650 130 25MB
English Pages 728 [729] Year 2023
Algorithms for Intelligent Systems Series Editors: Jagdish Chand Bansal · Kusum Deep · Atulya K. Nagar
Mohammad Shorif Uddin Jagdish Chand Bansal Editors
Proceedings of International Joint Conference on Advances in Computational Intelligence IJCACI 2022
Algorithms for Intelligent Systems Series Editors Jagdish Chand Bansal, Department of Mathematics, South Asian University, New Delhi, Delhi, India Kusum Deep, Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India Atulya K. Nagar, School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, UK
This book series publishes research on the analysis and development of algorithms for intelligent systems with their applications to various real world problems. It covers research related to autonomous agents, multi-agent systems, behavioral modeling, reinforcement learning, game theory, mechanism design, machine learning, meta-heuristic search, optimization, planning and scheduling, artificial neural networks, evolutionary computation, swarm intelligence and other algorithms for intelligent systems. The book series includes recent advancements, modification and applications of the artificial neural networks, evolutionary computation, swarm intelligence, artificial immune systems, fuzzy system, autonomous and multi agent systems, machine learning and other intelligent systems related areas. The material will be beneficial for the graduate students, post-graduate students as well as the researchers who want a broader view of advances in algorithms for intelligent systems. The contents will also be useful to the researchers from other fields who have no knowledge of the power of intelligent systems, e.g. the researchers in the field of bioinformatics, biochemists, mechanical and chemical engineers, economists, musicians and medical practitioners. The series publishes monographs, edited volumes, advanced textbooks and selected proceedings. Indexed by zbMATH. All books published in the series are submitted for consideration in Web of Science.
Mohammad Shorif Uddin · Jagdish Chand Bansal Editors
Proceedings of International Joint Conference on Advances in Computational Intelligence IJCACI 2022
Editors Mohammad Shorif Uddin Department of Computer Science and Engineering Jahangirnagar University Dhaka, Bangladesh
Jagdish Chand Bansal Department of Mathematics South Asian University New Delhi, India
ISSN 2524-7565 ISSN 2524-7573 (electronic) Algorithms for Intelligent Systems ISBN 978-981-99-1434-0 ISBN 978-981-99-1435-7 (eBook) https://doi.org/10.1007/978-981-99-1435-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
This book contains outstanding research papers as the proceedings of the International Joint Conference on Advances in Computational Intelligence (IJCACI 2022). IJCACI 2022 has been jointly organized by South Asian University (SAU), India, and Jahangirnagar University (JU), Bangladesh, under the technical co-sponsorship of the Soft Computing Research Society, India. It was held on 15–16 October 2022 at South Asian University (SAU), India. The conference was conceived as a platform for disseminating and exchanging ideas, concepts, and results of researchers from academia and industry to develop a comprehensive understanding of the challenges of the advancements of intelligence in computational viewpoints. This book will help in strengthening congenial networking between academia and industry. The conference focused on collective intelligence, soft computing, optimization, cloud computing, machine learning, intelligent software, robotics, data science, data security, big data analytics, and signal and natural language processing. This conference is an update of the first four conferences: (1) International Workshop on Computational Intelligence (IWCI 2016) that was held on 12–13 December 2016 at JU, Bangladesh, in collaboration with SAU, India, under the technical co-sponsorship of IEEE Bangladesh Section, (2) International Joint Conference on Computational Intelligence (IJCCI 2018) that was held on 14–15 December 2018 at Daffodil International University (DIU), Bangladesh, in collaboration with JU, Bangladesh, and SAU, India, (3) International Joint Conference on Computational Intelligence (IJCCI 2019) that was held on October 25–26, 2019 at University of Liberal Arts Bangladesh (ULAB) in collaboration with JU, Bangladesh, and SAU, India, (4) IJCACI 2020 that was held on 20–21 November 2020 at DIU, Bangladesh in collaboration with JU, Bangladesh, and SAU, India, and (5) IJCACI 2021 that was held on 23–24 October 2020 in hybrid mode at SAU, India, in collaboration with JU, Bangladesh. All accepted and presented papers of IWCI 2016 are in IEEE Xplore Digital Library and IJCCI 2018, IJCCI 2019, IJCACI 2020, and IJCACI 2021 are in Springer Nature Book Series Algorithms for Intelligent Systems (AIS). We have tried our best to enrich the quality of the IJCACI 2022 through a stringent and careful peer-review process. IJCACI 2022 received a significant number of technical contributed articles from distinguished participants from home and abroad. v
vi
Preface
After a very stringent peer-reviewing process, only 59 high-quality papers were finally accepted for presentation and for the final proceedings. In fact, this book presents novel contributions in areas of computational intelligence and it serves as reference material for advanced research. Dhaka, Bangladesh New Delhi, India
Mohammad Shorif Uddin Jagdish Chand Bansal
Contents
1
2
3
4
5
An Approach for Sign Language Recognition with Deep Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kalyanapu Srinivas, K. Ranjithkumar, V. Rakesh Datta, and K. Rama Devi Using the Contrastive Language-Image Pretraining Model for Object Detection on Images Containing Textual Labels . . . . . . . . Kishan Prithvi, N. Pratibha, Aditi Shanmugam, Y. Dhanya, and Harshata S. Kumar An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic Retinopathy Screening in the Fundus Image . . . . . . . . . . A. Mary Dayana and W. R. Sam Emmanuel Facial Recognition Approach: As per the Trend of 2022–23 Using Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basetty Mallikarjuna, Aditi Uniyal, Samyak Jain, Bharat Bhushan Naib, and Amit Kumar Goel Parametric Optimization of Friction Welding on 15CDV6 Aerospace Steel Rods Using Particle Swarm Algorithm . . . . . . . . . . . P. Anchana and P. M. Ajith
1
11
19
31
41
6
Review on Fetal Health Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . Vimala Nagabotu and Anupama Namburu
7
SDN-Based Task Scheduling to Progress the Energy Efficiency in Cloud Data Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. K. Jeevitha, R. Subhashini, and G. P. Bharathi
61
A Review on Machine Learning-Based Approaches for Image Forgery Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sonam Mehta and Pragya Shukla
75
8
51
vii
viii
9
Contents
Ponzi Scam Attack on Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. B. Amle and A. U. Surwade
10 Using the Light Gradient Boosting Machine for Prediction in QSAR Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marc Stawiski, Patrick Meier, Rolf Dornberger, and Thomas Hanne
91
99
11 The Adaptation of the Discrete Rat Swarm Optimization Algorithm to Solve the Quadratic Assignment Problem . . . . . . . . . . . 113 Toufik Mzili, Mohammed Essaid Riffi, and Ilyass Mzili 12 Multiple Distributed Generations Optimization in Distribution Network Using a Novel Dingo Optimizer . . . . . . . . . . . . . . . . . . . . . . . . 125 Abdulrasaq Jimoh, Samson Oladayo Ayanlade, Funso Kehinde Ariyo, Abdulsamad Bolakale Jimoh, Emmanuel Idowu Ogunwole, and Fatina Mosunmola Aremu 13 Prediction of Daily Precipitation in Bangladesh Using Time Series Analysis with Stacked Bidirectional Long Short-Term Memory-Based Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . . . 139 Shaswato Sarker and Abdul Matin 14 A New Face Recognition System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Anmol Tyagi and Kuldeep Singh 15 A Review on Quality Determination for Fruits and Vegetables . . . . . 175 Sowmya Natarajan and Vijayakumar Ponnusamy 16 A New Image Encryption Scheme Using RSA Cryptosystem and Arnold Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Arabind Kumar and Sanjay Yadav 17 An Integrated ATPRIS Framework for Smart Sustainable and Green City . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Nidhi Tiwari, Pratik Jain, and Mukesh Kumar Yadav 18 Detection of DDoS Attacks in Cloud Systems Using Different Classifiers of Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Swati Jaiswal, Pallavi Yevale, Anuja R. Jadhav, Renu Kachhoria, and Chetan Khadse 19 EEG Signal Classification for Left and Right Arm Movements using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Swati Shilaskar, Niranjan Tapasvi, Shripad Govekar, Shripad Bhatlawande, and Rajesh Jalnekar 20 A Holistic Study on Aspect-Based Sentiment Analysis . . . . . . . . . . . . 233 Himanshi and Jyoti Vashishtha
Contents
ix
21 Behavioral System for the Detection of Modern and Distributed Intrusions Based on Artificial Intelligence Techniques: Behavior IDS-AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Imen Chebbi, Ahlem Ben Younes, and Leila Ben Ayed 22 A Novel Integer-Based Framework for Secure Computations over Ciphertext Through Fully Homomorphic Encryption Schemes in Cloud Computing Security . . . . . . . . . . . . . . . . . . . . . . . . . . 263 V. Biksham and Sampath Korra 23 A Narrative Review of Students’ Performance Factors for Learning Analytics Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Dalia Abdulkareem Shafiq, Mohsen Marjani, Riyaz Ahamed Ariyaluran Habeeb, and David Asirvatham 24 Optimum Design of Base Isolation Systems with Low and High Damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Ayla Ocak, Sinan Melih Nigdeli, and Gebrail Bekda¸s 25 The General View of Virtual Reality Technology in the Education Sector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Ghaliya Al Farsi, Azmi bin Mohd. Yusof, Ragad Tawafak, Sohail Malik Iqbal, Abir Alsideiri, Roy Mathew, and Maryam AlSinani 26 Detection of Sign Language Using TensorFlow and Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Ayush Upadhyay, Parth Patel, Riya Patel, and Bansari Patel 27 Exploring Spatial Variation of Soils Using Self-organizing Maps (SOMs) and UHPLC Data for Forensic Investigation . . . . . . . . 317 Nur Ain Najihah Mohd Rosdi, Loong Chuen Lee, Nur Izzma Hanis Abdul Halim, Jeevna Sashidharan, and Hukil Sino 28 Application of Deep Learning for Wafer Defect Classification in Semiconductor Manufacturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Nguyen Thi Minh Hanh and Tran Duc Vi 29 Helping the Farmer with the Detection of Potato Leaf Disease Classification Using a Convolutional Neural Network . . . . . . . . . . . . . 341 Surya Kant Pal, Vineet Roy, Rita Roy, P. S. Jha, and Subhodeep Mukherjee 30 Analysis of ML-Based Classifiers for the Prediction of Breast Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Bikram Kar and Bikash Kanti Sarkar
x
Contents
31 Covid-19 Detection Using Deep Learning and Machine Learning from X-ray Images–A Hybrid Approach . . . . . . . . . . . . . . . 361 Afeefa Rafeeque and Rashid Ali 32 Detecting COVID-19 in Inter-Patient Ultrasound Using EfficientNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 Amani Al Mutairi, Yakoub Bazi, and Mohamad Mahmoud Al Rahhal 33 Expert System for Medical Diagnosis and Consultancy Using Prediction Algorithms of Machine Learning . . . . . . . . . . . . . . . . . . . . . 381 L. M. R. J. Lobo and Dussa Lavanya Markandeya 34 Dense Monocular Depth Estimation with Densely Connected Convolutional Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Adeeba Ali, Rashid Ali, and M. F. Baig 35 Learning Automata Based Harmony Search Routing Algorithm for Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . 407 Karthik Karmakonda, M. Swamy Das, and Bandi Rambabu 36 Financial Option Pricing Using Random Forest and Artificial Neural Network: A Novel Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 Prem Vaswani, Padmaja Mundakkad, and Kirubakaran Jayaprakasam 37 Hosting an API Documentation Portal Using Swagger and Various AWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Anjali Sinha and K. Nagamani 38 Bi-Level Linear Fuzzy Fractional Programming Problem Under Trapezoidal Fuzzy Environment: A Solution Approach . . . . . 447 Sujit Maharana and Suvasis Nayak 39 Comparative Assessment of Runoff by SCS-CN and GIS Methods in Un-Gauged Watershed: An Appraisal of Denwa Watershed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461 Papri Karmakar, Aniket Muley, Govind Kulkarni, and Parag Bhalchandra 40 Transfer Learning-Based Machine Learning Approach to Solve Problems of E-commerce: Image Search . . . . . . . . . . . . . . . . . 479 Kirti Jain 41 Performance Analysis of Rotten Vegetable Classifier Using Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Sonali Chakraborty
Contents
xi
42 Reduced-Order Model of the Russian Service Module via Loewner Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Sanwar Alam and Mohammad N. Murshed 43 Error Analysis of Hydrate Formation Pressure Prediction Using ANN Algorithms and ANFIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 S. Asif Mohammed and R. Asaletha 44 An Efficient Methodology for Brain Tumor Segmentation Using Genetic Algorithm and ANN Techniques . . . . . . . . . . . . . . . . . . 525 Ankita, Ramesh Kait, and Fairy 45 Generating Abstract Art from Hand-Drawn Sketches Using GAN Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539 Sugato Chakrabarty, Rithika F. Johnson, M. Rashmi, and Rishita Raha 46 Portfolio Selection Strategy: A Teaching–Learning-Based Optimization (TLBO) Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Akhilesh Kumar, Gayas Ahmad, and Mohammad Shahid 47 Friend Recommendation System Based on Heterogeneous Data from Social Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565 Animesh Chandra Roy and A. S. M. Mofakh Kharul Islam 48 Towards Quality Improvement and Prediction of Closed Questions on Stack Overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581 Md. Nahidul Islam Opu and Animesh Chandra Roy 49 A Knowledge-Based Consultant Student System Using Reasoning Techniques for Selection of Courses in Smart University . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597 Viet Phuong Truong and Quoc Hung Nguyen 50 Face Mask Detection in Real-Time Using an Automatic Door Control System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 G. Vimala Kumari, M. Sunil Prakash, V. Sailaja, and G. Usha Sravani 51 FIODC Architecture: The Architecture for Fashion Image Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623 Smita Vinit Bhoir, Sahil Sanjay Chavan, Sharvay Shashikant Chavan, and Aishwarya Anand 52 A Deep Learning Approach to Analyze the Stock Market During COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643 Lomat Haider Chowdhury, Nusrat Jahan Farin, and Salekul Islam 53 Detection of Social Bots in Twitter Network . . . . . . . . . . . . . . . . . . . . . . 655 Mahesh Chandra Duddu and S. Durga Bhavani
xii
Contents
54 EKF-Based Position Estimation of Wheeled Mobile Robot for 2-D Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669 Arun Kumar, Maneesha, and Praveen Kant Pandey 55 Control Application to a Heating System Used in an Auditorium with Complex Topology . . . . . . . . . . . . . . . . . . . . . . . 679 Eusébio Conceição, João Gomes, Mª Inês Conceição, Mª Manuela Lúcio, and Hazim Awbi 56 Mathematical Analysis of Effect of Nutrients on Plankton Model with Time Delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689 Rakesh Kumar and Navneet Rana 57 Enhanced Dragonfly-Based Secure Intelligent Vehicular System in Fog via Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705 Anshu Devi, Ramesh Kait, and Virender Ranga 58 Towards Ranking of Gene Regulatory Network Inference Methods Based on Prediction Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . 717 Softya Sebastian and Swarup Roy 59 Role of Viral Infection in Toxin Producing Phytoplankton and Zooplankton Dynamics: A Mathematical Study . . . . . . . . . . . . . . 729 Rakesh Kumar and Amanpreet Kaur Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
About the Editors
Prof. Mohammad Shorif Uddin completed his Ph.D. in Information Science at Kyoto Institute of Technology in 2002, Japan, Master of Technology Education at Shiga University, Japan in 1999, Bachelor of Electrical and Electronic Engineering at Bangladesh University of Engineering and Technology (BUET) in 1991 and also Master of Business Administration (MBA) from Jahangirnagar University in 2013. He began his teaching career as a Lecturer in 1991 at Chittagong University of Engineering and Technology (CUET). In 1992, he joined the Computer Science and Engineering Department of Jahangirnagar University and at present, he is a Professor of this department. He served as the Chairman of the Computer Science and Engineering Department of Jahangirnagar University from June 2014 to June 2017 and Teacher-in-Charge of the ICT Cell of Jahangirnagar University from February 2015 to April 2023. He worked as an Adviser of ULAB from September 2009 to October 2020 and Hamdard University Bangladesh from November 2020 to November 2021. He undertook postdoctoral research at Bioinformatics Institute, Singapore, Toyota Technological Institute, Japan and Kyoto Institute of Technology, Japan, Chiba University, Japan, Bonn University, Germany, Institute of Automation, Chinese Academy of Sciences, China. His research is motivated by applications in the fields of artificial intelligence, machine learning, imaging informatics, and computer vision. He holds two patents for his scientific inventions and has published around 200 research papers in international journals and conference proceedings. In addition, he edited a good number of books and wrote many book chapters. He had delivered a remarkable number of keynotes and invited talks and also acted as a General Chair or TPC Chair or Co-Chair of many international conferences. He received the Best Paper award in the International Conference on Informatics, Electronics & Vision (ICIEV2013), Dhaka, Bangladesh, and the Best Presenter Award from the International Conference on Computer Vision and Graphics (ICCVG 2004), Warsaw, Poland. He was the Coach of Janhangirnagar University ACM ICPC World Finals Teams in 2015 and 2017 and supervised a good number of doctoral and Master theses. He is a Fellow of IEB and BCS, a Senior Member of IEEE, and an Associate Editor of IEEE Access.
xiii
xiv
About the Editors
Dr. Jagdish Chand Bansal is an Associate Professor at South Asian University New Delhi and Visiting Faculty at Maths and Computer Science, Liverpool Hope University UK. Dr. Bansal has obtained his Ph.D. in Mathematics from IIT Roorkee. Before joining SAU New Delhi he has worked as an Assistant Professor at ABVIndian Institute of Information Technology and Management Gwalior and BITS Pilani. His Primary area of interest is Swarm Intelligence and Nature Inspired Optimization Techniques. Recently, he proposed a fission-fusion social structure based optimization algorithm, Spider Monkey Optimization (SMO), which is being applied to various problems from engineering domain. He has published more than 70 research papers in various international journals/conferences. He is the editor in chief of the journal MethodsX published by Elsevier. He is the series editor of the book series Algorithms for Intelligent Systems (AIS) and Studies in Autonomic, Data-driven and Industrial Computing (SADIC) published by Springer. He is the editor in chief of International Journal of Swarm Intelligence (IJSI) published by Inderscience. He is also the Associate Editor of Engineering Applications of Artificial Intelligence (EAAI) and ARRAY published by Elsevier. He is the general secretary of Soft Computing Research Society (SCRS). He has also received Gold Medal at UG and PG level.
Chapter 1
An Approach for Sign Language Recognition with Deep Learning Algorithm Kalyanapu Srinivas, K. Ranjithkumar, V. Rakesh Datta, and K. Rama Devi
1 Introduction One of the interesting research-oriented concepts is concentrating on human sensory organs and their working mechanisms. Most of the sensory organs are subjected to various issues like climatic conditions, eatables, etc. The deficiencies are to be addressed technically in order to help the suffering groups. One of the most common sensory deficiencies observed in people is loss of hearing. So, they need to rely on sign language in order to understand and express precise message. A lot of research had been done earlier by using images, sensors, etc., but cannot be affordable to many people due to high cost. Sign language recognition system fall under the machine language category which need to identify normal characters and numbers from the human hand gestures. A group of people who are disabled (deaf or dumb) try to communicate with others using sign gestures. Everyone may not be familiar with these sign gestures so they may be thinking of an interpreter to convert the sign language to meaningful text form. The work in this paper is to develop a model suitable for all people removing the communication gap between normal and disabled individuals (deaf or dumb) by [1]. According to WHO survey, around 63 million people are suffering with audio impairment in India and as per NSSO survey, out of 100,000 people 291 individuals are suffering with severe hear loss [2] where most of them are youngsters of age 0– 14 years. Therefore, there is a significant effect on economic output. The identified problem is people who are disabled (deaf or dumb) are facing difficulty in communicating with normal people having no knowledge on sign gestures. To overcome this, a model is proposed that can detect the sign gestures of disabled individual K. Srinivas (B) · K. Ranjithkumar · V. Rakesh Datta · K. Rama Devi Department of Computer Science and Engineering, Vaagdevi Engineering College, Warangal (Telangana State), India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_1
1
2
K. Srinivas et al.
and feed the sign gesture to convolution neural networks which detects the sign and translates it to normal person understandable form. This model helps the people who are disabled with normal people who are not.
2 Related Work Impaired people (deaf or dumb) communicate with non-impaired people in order to share their feelings using hand movements with facial expressions. Collection of such hand movements forms gestures having meaningful information called sign language proposed by [3]. Around the world there are 300 various sign languages available. These languages are helpful to impaired people but it makes complexity to non-impaired people to understand put forwarded by [4]. Therefore, to remove the gap between them, sign language recognition system forms a bridge to communicate freely to everyone. Sign language recognition systems have been developed by adopting different processing methods by [5, 6], and [7]. The most widely used method in this system is Hidden Markov Model (HMM) by [8]. Single stream (SHMM) and multi stream (MHMM) and Tied-Mixture Density HMM by [9] are the various HMM methodologies used in language recognition system. Some of the other processing methods adopted neural networks by [10−14], elliptical Fourier descriptors by [15], Support vector machines by [18], convolutional residual network by [19], unsupervised neural network Self-Organizing Map (SOM) by [16], Simple Recurrent Network (SRN) by [17], wavelet dependent method by [20], and Eigenvalues with Euclidean Distance by [21]. Sensor device by Microsoft is used in sign language recognition in the proposal by [22]. Experimentations are performed on color images and depth images. Extracted moments like angle, locations, and shapes are taken as inputs to SVM classifier and the output results in accuracy improvement. An android-based application is developed for sign language recognition by [23]. The app takes images as input and forward to the server system where feature extraction is performed in MATLAB application used by [24]. Using trained neural networks and pattern recognition, image classification is performed to create a text output. There are various learning strategies available to perform machine learning assessments. These strategies use information to prepare brain-relevant network calculations to assort the machine learning assignments. For investigating pictures and for strong calculations, convolutional brain networks are the best option. This article will make sense for you on how to build, train, and assess convolutional brain organizations. You will likewise figure out how to work on their capacity to gain from information, and how to decipher the aftereffects of the preparation. Profound Learning has different applications like picture handling, regular language handling, and so forth. It is likewise utilized in Medical Science, Media and Entertainment, Autonomous Cars, and so on. For mechanized handling pictures, CNN calculations
1 An Approach for Sign Language Recognition with Deep Learning …
3
are the suitable calculations and most of the organizations utilize these calculations to do things like recognizing the items in an image. Fundamentally, CNN is utilized for image investigations tasks like picture acknowledgment, Object identification in the picture and segmentations in the picture. Convolutional Neural Networks are composed of three layers: Convolutional layer, Pooling layer, and Fully-Connected layer. TensorFlow is an open-source system for making Machine Learning applications from beginning to end. A delegate science library utilizes dataflow and differentiable programming to complete different errands zeroed on the planning and acceptance of profound brain organizations. It empowers fashioners to make AI applications utilizing an assortment of instruments, structures, and nearby assets. Google’s TensorFlow is right now the most notable significant profound learning library in the planet. AI is utilized by Google in each of its items to further develop the web crawler, understanding, picture writing, and suggestions. One of the best suitable open-source libraries for computer vision, machine learning, and image processing is OpenCV. OpenCV supports a variety of programming languages, including Python, C++, Java, and others. It detects various items, facial expressions, and human calligraphy from pictures and recordings. This OpenCV instructional exercise will assist you with gaining the Image-handling from Basics to Advance, similar to procedure on Images, Videos utilizing an enormous arrangement of OpenCV-projects and undertakings. NumPy provides a high-end multi-layered showpiece as well as tools for working with these clusters. It is the most used Python logical processing package. Each Numpy cluster is a table of components, the entirety of a similar sort, listed by a tuple of positive numbers. This information type object gives data about the design of the cluster. The upsides of and array are put away in a support which can be considered a touching square of memory bytes and can be deciphered by the type object. Guido van Rossum made Python, which was delivered in 1991. It’s used for web improvement, programming, science, and framework coordination. Python is used for making web applications at server end. Python’s emphasis permits software engineers to compose programs with less lines than other programming dialects. Python depends on a mediator framework, and that implies that code can be executed when it is composed. This is the latest huge adaptation. As opposed to other programming dialects, Python utilizes new lines to satisfy demands, instead of semicolons or areas. Python utilizes whitespace to address scope, like the level of circles, limits, and classes. Wavy sections are regularly utilized in other programming dialects.
3 Methodology The stages engaged with sign language recognition (SLR) can be ordered into five phases: picture capture, image pre-processing, segmentation, feature extraction, and classification.
4
K. Srinivas et al.
3.1 Data Collection To collect the data set we use OpenCV and capture the images. Image and video processing libraries of Python are used in this proposal. OpenCV is one of the large libraries among them which provides a number of processing functions related images and videos. OpenCV with the help of camera captures videos and allows to create a video capture object. This object undergoes various operations for the next processing. Image capturing using OpenCV can be done in the following steps: a. cv2.VideoCapture(), this method is used to capture objects in the video for camera. b. Construct a while loop and insert read() method to read the frames. c. cv2.imshow() method is used to display the frames which are read in the video.
3.2 Image Pre-processing Here the captured images undergo various operations like morphological transformation, background noise removal, image smoothening to enhance the captured images. During this process, it removes all the noises in the image and makes sure the image is clear to extract the required features. In this step, it removes pepper-like background noises and undergoes segmentation then divides the images into black and white regions as shown in Fig. 1. White colored region shows the skin area, and black colored area represents the background area.
Fig. 1 Image after segmentation
1 An Approach for Sign Language Recognition with Deep Learning …
5
3.3 Feature Extraction Hand contours are identified from the given image, and these contours are used to compute feature values in percentages. If the computed feature value is zero, no contour is identified in the image. For illustrations, a 3 × 3 grid sample is used. The advantage of this proposal is that the features created change depending on the orientation of each hand stance. Different hand positions take up different amounts of grid space and fragment space. As a result, the feature vector precisely captures the hand’s shape and position. Each hand posture is represented by a separate cluster using these M*N characteristics. Featur e value =
ar ea o f contour ar ea o f f rament
To track the objects in the foreground area, contours are used. If all the contours of current frame are identified, then the top three correspond to our region of interest.
3.4 Classification and Training the Model To create the model, the following steps are followed: • • • •
Tensor flow library and a 2D CNN model are used in this proposal. To scan the images, a filter is used by convolution layers. Calculate frame pixel dot product and the filter’s weights. Key features of the input image are captured for further processing. The pooling layers are applied after each convolution layer. • A completely connected output layer is the network’s final layer. • The model is then put together. To start with, we load the information to the model utilizing image data generation of keras. Through this, we can utilize the flow from directory capacity to stack the train and test set information. Finally, the names of the number organizers will be the names of the classes for the images stacked. The model contains the callbacks utilized, likewise it contains the two different advancement calculations utilized. Now we find the maximum shape and assuming form is recognized that implies a hand is identified so the edge of the ROI is treated as a test picture. Segmenting the hand, i.e., getting the maximum forms and the picture of the hand detected.
6
K. Srinivas et al.
Fig. 2 Experimented output
3.5 Recognition In this step, classifiers are employed which uses algorithms to interpret the signals. Some of the popular classifiers include Hidden Markov Model (HMM), K-Nearest Neighbor model, Support Vector Machines (SVM), Artificial Neural Network (ANN), and Principal Component Analysis (PCA). The adopted classifier in this model, however, is CNN for classification of images and recognition, The CNN with high precision employs a hierarchical model for constructing a network in the funnel shape, then produces a fully-connected layer in which all neurons are connected to one another and the output gets processed. After recognition the output is displayed as shown in Fig. 2. This output is mapped in the collected dataset as shown in Fig. 3 in order to identify the type of character to be displayed (Fig. 3).
3.6 Working System The below flowchart represents the working of this system. As shown in Fig. 4, first the Images are captured by selected image acquisition devices like webcam. Then it undergoes image pre-processing and segmentation which gives the binary form of image. This binary image undergoes feature extraction where ROI (region of interests) are identified. Now, apply the opted CNN algorithm to classify and train the model to recognize the gestures or sign in real-time (Fig. 4).
1 An Approach for Sign Language Recognition with Deep Learning …
7
Fig. 3 Collected dataset
4 Conclusion and Future Scope Impaired people (deaf or dumb) communicate with non-impaired people in order to share their feelings using hand movements with facial expressions. Collection of such hand movements forms gestures having meaningful information called sign language. To overcome the gap, a model is proposed. The goal of this model is to gradually predict numeric hand gestures as shown in Fig. 5 and to present accurate results in predicting outputs in both offline and online. In the future, the work can be progressed for prediction of gestures of double hand movements and prediction of more symbols from alphabets of static symbols.
8
K. Srinivas et al.
Fig. 4 Sign gestures processing system
Fig. 5 Sample numerical hand gestures
References 1. Berke J, Lacy J (2021) Hearing loss/deafness| Sign Language. https://www.verywellhealth. com/sign-language-nonverbal-users-1046848. 2. National Health Mission-report of deaf people in India (2021) nhm.gov.in. 3. Konstantinidis D, Dimitropoulos K, Daras P (2018) Sign language recognition based on hand and body skeletal data. In: 3DTV-Conference. https://doi.org/10.1109/3DTV.2018.8478467. 4. Bragg D, Koller O, Bellard M, Berke L, Boudreault P, Braffort A, Caselli N, Huenerfauth M,
1 An Approach for Sign Language Recognition with Deep Learning …
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16. 17.
18.
19.
20. 21.
9
Kacorri H, Verhoef T, Vogler C, Morris MR (2019) Sign language recognition, generation, and translation: an interdisciplinary perspective. In: 21st International ACM SIGACCESS Conference on Computers and Accessibility (2019). https://doi.org/10.1145/3308561. Cheok MJ, Omar Z, Jaward MH (2017) A review of hand gesture and sign language recognition techniques. Int J Mach Learn Cybern 2017 101(10): 131–153. https://doi.org/10.1007/S13042017-0705-5. Wadhawan A, Kumar P (2019) Sign language recognition systems: a decade systematic literature review. Arch Comput Methods Eng. 2019 283(28): 785–813. https://doi.org/10.1007/S11 831-019-09384-2. Camgöz NC, Koller O, Hadfield S, Bowden R (2020) Sign language transformers: joint end-toend sign language recognition and translation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp 10020–10030. https://doi.org/10. 1109/CVPR42600.2020.01004. Gaus YFA, Wong F (2012) Hidden markov model-based gesture recognition with overlapping hand-head/hand-hand estimated using Kalman filter. In: Proceedings of the 3rd International Conference on Advanced Intelligent Systems Modelling and Simulation, ISMS 2012. pp 262– 267 https://doi.org/10.1109/ISMS.2012.67. Suharjito RA, Wiryana F, Ariesta MC, Kusuma GP (2017) Sign language recognition application systems for deaf-mute people: a review based on inputProcess-output. Procedia Comput Sci 116: 441–448. https://doi.org/10.1016/J.PROCS.2017.10.028. Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimed 21:1880–1891. https://doi.org/10.1109/TMM. 2018.2889563 Bantupalli K, Xie Y (2018) American sign language recognition using deep learning and computer Vvision. In: Proceedings of the 2018 IEEE International Conference on Big Data Big Data 2018. pp 4896–4899 13. https://doi.org/10.1109/BIGDATA.2018.8622141. Hore S, Chatterjee S, Santhi V, Dey N, Ashour AS, Balas VE, Shi F (2017) Indian sign language recognition using optimized neural networks. Adv Intell Syst Comput 455:553–563. https:// doi.org/10.1007/978-3-319-38771-0_54 Kumar P, Roy PP, Dogra DP (2018) Independent bayesian classifier combination based sign language recognition using facial expression. Inf Sci (Ny) 428:30–48. https://doi.org/10.1016/ J.INS.2017.10.046 Sharma A, Sharma N, Saxena Y, Singh A, Sadhya D (2020) Benchmarking deep neural network approaches for Indian sign language recognition. Neural Comput Appl 2020 3312 33: 6685– 6696. https://doi.org/10.1007/S00521-020-05448-8. Kishore PVV, Prasad MVD, Prasad CR, Rahul R (2015) 4-camera model for sign language recognition using elliptical fourier descriptors and ANN. 2015 International Conference on Signal Processing and Communication Engineering Systems. SPACES 2015, Assoc with IEEE. pp 34–38. https://doi.org/10.1109/SPACES.2015.7058288. Tewari D, Srivastava SK (2012) A visual recognition of static hand gestures in indian sign language based on kohonen self-organizing map algorithm. Int J Eng Adv Technol 165. Gao W, Fang G, Zhao D, Chen Y (2004) A Chinese sign language recognition system based on SOFM/SRN/HMM. Pattern Recognit 37:2389–2402. https://doi.org/10.1016/J.PATCOG. 2004.04.008 Quocthang P, Dung ND, Thuy NT (2017) A comparison of SimpSVM and RVM for sign language recognition. In: ACM International Conference Proceeding Series. pp 98–104. https:// doi.org/10.1145/3036290.3036322. Pu J, Zhou W, Li H (2019) Iterative alignment network for continuous sign language recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2019-June. pp 4160–4169. https://doi.org/10.1109/CVPR.2019.00429. Kalsh EA, Garewal NS Sign language recognition system. Int J Comput Eng Res 6. Singha J, Das K (2013) Indian sign language recognition using eigen value weighted euclidean distance based classification technique. (IJACSA) Int J Adv Comput Sci Appl 4.
10
K. Srinivas et al.
22. Raheja JL, Mishra A, Chaudhary A (2016) Indian Sign Language Recognition Using SVM. Pattern Recognit Image Anal 26(2):434–441 23. Loke P, Paranjpe J, Bhabal S, Kanere K (2017) Indian sign language converter system using an android app. In: International Conference on Electronics, Communication and Aerospace Technology. IEEE. 978-1–5090–5686–6/17. 24. Srinivas K, Janaki V, Shankar V, KumarSwamy P (2020) Crypto key protection generated from images and chaotic logistic maps. In: Advanced Informatics for Computing Research. ICAICR 2020. Communications in Computer and Information Science, pp 253–262, vol 1394. Springer, ISSN Number-1865–0929
Chapter 2
Using the Contrastive Language-Image Pretraining Model for Object Detection on Images Containing Textual Labels Kishan Prithvi , N. Pratibha , Aditi Shanmugam , Y. Dhanya , and Harshata S. Kumar
1 Introduction and Motivating Work In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret and reason about Multimodal messages. Multimodal machine learning aims to build models that can process and relate information from multiple modalities. From early research on audio-visual speech recognition to the recent explosion of interest in language and vision models, multimodal machine learning is a vibrant multi-disciplinary field of increasing importance and with extraordinary potential. Multimodal Machine Learning models provide us with the means of developing robust AI tools that simulate human behavior. Making predictions based on several modalities of data not only helps with the generalizability of Neural Networks but they also give rise to more confident and efficient ways of finding patterns in data [1, 2]. This paper outlines and builds upon the work of Contrastive Language-Image Pretraining (CLIP) model by Radford et al. [3], a neural network trained on several image- text pairs for the object detection task. The model can be instructed in natural language to predict the most relevant text snippet, given some image sample. The model is not directly optimized for the task, it functions based on Zero Shot learning. In addition to experimenting with various ConvNet architectures in the Contrastive Language Image Pre- training setting in Zero-Shot and Linear Probe variants, the project also explores the problem of typographical attacks on such models and overcomes the limitations of the model presented by Alcorn et al. [4]. Finally, the project also aims to develop a standalone task specific object detection model without explicit optimization of the model while exploring factors like scalability and generalizability. K. Prithvi (B) · N. Pratibha · A. Shanmugam · Y. Dhanya · H. S. Kumar B.M.S. Institute of Technology and Management, Bangalore 560054, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_2
11
12
K. Prithvi et al.
Object detection and classification tasks have a wide variety of applications. Computer Vision models are restricted to make predictions on a predefined set of classes, this limits the generalizability and scalability of computer vision models as it requires a lot of task specific well annotated data and training. We leverage the concept of contrastive-language image pretraining to perform object detection on images objects containing textual labels. Our experiments are also performed on custom datasets that contain a very small batch of images per class (Fig. 1).
Fig. 1 Methodology flowchart of the project
2 Using the Contrastive Language-Image Pretraining Model for Object …
13
2 Methodology The project life cycle is decomposed into the main stages, namely Data Preparation, Model Development and Training, Packaging and Validation of the Model, and task specific deployment, of performing object detection on images containing textual information.
2.1 Dataset Creation This task requires a Dataset containing images of objects along with textual information or labels on objects. A good example of this would be packaged goods with labels on the packaging. The datasets used in our experiments were generated using a custom web-scraper that takes keywords as input and downloads images from Google Images. The dataset was manually cleaned in order to remove blurred, poor-quality samples as well as any irrelevant images. Three datasets to run experiments were generated. A similar dataset approach was taken in Barbu et al. [5].
2.2 Breakfast Cereals The first dataset contains 12 classes of popular American breakfast cereals.
2.3 Beverages This dataset contains 12 classes of the frequently consumed beverages.
2.4 Snack Dataset This dataset contains 6 classes of packed snacks. Each class contains 16 images per class on average as justified in the paper by Dosovitskiy et al. [6]. To increase the size of the dataset, data transformation and augmentation scripts are integrated into the training script for the model [7].
14
K. Prithvi et al.
2.5 Model The primary model selected to run our experiments is the CLIP-ViT-L/14. The CLIPViT-L/14 maps text and images to a shared vector space. This version of the CLIP model is superior to the others as it is trained at a higher 336-pixel resolution for one extra epoch.
2.6 Training The pre-trained model is used in transfer learning mode to run our experiments. Our experiments leverage the saved weights of the CLIP-ViT-L/14 model trained end-toend on the CIFAR100 dataset. The model is trained for our task for 10 epochs each.
3 Proposed Technique Initially, the validity of the workspace is checked. If the workspace is invalid, CLIP and all its dependencies are loaded. Then the dataset is checked. If data is absent, the ‘imagescraper’ script is run to generate dataset. We then load the dataset. Then training is performed on Zeroshot CLIP. Training completion is continuously checked for and troubleshooted. After training is complete, we obtain predictions and validate the model. Then, Hyperparameter tuning is done to improve results. The approach in visual representations in Bulent et al. [8] is analyzed and improved upon. Then the data is analyzed. If the results are unsatisfactory, hyperparameter tuning is repeated. Once the results are satisfactory the model is deployed and tested.
4 Experiments and Results The model is trained for 10 epochs and validated on each dataset to obtain the Top-1, Top-3 and Top-5 Accuracy. The Training images are augmented by random flips, crops and color jitter. Data augmentation helps increase the size of our dataset while generating variations of the same sample in order for robust training. In order to obtain Baseline results (refer Fig. 2), we perform the same experiments by transfer learning and training from scratch on a ResNet-50. We also test semi-supervised learning as used in Dai et al. [9]. In experiments performed on all three datasets, the ResNet50 trained for 30 epochs is able to match that of a CLIP model trained for 10 epochs. The ResNet50 fails to perform object detection while training from scratch but performs relatively well by transfer learning. Comparision of results for each of the three datasets is as below. The ResNet trained for 30 epochs is able to match that of
2 Using the Contrastive Language-Image Pretraining Model for Object …
15
a CLIP model trained for 10 epochs. The ResNet50 fails to perform object detection while training from scratch but performs relatively well by transfer learning. The accuracies of the CLIP model and ResNet50 are compared below (Fig. 3) (Table 1).
Fig. 2 Baseline results-cereal dataset
Fig. 3 Baseline and CLIP results for the datasets
16
K. Prithvi et al.
Table 1 Dataset training accuracy results Dataset
ResNet50 transfer (%)
Breakfast cereal dataset
91.66
Beverage dataset Snacks dataset
ResNet50 trained from scratch (%)
CLIP linear probe (%)
CLIP zero-shot (%)
8.33
96.64
99.37
68.29
4.48
98.24
96.64
90.09
14.28
93.52
99.24
5 Discussion and Conclusion For the first phase of Data Preparation, a Web-Scraping script written using Python is developed. Using libraries such as Selenium, Beautiful Soup and Scrapy, the script takes an input keyword and automatically downloads images corresponding to it into a dataset folder. The folder is structures in a way that is easy to read by both human beings and the deep learning model. Further the script generates a CSV file for easy access to the location and ease of access to the Dataset. A custom Data loader to transform the images to the input layer of the model has also been developed. This phase of the project has been completed successfully. The Deployment phase of the project requires setting up a RaspberryPi to prepare it for Deep Learning Applications. The Arm Compute Library developed for edge devices enables Deep Learning on Edge. This Phase of the Project is also complete and the Edge device is successfully set up to run Deep Learning Models. The intermediate phase is to develop the Model Architecture first using Python and then replicate it on MATLAB for integration with Hardware. The development of the model using Python is underway and the MATLAB version subsequently. Finally, the Training and Validation followed by the deployment of the model on the edge device will follow. As per the aim of the project to develop and deploy a model to achieve robust performance on Object Detection tasks, we use several Deep Learning Concepts such as Multi-Modal Machine Learning, Zero-Shot Learning, Convolutional Neural Networks to name a few. Concepts such as Web-Scraping and Edge Computing are also used. To conclude, the project aims to be applied to a task-specific application along with improving generalizability and scalability of the model. A simple extension of this system would be to increase the size of dataset and the number of classes and applying bi-directional transformers as used in Devlin et al. [10].
References 1. Buolamwini J, Gebru T (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. In: Conference on fairness, accountability and transparency. pp 77–91 2. Coates A, Ng A, Lee H (2011) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of the fourteenth international conference on artificial intelligence
2 Using the Contrastive Language-Image Pretraining Model for Object …
17
and statistics. pp 215–223. 3. Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning transferable visual models from natural language supervision. 4. Alcorn MA, Li Q, Gong Z, Wang C, Mai L, Ku W-S, Nguyen A (2019) Strike (with) a pose: neural networks are easily fooled bystrange poses of familiar objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4845–4854. 5. Barbu A, Mayo D, Alverio J, Luo W, Wang C, Gut-freund D, Tenenbaum J, Katz B (2019) Objectnet: a large-scale bias-controlled dataset for pushing the lim-its of object recognition models. In: Advances in Neural Information Processing Systems. pp 9453–9463. 6. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al. An image is worth 16x16 7. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: CVPR09. 8. Bulent Sariyildiz M, Perez J, Larlus D (2020) Learning visual representations with caption annotations. arXiv e-prints, pp. arXiv–2008. 9. Dai AM, Le QV (2015) Semi-supervised sequence learning. In: Advances in neural information processing systems. pp. 3079–3087. 10. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Chapter 3
An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic Retinopathy Screening in the Fundus Image A. Mary Dayana and W. R. Sam Emmanuel
1 Introduction The retinal fundus image possesses blood capillaries responsible for supplying blood to the retina and transmitting signals to the brain from the retina [1]. The deterioration of blood vessels causes eye ailments such as diabetic retinopathy, hypertensive retinopathy, and retinal vein occlusion. Diabetic retinopathy is a severe condition of Diabetes that degrades blood vessels, causing leakage of fluids in the retina, which in turn form deposits that can interfere with vision. The formation of lesions like Microaneurysms and Hemorrhages signifies the early stage of DR. The advanced stage indicates retinal neovascularization causing severe vision loss. Therefore, detecting these pathologies is the primary task in screening DR. The presence of thin elongated retinal vasculatures and variations within the retinal fundus image make DR lesion detection difficult and limit performance. Therefore, it is necessary to segment the blood vessels to reduce the spurious responses in DR lesion detection and segmentation. Furthermore, the analysis of vessel attributes such as thickness, diameter, and tortuosity help ophthalmologists scrutinize DR and other eye-related diseases. However, manual assessment of the vasculature is a dreary task and necessitates expertise and experience [2, 3]. At the same time, early detection and timely treatment are crucial for reducing vision loss. Therefore, a reliable automated vessel extraction technique is required to analyze fundus images accurately. Several supervised techniques produce better accuracy but rely on a large labelled training dataset, which is computationally expensive and consumes large A. M. Dayana (B) · W. R. S. Emmanuel Department of Computer Science, Nesamony Memorial Christian College, Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu, India e-mail: [email protected] W. R. S. Emmanuel e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_3
19
20
A. M. Dayana and W. R. S. Emmanuel
memory. Meanwhile, unsupervised segmentation methods like thresholding [4], region growing [5], clustering [6], filtering [7], morphology, and model-based techniques are computationally efficient and do not require a trained dataset. However, the segmentation of thin arteries from the retinal background remains challenging even though significant advancements have been achieved. In addition, misclassified vessel pixels lead to false negatives, and undetected thin vessels lead to undersegmentation. The main focus of this paper is to overcome the above limitations by introducing a hybrid vessel segmentation technique based on the entropy measure combining deep joint segmentation and sparse fuzzy c-means clustering algorithm. The deep joint iterative segmentation approach identifies the optimal segments based on the distance measure of the estimated region similarity and the deep points. The sparse representation of the FCM clustering technique further improves the clustering performance while reducing the proportion of noisy features. Finally, entropy measure of the hybrid techniques produce the refined blood vessel segmentation results. The suggested technique can effectively segment the unconnected and low contrast tiny blood vessels which is considered a limitation in the existing methods. The remaining work is structured as follows: The review of the related research work is described in Sect 2. Section 3 explains the employed methodology. Section 4 details the experimental requirements, and Sect. 5 presents the results and a detailed discussion. Finally, Sect. 6 summarizes the work with a brief conclusion.
2 Related Work Over the years, several methods have been employed for blood vessel segmentation in fundus images. A semantic-guided network with a multi-directional feature learning module was introduced by [8] for vessel segmentation integrating dilated and strip convolutions. In [9], a feed-forward neural network with classical edge detection filters segments the retinal vessels with robust performance, thereby reducing spurious detection in vessel segmentation. Similarly, Li et al. [10] proposed a lightweight Convolutional Neural Network (CNN) to learn the features and segment the thin vessels from the retinal background. Even though these methods detect narrow and low contrast blood vessels, it leads to high computational complexity. In addition, the hyperparameter tuning and huge memory requirements linked with training the model hampers the model performance of the supervised methods. Sheng et al. [11] introduced an effective linear iterative clustering algorithm using spatial, colour, and distance features and a minimum spanning tree. Meanwhile, authors in [12] developed a novel vessel detection technique using a maximum entropy-based expectation-maximization algorithm and achieved better performance. Saroj et al. [7] proposed an entropy-based optimal thresholding algorithm with Matched filter for vessel segmentation. Though these methods perform better in vessel detection, some vessel pixels are misclassified as non-vessels, leading to false negatives. An automated segmentation method combining Hessian-based vessel filtering and fuzzy entropy-based thresholding was introduced by [13], where
3 An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic …
21
a multiscale linear filter-based Hessian matrix is applied for vessel enhancement. Although the thresholding-based methods are simple to implement, they ignore the spatial details between the pixels, resulting in an unconnected vascular structure. Memari et al. [6] segmented the blood vessels using Matched filter and Fuzzy CMeans clustering technique. Alhussein et al. [14] proposed a simple unsupervised method combining Hessian matrix and Intensity-based transformation. The limitation of these approaches is that the small thin vessels remain undetected, leading to under-segmentation. Even though the methods mentioned above achieved significant progress in vessel segmentation, the existing techniques are incapable of detecting the thin vessels from the background due to intensity homogeneity, low contrast, uneven illumination, and image blurring. In addition, most methods are sensitive to noise and require a post-processing step to reduce the noise effects. Other constraints that need to be addressed are under-segmentation, vessel connectivity, false detection of non-vascular structures, and computational complexity. Therefore, a new entropy-based segmentation approach was developed to overcome these limitations, incorporating sparse fuzzy-c-means and deep joint segmentation.
3 Proposed Methodology The proposed vessel detection method includes two steps- preprocessing and segmentation. The preprocessing phase removes the noise, corrects illumination, and enhances the image contrast for further segmentation. The entropy measure of sparse FCM and deep joint segmentation yields the final segmented blood vessels of the retina for DR screening. The systematic flow of the developed model is shown in Fig. 1.
3.1 Preprocessing The color fundus images are influenced by noise, poor contrast, and uneven illumination [15]; therefore, it is preprocessed. First, the median filtering technique eliminates noise preserving the sharp features in the fundus image. Then, the non-uniform
Fig. 1 Workflow of the proposed method
22
A. M. Dayana and W. R. S. Emmanuel
illumination is corrected by computing the median around each pixel in the local neighbourhood and subtracting it from the original image. Finally, Contrast Limited Adaptive Histogram Equalization (CLAHE) is applied to the image’s lightness (L) channel to preserve the colour consistency. In addition, it enhances the contrast, uniformly equalizes the intensity, and reduces noise amplification.
3.2 Blood Vessel Segmentation Accurate retinal blood vessel segmentation can aid in the early detection of ocular diseases. Therefore, a hybrid segmentation approach using sparse FCM and Deep Joint segmentation is developed from which the entropy is measured to determine the blood vessel structure accurately.
3.2.1
Sparse Fuzzy C-means
The standard FCM algorithm has a considerable proportion of noisy features and often causes over-segmentation problems [16]. The main objective of sparse FCM is to achieve fuzzy membership with sparsity, which lowers the percentage of noisy features and enhances clustering ability. The sparse FCM algorithm [17] computes the cluster group considering the distance between the centroids and the data points. Further, L1-regularization, which maximizes the weighted between-cluster sum of squares (BCSS), is applied to obtain the sparse solution of weights. The sparse clustering framework is described as: max
w,ϕ(M)
p
Wa ba (Ua , ϕ(M)) s.t.w2 ≤ 1, w1 ≤ f, Wa ≥ 0
(1)
a=1
where w = ω1 , ω2 , . . . ω p represent the feature weights, ba (Ua , ϕ(M)) denotes a function related to the a th feature, U = u i j ∈ As× p defines the dataset with s objects and p features, ϕ(M) is the model parameter, f is the tuning parameter that p controls the relevant features, and w1 = a=1 |Wa | signify the L1-norm of W. 3.2.2
Deep Joint Segmentation
Deep Joint segmentation [18] is an iterative procedure developed by connecting the grid segments in the image. At first, partition the input image into grids G with dimensions 2 x 2. Then, merge pixels within the grid provided a mean and the threshold value. The equation for joining the pixels is given as
3 An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic …
Q Gk =
x=1
Px
Q
±h
23
(2)
where Px denotes the pixels corresponding to the grid Gk , Q denotes the number of pixels in the grid Gk, and h indicates the threshold assumed to be 1. Then, the similarities of the pixel intensity are considered for region fusion, such that for every grid, only one grid point is selected, and the mean value should be less than 3. The region similarity is derived using Eq. (3) z Rk =
y
e=l
Pe
S
(3)
y
where Pe represent the grouped pixels in the grid G k and S represents the total grouped pixels. Meanwhile, the pixels having the minimum mean value are merged to find the mapped points. Finally, the deep points are derived as D P = Pm + Bd
(4)
Where Pm represent the missed pixels such that 1 < m ≤ Y , Y is the total missed pixels, and Bd represent the mapped pixels. Missed pixels are the pixels that are not within the threshold boundary. Mapped pixels are obtained by combining pixels with the lowest mean value. Next, the segmented points are randomly selected from the deep points, and the distance between them is evaluated to choose the new points. The newly segmented spots are chosen using the shortest distance. The best-segmented points are then determined by analyzing the Mean Square Error (MSE) value. Until the stopping criteria are satisfied and the optimal segments are determined, the process is repeated.
3.2.3
Entropy Estimation
The results of sparse FCM and Deep joint segmentation are given as Z 1 and Z 2 . The neighbourhood values are obtained by computing the entropy of Z 1 and Z 2 . The outcome is defined as in Eq. (5) Ci j =
2
Z ηi j Nηi j , η = 1, 2
(5)
n=1 ij
ij
where Z η denotes the segmentation result and Nη denotes the neighbourhood values estimated by calculating the entropy of Z 1 and Z 2 . The outcome is obtained by comparing Ci j with a threshold value T . E=
0; Ci j > T 1; Ci j ≤ T
(6)
24
A. M. Dayana and W. R. S. Emmanuel
4 Experimental Setup The proposed method implementation has been performed in Python. In addition, three public benchmark databases, DRIVE, STARE and DIARETDB1, are utilized to evaluate the algorithmic performance of the proposed approach.
4.1 Data Sets The DRIVE (Digital Retinal Images for Vessel Extraction) [19] is a public dataset with 40 colour fundus images. Only seven images show the signs of mild DR, and the remaining 33 did not reveal any symptoms of DR. The set of 40 images that were captured at 565 x 584 resolution with a 45° FOV is equally distributed into 20 testing and 20 training images. The STARE (STructured Analysis of the Retina) [20] database comprises 20 colour fundus images, 10 of which reveal the pathologic signs. The database is publicly available, and the images have 35° FOV and 605 x 700 fixed resolution. The DIARETDB1 [21] dataset is a publicly available standard retinopathy dataset that comprises 89 fundus images, of which five are normal, and the rest 84 has signs of DR. Each image in the database is captured with 50o FOV and 1500 x 1152 resolution.
4.2 Evaluation Metrics The segmentation performance is assessed using Accuracy, Sensitivity, Specificity, F1-Score, and Area Under the ROC Curve (AUC). Accuracy =
T pos + T neg T pos + F pos + T neg + Fneg
(7)
Sensitivit y =
T pos T pos + Fneg
(8)
Speci f icit y =
T neg F pos + T neg
(9)
T pos T pos + 21 (F pos + Fneg)
(10)
F1 − Score =
where T pos, T neg, F posand Fneg represent true positive, true negative, false positive, and false negative pixels, respectively.
3 An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic …
25
5 Results and Discussion The segmented output of the blood vessels obtained using the fundus images in the DRIVE, STARE and DIARETDB1 datasets are demonstrated in Figs. 2, 3 and 4. The images in the first column are taken from the datasets, the second column images are the preprocessed images, the third column images are the ground truth obtained from the corresponding datasets, and the images in the fourth column are the segmented results obtained by the proposed model. The performance of the suggested method evaluated on the public datasets is illustrated in Table 1. The graphical analysis is shown in Fig. 5. The performance is analyzed using measures like Accuracy, Sensitivity, Specificity, F1-Score, and AUC. The comparative performance of the proposed and the existing techniques are depicted in Tables 2 and 3. The proposed entropy-based hybrid segmentation approach achieves 0.977 accuracy, 0.882 sensitivity, 0.985 specificity, 0.921 f1score, and 0.987 AUC value on the DRIVE dataset. Moreover, the proposed method attains 0.979, 0.892, 0.989, 0.987, and 0.991 for accuracy, sensitivity, specificity, f1-score, and AUC on the STARE dataset. When evaluated on DIARETDB1 dataset, the developed method produce 0.972 accuracy, 0.885 sensitivity, 0.965 specificity, 0.953 f1-score and 0.987 AUC value. The obtained segmentation results on all the evaluation metrics prove its efficiency in classifying the background and the vessel pixels. However, authors [10] and [13] produce equivalent results regarding accuracy and AUC in the DRIVE dataset. Similarly, in the STARE dataset, authors [13] and [22] have got same accuracy, 0.970. Although accuracy, sensitivity, f1-score, and AUC values are comparatively higher than existing methods, the work presented in
Fig. 2 Segmentation result using DRIVE dataset. a Original image b Preprocessed image c Ground truth d Vessel segmented image by the proposed approach
26
A. M. Dayana and W. R. S. Emmanuel
Fig. 3 Segmentation result using STARE dataset. a Original image b Preprocessed image c Ground truth d Vessel segmented image by the proposed approach
Fig. 4 Segmentation result using DIARETDB1 dataset. a Original image b Preprocessed image c Vessel segmented image by the proposed approach
[22] produces a specificity value of 0.989, similar to the proposed approach in the STARE dataset. The authors in [9] produce a lower sensitivity value and fail to detect the small tiny vessel pixels from the fundus background. Even though the proposed technique could detect thin blood vessels and reveals better results when compared
3 An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic …
27
Table 1 Performance of the proposed method on DRIVE, STARE and DIARETDB1 datasets Dataset
Accuracy
Sensitivity
Specificity
F1-Score
AUC
DRIVE
0.977
0.882
0.985
0.921
0.987
STARE
0.979
0.892
0.989
0.987
0.992
DIARETDB1
0.972
0.885
0.965
0.953
0.987
with state-of-art techniques, for clinical assisted diagnosis, further improvement in the performance is needed.
Fig. 5 Analysis of the proposed method on DRIVE, STARE and DIARETDB1 datasets
Table 2 Comparative performance of the proposed and existing methods on the DRIVE dataset Methods
Accuracy
Sensitivity
Specificity
F1-Score
AUC
Khowaja et al. [2]
0.975
0.817
0.971
0.799
–
Tchinda et al. [9]
0.948
0.735
0.977
–
0.967
Li et al. [10]
0.956
0.792
0.981
–
0.980
Wang et al. [13]
0.956
0.807
0.978
0.825
0.980
Zou et al. [22]
0.951
0.776
0.979
0.813
–
Jiang et al. [23]
0.970
0.835
0.983
0.832
–
Proposed method
0.977
0.882
0.985
0.921
0.987
28
A. M. Dayana and W. R. S. Emmanuel
Table 3 Comparative performance of proposed and existing methods on the STARE dataset Methods
Accuracy
Sensitivity
Specificity
F1-Score
AUC
Khowaja et al. [2]
0.975
0.824
0.975
0.798
–
Tchinda et al. [9]
0.954
0.726
0.976
–
0.968
Li et al. [10]
0.967
0.835
0.982
–
0.987
Wang et al. [13]
0.970
0.843
0.984
0.852
0.982
Zou et al. [22]
0.970
0.812
0.989
0.855
–
Jiang et al. [23]
0.976
0.879
0.985
0.830
–
Proposed method
0.979
0.892
0.989
0.987
0.991
6 Conclusion Segmenting retinal vessels is crucial in diagnosing eye illnesses such as diabetic retinopathy and hypertension. The extraction of arteries from the fundus images helps ophthalmologists to observe the vessel structure and detect the pathologies. The suggested entropy-based hybrid segmentation approach segments the tiny retinal capillaries accurately and efficiently. The experimental results reveal that the proposed method could achieve high accuracy than the existing vessel segmentation methods. Besides, the developed method could capture the more discriminative representations of the small blood vessels. However, a more reliable and generalized method for vessel segmentation will be studied in future work, focusing on measuring the vessel structures and determining the location of vessel bifurcation, which can assist clinicians in analyzing the images.
References 1. Khan KB, Khaliq AA, Jalil A, Iftikhar MA, Ullah N, Aziz MW, Ullah K, Shahid M (2019) A review of retinal blood vessels extraction techniques: challenges, taxonomy, and future trends. Pattern Anal Appl 22(3):767–802 2. Khowaja SA, Khuwaja P, Ismaili IA (2019) A framework for retinal vessel segmentation from fundus images using hybrid feature set and hierarchical classification. Signal, Image Video Process 13(2):379–387 3. Zhang J, Dashtbozorg B, Bekkers E, Pluim JPW, Duits R, Ter Haar Romeny BM (2016) Robust retinal vessel segmentation via locally adaptive derivative frames in orientation scores. IEEE Trans Med Imaging 35(12):2631–2644. 4. Wiharto, Palgunadi YS (2019) Blood vessels segmentation in retinal fundus image using hybrid method of frangi filter, otsu thresholding and morphology. Int J Adv Comput Sci Appl 10(6):417–422 (2019). 5. Rodrigues EO, Conci A, Liatsis P (2020) ELEMENT: Multi-modal retinal vessel segmentation based on a coupled region growing and machine learning approach. IEEE J Biomed Heal Informatics 24(12):3507–3519 6. Memari N, Ramli AR, Saripan MIB, Mashohor S, Moghbel M (2019) Retinal blood vessel segmentation by using matched filtering and fuzzy C-means clustering with integrated level set method for diabetic retinopathy assessment. J Med Biol Eng 39(5):713–731.
3 An Entropy-Based Hybrid Vessel Segmentation Approach for Diabetic …
29
7. Saroj SK, Kumar R, Singh NP (2020) Fréchet PDF based matched filter approach for retinal blood vessels segmentation. Comput Methods Programs Biomed 194:105490 8. Guo S (2022) CSGNet: cascade semantic guided net for retinal vessel segmentation. Biomed Signal Process Control 78:103930 9. Saha Tchinda B, Tchiotsop D, Noubom M, Louis-Dorr V, Wolf D (2021) Retinal blood vessels segmentation using classical edge detection filters and the neural network. Informatics Med Unlocked 23:100521 10. Li X, Jiang Y, Li M, Yin S (2021) Lightweight attention convolutional neural network for retinal vessel image segmentation. IEEE Trans Ind Informatics 17(3):1958–1967 11. Sheng B, Li P, Mo S, Li H, Hou X, Wu Q, Qin J, Fang R, Feng DD (2019) Retinal vessel segmentation using minimum spanning superpixel tree detector. IEEE Trans Cybern 49(7):2707–2719 12. Jainish GR, Jiji GW, Infant PA (2020) A novel automatic retinal vessel extraction using maximum entropy based EM algorithm. Multimed Tools Appl 79:22337–22353 13. Wang H, Jiang Y, Jiang X, Wu J, Yang X (2018) Automatic vessel segmentation on fundus images using vessel filtering and fuzzy entropy. Soft Comput 22(5):1501–1509 14. Alhussein M, Aurangzeb K, Haider SI (2020) An unsupervised retinal vessel segmentation using hessian and intensity based approach. IEEE Access 8:165056–165070 15. Mary Dayana A, Sam Emmanuel WR (2020) A patch—based analysis for retinal lesion segmentation with deep neural networks. In: Lecture Notes on Data Engineering and Communications Technologies. pp. 677–685. 16. Jia X, Lei T, Du X, Liu S, Meng H, Nandi AK (2020) Robust self-sparse fuzzy clustering for image segmentation. IEEE Access 8:146182–146195 17. Chang X, Wang Q, Liu Y, Wang Y (2017) Sparse regularization in fuzzy C-means for highdimensional data clustering. IEEE Trans Cybern 47(9):2616–2627 18. Michael Mahesh K, Arokia Renjit J (2020) DeepJoint segmentation for the classification of severity-levels of glioma tumour using multimodal MRI images. IET Image Process 14(11):2541–2552 19. Staal J, Member A, Abràmoff MD, Niemeijer M, Viergever MA, Ginneken B. Van, Member A, Detection AR (2004) Ridge-based vessel segmentation in color images of the retina. IEEE Trans Med Imaging 23(4):501–509 20. Hoover A, Kouznetsova V, Goldbaum M (2000) Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Trans Med Imaging 19:203–210 21. DIARETDB1 - Standard Diabetic Retinopathy Database. https://www.it.lut.fi/project/ima geret/diaretdb1/index.html. Accessed 17 June 2020. 22. Zou B, Dai Y, He Q, Zhu C, Liu G, Su Y, Tang R (2021) Multi-label classification scheme based on local regression for retinal vessel segmentation. IEEE/ACM Trans Comput Biol Bioinforma 18(6):2586–2597 23. Jiang Y, Liang J, Cheng T, Zhang Y, Lin X (2022) MCPANet: multiscale cross-position attention network for retinal vessel image segmentation. Symmetry (Basel) 14:1357.
Chapter 4
Facial Recognition Approach: As per the Trend of 2022–23 Using Python Basetty Mallikarjuna , Aditi Uniyal, Samyak Jain, Bharat Bhushan Naib, and Amit Kumar Goel
1 Introduction Both industry and academics looking for a new trend in face recognition in Python programming language in the year 2022-23 [1]. Face recognition system problems existing since 1960, but how to precisely and effectively identify individuals has long been a fascinating issue [2]. Facial recognition is gaining more and more attention in recent years, thanks to the rapid development of artificial intelligence and machine learning. Face recognition provides several advantages over standard recognition systems, fingerprint recognition, and iris recognition, including non-contact, high concurrency, and user-friendliness [3]. It has a lot of promise in governance, public facilities, security, e-commerce, shopping, education, and a lot of other areas [4]. In the past framework, face acknowledgment was done physically which required some investment and effort [5]. Due to manual checkups and observing the number of people in crime investigation and police verification, people feel hardships and obstructions exist in the framework [6]. In the current circumstance, this work is done simply and is easy to implement in crime investigation. Various public places keep cameras for video surveillance systems and getting these data and manual verification is a gigantic process [7]. The challenging task of face recognition system has an exceptional case of identical twins [8]. In the year of 1990 starts the work on facial recognition tools like Ada boost algorithms and Haar cascade algorithms. These algorithms are implemented in Python, and it becomes a huge research gap as shown in Fig. 1.
B. Mallikarjuna (B) · A. Uniyal · S. Jain · B. B. Naib · A. K. Goel School of Computing Science and Engineering, Galgotias University, Gautham Budha Nagar, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_4
31
32
B. Mallikarjuna et al.
Fig. 1 Facial recognition algorithms published in IEEE and ACM jounals
In AI and ML algorithms solved many real-world problems in smart cards, digital image processing is a fast-developing discipline with increasing academic and industrial research [9]. Image processing is one prominent research area that supports AI and ML and this ultimate technology that can fulfill all living beings’ visual functions [10]. Before facial recognition, most of the literature starts with the biometrics.
2 Biometrics Biometrics is a very special and observable trait of a human being that may be used to instantaneously identify and can confirm a person’s identity [11]. Biometrics both physiological and behavioral aspects can be quantified. Physiological Biometrics are measurements and data produced directly for a component of the human body [12]. Behavioral Biometric measurements are supplied from voice-scan, signature-scan, and keystroke-scan [13]. The following are the objectives of the biometrics and face recognitions: • Use the current hardware infrastructure, cameras, and image-capturing software. The devices will function normally [14]. • It does not affect the user’s physical body. • It’s precise and allows for high enrollment and verification rates [15]. • The outcome of the comparison does not need to be interpreted by an expert. • It’s the only biometric that enables passive identification in a single step. Many situations to implement in the public sector for identification (for example, identifying a terrorist in crowded places like airport terminals) [16].
4 Facial Recognition Approach: As per the Trend of 2022–23 Using Python
33
In this article, Sect. 2 describes the related work of the various face recognition techniques and the methodology explained in Sect. 3 and the current features of the face recognition system and computational approach and solved by the result in Sect. 4. And Sect. 5 described the conclusion followed by the references.
3 Related Work In 1960 different face recognition approaches have developed, and they identified different attributes of image objects like eyes, chin, lips, ears, and mouth [1]. In the year 1970, Goldstein and Harmon developed 21 unquestionable theoretical guidelines, for instance, identifying, automatic recognition, and verification [2]. In the year of 1988, Kirby and Sirovich used standard direct factor based to recognize numerical computation for the recognition of faces [3]. The following applications are identified in face recognition: i.
Biometrics: Face detection is frequently employed as part of conjunction with a face. It’s also utilized in video surveillance, human-computer interfaces, and managing picture databases [4]. ii. Photography: Face detection is used for focusing on several contemporary digital cameras. Face detection may also be used to identify places of interest in photo slideshows using the Ken Burns pan-and-scale effect. Smile detection is frequently used in modern products to capture a shot at the right time [5]. iii. Lip Reading: Face identification is vital during the time spent deducing language from visual information. Mechanized lip perusing might be utilized to help PCs in figuring out who is talking [6]. Image import is a crucial process, the software code recognizes the image and makes sure the image has one face so that the software can detect it. There are various techniques of facial recognition such as iris-scan, retina-scan, and hand-scan [7]. Yan, Kriegman, and Ahuja presented a category for face detection systems. Face detection techniques may fall into several classifications. The major two categories of face recognition are as follows: i.
ii.
Knowledge-Based: The knowledge-based method recognizes the faces based on a set of rules that are human comprehension. A nostril, eyes, and mouth, for example, must be present at specific distances, angles, and locations from one another for a face to be complete [8]. A key downside of these systems is the difficulty in developing an acceptable set of rules. Maybe there are several misleading positives if the criteria are either broad or too detailed. This method is insufficient for locating several faces in a huge number of photographs [9]. Feature-Based: This technique extracts basic structural elements of any given face. It is used to distinguish between face and non-facial areas after being trained as a classifier. The goal is to go beyond our immediate understanding of faces [10].
34
B. Mallikarjuna et al.
Fig. 2 Haar feature selection
Paul Viola and Michael Jones in their report suggested an effective way of object identification approach which is used for utilizing Haar feature-based cascade classifiers, which was described in their work “Rapid Object Detection with the Cascade of Simple Features” in the year 2001 [11]. To train the classifier, the given method requires a huge number and variety of positive picture data (images with faces) and negative images (images without faces). After that, we must extract their characteristics from it. Different face recognition algorithms. iii. Haar Feature Selection In the Haar feature selection method compared to the upper cheeks, the eye area is darker. In comparison to the eyes, the nasal bridge is brilliant. There is a definite place for the eyes, mouth, and nose. The difference between the total of the pixels between two rectangular sections is the two-rectangle feature, which is mostly used to identify edges (a, b). The three-rectangle feature subtracts the sum of two outer rectangles from the total of a central rectangle and is mostly used to identify lines (c, d), as shown in Fig. 2 [12]. iv. Haar-like features: A Haar-like part considers successive rectangular districts at one point in an area window, aggregates the pixel powers in every region, and computes the contrast between them. This qualification is then used to arrange photograph subsections. For example, with a human face, it is a common notion that the location of the eyes is considerably more unclear than the location of the faces of objects [13]. v. Edge features: These edges are aware of their surroundings. Consider the brow and the eyes/eyebrows while recognizing someone’s face. The brow is a smooth, exposed area of the face [14]. vi. Line features: The region around the nostrils normally got twisted away from the light making them hazier. Another fascinating way that Line highlights are being used is in eye-following innovation [15]. vii. Center-surrounded features: It’s tracking down askew lines and features in a picture. This is utilized best on a miniature size. Contingent upon the lighting, it can select the edges of the jaw, jaw, wrinkles, and so forth [16]. viii. AdaBoost: Yoav Freund and Robert Schapire developed AdaBoost, also known as Adaptive Boosting, which is a quantifiable arrangement meta-calculation
4 Facial Recognition Approach: As per the Trend of 2022–23 Using Python
35
[17]. It might be applied to a variety of various types of learning computations to improve performance [18]. AdaBoost is adaptable, as it ensures that powerless students are replaced in circumstances where previous classifiers misclassified them. In certain cases, it is less susceptible to the overfitting problem than other learning calculations. ⎧ C(x) =
1, i f
Σ
a1h1(x) >= 1/2 0, other wise
Σ
a1
• Consolidating powerless classifiers in a weighted aggregate to shape a solid classifierGiven a model picture (x1,y1),… ,(xn,yn) where y1 = 0,1 for both negative and positive models. • Instate loads w for y1 = 0,1, where and I are the amounts of negative and positive. The Pseudo code of Adaboost algorithm as follows [19]. Pseudo code 1: Adaboost algorithm [20] Model: For r = 1,……,T: Standardize the heaps, wi,j = wi,/wj Characterize h(x) = h(x.p,q) where f. p, and q are the minimizers. Update the heaps: Where e = 0 accepting that model x is described precisely and e = 1 The last strong classifier is ⎧ C(x) =
1, i f
Σ
a1h1(x) >= 1/2 0, other wise
Σ
a1
where x = log(1/r) The aforementioned algorithms are existing in past, and the current trend for the year 2022−23 methodology is as follows.
4 Methodology In this approach it is easy to detect face recognition, and the approach has determined the following steps: Algorithm 1: Facial recognition approach Step 1: The “MultiScale” detect function recognize the face object, it’s called face cascade, that identifies the following Python code. print “Found {0} faces!”.format(len(faces)) //It display the rectangle around the face for (x, y, k, l) in faces:
36
B. Mallikarjuna et al.
Fig. 3 The cascade function flowchart to display the face image
cv2.rectangle(image, (x, y), (x+k, y+l), (0, 255, 0), 2) // This function draws the rectangle exactly in the face object, it covers the area of the face, in the dimension from 0 to 255 Step 2: The default option it displays in the gray scale image. Step 3: The scaling factor covers to the closed to the area of the face object to the camera, if the face object is closed to the camera then it display’s the bigger image, if the face object longer to the camera it display’s the smaller image. Step 4: The face detection algorithm uses rectangle that covers the face objects. The “minNeighbors” function provides the number of faces that covers the window. The function “minSize” provides the size of the window. The haar cascade function displays the face as follows and is shown in Fig. 3. Haar cascade method perceives or approves a singular’s recognizable proof by checking their face. Face recognition is programming that can recognize people in photos and recordings. During police and criminal investigations, it is most necessary and useful. The cascade training pseudo code displays that the python code of this function is as follows: $ python face_detect.py abba.png haarcascade_frontalface_default.xml Pseudo code 2: Haar cascade algorithm The user sets g values, and g sets threshold value and it sets the minimum acceptable detection rate per each level, it contains three stages of parameters in each stage features, the threshold value of each stage, at first. • The user may now use gtarget to target the false positive rate. • P stands for “all positive instances.” • N is the total number of negative cases. i = 0; go = 1; do = 1 • when gi is greater than gtarget I = i+1
4 Facial Recognition Approach: As per the Trend of 2022–23 Using Python
37
Fig. 4 Face recognition approach in crime investigation
ni = 0 ; ni = ni + 1 i = i+1 • Using the use of AdaBoost, develop a classifier with the ni feature using P and N. • Reduce the threshold value of the cascaded classifier in accordance with the detection rate. N=0 d = di – If gi >gtarget, then in a list of non-face photos, assess the current cascaded detector value and add any false detections to the set N. There are two main ways of comparing face recognition used by police in crime investigation as shown in Fig. 4. • Verification: Here, the system compares the individual to who they think they are and responds with a yes or no. • Identification: The system compares the individual in question to the rest of the system’s users. The database returns a ranked list of matches. All identification and authentication methods operate in the four steps listed below i. ii. iii. iv. v.
Capture—Throughout registration and, more typically, during the identification or verification procedure, the device gathers a physical or behavioral sample. Extraction—Specific data is extracted from the sample, and a prototype is created. Enrollment Module−A computerized system that scans and records a digital or analogue image of a living person’s personal features. Database−A separate entity that manages the compression, processing, storage, and compression of collected and stored data. Identification Module—It is a last module, the Identification Module, which communicates with the application system.
38
B. Mallikarjuna et al.
5 Results and Discussion Recognition of faces as per the new trend in the year 2022−2023, simple to obtain the results by using OpenCV, it is the most popular library written in C/C++ in the platform of Python. The execution of this project uses the following hardware and software configurations as shown in Table 1. It gets the user supplies values from the following two lines of code imagePath = sys.argv [1] // Get the user values cascPath = sys.argv [2] faceCascade = cv2.CascadeClassifier(cascPath) //Create the cascade value from the user, that contains data to detect the faces. The author faces can be found as per the following snapshot as shown in Fig. 5. The following advantages of this approach are as follows: • It deals with the recognition of programs has numerous and accepted by the social approval. Everyone can take the photo of himself with this work. • As per this work easy to use programming and anyone can simply operate without a person’s knowledge. The disadvantage of this approach is as follows: Table 1 Hardware and Software configurations Operating system Windows 10 advanced version 10.0.22000.708 Processor
Widows’ Intel and supported AMD PRocessor X64 bit X86
Hard disk
500 GB
RAM
8GB
Software
Python 3.10.0, install Open CV 32/64-bit differences, and type >> import CV2 If you don’t have any errors, it installed successfully then we can proceed for writing the code
Fig. 5 Authors images shown in the python platform (Authors Own figures of this paper)
4 Facial Recognition Approach: As per the Trend of 2022–23 Using Python
39
• This facial recognition system cannot distinguish between identical things.
6 Conclusion This paper explained the facial recognition software as per the current trend of the year 2022−2023 methodology using Python programming language. This software is typically coupled with face recognition and is simple to operate. The basic technology has improved, and equipment costs have decreased significantly as a result of the variance and rising computer power. Face recognition software is becoming more cost-effective, trustworthy, and accurate. As a consequence, there are no technical or financial impediments to expanding the pilot to a full deployment.
References 1. Larrain T, Bernhard JS, Mery D, Bowyer KW (2017) Face recognition using sparse fingerprint classification algorithm. IEEE Trans Inf Forensics Secur 12(7):1646–1657 2. Mallikarjuna B, Sathish K, Venkata Krishna P, Viswanathan R (2021) The effective SVM-based binary prediction of ground water table. Evol Intell 14(2): 779−787. 3. Mallikarjuna B, Ramanaiah KV, Nagaraju A, Rajendraprasad V (2013) Face detection and recognizing object category in boosting framework using genetic algorithms. Int J Comput Sci Artif Intell 3(3):87 4. Sukmandhani AA, Sutedja I (2019) Face recognition method for online exams. In: 2019 International Conference on Information Management and Technology (ICIMTech). IEEE, vol 1, pp 175−179. 5. Basetty M, Ramanaiah KV, Mohanaiah P, Reddy VV (2012) Recognizing human-object using genetic algorithm for face detection in natural driving environment. I-Manager’s J Softw Eng 7(2): 10. 6. Altameem A, Mallikarjuna B, Saudagar AKJ, Sharma M, Poonia RC (2022) Improvement of automatic glioma brain tumor detection using deep convolutional neural networks. J Comput Biol. 7. Mokhayeri F, Granger E (2019) Video face recognition using siamese networks with blocksparsity matching. IEEE Trans Biom, Behav, Identity Sci 2(2):133–144 8. Mallikarjuna B, Shrivastava G, Sharma M (2022) Blockchain technology: a DNN token-based approach in healthcare and COVID-19 to generate extracted data. Expert Syst 39(3):e12778 9. Dhiman, G (2020) An innovative approach for face recognition using raspberry Pi. Artif Intell Evol 102−107 10. Mallikarjuna B, Viswanathan R, Naib BB (2019) Feedback-based gait identification using deep neural network classification. J Crit Rev 7(4): 2020. 11. Mallikarjuna B, Addanke S, Anusha DJ (2022) An improved deep learning algorithm for diabetes prediction. In: Handbook of Research on Advances in Data Analytics and Complex Communication Networks. IGI Global, pp. 103−119. 12. Mallikarjuna B, Addanke S, Sabharwal M (2022) An Iimproved model for house price/land price prediction using deep Llearning. In: Handbook of Research on Advances in Data Analytics and Complex Communication Networks. IGI Global, pp. 76−87. 13. Deeba F, Memon H, Dharejo FA, Ahmed A, Ghaffar A (2019) LBPH-based enhanced real-time face recognition. Int J Adv Comput Sci Appl 10(5).
40
B. Mallikarjuna et al.
14. Phornchaicharoen A, Padungweang P (2019) Face recognition using transferred deep learning for feature extraction. In: 2019 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT-NCON). IEEE, pp 304−309. 15. Mallikarjuna B, Reddy DAK (2019) Healthcare application development in mobile and cloud environments. In: Internet of things and personalized healthcare systems. Springer, Singapore, pp 93−103. 16. Khan M, Chakraborty S, Astya R, Khepra S (2019) Face detection and recognition using OpenCV. In: 2019 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS). IEEE, pp 116−119. 17. Mallikarjuna B (2022) The effective tasks management of workflows inspired by NIM-game strategy in smart grid environment. Int J Power Energy Convers 13(1):24–47 18. Mallikarjuna B (2022) An effective management of scheduling-tasks by using MPP and MAP in smart grid. Int J Power Energy Convers 13(1):99–116 19. Mallikarjuna B (2020) Feedback-based fuzzy resource management in IoT-based-cloud. Int J Fog Comput (IJFC) 3(1):1–21 20. Mallikarjuna B (2022) Feedback-based resource utilization for smart home automation in fog assistance IoT-based cloud. In: Research Anthology on Cross-Disciplinary Designs and Applications of Automation. IGI Global, pp 803−824.
Chapter 5
Parametric Optimization of Friction Welding on 15CDV6 Aerospace Steel Rods Using Particle Swarm Algorithm P. Anchana
and P. M. Ajith
1 Introduction Friction welding is a solid-state joining method in which the joint is formed by the plastic deformation of the materials between one stationary and other spinning work parts. Friction welding has many advantages over other fusion welding procedures because sufficient strength may be achieved without melting the materials. Several welding parameters such as friction pressure, upsetting pressure, speed of rotation, friction time, upsetting time, burn-off length, and properties of materials were significantly affecting the quality of the joint. The quality of the joint can be assured by selecting the optimum level of input parameters [1]. 15CDV6 is a high-strength low alloy steel, which is widely used in aerospace industries. This steel has a very high strength-to-weight ratio along with good toughness and weldability [2]. Particle swarm optimization is a numerical technique that is randomized. The swarm population evolves by a process in which a finite number of individuals or particles travel around the search space in seek of the best answer. Each particle’s mobility is determined by its own experience as well as the activity of its neighbors. In this technique, each particle tracks all of its movements, and the global optimum is determined by selecting the best particle in its sector. A weighted acceleration is modified at each time step [3]. Ajith et al. [4] used the particle swarm optimization method to improve the friction welding parameters of duplex stainless steel. Friction pressure, upsetting pressure, P. Anchana (B) APJ Abdul Kalam Technological University, Trivandrum-695 016, Trivandrum, Kerala, India e-mail: [email protected] P. Anchana · P. M. Ajith Department of Mechanical Engineering, College of Engineering Trivandrum-695016, Trivandrum, Kerala, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_5
41
42
P. Anchana and P. M. Ajith
speed, and burn-off length were selected as the input parameters and tensile strength and hardness were considered as the response parameters for the multiobjective optimization. Sathiya et al. [5] optimized the tensile strength and material loss using genetic algorithm, simulated annealing (SA) and particle swarm optimization. Senapati and Bhol [6] calculated the best grain size value for maximum tensile strength using PSO. Katherasan et al. [7] used artificial neural networks (ANN) to simulate weld bead shape in the flux cored arc welding (FCAW) process, and particle swarm optimization was used to find the best parameters. Taguchi L25 orthogonal array was selected for experimental design. Malviya and Pratihar [8] designed PSO to investigate several neural networks intended to conduct input-output modelling of the MIG welding process in both forward and backward directions. Choudhary et al. [9] conducted a hybrid PSO-GA algorithm to obtain the best submerged arc welding parameters. Shojaeefard et al. [10] developed an ANN model to study the relationships between input and output variables and MOPSO was used to compute pareto optimal non-dominated solutions. Tagimalek et al. [11] used the ANN, ICA, and PSO methods to investigate the interactions and primary effects of welding factors on the mechanical properties of Al5083 and pure copper dissimilar welds. Ambekar and Kittur [12] used Taguchi orthogonal array L8 friction welding experiments and a combined WPCA-ANN-PSO technique to perform MRO of friction stir welding operations. Anand et al. [13] investigated the friction welding process of 15CDV6 by using a coupled optimization of ANN and other techniques for modelling and obtaining input output relationship in both forward and reverse directions for automation purpose. ANFIS model was explained by Chandrasekhar et al. [14] to investigate welding process parameters such as current, voltage, and torch speed in order to produce optimal weld bead shape features such as depth of penetration, bead width, and HAZ width. The ANFIS model in GA was utilized during A-TIG welding of reducedactivated ferritic-martensitic steels to evaluate the objective function and discover the optimal option for achieving target weld bead shape and HAZ width. Srinivasan et al. [15] conducted modeling and optimization of process parameters in Tungsten Inert Gas welding (TIG) of 15CDV6 alloy steel. The primary objective of this paper is to optimize the ultimate tensile strength and micro hardness of friction weld of 15CDV6 alloy steel rods by selecting optimal process parameters using PSO.
2 Experimental Procedure 15CDV6 alloy steel rods were selected as the base material for the friction welding experiments. Each rod is having a length of 70 mm and diameter of 16 mm. Figure 1 shows the experimental setup.
5 Parametric Optimization of Friction Welding on 15CDV6 Aerospace …
43
Fig. 1 Experimental setup of friction welding
Table 1 Taguchi L9 orthogonal array and performance results Experiment number
Friction pressure (MPa)
Upsetting pressure (MPa)
Speed (rpm)
Friction time (s)
Ultimate tensile strength (MPa)
Hardness (HV)
1
10
10
1000
6
1280
392
2
10
50
1500
8
1282
395
3
10
90
2000
10
1285
398
4
50
10
1500
10
1292
401
5
50
50
2000
6
1298
403
6
50
90
1000
8
1300
405
7
90
10
2000
8
1305
403
8
90
50
1000
10
1318
410
9
90
90
1500
6
1320
411
The experiment was designed based on Taguchi’s L9 orthogonal array. Three levels of four friction welding parameters such as friction pressure, upsetting pressure, speed of rotation, and friction time were selected for friction welding experiments. The ultimate tensile strength and microhardness were measured after FW of 15CDV6 alloy steel. Universal tensile testing machine and Vicker’s micro hardness tester were used for measuring ultimate tensile strength and micro hardness respectively. Table 1 represents the experimental runs with performance results.
3 Optimization Technique The PSO algorithm begins its search and provides precise solutions by continuously updating generations through iterations. It is based on the intelligence strategy used by animals or birds in groups. PSO incorporates the food-searching mechanism
44
P. Anchana and P. M. Ajith
Fig. 2 Flow chart of PSO [16]
utilized by swarms. The optimization problem is defined as selecting the optimal answer from a set of potential solutions [16]. The procedure involved in the PSO algorithm is described in Fig. 2.
4 Results and Discussions The objective function for the optimization was designed based on the L9 orthogonal array and obtained function is given in Eq. 1. UTS = 1278.94 + 0.4000 FP + 0.1167 UP - 0.00333 S - 0.250 FT
(1)
HV = 389.63 + 0.1625 FP + 0.0750 UP - 0.00100 S + 0.25 FT
(2)
Analysis of variance (ANOVA) was conducted to validate the accuracy and significance of regression model. Tables 2 and 3 explain the ANOVA charts for UTS and micro hardness respectively. The coefficient of determination of (R2 ) determines the strength of the model and high value indicates that the model is admissible for
5 Parametric Optimization of Friction Welding on 15CDV6 Aerospace …
45
Table 2 ANOVA table for ultimate tensile strength Source
DF
Adj SS
Adj MS
F-value
Regression
4
1684.83
421.21
45.88
P-value 0.001
FP (Friction pressure)
1
1536.00
1536.00
167.31
0.000
UP (Upsetting pressure)
1
130.67
130.67
14.23
0.020
S (Speed of rotation)
1
16.67
16.67
1.82
0.249
0.16
0.707
FT (Friction time)
1
1.50
1.50
Error
4
36.72
9.18
Total
8
1721.56
Table 3 ANOVA table for microhardness Source
DF
Adj SS
Adj MS
F-value
P-value
Regression
4
310.500
77.625
27.00
0.004
FP (Friction pressure)
1
253.500
253.500
88.17
0.001
UP (Upsetting pressure)
1
54.000
54.000
18.78
0.012
S (Speed of rotation)
1
1.500
1.500
0.52
0.510
FT (Friction time)
1
1.500
1.500
0.52
0.510
Error
4
11.500
2.875
Total
8
322.000
the future work. In this case, R2 value for the UTS model is 97.87% and that of microhardness model is 96.43%. Using MATLAB®, optimization algorithm was coded. Figures 3 and 4 depict the predicted tensile strength and microhardness of the welded joints. Table 4 describes the input parameters required to start the PSO algorithm, such as the number of agent particles, iterations, maximum and minimum permitted inertia weights, and learning rate. After 500 iterations, the global best value for tensile strength and microhardness were obtained as 1319.9 MPa and 412.2556HV respectively.
5 Confirmation Test The friction welding was conducted based on the optimum parameters in order to verify the best outputs resulted from PSO algorithm. Table 5 depicts the confirmation results. It is very clear from the results that the obtained results from confirmation tests are very close to the outputs of PSO. The microhardness of the test specimen was measured along the longitudinal direction with load of 0.5 Kgf and dwell time of 10s. The specimen subjected to microhardness test and variation of hardness along the longitudinal direction was
46
P. Anchana and P. M. Ajith
Fig. 3 PSO convergence graph for Ultimate tensile strength
Fig. 4 PSO convergence graph for micro hardness
described in Fig. 5a and b. The diamond indentation formed at the midpoint is depicted in Fig. 5C. The different zones such as weld zone (WZ), and heat affected zone (HAZ) for optimized parameters are presented in Fig. 6.
5 Parametric Optimization of Friction Welding on 15CDV6 Aerospace … Table 4 Parameters for initializing the PSO algorithm [9]
Table 5 Measured and predicted outputs from the optimum conditions
Input parameters
Value
Agent particles
300
Iterations
100
Wmax
0.9
Wmin
0.4
C1
2.05
C2
2.05
UTS (MPa)
Hardness (HV)
Measured
1312.3
409.1
Predicted
1319.9
412.2556
Relative error
a
47
0.579%
b
0.771%
450 400 350 300 250
-8
-6
-4
200 -2 0
2
4
6
8
c
Fig. 5 a Specimen prepared for micro hardness test b Micro hardness profile for optimized specimen c Indentation formed during micro hardness test
6 Conclusions In this paper, optimization of the response parameters such as yield strength and microhardness in friction welding of 15CDV6 aerospace steel rods was studied using PSO algorithm. Based on the experimentation and optimization following results are drawn:
48
P. Anchana and P. M. Ajith
Fig. 6 Optical micrographs of different regions of friction weld joint
• The friction pressure and upsetting pressure have the greatest influence on the ultimate tensile strength and micro hardness of the friction weld specimen. High values of friction pressure and upsetting pressure contribute to increased tensile strength and micro hardness. • The PSO algorithm optimizes the objective functions satisfactorily, and the percentage error was determined to be negligibly low. Higher tensile strength and micro hardness were predicted.
References 1. Kimura M, Ishii H, Kusaka M, Kaizu K, Fuji A (2009) Joining phenomena and joint strength of friction welded joint between pure aluminium and low carbon steel. Sci Technol Weld Join 14:388–395 2. Ramesh MVL, Srinivasa Rao P, Venkateswara Rao V (2015) Microstructure and mechanical properties of laser beam welds of 15CDV6 steel. Def Sci J 65:339–342 3. Romero-Hdz J, Saha B, Toledo-Ramirez G, Beltran-Bqz D (2016) Welding sequence optimization using artificial intelligence techniques, an overview. Int J Comput Sci Eng 3:90–95 4. Ajith PM, Husain TMAFSAL, Sathiya P, Aravindan S (2015) Multi-objective optimization of continuous drive friction welding process parameters using response surface methodology with intelligent optimization algorithm. J Iron Steel Res Int 22:954–960 5. Sathiya P, Aravindan S, Haq AN, Paneerselvam K (2009) Optimization of friction welding parameters using evolutionary computational techniques. J Mater Process Technol 209:2576– 2584 6. Senapati NP, Bhoi RK (2020) Grain size optimization using PSO technique for maximum tensile strength of friction stir-welded joints of AA1100 aluminium. Arab J Sci Eng 45:5647–5656 7. Katherasan D, Elias JV, Sathiya P, Haq AN (2012) Flux cored arc welding parameter optimization using particle swarm optimization algorithm. Procedia Eng 38:3913–3926 8. Malviya R, Pratihar DK (2011) Tuning of neural networks using particle swarm optimization to model MIG welding process. Swarm Evol Comput 1:223–235 9. Choudhary A, Kumar M, Gupta MK, Unune DK, Mia M (2020) Mathematical modeling and intelligent optimization of submerged arc welding process parameters using hybrid PSO-GA evolutionary algorithms. Neural Comput Appl 32:5761–5774
5 Parametric Optimization of Friction Welding on 15CDV6 Aerospace …
49
10. Shojaeefard MH, Behnagh RA, Akbari M, Givi MKB, Farhani F (2013) Modelling and pareto optimization of mechanical properties of friction stir welded AA7075/AA5083 butt joints using neural network and particle swarm algorithm. Mater Des 44:190–198 11. Tagimalek H, Maraki MR, Mahmoodi M, Moghaddam HK, Farzad-Rik S (2022) Prediction of mechanical properties and hardness of friction stir welding of Al 5083/pure Cu using ANN, ICA and PSO model. SN Appl Sci 4. 12. Ambekar M, Kittur J (2020) Multiresponse optimization of friction stir welding process parameters by an integrated WPCA-ANN-PSO approach. Mater Today Proc 27:363–368 13. Anand K, Barik BK, Tamilmannan K, Sathiya P (2015) Artificial neural network modeling studies to predict the friction welding process parameters of incoloy 800H joints. Eng Sci Technol an Int J 18:394–407 14. Neelamegam C, Sapineni V, Muthukumaran V, Tamanna J (2013) Hybrid intelligent modeling for optimizing welding process parameters for reduced activation ferritic-martensitic (RAFM) steel. J Intell Learn Syst Appl 05:39–47 15. Srinivasan L, Khan MC, Kannan TDB, Sathiya P, Biju S (2019) Application of genetic algorithm optimization technique in TIG welding of 15CDV6 aerospace steel. Silicon 11:459–469 16. Kumaran SS, Kaliappan J, Srinivasan K, Hu YC, Padmanaban S, Srinivasan N (2020) Realizing a novel friction stir processing-enabled FWTPET process for strength enhancement using firefly and pso methods. Materials (Basel) 13:1–19
Chapter 6
Review on Fetal Health Classification Vimala Nagabotu and Anupama Namburu
1 Introduction In 2012, there were around 213 million pregnancies worldwide [66]. 190 million (88%) of these pregnancies occurred in developing countries, whereas 22.9 million (10.9%) occurred in industrialised countries. Pregnancy complications, such as haemorrhage, miscarriage complications, high plasma compression, maternal infection, and clogged labour, claimed the lives of 293,337 people in 2013 [45]. Statistics show that in the year 2015, over 303,000 women died during or after childbirth [40]. According to 2016, approximately 835 women die each day as a result of pregnancy disorders. Pregnancy-related medical problems and death, which affect mothers and/or their children, are still a notable issue to date. The mortality rate is notably high in various parts of the Earth. Many maternal deaths occur in underdeveloped nations [40]. Global differences in medical services and treatment are reflected in this huge and unequal death distribution. A study proves that a considerable mortality difference exists within countries too, as well as between income differences as well as rural and urban living differences. As a result, delivery problems are one of the leading causes of medical deaths in places [40, 45]. The majority of the problems mentioned in the literature are occurred during pregnancy, some may appear during conception and worsen during the child’s development after the birth. But these maternal deaths are due to low resources facilities, which could be avoided. The Child birth problems include high plasma pressure, inflections, hypertension, pregnancy loss, preterm labour, and still birth. In addition to these, other complications include severe nausea, vomiting, and iron deficiency and anemia [5, Anupama Namburu authors contributed equally to this work. V. Nagabotu · A. Namburu (B) School of Computer Science Engineering, VIT-AP University, Amaravati, Guntur 522337, Andhra Pradesh, India e-mail: [email protected] V. Nagabotu e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_6
51
52
V. Nagabotu and A. Namburu
46]. As a result, these disorders may affect pregnancy, adding to the demand for novel tools used to screen and assess foetal health. Even mothers’ medical issues that impact the foetus, childbirth-related concerns, as well as foetal diseases are examples of these conditions [25]. Pre-eclampsia, high blood pressure, renal and auto-immune disease, maternal diabetes, and thyroid disease have all been linked to an increased risk for the foetus [42, 55, 58, 77]. Prolonged pregnancy, vaginal bleeding, decreased foetal movements, and prolonged ruptured membranes are additional medical problems that endanger the foetal health [75]. Other risk factors for foetuses include intrauterine growth restriction, foetal inflexion, and multiple gestations [23, 75]. As a result, these disorders can include neurodevelopmental problems in children, such as non-ambulant intellectual palsy, growing delays, acoustic and visual weakening, and foetal concession, which all can be the causes of mortality in neonates. By continuously detecting and monitoring the foetal heart rate (FHR) and uterine retrenchments throughout the period to assess foetal well-being and monitor for increased risks of childbirth complications. This helps with the tracking of embryonic hypoxia’s course and prompt intervention before disease [31]. The foetal heart rate as well as its changeability, receptiveness, and likely checking are important indicators for measuring foetal healthiness throughout [20]. An ultrasonogram is placed on the abdomen to obtain the FHR. The test is based on FHR, uterine shrinkage, and foetal programme motion and is used to sense and identify potentially hazardous events. During the prenatal and intrapartum phases of childbirth, obstetricians routinely use cardiography to check and evaluate foetal status. Improved technology in modern medical practises has recently allowed the use of strong and effective methods for generating self-machined predictions in large medical applications [47–49]. Machine learning tools, when properly implemented, can greatly assist people in creating perfect medical decisions and diagnoses, lowering mother and foetal transience as well as working in childbirth and benefiting people. The foetal heart rate pattern is difficult to recognise, but a CAD system based on the created technologies helps to allow automatic foetal state classifications during pregnancy [12] (Table 1).
2 Literature Review In valuable works [16], an Support Vector Machine (SVM) method was used to determine the foetal state in pregnancy using CAD approaches. Other research uses Neural Network and Random Forest classifiers to classify cardiotocography [8, 73]. But all these approaches were designed with the aid of the clinical diagnostic datasets of patients and CTG data in childbirth for classification, and it shows only the binary results of pathological cases in pregnancy. Goal four of the United Nations Maintainable Progress Areas, aimed to reduce the worldwide under-five transience rate of 1.9%, ultimately failed [80]. The neonatal period was responsible for 46% of all deaths among children under the age of five in 2015 [38]. Preterm birth (36%), intrapartum events (24%), and infections are the leading causes of morbidity and mortality in this category. As per a study in 2018, it shows an under-developing country, Pak-
6 Review on Fetal Health Classification
53
Table 1 List of abbreviations used in paper Definition Abbreviation FHR CAD CTG SVM LGBM LS-SVM MLP LSTM CNN CWT-CNN AI RF KNN NN ANFIS MLPNN LSVM PCA ELM XGB CART META_DES
Foetus heart rate Computer aided design Cardiotocography Support Vector Machine Light Gradient Boosting Least squares support vector machine Multi-layer Perceptron Long short-term memory network Convolutional neural network. Continuous wavelet transform and convolutional neural network Artificial intelligence Random Forest K-nearest neighbours Neural network Adaptive-network-based FUZZY inference system Multiple layer perceptron neural network Lagrangian support vector machine Principal Component Analysis Extreme Learning Machine Extreme Gradient Boosting Classification and Regression Tree Meta-learning and feature selection for ensemble
istan, shows the highest newborn death rate at 4.6% [6]. CTG can be divided into normal, suspect, or abnormal depending on the FHR, pulse degree changeability, hastening, and slowing up, as per the FIGO (International Federation of Gynaecology and Obstetrics) standards [13]. Specialised health personnel (obstetricians) or machine learning technology can execute this interpretation. In a recent Coherent study, Grivell et al. found a noteworthy lessening in prenatal period loss with CTG (relative risk: 0.20,95 per cent poise interlude[CI]: 0.04–0.88) [7]. However, given the studies were of moderate quality, more research is needed to examine the influence of CTG on perinatal outcomes [7]. AI employs scientific procedures and a large number of feedback points from the human body to produce structural data. Allonen [7] used to improve cancer recurrence identification and death estimation [21], circulatory risk prediction [76], and tonal imaging [43]. Existing algorithms are excellent at predicting the fetus risk but not so effective at predicting the suspect state [51]. Existing algorithms are excellent at predicting the foetus’s troublesome condition but not so effective at predicting the suspect state [17, 73]. The study was to develop a model that could detect speculative pregnancies (both doubtful and abnormal) and highly qualified medical workers. Machine learning
54
V. Nagabotu and A. Namburu
techniques can help doctors make better decisions in tough situations like diagnosis, MMR reduction, and labour problems. It’s challenging to classify the stages of foetal health, but ML classification systems excel at it [61]. Standard classification algorithms include SVM, Neuronic Systems (NS), and RF [69]. In terms of accurately classifying the stages of foetal health, the RF classifier performs better. Foetal monitoring can reduce foetal mortality, even in the second trimester [68]. AI has recently been an important method for primary judgement and exact classification. A detailed comparison of fifteen machine learning methods employed to protect healthy and diseased foetuses was conducted [19]. The properties were extracted using cardiography recordings. These effectively analyse great volumes of immediate data in order to provide improved answers as well as a framework using new classification replicas [3]. Cardiography is utilised to quickly measure the FHR and offer medical personnel with accurate data and updates. The efficiency of cardiotocography in monitoring foetal health during labour is reviewed. CTG is used to assess the foetal heart rate and womb contractions, and thus plays an important role in both pre- and post-labour foetal assessment. This would also track the baby’s movement frequency. A wellcurrent hike in these kinds of approaches, namely NN, K NN, SVM, and Decision Tree, outperforms others [72]. All of the classifiers are effective, and the prediction model uses 30 ranking characteristics to identify common risk variables [40]. Among the most common strategies for selecting optimum features for prediction, in the ML algorithms extraction and selection are preferred. This would even aid in determining which features to prioritise [10]. Recent measures, including factors like exactness, specificity, meticulousness, and recollection, can be used to access the procedure’s performance. The efficiency of classifying cardiac anomalies using machine learning for monitoring foetal alterations and improving the image acquired [24]. The procedure of assessing old statistics while using cardiography is outlined [41]. The image shown was acquired through CNN architecture in Deep Learning (DL), with the box confirming the statistics from Cardiography in contradiction to the duplicate attainment dataset [71]. A machine learning (ML)-based approach to FHR signal analysis [59] is better suited for predicting risk in foetal health. Among the classifiers used for FHR-based risk prediction were Naive Bayes (NB) [33, 39, 54], support vector machine (SVM)-radial basis function [18, 26], SVM Linear [14], and Classification trees [9, 34, 64]. These findings indicate that the technique should be widely applied in clinics to forecast intrauterine growth restriction foetuses. To achieve high speed and accuracy, recently interesting Extreme gradient Boost (XGBoost) regressors, Light Gradient Boosting Machine (LightGBM) regressors, and Category Boosting (CatBoost) regressors have been added to the literature. These boosting algorithms have proved effective in many medical applications. XGBoost is a scalable algorithm and has proved to be a challenger in solving machine learning problems. In [35], the researchers used the rule base and the XGboost method to analyse the patterns of the FHR signal. XGBoost takes advantage of combining hundreds of trees and converting a weak classifier to a stronger one. High precision, difficulty in over-fitting, and robust scalability are all benefits of XGBoost. It is capable of handling distributed highdimensional features [39]. After that, so many ensemble models were done. Ensemble modelling is the creation of numerous diverse models to forecast an outcome, either
6 Review on Fetal Health Classification
55
by utilising a variety of modelling algorithms or by utilising various training data sets. The ensemble model then combines the output of each base model to produce a single, conclusive prediction for the unobserved data. To decrease the over-fitting of the prediction, ensemble models are used. The ensemble approach reduces prediction error. The accuracy of the model is low as long as the basis models are diverse and independent. In order to make a prediction, the method looks to the wisdom of the masses. Although the ensemble model has numerous base models, it behaves and operates as a single model, and all these models are presented in Tables 2 and 3.
Table 2 Representation of various works done on foetal health classification Dataset Factors and References classes Discriminant Analysis [27] Decision tree [27] SVM [49] Naïve Bayes [49] Random Forest [4] Decision tree [4] k-Nearest Neighbour [4] Logistic Regression [4] Gradient Boosting classifier [36] Cat boost classifier [36] LGBM [36] Extreme Gradient boosting [36] Cascade forest classifier [36] Random forest classifier [36] Extra tree classifier [36] Decision tree classifier [36] Linear discriminant analysis [36] AdaBoost classifier [36] Logistic regression [36] k-neighbour’s classifier [36] Stack model [36] Blender model [36] Random forest [57] LS-SVM [79] AlexNet [53] DenseNet [56] MLP [37] LSTM [78] CWT-CNN [81] AlexNet-SVM [50]
CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset CTG dataset
21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3 21,3
Accuracy 82.1 86 84 83.7 98 96 90 96 95.5 95.50 95.3 95.1 94.7 94.0 93.6 91.1 88.2 87.5 88.5 88.5 95.2 95.9 83.30 80.64 83.70 93.67 85.98 96.25 94.08 99.91
56
V. Nagabotu and A. Namburu
Table 3 Representation of various works done on foetal health classification References
dataset
Factors and classes
Accuracy
ANFIS [52]
CTG dataset
21,2
97.16
SVM-GA [51]
CTG dataset
21,2
99.34
MLPNN [27]
CTG dataset
21,3
97.8
MLPNN [73, 74]
CTG dataset
21,3
–
Fuzzy C-Mean,K-Mean [74]
CTG dataset
21,3
–
DT (C4.5) [32]
CTG dataset
21,3
92.43
DT (C4.5)-based AdaBoost Ensbl [32]
CTG dataset
21,3
95.01
MLPNN [32]
CTG dataset
21,3
92.10
MLPNN Based AdaBoost Ensbl. [32]
CTG dataset
21,3
92.05
LSVM [9] LSVM with PCA [30]
CTG dataset
21,3
–
LS-SVM-PSO-BDT [79]
CTG dataset
21,3
91.62
ELM [79]
CTG dataset
6,3
91.03
ELM with IAGA-M1 (6 features) [79]
CTG dataset
6,3
93.61
ELM with PCA (6 components) [65]
CTG dataset
6,3
89.60
ELM with Basic GA (14 features) [65]
CTG dataset
14,3
97.87
SVM [44]
CTG dataset
21,3
93
RF [44]
CTG dataset
21,3
94.5
MLP [44]
CTG dataset
21,3
92.53
KNN [44]
CTG dataset
21,3
91.23
XGB [63]
CTG dataset
21,3
98
SVM [63]
CTG dataset
21,3
98
KNN [63]
CTG dataset
21,3
98
LGBM [63] [14]
CTG dataset
21,3
99
RF [63]
CTG dataset
21,3
98
ANN [63]
CTG dataset
21,3
17
LSTM [63]
CTG dataset
21,3
34
Deep Forest [15]
CTG dataset
21,3
95.07
Random Forest [60]
CTG dataset
10,3
93.74
SVM [2]
CTG dataset
21,3
92.39
Random Forest [28]
CTG dataset
21,3
94.80
Naïve Bayes [1]
CTG dataset
16,3
85.88
Random Forest [29]
CTG dataset
21,3
95.11
CART with Gini index [64]
CTG dataset
21,3
90.12
Random Forest [70]
CTG dataset
21,3
93.40
META-DES, ensemble classifier [22]
CTG dataset
21,3
84.62
Bagging based Random Forest [67]
CTG dataset
21,3
94.73
Stacking ensemble learning [11]
CTG dataset
10,3
96.05
Random Forest [62]
CTG dataset
21,3
97.18
SVM [62]
CTG dataset
21,3
95.88
ANN [62]
CTG dataset
21,3
99.09
6 Review on Fetal Health Classification
57
3 Conclusion In this paper, we discussed different models to classify foetal health classification based on the CTG dataset. Foetal health refers to the foetal development and regular contact with the mother’s uterus during pregnancy. Maximum gestation period complications put the foetus in a tough situation that restricts proper growth and can result in deficiencies or death. Predicting risk factors before problems arise can help ensure a healthy pregnancy period and promote proper foetal development. The numerous methods and studies that have been done thus far for accurately projecting foetal well-being and developing state from a collection of pre-classified patterns were described in this study. Foetal health prediction is essential in order to create a predictive ensemble model using machine learning and deep learning algorithms.
References 1. Afridi R, Iqbal Z, Khan M et al (2019) Fetal heart rate classification and comparative analysis using cardiotocography data and known classifiers. Int J Grid Distrib Comput (IJGDC) 12:31– 42 2. Agrawal K, Mohan H (2019) Cardiotocography analysis for fetal state classification using machine learning algorithms. In: 2019 International conference on computer communication and informatics (ICCCI), IEEE, pp 1–6 3. Akhtar F, Li J, Azeem M et al (2020) Effective large for gestational age prediction using machine learning techniques with monitoring biochemical indicators. J Supercomput 76(8):6219–6237 4. Alam MT, Khan MAI, Dola NN, et al (2022) Comparative analysis of different efficient machine learning methods for fetal health classification. Appl Bionics Biomech 5. Albahlol IA, Almaeen AH, Alduraywish AA et al (2020) Vitamin d status and pregnancy complications: serum 1, 25-di-hydroxyl-vitamin d and its ratio to 25-hydroxy-vitamin d are superior biomarkers than 25-hydroxy-vitamin d. Int J Med Sci 17(18):3039 6. Alive EC (2018) The urgent need to end newborn deaths. UNICEF, New York 7. Allonen S (2018) Käyttäjien asenteet ja odotukset tekoälyyn urheilussa ja terveydenseurannassa: case ibm watson 8. Arif M (2015) Classification of cardiotocograms using random forest classifier and selection of important features from cardiotocogram signal. Biomater Biomech Bioeng 2(3):173–183 9. Arif MZ, Ahmed R, Sadia UH et al (2020) Decision tree method using for fetal state classification from cardiotography data. J Adv Eng Comput 4(1):64–73 10. Azar AT (2014) Neuro-fuzzy feature selection approach based on linguistic hedges for medical diagnosis. Int J Model Ident Control 22(3):195–206 11. Bhowmik P, Bhowmik PC, Ali UME, et al (2021) Cardiotocography data analysis to predict fetal health risks with tree-based ensemble learning 12. Ayres-de Campos D, Bernardes J, Garrido A, et al (2000) Sisporto 2.0: a program for automated analysis of cardiotocograms. J Maternal-Fetal Med 9(5):311–318 13. Ayres-de Campos D, Spong CY, Chandraharan E et al (2015) Figo consensus guidelines on intrapartum fetal monitoring: cardiotocography. Int J Gynecol Obstet 131(1):13–24 14. Chauhan VK, Dahiya K, Sharma A (2019) Problem formulations and solvers in linear svm: a review. Artif Intell Rev 52(2):803–855 15. Chen Y, Guo A, Chen Q et al (2021) Intelligent classification of antepartum cardiotocography model based on deep forest. Biomed Signal Process Control 67(102):555 16. Comert Z, Kocamaz A (2017) Comparison of machine learning techniques for fetal heart rate classification
58
V. Nagabotu and A. Namburu
17. Cömert Z, Kocamaz AF, Güngör S (2016) Cardiotocography signals with artificial neural network and extreme learning machine. In: 2016 24th Signal processing and communication application conference (SIU), IEEE, pp 1493–1496 18. Cömert Z, Boopathi AM, Velappan S, et al (2018) The influences of different window functions and lengths on image-based time-frequency features of fetal heart rate signals. In: 2018 26th Signal processing and communications applications conference (SIU), IEEE, pp 1–4 19. Cömert Z, Sengür ¸ A, Budak Ü et al (2019) Prediction of intrapartum fetal hypoxia considering feature selection algorithms and machine learning models. Health Inf Sci Syst 7(1):1–9 20. Cosmi EV (1997) New technology” evaluation and standardization offetal monitoring. Int J Gynecol Obstet 59:169–173 21. Cruz JA, Wishart DS (2006) Applications of machine learning in cancer prediction and prognosis. Cancer Inf 2(117693510600200):030 22. Cruz RM, Sabourin R, Cavalcanti GD et al (2015) Meta-des: a dynamic ensemble selection framework using meta-learning. Pattern Recogn 48(5):1925–1935 23. Fisk N, Smith R (2001) Fetal growth restriction; small for gestational age. Turnbull’s obstetrics pp 197–209 24. Garcia-Canadilla P, Sanchez-Martinez S, Crispi F et al (2020) Machine learning in fetal cardiology: what to expect. Fetal Diagn Ther 47(5):363–372 25. Grivell RM, Alfirevic Z, Gyte GM, et al (2015) Antenatal cardiotocography for fetal assessment. Cochrane Database Syst Rev (9) 26. Haque E, Gupta T, Singh V, et al (2022) Detection and classification of fetal heart rate (fhr). In: International conference on artificial intelligence and sustainable engineering, Springer, pp 437–447 27. Huang ML, Hsu YY (2012) Fetal distress prediction using discriminant analysis, decision tree, and artificial neural network 28. Imran Molla M, Jui JJ, Bari BS, et al (2021) Cardiotocogram data classification using random forest based machine learning algorithm. In: Proceedings of the 11th National Technical Seminar on Unmanned System Technology 2019, Springer, pp 357–369 29. Islam SFN, Yulita IN (2020) Predicting fetal condition from cardiotocography results using the random forest method. In: 7th Mathematics, science, and computer science education international seminar, MSCEIS 2019 30. Je˙zewski M, Czaba´nski R, Łe˛ski J (2014) The influence of cardiotocogram signal feature selection method on fetal state assessment efficacy. J Med Inf Technol 23:51–58 31. Jørgensen JS, Weber T (2014) Fetal scalp blood sampling in labor-a review. Acta Obstet Gynecol Scand 93(6):548–555 32. Karabulut EM, Ibrikci T (2014) Analysis of cardiotocogram data for fetal distress determination by decision tree based adaptive boosting approach. J Comput Commun 2(9):32–37 33. Kong Y, Xu B, Zhao B, et al (2021) Deep gaussian mixture model on multiple interpretable features of fetal heart rate for pregnancy wellness. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 238–250 34. Kumar GR, Sheshanna KV, Basha SR, et al (2021) An improved decision tree classification approach for expectation of cardiotocogram. In: Proceedings of international conference on computational intelligence, data science and cloud computing, Springer, pp 327–333 35. Kuo PL, Yen LB, Du YC et al (2021) Combination of xgboost analysis and rule-based method for intrapartum cardiotocograph classification. J Med Biol Eng 41(4):534–542 36. Li J, Liu X (2021) Fetal health classification based on machine learning. In: 2021 IEEE 2nd International conference on big data artificial intelligence and internet of things engineering (ICBAIE), IEEE, pp 899–902 37. Li J, Chen ZZ, Huang L et al (2018) Automatic classification of fetal heart rate based on convolutional neural network. IEEE Internet Things J 6(2):1394–1401 38. Liu L, Oza S, Hogan D et al (2016) Global, regional, and national causes of under-5 mortality in 2000–15: an updated systematic analysis with implications for the sustainable development goals. The Lancet 388(10063):3027–3035
6 Review on Fetal Health Classification
59
39. Liu L, Jiao Y, Li X et al (2020) Machine learning algorithms to predict early pregnancy loss after in vitro fertilization-embryo transfer with fetal heart rate as a strong predictor. Comput Methods Programs Biomed 196(105):624 40. Lu C, Zhu Z, Gu X (2014) An intelligent system for lung cancer diagnosis using a new genetic algorithm based feature selection method. J Med Syst 38(9):1–9 41. Magenes G, Signorini MG (2021) Cardiotocography for fetal monitoring: Technical and methodological aspects. In: Innovative technologies and signal processing in perinatal medicine. Springer, pp 73–97 42. Mander R, Fleming V (2002) Failure to progress. The contraction of the midwifery profession, NY 43. McBee MP, Awan OA, Colucci AT et al (2018) Deep learning in radiology. Acad Radiol 25(11):1472–1480 44. Mehbodniya A, Lazar AJP, Webber J, et al (2022) Fetal health classification from cardiotocographic data using machine learning. Expert Syst 39(6):e12,899 45. Mensah GA, Sampson UK, Roth GA, et al (2015) Mortality from cardiovascular diseases in sub-saharan africa, 1990–2013: a systematic analysis of data from the global burden of disease study 2013. Cardiovasc J Afr 26(2 H3Africa Suppl):S6 46. Miao JH, Miao KH (2018) Cardiotocographic diagnosis of fetal health based on multiclass morphologic pattern predictions using deep learning classification. Int J Adv Comput Sci Appl 9(5) 47. Miao JH, Miao KH, Miao GJ (2015) Breast cancer biopsy predictions based on mammographic diagnosis using support vector machine learning. Multi J Sci Technol J Sel Areas Bioinform 5(4):1–9 48. Miao KH, Miao GJ et al (2013) Mammographic diagnosis for breast cancer biopsy predictions using neural network classification model and receiver operating characteristic (roc) curve evaluation. Multi J Sci Technol J Sel Areas Bioinform 3(9):1–10 49. Miao KH, Miao JH, Miao GJ (2016) Diagnosing coronary heart disease using ensemble machine learning. Int J Adv Comput Sci Appl 7(10) 50. Muhammad Hussain N, Rehman AU, Othman MTB et al (2022) Accessing artificial intelligence for fetus health status using hybrid deep learning algorithm (alexnet-svm) on cardiotocographic data. Sensors 22(14):5103 51. Ocak H (2013) A medical decision support system based on support vector machines and the genetic algorithm for the evaluation of fetal well-being. J Med Syst 37(2):1–9 52. Ocak H, Ertunc HM (2013) Prediction of fetal state from the cardiotocogram recordings using adaptive neuro-fuzzy inference systems. Neural Comput Appl 23(6):1583–1589 53. Ogasawara J, Ikenoue S, Yamamoto H et al (2021) Deep neural network-based classification of cardiotocograms outperformed conventional algorithms. Sci Rep 11(1):1–9 54. Ogunyemi D, Jovanovski A, Friedman P et al (2019) Temporal and quantitative associations of electronic fetal heart rate monitoring patterns and neonatal outcomes. J Maternal-Fetal Neonatal Med 32(18):3115–3124 55. Parsons J, Sparrow K, Ismail K et al (2018) Experiences of gestational diabetes and gestational diabetes care: a focus group and interview study. BMC Pregnancy Childbirth 18(1):1–12 56. Parvathavarthini S, Sharvanthika K, Bohra N, et al (2022) Performance analysis of squeezenet and densenet on fetal brain mri dataset. In: 2022 6th International conference on computing methodologies and communication (ICCMC), IEEE, pp 1340–1344 57. Peterek T, Gajdoš P, Dohnálek P et al (2014) Human fetus health classification on cardiotocographic data using random forests. In: Volume II (ed) Intelligent data analysis and its applications. Springer, pp 189–198 58. Pickersgill A, Meskhi A, Paul S (1999) Key questions in obstetrics and gynaecology. CRC Press 59. Ponsiglione AM, Cosentino C, Cesarelli G et al (2021) A comprehensive review of techniques for processing and analyzing fetal heart rate signals. Sensors 21(18):6136 60. Prasetyo SE, Prastyo PH, Arti S (2021) A cardiotocographic classification using feature selection: a comparative study. JITCE (J Inf Technol Comput Eng) 5(01):25–32
60
V. Nagabotu and A. Namburu
61. Quilligan EJ, Paul RH (1975) Fetal monitoring: is it worth it? Obstet Gynecol 45(1):96–100 62. Rafie A, Chenouni S, Alami N, et al (2022) Classification of fetal state using machine learning models. In: E3S Web of conferences, EDP sciences, p 01027 63. Rahmayanti N, Pradani H, Pahlawan M et al (2022) Comparison of machine learning algorithms to classify fetal health using cardiotocogram data. Proc Comput Sci 197:162–171 64. Ramla M, Sangeetha S, Nickolas S (2018) Fetal health state monitoring using decision tree classifier from cardiotocography measurements. In: 2018 Second international conference on intelligent computing and control systems (ICICCS), IEEE, pp 1799–1803 65. Ravindran S, Jambek AB, Muthusamy H, et al (2015) A novel clinical decision support system using improved adaptive genetic algorithm for the assessment of fetal well-being. Comput Math Methods Med 66. Sedgh G, Singh S, Hussain R (2014) Intended and unintended pregnancies worldwide in 2012 and recent trends. Stud Fam Plann 45(3):301–314 67. Shah SAA, Aziz W, Arif M, et al (2015) Decision trees based classification of cardiotocograms using bagging approach. In: 2015 13th international conference on frontiers of information technology (FIT), IEEE, pp 12–17 68. Sharanya S, Venkataraman R (2021) An intelligent context based multi-layered Bayesian inferential predictive analytic framework for classifying machine states. J Ambient Intell Humaniz Comput 12(7):7353–7361 69. Signorini MG, Pini N, Malovini A et al (2020) Integrating machine learning techniques and physiology based heart rate features for antepartum fetal monitoring. Comput Methods Programs Biomed 185(105):015 70. Sontakke SA, Lohokare J, Dani R, et al (2018) Classification of cardiotocography signals using machine learning. In: Proceedings of SAI intelligent systems conference, Springer, pp 439–450 71. Sridar P, Kumar A, Quinton A et al (2019) Decision fusion-based fetal ultrasound image plane classification using convolutional neural networks. Ultrasound Med Biol 45(5):1259–1273 72. Stoean R, Stoean C (2013) Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection. Expert Syst Appl 40(7):2677–2686 73. Sundar C, Chitradevi M, Geetharamani G (2012) Classification of cardiotocogram data using neural network based machine learning technique. Int J Comput Appl 47(14) 74. Sundar C, Chitradevi M, Geetharamani G (2013) An overview of research challenges for classification of cardiotocogram data. J Comput Sci 9(2):198 75. Thapa J, Sah R (2017) Admission cardiotocography in high risk pregnancies. Nepal J Obstet Gynaecol 12(1):50–54 76. Weng SF, Reps J, Kai J, et al (2017) Can machine-learning improve cardiovascular risk prediction using routine clinical data? PloS one 12(4):e0174,944 77. Woodrow P (2018) Intensive care nursing: a framework for practice. Routledge 78. Yefei Z, Yanjun D, Xiaohong Z, et al (2021) Bidirectional long short-term memory-based intelligent auxiliary diagnosis of fetal health. In: 2021 IEEE region 10 symposium (TENSYMP), IEEE, pp 1–5 79. Yılmaz E, Kılıkçıer Ç (2013) Determination of fetal state from cardiotocogram using ls-svm with particle swarm optimization and binary decision tree. Comput Math Methods Med 80. You D, Hug L, Ejdemyr S et al (2015) Global, regional, and national levels and trends in under-5 mortality between 1990 and 2015, with scenario-based projections to 2030: a systematic analysis by the un inter-agency group for child mortality estimation. The Lancet 386(10010):2275– 2286 81. Zhao Z, Deng Y, Zhang Y et al (2019) Deepfhr: intelligent prediction of fetal acidemia using fetal heart rate signals based on convolutional neural network. BMC Med Inform Decis Mak 19(1):1–15
Chapter 7
SDN-Based Task Scheduling to Progress the Energy Efficiency in Cloud Data Center J. K. Jeevitha, R. Subhashini, and G. P. Bharathi
1 Introduction The cloud computing is one of the important networking concepts, nowadays people are efficiently using the cloud technologies. The cloud computing is providing the various types of resources and services to the clients with the concept of pay per use, along with low amount of executive perils [1]. These type of computing avails the resources in the Internet such as storage, server, software, and infrastructure. For sharing the business data, many of the business organizations and IT Corporations are incorporating the cloud system. The end user had not satisfied with the existing traditional network, because they had expected high resource availability, fastness in network, reliability, fault tolerance, and proper Service-Level Agreement (SLA). The traditional cloud service providers were Azure, Microsoft, Amazon, IBM, etc. The traditional service providers decided to implement and install the various cloud data centers across the world, with a different geographical location to convince the customer. The Data Center holds a huge number of servers (thousands of servers). The Data Center servers consume huge amount of energy for the least amount of work processing, approximately it consumes more than 50 % of energy J. K. Jeevitha (B) Department of Information Technology, PSNA College of Engineering and Technology, Dindigul, India e-mail: [email protected] R. Subhashini Department of Information Technology, Sathyabama Institute of Science and Technology, Chennai, India e-mail: [email protected] G. P. Bharathi Department of Computer and Communication Engineering, Sri Sai Ram Institute of Technology, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_7
61
62
J. K. Jeevitha et al.
for little amount of work execution. The cloud service roles are increasing among the various organizations, and simultaneously organization also consumed more amount of energy. Finally, the conclusion said that energy utilization amount is identical to the 1.6 % of power contributed to the whole metropolis [2]. The United States data center utilized the 100 billion Kilo watt’s energy and the price of the energy consumption also increased by 75%, and the report was submitted by United States Environmental Protection Agency. The breeding of the new technologies and the application of social networking such as Facebook, WhatsApp, Instagram, and Twitter need large amount of resources (storage) detain necessitates numerous servers to store the data of social media and some new technologies, as the way it is boosting power consumption. The high energy consumption directs to increasing the level of carbon dioxide in the world [3]. While the energy consumption is amplified at the same time energy cost will be increased, the energy cost is higher than the infrastructure cost. In the data center, huge amount of power is required for the cooling system, and as a result it affects the global warming on the environment. The biggest amount of energy consumed by the free server (“Idle server”) is compared to busily working server. As per the report of National Research Development Corporation (NRDC), the freely available server in the cloud data center utilizes up to 97% of energy [4]. While the server is getting overloaded, simultaneously energy consumption is also increased. So most of the researches have been focused sharing or distributing the tasks from the overloaded server to freely available server. One of the important energy saving concepts is Power off the freely available (“Idle”) server. Some of the techniques are used to resolve the energy consumption problem, such as load balancing, dynamic resource allocation, task scheduling, and virtual machine migration. In this research work, EEGA and Software-Defined Network (SDN) are focused. The EEGA as well as SDN will be put into operation with fat-tree topology architecture. The EEGA is compared with various existing algorithms such as Round Robin (RR) and Traditional Genetic Algorithm (TGA). The experimental results are carried out in the CloudSim simulator. The research paper is organized as Sect. 1 is Introduction, Sect. 2 is Literature Survey, Sect. 3 is Proposed Work, Sect. 4 is model, Sect. 5 is Experimental Results, Sect. 6 is Conclusion, and References.
2 Literature Survey The author had focused on one of the frequency scaling techniques that is considered as Dynamic Voltage Frequency Scaling (DVFS). This method is applied in the graphical gaming server and had completed the analysis of the energy efficiency of the server. This method was evaluated with “Non-Power Technique” (NPT) and “StaticThreshold-Detection Technique” (STDT), Central Processing Unit (CPU), Random
7 SDN-Based Task Scheduling to Progress the Energy Efficiency …
63
Access Memory (RAM), and bandwidth and input/output files were considered as the parameter for evaluation [5]. The author concentrated on VM migration technique and commenced two layers such as “Cloud-Manager Layer” (CML) and “Green-Manager Layer” (GML). The CML was used to choose the all optimal and correct resource from the previously available resources. This GML is utilized for selecting the optimal resources according to the task requirement. The GML had completed the task and resource plotting. This method had reduced the waiting time, response time, and energy consumption among the server [6]. The researcher proposed two heuristic methods for consolidation of VM among the cloud data center. For VM consolidation, author enhanced two types of heuristic approaches. Both algorithms are temperature-aware algorithm. This approach had improved the energy efficiency [7]. The Thermal-Aware Task Scheduling (“TATS”) was implemented to improve the energy in the cloud computing data center. This is the hybrid method of Art Algorithm and First Fit Algorithm [8]. In this research was concentrated on work load assignment and Dynamic Voltage Frequency Scaling (DVFS) Technique [9]. The load balancing technique was performed with foundation of the Bayes theorem, which is considered as LB-BC. The virtual machine to physical machine plotting was done in the paper. These all methods were executed in the CloudSim simulator to get better energy efficiency [10]. The “passive optical cross-connect networks” (POXN) in the cloud data center. The POXN was used to diminish the energy consumption and used to improve the QoS parameter. The POXN was favorable for all types of routing such as uni-cast routing and multi-cast routing. The above-mentioned research paper was concentrated on various types of scheduling technique, VM migration, and DVFS. This research is going to concentrate on EEGA, SDN with fat-tree topological architecture. The experimental results will be executed in the Java-enabled CloudSim simulator. The EEGA algorithm’s performance will be compared with RR and TGA.
3 Proposed Work 3.1 Software-Defined Network (SDN) The SDN is one of the recently emerging technologies in cloud computing. The SDN consists of three different sets of layers such as infrastructure, application, and control layer. The SDN is used to divide from a data forwarding plane from network plane. In this research work, the SDN for congestion-free scheduling is implemented. This component worked as the boundary between the infrastructure and application layer through north-bound and south-bound interface. The SDN is
64
J. K. Jeevitha et al.
acting as the centralized manger, which is monitoring the network traffic and network load. As per the monitoring, SDN is computing the different sets of traffic rules and the various sets of traffic paths. SDN is incorporating the network elasticity and it makes the real-time traffic adjustment. The main advantages of the SDN are energy conservation, reduce traffic congestion, low maintenance, and cost-effective. The traffic-aware method has been implemented in the cloud computing paradigm, because it is the optimal way to cut the energy efficiency. The traffic-aware methodology is used to monitor the idle paths or freely utilized path. The idle components are consumed in giant amount of energy compared to fully utilized components. The SDN is noticed the idle and block the idle path which means switch off the idle paths. This type of method reduces more than 50% of energy consumption. Mostly the hierarchical architecture of tree-based Data Center Network (DCN) architecture is more suitable for the SDN, for the reason that it is scalable in the dynamic manner. So this research work implements the SDN in fat-tree topological architecture [12].
3.2 Fat-Tree Topological Architecture The fat-tree data center architecture is a well set up architecture among cloud data center. This DCN composed of various layers, these are core switch layer, aggregation layer, and edge switch layer. The N numbers of servers are connected at the point of edge switch layer. The core switch layer is attached in the downward manner to aggregate switch layer. The aggregate layer is doing one major function, which is connecting the overall architecture with internet. The 2K pods are implemented in the fat-tree topological architecture. Each pod consists of K2 switches. As a final point, the whole data center is surrounded with 2K3 servers. The fat-tree topological architecture is designed with bi-sectional bandwidth, so it reduced 50% of the energy consumption in the cloud data center. Fat-tree topological architecture is well adopted for huge network (large-scale network) [13].
3.3 Energy-Efficient Genetic Algorithm for Task Scheduling In Fig. 1, the architecture for scheduling with SDN is explained. The end user resource requirements are satisfied by the Service-Level Agreement (SLA). The cloud providers should ensure the throughput and proper resource utilization. Day by day the cloud end user ratios are increased, the rapid growth of end users affects the QoS parameters. The good task scheduling approach may improve the resource utilization and increase the level of QoS parameters. In this research work, the scheduling optimization algorithm is implemented. The heuristic algorithm line Traditional Genetic Algorithm (TGA), Particle Swarm Optimization (PSO), and Ant Colony Optimization (ACO) are used to solve the difficulties of optimization.
7 SDN-Based Task Scheduling to Progress the Energy Efficiency …
65
Fig. 1 Architecture diagram for SDN-based scheduling
In this paper, some alterations in the TGA has been made. The new alteration depicts that in every population the parent involved to generate the child behind the process of crossover. In this algorithm, another one phase is considered, which is known as “Event Selection”, this phase is used to choose the high-quality chromosomes to avoid the drawback of population size. This change increases the efficiency of the algorithm.
66
3.3.1
J. K. Jeevitha et al.
Traditional Genetic Algorithm (TGA)
TGA is the basic biological concept, which is invented by Darwin. As per the theory, the concept of survival of the fitness is implemented in the cloud task scheduling. The philosophies of the TGA are as follows [14, 15]: Initial Population: The initial population is the group of all individual chromosomes. During this phase a few functions are incorporated on the chromosome to produce the next generation. The pairing chromosomes are chosen depending on some exact criterion. Fitness Value: The fitness value decides the efficiency of some individual chromosomes. The fitness value is used to estimate the dominance of individual chromosome. Thus the survival of the chromosome depends on the fitness value. Selection: In this step, good chromosomes are chosen to produce the next generation. The numerous selection rules were implemented to choose the good chromosomes. It may be “roulette wheel”, “Boltzmann strategy”, “tournament selection”, and selection based on rank. Crossover: This step is involved in the recombination of the chromosomes to generate the new child. This crossover step is used to produce a novel resolution from the previous population. Mutation: The mutation process is done after the crossover. The mutation process is considered as the operator for generating a new genetic multiplicity (“Diversity”). In this phase, the fewer or more early states of chromosome’s gene values are changed. Figure 2 [14] explains the entire process of TGA.
3.3.2
The Proposed Energy-Efficient Genetic Algorithm
The major goal of the algorithm, each selection process is considered as the solution. After each and every selection process, the solution will be added to the population. Because of that, in the end of the every selection process new and good solution will be added. The proposed EEGA algorithm promotes the best fitness value during the selection process. Other than it is not chosen for the crossover procedure. Initial Population: The first step is initial population, here randomly produced the population with the help of encoded binary values such as 0 and 1. Every chromosome or gene consists
7 SDN-Based Task Scheduling to Progress the Energy Efficiency …
67
Fig. 2 Traditional genetic algorithm
of tasks and VM. The task and VM are addressed by the corresponding ID. In Fig. 3 [12], the tasks and VM are represented. The VM and tasks are encoded by this manner, for example, VM2-TS7-TS2-TS6 -> [0110-0100-1000-1001]. Fitness calculation: The main objective of the task scheduling is a proper resource assignment and decreases the end (completion) time of the task [15]. End T ime = E T max[i, j] i ∈ T a, i = 1, 2, 3, . . . . . . , m j ∈ V M j = 1, 2, 3, . . . . . . ., n Fig. 3 Number of tasks and VM representation
(1)
68
J. K. Jeevitha et al.
ETmax depicts that maximum time task end time of the task is on the VMj. Ta denoted as task, VM denoted as Virtual Machine, m denoted as number of tasks, n denoted as number of VMs. P Ti j =
coi pc j
(2)
where PTij represents processing time, coi represents task computational complexity, pcj represents processing speed of the VM. PR =
m
P Ti j
(3)
i=1
where PR represents processing time of the task, Eq. (3) [15] is used to calculate the processing time. The process of selection: This process is known as event selection. This process enhances the more effectiveness in the computation and it is easily accomplished concurrent process implementation. The Event Selection process is used to rectify the disadvantages of the population size. Initially, the two individuals are picked from the population in the unsystematic manner. Here consider one random number (rn), its range between 0 and 1, if rnTGA> EEGA. From the beginning EEGA algorithm had offered the best output. In Fig. 6, the Response time is symbolized, here also EEGA had provided the best results. The response time is analyzed with a number of tasks and time. Gradually, the numbers of task are increased from 10 to 150. In each and every moment, the EEGA had proven their best. The EEGA optimization
7 SDN-Based Task Scheduling to Progress the Energy Efficiency …
71
technique was examined with different factors and evaluated with existing methods and techniques. During the experiment, the EEGA had given the best results with all parameters equally as energy consumption, task completion, and response time. Figures 7 and 8 clearly represented delay and task completion time. Figure 9 reflected the output of energy consumption. In Fig. 9, values were analyzed with VMs and time consumption parameter mentioned in the form of units. The VMs are noted in the X-axis, energy consumption units are marked in the Y-axis. Initially, the experiment started with 5 VM, after that, VM was gradually increased periodically, as the way energy consumption values were increased. The overall output of Fig. 9 is RR>TGA>EEGA. Fig. 5 Waiting time
Fig. 6 Response time
72 Fig. 7 Delay
Fig. 8 Task completion time
Fig. 9 Energy consumption
J. K. Jeevitha et al.
7 SDN-Based Task Scheduling to Progress the Energy Efficiency …
73
6 Conclusion The main aim of the research paper is improving the energy efficiency in the cloud data center servers. One of the main factors of the energy consumption is a poorly used network component and link. So this research work switches off the idle components in the cloud data center. The identification of the idle components process was done by the SDN. The proper scheduling methods are improving the resources utilization and energy. In this paper, EEGA algorithm is introduced with fat-tree topological architecture. The EEGA algorithm was evaluated with RR and TGA. The EEGA algorithm improved the QoS parameters and energy efficiency.
References 1. Mell P, France T (2009) Definition of cloud computing, Technical report, SP 800–145, National Institute of Standard and Technology (NIST) 2. Pagare, Damodar, Jayshri, Koli NA (2013) Energy-efficient cloud computing: a vision, introduction, efficient cloud computing: a vision, introduction, and open challenges. Int J Comput Sci Netw 2(2):96–102 3. Bo L, Li J, Huai J, Wo T, Li Q, Zhong L (2009) Enacloud: an energy-saving application live placement approach for cloud computing environments. International conference on cloud computing, Bangalore, 21–25 Sept 2009, pp 17–24.https://doi.org/10.1109/CLOUD.2009.72 4. Smith JW, Sommerville I (2010) Green cloud: a literature review of energy-aware computing and cloud computing 5. AhmadB, Maroof Z, McClean S, Charles D, Parr G (2020) Economic impact of energy saving techniques in cloud server. Cluster Comput 23:611–621 6. Geetha, Robin P (2020) C.R.R. Power conserving resource allocation scheme with improved QoS to promote green cloud computing. J Ambient Intell Human Comput. https://doi.org/10. 1007/s12652-020-02384-2 7. Yavari M, HadiFathi M, HadiFathi M (2019) Temperature and energy aware consolidation algorithms in cloud computing. J Cloud Comput 8, Article number: 13, 1–16. https://doi.org/ 10.1186/s13677-019-0136-9 8. Sobhanayak S, Turu AK (2019) Energy-Efficient task scheduling in cloud data centera temperature aware approach. IEEE conference record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5, pp 48–55 9. Higuera-Toledano T, Risco-Mart´ı n JL, Arroba P, Jos´e (2017) Green adaptation of real-time web services for industrial CPS within a cloud environment. IEEE Trans Indus Inf 13(3):1–8 10. Zhao J, Yang K, Wei X, Ding Y, Hu L, Xu G (2016) A heuristic clustering-based task deployment approach for load balancing using bayes theorem in cloud environment. IEEE Trans Parallel Distrib Syst 27(2):305–316 11. Jeyasekar, Nanda S, Uthra A (2018) Green SDN: trends of energy conservation in software defined network. Int J Eng Technol 7(3.12):9–13 12. Ali TE, Morad AH, Mohammed, Abdala A (2020) Traffic management inside software-defined datacentre networking. Bull Electr Eng Inf 9(5)
74
J. K. Jeevitha et al.
13. Jang SH, Kim TY, Kim JK, Lee JS (2012) The study of genetic algorithm-based task scheduling for cloud computing. Int J Control Autom 5:157–162 14. Buyya R, Ranjan R, Calheiros RN (2009) Modeling and simulation of scalable cloud computing environments and the CloudSim toolkit: Challenges and opportunities. High performance computing & simulation, 2009. HPCS’09. International Conference on, pp 1–11 15. Kruekaew B, Kimpan W (2014) Virtual machine scheduling management on cloud computing using artificial bee colony. In: Proceedings of the international multiconference of engineers and computer scientists
Chapter 8
A Review on Machine Learning-Based Approaches for Image Forgery Detection Sonam Mehta and Pragya Shukla
1 Introduction The advent of Internet and handheld devices has led to the widespread use of social media platforms, Bunk et al. [1]. This has in turn resulted in the enormous sharing of data, out of which images have gained the maximum prominence. The amount of information stored in images is extremely large and the ease of sharing such images has its own repercussions. Images can be visualized as two-dimensional data streams which are made up of pixels rendering the following information Pomari et al. [2]: (1) Location. (2) Intensity. (3) RGB value. The rampant sharing of images also leads to the chances of images being tampered or forged which can have disastrous consequences such as blemishing the reputation and in some extreme cases massive public furore and violence. Hence, it is mandatory to detect image forgeries and filter them out from networks in very less time so that it cannot reach large masses. This is however challenging due to the following reasons Rao and Ni [3]: (1) The number of images shared is staggeringly large. (2) Image editing tools are extremely sophisticated making manual detection infeasible, Seibold et al. [4].
S. Mehta (B) · P. Shukla Department of Computer Engineering, Institute of Engineering and Technology, Indore, India e-mail: [email protected] P. Shukla e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_8
75
76
S. Mehta and P. Shukla
(3) Pixels of both forged and unforged images exhibit so much numeric similarity, that is, it is infeasible to directly classify images based on pixel-based information directly, Li et al. [5]. (4) Statistical techniques for image forgery detection generally yield low accuracy of detection, Nirajdar and Mankar [6]. With the latest advances in machine learning techniques, image forgery detection can be done using machine learning approaches, to yield high accuracy and less time, Pun et al. [7].
2 Types of Image Forgery Authors show that image forgery can be detected by analyzing divergences in the attributes of images, and the techniques employed can be fundamentally classified as active and passive approaches, Bayar and Stamm [8]. The active approach is in general attributed by visible changes in image features while pre-processing with or without a priori information. Whereas, in case of a passive approach, a priori information of image features are absent and changes in image texture are imperceptible, Cozzolino et al. [9]. The prevalent methods of forgery are Hashmin et al. [10]. Copy-Move Forgery: In this case, a section of the image is copied and repeated in some other sections of the image, Lynch et al. [11]. Due to similar properties of the forged section of the image, this type of forgery may be difficult to detect. Image splicing: In this case, a part of an image is used to add to a different image. In general, due to dissimilarities in statistical properties of the images, it is relatively easier to detect compared to copy and move forgery, Hussain et al. [12]. Retouching: In this case, properties such as grayscale value or RGB value along with illumination are tampered to create a forged image and are challenging to be detected Fan et al. [13].
3 Challenges in Machine Learning-Based Automated Detection of Image Forgery Machine Learning-based approaches are invariably needed as the data to be analyzed is extremely complex for time-critical applications. In case machine learning is used to identify image forgery, handpicked features are to be computed from images of both categories which are the forged and unforged images. This approach may have the advantage of not having to depend on extremely exhaustive datasets for direct processing, Muhammad et al. [14]. However, it is often complex to decide which image features are to be computed in order to make the final classification accurate. Moreover, due to noise and blurring effects in images, the accurate computation of image features is also complex, Chierchia et al. [15]. Stochastic features which are
8 A Review on Machine Learning-Based Approaches …
77
applicable for a wide range of images are generally computed as used for feature selection. Deep learning approaches on the other hand do not require the handpicked feature extraction stage but rather compute image features directly, Tafti et al. [16]. The deep neural networks are the most common type of deep learning models which are being used for image classification problems pertaining to various fields. Lower level features are computed at the outer layers of the deep neural networks while higher order features are computed at the deeper layers. The major limitation of this approach is the fact that a staggering amount of labeled data is required for effective learning of the deep learning model which can lead to subsequent high accuracy classification. The most common deep learning model used for image forgery detection is the convolutional neural network (CNN) and its variants such as the region-based CNN (R-CNN), Residual Network (ResNet), Barad and Goswami [17]. The recent work has been done for image splicing forgery detection which is based on masked region-based CNN (R-CNN) model MobileNet. One of the approaches is based upon illumination and statistical inconsistency for forgery detection which marks deviations in pixels of the image. The divergences in the image attribute along with time-constrained applications make extremely challenging to detect image forgery with high accuracy. It is almost infeasible to detect image forgery with naked eyes (manual approach) due to the complexity of visual appearance, Yao et al. [18]. Figure 1 depicts the visual similarity between forged and unforged images, which is highly imperceptible through visual inspection and may be extremely time-consuming, Shivakumar and Baboo [19]. Other associated challenges include Yerushalmy and Hel-Or [20]. Scene complexity: In this case, the background statistics are identical to that of the actual image making forgery detection challenging. Uneven lighting: In this case, blurring and low contrast make accurate detection challenging. Blurring Effects: In this case, due to lesser number of pixels captured or misalignment of pixels, the image gets de-focused making forgery detection challenging. Variation in Aspect ratios: Variations in aspect ratio render lesser effectiveness in forgery detection. Similarity in Background: In this case, identical pixel attributes make forgery detection challenging. Typically, the performance metrics which are used to evaluate the performance of an automated image forgery classifier are the classification accuracy, sensitivity, precision, and recall. Moreover, time complexity of the algorithm plays a crucial role as the automated classifiers may find applications in time-constrained applications such as social media, wherein large execution time may lead to seriously undesirable repercussions.
78
S. Mehta and P. Shukla
Fig. 1 Visual similarity between forged and unforged images
4 Previous Work This section highlights the salient points of the contemporary noteworthy contributions in the filed with the findings (Table 1). It can be observed from a detailed study of contemporary approaches that the most common types of approaches include machine learning approaches like Support Vector Machines, Resampling, Feature Point Matching, Local Binary Patterns, Neural Networks, and DCT which have been used so far basically falling under the following categories: (1) Handpicked Feature Extraction with Machine Learning-Based Classifier: In this approach, handpicked features such as LBP or HOG or statistical features along with their derivatives have been extracted which serve as the basis for training a classifier. This approach doesn’t need extremely exhaustive datasets for finding patterns. However, accurate feature extraction and feature selection remains a challenge.
IEEE Access 2021
Kadam et al. [23]
Walia et al. [24]
Detection and Localization of Multiple Image Splicing Using MobileNet V1
Fusion of Handcrafted and Deep Features for Forgery Detection in Digital Images IEEE Access 2021
Emerald Insight 2022
Akram et al. [22]
Image splicing detection using discriminative robust local binary pattern and support vector machine
Publication Springer 2022
Authors
An efficient approach for Koul et al. [21] copy-move image forgery detection using convolution neural network
Paper name
Table 1 Summary of noteworthy contribution in the field Approach
Results and finding
Feature extraction followed by ANN-based classification
MobileNet VI
Support Vector Machine
(continued)
Amalgamation of 648-D Markov-based features and YCbCr colorspace based local binary pattern (LBP) features used to train residual network (ResNet) 18. Accuracy of 99.3% attained
Image Splicing forgery detection model based on masked region-based CNN (R-CNN) model MobileNet Classification accuracy of 82% achieved
Splicing image forgery detection based on discriminative robust local binary pattern (DRLBP) and support vector machine (SVM)
Convolutional neural networks Slant-based convolutional neural network used for identifying copy and move forgery. Accuracy of 97.52% achieved
8 A Review on Machine Learning-Based Approaches … 79
Kumar et al. [27]
Kumar et al. [28]
Syn2Real: Forgery classification via unsupervised domain adaptation
Image forgery detection based on physics and pixels: a study Taylor and Francis, 2019
IEEE 2020
IEEE 2020
Nasar et al. [26]
Deepfake Detection in Media Files—Audios, Images, and Videos
Publication Springer 2021
Authors
Encoder-decoder-based Baich et al. [25] convolutional neural networks for image forgery detection
Paper name
Table 1 (continued)
Illumination and statistical inconsistency detection
AlexNet based CNN
Recurrent Neural Networks
Encoder Based CNN and ResNet
Approach
(continued)
The approach discussed in this paper uses the physical changes or deviations in pixel patterns in the image to identify re-sampling image forgery
Deep Learning model AlexNet used for the detection of copy and move forgery
The approach uses recurrent neural networks (RNNs) for the detection of Deep Fakes, i.e., tampered videos with faces of people morphed on videos
Encoder Decoder-based ResNet named Fals-Unet used for identifying manipulated regions of images. Average accuracy of 84% archived
Results and finding
80 S. Mehta and P. Shukla
Authors
Bappy et al. [29]
Moghaddasi et al. [30]
Paper name
Hybrid LSTM and encoder-decoder architecture for detection of image forgeries
Image splicing forgery detection based on low- dimensional singular value decomposition of discrete cosine transform
Table 1 (continued)
Springer 2019
IEEE 2019
Publication
DCT, Feature Extraction and SVM
Hybrid LSTM-Encoder based CNN
Approach
(continued)
This approach uses the handpicked features, namely ,mean, variance skewness, and kurtosis for detecting re- ouching image forgery based on the Support Vector Machine (SVM). Prior to feature extraction, the images are converted to transform domain using the discrete cosine transform (DCT) and singular value decomposition (SVD). Principal Component Analysis
Resampling features are used to identify image forgery based on the hybrid encoder—Long Short-Term Memory (LSTM)-based structure. The frequency domain correlation among re-sampling features is predominantly used as the discriminating feature
Results and finding
8 A Review on Machine Learning-Based Approaches … 81
Authors
Zhang et al. [31]
Alkawaz et al. [32]
Ouyang et al. [33]
Paper name
Boundary-based image forgery detection by fast shallow CNN
Detection of copy-move image forgery based on discrete cosine transform
Copy-move forgery detection based on deep learning
Table 1 (continued)
IEEE 2017
Springer 2018
IEEE 2018
Publication
Deep Neural Networks with Back Propagation
DCT based feature extraction followed by Deep Learning
Shallow CNN architecture
Approach
(continued)
A deep learning-based approach is designed by authors for detecting copy and move forgery. The deep learning-based approach used employs a deep neural network with stacked hidden layers computing attributes to detect image forgery
This approach tries to detect image forgery in the transform domain which is another way to detect image forgeries. The discrete cosine transform (DCT) of the images is computed and the DCT values are used to detect forgery
A boundary-based approach has been developed for detection of tampering using shallow convolutional neural networks. The boundary-based feature is computed as a feature in the layers of the CNN
Results and finding
82 S. Mehta and P. Shukla
IJCSIS 2016
Springer 2017
Zhou et al. [34]
An Efficient Approach for Digital Kaur et al. [35] Image Splicing Detection Using Adaptive SVM
Publication
Authors
Paper name
Copy-move forgery detection based on deep learning
Table 1 (continued)
Feature Extraction and classification using SVM
Approach
(continued)
Authors have designed a method to detect splicing image forgery based on an adaptive support vector machine (SVM)-based classifier. The hyperplane is adaptive based on the data fed
This approach proposes a block-based rich model convolutional neural network (rCNN)-based architecture for image forgery detection for detection of splicing image forgery. The block-based (rCNN) approach is found to be effective in identifying slicing image forgery even after JPEG compression of the forged images
Results and finding
8 A Review on Machine Learning-Based Approaches … 83
Authors
Zhang et al. [36]
Carvalho et al. [37]
Paper name
Joint image splicing detection in DCT and contourlet transform domain
Illuminant-based transformed spaces for image forensics
Table 1 (continued)
IEEE 2016
Elsevier 2016
Publication
Illumination Inconsistency
DCT-Contourlet based approach
Approach
(continued)
The approach adopted in this approach is the analysis of image illumination for forgery detection. It was shown that the image illumination co-efficient varied sporadically in case of common image forgeries such as copy and move or splicing, thereby rendering ways to detect the same
The approach developed in this case is the transform domain detection of image forgery based on the discrete cosine transform and contourlet transform. The contourlet transform is able to pick up sudden spectral changes occurring due to forgeries
Results and finding
84 S. Mehta and P. Shukla
IEEE 2014
Wang et al. [40]
Exploring DCT co-efficient quantization effects for local tampering detection
Elsevier 2015
Publication
IEEE 2016
Gao et al. [38]
Uliyan et al. [39] Copy-move image forgery detection using Hessian and center symmetric local binary pattern
Authors
Paper name
Bayesian sample steered discriminative regression for biometric image classification
Table 1 (continued)
DCT features for forgery detection
Hessian features and centersymmetric local binary pattern (CSLBP)
BayesNet
Approach
(continued)
This techniques relies on the sudden abnormalities in the Discrete Cosine Transform (DCT) matrix co-efficient values which occur during forgery. The quantization of the samples is analyzed to predict forgery
This approach employs Hessian features and center- symmetric local binary pattern (CSLBP) features for identifying copy and move image forgeries and the affected regions in case the forgery operation is followed by post-processing techniques such as JPEG compression and scaling operations
The approach used in this work an amalgamation of regression learning and the Bayesian classifier. While image parameters are fed to the regression learner for pattern recognition, the Bayes classifier classifies the new samples
Results and finding
8 A Review on Machine Learning-Based Approaches … 85
Publication IEEE 2014
Authors
Zhao et al. [41]
Paper name
A Distributed Local Margin Learning-based scheme for highdimensional feature processing in image tampering detection
Table 1 (continued) Results and finding
Local Margin Learning (LML) A Local Margin Learning and dimensional reduction (LML)-based approach is designed for image forgery detection. The main approach is dimensionally reducing the image features for detection of forgery to facilitate classification using simple classifiers such as Euclidean distance-based classifiers
Approach
86 S. Mehta and P. Shukla
8 A Review on Machine Learning-Based Approaches …
87
(2) Feature Extraction with Filtering and/or dimensional reduction: This approach either works on removal of noise and disturbance effects employing transform domain filtering such as DCT, DWT, and Contourlet followed by feature extraction. It may also contain separate dimensional reduction techniques such as singular value decomposition (SVD) or principal component analysis (PCA). Feature selection still remains a major challenge. (3) Deep Learning-Based Approaches: In this approach, no handpicked feature extraction is needed. Thus the rigor of feature extraction is avoided. However, to attain high accuracy of classification, exhaustive datasets are needed which is a major limitation. Typical techniques are the CNN and its derivatives like Residual Neural Network, MobileNet, Shallow CNN, and Deep Neural Network with back propagation.
5 Comparative Analysis A comparative analysis of the methods based on the classification accuracy is depicted in Fig. 2. The accuracies of the baseline techniques have been compared and tabulated in Table 2, with a standard dataset for training and testing that being the NIST’16 and COVERAGE forgery dataset. The forgeries include the copy-move-, splicing- and removal-based forgeries. The dataset has been splitted as 75% for training, 25% for testing, and 5% for validation (cross validation). The subsets of training, testing, and validation have been chosen randomly. The images used in the dataset are 1024 × 1024 RGB images.
Fig. 2 Comparative accuracy analysis
88 Table 2 Comparison of benchmark techniques
S. Mehta and P. Shukla S. no.
Dataset
Approach
Accuracy
1
NIST’ 16, COVERAGE
CNN
97.52
2
NIST’ 16, COVERAGE
R-CNN
82
3
NIST’ 16, COVERAGE
ResNet
99.3
4
NIST’ 16, COVERAGE
LBP-SVM
98.95
5
NIST’ 16, COVERAGE
LSTM-EncoderDecoder
98.50
6
NIST’16, COVERAGE
FCN
74.28
6 Conclusion It can be concluded from previous discussions that image forgery can have farfetched repercussions in the world of social media and high-speed Internet. This paper presents the potential consequences of image forgery and the need for highly accurate automated tools for image forgery detection. The comparative analysis of machine learning and deep learning-based algorithms for image forgery detection has been explained along with their own pros and cons. The various contemporary approaches employed for the same such as machine learning, deep learning, transform domain analysis, feature selection, and evaluation have been highlighted. The performance metrics commonly used for the evaluation of the automated classifiers have also been presented. It is expected that the proposed work paves the path for future researchers to develop meticulously planned machine learning models with an aim to beat existing work in terms of classification accuracy and computational complexity.
References 1. Bunk J, Bappy J, Mohammad T, Nataraj L, Flenner A, Manjunath B, Chandrasekaran S, Roy A, Peterson L (2017) Detection and localization of image forgeries using resampling features and deep learning. IEEE conference on computer vision and pattern recognition workshops, 1881–1889 2. Pomari T, Ruppert G, Rezende E, Rocha A, Carvalho T (2018) Image splicing detection through illumination inconsistencies and deep learning. IEEE Int Conf Image Process 2018:3788–3792 3. Rao Y, Ni J (2016) A deep learning approach to detection of splicing and copy-move forgeries in images. IEEE international workshop on information forensics and security (WIFS), 1–6 4. Seibold C, Samek W, Hilsmann A, Eisert P (2017) Detection of face morphing attacks by deep learning. International workshop on digital watermarking: digital forensics and watermarking, 107–120 5. Li J, Li X, Yang B, Sun X (2014) Segmentation-based image copy-move forgery detection scheme. IEEE Trans Inf Forensics Secur 10:507–518
8 A Review on Machine Learning-Based Approaches …
89
6. Birajdar G, Mankar G (2013) Digital image forgery detection using passive techniques: a survey 10(3):226–245 7. Pun C, Yuan X, Bi X (2015) Image forgery detection using adaptive oversegmentation and feature point matching. IEEE Trans Inf Forensics Secur 10(8):1705–1716 8. Bayar B, Stamm M (2016) Deep learning approach to universal image manipulation detection using a new convolutional layer. In: Proceedings of the 4th ACM workshop on information hiding and multimedia security, ACM, 5–10 9. Cozzolino D, Gragnaniello D, Verdoliva L (2014) Image forgery detection through residualbased local descriptors and block-matching. IEEE international conference on image processing, 5297–5301 10. Hashmi MF, Hambarde AR, Keskar AG (2013) Copy move forgery detection using DWT and SIF T features. International conference on intelligent systems design and applications, 188–193 11. Lynch G, Shih F, Liao H (2013) An efficient expanding block algorithm for image copy-move forgery detection. Elsevier, Information Sciences, 239, 253–265 12. Hussain M, Muhammad G, Saleh SQ, Mirza AM, Bebis G (2012) Copy-move image forgery detection using multi-resolution weber descriptors. International conference on signal image technology and internet based systems, 395–401 13. Fan W, Wang K, Cayre F, Xiong Z (2012) 3D lighting-based image forgery detection using shape-from-shading. In: Proceedings of the 20th European signal processing conference, 1777– 1781 14. Muhammad G, Hussain M, Bebis G (2012) Passive copy move image forgery detection using undecimated dyadic wavelet transform. Digital Investigation, Elsevier, 9(1):49–57 15. Chierchia G, Parrilli S, Poggi G, Verdoliva L, Sansone C (2011) PRNU-based detection of small-size image forgeries. International conference on digital signal processing, 1–6 16. Tafti AP, Malakooti MV, Ashourian M, Janosepah S (2011) Digital image forgery detection through data embedding in spatial domain and cellular automata. International conference on digital content, multimedia technology and its applications, 11–15 17. Barad Z, Goswami M (2020) Image forgery detection using deep learning: a survey, 6th international conference on advanced computing and communications (ICACC-2020), 571–576 18. Yao H,Wang S, Zhao Y, Zhang X (2011) Detecting image forgery using perspective constraints. IEEE Signal Process Lett 19(3)3:123–126 19. Shivakumar B, Baboo S (2011) Detection of region duplication forgery in digital images using SURF. J Comput Sci Issues 8(1):199–205 20. Yerushalmy I, Hel-Or H (2011) Digital image forgery detection based on lens and sensor aberration. Int J Comput Vis. Springer 72:71–91 21. Koul S, Kumar M, Khurana S, Mushtaq F, Kumar K (2022) An efficient approach for copy-move image forgery detection using convolution neural network. Multimedia Tools Appl Springer 81:11259–11277 22. Akram A, Ramzan S, Rasool A, Jaffar A, Furqan U, Javed W (2022) Image splicing detection using discriminative robust local binary pattern and support vector machine. World J Eng Emerald Insight 19(4):459–466 23. Kadam K, Ahirrao S, Kotecha K, Sahu S (2021) Detection and localization of multiple image splicing using MobileNet V1. IEEE Access 9:162499–162519 24. Walia S, Kumar K, Kumar M, Gao X (2021) Fusion of handcrafted and deep features for forgery detection in digital images. IEEE Access 9:99742–99755 25. Biach F, Iala I, Laanaya H, Minaoui K (2021) Encoder-decoder based convolutional neural networks for image forgery detection. Multimedia Tools Appl Springer 466:22611–22628 26. Nasar BF, Lason ER (2020) Deepfake detection in media files—audios, images and videos. IEEE Recent Adv Intell Comput Syst, 74–79 27. Kumar A, Bhavsar A, Verma R (2020) Syn2Real: forgery classification via unsupervised domain adaptation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, 63–70
90
S. Mehta and P. Shukla
28. Kumar M, Srivastava S (2019) Image forgery detection based on physics and pixels: a study. Astralian J Forensic Sci Taylor Francis 15(2):119–134 29. Bappy JH, Simons C, Nataraj L, Manjunath BS, Roy-Chowdhury AK (2019) Hybrid LSTM and encoder–decoder architecture for detection of image forgeries. IEEE Trans Image Process 28(7):3286–3300 30. Moghaddasi Z, Jalab H, Noor R (2019) Image splicing forgery detection based on lowdimensional singular value decomposition of discrete cosine transform coefficients. Neural Comput Appl Springer 31:7867–7877 31. Zhang Z, Zhang Y, Zhou Z, Luo J (2018) Boundary-based image forgery detection by fast shallow CNN. International conference on pattern recognition, 2658–2663 32. Alkawaz M, Sulong G, Saba T, Rehman A (2018) Detection of copy-move image forgery based on discrete cosine transform. J Neural Comput Appl 30:183–192 33. Ouyang J, Liu Y, Liao M.(2017).Copy-move forgery detection based on deep learning. International congress on image and signal processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, China, 1-5, Forensics and Watermarking, Springer 2017, pp 65–76 34. Zhou J, Ni J, Rao Y (2017) Block based convolutional neural network for image forgery detection. International workshop on digital watermarking, digital forensics and watermarking, 65–76 35. Kaur G, Khehra B (2016) An efficient approach for digital image splicing detection using adaptive SVM. Int J Comput Sci Inf Security (IJCSIS) 14(6):168–173 36. Zhang Q, Lu W, Weng J (2016) Joint image splicing detection in DCT and contourlet transform domain. J Vis Commun Image Represent Elsevier 40:449–458 37. Carvalho T, Faria F, Pedrini H (2016).Illuminant-based transformed spaces for image forensics. IEEE Trans Inf Forensics Security 11(4):720–733 38. Gao G, Yang J, Wu S, Jing X, Yue D (2015) Bayesian sample steered discriminative regression for biometric image classification. J Appl Soft Comput 37:48–59 39. Uliyan DM, Jalab HA, Wahab A (2015).Copy move image forgery detection using Hessian and center symmetric local binary pattern. IEEE conference on open systems, 7–11 40. Wang W, Dong J, Tan T (2014) Exploring DCT coefficient quantization effects for local tampering detection. IEEE Trans Inf Forensics Security 9(10):1653–1666 41. Zhao X, Li J, Wang S, Li S (2014) A distributed local margin learning based scheme for high-dimensional feature processing in image tampering detection. International conference on multimedia and expo, Chengdu, China, pp 1–6
Chapter 9
Ponzi Scam Attack on Blockchain R. B. Amle and A. U. Surwade
1 Introduction Nowadays, blockchain technology is most popular because decentralized nature of the blockchain. Satoshi Nakamoto has presented the two most important concepts such as bitcoin and blockchain. Bitcoin is the virtual cryptocurrency. The blockchain technology uses the advanced cryptography the transaction happened in the decentralized peer-to-peer network, in which transaction processing is without involvement of third party such as bank or any financial institute [1, 2]. Vitalik Buterin developed the Ethereum. Ethereum is the blockchain platform in that anyone can write the code or programs that are called smart contracts and this code or programs run the decentralized applications [3]. Nick Szabo introduced the smart contract [4]. Bitcoin and Ethereum are these two blockchain networks to execute the smart contract. The smart contracts are executed on decentralized platform of Ethereum which can be developed by using programming language such as Solidity, Serpent and other Low Level Lisp like Language can be loaded to Ethereum Virtual Machine (EVM) for execution [3, 5, 6]. As a new technology, Blockchain attract many attacks and scam such as Security of the Private Key, Double Spending, 51% Vulnerability. One of the major scams occurred on blockchain technology is Ponzi scheme. Ponzi scheme is as financial fraud in this scheme, company makes the fake promises to the investors for returning high amount of money on their investements. Since blockchain technology works over the Internet, there are many attackers and malicious users, who try to hack a node or the entire network by using many attacks. R. B. Amle (B) · A. U. Surwade School of Computer Sciences, Kavayitri Bahinabai Chaudhari North Maharashtra University, Jalgaon, India e-mail: [email protected] A. U. Surwade e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_9
91
92
R. B. Amle and A. U. Surwade
Some of the attacks in the blockchain network are Selfish mining attack, DAO attack, Liveness attack, Border Gateway Protocol (BGP) Hijacking and Balance attack. In the recent time, the criminal activities are increasing day by day in blockchain network. In bitcoin, the users are assigned with the address without proof of identities and therefore the payment made using bitcoin is difficult to track. The criminal activities can be executed on blockchain. Ponzi scheme is a classical fraud which promises high return rates with little risk to investors. It pays the older investors with new investors’ funds. But if there is no enough circulating money, the scheme unravels those posteriors who consequently lose their money. The WannaCry or WannaCrypt is one popular example of such activity [7, 8]. This chapter is organized as, motivation is followed by this section which is followed by literature review. The next section is research gap identified which consists of the proposed architecture which is then followed by conclusion and references.
2 Motivation The recent work of [9, 10] conducted a systematic study over the Ponzi schemes on Ethereum. In particular, 16,082,269 transactions were collected from July 2015 to May 2017. It was found that 17,777 transactions were related to Ponzi schemes, which had already collected over 410,000 US dollars within only two years. Chen et al. [10] proposed a method to extract features from both accounts and the operation codes to identify Ponzi schemes on Ethereum. Meanwhile, the work of [11] proposed a novel approach to detect and quantify Ponzi schemes on bitcoin. In particular, to address the difficulty of identifying Ponzi schemes as they often use multiple addresses, a clustering method [9, 11] was proposed to identify the addresses. They found that 19 out of 32 Ponzi schemes use more than one address.
3 Literature Review Table 1 summarizes the literature review in chronological order on Ponzi scheme. It is mainly focused on the techniques or strategies applied by several researchers to address or solve the problem.
Dataset used
Totally 184 contracts are used from goo.gl/CvdxBp website
Smart contracts from http://ethers can.io is used
Bitcoin addresses are extracted
–
–
–
Authors
Bartoletti et al. [9], (2019)
Chen et al. [10], (2018)
Bartoletti et al. [11], (2018)
Drew and Drew [12], (2010)
Smith F. [13], (2010)
Lewis, M. K. [14], (2012)
Table 1 Important contribution Techniques/methodology
Contribution
Author has proposed a model to estimate that more than 400 Ponzi schemes that are running on Ethereum platform
Author has identified 138 Ponzi schemes on Ethereum deployed public dataset on Ethereum platform
Author has used the case studies to identify the three Ponzi scheme that runs by the Kenneth Wayne, Allan McFarlane, and Bernard Madoff
(continued)
Overview of Ponzi scheme is described with various aspects which is compared with other financial arrangements, victims blame to themselves. protect the investors for necessary changes and avoid the repetitions
Author has focused the transactions Author has described the Madoff Ponzi failure, investor transactions problem, scheme that has impact on investors securities, financial instruments and markets
Autor has used tools to detect the Author has focused on the Madoff fraud activity of the specific existence Ponzi scheme and proposed the fraud of Ponzi scheme detection systems in the industry of financial services
Author used the data mining Author used the data techniques to automatically detect and mining algorithms and classifiers to quantify Bitcoin Ponzi Scheme. detecting Ponzi schemes on bitcoin
Used machine learning methods and data mining techniques to detect the Ponzi scheme on the blockchain
Author describes the survey of Ponzi schemes on Ethereum and analyzed the behavior and impact of various viewpoints
9 Ponzi Scam Attack on Blockchain 93
Data collected from www.bitcointa Author has described the advertised scams and created the scams list lk.org
Used AC technique to control the bitcoin addresses and used High-Yield Investment Program (HYIP) identification methodology to classify the bitcoin address that belongs to HYIP or non-HYIP Author has used decision trees and support vector machine classification methods
–
Bitcoin addresses and High-Yield Investment Program (HYIP)
–
Papillouda and Haeslerb [16], (2014)
Vasek and Moore [17], (2019)
Toyoda et. al. [18], (2019)
Chen et al. [19], (2019)
Survey of electronic money and pure Ponzi scheme and virtuous Ponzi scheme are carried out
Author has monitored the 9 websites, daily observed the High-Yield Investment Program (HYIP) website for guidelines of investors help to invest their money
High-Yield Investment Program (HYIP) is used
Moore et al. [15], (2012)
Techniques/methodology
Dataset used
Authors
Table 1 (continued)
(continued)
Authors have proposed a machine learning method to detect smart Ponzi schemes and identified that 500 smart Ponzi schemes running on Ethereum
Author has Proposed the novel dataset collection approach, which significantly increases the number of High-Yield Investment Program (HYIP) owners. Accuracy reported 93.75% of proposed methodology or schemes
Author has focused the Ponzi schemes in bitcoin, kinds scams, and examine the 11,424 threads on this site, to find the 1780 distinct scams in bitcointa lk.org
Author has described how to use electronic money payment to societies, it is involved in Ponzi scheme and has differentiated Pure Ponzi scheme and Virtuous Ponzi scheme
Author has identified 22 currencies that accept in High-Yield Investment Program (HYIP) and summarized the HYIP analysis for online fraud Ponzi schemes in website
Contribution
94 R. B. Amle and A. U. Surwade
Eight classification algorithms are used Logistic Regression, Decision Trees, Support Vector Machine, Random Forests, Extremely randomized trees, Gradient Boosting Machines, XGBoost, LightGBM
Presented the survey of post deployed Overview of smart contract Ethereum smart contract development maintenance and related concerns on using offline checking methods, Ethereum are carried out. online checking methods, and other methods
Operation codes of smart contract on Ethereum
Ethereum smart contract used
Peng and Xiao [22], (2020)
Chen et. al. [23], (2021)
Proposed a model to use the smart contract operation codes to detect the Smart Ponzi scheme on Ethereum
The Author has identified the five Author has observed that the Ponzi Ponzi scheme on website homepages schemes target the websites in Nigeria between the 2016 and 2019 in Nigeria to particular groups and culture Qualitative methods are used for the webpage’s discourses in fraudulent websites
Author had investigated the demographic characteristics of the victims of a Ponzi scheme. Out of 698 peoples, 30 of whom were invited to participate in ‘in-depth interviews’.
–
Contribution
Chiluwa and Chiluwa [21], (2020)
Survey and the interviews are carried out
–
Fei et al. [20], (2020)
Techniques/methodology
Dataset used
Authors
Table 1 (continued)
9 Ponzi Scam Attack on Blockchain 95
96
R. B. Amle and A. U. Surwade
4 Research Gap Identified After carrying out the literature review, following research gap has been identified: [A] It is important to investigate the role of bitcoin and its importance in working of blockchain. [B] It is needed to investigate the execution of smart contracts and its susceptibility of smart contracts to different types of scams or frauds mainly focused on Ponzi scheme. [C] It is necessary to develop method for the detection of Ponzi scheme attack or Ponzi Scam on blockchain. After studying the literature survey, the research gap has been found which is mentioned in this section and the architecture has been proposed to classify the Ponzi scheme as shown in Fig. 1. The bitcoin address is collected from various sources and websites. The standard public dataset which is available at [24] has been downloaded. This dataset consists of bitcoin addresses along with features of bitcoin Ponzi schemes. These features would be considered during the process of feature extraction. The regular bitcoin addresses which are free from Ponzi scam would be considered for non-Ponzi bitcoin addresses. After collecting all the features of both Ponzi schemes, Bitcoin and Non-Ponzi Bitcoin would be divided into two sets out of which one set is used for training the machine learning classifier and another part would be used for testing machine learning classifier. The bitcoin transaction addresses would be classified in two different types such as Ponzi scheme class and Non-Ponzi scheme class.
Fig. 1 Architecture for Bitcoin Ponzi Scheme Classification
9 Ponzi Scam Attack on Blockchain
97
The important features would be extracted from both the classes, Ponzi scheme class and non-Ponzi scheme class. The machine learning classifier is proposed which would be initially trained with extracted Ponzi scheme and non-Ponzi schemes features. After successful training of machine learning classifier, new Bitcoin address would be tested with this machine learning classifier. This machine learning classifier would then classify the Bitcoin transactions as Ponzi scheme and Non-Ponzi scheme based on features.
5 Conclusion Blockchain technology is becoming popular because of its decentralized, peer-topeer transaction and immutable properties. This chapter has discussed attacks on Blockchain which includes Ponzi scheme attack. As a new technology, Blockchain is vulnerable to malicious attacks initiated by scams such as Ponzi scheme. A systematic literature review has been carried out on Ponzi scheme. After going through the literature review, the chapter has identified the research gap. The architecture has been proposed to detect the Ponzi scheme attacks on blockchain. This architecture would use machine learning classifier which will be trained with public standard dataset mentioned in [24] and would classify the Ponzi scheme and non-Ponzi schemes.
References 1. Attaran M (2020), Blockchain technology in healthcare: challenges and opportunities. Int J Healthcare Manag, 1–14. https://doi.org/10.1080/20479700.2020.1843887 2. Antonopoulos AM (2017) Mastering bitcoin programming the open Blockchain, Second Edition. O’Reilly Media 3. Nakamoto S (2008) Bitcoin: a peer-to-peer electronic cash system, www.bitcoin.org/bitcoin. pdf 4. Gamage HTM, Weerasinghe HD, Dias GJ (2020) A survey on Blockchain technology concepts. Appl Issues, Springer Nat Comput Sci. Springer Nature Singapore.https://doi.org/10.1007/s42 979-020-00123-0 5. Merkle RC (1990) A certified digital signature. Lect Notes Comput Sci G. Brassard (Ed.): Advances in cryptology—CRYPT0 ‘89, LNCS 435, 435, pp. 218–238. https://doi.org/10.1007/ 0-387-34805-0_21 6. Namasudra S, Deka GC, Johri P, Hosseinpour M, Gandomi AH (2020) The revolution of Blockchain: state-of-the-art and research challenges. Arch Comput Methods Eng.https:// doi.org/10.1007/s11831-020-09426-0 7. Kosba JA, Shi EA (2016) The ring of Gyges: investigating the future of criminal smart contracts. In: the proceedings of the 2016 ACM SIGSAC conference on computer and communications security, New York, NY, USA, pp 283–295 8. Zheng Z,Xie S, Dai HN (2019) An overview on smart contracts: challenges, advances, and platforms. Future Gen Comput Syst.https://doi.org/10.1016/j.future.2019.12.019
98
R. B. Amle and A. U. Surwade
9. Bartoletti M, Carta S, Cimoli T, Saia R (2019) Dissecting Ponzi schemes on Ethereum: identification, analysis, and impact. Future Gen Comput Syst, arXiv preprint arXiv:1703.03779 10. Chen W, Zheng Z, Cui J, Ngai E, Zheng P, Zhou Y (2018) Detecting Ponzi schemes on Ethereum: Towards healthier blockchain technology. In: the Proceedings of the 2018 world wide web conference on world wide web, pp 1409–1418 11. Bartoletti M, Pes B, Serusi S (2018) Data mining for detecting bitcoin Ponzi schemes. In: the Proceedings of the crypto valley conference on Blockchain Technology (CVCBT), IEEE, pp 75–84 12. Drew JM, Drew ME (2010) The Identification of Ponzi schemes (51–70). Griffith Law Rev 19:1.https://doi.org/10.1080/10854668.2010.10854668 13. Felicia S (2010) Madoff Ponzi scheme exposes, the myth of the sophisticated investor. Univ Baltimore Law Rev 40(2), Article 3. Available at: http://scholarworks.law.ubalt.edu/ublr/vol40/ iss2/3 14. Lewis MK (2012) New dogs, old tricks. Why do Ponzi schemes succeed? University of South Australia, Australia, https://doi.org/10.1016/j.accfor.2011.11.002 15. Moore T, Han J, Clayton R (2012) The postmodern ponzi scheme: empirical analysis of highyield investment programs, pp 41–56. A.D. Keromytis (Ed.): FC 2012, LNCS 7397 16. Papillouda C, Haeslerb A (2014) The veil of economy: electronic money and the pyramidal structure of societies, pp 54–68, Taylor & Francis, Distinktion: Scandinavian J Soc Theory 15(1), https://doi.org/10.1080/1600910X.2014.882853 17. Vasek M, Moore T (2019) Analyzing the bitcoin ponzi scheme ecosystem. In: Zohar A et al. (eds) International financial cryptography association, FC 2018 Workshops, LNCS 10958, pp 101–112, https://doi.org/10.1007/978-3-662-58820-8_8 18. Toyoda K, Mathiopoulos PT, Ohtsuki T (2019) A novel methodology for hyip operator’s bitcoin addresses identification. IEEE Access, 2169–3536 Vol 7, https://doi.org/10.1109/Access.2019. 2921087 19. Chen W, Zheng Z, Ngai E, Zheng P, Zhou Y (2019) ExploitingBlockchain data to detect smart ponzi schemes on Ethereum. IEEE Access 7,https://doi.org/10.1109/Access.2019.2905769 20. Fei L, Shi H, Sun X, Liu J, Shi H, Zhu Y (2020) The profile of ponzi scheme victims in china and the characteristics of their decision-making process. Deviant Beh,https://doi.org/10.1080/ 01639625.2020.1768639 21. Chiluwa IM, Chiluwa I (2020) We are a mutual fund, How Ponzi scheme operators in Nigeria apply indexical markers to shield deception and fraud on their websites. Soc Semiotics.https:// doi.org/10.1080/10350330.2020.1766269 22. Peng J, Xiao G (2020) Detection of smart ponzi schemes using opcode. In: Zheng Z et al. (eds) BlockSys. Springer Nature Singapore, pp 192–204. https://doi.org/10.1007/78-981-15-92133_15 23. Chen J, Xia X, Lo D, Grundy J, Yang X (2021) Maintenance-related concerns for post-deployed Ethereum smart contract development: issues, techniques, and future challenges. Empirical Soft Eng.,https://doi.org/10.1007/s10664-021-10018-0 24. The Standard dataset available at www.goo.gl/ToCho7 downloaded on August 2022
Chapter 10
Using the Light Gradient Boosting Machine for Prediction in QSAR Models Marc Stawiski, Patrick Meier, Rolf Dornberger, and Thomas Hanne
1 Introduction The quantitative relationship between pharmacological, chemical, biological, and physical effects of a molecule with its chemical structure can be modeled with a quantitative structure–activity relationship (QSAR) model. Instead of randomly testing a large number of potential drug candidates against a drug target, as in highthroughput screening, the use of QSAR models provides a novel method to predict potential candidates based on their structure. The usefulness of such QSAR models has already been proven for over 50 years in many application areas of pharmacology, such as HIT identification and toxicity prediction in drug discovery [1].
1.1 Genotoxicity In preclinical development, molecules are tested for their genotoxicity, i.e., their ability to damage genetic information in cells, which is mandatory for every new drug submission, according to the International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) Guideline S2 (R2). To reduce the drug attrition rate due to genotoxicity, the genotoxicity of small molecules is computationally predicted during the drug discovery phase M. Stawiski · P. Meier School of Life Sciences, University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland R. Dornberger · T. Hanne (B) Institute for Information Systems, University of Applied Sciences and Arts Northwestern Switzerland, Basel, Olten, Switzerland e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_10
99
100
M. Stawiski et al.
using expert systems and/or QSAR models. The computational prediction of genotoxicity offers first approaches to virtualize further toxicological studies to reduce the number of animal experiments in the future [2].
1.2 MAO-B Bioactivity Monoamine Oxidase Type-B (MAO-B) is an enzyme in our body that breaks down several chemicals in the brain, including dopamine. By giving a medication that blocks the effect of MAO-B, an MAO-B inhibitor, more dopamine is available to be used by the brain. This can modestly improve motor symptoms of Parkinson’s disease [3]. Using the MAO-B enzyme as a target for predicting the bioactivity of small molecules exemplifies the implementation of a virtual screening approach in the phase of HIT identification with a QSAR model. By using virtual screening in this phase of drug discovery it is possible to reduce the number of compounds to be screened for in the subsequent drug discovery steps. Such procedures in the phase of HIT identification have already been successfully used in previous studies [4].
1.3 CCR5 Bioactivity C–C chemokine receptor type-5 is a protein on the surface of white blood cells that is involved in the immune system as a receptor for chemokines. CCR5 inhibitors are a new class of antiretroviral drugs used to treat human immunodeficiency virus (HIV). They are designed to prevent HIV infection of CD4 T-cells by blocking the CCR5 receptor [5]. As with inhibition of the MAO-B enzyme, inhibition of the target CCR5 has a positive effect on the clinical picture. The receptor CCR5 is used as a target for virtual screening and small molecules are sought that are able to inhibit this receptor.
2 Description of the Optimization Problem 2.1 Problem Model The QSAR models form our problem model which consists of two parts: Description of molecular structure. For the QSAR model, the chemical structures and their substructures need to be converted from Simplified Molecular Input Line Entry System (SMILES), a chemical notation that allows to represent a chemical structure in such a way that it can be used by a computer, into molecular descriptors and fingerprints shown in Fig. 1. The finally obtained data format is the result of a
10 Using the Light Gradient Boosting Machine …
101
mathematical procedure. This conversion of a symbolic representation of a moleculeencoded chemical information into a useful number or the result of a standardized experiment can be achieved using the open-source software PaDEL-Descriptor [6]. Supervised Machine Learning. To identify correlations between molecular descriptors and fingerprints with biological properties such as genotoxicity or bioactivity, supervised machine learning models are most commonly used [2, 7, 8]. Thus, this approach requires labeled datasets to train the algorithm to classify data or predict outcomes accurately.
Fig. 1 SMILES generation algorithm for ciprofloxacin
102
M. Stawiski et al.
2.2 Optimization Methods Random Forest. To apply supervised machine learning in QSAR models, mostly Random Forest (RF) methods are used, as exemplified given in [2, 7, 8]. The RF approach is a classification and regression method widely used in statistical analysis and predictive modeling. RF is a robust decision tree algorithm that consists of multiple uncorrelated decision trees. All decision trees are evolved under a certain type of randomization during the learning process, where each tree is created using a bootstrap sample of the input data and a random selection of the best k predictors in each partition of the tree. In a supervised learning approach, each tree in the forest can make a decision, in the case of classification the model predicts the class with the most votes, whereas in the case of regression the mean of all trees determines the predicted value. Due to the ensemble nature of RF, it is difficult to understand the relationship between predictors and observations; however, it is possible to quantify the effects of predictors within the ensemble on the prediction using the improvement criteria for the entire ensemble [7]. In a RF model, various hyperparameters can be tuned to optimize the model. The following hyperparameter settings have been applied, see Table 1. Light Gradient Boosting Machine. Still untested in QSAR models is the method LightGBM, a highly efficient Gradient Boosting Decision Tree, which includes two techniques: gradient-based one-sided sampling and exclusive feature bundling to process a large number of data instances or a large number of features [9]. The LightGBM has been used to diagnose Parkinson’s disease using only speech recordings [10] and has also shown superior performance in detecting daily activities using sensory data from wearables. Furthermore, in a LightGBM model, various hyperparameters can be tuned to optimize the model. The following hyperparameter settings were applied, see Table 2. Table 1 Hyperparameters of RF
Parameter name
Description
n_estimators
Number of trees in the 1000 forest
max_depth
Maximum depth of the tree
min samples split
Minimum number of samples to split
5
min_samples_leaf
Minimum number of samples per leaf
3
Used parameter
10
10 Using the Light Gradient Boosting Machine … Table 2 Hyperparameters of LightGBM
103
Parameter name
Description
Used parameter range
n_estimators
Number of estimators [6, 24, 96]
Max depth
Maximum tree depth for base learners
Num leaves
Maximum tree leaves [32, 64] for base learners
Boosting type
Boosting algorithm type
[“gbdt”, “dart”]
Learning rate
Boosting learning rate
[0.063, 0.126]
Subsample for bin
Number of samples for constructing bin
[60000]
[21, 42]
2.3 Implementation and Comparison of Different Optimization Methods Modeling process. A grid search algorithm is applied that systematically combines all previously defined hyperparameters to create different QSAR models. After empirical evaluation of suitable parameter ranges, we used the parameters mentioned in Table 2 for the grid search. Internal validation. The predictive performance of these QSAR models is checked by stratified fivefold cross validation [2] using only the training dataset. In cross validation, not the entire dataset is used to build the predictive model. Instead, the training dataset is split five times into a sub-training dataset and a sub-test dataset so that the performance of the model built using the sub-training dataset can be evaluated on the sub-test dataset. By stratification, care is taken to ensure that the distribution of classes present in the training and test datasets remain the same. By performing repetitions of these fivefold cross validation, the average accuracies, respectively, R-squares can be used to truly evaluate the performance of the predictive QSAR model. External validation. The four best QSAR models from the internal validation are compared again using an external test set. For this, the entire training dataset is used to build and train the predictive QSAR models. So that the performance of the different QSAR models built using the training dataset can be evaluated on the external test dataset. Classification metrics. To compare the classification models, following metrics are considered: Accuracy, Precision, Recall as well as the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC). The metrics are calculated as follows using the terms: True Positive (tp), False Negative (fn), False Positive (fp), and True Negative (tn):
104
M. Stawiski et al.
Accuracy(y, yˆ ) =
n samples −1
1 n samples
Pr ecision = Recall =
1 yi = yˆi
tp tp + f p
(2)
tp tp + f n
Fall − out =
(1)
i=1
(3)
fp tp + f p
(4)
The ROC curve is created by plotting the Recall against the Fall-out at all classification thresholds. Regression metrics. To compare the regression models, following metrics are considered: Max Error, Mean Absolute Error (MAE), Mean Squared Error (MSE) as well as the R-squared value. Max Err or (y, y ) = max( yi − y i )
M AE(y, y ) =
M S E(y, y ) =
1
n samples −1 yi − y i
n samples 1
(5)
i=0
n samples −1
(yi − y i )
n samples n samples −1 2 (yi − y i ) 2 R y, y = 1 − i=0 n samples −1 (yi − y¯ )2 i=0 i=0
2
(6) (7)
(8)
3 Results 3.1 Genotoxicity Data generation. The two datasets from Chu et al. [7] are used as the data basis, comprising a total of 5395 molecules split into two sets, a training set of 4402 molecules (2549 mutagens and 1853 non-mutagens) and a test set of 993 molecules (498 mutagens and 495 non-mutagens). From these molecules, 881 substructure fingerprints are calculated to characterize the structure of the compound and are further used as features in the QSAR problem model. Internal validation. A total of 48 different LightGBM QSAR models are created. All created models achieve a good performance during the fivefold cross validation
10 Using the Light Gradient Boosting Machine …
105
with respect to accuracy (0.803 ± 0.025), see Fig. 2. The best LightGBM model shows an accuracy of 0.829 ± 0.020. The additionally created reference RF model achieves an accuracy of 0.826 ± 0.015 during the fivefold cross validation, which is only marginally worse than the best LightGBM model, shown in Fig. 3. External validation. In the external validation with the test dataset, the best LightGBM model shows an accuracy of 0.774, precision of 0.798, and recall of 0.737. The reference RF model shows an accuracy of 0.778, precision of 0.810, and recall of 0.729, which is only marginally better in accuracy and precision, compared
Fig. 2 Accuracy comparison between all LightGBM models and the reference RF model during internal validation for the genotoxicity QSAR problem
Fig. 3 Comparison between the best LightGBM model and the reference RF model during internal validation for the genotoxicity QSAR problem
106
M. Stawiski et al.
Fig. 4 Comparison of different metrics between the best LightGBM models and the reference RF model during external validation for the genotoxicity QSAR problem
to the LightGBM model. Significant differences between the two methods are only seen in the required training times, the RF model runs 13 s, while the LightGBM model only requires between 1.04 and 1.59 s for the training process. The other metrics and further machine learning models are listed in Fig. 4 for the external validation.
3.2 MAO-B Bioactivity Data generation. Existing data of small molecules exhibiting a half maximal inhibitory concentration value on the target MAO-B are retrieved from the ChEMBL database and stored as a dataset. 5067 small molecules entries are found corresponding to this target. After preprocessing the data, removing salt residues from the SMILES, and filtering out duplicate entries, 4200 entries were considered. From these molecules, 881 substructure fingerprints are calculated to characterize the structure of the compound and further used as features in the QSAR problem model. The data is then divided (80:20) into separate datasets, the training set containing 3360 molecules and the test set 840 molecules. Internal validation. A total of 48 different LightGBM QSAR models are created. For further analysis and graphical representation, only 40 models with a positive R-square value are considered. In general, the created models achieve a weak performance during the fivefold cross validation with respect to R-squared (0.411 ± 0.168), see Fig. 5; however, the best LightGBM model shows a moderate R-squared value of 0.636 ± 0.011. The additionally created reference RF model achieves an R-squared value of 0.563 ± 0.013 during the fivefold cross validation, which is worse than the result of the best LightGBM model, shown in Fig. 6. External validation. In the external validation with the test dataset, the best LightGBM model shows an R- squared value of 0.648. The reference RF model
10 Using the Light Gradient Boosting Machine …
107
Fig. 5 Accuracy comparison between the LightGBM models and the reference RF model during internal validation for the MAO-B QSAR problem
Fig. 6 Comparison between the best LightGBM model and the reference RF model during internal validation for the MAO-B QSAR problem
shows an R-squared value of 0.586. Significant differences between the two methods are only seen in the required training times, the RF model taking 64.6 s, while the LightGBM model only requires between 0.85 and 1.25 s for the training process. The other metrics and further machine learning models are listed in Fig. 7 for the external validation.
108
M. Stawiski et al.
Fig. 7 Comparison of different metrics between the best LightGBM models and the reference RF model during external validation for the MAO-B QSAR problem
3.3 CCR5 Bioactivity Data generation. Existing data of small molecules exhibiting a half maximal inhibitory concentration value on the target CCR5 are retrieved from the ChEMBL database and stored as a dataset. 3862 small molecule entries are found corresponding to this target. After preprocessing the data, removing salt residues from the SMILES, and filtering out duplicate entries, 2086 entries were considered. From these molecules, 881 substructure fingerprints are calculated to characterize the structure of the compound and are further used as features in the QSAR problem model. The data is then divided (80:20) into separate datasets, the training set containing 1668 molecules and the test set with 418 molecules. Internal validation. A total of 48 different LightGBM QSAR models are created. For further analysis and graphical representation, only 40 models with a positive R-square value were considered. In general, the created models achieve a moderate performance during the fivefold cross validation with respect to R-squared (0.523 ± 0.171), see Fig. 8; however, the best LightGBM model shows a good R-squared value of 0.713 ± 0.021. The additionally created reference RF model achieves an R-squared value of 0.674 ± 0.013 during the fivefold cross validation, which is worse than the result of the best LightGBM model, shown in Fig. 9. External validation. In the external validation with the test dataset, the best LightGBM model shows an R-squared value of 0.767. The reference RF model shows an R-squared value of 0.695. Significant differences between the two methods are only seen in the required training times, the RF model taking 29.1 s, while the LightGBM models only require between 0.52 and 0.66 s for the training process. The other metrics and further machine learning models are listed in Fig. 10 for the external validation.
10 Using the Light Gradient Boosting Machine …
109
Fig. 8 Accuracy comparison between the LightGBM models and the reference RF model during internal validation for the CCR5 QSAR problem
Fig. 9 Comparison between the best LightGBM model and the reference RF model during internal validation for the CCR5 QSAR problem
Fig. 10 Comparison of different metrics between the best LightGBM models and the reference RF model during external validation for the CCR5 QSAR problem
110
M. Stawiski et al.
4 Conclusions The LightGBM, developed with a focus on performance and scalability, is slowly finding its way into practical use and may replace the RF as the most popular supervised learning method for prediction in QSAR models. Especially in the field of high-throughput screening, the data volume is increasing significantly. Thus, it is important to choose a reliable and time-efficient method when selecting a machine learning approach. One of the biggest drawbacks of the RF is the extensive time consumption it requires to compute each decision tree, whereas LightGBM requires considerably less time for the calculation of the individual decision trees. A major difference between the two methods lies in the construction of the trees. LightGBM does not build a tree row by row, as the RF method does, but leaf by leaf. LightGBM selects the leaf that it assumes will provide the greatest loss reduction. It implements a highly optimized histogram-based decision tree algorithm that offers great advantages in both efficiency and memory consumption. In our investigated applications, we are not dealing with big data, however we already observe significant time differences. Furthermore, our study shows that the emerging LightGBM also has the potential for QSAR models and provides equivalent results to the popular RF. Both methods delivered comparable scores, in the internal and external validation.
References 1. Muratov EN, Bajorath J, Sheridan RP et al (2020) QSAR without borders. Chem Soc Rev 49(11):3525–3564. https://doi.org/10.1039/d0cs00098a 2. Yang X, Zhang Z, Li Q, Cai Y (2021) Quantitative structure–activity relationship models for genotoxicity prediction based on combination evaluation strategies for toxicological alternative experiments. Sci Rep 11(1):8030. https://doi.org/10.1038/s41598-021-87035-y 3. Binde CD, Tvete IF, Gåsemyr JI, Natvig B, Klemp M (2020) Comparative effectiveness of dopamine agonists and monoamine oxidase type-B inhibitors for Parkinson’s disease: a multiple treatment comparison meta-analysis. Europ J Clinical Pharmacol 76(12):1731–174. https://doi. org/10.1007/s00228-020-02961-6 4. Shao J, Gong Q, Yin Z, Pan W, Pandiyan S, Wang L (2022) S2DV: converting SMILES to a drug vector for predicting the activity of anti-HBV small molecules. Briefings Bioinformat 23(2):bbab593. https://doi.org/10.1093/bib/bbab593 5. Rao PKS (2009) CCR5 inhibitors: Emerging promising HIV therapeutic strategy. Indian J Sexually Trans Diseases AIDS 30(1):1–9. https://doi.org/10.4103/2589-0557.55471 6. Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707 7. Chu CSM, Simpson JD, O’Neill PM, Berry NG (2021) Machine learning—predicting ames mutagenicity of small molecules. J Mol Graph Model 109:108011. https://doi.org/10.1016/j. jmgm.2021.108011 8. Simeon S, Anuwongcharoen N, Shoombuatong W, Malik AA, Prachayasittikul V, Wikberg JE, Nantasenamat C (2016) Probing the origins of human acetylcholinesterase inhibition via QSAR modeling and molecular docking. PeerJ 4:e2322. https://doi.org/10.7717/peerj.2322
10 Using the Light Gradient Boosting Machine …
111
9. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st international conference on neural information processing systems, Red Hook, NY, USA, Dec. 2017, pp 3149–3157, Curran Associates (2017) 10. Karabayir I, Goldman SM, Pappu S, Akbilgic O (2020) Gradient boosting for Parkinson’s disease diagnosis from voice recordings. BMC Med Inform Decis Mak 20(1):1–7. https://doi. org/10.1186/s12911-020-01250-7
Chapter 11
The Adaptation of the Discrete Rat Swarm Optimization Algorithm to Solve the Quadratic Assignment Problem Toufik Mzili, Mohammed Essaid Riffi, and Ilyass Mzili
1 Introduction The lack of deterministic algorithms that can optimally solve NP-hard combinatorial optimization problems (POCs) within a reasonable computation time has led to the use of approximate search algorithms, specifically metaheuristics, for this purpose. These metaheuristics are constrained to search within combinatorial spaces. Researchers, mathematicians, and scientists are working diligently to develop metaheuristics and approximate algorithms capable of solving these types of combinatorial problems. Several types of heuristics and metaheuristics [1–5] have been identified, including those inspired by physics and chemistry (e.g., the simulated annealing algorithm [6] based on heat variation), those based on the mathematical modeling of the law of gravity and Newton’s motion (e.g., the gravitational search algorithm [7, 8]), and those inspired by natural selection processes (e.g., the genetic algorithm [9, 10]). Metaheuristics based on the behaviors of animal swarms [11–15] such as hunting, foraging, and prey pursuit are also of interest. These algorithms have recently gained importance and evolved due to their ability to cover the entire search space, their low complexity and parameter requirements, and their low memory usage. Some algorithms have shown promising results in approximating solutions for various continuous and discrete optimization problems.
T. Mzili (B) · M. E. Riffi Department of Computer Science, Faculty of Sciences, Chouaib Doukkali University, EI Jadida, Morocco e-mail: [email protected] I. Mzili Department of Management, Faculty of Economics and Management, Hassan First University, Settat, Morocco © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_11
113
114
T. Mzili et al.
In the rest of this work, we will study the performance of these algorithms in solving some discrete optimization problems such as the traveling salesman problem and the quadratic assignment problem. We also present a new bio-inspired metaheuristic inspired by the natural behavior of rats. The rat swarm optimizer is a metaheuristic inspired by the attack and fight behavior of rats. This optimizer could give excellent results in solving continuous POC better than most known metaheuristics in this sense according to the cited results by Dhiman et al. [16], and also in solving the famous discrete traveling salesman [17]. In this work, we will extend this optimizer to solve a discrete problem that is quite large than the traveling salesman problem, namely, the quadratic assignment problem.
2 RSO Algorithm In order to deal with continuous optimization problems, Dhiman et al. [16] initially proposed the rat-inspired algorithm in 2021. This algorithm was created by simulating predators engaging in attacks and fights while searching for prey. Rats are social animals that prefer to live in groups and perform their many tasks together, including hunting, attacking, and foraging. The two behaviors that serve as the basis for this bio-inspired algorithm are. Hunting behavior: in which rats hunt their prey in packs. To locate the prey, the group members designate a captain each time they think they have located it, and they follow him. However, each time they change captains, they cover the entire area. The behavior of dispute with the prey: in order to hunt their prey the rats enter into dispute with these last ones, this dispute can cause in several cases the death of certain rats which can translate to the cancellation of a certain solution.
2.1 Mathematical and Logical Modeling of Behavior This section explains the chasing and fighting behavior of rats. The proposed RSO algorithm is then described.
2.1.1
Prey Pursuit
Rats typically hunt their prey in packs due to their agonistic social behavior, which makes them sociable animals. We assume that the finest searcher knows the location of the prey in order to define this behavior quantitatively. The best searcher found so far can be updated by other searchers. The following equations are proposed to
11 The Adaptation of the Discrete Rat Swarm Optimization Algorithm …
115
model this mechanism: Loc = δ × Loci + β × (loc Best − loci )
(1)
where loc Best is the best optimal solution and loci+1 specifies the locations of the rats. On the other hand, the parameters δ and β are determined as follows: ( δ =θ −ρ
θ Max I teration
) ,1 ≤ θ ≤ 5
(2)
ρ = 1, 2, 3, . . . , Max I teration Therefore, δ, β play a crucial role in maintaining an efficient balance between exploration and exploitation during the iterative process. These parameters are assigned random values between [0, 2] and [1, 5], which contributes to their sensitivity in adapting to different stages of the process.
2.1.2
Combating a Prey
In many cases, the chase ends with the death of some rats. The following equation was presented as a mathematical definition of this process: Loci+1 = |Loc Best − Loci |
(3)
where Loci+1 specifies the rat’s next updated position. The best solution is preserved, and the positions of the other search agents in relation to the best search agent are updated.
3 Quadratic Assignment Problem (QAP) Koopmans and Beckman [18] introduced the quadratic assignment problem (QAP) in the context of the localization of indivisible economic activities in 1957. The goal of the problem is to minimize the overall cost of impact by affecting a group of installations at a group of locations. The cost of affectation for a pair of installations depends on the flow between them as well as the distance between their locations. Example Consider the possibility of an installation location issue with four installations (four emplacements). The following illustration depicts a potential effect: installation 2 is impacted at position 1, installation 1 is impacted at position 2, installation 4 is impacted at position 3, and installation 4 is impacted at position 3. This effect can be written as the permutation p = 2,1,4,3, which denotes that installations 2 and 1
116
T. Mzili et al.
Facility 2
Facility 1
Location 2 Location 1
Facility 3
Facility 4
Location 4
Location 3
Fig. 1 Facility location problem with four facilities
are both affected at position 1, installations 4 and 5 are both affected at position 3, and installations 4 and 5 are both affected at position 3. Figure 1 shows a line between a pair of installations that indicates a required flux between the installations. The thickness of the line increases with the flux value. In order to calculate the assignment cost of a permutation, the necessary flows between facilities (as shown in Table 1) and the distances between locations (as shown in Table 2) are required. The formula for calculating the assignment cost of a permutation is as follows: Function objective = f low(1, 2) × distance(2, 1) + f low(1, 4) × distance(2, 3) + f low(2, 4) × distance(1, 3) Table 1 The movements between facilities
Table 2 Distances between sites
Facility i
Facility j
Flow (i,j)
1
2
3
1
4
2
2
4
1
3
4
1
Location i
Location j
Distance (i,j)
1
3
53
2
1
22
2
3
40
3
4
55
11 The Adaptation of the Discrete Rat Swarm Optimization Algorithm …
117
+ f low(3, 4) × distance(3, 4) = 322 + 240 + 153 + 455 = 419. Formulation in Mathematics The Koopmans–Beckmann formulation of the QAP is presented here. The goal of the quadratic assignment problem is to place each facility in a location while minimizing the overall cost given a set of facilities and locations, together with the flows between facilities and the distances between locations. Sets N = {1, 2, 3, . . . ., n} Sn = ∅ :N → N represents the set of all permutations. Parameters F = ( f i, j ) Matrix of flow between facilities i and j. D = (di, j ) Matrix of the distance between locations i and j. Optimization Problem M i n∅ ∈ Sn
n Σ n Σ i=1
f i, j × d ∅(i)∅( j )
(4)
j
A permutation, where i is the place to which facility i is assigned, is used to indicate the assignment of facilities to locations. The cost of assigning facility I to location ∅(i ) and facility j to location ∅( j ) equals the cost of each individual product f i, j × d ∅(i)∅( j ) .
4 Discrete RSO Algorithm for QAP Standard RSO is, in principle, a continuous optimization method created to optimize continuous nonlinear functions. However, this technique cannot be used to solve discrete problems directly. In this work, a DRSO is created using the original RSO to solve discrete combinatorial problems, which serve as an example of QAP. The fundamental equations need to be modified in order to apply the DRSO to QAP. The first modification is to the position representation, the second is to the position update equation, and the third is to the RSO parameters and operators. Additionally, various neighborhood search techniques are always utilized for combinatorial issues to improve the quality of the result. The two-exchange neighborhood function is an
118
T. Mzili et al.
appropriate neighborhood search technique for QAPs in this study. The subsection that follows will go into further information about this function.
4.1 Update Position The movement of each virtual rat in an n-dimensional search space (n being the size of the problem) is described by the location in the fundamental RSO created by Dhiman et al. [9]. The rats tend to move to the best location (solution) found since the first iteration during this process of hunting and looking for prey by updating their positions Pi at time step t. During the hunt, a certain rat perished in a conflict with the prey. Therefore, in the instance of QAP, Lbest represents the answer to the n assignments discovered by the ith rat, and each rat represents a potential solution.
4.2 Check the Position Quality The process of hunting and fighting against the prey can lead in many cases to the death of rats that are not strong, this process can be modeled as follows: each rat represents a solution and the death of a rat can be translated as the neglect of a solution, see Eq. (2).
4.3 Operator of Discrete RSO In the continuous case, the logical and mathematical operators are applied to real and natural numbers. On the other hand, in the discrete case, they cannot be applied as their definitions in the continuous case, because in the discrete case most of the problems aim at optimizing discrete situations like order, sequencing or permutations, etc. Therefore, we have to change these operators by discrete operators. • The addition operator: This operator is used to add a step to the current position and change the current position. In the discrete case, we can present this operator as a set of permutations to change the location of the facilities. • The subtraction operator: The subtraction between two facilities loc Best − loci can be defined as the set of permutations to be performed to loci obtain a new facility loc Best . • The multiplication operator: This operator can be defined as an operator allowing to reduce the number of permutations, applicable between a real and a list β × (loc Best − loci ).
11 The Adaptation of the Discrete Rat Swarm Optimization Algorithm … Table 3 Parameters of discrete RSO
Parameter
119
Value
The population of rat size: N
60
δ
A random value between [1, 5]
β
A random value between [0, 1]
Nb iteration
1000
4.4 Parameters of Discrete RSO The parameters of the Discrete Rat Swarm Optimizer (DRSO) for solving the Quadratic Assignment Problem (QAP) are presented in Table 3. Additionally, various neighborhood search techniques are always utilized for combinatorial problems to enhance the quality of the solutions. The two-exchange neighborhood function is a reliable neighborhood search technique for QAPs in this study. The following section will provide this function’s specifics.
4.5 The Two Exchange For the QAB problem, the two exchange is a very simple optimization technique. Starting from a random installation, it repeatedly exchanges two of its locations with new ones, as long as this results in a more optimized installation. The K-opt generally adheres to the same philosophy as the 2-opt. As long as it results in a shorter turn, one can simply replace the K rentals of the installation with the other K rentals rather than replacing both rentals.
4.6 The Objective Function The objective function is the sum of the distances between the locations multiplied by the flow between the locations, see Eq. (4). Here is the pseudocode for the discrete rat swarm optimization algorithm (Algorithm 1).
120
T. Mzili et al.
5 Computer Results and Analysis 5.1 Results The experiments were implemented in C + + and conducted on a quad-core Intel Core i5 processor with 4 GB of RAM. The QABLIB library was used for testing, and the instances tested ranged in size from 12 to 100, as indicated by the instance names (e.g., the number 19 in the instance named “els19” indicates the number of facilities offered). To compare the performance of our algorithm with other metaheuristics, we defined the parameters and comparison criteria in Table 4 and presented the results produced by the Discrete Rat Swarm Optimization (DRSO) method in Table 5. Table 4 The comparison criteria
Criteria
Description
Opt-know The optimal solution Best
The length of the best solution found by each algorithm
Time(s)
The average time in seconds for the 30 executions
Worst
The length of the worst solution found by each algorithm
Err: (%)
The percentage deviation of the average solution length from the optimal
average-opt opt
× 100
11 The Adaptation of the Discrete Rat Swarm Optimization Algorithm …
121
Table 5 Results produced by DRSO method Instance
Opt-know
Best
Worst
Average
Err
Time
Bur26a
5,426,670
5,426,670
5,426,670
5,426,670
0
0,5
Bur26b
3,817,852
3,817,852
3,817,852
3,817,852
0
0,2
Bur26c
5,426,795
5,426,795
5,426,795
5,426,795
0
1,3
Bur26d
3,821,225
3,821,225
3,821,225
3,821,225
0
2,0
Bur26e
5,386,879
5,386,879
5,386,879
5,386,879
0
1,7
Bur26f
3,782,044
3,782,044
3,782,044
3,782,044
0
1,2
Bur26g
10,117,172
10,117,172
10,117,172
10,117,172
0
0,3
Bur26h
7,098,658
7,098,658
7,098,658
7,098,658
0
0,9
Chr12a
9552
9552
9552
9552
0
0,4
Chr12b
9742
9742
9742
9742
0
0,9
Chr12c
11,156
11,156
11,156
11,156
0
0,1
Chr15a
9896
9896
9896
9896
0
0,1
Had12
1652
1652
1652
1652
0
0
Had14
2724
2724
2724
2724
0
0
Had16
3720
3720
3720
3720
0
0
Had18
5358
5358
5358
5358
0
1
Had20
6922
6922
6922
6922
0
1
Nug12
578
578
578
578
0
0
Nug14
1014
1014
1014
1014
0
0
Nug15
1150
1150
1150
1150
0
1
5.2 Comparison of DRSO with CSO To demonstrate the robustness of our algorithm, we will compare it to a well-known metaheuristic based on the natural behavior of cats in the wild, as adapted by Bouzidi and Riffi [19], which has been shown to be effective in solving discrete optimization problems such as the traveling salesman problem and scheduled work. In order to compare the performance of our proposed algorithm with that of the cat-inspired metaheuristic, we will use two criteria: the average error to measure the deviation of the solutions obtained and the time required to determine the faster algorithm. These results will be presented in Table 6. According to the results presented in Table 6 and the graphs in Figs. 2 and 3, it can be seen that the Discrete Rat Swarm Optimization (DRSO) algorithm was able to produce good solutions in a shorter amount of time compared to the cat swarm optimization algorithm. Additionally, as shown in Fig. 2, all of the solutions obtained by the DRSO algorithm are similar to the known good solution.
122
T. Mzili et al.
Table 6 Comparison of DRSO with CSO CSO
DRSO
Instance
Opt
Opt
Err
Time (s)
Opt
Err
Time (s)
Bur26a
5,426,670
5,426,670
0
15
5,426,670
0
0,5
Bur26b
3,817,852
3,817,852
0
20
3,817,852
0
0,2
Bur26c
5,426,795
5,426,795
1
24
5,426,795
0
1,3
Bur26d
3,821,225
3,821,225
0
69
3,821,225
0
2,0
Bur26e
5,386,879
5,386,879
0
05
5,386,879
0
1,7
Bur26f
3,782,044
3,782,044
0
21
3,782,044
0
1,2
Bur26g
10,117,172
10,117,172
0
36
10,117,172
0
0,3
Bur26h
7,098,658
7,098,658
0
22
7,098,658
0
0,9
Algorithe Err(%) 1.2 1
CSO
0.8
DRSO
0.6 0.4 0.2 0 Bur26a
Bur26b
Bur26c
Bur26d
Bur26e
Bur26f
Bur26g
Bur26h
Bur26g
Bur26h
Fig. 2 Percentage error comparison between DRSO and CSO
Average running time (s) 80 60 40 20 0 Bur26a
Bur26b
Bur26c
Bur26d CSO
Fig. 3 Comparison of the average running time
Bur26e DRSO
Bur26f
11 The Adaptation of the Discrete Rat Swarm Optimization Algorithm …
123
6 Conclusion In this study, we introduced an adaptation of the Discrete Rat Swarm Optimization (DRSO) algorithm to solve the Quadratic Assignment Problem (QAP). This initial version of the DRSO algorithm, which is not hybridized with any other method, has already shown success in solving the traveling salesman problem. In this research, we aimed to expand the applicability of our approach to more challenging discrete problems. To evaluate the effectiveness of our initial adaptation of the DRSO algorithm, we compared its performance to that of a bio-inspired metaheuristic based on cat behavior. The results of this comparison demonstrate that the initial adaptation of the DRSO algorithm is effective in solving problems of this nature and that further development, such as the incorporation of additional improvement heuristics or hybridization with other metaheuristics, could lead to even better results. In the DRSO algorithm, each rat is a simple entity that can move in a single dimension to find the optimal path through its environment. Each rat can move to a nearby location or search for a different location within a certain radius of its current position. Author Contributions Research problem, M.R., T.M. and I.M.; conceptualization, M.R., T.M. and I.M.; methodology, M.R., T.M., and I.M.; formal analysis, M.R., T.M.; resources, M.R., I.M.; original drafting, M.R. and T.M.; reviewing and editing, M.R., M., I.M; project administration, M.R., T.M., IM; supervision, M.R., I.M.; proposal, improvement, and ideation, M.R., I.M. All authors have read and approved the published version of the manuscript. Acknowledgements The authors would like to express their gratitude to editors and anonymous referees for their informative, helpful remarks and suggestions to improve this paper as well as the important guiding significance to our research. Funding This research received no external funding.
Conflict of Interest The authors declare that they have no conflict of interest.
References 1. Gao XZ, Govindasamy V, Xu H, Wang X, Zenger K (2015) Harmony search method: theory and applications. Comput Intell Neurosci 2015:1–10. https://doi.org/10.1155/2015/258491 2. Golabian H, Arkat J, Tavakkoli-Moghaddam R, Faroughi H (2021) A multi-verse optimizer algorithm for ambulance repositioning in emergency medical service systems. J Ambient Intell Humaniz Comput 13(1):549–570. https://doi.org/10.1007/s12652-021-02918-2 3. Arnold DV, Beyer H-G (2002) Noisy optimization with evolution strategies. Springer Science & Business Media 4. Barbarosoglu G, Ozgur D (1999) A tabu search algorithm for the vehicle routing problem. Comput Oper Res 26(3):255–270. https://doi.org/10.1016/s0305-0548(98)00047-1
124
T. Mzili et al.
5. Atashpaz-Gargari E, Lucas C (2007) Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition. IEEE Xplore. https://doi.org/10.1109/CEC. 2007.4425083 6. Goto T, Najafabadi HR, Falheiro M, Martins TC, Barari A, Tsuzuki MSG (2021) Topological optimization and simulated annealing. IFAC-PapersOnLine 54(1):205–210. https://doi.org/10. 1016/j.ifacol.2021.08.078 7. Wang Y, Gao S, Yu Y, Cai Z, Wang Z (2021) A gravitational search algorithm with hierarchy and distributed framework. Knowl-Based Syst 218:106877. https://doi.org/10.1016/j.knosys. 2021.106877 8. Kaveh A, Dadras A (2017) A novel meta-heuristic optimization algorithm: thermal exchange optimization. Adv Eng Softw 110:69–84. https://doi.org/10.1016/j.advengsoft.2017.03.014 9. Forrest S (1996) Genetic algorithms. ACM Comput Surv 28(1):77–80. https://doi.org/10.1145/ 234313.234350 10. Koza JR (2010) Human-competitive results produced by genetic programming. Genet Program Evolvable Mach 11(3–4):251–284. https://doi.org/10.1007/s10710-010-9112-3 11. Abualigah L, Elaziz MA, Sumari P, Khasawneh AM, Alshinwan M, Mirjalili S, … Gandomi AH (2022) Black hole algorithm: a comprehensive survey. Appl Intell.https://doi.org/10.1007/ s10489-021-02980-5 12. Agharghor A, Riffi ME, Chebihi F (2019) Improved hunting search algorithm for the quadratic assignment problem. Indonesian J Electr Eng Comput Sci 14(1):143. https://doi.org/10.11591/ ijeecs.v14.i1.pp143-154 13. Cui Y, Meng X, Qiao J (2022) A multi-objective particle swarm optimization algorithm based on two-archive mechanism. Appl Soft Comput 119:108532. https://doi.org/10.1016/j.asoc.2022. 108532 14. Solving the Quadratic Assignment Problem using the Swallow Swarm Optimization Problem (2019) Int J Eng Adv Technol 8(6):3116–3120. https://doi.org/10.35940/ijeat.f9132.088619 15. Mzili I, Riffi ME, Benzekri F (2017) Penguins search optimization algorithm to solve quadratic assignment problem. Proceedings of the 2nd international conference on big data, cloud and applications. https://doi.org/10.1145/3090354.3090375 16. Dhiman G, Garg M, Nagar A, Kumar V, Dehghani M (2020) A novel algorithm for global optimization: rat swarm optimizer. J Ambient Intell Humaniz Comput 12(8):8457–8482. https:// doi.org/10.1007/s12652-020-02580-0 17. Mzili T, Riffi ME, Mzili I, Dhiman G (2022) A novel discrete Rat swarm optimization (DRSO) algorithm for solving the traveling salesman problem. Decision making: applications in management and engineering, 5(2), 287–299. https://doi.org/10.31181/dmame0318 062022m 18. Koopmans TC, Beckmann M (1957) Assignment problems and the location of economic activities. Econometrica 25(1):53–76. https://doi.org/10.2307/1907742 19. Bouzidi A, Riffi ME (2014) Discrete cat swarm optimization algorithm applied to combinatorial optimization problems. 2014 5th workshop on codes, cryptography and communication systems (WCCCS), 2014, pp 30–34,https://doi.org/10.1109/WCCCS.2014.7107914
Chapter 12
Multiple Distributed Generations Optimization in Distribution Network Using a Novel Dingo Optimizer Abdulrasaq Jimoh, Samson Oladayo Ayanlade, Funso Kehinde Ariyo, Abdulsamad Bolakale Jimoh, Emmanuel Idowu Ogunwole, and Fatina Mosunmola Aremu
1 Introduction Power generation’s only objective is to satisfy the load demand of the consumers. In light of the fact that distribution networks serve as a conduit via which consumers are linked to the power system, these networks are crucial to the functioning of the power system [1, 2]. The distribution networks are confronted with a wide range of challenges, including inadequate voltage profiles, significant power losses, and insufficient reactive power supplies, among others [3]. Poor voltage profiles and power losses are among the most significant issues that distribution networks are contending with. The allowable voltage magnitude for the distribution networks for the voltage profile is ±5% of the nominal voltage magnitude [4]. In other words, if the nominal voltage magnitude is 1.0 p.u., then a voltage magnitude between 0.95 and 1.05 p.u. would be the permissible nominal voltage. The efficiency of the distribution networks is greatly impacted by a voltage magnitude that is deemed inappropriate
A. Jimoh · S. O. Ayanlade Lead City University, Ibadan, Nigeria e-mail: [email protected] A. Jimoh (B) · F. K. Ariyo Obafemi Awolowo University, Ile-Ife, Nigeria e-mail: [email protected] A. B. Jimoh University of Ilorin, Ilorin, Nigeria E. I. Ogunwole Cape Peninsula University of Technology, Cape Town, South Africa F. M. Aremu Kwara State University, Malete, Nigeria © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_12
125
126
A. Jimoh et al.
outside of this range [5]. Poor distribution network efficiencies are also caused by significant real and reactive power losses. A variety of ideas have been put forward by academics throughout the world to address these issues. Some of these solutions include the insertion of shunt capacitors [6], adjusting the transformer tap, reconfiguring the network [7], placing DGs, etc. The DG placement approach is the most successful of all of these techniques [8]. Multiple DGs are typically used to further improve the efficiency of the distribution networks [9]. The placement of several DGs has the potential to dramatically minimize real and reactive power losses and enhance the voltage profile of the networks. DG optimization, however, is absolutely essential for integrating several DGs into the distribution networks. The goal of DG optimization is to enhance the performance of distribution systems by placing and sizing DGs as efficiently as possible. Several optimization solution algorithms exist in the literature for solving optimization problems, especially in power systems. There are quite a few that were used to optimize a single DG as well as multiple DGs. Most of these algorithms are metaheuristic and some of them, as documented in literature, are the firefly algorithm (FA) [10], particle swarm optimization (PSO) [11], grey wolf optimizer (BA) [12], genetic algorithm (GA) [13], whale optimization algorithm (WOA) [14], oppositional sine cosine muted differential evolution algorithm (O-SCMDEA) [15], krill herd algorithm (KHA) [16], etc. Some of these algorithms proved to be effective as far as the optimization of DGs is concerned, but they have various shortcomings. Some of these drawbacks include slow convergence to appropriate solutions and locked in local optima when dealing with circumstances that are heavily constrained. The deployment of a novel dingo optimizer, which is not susceptible to some of these drawbacks, to optimize multiple DGs to improve the performance of the distribution network was proposed in this study. To address the multiple DGs optimization problem, a script was created in the MATLAB/Simulink software using the proposed optimization approach, and the standard IEEE 33-bus was used as the test case.
2 Problem Formulation 2.1 Objective Function Minimizing the overall real power loss is the prime objective of the multiple DG deployments in the distribution network. Therefore, the objective function to be minimized is the total real power loss expressed as the addition of the losses in all the network line segments as given by Eq. (1). O Fmin =
nb i
|Ii |2 Ri
(1)
12 Multiple Distributed Generations Optimization …
127
where, nb = total number of branches, Ri = ith branch resistance, and |I i | = ith branch current magnitude.
2.2 Constraints The objective function is under the following equality and inequality constraints: Power Flow Equations. For any power system, the power flow equations are given by Eqs. (2) and (3). nb
PGi = PDi +
|Vi |V j G i j cos θi j + Bi j sin θi j
(2)
j=1
Q Gi = Q Di +
nb
|Vi |V j G i j sin θi j − Bi j cos θi j
(3)
j=1
where V i and V j = voltages of buses i and j, accordingly, PGi and PDi = real power generated and demanded at bus i, QGi and QDi = reactive power generated and demanded at bus i, and θ ij = voltage angle difference between i and j. Real Power Generation Constraint of DG. The size of each of the installed shunt capacitors is constrained within the limits below; PDG(min) ≤ PDG ≤ PDG(max)
(4)
where PDG(min) = 100 kW and PDG(max) = 75% of the total network real power demand. Bus Voltage Limitation. The standard voltage variation limits are given in Eq. (5). Vmin ≤ Vi ≤ Vmax
(5)
where V min = minimum voltage (V min = 0.95), V max = maximum voltage (V max = 1.05), and V i = bus voltage. DG Penetration Limitation. The installed DG’s total real power output must be less than 75% of the network’s total real power consumption. m l=1
PDG (l) ≤ 0.75 ×
nb i=1
Pd (i )
(6)
128
A. Jimoh et al.
3 Dingo Optimizer In 2021, Bairwa introduced the Dingo optimizer, which was based on the social organization and shrewd collective hunting strategies utilized by dingoes to attack their prey. Dingoes are clever, adept communicators, and they live in communities of 12–15 individuals. They reside in a society with an efficient social structure. The alpha, who is the group’s strongest member or leader, is responsible for making choices that are binding on all group members. Beta dingoes serve as the group’s second in command, uphold order, and act as a liaison between the alpha and the other dingoes. All other dingoes assist the alphas and betas in capturing prey so that the pack has food. Encircling, hunting, and assaulting prey are the three different categories and models for hunting tactics.
3.1 Encircling Dingoes are naturally good at finding their prey. The pack of dingoes surrounds the prey after spotting its location. Equations (7)–(11) are used to mathematically model this behavior (13) [17]. → − → − → − − → (7) Dd = A · Pp (x) − P (i ) − → − → − → − → P (i + 1) = P p (i) − B · D (d)
(8)
− → → A =2·− a1
(9)
− → − → → − → B =2 b ·− a2− b
(10)
3 − → b =3− I ∗ Imax
(11)
− → − → − → where D d = dingo and prey separation distance, P p = position vector (prey), P − → − → → → a 1 and − a 2 = random = position vector (dingo), A and B = coefficient vectors, − − → vector in [0, 1], At each iteration, b reduces linearly from 3 to 0. Dingoes can be relocated anywhere in the search area surrounding the prey by Eqs. (7) and (8).
12 Multiple Distributed Generations Optimization …
129
3.2 Hunting Dingoes are mathematically modeled with the assumption that the pack is aware of the likely location of the prey. In addition to leading the hunt, the alpha dingo occasionally engages in it with the beta dingo. According to Eqs. (12)–(17), the dingoes revise their placements [17]. − → D = A · P α − P α 1
(12)
− → D = A · P β − P β 2
(13)
− → D = A · P o − P o 3
(14)
− → P = Pα − B · D α 1
(15)
− → P = Pβ − B · D β 2
(16)
− → P = Po − B · D o 3
(17)
The intensity of each dingo is estimated using Eqs. (18)–(20). 1 +1 I = log α Fα − (1E − 100) 1 +1 I = log Fβ − (1E − 100) β 1 +1 I = log o Fo − (1E − 100)
(18) (19) (20)
3.3 Attacking Prey A dingo assault on the prey has occurred if there is no update. In order to mimic this − → tactic, b ’s value is progressively lowered.
130
A. Jimoh et al.
3.4 Searching − → To search for and attack their predators, dingoes always move forward [17]. B indicates whether the dingo is approaching its prey or going away from it when it is less than or equal to 1. Dingoes always travel forward to hunt and attack their predators.
3.5 Application of Dingo Optimizer for Multiple DG Placements The DOX technique is implemented to solve multiple DGs placement problem using the following algorithm: Step 1: Input the line and load data of the network including DOX parameters Step 2: Initialization. A dingo is a possible solution for the DOX application that consists of DG locations and DG sizes. An illustration of n dingoes in a population is given in Eq. (21). ⎤ ⎡ bus.DG 11 , . . . , bus.DG 1m D1 ⎢ D2 ⎥ ⎢ bus.DG 2 , . . . , bus.DG 2 m 1 ⎥ ⎢ ⎢ D = ⎢ . ⎥ = ⎢. . . . . . . . . . . ⎣ .. ⎦ ⎣ .. . . .. . . .. . . . . . . . . . . .. ⎡
Dn
⎤ si ze.DG 11 , . . . , si ze.DG 1m si ze.DG 21 , . . . , si ze.DG ⎥ ⎥ .. . . .. . . .. . . . . . . . . . . .. ⎥ (21) . . . . . . . . . . .⎦
bus.DG n1 , . . . , bus.DG nm si ze.DG n1 , . . . , si ze.DG nm
Each dingo in the population can be represented as: Di = bus.DG i1 , . . . , bus.DG im si ze.DG i1 , . . . , si ze.DG im
(22)
Equation (22) demonstrates that each dingo’s solution vector has two components. The number of buses chosen for DG integration is represented by the first portion, while the sizes of the DG units are represented by the second part. Also, in Eq. (22), bus.DG 1 , bus.DG 2 . . . bus.DG m are the buses chosen for the placement of DG; si ze.DG 1 , si ze.DG 2 . . . si ze.DG m are the sizes of the DG units in kW to be installed at the buses, respectively. Each dingo in the DOX may be seen as a randomly generated solution upon initialization. As a result, each dingo named Di in the population has the following random initialization: i i i bus.DG i = r ound buslower,r 1 + rand × (busupper,r 1 − buslower,r 1 )
(23)
i i i si ze.DG i = r ound si zelower,r 2 + rand × (si zeupper,r 2 − si zelower,r 2 )
(24)
12 Multiple Distributed Generations Optimization …
131
where r1 = 1, 2,... m and r2 = 1, 2... m. Except for the slack bus, DGs can be mounted on any bus in the distribution network. Due to the inequality restriction of Eq. (4), the size of each DG is from 150 kW to the maximum power of the DGs, with the lower and higher limits of placement being from bus 2 to the final bus of the network. Step 3: Evaluation of each dingo’s fitness Utilizing the Newton–Raphson approach, the load flow for each dingo is calculated in order to obtain the fitness function, which is used in this work to represent the power loss. The following results are derived based on the fitness function’s values: The Dingo with the best search (Dα ) The Dingo with the second best search (Dβ ) The Dingo search results afterword (Do ) Step 4: Updating the Dingoes status For i = 1: Dn . Using the set of Eqs. (12)–(17), update the status of the most recent search agent. Step 5: Estimation of the fitness of updated dingoes. Load flow is performed by the dingoes to obtain the power loss (objective function and fitness function). Step 6: Determination of the fitness value of dingoes Keep track of the values of S α , S β and S o − → − → − → Keep track of the values of b , A , and B Step 7: Termination criterion Prior to reaching the maximum number of iterations, the status of the Dingoes is updated continually. The proposed DOX for placing multiple DGs is depicted in a flowchart in Fig. 1.
4 Results and Discussion The IEEE 33-bus network was used to test the efficacy of the proposed DOX approach in addressing multiple DG placements, and MATLAB software was used to implement it. The network’s line and load parameters were sourced from [18]. The IEEE 33-bus network, as shown in Fig. 2, consists of 33 buses and 32 branches. In this research, a maximum of three DGs were placed on the network. The population (Dn = 1000) and the maximum number of iterations (iter max = 200) are the DOX parameters used in this study. According to the convergence curve shown in Fig. 3, the simulation converges after 20 iterations. Table 1 presents the simulation results. Buses 12, 25, and 30 were the optimum placements for the DG units after optimization, and Table 1 shows that the optimum DG unit sizes for these buses were 916, 818, and 1046 kW, respectively. The next subsections examine how the allocation of the multiple DG units affects the network voltage profile as well as real and reactive power losses.
132
A. Jimoh et al. Start
Fig. 1 Implementation of the DOX multiple DG placement
Input the line and load data of network and DOX parameters Initialize the population of dingoes (search agents). Initialize b, A and B Run the power flow to determine each dingo’s fitness (power loss) Determine Dα, Dβ, and Do Set ite = 1 Set i = 1 Renew the latest search agent status
i=i+1
No
Does i = Do? Yes
Run load flow to determine dingoes power loss Record the values of Sα, Sβ, So; Record b, A and B
Does iter = itermax?
Yes Print the best dingo Stop
No
iter = iter + 1
12 Multiple Distributed Generations Optimization … 23
1
2
24
133
25
3
4
5
6
19
20
21
22
26
27
28
7
8
9
29
10
30
11
31
12
32
13
33
14
Fig. 2 IEEE 33-bus network
Fig. 3 Convergence curve Table 1 Optimal DG allocations
DG size (kW)
Location
916 818 1046
12 25 30
15
16
17
18
134
A. Jimoh et al.
Fig. 4 Voltage profile of the IEEE 33-bus network before and after multiple DG placement
4.1 Voltage Profile Figure 3 shows the voltage profile of the test network both before and after the placement of multiple DGs. The bulk of the bus voltage magnitudes fell outside the allowed voltage limitations; hence, the voltage profile was insufficient prior to the placement of multiple DGs. Figure 4 shows that the voltage magnitudes at buses 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, and 18 were all below the least voltage limit of 0.95 p.u., with bus 18 having the least voltage magnitude of 0.9131 p.u. The voltage magnitudes of those buses whose voltage magnitudes were outside the permitted voltage limits prior to the placement of multiple DG units were significantly improved, with bus 33 experiencing the least voltage magnitude, as shown in Fig. 3. This resulted in a significant increase in the overall network voltage profile. As shown in Fig. 3, the network’s performance was improved by using DOX to optimize a number of DG units on the test network.
4.2 Real Power Loss Before the placement of multiple DG units, the total real power loss was 202.71 kW. The DOX optimization method was used to position multiple DG units in the best possible way and to optimally size them. A remarkable 64.10% reduction in total real power, or 72.77 kW, was achieved. For easier visual examination and comprehension, Fig. 4 shows the real power losses in each branch of the test network before and after the placement of multiple DGs. As shown in Fig. 5, the real power losses in all the branches of the network have been tremendously decreased. The network’s branch
12 Multiple Distributed Generations Optimization …
135
Fig. 5 Branch real power losses
2–3 showed a maximum real power loss of 65.56 kW, which was remarkably reduced to 14.86 kW.
4.3 Reactive Power Loss Before the installation of several DG units, there was a total reactive power loss of 145.87 kVar. The total reactive power was greatly reduced to 54.63 kVar, which translates to a considerable 62.55% reduction, following the deployment of multiple DG units after they had been optimized by the DOX optimization process. For easier visual observation and comprehension, Fig. 6 presents the reactive power losses observed in all of the network’s branches before and after the placement of multiple DG units. The branch 5–6 showed a maximum reactive power loss of 40.67 kVar, which was reduced to 12.19 kVar as shown in Fig. 5. Nevertheless, when multiple DG units were optimally distributed over the network using DOX, there was a significant general decrease in reactive power losses in all of the network’s branches.
4.4 Comparison with Other Optimization Technique A comparison of the proposed DOX with existing approaches from the literature was conducted and is shown in Table 2 to demonstrate the DOX’s capacity to allocate multiple DGs in distribution networks in the most efficient way possible.
136
A. Jimoh et al.
Fig. 6 Branch reactive power losses
Table 2 IEEE 33-bus comparison results Method
Base Case
Location/size (kW)
–
PSO [11]
WOA [14]
KHA [16]
Q-SCMDEA [15]
Proposed DOX
14/691
14/1021.6
13/810
30/1048.3
12/916
24/986.1
24/1200
25/836
13/805.2
25/818
29/1277.3
31/1200
30/841
24/1093.6
30/1046
Total (kW)
–
2954.4
3421.6
2487
2947.1
2780
Power loss (kW)
202.7
74.09
79.72
75.41
72.78
72.77
Table 2 shows that the proposed approach outperformed other methods for optimizing multiple DG units and enhancing the performance of distribution networks.
5 Conclusion In this study, the use of DOX to optimize multiple DG units to improve the performance of distribution networks has been demonstrated. The proposed technique for optimization turned out to be successful since it distributed multiple DG units on the tested distribution network in the best possible way. The proposed approach excelled some of the other methods in the literature when its results were compared with those of other methods. In order to increase the performance of the distribution network via multiple DG penetration, DOX is, therefore, recommended.
12 Multiple Distributed Generations Optimization …
137
References 1. Ayanlade SO, Komolafe OA, Adejumobi IO, Jimoh A (2020) Distribution system power loss minimization based on network structural characteristics of network. In: 1st faculty of engineering and environmental sciences conference (2019) proceedings, pp 849–861, Uniosun, Osogbo, Nigeria 2. Jimoh A, Ayanlade SO, Ariyo FK, Jimoh AB (2022) Variations in phase conductor size and spacing on power losses on the Nigerian distribution network. Bull Electr Eng Inf 11(3):1222– 1233 (2022). https://doi.org/10.11591/eei.v11i3.3753 3. Ayanlade SO, Ogunwole EI, Salimon SA, Ezekiel SO (2022) Effect of optimal placement of shunt facts devices on transmission network using firefly algorithm for voltage profile improvement and loss minimization, Cham, pp 385–396. https://doi.org/10.1007/978-3-030-987411_32 4. Ayanlade SO, Komolafe OA (2019) Distribution system voltage profile improvement based on network structural characteristics of network. In: 2019 OAU faculty of technology conference (OAU TekCONF 2019) Proceedings, pp 75–80, OAU, Ile-Ife, Nigeria 5. Okelola MO, Salimon SA, Adegbola OA, Ogunwole EI, Ayanlade SO, Aderemi BA (2021) Optimal siting and sizing of D-STATCOM in distribution system using new voltage stability index and bat algorithm. In: 2021 international congress of advanced technology and engineering (ICOTEN) 2021 Proceedings, pp 1–5. https://doi.org/10.1109/ICOTEN52080.2021. 9493461 6. Okelola MO, Adebiyi OW, Salimon SA, Ayanlade SO, Amoo AL (2022) Optimal sizing and placement of shunt capacitors on the distribution system using whale optimization algorithm. Nigerian J Technol Dev 19(1):39–47. https://doi.org/10.4314/njtd.v19i1.5 7. Al Samman M, Mokhlis H, Mansor NN, Mohamad H, Suyono H, Sapari NM (2020) Fast optimal network reconfiguration with guided initialization based on a simplified network approach. IEEE Access 8:11948–11963. https://doi.org/10.1109/ACCESS.2020.2964848 8. Adepoju GA, Aderinko HA, Salimon SA, Ogunade FO, Ayanlade SO, Adepoju TM (2021) Optimal placement and sizing of distributed generation based on cost-savings using a twostage method of sensitivity factor and firefly algorithm. In: 1st ICEECE & AMF proceedings, pp 52–58 9. Alzaidi KMS, Bayat O, Uçan ON (2019) Multiple DGs for reducing total power losses in radial distribution systems using hybrid WOA-SSA algorithm. Int J Photoenergy 2019:1–20. https:// doi.org/10.1155/2019/2426538 10. Anbuchandran S, Rengaraj R, Bhuvanesh A, Karuppasamypandiyan M (2022) A multiobjective optimum distributed generation placement using firefly algorithm. J Electr Eng Technol 17(2):945–953. https://doi.org/10.1007/s42835-021-00946-8 11. Prakash D, Lakshminarayana C (2016) Multiple DG placements in distribution system for power loss reduction using PSO algorithm. Procedia Technol 25:785–792. https://doi.org/10. 1016/j.protcy.2016.08.173 12. Sultana U, Khairuddin AB, Mokhtar A, Zareen N, Sultana B (2016) Grey wolf optimizer based placement and sizing of multiple distributed generation in the distribution system. Energy 111:525–536. https://doi.org/10.1016/j.energy.2016.05.128 13. Bhattacharya M, Das D (2016) Multi-objective placement and sizing of DGs in distribution network using genetic algorithm. In: 2016 national power systems conference (NPSC) proceedings, pp 1–6. https://doi.org/10.1109/NPSC.2016.7858906 14. Kamel S, Selim A, Jurado F, Yu J, Xie K, Yu C (2019) Multi-objective whale optimization algorithm for optimal integration of multiple DGs into distribution systems. In: 2019 IEEE innovative smart grid technologies-Asia (ISGT Asia) proceedings, pp 1312–1317. https://doi. org/10.1109/ISGT-Asia.2019.8881761 15. Dash SK, Mishra S, Abdelaziz AY, Alghaythi ML, Allehyani A (2022) Optimal allocation of distributed generators in active distribution networks using a new oppositional hybrid sine cosine muted differential evolution algorithm. Energies 15(2267):1–35. https://doi.org/ 10.3390/en15062267
138
A. Jimoh et al.
16. Sultana S, Roy PK (2016) Krill herd algorithm for optimal location of distributed generator in radial distribution system. Appl Soft Comput 40:391–404. https://doi.org/10.1016/j.asoc. 2015.11.036 17. Bairwa AK, Joshi S, Singh D (2021) Dingo Optimizer: a Nature-inspired metaheuristic approach for engineering problems. Math Probl Eng 2021:1–12. https://doi.org/10.1155/2021/ 2571863 18. Iyer H, Ray S, Ramakumar R (2005) Voltage profile improvement with distributed generation. In: IEEE power engineering society general meeting proceedings (2005), pp 2977–2984. https:// doi.org/10.1109/PES.2005.1489406
Chapter 13
Prediction of Daily Precipitation in Bangladesh Using Time Series Analysis with Stacked Bidirectional Long Short-Term Memory-Based Recurrent Neural Network Shaswato Sarker and Abdul Matin
1 Introduction Precipitation is one of the essential meteorological factors of agricultural progress. Accurate knowledge of rainfall and other correlating meteorological factors can make or break the probability of a successful harvest, especially in developing countries like Bangladesh. These factors include humidity, temperature, wind speed, and more. The high population density growth and industrial development in this South Asian country demand efficient use of all water resources, including precipitation. Bangladesh is located in the northeastern region of South Asia. Geographically, it lies between 20◦ 34’ and 26◦ 38’ north latitude and 88◦ 04’ and 92◦ 41’ east longitude. Indian mountainous lands surround the country north and northeast, while the Bay of Bengal lies to the south. Its geographical position makes the weather pattern inside the country vary greatly. Hence, precipitation becomes unpredictable. Because rain-based agriculture, which utilizes little or no irrigated water, accounts for about 5.59% of the total cropped area from food staples such as wheat, maize, pulse, and oilseed in 2007–08 [1], accurate knowledge of precipitation becomes essential for agricultural progress. Traditionally, statistical models have been widely used for precipitation forecasting worldwide. However, recent progress in machine learning and related technology has prompted researchers to combine both approaches, as high degrees of accuracy have been observed when forecasting precipitation in general and complex circumstances. Rahman et al. have introduced a rainfall prediction model specifically for Dhaka city based on Artificial Neural Networks or ANNs [2]. They used Radial Basis Function (RBF) and Multi-layer Perceptron (MLP) to fit the models. Zhang et al. have used the Support Vector Regression and Multi-Layer Perceptron model to predict rainfall for annual and non-monsoon rainfall in Odisha [3]. The study showed promising results S. Sarker (B) · A. Matin Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_13
139
140
S. Sarker and A. Matin
for precipitation prediction when they used the MLP model to predict rainfall in monsoon months and the SVR model to predict rainfall in non-monsoon months. Mahsin et al. have used Seasonal Autoregressive Integrated Moving Average (SARIMA) to predict monthly precipitation two years into the future with reliable accuracy [4]. SARIMA is another time series modeling method used frequently to calculate or forecast precipitation worldwide. In Bangladesh, too, extensive work has been done using similar auto-regressive models for weather prediction. Graham et al. have used it to forecast rainfall in Allahabad [5], and Bari et al. have done similar work in Sylhet [6]. Another time series analysis model, LSTM or Long Short-Term Memory, was applied by Zhang et al. using multiple meteorological factors as input parameters for rainfall prediction in East China [7]. Rainfall forecasting is also essential for other catchment management procedures, such as flood warning systems or meteorological drought prediction. Luk et al. introduced two practicable approaches to forecast precipitation [8]. They declared that the first method is a research-based process modeling approach that may not be viable because of three main factors. They are as follows: 1. Precipitation is an output of many complex atmospheric variables which vary in the spatio-temporal dimensions; 2. Even if the precipitation processes can be reported concisely and thoroughly, the magnitude of calculations involved may be inhibitory; 3. The data that is available to assist in defining control variables for the process models, such as precipitation intensity, wind speed, vaporization, and other factors, are limited in both the spatial and temporal dimensions. According to their work, the second method to forecast precipitation is based on recognizing patterns. This methodology aims to recognize precipitation patterns based on their historical features. The reasoning behind this method is to find pertinent spatial and temporal features in historical precipitation patterns and use these to predict the evolution of future precipitation. This pattern recognition methodology refers only to precipitation patterns, with no consideration of the physical laws of the precipitation processes. This approach is considered appropriate for this study because the main concern is to forecast short-duration precipitation based on daily precipitation records at specific locations within the jurisdiction of one of Bangladesh’s 35 meteorological stations, similar to the criteria suggested in their paper. They also mentioned that a thorough understanding of the physical laws is not required for this approach, and the requirements for the data are not as extensive as for a process model. This paper uses a time series analysis model called Stacked Bidirectional LSTM (BDLSTM) with only one meteorological factor: the daily rainfall of the past 50 years. It uses the second process Luk et al. described to analyze the time series dataset to figure out only rainfall patterns [8]. A total of 35 meteorological stations were considered. Mean squared error was used to score the goodness of fit criterion.
13 Prediction of Daily Precipitation in Bangladesh …
141
2 Methods and Materials 2.1 Dataset Description and Research Area For this study, daily rainfall data was collected for the past 50 years (1970–2019) from the Division of Agricultural Statistics, Bangladesh Rice Research Institute (BRRI), Bangladesh, for stations established in 1970 or before. For stations established after 1970, all daily precipitation data recorded from their establishment till December 2019 was collected. For this precipitation prediction study, 580321 daily data entries spread over 35 meteorological stations were considered. Furthermore, the dataset was automatically separated into training, test, and validation sets in the training model. The training set was used to train or fit the network, while the validation set tested the network’s performance. Mean Absolute Error, Mean Squared Error, and Root Mean Square Error were used for overall performance metrics. While the network was being trained, the Mean Squared Error and accuracy and validation accuracy were considered for performance metrics.
2.2 Time Series Analysis Time series analysis is a method of data analysis that uses time series data, which contains a series of data points indexed at even intervals of time and ordered chronologically. It is a sequence of discrete time data, meaning each successive data point is taken at a discrete time rate, such as hourly, daily, and even yearly. This data is easily visualized with a time series graph. The graph plots observed values on the y-axis against time increments on the x-axis. Time series analysis takes this data to extract meaningful statistical information. It usually precedes time series prediction or forecasting, using a model to forecast future values based on previously observed values. The same concept was applied in this study, specifically for rainfall or precipitation over Bangladesh. Other uses of time series analyses would be predicting stock prices, birth rates, disease and mortality rates, wildlife populations, cholesterol measurements, heart rate monitoring, and more. It can be applied to real-valued data, discrete numerical data, continuous data, or discrete symbolic data, such as in the case of natural language processing. Time series is also used for data imputation, anomaly, and pattern detection. Ismail et al. introduced three formal definitions of Time Series analysis that are widely accepted. 1. Definition 1: X = [x1 , x2 , ..., x T ], a univariate time series, is an ordered set of real values. The length of X here equals the number of real values, which is T . 2. Definition 2: X = [X 1 , X 2 , ..., X M ], an M-dimensional multivariate time series, consists of M different univariate time series with X i ∈ RT . 3. Definition 3: A dataset D = (X 1 , Y1 ), (X 2 , Y2 ), ..., (X N , Y N ) is a collection of pairs (X i , Yi ) where X i could either be a univariate or multivariate time series with Yi as its corresponding one-hot label vector.
142
S. Sarker and A. Matin
They mentioned that, for a dataset with K classes, the one-hot label vector Yi is a vector of length K where each element j ∈ [1, K ] is equal to 1 if the class of X i is j and 0 otherwise. As a result, they argue that the task of Time Series Analysis consists of training a classifier on a dataset D to map from the space of possible inputs to a probability distribution over the class variable values or labels [9]. Time series analysis is used in this study because all four of its main characteristics can be found in the selected dataset. The dataset has a trend over time and exhibits strong seasonality, and the data has a serial correlation between subsequent observations. Finally, because of technical or human error, the dataset shows irregular components, and random variations in data are not explainable by any other factors, often referred to as white noise or data outliers. There are many types of time series models. Among them, this study uses the Stacked BDLSTM model.
2.3 Long Short-Term Memory Long Short-Term Memory or LSTM is a unique Recurrent Neural Network (RNN) architecture. It was designed explicitly for modeling temporal sequences [10]. First introduced by Hochreiter and Schmidhuber, it can learn long-term dependencies, unlike the standard RNN [11]. In their paper, Selvaraj and Marudappa stated that a standard RNN might encounter the vanishing gradient problem. In the structure of recurrent neural networks, the challenge in learning the long-term dependencies is called the vanishing gradient problem. Here, with the increase in the length of input sequences, the difficulty of capturing the influence of the earliest stages increases. So the gradients to the earliest several input points get eliminated or explode and become equal to zero. Like most RNNs, it has hidden states and feedback loops in the recurrent layer. It differs because it was designed to remember long-term information. After all, other than the hidden states, it includes cell states or memory cells that can maintain information in memory for long periods. LSTMs also have a set of gates to control when data enters the memory, the output timing, and when the data is erased from memory. Because of its recurrent nature, the activation function of the LSTM is considered the identity function with a derivative of 1.0. Therefore, the gradient being backpropagated neither vanishes nor explodes. It remains constant [10]. The shift from standard RNN to LSTM adds more controlling knobs to the architecture, bringing far better flexibility in the control over outputs. LSTMs give the user better control in return for complexity and operating cost. The LSTM cell contains 1. Forget Gate f t : A neural network with a sigmoid. Here, f t = sigmoid(W f [h t−1 , X (t)] + b f )
(1)
2. Input Gate i t : A neural network with sigmoid. Here, i t = sigmoid(Wi [X (t), h t−1 ] + bi )
(2)
13 Prediction of Daily Precipitation in Bangladesh …
143
3. Output Gate ot : A neural network with sigmoid. Here, ot = sigmoid((Wo [h t−1 , X (t)] + bo )
(3)
4. Candidate Layer ct, : A neural network with Hyperbolic Tangent (Tanh). Here, ct, = tanh(Wc [h t−1 , X (t)] + bc )
(4)
5. Hidden State h t : A vector. Here, h t = ot ∗ tanh(ct )
(5)
6. Memory State ct : A vector. Here, ct = f t ∗ ct − 1 + It ∗ ct,
(6)
where t is time step, X (t) is the input vector, h t−1 is the previous state hidden vector, W is the weight, and b is the bias for each gate. The basic structural diagram of an LSTM network is shown in Fig. 1. The reasons we decided to use the LSTM model over others are as follows: 1. In this study, the oldest stations have more than 50 years of daily precipitation records. For example, Dhaka city has 18229 records, including outliers. Using a traditional RNN could potentially cause the influence of the first few records on the later records to vanish to zero. 2. Moreover, each station has an overwhelming number of 0 values in the dataset. It is 40–50 percent for some stations but approximately 70 percent for others. We decided against using other machine learning models, such as Support Vector Classification (SVC), to train the model. 3. In a study to compare the accuracy between ARIMA and LSTM models when handling time series data, Siami-Namini et al. concluded that the results indicated LSTMs were superior to ARIMA. More specifically, the LSTM-based algorithms improved the prediction of time series data by 85 percent on average compared to the ARIMA model. Furthermore, the paper reported no improvements when the number of epochs was changed [12].
2.4 Stacked Bidirectional LSTM Regarding sequence classification problems, bidirectional LSTMs can improve performance compared to the standard LSTM. Bidirectional LSTMs are an extension
144
S. Sarker and A. Matin
Fig. 1 Schematic diagram of an LSTM
of the ordinary LSTM. A standard model has one LSTM on the input sequence, and a Bidirectional LSTM trains two separate LSTMs. The first LSTM is trained on the input sequence just as it is, and the other one is trained on a reversed copy of the input sequence. It works best in problems where all-time steps of the input sequence are available. On the other hand, stacked or deep BDLSTMs are networks with multiple BDLSTM hidden layers. The output of one layer is fed as the input of another layer. This stacked architecture, which can enhance the power of neural networks, is adopted in this study. A diagram of the stacked BDLSTM network is given in Fig. 2. This study stacks three bidirectional LSTM layers on top of each other, with dropout layers in between to reduce overfitting. Finally, a dense layer was used for the output of the model. As mentioned above, this study adopts the second method of rainfall prediction as introduced by Luk et al. [8] and uses past rainfall records, which have characteristics of temporal dependencies as the only meteorological parameter for pattern recognition. Past studies have shown that deep BDLSTM architectures with several hidden layers can build up a progressively higher level of representations of sequence data and, thus, work more effectively [13].
13 Prediction of Daily Precipitation in Bangladesh …
145
Fig. 2 Architecture of the stacked bidirectional long short-term memory model used in this study
2.5 Measure of Fit, and Scoring This study uses the mean squared error (MSE) to fit the trained model. The MSE measures how close a fitted line is to the actual data points in the training set. It is the sum of the squares of the differences between the predicted and actual target variables divided by the number of data points across all data points. MSE is a risk function corresponding to the expected value of the squared error loss. Here, 1 Σ ∗ (yi − y¯i )2 n i=1 n
MSE =
(7)
where n is the number of data points, yi is observed value, and y¯i is predicted value. In this study, the usage of mean absolute percentage error or MAPE, a widely used estimator for time series forecasting because of its ease of interpretation [2], was avoided. MAPE is interpreted in terms of percentage error and does not depend on the scales of the actual variables, making it an ideal estimator for monthly and yearly precipitation forecasting. However, in this study of daily precipitation forecasting, an overwhelming number of 0 valued records exist. This may cause the ‘divide by zero error’ since MAPE relies on dividing predicted values by the actual values: 1 Σ (yi − y¯i ) M AP E = ∗ n i=1 (yi ) n
(8)
146
S. Sarker and A. Matin
Here, the difference between the observed value or true value yi and the predicted value y¯i is further divided by true value yi to get the prediction error. Hence, MAPE was not considered. The estimators used in this study to score the model’s overall performance alongside MSE are called Mean Absolute Error or MAE and Root Mean Square Error or RMSE. They are described below. Mean Absolute Error or MAE of the model refers to the mean of the absolute values of each difference between actual value yi and predicted value y¯i on all instances of the test dataset: 1 Σ ∗ abs(y¯i − yi ) n i=1 n
M AE =
(9)
Root Mean Square Error or RMSE of the model refers to the square root of the average squared difference between the true and predicted scores. In other words, it is the root of the MSE: ⎡ | n |1 Σ (10) (yi − y¯i )2 RMSE = √ n i=1 Another way to evaluate the model’s performance used in this study is Accuracy. It is simply the ratio of correctly predicted observations against the total observations, which can be helpful in specific interpretations of data.
3 Experiment 3.1 Data Pre-processing The daily precipitation time series dataset we collected had erroneous values, such as records collected on nonexistent dates like June 31 or February 29 on a nonleap year. These data points were first removed before splitting the data into training, test, and validation sets. Then, the dataset was pre-processed, scaling it to a standard range for ease of calculation before fitting the model. However, the input variables in the dataset contained many outliers because of technical or human-caused reasons during the recording. N/A values and ‘−99999’ values were removed, and 0 values were inserted in the place of missing records. For this reason, Robust scaler was used to scale the data. Otherwise, the standardization process would become skewed or biased due to the many outlier values. Robust scaler uses the median, the 50th percentile, and the interquartile, the 25th and 75th percentiles, to deal with the biasing problem. Each value has the median subtracted from them. Then they are divided by the interquartile range, which is the difference between the 75th and 25th percentiles: Robust Scaler =
(V alue − median) (75 p) − (25 p)
(11)
13 Prediction of Daily Precipitation in Bangladesh …
147
Here, 75p is the 75th percentile, and 25p is the 25th percentile. The resulting value has a zero mean, median, and standard deviation of 1. When stopped from influencing the scaling, the outliers were still present, albeit with the same relative relationships to the other values.
3.2 Defining the Model In this study, Google Brain Team’s TensorFlow [14] along with Keras [15] was used to build the model. TensorFlow is a machine learning system or interface for expressing various machine learning algorithms and the implementations from executing these algorithms. Keras is an API used in TensorFlow that makes the creative and cognitive process more straightforward with simple code implementations. The model was built as a sequential, which groups a linear stack of layers into the model. Three Bidirectional LSTM layers were stacked, each feeding an output to the next. Experimentation has shown that using only one or two layers provides significantly worse results. Adding more gave minimum improvements compared to the computational costs. In each Bidirectional LSTM layer, 128 units were used for the computation. Here, units mean the dimension of the inner cell of the LSTM, also referring to the number of neurons connected to the layer holding the vector of the hidden state and input of the artificial neural network. Various 2n combinations of units were experimented with to find which combination had the best efficiency concerning computation time and performance metrics. Because three BDLSTM layers were stacked here, the return sequence of the first two layers was set to true so that the second and third layers had three-dimensional sequence inputs. The final output layer was a standard deeply connected neural network layer called ‘dense’ in Keras. The units of the dense layer are connected to every single unit of its preceding layer. Here, the number of units in the dense layer was set to 1 and was configured to activate only with rectified linear units or ReLU [16]. Between each BDLSTM layer and between the third BDLSTM layer and the dense layer, there were dropout layers to reduce overfitting to a minimum. The dropout rate was set to 0.1, so randomly selected neurons are ignored during training with a 10 percent chance. The rate of 0.1 gave the best decrease in loss, while any more than 0.2 resulted in the model being unable to fit correctly. The optimizer used was Adam (Adaptive moment estimation) [17], a widely used optimizer for updating the weights on an iterative basis for time series predictions.
3.3 Fitting the Model The model was set to run for 500 epochs for each of the 35 stations. After 200 to 300 epochs, the model would only show negligible improvements depending
148
S. Sarker and A. Matin
Fig. 3 Epochs versus loss for model fitting (Dhaka)
Fig. 4 Zoomed in figure of epochs versus loss for model fitting (Dhaka)
on the station. However, the model was run for the extra 300 or 200 epochs for experimentation purposes. As seen in Fig. 3, the improvement in loss function (MSE) plateaus after approximately 200 epochs. The loss spikes on epochs 330 and 400 as seen in Fig. 3; however, generally, it stayed around the 0.8 to 1.2 MSE range. Figure 4 shows a zoomed-in image of the loss from epoch 300 to epoch 500. In this study, the validation split for the training set was set to 0.1. In other words, 90 percent of the data was used to train the model, while 10 percent was used to validate the performance of the training model.
13 Prediction of Daily Precipitation in Bangladesh …
149
4 Results and Discussion Table 1 lists the MAE, MSE, and RMSE scores of the trained model for 35 meteorological stations. As seen in Table 1, the average MSE for precipitation prediction varies significantly from station to station. For stations with relatively complete precipitation datasets like Mymensingh, the MAE is low at 1.3414 and MSE at 0.997121639. The network accurately predicted precipitation seasonality, as seen in Fig. 5. While there were several errors and spikes when predicting the daily precipitation, the overall result is relatively better among the 35 stations. Figure 6a shows a zoomed-in image with the predicted precipitation values and the true values separately. Figure 6b shows the values together for better comparison. However, compared to Mymensingh, for Ambagan, where the collected data is incomplete, the MAE and MSE are larger at 3.720231 and 1.409450345, respectively, especially MAE, which is more than double Mymensingh’s value. Ambagan station was established in 1999. Hence, the collected valid data points were also less at only 6695. In comparison, Mymensingh had 9599 valid data points. For this pattern recognition method, improvements are expected in predicting stations with high MAE, like Ambagan, by collecting more daily precipitation data. Here, Fig. 7 shows the historical rainfall data of Ambagan, where the seasonality was correctly predicted once again. However, in Fig. 8a and b, we can see that the predicted daily rainfall is more random and inaccurate. Table 2 is the direct feedback we reported in this study compared to similar work done. We compare the best results obtained in our work to those found in comparative studies. For cases where the studies include the best results for multiple models, we only include the models with architecture similar to ours if the network used is LSTM. If other neural networks are used, we only include the variations with the best results for better comparison. It is worth noting that our network outperforms those obtained in similar works, with the only exception being the SVR with combined Rainfall and Temperature Dataset used by Rasel et al. [20], providing remarkable addition to the literature.
5 Conclusions Precipitation prediction using Time Series Analysis with Stacked BDLSTM-based RNN has been the focus of this study. The model with the most suitable architecture was identified and developed using TensorFlow and Keras. The data was pre-processed to deal with the outlier values in the daily precipitation dataset. The model was fitted for each of the 35 meteorological stations in Bangladesh, showing varying success measures as measured by the MSE. The model showed remarkable results for stations with the most comprehensive set of data, the least amount of outliers, and the most accurate past precipitation records. Regarding the overall performance of the trained model, it is clear that Stacked BDLSTM is a flexible
150
S. Sarker and A. Matin
Table 1 MAE, MSE, and RMSE score of the trained model for 35 meteorological stations Station Index Station MAE MSE RMSE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Average
Dhaka Tangail Mymensingh Faridpur Madaripur Srimangal Sylhet Bogra Dinajpur Ishurdi Rajshahi Rangpur Sydpur Chuadanga Jessore Khulna Mongla Satkhira Barisal Bhola Khepupara Patuakhali Chandpur Teknaf Chittagong Comilla Cox’s Bazar Feni Hatiya Kutubdia M.court Rangamati Sandwip Ambagan Sitakunda
1.442075 1.609438 1.3414 1.307441 1.02418291 2.767083 2.077293 2.089107 2.617183 2.183188 1.87145 2.062848 3.422027 2.721903 2.021922 2.007989 3.649806 2.00259 1.954238 2.106744 2.284613 2.429007 2.272697 2.359197 2.406155 1.88458 1.890064 2.39538 2.206219 2.982359 2.139879 2.198487 2.160157 3.720231 2.393559 2.275474114
1.200328702 0.787142273 0.997121639 2.262624006 1.03218302 2.497229477 2.418298526 2.044293119 1.234666592 1.901296173 2.085304561 1.822790388 0.789756872 0.775860899 1.86099592 2.221868465 1.312493882 0.949392019 1.042344328 1.013991425 0.969003799 1.098318822 0.816096198 0.984618369 1.774922156 0.917790677 1.976887675 1.014153774 0.906053181 1.665570718 0.926314456 0.918213429 2.059725075 1.409450345 0.922348202 1.388841405
1.095595136 0.887210388 0.998559782 1.504202116 1.015964084 1.580262471 1.555087948 1.429787788 1.111155521 1.378874966 1.444058365 1.350107547 0.888682661 0.880829665 1.364183243 1.490593326 1.145641254 0.974367497 1.020952657 1.006971412 0.984379906 1.048007072 0.903380428 0.992279381 1.332262045 0.958013923 1.406018376 1.007052021 0.951868258 1.29056992 0.962452314 0.958234538 1.435174232 1.18720274 0.960389609 1.178491156
13 Prediction of Daily Precipitation in Bangladesh …
Fig. 5 Historical precipitation plot of Mymensingh with predicted values
Fig. 6 Predicted and true precipitation values for Mymensingh
151
152
Fig. 7 Historical precipitation plot of Ambagan with predicted values
Fig. 8 Predicted and true precipitation values for Ambagan
S. Sarker and A. Matin
13 Prediction of Daily Precipitation in Bangladesh … Table 2 Estimation results compared to proposed model Model Reference Cui et al. [18]
Zhang et al. [19]
Rasel et al. [20]
Proposed Model
3-layers LSTM 3-layers LSTM + 1-layer DNN 3-layers BDLSTM SBU-LSTMs: 1-layer BDLSTM + 3 middle BDLSTM layers + 1-layer LSTM K-Means Clustering with LSTM K-Means Clustering with Deep Belief Network (DBN) K-Means Clustering with Frequency Matching Method K-Means Clustering with Stepwise Linear Regression K-Means Clustering with Support Vector Machine (SVM) SVR with only Rainfall Dataset ANN with only Rainfall Dataset SVR with combined Rainfall and Temperature Dataset ANN with combined Rainfall and Temperature Dataset 3-layers BDLSTM with only Rainfall Dataset
153
MAE
RMSE
2.483 2.63 2.476 2.549 7.45 7.56 8.05 7.43 7.43 1.7 10.87 0.17
27.68 31.97 27.6
11.33
27.53
2.275
1.178
and effective tool in pattern recognition. It shows significant results mainly for time series datasets like precipitation prediction. Concerning the dataset, daily precipitation records contain more zero, outliers, and missing values than monthly or yearly precipitation records. Hence, when using similar machine learning algorithms, most researchers use monthly or yearly rainfall data and other correlating meteorological phenomena such as wind speed, temperature, and humidity. However, this study shows that it is possible to train a model to perform daily precipitation predictions with just the past daily rainfall data via pattern recognition. This study also clarifies the limitations of only using one parameter for pattern recognition. Most notably, the fewer training samples, the more inaccurate the prediction results. As far as precipitation prediction is concerned, future research points should be aimed at overcoming the limitations of this study. The most obvious is the use of multiple parameters, such as the ones mentioned above, and the analyses of the significance of such parameters on precipitation prediction. Acknowledgements This paper and its research would not have been completed successfully without the support of my supervisor, Abdul Matin, Assistant Professor of the Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology. His enthusiasm and knowledge have kept me on track since the beginning of my study. Special thanks to the late
154
S. Sarker and A. Matin
Dr. Tamal Lata Aditya, the members of the Department of Plant Breeding and Genetics, and the Department of Statistics of Bangladesh Rice Research Institute for helping in the collection of the applied dataset and understanding it. The cooperation of these personages has improved this study in uncountable ways and saved me from many errors of my own doing.
References 1. Kashem MA, Hossain MA, Faroque MAA (2016) Rainfed agriculture: on going and future researchable issues in Bangladesh. J Sci Found 7(1):1–8 2. Rahman A, Akter M, Majumder AK (2017) Prediction of rainfall over Dhaka City using artificial neural network. Jahangirnagar Univ J Stat Stud 34:25–36 3. Zhang X, Mohanty SN, Parida AK, Pani SK, Dong B, Cheng X (2020) Annual and non-monsoon rainfall prediction modelling using SVR-MLP: an empirical study from Odisha. IEEE Access 8:30223–30233 4. Mahsin MD (2011) Modeling rainfall in Dhaka division of Bangladesh using time series analysis. J Math Model Appl 1(5):67–73 5. Graham A, Mishra EP (2017) Time series analysis model to forecast rainfall for Allahabad region. J Pharma Phytochem 6(5):1418–1421 6. Bari SH, Rahman MT, Hussain MM, Ray S (2015) Forecasting monthly precipitation in Sylhet city using ARIMA model. Civil Environ Res 7(1):69–77 7. Zhang CJ, Zeng J, Wang HY, Ma LM, Chu H (2020) Correction model for rainfall forecasts using the LSTM with multiple meteorological factors. Meteorol Appl 27(1):e1852 8. Luk KC, Ball JE, Sharma A (2001) An application of artificial neural networks for rainfall forecasting. Math Comput Model 33(6–7):683–693 9. Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Mining Knowl Discovery 33(4):917–963 10. Poornima S, Pushpalatha M (2019) Prediction of rainfall using intensified LSTM based recurrent neural network with weighted linear units. Atmosphere 10(11):668 11. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780 12. Siami-Namini S, Tavakoli N, Namin AS (2018) A comparison of ARIMA and LSTM in forecasting time series. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, pp 1394–1401 13. Graves A, Jaitly N, Mohamed AR (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding. IEEE, pp 273–278 14. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Zheng X, et al (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 15. Chollet F (2018) Keras: the python deep learning library. Astrophysics source code library, ascl-1806 16. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In Icml 27, pp 807–814 17. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 18. Cui Z, Ke R, Pu Z, Wang Y (2020) Stacked bidirectional and unidirectional LSTM recurrent neural network for forecasting network-wide traffic state with missing values. Transp Res Part C Emerging Technol 118:102674
13 Prediction of Daily Precipitation in Bangladesh …
155
19. Zhang CJ, Zeng J, Wang HY, Ma LM, Chu H (2020) Correction model for rainfall forecasts using the LSTM with multiple meteorological factors. Meteorol Appl 27(1):e1852 20. Rasel RI, Sultana N, Meesad P (2017) An application of data mining and machine learning for weather forecasting. In: International conference on computing and information technology. Springer, Cham, pp 169–178
Chapter 14
A New Face Recognition System Anmol Tyagi and Kuldeep Singh
1 Introduction Social interactions rely heavily on facial expressions to communicate a person’s identity and range of emotions. When we meet someone for the first time, it’s easy for us to recognize their face even if we haven’t seen them in years. Due to changing circumstances, ageing, and visual distractions such as beards, spectacles or changes in coiffure this ability is still quite strong. A person’s face is an integral element of their visual perception, and it is actually one of our most fundamental talents. Imagine looking for a portrait photograph of one without realizing that you are the subject of the shot [1]. Even worse, while you go about your regular activities, you run into known people and recognize looks, but you lack the cognitive capacity to recognize them without other available indicators, such as voice, coiffure, stride, clothing, and context. There are certain areas in the brain that allow us to have such amazing talents as face popularity [2].
1.1 Background People detection and tracking for restricted or high-security places is one of significant study subjects that have attracted lot of interest in previous few years. Although human identification and counting technologies are commercially accessible, there is need for more research to solve problems of real-life settings. There is much of security cameras deployed around us but there are no methods to monitor all of A. Tyagi (B) · K. Singh Department of Electronics and Communication Engineering, Guru Jambheshwar University of Science and Technology, 125001, Hisar, Haryana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_14
157
158
A. Tyagi and K. Singh
them continually. It is required to build computer vision-based technologies that automatically scan the photos in order to detect harmful scenarios or odd behavior [3]. Automated video surveillance system handles real-time monitoring of individuals within busy setting leading to description of their behaviors and interactions. It needs identification and tracking of persons to provide security, safety and site management [4]. Face identification is one of key phases in automated video surveillance. Face detection from video sequence is generally achieved via background subtraction technique. It is a frequently used method for identifying moving objects from static cameras. If a human being is discovered, tracking lines are established around that person and monitored [5, 6].
1.2 Face Recognition To verify a user’s identity via ID verification services, face recognition systems compare the image of a person’s face against a database of known people’s faces. These systems function by analyzing an image to determine a person’s facial characteristics [7]. In the 1960s, computer programmers similar to this were initially created. Smart phones and other gadgets, including robots, are increasingly utilizing face recognition technologies. Biometrics encompasses the examination of a human’s physiological traits, which is why computerized face recognition falls under this category [8]. Despite fact that facial-recognition systems are less precise than other biometric technologies, such as iris and fingerprint detection, it is widely accepted. Human–computer interaction, surveillance, and picture indexing have all benefited from enhanced facial recognition systems [9]. Governments and commercial enterprises throughout the globe are already using facial recognition technologies. Many of these methods have already been discarded due to their ineffectiveness. Critics argue that facial-recognition technologies infringe on residents’ privacy, often misidentify people and promote gender and racial stereotyping, and do not adequately secure biometric data. Meta has stated that it will shut down Facebook’s facial recognition technology, erasing face scan data of more than a billion users as a response of mounting social concerns. In the history of face recognition, this will be one of the biggest revolutions [10, 11] (Fig. 1). Steps involved in a face recognition model [12]: 1. Face Detection: Keep track of the coordinates of the bounding boxes you create around the faces you locate. 2. Face Alignments: In order to match the training database, normalize each face. 3. Feature Extraction: For training and recognition tasks, extract characteristics from faces. 4. Face Recognition: The face is compared to a database of previously known faces.
14 A New Face Recognition System
159
Fig. 1 Face recognition system
Capturing
Extracting
Comparing
Matching
Fig. 2 Working of face recognition
In the old approach of facial recognition, each of these four processes had to be handled by a distinct module, which was time-consuming [13]. To see a library that accomplishes all four of these tasks in one go, check out this article (Fig. 2).
1.3 Techniques for Face Recognition In spite of the fact that humans can easily recognize faces, computers have a hard time with facial recognition. As the lighting and facial expressions vary, so does the look of a three-dimensional human face that is being recognized by facial recognition algorithms [14]. To conduct this computational job, face recognition algorithms employ four different procedures. Face detection is first used to isolate the subject’s head from the rest of the image. This second step takes into account factors including the position of the subject’s face, the size of the image, and the quality of the photography itself, such as lighting and grayscale [15]. In the third stage, the alignment approach is supposed to allow for perfect extraction of face features. The characteristics of the face, like eyes, nose, and mouth, are depicted in the photograph. In the
160
A. Tyagi and K. Singh
fourth stage, the established feature vector of the face is compared to a database of other faces [16].
1.4 Face Recognizing Verification Structures Face spotting evidence is a comfortable task to see that the combination of scale, regions, pads, present, external appearance, block, and lighting conditions beyond the appearance of your face exchange. Facts-based and appearance-based strategies are combined to identify face suggested confirmation, as well as via the structures set up, skin concealment, progress and so on to name a few [17].
1.5 Face Recognition with Python This article will teach you how to use Python to do face recognition. After that, I’ll help you install the libraries you need to do facial recognition. In the end, we’ll be able to recognize faces in both still photographs and live video feeds. Real-time facial recognition will be possible with our solution, as we will see [18].
1.6 Deep Learning Machine learning has numerous subfields, one of which is deep learning. Using wellstructured and controllable data, it is able to learn and provide useful and knowledgerich outputs. We refer to this as “deep neuronal learning” because of the way the brain processes new information [19]. A single researcher invented the phrase “deep neural network.” Machine learning encompasses algorithms. Many data points are taken into account in deep learning. Research shows that machine learning systems need to be monitored. It is up to the programmer to specify the details of his or her code. In order to discover a picture of a dog, for example, one must define his search to the computer quite thoroughly. Using a huge knowledge base and autonomous software, you’ll get better results on every search. Deep learning is superior to simple machine learning because it is faster and more accurate [20]. Machine learning is an important application of AI. Because of it, the system may develop and improve. To be able to expand, it had to be incompletely programmed. Machine learning aims to create computer programs that are simple enough to be utilized by a single human. It is the major objective of artificial intelligence to allow computers to learn on their own, without the help of a human teacher. Here, human aid isn’t needed at all. The following are some of the benefits of machine learning [21] (Fig. 3):
14 A New Face Recognition System
161
Fig. 3 Deep learning
Artificial Intelligence Machine Learning Deep
1. It is also used by the financial services and healthcare industries as a whole. 2. It is possible for devices to reduce their time cycles, which increases the amount of energy they require. 3. Using this strategy, social media businesses like Google and Facebook customize their ads based on the interests of their users. 4. 4. For big and complicated process systems, machine learning may improve quality.
1.7 Image Processing Images are just two-dimensional messages. The connection between two coordinates, x and y, is described mathematically by the function F(x, y). Pixel values may be determined at any moment using F(x, y). The amplitude of F at a particular point’s (x, y) spatial coordinates is defined as the intensity of an image at that position. All three of F’s parameters (x, y, and amplitude) have fixed values in a digital picture. A picture may be defined as a two-dimensional array of rows and columns [22]. Each element in the Digital Image has a unique value given to it at that precise location in the image. These elements include pictures, photos, and pixels. The pixels of a digital picture are most typically used to indicate the image’s individual components. There are various elements of signal processing, including signal analysis, signal processing, signal storage, signal filtering (e.g., digital signal processing) and signal processing. Audio and visual signals, as well as other kinds of signals, are included in this category [23]. Image processing is only one of these signals that deals with signals that are both input and output pictures. As the name indicates, it deals with picture processing. Digital image processing and analogue image processing are both subsets of this larger field [24, 25]. Digital Image Processing (DIP) refers to the software used to process digital images, such as computer graphics, communications, photography, camera mechanisms, and pixels. Images may be enhanced signal processing can be done for analogue or digital data as well as picture and voice signals on this platform Various formats are available for images. In Digital Image Processing, algorithms are used to change pictures [26]. Adobe Photoshop is the industry standard for digital picture editing (Fig. 4).
162
A. Tyagi and K. Singh Output Input Image Description
Image
Description
Digital Image Processing
Computer Vision
Computer Graphics
AI
Fig. 4 Image processing
1.8 Need of Research Terror acts have sparked a worldwide effort by governments throughout the globe to improve border security. The use of biometric face recognition technology is one of several techniques under testing to keep the country safe. Human restricted regions might benefit greatly from an automated real-time tracking system that could one day become the standard security platform for border control across the world. The development of technologies for identifying and monitoring persons in limited or high-security settings has been one of most significant study topics in recent years. In spite of the fact that commercially available human recognition and counting technologies are available, further research is needed to address challenges of real-world situations. There is a plethora of surveillance cameras set up all around us, but it is impossible to maintain tabs on them all simultaneously. A computer vision-based solution is needed to automatically analyze these photographs to identify problematic situations or aberrant behavior. Using an automated video surveillance system, individuals may be seen in real time and their activities and interactions can be described. Security, safety, and site management all rely on the ability to identify and monitor individuals. Automated video surveillance relies heavily on object detection. Background subtraction is the primary method for identifying objects in a video. Moving objects may be detected from static cameras using this method, which is extensively utilized. The practice of deleting a video’s background is called “backdrop subtraction,” as the name implies. Using a single camera, the surveillance system here aims to identify and monitor humans. Segmenting moving objects in video is accomplished by the employment of a camera set at the appropriate location and a background removal technique. Tracing lines are drawn around the human body if a human entity is recognized. The system processes a human input in seconds and sends e-mail warning for security reasons when it detects it. The primary goal is to build a real-time security system. As a result of studying the literature on face location and face recognition and determining possible, verifiable scenarios when such frameworks might be advantageous, and we arrived at the following problem scope for our venture. Prerequisites for any supporting systems were noted. • Automated facial recognition in static photos. • A frontal view facial recognition system. • These systems will be provided with expressionless, frontal-view images of faces.
14 A New Face Recognition System
163
• Variation in illumination is required in all installed systems. • The performance of all systems must be near real-time. • Face detection must be supported in both fully automated and manually operated modes simultaneously. • Only one known picture from each person will be used for frontal view face recognition. • Automated face detection and identification technologies should be integrated into single system. Segmented picture generated by face detection sub-system must show some degree of invariance to scaling and rotation errors in the face recognition sub-system.
2 Literature Review Song Zhou et al. (2018) studied 3D face recognition has been a popular research field in both industry and academia. It inherits advantages from standard 2D face recognition, such as the natural recognition process and a wide range of applications. Moreover, 3D face recognition systems could accurately distinguish human faces even under dim circumstances and with variable facial postures and emotions, in such settings 2D face recognition systems would have tremendous trouble to work. This article outlines the history and the most recent achievements in 3D face recognition research arena. The frontier research outcomes were introduced in three categories: pose-invariant recognition, expression-invariant recognition, and occlusion-invariant recognition. To assist future study, this document compiles information regarding freely available 3D face datasets. This report also identifies major outstanding problems [1]. Xiao-Zhen Li et al. (2018) presented a quantum grey picture encryptioncompression strategy based on quantum cosine transform and a 5-dimensional hyper chaotic system was developed in order to achieve improved encryption efficiency and to accomplish the compression of quantum images. Using a 5-dimensional hyper chaotic system, the original picture was compressed used the quantum cosine transform and Zigzag scan coding. Used a 5-dimensional hyper chaotic system, the suggested quantum image compression technique has a bigger key space and superior security because of its more complicated dynamic behavior, better randomness, and unpredictability. It has been demonstrated that the suggested quantum picture encryption-compression system was more efficient and secure than its classical counterpart [2]. Yunfei Li et al. (2018) provided it was difficult to recognize faces in the actual world because of the variety in lighting, backdrop and stance. Face recognition algorithms based on deep learning have recently demonstrated the ability to learn useful aspects about faces and achieve quite outstanding results. A face-recognition system based on machine-learning algorithms, however, entirely neglects the beneficial hand-craft traits that have been extensively researched over a period of time. Face
164
A. Tyagi and K. Singh
identification using a deep learning feature helped by the facial texture feature (FTFADLF) was therefore suggested in this research. In the proposed FTFA-DLF, deep learning and handcrafted characteristics may be combined. The suggested FTFADLF approach uses textural characteristics from the eyes, nose, and mouth areas to extract hand-crafted features. A combination of deep learning and hand-craft features was then added to the goal function layer, which adaptively alters the deep learning features to better cooperate with the hand-craft features and achieve superior face recognition performance. Using the LFW face database, the suggested face recognition system was able to obtain a 97.02 percent accuracy rate [3]. R. Ponuma et al. (2018) reviewed compressive sensing-based encryption which uses a low-complexity sampling approach to accomplish simultaneous compression and encryption while also being computationally secure. Incoherence rotated chaotic measurement matrices based on a new innovative 1D–chaotic map is presented in this study for the first time. The suggested map’s chaotic feature was put to the test in an experiment. The suggested map generates a chaotic sequence that distorts and muddles linear measurements. The data storage and bandwidth needs are decreased by the use of a chaos-based measuring matrix, because it simply needs to store the parameters necessary to produce the chaotic sequence. Data transmission is safe. thanks to the chaos’s sensitivity to settings. Input data and the parameter used to produce the chaotic map both influence the secret key used in the encryption procedure. Thus, the suggested system was able to withstand an attack on a specific plaintext. The suggested scheme’s key space is big enough to withstand statistical assaults. Experimentation and security analysis show that the suggested compression-encryption method was safe and secure [4]. Mohamed Uvaze Ahamed Ayoobkhan et al. (2018) introduced prediction errors, and this research proposes a method for lossless picture compression. Using a unique classifier that incorporates wavelet and Fourier descriptor information, greater compression is achieved. ANNs were utilized as a predictive model. For each picture class, an optimal ANN configuration was found. The prediction mistakes are encoded using entropy encoding in the second step, which further enhances the compression performance. By making the projected values integers at both the compression and decompression stages, lossless prediction may be achieved. CLEF med 2009, COREL1 k, and conventional benchmarking pictures were used to evaluate the new technique. In these circumstances, it has been discovered that the suggested technique produces good compression ratio values, and for standard photos, compression ratio values achieved were greater than those obtained by well-known methods [5]. Yongqing Zhang et al. (2018) studied in recent years that deep convolutional networks have proved their potential to improve the discriminative power compared to other machine learning methods, however, its feature learning process is not particularly well understood. ICANet, a cascaded linear convolutional network based on ICA filters, was described in this study. A convolutional layer, a binary hash, and a block histogram make up ICANet. In comparison to other approaches, it offers the following advantages: It was possible to simply train each layer parameter in ICANet, compared to more complex deep learning models, because the network topology was simple and computationally efficient. The ICA filter may be taught
14 A New Face Recognition System
165
using an unsupervised approach using unlabeled samples. A deep learning framework may be applied to large-scale picture categorization with the help of ICANet as a benchmark. As a last step, they tested ICANet’s face recognition capabilities against two publicly available databases: AR and FERET [6]. S. Hanis et al. (2018) reviewed compression and security have become more important as a result of the widespread availability and increased use of multimedia apps. A technique for key generation and an algorithm for encrypting double images was presented here. In order to produce the encryption keys, a new modified convolution and chaotic mapping approach was used. Pictures were shortened and merged using the suggested logistic mapping, with the four least significant bits of the two images combined first. Additionally, cellular automata-based diffusion is applied to the final picture to further enhance security. The encryption system appears to have been improved by using both confusion and dispersion. The key and algorithm’s unpredictability were successfully tested and determined to be successful in terms of performance and randomness. Real-time scenarios benefit from the fact that two pictures are compressed and encrypted concurrently [7]. Puja S. Prasad et al. (2019) created face recognition that was a difficult problem to solve because of the sheer volume of data available. Biometrics is dominated by deep learning, which has offered a decent solution in terms of recognition performance. Using deep learning, we want to investigate face representation under a variety of situations, including lower and upper face occlusions, misalignment, varied head positions, shifting illuminations, and inaccurate facial feature localization. In this study, two well-known Deep Learning models, VGG-Face and Lightened CNN, were used to extract face representations. This model and the previous one demonstrates the robustness of the deep learning model to various forms of misalignment and the ability to accept errors in the intraocular distance localizations [8]. James R. Clough et al. (2019) provided to the best of our knowledge, and this was the first time anybody has shown a strategy for directly incorporating topological prior information into deep learning-based segmentation. The persistent homology notion, a topological data analysis tool, is used in our approach to record high-level topological properties of segmentation results in a way that is distinguishable in terms of the pixelwise likelihood of being allocated to a certain class. The segmentation’s intended Betti numbers form the topological prior knowledge. We demonstrate our technique by applying it to the problem of left-ventricle segmentation of cardiac MR images of patients from the UK Biobank dataset, where we show that it increases segmentation performance in terms of topological correctness without losing pixelwise accuracy [9]. FraolGelana et al. (2019) studied more and more public locations like hotels and cinemas that were being targeted by terrorists and lone wolves, and this has made it clear that CCTV systems need to be considerably denser. CCTV cameras’ proliferation has made it nearly difficult for a human operator to review all the video streams and identify potential terrorist occurrences. “Active Shooter” is one of the most prevalent terror events. An assault on an outdoor music festival on Oct. 1, 2017, in Las Vegas, USA, by an assailant who opened fire with a gun on Oct. 1, 2017, USA, is an example of a terrorist act. Accordingly, the identification of an “Active Shooter”
166
A. Tyagi and K. Singh
with a visible firearm and the audible alarming of the CCTV operator to a potentially dangerous occurrence has been carried out in this study. A convolutional neural network classifier is used to categorize items as either a gun or not a gun, according to the proposed method. According to the proposed method, the proposed method’s classification accuracy is 97% [10]. Nhat-DucHoang et al. (2019) expressed an automated model for identifying and categorizing cracks in the asphalt pavement that was developed in this work. For feature extraction, image processing techniques such as steerable filters, projective integrals of images, and an improved image thresholding approach were used. For the purpose of creating data sets from digital photos, many scenarios for feature selection have been tested. Machine learning methods such as the support vector machine (SVM), the artificial neural network (ANN), and the random forest were trained and tested on these data sets (RF). It was possible to achieve the best results by combining attributes derived from the projective integral with th4 properties of crack objects. For classification accuracy, Wilcoxon signed-rank tests support the claim that support for SVM, ANN, and RF, SVM has obtained the best level of success (97.50%). (70%). So, the suggested automated technique could aid transportation organizations and inspectors in the assessment of pavement quality [11]. EftychiosProtopapadakis, et al. (2019) determined that a fracture detecting system for concrete tunnel surfaces is described in this research. High accuracy requirements, low operational time requirements, limited hardware resources, variable lighting conditions, low textured lining surfaces, scarcity of training data and abundance in noise are all addressed by the proposed methodology, which utilizes deep Convolutional Neural Networks and domain-specific post-processing techniques. Instead of laborious feature extraction manually, the proposed approach takes advantage of the representational capability provided by CNN convolutional layers, which automatically choose useful features. As a result, the suggested framework achieves high performance rates at a substantially reduced execution time than prior methodologies. In the Greek town of Metsovo, the Egnatia Motorway’s tunnels were used to test and certify the system that was demonstrated here. The results show that the suggested method is superior to other ways and has a bright future as a driver for autonomous concrete-lining tunnel-inspecting robots [12]. Dr. V. Suresh et al. (2019) presented that face recognition-based attendance monitoring for educational institutions was the primary goal of this project, which aims to improve and modernize the existing attendance system by making it more efficient and effective. The existing approach was riddled with ambiguity, making attendance tracked both incorrect and wasteful. Many issues developed when the authority was unable to enforce the previous system regulations. The face-recognition system will be the technology at work behind the scenes. One of the most distinguishing features of a person is their face. As a result, it’s commonly used to verify a person’s identification because the chances of a face being misidentified or copied were so minimal. Facial recognition data will be fed into an algorithm using face databases built for this research. Faces will then be matched to the database throughout the attendance taking process in order to verify their identification. Whenever a person is detected,
14 A New Face Recognition System
167
their attendance will be entered into a spreadsheet and saved. The students’ attendance record will be sent to their professors at the end of each day in the form of an Excel spreadsheet [13]. Thulluri Krishna Vamsi et al. (2019) studied that in today’s world, they were dealing with a wide range of security concerns. Because of this, they must use the most up-to-date technologies to tackle these problems. Face recognition was being used in this research to collect human photos and compare them to database photographs. The door will be unlocked by an electromagnetic lock if the user’s identity matches that of the system. Fast and precise facial recognition systems that can identify attackers and prevent illegal access to highly protected places were becoming increasingly important. When it comes to biometrics, face recognition ranks as one of the most secure methods. The timing and accuracy of face recognition in real-time situations is regarded to be one of the most important issues. Multicore systems have been presented as a means of addressing a number of issues. The LBPH Algorithm gives a thorough architectural design and an analysis for a real-time face recognition system, taking into account the current problem in doing so. These photographs will be saved in a database using this technique, which turns color images to grayscale images and breaks them down into individual pixels. In order to open the door, the microcontroller will transfer power to the motor driver unit, which in turn powers the electromagnetic lock, and it will lock again if the microcontroller loses power to that unit. This study continues by discussing the sophisticated implementations that were made possible by the use of embedded system concepts that went against the established [14]. UmaraZafar et al. (2019) expressed there were several difficulties in face picture recognition, included variable position, expression, lighting, and resolution, that make this study challenge extremely difficult. A recognition method’s capacity to deal with low-quality face photos was a critical component of its overall resilience. Deep convolutional neural networks (DCNNs) are well-suited for face identification because of their ability to learn robust features from raw face photos. To create a prediction, DCNNs employ SoftMax to measure the model’s confidence in a class for an input face picture. Because of this, the SoftMax probabilities do not accurately represent model confidence and could be deceptive in feature spaces that were not covered by training instances. Using model uncertainty, the authors hope to reduce the number of false positives in face recognition systems. Open-source dataset experiments reveal that model uncertainty improves accuracy by 3–4% over DCNNs and traditional machine learning approaches when used to these datasets [15].
3 Research Methodology It has been observed that there have been several research studies in existence. But the issue with previous research work is the lack of accuracy and performance. The present work considered the previous research in the area of face recognition. The issues of accuracy and performance were found. Moreover there was issue of multiple face detection in some research work. The proposed work has applied the
168
A. Tyagi and K. Singh
Literature survey •Face recognition system •Image processing •Python Problem statement • Multiple face detection issue • Accuracy issue • Performance issue Proposed statement • Integration of Image compression mechanism to face detection system in order to improve the performance and accuracy Evaluation • Comparative analysis of performance and accuracy • Implementation of multiple face detection Fig. 5 Research methodology
image compression mechanism to reduce the image size in order to speed up the face detection. Finally the comparative study of accuracy and performance has been made in order to evaluate the proposed work (Fig. 5). Algorithm for face detection 1. 2. 3. 4.
Input image and perform window scanning get S Perform Illumination correction S’ Apply Gradient map to get G Perform block rank Pattern IF True Perform geometric face model If true Consider face Else Consider non face Else Consider non face
14 A New Face Recognition System
169
4 Result and Discussion The present research is considering the dataset of Ian Malcol, malan grant, elliesattler, Claire dearing, John Hammond, Owen grady. In order to implement face recognition the following tools have been used. OpenCV : Our face recognition system’s first stage is to detect faces. Deep learning is used to extract face embeddings from each face. A face recognition model is then trained on the embeddings. Finally, OpenCV is used to identify faces in photos and video streams. Deep learning: Face-recognition software may benefit from the latest advances in deep learning. For example, a trained model may be used to detect photographs from a database in other photos and movies using facial embeddings extracted from images of faces (Fig. 6). The Bing Image Search API was used to construct an example face recognition dataset. Six characters from the Jurassic Park film series are shown here. After considering this, the above image dataset data is compressed and training performed. Finally composite image is considered for testing where faces are recognized according to the trained model (Fig. 7).
Fig. 6 Image dataset
170
A. Tyagi and K. Singh
Fig. 7 Multiple face detection and identification
5 Comparative Analysis
Time consumption (in Sec)
Present section is presenting the comparative analysis of time and file space consumption in case of the proposed and conventional face detection approach. Comparative analysis of time taken in case of the proposed work and conventional face detection system (Fig. 8 and Table 1). Comparative analysis of file size taken in case of the proposed work and conventional face detection system (Fig. 9 and Table 2). Confusion matrix in case of conventional face detection (Table 3). Confusion matrix in case of Proposed face detection (Table 4).
15 Conventional approach
10 5
Proposed approach
0 1 3 5 7 9 Number of images
Fig. 8. Comparison of time taken
14 A New Face Recognition System Table 1 Comparison of time taken
Images
171 Conventional approach
Proposed approach
1
0.903157627
0.09222942
2
2.723790506
2.07552306
3
3.061927303
2.75117646
4
4.201901528
3.70034699
5
4.577506078
4.18033226
6
6.537213925
5.61916959
7
7.77368333
6.82813547
8
7.432674618
7.27400176
8.678987474
8.30049636
9 10
10.33464955
9.96144301
FIle Size in (kbs)
20 15
Conventional approach
10
Proposed approach
5 0 1 2 3 4 5 6 7 8 9 10 Number of Images
Fig. 9 Comparison of file size Table 2 Comparison of file size
Images 1 2
Conventional approach 5.78203346 10.13636
3
8.147052
4
12.1676015
5
6.25941332
Proposed approach 5.42799776 9.58213921 7.43287004 11.8137784 5.82822702
6
12.8026995
11.810289
7
13.028142
12.3541929
8
12.2422228
11.9422597
9
10.1894864
10
16.3579357
9.34536124 16.0764768
172
A. Tyagi and K. Singh
Table 3 Confusion matrix in case of conventional face detection Person A
Person B
Person C
Person A
6
1
1
Person B
1
5
2
Person C
2
0
Class
n (truth)
n (classified)
Accuracy
6 Precision
Recall
F1 Score
1
9
8
79.17%
0.75
0.67
0.71
2
6
8
83.33%
0.63
0.83
0.71
3
9
8
79.17%
0.75
0.67
0.71
Result TP: 17 Overall Accuracy: 70.83%
Table 4 Confusion matrix in case of Proposed face detection Person A
Person B
Person C
Person A
7
0
1
Person B
1
6
1
Person C
1
0
7
Result TP: 20 Overall Accuracy: 83.33% Class
n (truth)
n (classified)
Accuracy
Precision
Recall
F1 Score
1
9
8
87.5%
0.88
0.78
0.82
2
6
8
91.67%
0.75
1.0
0.86
3
9
8
87.5%
0.88
0.78
0.82
5.1 Comparative Analysis of Accuracy See Table 5 and Fig. 10. Table 5 Comparative analysis of accuracy
Conventional face detection
Proposed face detection
70.83%
83.33%
14 A New Face Recognition System
173
Overall Accuracy 90.00% conventional face detection
80.00% 70.00% 60.00% conventional Proposed face face detection detection
Proposed face detection
Fig. 10 Comparison of Overall Accuracy
6 Conclusion Simulation concludes that the proposed work is taking less time during face recognition as compared to the previous model. More over the file space consumption is less in case of the proposed work. Accuracy of the proposed work is found better than the conventional approach. In this way the proposed work has provided a more efficient multiple face detection and recognition mechanism.
References 1. Zhou S, Xiao S (2018) 3D face recognition: a survey. Human-centric Comput Inf Sci 8(1). https://doi.org/10.1186/s13673-018-0157-2 2. Li XZ, Chen WW, Wang YQ (2018) Quantum image compression-encryption scheme based on quantum discrete cosine transform. Int J Theor Phys 57(9):2904–2919. https://doi.org/10. 1007/s10773-018-3810-7 3. Li Y, Lu Z, Li J, Deng Y (2018) Improving deep learning feature with facial texture feature for face recognition. Wirel Pers Commun 103(2):1195–1206. https://doi.org/10.1007/s11277018-5377-2 4. Ponuma R, Amutha R (2018) Compressive sensing based image compression-encryption using Novel 1D-Chaotic map. Multimed Tools Appl 77(15):19209–19234. https://doi.org/10.1007/ s11042-017-5378-2 5. Uvaze M, Ayoobkhan A, Chikkannan E, Ramakrishnan K, Balasubramanian SB (2018) Prediction-based lossless image compression, vol 2018. Springer International Publishing. https://doi.org/10.1007/978-3-030-00665-5 6. Zhang Y, Geng T, Wu X, Zhou J, Gao D (2018) ICANet: a simple cascade linear convolution network for face recognition. Eurasip J Image Video Process 1:2018. https://doi.org/10.1186/ s13640-018-0288-4 7. Hanis S, Amutha R (2018) Double image compression and encryption scheme using logistic mapped convolution and cellular automata. Multimed Tools Appl 77(6):6897–6912. https:// doi.org/10.1007/s11042-017-4606-0 8. Prasad PS, et al (2019) Deep learning based representation for face recognition. May 2012, pp 419–424 9. Clough JR, Oksuz I, Byrne N, Schnabel JA, King AP (2019) Explicit topological priors for deeplearning based image segmentation using persistent homology, vol 11492. LNCS, Springer International Publishing. https://doi.org/10.1007/978-3-030-20351-1_2
174
A. Tyagi and K. Singh
10. Gelana F, Yadav A (2019) Firearm detection from surveillance cameras using image processing and machine learning techniques, vol 851. Springer, Singapore. https://doi.org/10.1007/978981-13-2414-7_3 11. Hoang ND, Nguyen QL (2019) A novel method for asphalt pavement crack classification based on image processing and machine learning. Eng Comput 35(2):487–498. https://doi.org/ 10.1007/s00366-018-0611-9 12. Protopapadakis E, Voulodimos A, Doulamis A, Doulamis N, Stathaki T (2019) Automatic crack detection for tunnel inspection using deep learning and heuristic image post-processing. Appl Intell 49(7):2793–2806. https://doi.org/10.1007/s10489-018-01396-y 13. Suresh V, Dumpa SC, Vankayala CD, Rapa J (2019) Facial recognition attendance system using python and OpenCv. Quest J Softw Eng Simul 5(2):2321–3809 [Online]. www.questjournal s.org 14. Vamsi TK (2019) Face recognition based door unlocking system using Raspberry Pi’, Academia. Edu.stem using Raspberry Pi. Academia Edu 5(2):1320–1324 15. Zafar U et al (2019) Face recognition with Bayesian convolutional networks for robust surveillance systems. Eurasip J Image Video Process 1:2019. https://doi.org/10.1186/s13640-0190406-y 16. Ding X, Raziei Z, Larson EC, Olinick EV, Krueger P, Hahsler M (2020) Swapped face detection using deep learning and subjective assessment. Eurasip J Inf Secur 1:2020. https://doi.org/10. 1186/s13635-020-00109-8 17. Khan S, Akram A, Usman N (2020) Real time automatic attendance system for face recognition using face API and OpenCV. Wirel Pers Commun 113(1):469–480. https://doi.org/10.1007/ s11277-020-07224-2 18. Oloyede MO, Hancke GP, Myburgh HC (2020) A review on face recognition systems: recent approaches and challenges. Multimed Tools Appl 79(37–38):27891–27922. https://doi.org/10. 1007/s11042-020-09261-2 19. Ríos-Sánchez B, Da Silva DC, Martín-Yuste N, Sánchez-Ávila C (2020) Deep learning for face recognition on mobile devices. IET Biom 9(3):109–117. https://doi.org/10.1049/iet-bmt.2019. 0093 20. Tirupal T, Rajesh P, Nagarjuna G, Sandeep K, Ahmed P (2020) Python based multiple face detection system. 6:5–14 21. Yuan Z (2020) Face detection and recognition based on visual attention mechanism guidance model in unrestricted posture. Sci Program 2020. https://doi.org/10.1155/2020/8861987 22. Zhu Z, Cheng Y (2020) Application of attitude tracking algorithm for face recognition based on OpenCV in the intelligent door lock. Comput Commun 154(900):390–397. https://doi.org/ 10.1016/j.comcom.2020.02.003 23. Agrawal P et al (2021) Automated bank cheque verification using image processing and deep learning methods. Multimed Tools Appl 80(4):5319–5350. https://doi.org/10.1007/s11042020-09818-1 24. Haq MA, Rahaman G, Baral P, Ghosh A (2021) Deep learning based supervised image classification using UAV images for forest areas classification. J Indian Soc Remote Sens 49(3):601–606. https://doi.org/10.1007/s12524-020-01231-3 25. Sunaryono D, Siswantoro J, Anggoro R (2021) An android based course attendance system using face recognition. J King Saud Univ Comput Inf Sci 33(3):304–312. https://doi.org/10. 1016/j.jksuci.2019.01.006 26. Thomas RM, Sabu M, Samson T, Mol S, Thomas T (2021) Real time face mask detection and recognition using python. 9(7):57–62 [Online]. www.ijert.org
Chapter 15
A Review on Quality Determination for Fruits and Vegetables Sowmya Natarajan
and Vijayakumar Ponnusamy
1 Introduction Fruits and vegetable quality can be determined using four main attributes such as flavor (aroma, taste), texture, color, appearance and nutritional value [1]. Many features like soluble solids, smell, acidity and firmness are helpful to determine the quality of fruit and vegetables. Color feature serves as a good indicator for fruit freshness outer skin appearance, and defective presence. Firmness measure is usually done with penetrometer in the farm field itself to identify the degree of fruit ripeness, stiffness and elastic behavior. Texture feature determines the tenderness, consistency, juiciness, firmness and smoothness. Specific nutrients presence could also be known through the color of the fruit Hamza and Chtourou [15]. The presence of pigmented components can be determined through the reflected color of vegetables and fruits. Almost the color of vegetables and fruits are caused by families of pigments. The primary families of color pigments are listed as follows: carotenoids (orange, yellow, red) chlorophyll (green), flavonoids are combination of anthoxantins and anthocyanins (red, purple, blue). Chlorophyll pigments are sensitive to acid and heat but are stable for alkali. Carotenoid is stable to heat and sensitive to oxidation and light. Anthocyanins are sensitive to heat and pH, while the betalains, flavonoids are much sensitive to heat. Sensory measurement describes the quality of food products. The features extracted determines preference to select the good sample than the defected one. Sensory descriptive analysis employed to determine the small differences in quality between same groups or species of samples Barrett et al., Sowmya and Vijayakumar [2, 3]. It also involves determination of features like smell, firmness, color, gloss,
S. Natarajan · V. Ponnusamy (B) Department of ECE, SRM Institute of Science and Technology, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_15
175
176
S. Natarajan and V. Ponnusamy
Feature extraction
Machine learning/deep learning
Quality determina-
Sensor, spectroscopic, colorimetric, Images, chemical analysis
Colour,
tion/classificat
shape, texture,
ion/ grading/ sorting
flavour, aroma, firmness, size, freshness Statistical
Methodologies employed
analysis
Fig. 1 Techniques employed for quality determination of vegetables and fruits
shape and size of the fruit/ vegetable samples. Instrumental analysis involves determination of color pigments acidity, Texture, chemical molecule components of the samples. Instrumental technique provides more sensitive accurate results and directly related with the physical chemical properties. Figure 1 shows the various methods and techniques deployed for quality determination of fruits and vegetables. From Fig. 1 it can be seen that both physical and chemical analysis is performed to extract the specific attribute to classify the quality. The features are processed through statistical and machine learning algorithmic analysis for classification. The classification results with sorting and grading of fruits/vegetables. The following literature work delineates the emerging destructive and non-destructive techniques for determination of fruit quality analysis. Imaging, sensor-based analysis, spectroscopic, chemical based destructive experimentation and some of the spectroscopic sensors which are employed for the quality assessment.
2 Imaging and Instrumental Methods for Quality and Grading Color, shape, size, gloss, acidity, firmness, molecular changes in the samples, odor, aroma are the attributes considered for the fruit quality evaluation. Texture, moisture, firmness, color, shape, gloss, aroma, acidity and size are the main features utilized to determine the quality of fruit samples. Color is often associated with the quality of agricultural products. Automatic grading system determines the quality either by
15 A Review on Quality Determination for Fruits and Vegetables
177
comparing the colour of the product with the predefined colour or with the fixed set of reference parameters. The method is not convenient for the end user to predict the quality of the product. Automating the sorting and grading of the food products increases the productivity and economic growth of the country. Traditional way of sorting consumes time, subjective, expensive and inconsistent. Color is one of the indirect measurement for classification on fruit ripening and maturity analysis. Generally visual appearance plays the significant role in measuring the equality of fresh commodities made by the buyer Mitcham et al. [4]. Human eye perceives the light reflected from the fresh fruits/vegetables which helps to characterize the commodity. Color attribute depends on the light intensity, physical and chemical characteristics of fresh products and ability of the individual to specify the colour. Gloss is another visual outlook to determine the quality of fresh products. It observes the light reflected from the surface of the product and identify its quality attribute. Shape and size is another subjective evaluation attribute which determines the undesirable and desirable aspects of commodities. The product must also be validated to identify the presence of defects such as disease, cuts, injury, physiological disorders and bruises. During quality identification this test would be helpful to grade the fresh fruit/vegetables. Firmness is next factor which determines the crispness/softness degree of the product. Firmness can be measured with the help of the subject’s finger touch. Soluble Solid Content (SSC) can be determined using Refractometer refractive index measurement. The measured refractive index scale can be compared with the equivalent values to determine the freshness. Titration is a destructive indicator. A known volume of fresh fruit juice can be titrated with 0.1 sodium hydroxide to determine the Titratable acidity which is indicated using pH meter, Mitcham et al. [4]. Some of the chemical hazardous substances are sprayed on the fruits and vegetables to look fresh for long time. Those chemicals cause severe health issues once it is consumed for a long time. Steroid hormone is an emerging organic analyte pollutant present in the agro-ecosystem. Steroid hormones are found in fruits, leaves and roots. Progestin, oestrogens, androgens and glucocorticoids are the selected steroid hormones in fruits and vegetables. The next work focuses in determining the sensitive steroid hormones through the multiclass analytical method. Spinach (leafy vegetable), carrot (root vegetable) and strawberry (fruit) are chosen for testing the steroid hormone. The results, Merlo et al. [5] show that the chemo metric approach will be able to quantify low nano gram of steroid hormone per gram (ng/g) of sample levels. Malathion is one of the Organo-phosphorous chemical pesticide which is sprayed on the farm fields to kill pests. Research work by Huang et al. [6] presents the single particle detection method thorough calorimetric analysis to quantify the malathion. The work offers new design insights for organo-phosphorous detection using ultrasensitive assay. The probe was developed using Mno2 coated gold nanoparticles. Malathion inhibits the alkaline phosphate activity and the colur of the probe remains unchanged. Dark field optical spectroscopy identifies the change of color in the probe and its scattering intensity shows the presence of Malathion component in the fruits
178
S. Natarajan and V. Ponnusamy
and vegetables. The work quantifies malathion limit of detection as low as 0.82 Pico gram/ml. Shomaji et al. [7] determined that the chemicals’ presence in the surface of the vegetable samples are measured by employing 1H NMR relaxometry. Raw Okra, Peas and red chillies are used as the pure sample. Copper sulphate coated okra, malachite green dyed on peas and sudan dye applied on red chillies are used as adulterated samples for experimentation. The measurement was obtained through absorption relaxation spectra of NMR relaxometry. Carr-Purcell-Meiboom-Gill method estimated the effective transverse relaxation time constant. The constant is compared with the library values to quantify and detect the presence of the adulterant dye. The proposed method was able to quantify the presence of a toxic element with a lower concentration rate of up to 1 g/L (0.1%). In warm water the dyes are washed out faster (2 min) than in normal room temperature water. It also confirms that the concentration of dye increases with time. The proposed system is not supported for real time analysis and also it is not tested with the local market products. Color is one of the feature used widely for image indexing and retrieval. Initially the color spaces such as RGB, HIS, CIE Lab space need to be specified then the features are extracted from the color space. The review work of Bhargava and Bansal [8] focuses on computer vision techniques deployed for sorting and grading the quality of fruits and vegetables. Generally, in food applications various image acquisition tools such as Computer tomography (CT), Charge Coupled Device (CCD), CMOS sensors, Magnetic Resonance Imaging (MRI) are utilized. The acquired images are pre-processed, segmented, features are extracted. The features are processed with Machine Learning and Deep Learning algorithms to classify/determine the quality of the samples. Color correlogram, color moments, coherence vector and color histogram are the color features extracted from the color space. With these color features it is possible to determine the maturity of the fruit with 99.10% accuracy. Even quality of the fruit also determined by utilizing RGB LEDs having wavelength from 525, 635 and 470 nm. Maturity, ripening classification, sorting, grading and quality determination are analysed using the computer vision imaging technique. Fruit quality measurement and evaluation is proposed using the surface color feature of banana fruit as the validation parameter. Ten banana images are acquired at various status Wang et al. [9] employed to train the Back Propagation Neural Network (BPNN). Each RGB colour values are divided into five levels. The divided five level of RGB are used as training and validation data input to train and validate the BPNN. High to low quality parameters are refereed as follows with the binary values (10000), (01000), (00100), (00010) and (00001). This method shows the feasibility and usefulness of determination. But this method fails to provide the determination accuracy of BPNN. Artificial Intelligence based fruit quality grading by Priya et al. [10] the authors proposed which works on image processing technique. Classification algorithms such as K-nearest neighbour, Convolutional Network and support vector machine are employed to classify the fruits based on the quality and grading. From the input
15 A Review on Quality Determination for Fruits and Vegetables
179
images color, mean and Histogram Of Gradient (HOG) features are chosen to classify the quality. Real time, microwave and RGB camera images are employed for classification. KNN classifier achieves 88% of fruit freshness accuracy with the HOG feature. The audio system is implanted with the design to say whether the fruit is good or bad with respect to the images obtained. The system works well for the image dataset which contains specific features such as HOG components, conjure, size, color and texture features. Only apple is analysed that can be tested for other varieties of fruits and vegetables. Lee et al. [11] proposed that the work develops color mapping concept to make the quality assessment easier. A unique set of colour mapping index is developed for specific application. This system converts the 3D-RGB colour space into 1D smooth and continuous colour space. Threshold bar is set to adjust the color preferences to slightly darker and slightly greener color. The developed method was tested for the tomato sample to determine the date of maturity and surface defect. Color is the key sensory characteristic in the evaluation of quality and ripeness of fruits. The traditional way of spectrophotometric and colorimetric analysis is limited with homogeneous color distribution. Ratprakhon et al. [12] deals with the categorization of ripeness levels in mango fruit. Mango peels and pulp are used to determine the level of ripeness in the mango obtained from “Mahachanok”, “Kent” and “Nam Dokmai” cultivars. Pulp and peel colour of the mango fruit shows the modification in pigments which occurs during the ripening stage. The Charge coupled Device (CCD) camera is deployed in combination with color standards and computer vision system. The RGB color space values are converted into Hue Saturation Intensity (HSI) values and Natural Colour Standard (NCS). Colour-coded index combined computer vision system helps to predict the ripeness levels of mangoes with color pixel precision of ±5%. The system is not suitable to analyze the sample with no color change during its various maturity stages. The future work of the system can be improved to determine the post-ripening stage at the time of transport and storage purposes. The level of firmness, total soluble solids, acidity and maturity parameters can also be measured to get a clear biochemical view of the mango sample. Dubey and Jalal [13] review work focuses on image processing and computer vision approaches to classify and determine the disease in the fruit samples. The work also discusses the various image processing methodologies like pre-processing, segmentation, feature extraction and classification. For automatic classification and recognition of diseases in fruits, color and texture features play a significant role. Features are extracted from the infected region and segmented image. 1 and 3% of average classification error were reported using different ML algorithms. It was concluded that only one fruit sample is evaluated. The work can be extended to determine more number of samples with different species varieties in a single image. Kaur and Sharma [14] employs 2D fruit images to classify and grade the fruits with the ANN model.2D images are classified with respect to the color and shape of the fruit. Many fruit images may have similar color and shape features. Hence it is not effective to differentiate and identify the fruit images with only the color and shape features. To increase the determination accuracy Artificial Neural Network model is deployed in the work. Texture, color and shape features are extracted from the images
180
S. Natarajan and V. Ponnusamy
and processed through ANN. The ANN classifies the fruit samples as “Good” “Better” and “worst” qualities. The system fails to determine the classification accuracy and analyse a single variety of the fruit sample. Three maturity and ripeness stages of apple are determined using the color features. The images obtained using normal camera and applied to the fuzzy classifier are discussed by Hamza and Chtourou [15]. Apple sample is taken for experimentation. Green area, red area and yellow area are the attributes chosen for analysing the ripeness and maturity levels. To achieve a better performance, the gradient method was applied for tuning the fuzzy classifier. The selected features are processed with ANN, SVM and fuzzy interface system. 99.33% accuracy obtained for the Fuzzy interface system to determine the level of ripeness and maturity. Colorimetric and sensory evaluation techniques are employed to determine the change in green color for broccoli, spinach, tomato and cucumber samples, Gnanasekharan et al. [16]. The vegetables are kept under 4 controlled temperature environments (21, 4, 10 and 37 °C). Colorimetric evaluation was conducted for 13 days whereas the sensory evaluation was made for 4 days to determine the change in green color. The study finds the change in color through sensory as well as colorimetric evaluation readily when the samples are stored on the abuse temperature. Color changing pattern varied with the type of vegetable sample due to the difference in object—light interaction and surface. Analysis of Variance (ANOVA) shows p < 0.001 as a highly significant color variation for broccoli, tomato and spinach samples. Cucumber shows insignificance variation in color due to injury and storage temperature. This study lacks in determining the green colour change when the vegetables are kept under normal temperature. An optical sensor is developed in the proposed work Yahaya et al. [17] to determine the fruit index with the peak responsive wavelength. Mango samples are used to determine the ripeness and maturity state. The blue (470 nm), green (525 nm) and red (635 nm) monochromatic wavelength LEDs are utilized. The optical detector is used to produce data for all the above three spectral bands. These optical detector data are used to train and classify mango maturity index using the multiple linear regression model. The coefficient of determination achieves R2 = 0.879 for combination of various ripeness indices. Also the monochromatic (red) light testing experimentation was made and a better accuracy of R2 = 0.795 was obtained. The research can be enhanced with other fruit varieties in different wavelength ranges. The Near Field Communication tag (NFC) sensor was developed to classify and grade the fruits based on the colour presented in Lazaro et al. [18]. Banana and apple samples are utilized for testing the developed system. The color sensor acquires the RGB colour space from the samples and converts into Hue Saturation Value (HSV) for further classification. Hue angle and Saturation are the significant features chosen for ripeness classification. The features are processed with various Machine learning algorithms such as Naive Bayes, Linear Discriminant Analysis, Decision Tree, KNearest Neighbor and SVM. For the banana sample, Naïve Bayes algorithm achieves better ripening accuracy of 93%, for red apple LDA obtains 73% of accuracy and
15 A Review on Quality Determination for Fruits and Vegetables
181
golden apple Naïve Bayes achieves 83.50% of accuracy. Moreover the system developed a mobile based application to measure the quality of samples, which is communicated to the cloud server. The developed system was able to determine a single sample at a time and was tested with only 3 kinds of fruits samples. Outliers are identified when the samples are found with more pigments after which the measurements are repeated. The developed system can be trained to determine the ripeness, maturation and grading of fruits. With the increased safety and nutritional values apple is highly considered as a popular product. The research work Li et al. [19] aims to classify and grade the quality of apple. Normal camera is employed to capture the images. Attributes are selected from the images and applied to machine learning algorithm such as Support Vector Machine (SVM), Convolutional Neural Network (CNN) model and Google Inception v3 model. These models are trained and tested for quality determination out of which CNN achieves 99% of accuracy for three different varieties of apples. Dragon fruit mellowness and harvesting time are examined in the research work by Vijayakumar and Vinoth kanna [20]. Live images are captured and tested with the RESNET 152 deep CNN model. The model classifies the time period of harvesting and its mellowness with the Area Under Region of Curve (AUROC) as 1.0. Measurement of freshness index is also a kind of fruit or vegetable quality measurement. Luster sensor is utilized by Althaus and Blanke [21] to determine the quality freshness index of bell pepper samples. With the surface appearance modification the sensor evaluates the freshness indices. Yellow, red and green bell pepper show 42.2, 40.4, and 16.2% of freshness indices on day 0. The freshness rate decreases with the increase in number of days. The work testing can be tested for other fruit varieties with change in colour, size and shape and also to check the maturity. A new texture feature algorithm named Color Completed Local Binary Pattern (CCLBP) is developed to recognize the fruits and vegetables Tao et al. [22]. The database consists of 13 fruits containing variety images with different illuminations are taken and also the outdoor database consisting of 47 fruits and vegetable varieties are utilized. CCLBP uses the HSV color histogram and Border/Interior pixel classification histogram to obtain the color feature from the images. The features extracted are applied to the matching score fusion algorithm and verified using the Neural Network (NN) model. CCLBP achieves 5% better precision rate than the conventional method. Highest recognition rate of 94.26% is achieved by the CCLBP fusion algorithm which implies that this proposed system enables better vegetable and fruit recognition. Spatially resolved spectroscopy finds better application in the field of fruit and vegetable quality features determination, Si et al. [23]. This spectroscopic method of analysis includes tissue structure examination to find the defects in the fruits. It involves both chemical and physical destructive analysis. However, this method limits on the determination’s accuracy and the speed of detection. With the morphology, texture and color features the research work by Narendra et al. [24] identifies the external defects in the fruits and vegetables. Apple, orange and tomato RGB images are obtained to classify the presence and absence of external
182
S. Natarajan and V. Ponnusamy
defects. RGB images are converted into L*a*b color space and K means clustering mechanism is applied for classification. 87% of accuracy is achieved for apple,83% for orange and 93% for tomatoes obtained to determine and classify the defects on the samples. Table 1 Reviews the methodologies and algorithms utilized to determine the quality of food with its accuracy level. Table 1 Methods to determine the quality of fruits Literature employed
Methods of determination/Data set utilized
Fruits/vegetables and features utilized
Algorithms employed/
Accuracy achieved
Priya et al. [10]
Image processing/real time RGB images
Apple and mean, color, HOG
KNN
ACC-88%
Shomaji et al. [7]
1H
NMR relaxometry/ Real time sample testing
Red chilli, okra, peas and constant values
Chemical analysis
1 g/L LOD of sudan dye
Gnanasekharan et al. [16]
Colorimetric analysis/local grocery store
Broccoli, spinach, ANOVA tomato and cucumber
Bhargava et al. [8]
Optical sensor/online image dataset
Mango and fruit index responsive wavelength
P < 0.001
Multiple linear Acc-87.9% regression
Lazaro et al. [18] NFC color sensor tag/real time images
Red apple, golden Naïve Bayes, apple, LDA, DT, banana/HSV KNN and SVM
ACC-93%
Ratprakhon et al. Charge Coupled [12] Device camera/real time images
Mango/RGB, HIS, NCS
Ripeness determination ±5%
Hamza et al. [15] Fruit ripeness classification/real time
Fruits/area of red, green, yellow
ANN,SVM & FIS
Acc-99.3%
Li et al. [19]
Mobile camera/on-field dataset
Apple
CNN
Acc-99%
Tao [22]
Online IMAGE dataset
47 varieties of fruit and vegetables
CCLBP
ACC-94.26%
Althaus et al. [21]
Luster sensor/real time dataset
Yellow, green, red Freshness bell pepper indices
*
Acc-Accuracy
Acc-42–16.2% on day 0
15 A Review on Quality Determination for Fruits and Vegetables
183
3 Summary and Discussion Quality of the fruits and vegetables are mainly determined by the change in their features either internally or externally. The traditional way of determination includes online image datasets, destructive, titratable, and experimental way of determination. Conventional methods are laborious, consume time in delivering the results, not suitable for on-field analysis and lack in number of sample analysis. With the available online images, it is possible to distinguish a single fruit species variety but determination of the classification accuracy is less. By having the online image datasets or real time images it is possible to grade and classify a single variety of the fruit sample. The imaging mechanism could also quantify the presence of hazardous chemicals in the fruit sample. The instrumental analysis lags to sort more than 2 varieties. In near future, a system needs to be designed by considering three to four factors which are affecting the rapid quality determination of the fruit sample. Though the authors have considered only few color spaces, one may consider other color spaces to enhance the accuracy of determination. Also, images can be obtained from various countries/regions to make the analysis free of region variation. Machine Learning and deep learning algorithms are utilized to train and test the real time images which achieves better quality determination but still fails to distinguish the real time images. A generalized system can be developed to sort, grade and analyze the quality.
4 Conclusion Fruit and vegetable quality determination can be carried by both destructive and nondestructive analysis. The review work focuses on Sensor, imaging and colorimetric based evaluation which provides better quality, grading and sorting for a minimum of 2 varieties of the fruit sample. Imaging technique gives the scope to sort the quality of fruits based on color dullness, shape defect and texture analysis. Machine learning and neural network model combines to provide better accuracy performance in quality recognition. The sensor measurement faces some technical issues like placing angle and distance between the sensor and the product to be measured and also it involves human assistance to rotate the sample for full coverage analysis. Some of the sensor based real time measurement fail to test the local market samples and tested only a single variety of fruit or vegetable sample. The samples are also tested under various temperature humidity and for a corresponding number of days for its quality determination which provides better quality analysis. Time consumption is the only drawback. Only single spectral wavelength range and single sample variety is utilized to determine the quality. Deep neural network algorithms are employed to classify the fruits quality but fail to predict the classification accuracy rate. The future work can be focused to develop a real time sensor for multiple sample measurements and to deliver on time results. Portable sensors can be designed to determine the quality attributes on real time. Multiple fruit or vegetable sample
184
S. Natarajan and V. Ponnusamy
simultaneous measurements need to enhance in near future. The review work revealed that for quality detection of each type of fruit and vegetable we need to develop a separate mechanism/model. This resulted in the hypothetical research question of “is it possible to develop a generalized model for quality of all fruits and vegetables than that of dealing every variety of fruit and vegetable one model”. A deep learning model which is trained with all possible fruits and vegetables will to improve the performance of detecting quality for all types of vegetables and fruits.
References 1. https://www.google.com/search?q=quality+determination+of+fruits+and+vegetables+using+ the+colour&rlz=1C1CHBD_enIN916IN916&oq=quality+determination+of+fruits+and+veg etables+using+the+colour+&aqs=chrome..69i57.83638j1j7&sourceid=chrome&ie=UTF-8. Accessed March 30 2022 2. Barrett DM, Beaulieu JC, Shewfelt R (2010) Color, flavor, texture, and nutritional quality of fresh-cut fruits and vegetables: desirable levels, instrumental and sensory measurement, and the effects of processing. Crit Rev Food Sci Nutr 50(5):369–389 3. Natarajan S, Ponnusamy V (2021) A review on the organic and non-organic fruits and vegetable detection methods. In: 2021 sixth international conference on wireless communications, signal processing and networking (WiSPNET). pp 1–5: https://doi.org/10.1109/WiSPNET51692. 2021.9419397 4. Mitcham B, Cantwell M, Kader A (1996) Methods for determining quality of fresh commodities. Perishs Handl Newsl 85:1–5 5. Merlo F, Suppini S, Maraschi F, Profumo A, Speltini A (2022) A simple and fast multiclass method for determination of steroid hormones in berry fruits, root and leafy vegetables. Talanta Open 5:100081 6. Huang M, Fan Y, Yuan X, Wei L (2022) Color-coded detection of malathion based on enzyme inhibition with dark-field optical microscopy. Sens Actuators, B Chem 353:131135 7. Shomaji S, Masna NVR, Ariando D, Deb Paul S, Horace-Herron K, Forte D, Mandal S, Bhunia S (2021) Detecting dye-contaminated vegetables using low-field NMR relaxometry. Foods 10(9):2232 8. Bhargava A, Bansal A (2021) Fruits and vegetables quality evaluation using computer vision: a review. J King Saud Univ-Comput Inf Sci 33(3):243–257 9. Wang Y, Cui Y, Chen S, Zhang P, Huang H (2009) Study on fruit quality measurement and evaluation based on color identification. In: 2009 International conference on optical instruments and technology: optoelectronic imaging and process technology, vol 7513. SPIE , pp 114–119 10. Priya PS, Jyoshna N, Amaraneni S, Swamy J (2020) Real time fruits quality detection with the help of artificial intelligence. Mater Today Proc 33:4900–4906 11. Lee DJ, Archibald JK, Xiong G (2010) Rapid color grading for fruit quality evaluation using direct color mapping. IEEE Trans Autom Sci Eng 8(2):292–302 12. Ratprakhon K, Neubauer W, Riehn K, Fritsche J, Rohn S (2020) Developing an automatic color determination procedure for the quality assessment of mangos (Mangifera indica) using a CCD camera and color standards. Foods 9(11):1709 13. Dubey SR, Jalal AS (2015) Application of image processing in fruit and vegetable analysis: a review. J Intell Syst 24(4):405–424 14. Kaur M, Sharma R (2015) Quality detection of fruits by using ANN technique. IOSR J. Electron Commun Eng Ver. II 10(4):2278–2834 15. Hamza R, Chtourou M (2020) Design of fuzzy inference system for apple ripeness estimation using gradient method. IET Image Proc 14(3):561–569
15 A Review on Quality Determination for Fruits and Vegetables
185
16. Gnanasekharan V, Shewfelt RL, Chinnan MS (1992) Detection of color changes in green vegetables. J Food Sci 57(1):149–154 17. Yahaya OKM, MatJafri MZ, Aziz AA, Omar AF (2014) Non-destructive quality evaluation of fruit by color based on RGB LEDs system. In: 2014 2nd international conference on electronic design (ICED). IEEE, pp 230–233 18. Lazaro A, Boada M, Villarino R, Girbau D (2019) Color measurement and analysis of fruit with a battery-less NFC sensor. Sensors 19(7):1741 19. Li Y, Feng X, Liu Y, Han X (2021) Apple quality identification and classification by image processing based on convolutional neural networks. Sci Rep 11(1):1–15 20. Vijayakumar T, Vinothkanna MR (2020) Mellowness detection of dragon fruit using deep learning strategy. J Innov Image Process (JIIP) 2(01):35–43 21. Althaus B, Blanke M (2021) Development of a freshness index for fruit quality assessment— using bell pepper as a case study. Horticulturae 7(10):405; Young M (1989) The technical writer’s handbook. University Science, Mill Valley, CA 22. Tao H, Zhao L, Xi J, Yu L, Wang T (2014) Fruits and vegetables recognition based on color and texture features. Trans Chin Soc Agric Eng 30(16):305–311 23. Si W, Xiong J, Huang Y, Jiang X, Hu D (2022) Quality assessment of fruits and vegetables based on spatially resolved spectroscopy: a review. Foods 11(9):1198 24. Narendra VG, Pinto AJ (2020) Defects detection in fruits and vegetables using image processing and soft computing techniques. In: International conference on harmony search algorithm. Springer, Singapore, pp 325–337
Chapter 16
A New Image Encryption Scheme Using RSA Cryptosystem and Arnold Algorithm Arabind Kumar and Sanjay Yadav
1 Introduction The Web is the medium in the expanding development of media to move from the information starting with one spot and then onto the next place across the Web. There are numerous potential ways of communicating information over the Web, for example, messages, sending messages and pictures, and so forth; in the current correspondence, pictures are generally used. The Web has made the current world a carefully associated center point by the huge transmission of computerized data [1]. Yet at the same time, data security is of incredible concern. Specifically, picture security is at the prime center in view of its broad use in various areas like training, protection, organization, and so forth [2]. With the quick advancement of the Internet, data innovation, and computerized correspondences, the situation with media information transmission is expanding in present-day culture. A picture is a significant data transporter in human existence to show or portray straightforwardly the objective peculiarities. Lately, inferable from spillages of computerized picture data, genuine outcomes have happened. Consequently, picture encryption insurance has become especially significant. Customary encryption calculations store and scramble information as a one-layered string. As for pictures, because of the enormous measure of information, the computation of encryption plans is convoluted, and the high connection between the contiguous pixels requires an adequate long keystream for a high security level. Accordingly, another encryption calculation should be proposed for the safe correspondence of pictures [3] (refer to Fig. 1). The development, sharing, and interconnection of PC networks have prompted the quick turn of events and use of the Web. As of now, the Web is an immense, widely appropriated worldwide data administration focus that gives text, picture, A. Kumar (B) · S. Yadav Department of Applied Sciences, The NorthCap University, Gurugram 122017, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_16
187
188
A. Kumar and S. Yadav
Fig. 1 General block diagram of cryptosystem
sound, movement, and other assistance data. Nonetheless, information on the Web can be handily perused, taken, altered, replicated, and spread unlawfully [4]. Specifically, pictures, as one of the significant types of electronic correspondence, assume a significant part in interactive media correspondence. Besides, many pictures include individual protection or public mystery. In this manner, the assurance of picture data turns out to be especially significant. Nonetheless, pictures are portrayed with a lot of information, overt repetitiveness, and expanded connection, making customary encryption calculations unacceptable for picture encryption [5]. Computerized pictures have been generally utilized in different application regions like military, monetary, legitimate, and clinical, where the prerequisites for picture security are getting higher and higher. This part investigates and looks at the current condition of examination on computerized picture encryption procedures, zeroing in on the benefits and deficiencies of advanced picture encryption calculations in light of tumultuous frameworks, and at last talks about the utilization of tumultuous picture encryption procedures in clinical picture encryption [6]. Pre-processing is expected in encoding pictures, changing over the two-layered picture pixel lattice into a one-layered stream of plaintext information, and afterward applying the related calculation to encode heavily influenced by the key and re-establish it to the first picture while decoding. The benefit of this kind of cryptographic calculation is that the calculation is open, and the drawback is that it is wasteful, so it is many times utilized in a mix with other encryption procedures [7–9].
2 Related Works The RSA calculation is one of the most usually utilized awry calculations that depends on open key encryption. Generally, the key is utilized with the private or public type, which makes the unscrambling system take additional time than encryption. The
16 A New Image Encryption Scheme Using RSA Cryptosystem and Arnold …
189
primary endeavor to accelerate the encryption interaction by the RSA calculation was by Fiat. Hiral Rathod, Mahendra Singh Sisodia, Sanjay Kumar Sharma, et al. clarified with regards to picture cryptography; it might utilize the customary cryptosystems to scramble pictures [10, 11]. However, it has two issues. The primary issue is that the picture size is a lot more noteworthy, 100% of the time than the text. In this way, the cryptosystems need a lot of chance to scramble the picture. The subsequent issue is the unscrambled information should be equivalent to the first information. Because of the Quality of human discernment, an unscrambled picture containing little bending is typically adequate [12, 13]. A few confusion-based encryption calculations are accessible in the writing given by different scientists. Singh and Sinha (2009a, b) planned two confusion-based calculations in Hartley and Spinner change spaces. Liu et al. (2013) presented an optical tumultuous shading picture encryption plot utilizing Hartley change. El-Latif et al. (2013) specified a turbulent methodology of a mystery picture-sharing plan in light of arbitrary matrices and blunder dispersion. Sun (2017) presented the sine iterative turbulent guide-based encryption utilizing the reciprocal rule of DNA calculation [14]. Q. Zhang et al. (2010) and Jithin and Sankar (2020) have introduced tumultuous picture cryptosystems utilizing DNA sequencing calculations. Sheela et al. (2018) presented altered turbulent picture encryption utilizing cross-breed shift change. Wang et al. (2019) proposed a quick tumultuous picture encryption calculation by equal processing utilizing a strategic guide. Solitary and Singh (2020) specified tumultuous picture encryption calculation utilizing relative Slope figures. Wang and Gao (2020a, b) introduced two picture encryption innovation plans utilizing a grid semi-tensor item hypothesis and a Boolean organization calculation. Joshi et al. (2020) have given tumultuous different shading picture encryption calculations utilizing Arnold in addition to partial Fourier changes by changing over into Bayer pictures [14, 15]. Guleria et al. (2020) have introduced alopsided and turbulent various shading pictures encryption framework utilizing the RSA cryptosystem joined with Arnold and discrete fragmentary cosine changes. Sun et al. (2020) proposed a high layered spatiotemporal tumultuous framework in view of a 2D nonadjacent coupled guide grid model for picture encryption [16, 17]. Xian et al. (2020) likewise gave tumultuous picture encryption in light of sub-block scrambling and digit determination dispersion. As of late, Wang and Yang (2021) presented bedlam-based picture encryption calculation utilizing piece astute coupled guide grid with multi dynamic coupling coefficient, and Solitary et al. (2021) gave twofold tumultuous picture encryption irregular framework relative code [18, 19]. Additionally, Xian and Wang (2021) used a fractal arranging framework for turbulent picture encryption. Quantumbased disorder hypothesis is likewise utilized actually in picture encryption plans. Zaghloul et al. (2014) presented a shading encryption plot in view of an adjusted quantum calculated guide. Li et al. (2017) proposed quantum turbulent picture encryption calculation utilizing a strategic guide, tent guide, and Chebyshev map [20]. Also, Abd-El-Atty et al. (2019) presented an encryption convention for NEQR pictures in view of discrete quantum strolls on a circle. As of late, El-Latif et al.
190
A. Kumar and S. Yadav
(2020) if quantum shading picture encryption utilizing controlled substitute quantum walk-based pseudo-irregular number generator [21–25].
3 Arnold Algorithm As per the Arnold change conception, the coding of the position space in the first place of the pixel of the picture has rudimentarily moved; in the event that the pixel has moved farther contrasted with the pixels of the first picture, the caliber of obstruction is more eminent. Accordingly, its randomization work makes the pixel picture a debacle. Thus, we make safer our picture utilizing the Arnold Transformation. The Arnold change is chiefly utilized in picture scrambling to get the picture in the procedure of picture encryption. Arnold’s transformation is defined as follows [26, 27]:
xn+1 yn+1
11 = 12
xn yn
mod N
In general, the Arnold transformation is defined as
xn+1 yn+1
1 b = a ab + 1
xn yn
mod N
Input: P (plain image), a, b; Output: C (Confused image). (1) (2) (3) (4) (5) (6) (7)
Read the image P and get its size NxN; Let img = P and C be a zero image with the same size of P; For each row x and column y, do; X = (x + by) mod N + 1; Y = (ax + (ab + 1)y) mod N + 1; C(X, Y ) = img(x, y); Return
4 RSA Cryptosystem In 1977, three experts at MIT in the USA, Rivest, Shamir, and Adleman, introduced a public key cryptosystem known as the RSA cryptosystem; RSA cryptosystem relies upon estimated calculating and it uses immense numbers (e.g. 1024 pieces). The fundamental public key cryptosystem is the RSA cryptosystem on which one can in
16 A New Image Encryption Scheme Using RSA Cryptosystem and Arnold …
191
like manner frame a collection of huge considerations of current public key cryptography. A phenomenal thought will be given to the issue of factorization of entire numbers that expect such a huge part in the security of RSA. In this methodology, a public key cryptosystem is utilized to support another picture encryption assessment. The public key cryptosystem is by and large called a lopsided key cryptosystem. A few factorization strategies will be introduced and examined. In doing that, we will outline current circulated strategies to factorize extremely enormous whole numbers. Security because of the cost of considering huge numbers [28]. RSA algorithm uses the following procedure to generate public and private keys (refer to Fig. 2). Explanation of RSA with the help of an example • Take p = 7, q = 11, so n = 77 and ϕ(n) = 60 • Bob choose e = 17, making d = 53 • Bob wants to send Alice secret message HELLO (07 04 11 11 14) 0717 mod 77 ≡ 28 0417 mod 77 ≡ 16 1117 mod 77 ≡ 44
Fig. 2 RSA Algorithm for key generation
192
• • •
• •
A. Kumar and S. Yadav
1117 mod 77 ≡ 44 1417 mod 77 ≡ 42 Bob sends 28 16 44 44 42 Alice receives 28 16 44 44 42 Alice uses private key, d = 53 to decrypt message: 2853 mod 77 ≡ 07 1653 mod 77 ≡ 04 4453 mod 77 ≡ 11 4453 mod 77 ≡ 11 4253 mod 77 ≡ 14 Alice translates message to letters to read HELLO No one else could read it, as only Alice knows her private key and that is needed for decryption
5 Proposed Algorithm The unbalanced encryption calculation of RSA makes encryption safer and the beneficiary isn’t excessively hesitant to give every source an alternate key to guarantee correspondence. What’s more, one more benefit of the RSA calculation is that the RSA calculation is hard to interpret on the grounds that it includes the factorization of indivisible numbers that are challenging to factor. Assuming somehow, the utilization of change or endeavored robbery can get the unscrambling key is practically equivalent to the first key. Furthermore, he can go out to interpret the picture by 70–80% [29]. Then, at that point, there are potential outcomes that I can grasp about the genuine picture (or we can say an unscrambled picture). Along these lines, to take care of this issue, we utilize one more Arnold change procedure here. As indicated by the Arnold change idea, the coding of the position space in the first place of the pixel of the picture has basically moved; in the event that the pixel has moved farther contrasted with the pixels of the first picture, the level of impedance is more noteworthy. Albeit the coding doesn’t change the dim level of the pixels of the first picture, you can change the picture of the special visualizations. The coding of the picture will be contrasted with the first picture in addition to “mayhem”, which demonstrates that the randomization calculation is more productive. Subsequently, its randomization capability makes the pixel picture a catastrophe. This is the justification for why a programmer can figure out the genuine picture (70–80%, before Arnold’s opposite change). It gets the picture of an issue that is truly challenging to comprehend and hack. One more benefit of the Arnold change is that it utilizes the module activity, along these lines, assuming the programmer must be informed on the number of emphases of Arnold change. On the off chance that you will foresee an off-base number later, make the picture more muddled and confounding. This study coordinates the quantum calculated map, the RSA calculation, the Arnold change, and the dispersion activity to understand another lopsided picture
16 A New Image Encryption Scheme Using RSA Cryptosystem and Arnold … Fig. 3 Flowchart of proposed scheme
193
Flow Chart for Proposed Algorithm
Plane Image, p, q
Encrypted Image
Group Pixel
Group Pixel
RSA Encryption
RSA Decryption and AT
Recover Pixel
Recover Pixel
Encrypted Image
Decrypted Image
encryption plot. The structure of the proposed picture encryption conspire is displayed in Fig. 3. The image encryption procedure is as the following Step 1: Step 2: Step 3:
Input prime numbers p and q and also a plain image with size N * N. Convert the input image into the pixel values. From the prime numbers p and q, generate the encryption and decryption keys. Step 4: Encrypt each pixel of the image with the help of RSA algorithms. Step 5: Rearrange the encrypted pixels obtained from step 4 into a binary image of size M * M. Step 6: Divide the encrypted image into different subparts. Step 7: Now apply the Arnold Transformation on each subpart. Step 8: Again, apply the RSA decryption algorithm. Step 9: Combine all subparts and arrange them in the form of an image. Step 10: Output will be the original image.
6 Result Analysis Our perceptions of the outcomes affirm the strength of the calculation utilized in this framework is the RSA calculation, which is one of the most impressive encryption calculations right now. The strength of this calculation gets through a vital length of up to 2048 pieces which expands the strength of the calculation and decreases the possibilities of key breaking by interlopers. In the principal situation, we were chipping away at creating the keys and putting them away as documents on the PC so that we can utilize them again without the need to enter information and make estimations once more. Also, this progression decreases the framework’s extensive computations of key age, so we momentarily sum up the time spent in encryption and unscrambling, one of the main shortcomings of the RSA calculation (Fig. 4).
194
A. Kumar and S. Yadav
Fig. 4 Results of the test images: a original image; b encrypted image using keys 1; c encrypted image using keys 2; d difference image of (b) and (c); e decrypted image using wrong keys 2; f decrypted image using correct key [30]
The histogram analysis of the proposed scheme is given in Fig. 5. The closer to zero this correlation coefficient is, the weaker the relationship between the original signal and its delayed version (refer to Fig. 6). At long last, a picture calculation for delicate planar configuration is planned, which works on the customary calculation furthermore, speeds up the calculation of the calculation, and proposes a strategy to set the underlying worth of the tumultuous framework as the coefficients of the wavelet changes in the picture, utilizing the qualities of the turbulent framework delicate to the underlying worth as the responsiveness of the picture of the delicate planar plan to identify altering. At long last, it very well may be learned through recreation that the proposed two sorts of picture planar plan of the picture exist visual impalpability and are better for normal commotion impedance and sign handling. Cropping Attack Analysis Several attacks are available for the encryption of images as well as damaging of protected data; one of the famous ones is the cropping attack; the result of the cropping attack are shown in Fig. 7.
16 A New Image Encryption Scheme Using RSA Cryptosystem and Arnold …
195
Fig. 5 Histogram test: a plain image of a boat; b histogram of the plain image boat; c cipher image of a boat; d histogram of the cipher image boat; e plain image of male; f histogram of the plain image male; g cipher image of male; h histogram of the cipher image male; i plain image of landscape; j histogram of the plain image landscape; k cipher image of landscape; l histogram of the cipher image landscape [30]
7 Future Direction This paper’s future work can incorporate applications of other topsy-turvy cryptosystems and on account of the division of pictures any symmetric cryptosystem. Furthermore, it very well may be applied to sound and video encryption and unscrambling.
196
A. Kumar and S. Yadav
Fig. 6 Correlation of two adjacent pixels in a plain image of 512 × 512 Boat in a horizontal, b vertical, and c diagonal directions. Correlation of two adjacent pixels in the cipher image: d horizontal, e vertical, and f diagonal directions [30, 31]
Fig. 7 Cropping attack analysis: a–b are the images cropped with 25%. c–d are the images cropped with 50%. e–f are the corresponding decrypted images respectively [29]
References 1. Hu G, Li B (2020) Coupling chaotic systeam based on unit transform and its applications in image encryption. Signal Process 178:107790. https://doi.org/10.1016/j.sigpro.2020.107790 2. Malik DS, Shah T (2020) Color multiple image encryption scheme based on 3D-chaotic maps. Math Comput Simulat 178:646–666. https://doi.org/10.1016/j.matcom.2020.07.007 3. Zhou M, Wang C (2020) A novel image encryption scheme based on conservative hyperchaotic
16 A New Image Encryption Scheme Using RSA Cryptosystem and Arnold …
4.
5. 6. 7.
8.
9. 10.
11.
12. 13.
14.
15. 16. 17. 18. 19. 20. 21.
22. 23. 24.
25.
197
system and closed-loop diffusion between blocks. Signal Process 171:107484. https://doi.org/ 10.1016/j.sigpro.2020.107484 Slimane NB, Bouallegue K, Machhout M (2017) Designing a multi-scroll chaotic system by operating logistic map with fractal process. Nonlinear Dynam 88:1655–1675. https://doi.org/ 10.1007/s11071-017-3337-0 Guo Y, Qi G, Hamam Y (2016) A multi-wing spherical chaotic system using fractal process. Nonlinear Dyn 85:2765–2775. https://doi.org/10.1007/s1107-016-2861-7 Zhang Y (2020) The fast image encryption algorithm based on lifting scheme and chaos. Inf Sci 520:177–194. https://doi.org/10.1016/j.ins.2020.02.012 Akhshani A, Akhavan A, Lim S-C, Hassan Z (2012) An image encryption scheme based on quantum logistic map. Commun Nonlinear Sci Numer Simulat 17:4653–4661. https://doi.org/ 10.1016/j.cnsns.2012.05.033 Seyedzadeh SM, Norouzi B, Mosavi MR, Mirzakuchaki S (2015) A novel color image encryption algorithm based on spatial permutation and quantum chaotic map. Nonlinear Dyn 81:511–529. https://doi.org/10.1007/s11071-015-2008-2 Zhang J, Huo D (2019) Image encryption algorithm based on quantum chaotic map and DNA coding. Multimed Tools Appl 78:15605–15621. https://doi.org/10.1007/s11042-018-6973-6 Huang L, Cai S, Xiong X, Xiao M (2019) On symmetric color image encryption system with permutation-diffusion simultaneous operation. Opt Laser Eng 115:7–20. https://doi.org/10. 1016/j.optlaseng.2018.11.015 Ye G, Huang X (2017) An efficient symmetric image encryption algorithm based on an intertwining logistic map. Neurocomputing 251:45–53. https://doi.org/10.1016/j.neucom.2017. 04.016 Zhang Y, Wang X (2014) A symmetric image encryption algorithm based on mixed lineanonlinear coupled map lattice. Inf Sci 273:329–351. https://doi.org/10.1016/j.ins.2014.02.156 Wu C, Hu K, Wang Y, Wang J, Wang Q (2019) Scalable asymmetric image encryption based on phase-truncation in cylindrical diffraction domain. Opt Commun 448:26–32. https://doi. org/10.1016/j.optcom.2019.05.009 Kumari E, Mukherjee S, Singh P, Kumar R (2020) Asymmetric color image encryption and compression based on discrete cosine transform in Fresnel domain. Results Opt 1:100005. https://doi.org/10.1016/j.rio.2020.100005 Klima RE, Sigmon N (2012) Cryptology: classical and modern with maplets. CRC Press, Boca Raton, FL Rivest RL, Shamir A, Adleman L (1978) A method for obtaining digital signatures and publickey cryptosystems. Commun ACM 21:120–126 Rivest R, Shami A, Aldeman L (1978) A methoed for obtaining digital signatures and public-key cryptosystems. Commun ACM 21(2):120–126 Fiat A (1989) Batch RSA. In: Proceedings of the LNCS conference on crypto ’89, Berlin, pp 175–185 Yunfei L, Qing L, Tong L, Wenming X (2010) Two efficient methods to speed up the batch RSA decryption. In: Third international workshop on advanced computational intelligence Boneh D, Shacham H (2002) Fast variants of RSA. RSA Laboratories Cryptobytes Rep 5(1):1–8 Castelluccia C, Mykletun E, Tsudik G (2006) Improving secure server performance by rebalancing SSL/TLS handshakes. In: Proceedings of the ACM conference on ACM symposium on information, computer and communications security, New York, pp 26–34 Grobler TL, Penzhorn WT (2006) Fast decryption methods for the RSA cryptosystem. In: IEEE proceedings Lal NA (2017) A review of encryption algorithms-RSA and Diffie-Hellman. Int J Sci Technol Res 6(7) Osho O, Zubair YO, Ojeniyi JA (2014) A simple encryption and decryption system. In: Conference: international conference on science, technology, education, arts, management and social sciences Nisha S, Ftika M (2017) RSA public key cryptography algorithm: a review. Int J Sci Technol Res 6:187–191
198
A. Kumar and S. Yadav
26. Kakkar A, Singh ML (2012) Comparison of various encryption algorithms and techniques for secured data communication in multinode network. Int J Eng Technol (IJET) 1(2) 27. Seth SM, Mishra R (2011) Comparative analysis of encryption algorithms for data communication. Int J Comput Sci Technol 2(2) 28. Soni GK, Arora H, Jain B (2019) A novel image encryption technique using Arnold transform and asymmetric RSA algorithm. In: International conference on artificial intelligence: advances and applications, pp 83–90 29. Mir UH, Singh D, Lone PN (2021) Color image encryption using RSA cryptosystem with a chaotic map in Hartley domain. Inf Secur J A Glob Perspect 30. Jiao K, Ye G, Dong Y, Huang X, He J (2020) Image encryption scheme based on a generalized Arnold map and RSA algorithm. Hindawi Secur Commun Netw 2020 31. Ye G, Jiao K, Huang X (2021) Quantum logistic image encryption algorithm based on SHA-3 and RSA. Springer Nature B.V.
Chapter 17
An Integrated ATPRIS Framework for Smart Sustainable and Green City Nidhi Tiwari , Pratik Jain, and Mukesh Kumar Yadav
1 Introduction This model composed all user-friendly features, conferring to, respectively, definitions and eligibility, have been organized, namely, the ability to learn, persistence, and flexibility. Such key structures can be enhanced by other sub-features that may be associated with more than one important structure, such as the general efficiency of persistence and adaptability. In detail, the ability to learn can be enhanced by strategies and actions aimed at improving: communication skills that allow people to connect and tools to exchange information; volume monitoring, which allows you to always see the conditions for urban city plan; information that allows for a more detailed description of events and processes; memory, which allows you to learn from past events in order to discover possible future situations; collaboration, which favors collaboration and collaboration between different stakeholders; participation, which allows for the participation of people in decision-making processes. Sustainable and renewable smart city concept is an effort to use different modules with efficient and reliable solutions. The ecosystem is plagued by economic instability, high urbanization, climate change, and population growth. Problems with health, vehicles, pollution, lack of resources, waste management, and poor infrastructure are emerging which is why urban development is declining. This has introduced the use of technology as a solution to all these problems and to deal with them wisely. Smart cities ensure a continuous environment through the support of Big Data and Internet of Things.
N. Tiwari (B) · M. K. Yadav Department of Electronics and Communication Engineering, SAGE University, Indore, India e-mail: [email protected] P. Jain Institute of Management, Studies, SAGE University, Indore, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_17
199
200
N. Tiwari et al.
The city will be more prudent when speculation in daily services and current transportation system and up-to-date ICT infrastructure promotes sustainable economic growth and excellence of life through prudent management of natural resources and co-management [1–3]. This city comprises ICT tools. This city is also having various development tools in a way that improves society and quality of life [4]. Information and Communication Technology (ICT) is used to reduce greenhouse gas emission and energy consumption in mainstream infrastructure [5, 6].
2 ATPRIS Framework Smart city framework is composed with various parameters (refer Fig. 1). Each parameter should be concise with renewable and sustainable energy. There are focus on four important parameters: 1. 2. 3. 4.
Smart Agriculture System. Intelligent Transportation System. Technology: Information and Communication Technology (ICT). Infrastructure and Health care.
2.1 Smart Agriculture System In our country, 70% people depends on agriculture land that is why this country is known as agriculture land. Due to urbanization, smart technology can be used in agriculture field to improve quality of different types of farm land. Important tools are robots, drones, artificial intelligence, sensors, and IOT devices used to make smart and intelligent agriculture field. In urban areas, agriculture lands have been destroyed Fig. 1 ATPRIS framework
Smart Agricult ure Food and Water Resources
Technology
ATPRIS Framework Intelligent Transportation System
Sustainibility
Infrastructure
17 An Integrated ATPRIS Framework for Smart Sustainable and Green City
201
due to increase of urbanization [7]. One can do farming in small areas by using smart agriculture tools. Hydroponic and greenhouse farming is a solution to do farming in small urban areas. This should be an important goal to achieve smart city by making each person capable to grow his/her needs. After COVID situation, we all understand about power of internal immunity of body and this can be achieved by using only and only pure vegetables and fruits. This type of farming can be controlled by using different types of sensors like soil, water, light, moisture, and temperature. By using GPS and satellite system one can monitor crop from anywhere from mobile device. Smart agriculture solution on their farms can save up to 85% in water consumed and up to 50% in energy consumed. They also report up to 40% increase in crop yield, while reducing the cost of fertilization and chemical treatment, and up to 60% less losses resulting from human error [8]. Green house is a smart concept of farming. Farming completely depends on environmental conditions. In smart city, one should consider smart solution for farming. Figure 2 describes greenhouse setup. It consists of different types of sensors, solar cell, controller board, and greenhouse setup. Solar tracking system is used to track solar energy for whole day. Proposed circuit diagram is shown in Fig. 3. There are four main sensors connected: soil sensor, light sensor, temperature sensor, and humidity sensor. These sensors are connected with controller board. All decisions will be taken by controller only. GPRS module and Wi-Fi module are connected to synchronize in real time with app. Figure 4 is explaining about whole process of smart agriculture system using greenhouse setup. This will depend on crop selection. After setting up of greenhouse all sensors are connected with system and then activated. Data from sensors are sent to the controller board for making decision. Through cloud storage data is sent to the app. An alarm system will be activated if the parameters cross the limit. Fig. 2 Block diagram for greenhouse setup
202
N. Tiwari et al.
Fig. 3 Block diagram of circuit diagram
Fig. 4 Flow chart of functioning
2.2 Intelligent Transportation System Figure 5 is composed of important modules for intelligent transportation system. These three modules, traffic management system, smart parking system, pollution under control norms, are playing important role to make a city smarter. Smart parking system is a good idea to use each corner of parking space, but surveillance is must in all around the parking space. So here this system consists of a smart parking system with surveillance robot. The complete system comprises smart parking system circuitry, sensor at parking lot, and a surveillance robot. Automatic
17 An Integrated ATPRIS Framework for Smart Sustainable and Green City
203
Traffic Management
Parking System
Pollution Under control
Fig. 5 Module of intelligent transportation system
door is connected at entry and exit level and this is connected with servomotors for automatic detection of object and opening the door [9]. Figure 6 is showing a proposed system for smart parking. Parking of vehicle is a major problem for every city. This cost-effective solution provides a system to make parking easier. In this system, different parking slots consist of sensors to detect slots of parking. A display unit is connected at the entrance of system. This unit displays status of parking slots. A surveillance robot is connected in this system for safety purpose. (ii) The second important factor is emission norms. Sustainability and renewable energy solution is to use waste gases and fuel from vehicles. A good example of this is to use waste PM2.5 from vehicle to convert into electrical energy from any type of sensors [10, 11]. Electric vehicles are future of any smart and green city. Pollution rate is very less, as electric vehicle doesn’t consist of any type of petrol or diesel engine. Fossil fuel vehicles increase pollution due to its ICE engine that’s why automotive industry shifts from fossil fuel to the electric solution. There are various ways to use waste energy [12, 13]. Figure 7 consists of circuit diagram to control traffic system. CCTV camera is connected to Raspberry Pi with the help of network switch. Ethernet cable is used for connection between network switch and Raspberry Pi. Raspberry Pi acts like a minicomputer to store data and communicate data. Raspberry Pi is connected to Arduino microcontroller with the help of USB cable. Second USB cable relates to
204
N. Tiwari et al.
Fig. 6 Proposed system for smart parking system
Fig. 7 Circuit diagram for smart traffic system module
speed sensor to measure speed. Arduino takes data from Raspberry Pi and makes decision according to density. Traffic light LEDs are connected to Arduino with the help of relay for on–off signal.
17 An Integrated ATPRIS Framework for Smart Sustainable and Green City
205
2.3 Technology: Information and Communication Technology (ICT) During the COVID era, everything was shifted toward digital technology. Information and Communication Technology plays an important role toward future. ICT tools in the field of safe and convenient transportation, energy saving and infrastructure play vital role to develop a smart city and maintain sustainibility. It will be beneficial for following parameters: • Social and environmental challenges: to find climate change and try to balance ecosystem and city sustainable. • Universal access on health and education. • Universal coverage of health problems. • Universal coverage of education. • Electronic payment system. • Community health workers. • Education: Digital library design, educational tools, virtual laboratories, change in education system, online platform.
2.4 Infrastructure and Health Care Smart city depends on following important parameters: • Wind. • Light. • Heat. These all parameters are available naturally. To develop any infrastructure, one should take care about all natural parameters. These cannot be affected in any case. Infrastructure can be composed of various natural things as well as equipped with devices which do not affect environment. Mostly devices are electronically operated in the field of biomedical. Artificial intelligence has changed whole scenario in every field. Devices based on machine learning are more accurate and in the field of medical accuracy is most important, because it is a matter of life. Decision-making is also easier using latest technology. Image processing technology is becoming more precise and accurate. IBM company proposed a clinical decision system device: IBM Watson [7]. Medical system should be contactless specially for infected patients. For this robot can play an important role to distribute medicine, diagnosis, and maintain database of patient.
206
N. Tiwari et al.
3 Conclusion ATPRIS framework defines various modules to make a city smarter. In smart city concept healthcare system should be more efficient and accurate and it can be possible using latest technologies and platforms like IOT devices, machine learning. Smart hospitals are future of smart city framework to automatized things. Another main parameter is traffic, which one mostly affects any citizen’s life. It should be hurdleless, secure, and less time-consuming. Our model of smart city is composed of smart traffic and parking solution. Today’s latest technology based on IOT will be helpful to build smart city. This paper emphasizes to use renewable and sustainable energy in each field. Decomposition of waste garbage is a best solution of renewable energy.
References 1. Caragliu A, Del Bo C, Nijkamp P (2009) Smart cities in Europe. In: Proceedings of the 3rd Central European conference in regional science, Košice, Slovak Republic, Oct 7–9. http:// www.cers.tuke.sk/cers2009/PDF/01_03_Nijkamp.pdf 2. Batty M, Axhausen KW, Giannotti F, Pozdnoukhov A, Bazzani A, Wachowicz M, Ouzounis G, Portugali Y (2012) Smart city of the future. Eur Phys J Spec Top 214:481–518 3. Papa R, Galderisi A, Vigo Majello MC, Saretta E (2015) Smart and resilient cities. A systemic approach for developing cross-sectoral strategies in the face of climate change. Tema. J Land Use Mobil Environ 8(1):19–49. https://doi.org/10.6092/1970-9870/2883 4. Gaura A, Scotneya B, Parra G, McCleana S (2015) Smart city architecture and its applications based on IoT. In: The 5th international symposium on internet of ubiquitous and pervasive things (IUPT 2015), vol 52, pp 1089–1094 5. Angelidou M (2015) Smart cities: a conjuncture of four forces. Cities 47:95–106; Sta HB (2017) Quality and the efficiency of data in “Smart-Cities”. Future Gener Comput Syst 74:409–416 6. Kummitha RKR, Crutzen N (2017) How do we understand smart cities? An evolutionary perspective. Cities 67:43–52 7. Kyriazisa D, Varvarigoua (2013) Autonomous and reliable internet of things: smart autonomous and reliable internet of things. In: International workshop on communications and sensor networks (ComSense), pp 442–448 8. Sánchez López T et al (2012) Adding sense to the internet of things an architecture framework for smart object systems. Pers Ubiquitous Comput 16:291–308 9. Tiwari N, Jain R, Prajapati DK, Upadhyay A, Yadav M (2021) Car accident prevention using alcohol sensor. In: Proceedings of second international conference on smart energy and communication. Algorithms for intelligent systems. Springer, Singapore. https://doi.org/10.1007/978981-15-6707-0_61 10. Sankath V, Prajapati DK, Tiwari N (2021) Evolution and advancement in defence radar technology. Wesleyan J Res Int J Res 14(0.1) (XXVI). ISSN: 0975-1386 11. Prajapati DK, Sankath V, Tiwari N (2021) IOT based traffic system. Wesleyan J Res Int J Res 14(0.1) (XIV). ISSN: 0975-1386 12. R High (2019) The era of cognitive systems: an inside look at IBM Watson and how it works. http://www.redbooks.ibm.com/redpapers/pdfs/redp4955.pdf. Accessed 20 March 2019 13. Nayak RR, Sahana SK, Bagalkot AS et al (2013) Smart traffic congestion control using wireless communication. Int J Adv Res Comput Commun Eng 2(9)
Chapter 18
Detection of DDoS Attacks in Cloud Systems Using Different Classifiers of Machine Learning Swati Jaiswal , Pallavi Yevale , Anuja R. Jadhav , Renu Kachhoria , and Chetan Khadse
1 Introduction Nowadays, Cloud computing is the most popular model which can be used for communication, sharing data, storing data, health care, e-commerce, online banking, as a server, software, networking, computing resources, and many other purposes via the Internet. Cloud computing model is making human life simpler for social, official, educational, and entertainment communication. The Internet is the only tool to avail these services, which turns a huge number of devices with internet connectivity. So, the increased connectivity results in a heightened risk of security. Security aspects that challenge Cloud Computing are identity, authentication, authorization, confidentiality, integrity, isolation, and availability explained by Bonguet and Bellaiche [1]. The biggest and most common threat is the Distributed Denial of Service (DDoS) attack. DoS and DDoS attacks have been targeting wired networks, machine to machine, big data, wireless sensor networks, IoT, and cloud computing environments in the last three decades as explained by Kyambadde and Ngubiri [2], Dibaei et al. [3], Medeira et al. [4], Lohachab and Karambir [5], and Subramanian and Jeyaraj [6]. In 1980, the network research community came to know about DoS attacks, and in 1999 the first DDoS attack was reported by the Computer Incident Advisory Capability (CIAC) by Zargar et al. [7]. Till today, the largest DDoS attacks reported are the S. Jaiswal (B) · A. R. Jadhav · R. Kachhoria Pimpri Chinchwad College of Engineering, Pune, India e-mail: [email protected] P. Yevale Dr. D. Y. Patil Institute of Engineering, Management & Research, Pune, India C. Khadse MIT World Peace University, Pune, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_18
207
208
S. Jaiswal et al.
February 2020 attack reported by AWS, the February 2018 GitHub DDoS attack, the 2016 Dyn attack, the 2015 GitHub attack, the 2013 Spamhaus attack, the 2000 Mafiaboy attack, and the 2007 Estonia attack. DDoS is one of the remarkable attacks that intentionally occupies resources and bandwidth which interrupt and block the users in order to deny the services. Denial of Service attack (DoS) is an attack in which the attacker makes a machine or network resource unavailable to its actual users by indefinitely or temporarily interrupting the services of a host connected to the Internet. A DDoS attack is a subtype of a DoS attack. DDoS attack is a problem for Internet services. DDoS attack uses multiple connected online devices, collectively known as a Botnet, which are used to affect the target resources of a router or a server with fake traffic. Exhaustion of resources prevents authorized users to use the resources and the bandwidth of the network is consumed. DDoS attacks are large-scale attacks to ensure that authorized users cannot access devices. A difficult task is to recognize authorized users and DDoS attacks as DDoS attacks pretend as authorized traffic as explained in Toklu and Sim¸ ¸ sek [8].
1.1 Types of DDoS Attacks (Lohachab and Karambir [5]) Different types of DDoS attacks are flood attacks, protocol attacks, and application attacks (Fig. 1) Lohachab and Karambir [5], and Jaafar et al. [9].
1.1.1
Flood Attacks
In a flood attack, attackers send a very high amount of traffic to a legitimate system so that it cannot examine and allow the incoming permitted network traffic. These can be achieved by network time protocol (NTP) amplification, domain name system (DNS) amplification, user datagram protocol (UDP) flooding, transmission control protocol (TCP) flooding, and ICMP (ping) flooding. The detection introduced two different algorithms known as training, which is basically used to construct normal patterns of network traffic, and the testing algorithm, which determines the types of traffic received from the network [9].
1.1.2
Protocol Attacks
In a protocol attack, the attacker engages the processing capacity of network resources. This can be achieved by fragmented packet attacks, Ping of Death, SYN floods, Smurf attacks, CGI request attacks, and Slowloris.
18 Detection of DDoS Attacks in Cloud Systems Using Different …
209
Fig. 1 Types of DDoS attacks
1.1.3
Application Attacks
In an application attack, the attacker gains access to unauthorized areas mostly through the application layer. This can be achieved by DNS services and HTTP flooding. The rest of the paper is organized as follows: In Sect. 2, the related works are presented. In Sect. 3 a proposed hybrid algorithm is explained.
210
S. Jaiswal et al.
2 Related Work Cloud computing’s new security challenges are increasing day by day. In the last few years, many researchers are doing research on cloud computing security challenges. Valuable work is done on different DDoS attacks, prevention, and mitigation techniques in the cloud computing environment. Some of the recently published security surveys are as follows. Jaafar et al. [9] presented a review of the HTTP DDoS attack detection methods at the application layer. It shows work is based on various parameters like timestamps, source IP, destination IP, TCP packet header and covariance matrix, HTTP count, threshold, and many more. They have taken into consideration High-Rate DDoS attack and Low-Rate DDoS attack detection levels. It is mentioned that precious work is done with different datasets for analysis of attacks and performance matrix is also discussed. Medeira et al. [4] discussed various Application Layer DDoS attack detections by using big data technologies. It is stated that HTTP, SIP, SOAP, and DNS protocols are used to perform application layer DDoS attacks. Big Data Technologies such as Big Data Analytics and Big Data Processing are used to process large and complex data, with different features, sources, and applications. These technologies are particularly useful in solving cyber security issues. Suggested application layer DDoS attacks can be detected by using big data technologies, and the time required to process and analyze the data can be reduced. Toklu and Sim¸ ¸ sek [8] proposed new approaches that filter out the mixed high-rate DDoS from the low-rate DDoS attacks which make effortless implementation. The given system detects high- and low-rate DDoS attacks simultaneously. The distinction between the arrival times from the packets to the victim router is analyzed by DFT. Testing is done with DAF and DDFT methods with mixed High-rate and Low-rate attack scenarios. Both the methods successfully detect 100% high-rate and low-rate DDoS attacks and reported an issue regarding the proposed system that the accuracy rate of the detection method decreases if high-rate and low-rate attacks are close to each other. Subramanian and Jeyaraj [6] presented a detailed comparative study of the strengths and weaknesses of previous work. 1. Communication level challenges: listed network-level security issues, application-level security issues, and host-level threats. 2. Virtualization challenges: a. Virtual machine-level security challenge: It is recommended that high-rate packets should not be passed to a virtual machine for maintaining the robustness of the system, use of unrealistic assumptions which increase the complexity should be avoided, and identified assumptions/parameters should not be neglected. b. Hypervisor-level security challenge: The use of multi-factor authentication to improve the security of the hypervisor level is recommended.
18 Detection of DDoS Attacks in Cloud Systems Using Different …
211
c. Hardware-level security challenge: It is recommended hardware-level issues can be reduced by using a strong authentication mechanism at the virtual layer. Strong authentication mechanisms at the virtual layer can be used to reduce hardware-level issues. 3. Data-level challenges: They are listed issues about data-in-transit and data-inrest. 4. Service-level agreements: It is mentioned to avoid over/underestimation in the required resource; the pay-as-use model must have a proper service-level agreement. Lohachab and Karambir [5] explained the categories, classification, and working mechanism of DDoS attacks and their impact on various layers of IoT. The article provides a comparative analysis of DDoS attacks on the perception layer, transport layer, and application layer of IoT framework and its impact, tools used for conducting DDoS attacks, existing botnet prevention and detection techniques, and prevention and detection techniques with different DDoS variants. Sreeramand Vuppala [10] developed the Bio-Inspired Anomaly-based HTTP-Flood Attack Detection model for fast and early detection. Machine learning matrices are used for HTTP-Flood Attack Detection. The dataset is prepared as well as testing and training are done on the dataset. The proposed technique is tested against the CAIDA dataset. The proposed model performance is evaluated with the parameters Precision, Recall, Specificity, Accuracy, and F-measure. The accuracy of the proposed model is improved when it is compared to ARTP and FCAAIS and also achieved maximum prediction accuracy. Bonguet and Bellaiche [1] presented a survey which describes cloud availability targeted by DoS and DDoS attacks. Specific XML-DoS and HTTP-DoS attacks are described. The authors mentioned defenses applied to DoS and DDoS attacks in Cloud Computing. Defenses against XML-DoS and HTTP-DoS attacks are described in detail and evaluated defense systems in DDoS Attacks.
3 A Hybrid Approach Cloud computing is an on-demand availability of resources like data storage and computing power without the direct access of any management or user. While using cloud storage for storing ample data from various sources, there is a threat to cloud computing in the form of attacks. There are a number of attacks such as DoS, black hole attacks, gray hole, and DDoS attacks. A DDoS attack is a malicious attempt to disrupt network services or resources by flooding the target with a large volume of traffic from multiple sources. This is accomplished by leveraging multiple computers, referred to as bots, to simultaneously send an overwhelming number of requests, connection requests, and packets to the target, resulting in the denial of service to legitimate system users. So to avoid such issues, the author proposes a hybrid algorithm to detect and classify the DOS attack using different machine learning
212
S. Jaiswal et al.
algorithms. The algorithm approaches greater efficiency and accuracy than other approaches used to detect DDoS attacks. The approach worked in the following way: 1. To analyze the accepted data by cleaning the data and avoiding any kind of redundancy which affects production results. 2. Removal of all uncertain and empty values from the data. 3. Uses of different types of classifiers of machine learning for achieving better accuracy for the detection of DDoS attacks. 4. Then build a hybrid model that makes use of a support vector machine, random forest, and KNN. This model aims to provide better accuracy when compared to that of other models. The author also explains other existing algorithms like captcha puzzle and meta heuristic methods which are designed for detecting the DDoS attack, but both methods are not appropriate for finding out the same. The other algorithm named 7 based on adaptive and selective verification was also developed to find the DDoS at the network layer but failed to counter the same due to HTTP post-flooding attack. To overcome such issues, the author proposes a hybrid algorithm based on machine learning classifiers. This algorithm can efficiently detect and counter DDoS attacks on cloud systems. The approach worked as follows (Fig. 2): Step 1: State the dataset of items containing different values. Step 2: A total of 23 parameters and features, such as source and destination address, packet ID, information related to from_node and to_node, types of packets, size of packet and packet rate. Step 3: The correlation is done on the basis of source and destination addresses. Step 4: Use of classifiers SVM, KNN, and Naive Bayes to detect DDoS attacks. Step 5: Calculate model accuracy based on classifiers. Step 6: The worst-case scenario has also been considered for the following characteristics like sigmoid kernel for SVC, defaults for Naive Bayes, and 5 KNN. Working of the hybrid algorithm: 1. The dataset is divided into training data and testing data. The ratio for dividing training and testing data is 80:20. 2. The training data is again bifurcated into a validation set and a training set. 3. Training sets are given as input to classifiers, i.e. SVM, KNN, and naive Bayes. 4. Each classifier produces two sets: test prediction set and validation prediction set which means a total of 3–3 sets will be generated. The three sets of test input are combined in one set, and three sets of validation data are combined in one validation set. 5. After performing on different classifiers, the data is divided into test input and validation input, and these input sets are provided to random forest classifiers.
18 Detection of DDoS Attacks in Cloud Systems Using Different …
213
Dataset
Test data Train Data Blendin
Training set
KNN
Validation
SVM
Test prediction
Naïve
Validation prediction
Random
Hybrid approach Fig. 2 Working architecture of Hybrid approach
6. The newly obtained validation set is provided as input to the Random Forest algorithm to create the final hybrid model. 7. The generated and obtained models are basically accustomed to making predictions over the test dataset to get the final output. 8. On the basis of that, accuracy of these classifiers has been measured and is given as
214
S. Jaiswal et al.
Table 1 Analysis of different classifiers with hybrid approach S. No.
SVM
KNN
Naïve Bayes
Hybrid approach
Slot 1
70.16
77.83
76.5
80.16
Slot 2
68.67
74.83
75.34
78.67
Slot 3
69.67
74.5
74.34
79.12
Fig. 3 Comparison of SVM, KNN, and Naïve Bayes with Hybrid approach
SVM—70.16%, KNN—77.833%, and Gaussian Naive Bayes—76.5%. The blending technique—This basically divides the data into the training set and validation set like train_x, val_x, train_y, val_y, train_test_split (x_train, y_train, stratify_y_train, test_size = 0.2, random_state = 0). The size of the dataset taken was 3000; out of that, (1920, 23) was taken for the training of data and (480, 23) was taken to test the trained model. After blending and use of classifieds, the output is given to the RF technique as an input that produces 80.16% of accuracy as explained in Table 1. The advantages of the existing approaches are it provides better accuracy prediction than other models, as depicted in Fig. 3. It also classifies and predicts DDoS attack with greater accuracy. The disadvantage of the given approach is it is required to design a real-time system analysis to overcome the denial of service and to achieve 100% accuracy, a more accurate model or algorithm is required.
4 A New Approach Using Classifiers—Rainbow Technique The aim of the proposed approach is to detect DDoS attack with the help of combining different classifiers together to achieve more accuracy and reduce false rates while working in cloud systems as explained in Fig. 4. The approach worked as follows: Step 1: Datasets containing different values from different sources.
18 Detection of DDoS Attacks in Cloud Systems Using Different … Fig. 4 Working model—Rainbow algorithm
215
Collection of dataset using different sources
Pre-processing of data
Training data
Training set
Testing data
Validation set
K-fold validation
Different classifiers
RF technique
Rainbow Algorithm
Step 2: Dataset contains multiple parameters, and out of that source IP and destination IP address are used as the main parameters. Step 3: Calculation of the correlation among different parameters is done on the basis of source and destination IP addresses. Step 4: Classifiers such as ANN, Decision Tree (DT), SVM, and Random Forest are used to detect DDoS attacks. Step 5: Calculating model average accuracy based on classifiers and their working. Step 6: Parameters like error, precision, sensitivity, false rate, recall, specificity, etc. are also calculated on the basis of data available and classifiers.
216
S. Jaiswal et al.
Process of the hybrid algorithm: 1. The dataset is collected and combined through various sources. 2. The task of pre-processing is done for cleaning data by removing repeated, null, and missing values. 3. The dataset is then divided into training data and testing data. The ratio for dividing the same is 75:25%. 4. The training data is divided into a validation set and a training set. 5. Training sets are provided as input to different classifiers of supervised machine learning, i.e. Decision Tree and ANN. 6. After performing on different ML classifiers, the newly obtained validation set is provided as input to SVM and random forest. 7. The Rainbow Algorithm is a new approach, which is basically derived by combining Random forest and SVM classifiers. 8. The obtained model then is basically used to make the predictions to get the final output. 9. On the basis of that, accuracy of these classifiers has been measured and is given as Table 2 provides the result of average accuracy, sensitivity, specificity, recall, error, false positive rate, and precision using a K-fold cross-validation process on multiple classifiers of machine learning. Figures 5 and 6 provide the classification of multiple classifiers like ANN, DT, KNN, and SVM on different parameters. The analysis of the proposed algorithm along with various other algorithms is presented (Fig. 7) on the basis of average accuracy, error, precision, recall, and so on.
5 Conclusion In this research work, analysis of the detection of DDoS attacks for a cloud computing environment is proposed. The presented algorithm has the ability to detect DDoS with much greater accuracy and precision than the existing algorithm. The model can handle complex and noisy data received from multiple sources. Hence, it is a more robust and consistent approach against DDoS attacks. It is also used to analyze complex numbers and calculations for a large amount of data. The generated and obtained models are basically accustomed to making the predictions over the test dataset to get the final output. The result achieved an increase in average accuracy, specificity, sensitivity, and precision. It has been observed that the error and false positive rate are reduced in this work. To achieve more accuracy, the proposed algorithm can be combined with deep learning approaches and also requires a large number of training samples from a real-time environment.
Validation
K-fold cross validation
K-fold cross validation
K-fold cross validation
K-fold cross validation
K-fold cross validation
Sr. No.
1
2
3
4
5
RSVM
ANN
DT
KNN
SVM
Classifier name
91
89
87
84
90
Avg. accuracy
Table 2 Result of different classifiers with new hybrid approach
92.8
90
91
91
92
Sensitivity
91.9
88
89.9
91.3
91
Specificity
93.1
94.29
93.65
92
94.5
Recall
0.32
0.42
0.39
0.46
0.32
Error
94
91.83
90.67
91.34
93.28
Precision
93.1
94.29
93.65
92
94.5
False positive rate
18 Detection of DDoS Attacks in Cloud Systems Using Different … 217
218
S. Jaiswal et al.
Fig. 5 Measures of accuracy, sensitivity, and specificity using K-fold cross-validation process
Fig. 6 Classification of classifiers on FPR, precision, recall, and error parameters
18 Detection of DDoS Attacks in Cloud Systems Using Different …
219
Fig. 7 Analysis of SVM and RSVM on the basis of sensitivity, specificity, recall, error, FPR, and average accuracy
References 1. Bonguet A, Bellaiche M (2017) A survey of Denial-of-Service and distributed Denial of Service attacks and defenses in cloud computing. Futur Internet 9(3). https://doi.org/10.3390/fi9030043 2. Kyambadde G, Ngubiri J (2018) A tool to mitigate denial of service attacks on wired networks 3. Dibaei M et al (2020) Attacks and defences on intelligent connected vehicles: a survey. Digit Commun Netw 6(4):399–421. https://doi.org/10.1016/j.dcan.2020.04.007 4. Medeira P, Grover J, Khorajiya M (2017) Detecting application layer DDoS using big data technologies. J Emerg Technol Innov Res 4(6):25–31 5. Lohachab A, Karambir B (2018) Critical analysis of DDoS—an emerging security threat over IoT networks. J Commun Inf Netw 3(3):57–78. https://doi.org/10.1007/s41650-018-0022-5 6. Subramanian N, Jeyaraj A (2018) Recent security challenges in cloud computing. Comput Electr Eng 71(June):28–42. https://doi.org/10.1016/j.compeleceng.2018.06.006 7. Zargar ST, Joshi J, Tipper D (2013) A survey of defense mechanisms against distributed denial of service (DDoS) flooding attacks. IEEE Commun Surv Tutor 15(4):2046–2069. https://doi. org/10.1109/SURV.2013.031413.00127 8. Toklu S, Sim¸ ¸ sek M (2018) Two-layer approach for mixed high-rate and low-rate distributed denial of service (DDoS) attack detection and filtering. Arab J Sci Eng 43(12):7923–7931. https://doi.org/10.1007/s13369-018-3236-9 9. Jaafar GA, Abdullah SM, Ismail S (2019) Review of recent detection methods for HTTP DDoS attack. J Comput Netw Commun 2019. https://doi.org/10.1155/2019/1283472 10. Sreeram I, Vuppala VPK (2019) HTTP flood attack detection in application layer using machine learning metrics and bio inspired bat algorithm. Appl Comput Inf 15(1):59–66. https://doi.org/ 10.1016/j.aci.2017.10.003
Chapter 19
EEG Signal Classification for Left and Right Arm Movements using Machine Learning Swati Shilaskar , Niranjan Tapasvi, Shripad Govekar, Shripad Bhatlawande , and Rajesh Jalnekar
1 Introduction Disorders related to the nervous system [1] may cause loss of control over different muscles of the body. Worldwide, 2–5 people per 100,000 people suffer from severe amyotrophic lateral sclerosis or adamant disorder. These patients are not able to make particular body movements. Many people suffer from loss of upper or lower limbs due to some accidents. Brain-computer interface (BCI) is helpful in obtaining EEG signals related to arm movement and in detecting any kind of brain disorder. These signals were classified into [2] various frequency bands according to their frequencies ranging from 0.1 to 100 Hz. The intention of the proposed work is to find solutions to such problems using advanced technology. The objective of the paper has always been to improve patients’ quality of life. EEG signals along with a prosthetic arm provide control over the limb movement based on the person’s imagery or thoughts. The idea of limb activation has been proven to alter brain electrical activity and generate signals. The activation of hand region neurons is accompanied by the preparation for an actual movement/imagery. The EEG signal [3] corresponding to left or right arm movement can be used to stimulate the appropriate movement of the prosthetic arm or control the direction of a wheelchair.
S. Shilaskar (B) · N. Tapasvi · S. Govekar · S. Bhatlawande · R. Jalnekar Department of Electronics and Telecommunication Engineering, Vishwakarma Institute of Technology, Pune 411037, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_19
221
222
S. Shilaskar et al.
2 Literature Survey EEG signals are obtained [4] during the execution of distinct motions of the same limb. These signals were evaluated from a publicly available dataset of 15 patients using a 61-channel montage. The frequency radius of generated signals was from [5] 0.1–100 Hz. In order to eliminate the artifacts, the filtration process of frequency range was carried out through a band pass filter. Electrodes were placed [6, 7] on the frontal, temporal, perinatal and occipital lobes of the brain. Activities were categorized under 4 classes such as right hand up, left hand up, both hands down and both hands up performed under visual cues. An assistive pneumatic [8] planar robot was predicted from signal connection for 16 EEG channels. Activities were classified into 6 classes. The sampling frequency used was 256 Hz [10]. The data was collected [9, 11] with 56 channel montage at a 128 Hz sampling frequency. The collected data had a frequency range from 0 to 32 Hz. Extraction of the feature was done using Wavelet Transform (WT) and Fast Fourier Transformation (FFT) and dimensionality reduction using Principal Component Analysis (PCA). EEG signals of hand movements [9] were captured at different velocities. Filtering operations like constant Q filtering, low pass filtering and Laplacian filtering were performed. Fisher Linear Discriminant (FLD) was used for classification. FlickerNoise Spectroscopy (FNS) was used [12] for extracting features from EEG signals corresponding to hand movement imagination. The frequency range of the collected signal was 0–28 Hz. Band Pass Filter (BPF) and Laplacian filter were used for filtering operation. Wavelet transform [13] and power spectral density estimates were used for feature extraction. Feature reduction was done using Independent Component Analysis (ICA), Kernel Ridge Regression (KRR) and PCA. Quadratic discriminant analysis (QDA) [14, 15], Linear Support Vector Machine (LSVM) and KNN were used as classifiers. The EEG signals were decoded [16, 17] for imagined hand movements in stroke patients. It was carried out by analyzing Motor Imagery (MI) constructed cerebral cortical tasks out of EEG recordings. Phase Locking Value (PLV) was used for the feature extraction. An algorithm based on trajectory planning [18] of the robotic arm was proposed. The collection of EEG signals was carried out with a 64-channel montage. The collected signals were filtered in the beta frequency band with fifth-order Butterworth BPF. A support Vector Machine classifier (SVM) was used in order to the classification of EEG signals of both arm movements. A sampling frequency of 1200 Hz [19] was used. The data was analyzed in time and wavelet domains. Extraction of wavelet domain features was carried out using db2 wavelet transform. Support Vector Machine was employed as a classifier.
19 EEG Signal Classification for Left and Right Arm Movements using …
223
3 Methodology A machine learning-based system to classify both (left and right) arm movements was used. The testing folder contained data of 2 subjects and the training folder contained data of 6 subjects. The block diagram for the classification of arm movements is indicated in Fig. 1.
3.1 Experimental Setup The EEG signal of four arm movements was recorded. Eight healthy subjects participated in EEG data collection. These were right hand up, right hand down, left hand up and left hand down. The collection of data for each activity was conducted for 60 s. Subjects had closed their eyes during the data collection session to avoid eye-related artifacts. The experimental setup of data collection is shown in Fig. 2.
3.2 Data Collection The EEG signals were captured with a 10–20 international standard 32 electrode system. The data collection is carried out with a sampling frequency of 512 Hz. The montage consists of C4, C3, CZ, P3, P4, F3 and F4 electrodes. These electrodes were mounted on specific regions of the brain. Electrodes F3 and F4 were mounted on the frontal region, P3 and P4 were mounted on the perinatal region, and C3, C4 and Cz
Fig. 1 Block diagram for classification of arm movement
Fig. 2 Data collection setup
224
S. Shilaskar et al.
Fig. 3 Montage
Fig. 4 Brain map for both hands up and both hands down
were mounted on the central region of the motor cortex. The montage of discussed electrodes is shown in Fig. 3. The brain map highlights the active regions of the motor cortex for both hands up and both hands down movements. The brain map diagrams are shown in Fig. 4.
3.3 Data Preprocessing The Butterworth band pass filter (BPF) of fifth order and a notch filter act as filters to obtain EEG signals. The notch filter was used for separating powerline noises. The range of cutoff frequencies of the notch filter was 49 and 51 Hz respectively and the range of cutoff frequencies for the band pass filter was 14 Hz and 30 Hz, respectively. This frequency range comes under the Beta band. This band contains
19 EEG Signal Classification for Left and Right Arm Movements using … Table 1 Frequency bands and behavior traits
Waves
Frequency bands (Hz)
Behavior trait
Delta
0.5–3
Deep sleep
Theta
3–7
Deep meditation
Alpha
7–15
Eyes closed, awake
Beta
14–30
Awake, engaged in activities
Gamma
30 and above
Unifying conscious
225
signals of limb movements of the body. Other frequency bands and their associated activities are shown in Table 1. The response of band pass filter is shown in Figs. 5 and 6. The low-frequency signals from delta, theta and alpha bands lying in 0.5–15 Hz and gamma bands consisting of frequencies more than 30 Hz were removed from the original signal. The beta band (14–30 Hz) was retained. Right hand up signal’s power spectral density is calculated. This waveform is shown in Fig. 7. It is observed from Fig. 7 that signals from 20 to 60 Hz range have the majority of power distributed. Power spectral density for filtered signals is given in Eqs. 1 and 2: )2 ( N Ts Σ − j2π f nT ·x[n]e Sp( f ) = N n=1
Fig. 5 Original and filtered signal of right hand up movement
(1)
226
S. Shilaskar et al.
Fig. 6 Frequency distribution of filtered signal
Fig. 7 Power spectral density of right hand signal
Σ
k= f 2
Pm =
·S(k)
(2)
k= f 1
where S(k) is the power spectrum density, and x[n] is a random process of the signal.
19 EEG Signal Classification for Left and Right Arm Movements using …
227
3.4 Feature Extraction A small segment of the waveform was analyzed with different data lengths. The windowing was carried out with 3 different slice lengths of 250, 500 and 1000 and then the results were plotted. The results are shown in Fig. 8. It is observed that, as compared to 250 and 1000 slice lengths, 500 slice length retains variation in patterns. Features were calculated and concatenated into one single feature vector by performing statistical operations on them. The energy and entropy for each channel are given in Eqs. 3 and 4, respectively: E=
N Σ
·X i 2
(3)
i=1
H =−
N Σ
·X i ∗ log(X i 2 )
(4)
i=1
Through the use of mathematical variables like minimum value, maximum value, mean and standard deviation, the features can be extracted from the dataset. Minima is the smallest value in available data points. It is given in Eq. 5: Min = min(y(t))
(5)
Maxima is the highest value in available data points. It is given by in Eq. 6: Max = max(y(t)) Mean is the average value of all instances, and it is calculated in Eq. 7:
Fig. 8 Effect of different slice lengths on raw EEG signal
(6)
228
S. Shilaskar et al.
Fig. 9 Standard deviation of EEG signal
Δ
Σ
Mean(y) =
y(t) N
(7)
Standard deviation provides the measure of how the data is dispersed with respect to mean. A low standard deviation means the data points tend to cluster around the mean, and a high standard deviation means data points are sparser. The mathematical expression for standard deviation is given in Eq. 8: / Standard Deviation(σ ) =
Σ
Δ
[y(t) − y ] N
2
(8)
where N is the number of instances. Boxplots were plotted to visualize the variations in features. Significant variations were seen in max value and standard deviation. The boxplots for standard deviation and max value are shown in Figs. 9 and 10.
3.5 Classification An array of machine learning algorithms was used for classifications. Classifiers used were random forest, decision tree and KNN. Random forest contains extensive decision trees. It is constructed on numerous subsets of the datasets to predict the accuracy of the model. It uses the Gini index method to compute the probability of specific classes using Eq. 9, where p+ and p− are probabilities: ] [ G I = 1 − (P+)2 + (P−)2
(9)
19 EEG Signal Classification for Left and Right Arm Movements using …
229
Fig. 10 Amplitude of max
The decision tree is based on a tree structure. The root node represents the feature rules, branches represent decision and leaf node indicates the outcome. The KNN Clustering algorithm classifies data based on its relative class. A Euclidean method is used to determine the distance between two data points. A mathematical equation to find Euclidean distance is shown in Eq. 10: / d = (x2 − x1 )2 + (y2 − y1 )2
(10)
4 Results and Discussion Accuracy, precision, recall, F1 score and AUC of all algorithms are shown in Table 2. Accuracy, Precision, Recall, F1 score and AUC score are the parameters used to calculate the efficiency of the product. And apart from the classifiers mentioned above, SVM and XGBoost classifiers were considered, but their accuracies were below 60% so they are not used in the paper. Table 2 Classification performance (%) CF
Acc
Prec
Rec
F1
AUC
R.F
98.09
98.12
98.09
98.01
67
D.T
98.66
98.67
98.66
98.66
96
KNN
76.32
76
68
65
94
Note CF—Classifiers, R.F.—Random Forest, D.T.—Decision Tree, Acc—accuracy, Prec—Precision, Rec—Recall, F1—F1 Score, AUC—AUC Score
230
S. Shilaskar et al.
5 Conclusion This paper presented an EEG-based system for mapping left and right arm movement signals. The signals were collected from the frontal, central and perinatal regions of the brain. The arm movements were decoded from a low-frequency band between 14 and 30 Hz. This beta band consists of low-frequency components. It resembles the alertness and engagement of a person in a particular activity. The statistical features such as mean, max and standard deviation were extracted to generate a feature vector. The classifiers used were Random Forest, Decision Tree and K-Nearest Neighbor. Among these classifiers, random forest provided the highest accuracy of 98.09%. The results obtained from this study can be used in arm prostheses and related bio-medical applications. Acknowledgements We express our sincere gratitude to the Doctors, supporting staff and Participants in this study. The authors sincerely thank the All India Council for Technical Education (AICTE), Government of India, New Delhi, and Vishwakarma Institute of Technology, Pune, for providing financial support (File No. 8-53/FDC/RPS (POLICY-I)/2019–20) to carry out this research work.
References 1. Aljalal M, Ibrahim S, Djemal R, Ko W (2020) Comprehensive review on brain-controlled mobile robots and robotic arms based on electroencephalography signals. In: Intelligent service robotics, pp 179–183 2. Mammone N, Ieracitano C, Morabito FC (2020) A deep CNN approach to decode motor preparation of upper limbs from time–frequency maps of EEG signals at source level. In: Neural networks, pp 357–372 3. Lv J, Li Y, Gu Z (2010) Decoding hand movement velocity from electroencephalogram signals during a drawing task. In: Biomedical engineering online, pp 1–21 4. Morash V, Bai O, Furlani S, Lin P, Hallett M (2008) Classifying EEG signals preceding right hand, left hand, tongue, and right foot movements and motor imageries. In: Clinical neurophysiology, pp 2570–2578 5. Kuo C-C, Lin WS, Dressel CA, Chiu AWL (2011) Classification of intended motor movement using surface EEG ensemble empirical mode decomposition. In: Annual international conference of the IEEE engineering in medicine and biology society, pp 6281–6284 6. Kim J-H, Bießmann F, Lee S-W (2014) Decoding three-dimensional trajectory of executed and imagined arm movements from electroencephalogram signals. In: IEEE transactions on neural systems and rehabilitation engineering, pp 867–876 7. Woo J-S, Müller K-R, Lee S-W (2015) Classifying directions in continuous arm movement from EEG signals. In: The 3rd international winter conference on brain-computer interface, pp 1–2 8. Ubeda A, Azorín JM, García N, Sabater JM, Pérez C (2012) Brain-machine interface based on EEG mapping to control an assistive robotic arm. In: 2012 4th IEEE RAS & EMBS international conference on biomedical robotics and biomechatronics, pp 1311–1315 9. Ramoser H, Muller-Gerking J, Pfurtscheller G (2000) Optimal spatial filtering of single trial EEG during imagined hand movement. IEEE Trans Rehabil Eng 441–446
19 EEG Signal Classification for Left and Right Arm Movements using …
231
10. Shedeed HA, Issa MF, El-sayed SM (2013) Brain EEG signal processing for controlling a robotic arm. In: International conference on computer engineering and systems (ICCES) Cairo, Egypt, pp 1000–1005 11. Van den Noort JC, Steenbrink F, Roeles S, Harlaar J (2015) Real-time visual feedback for gait retraining: toward application in knee osteoarthritis. Med Biol Eng Comput 275–286 12. Broniec A (2016) Analysis of EEG signal by flicker-noise spectroscopy: identification of right/left-hand movement imagination. Med Biol Eng Comput 1935–1947 13. Scherer R, Moitzi G, Daly I, Müller-Putz GR (2013) On the use of games for noninvasive EEG-based functional brain mapping. In: IEEE transactions on computational intelligence and AI in games, pp 155–163 14. Khasnobish A, Bhattacharyya S, Konar A, Tibarewala DN, Nagar AK (2011) A two-fold classification for composite decision about localized arm movement from EEG by SVM and QDA techniques. In: International joint conference on neural networks, pp 1344–1351 15. Haji Babazadeh M, Azimirad V (2014) Brain-robot interface: distinguishing left and right hand EEG signals through SVM. In: Second RSI/ISM international conference on robotics and mechatronics, pp 777–783 16. Huong NTM, Linh HQ, Khai LQ (2014) Classification of left/right hand movement EEG signals using event related potentials and advanced features. In: International conference on the development of biomedical engineering in Vietnam, pp 209–215 17. Benzy VK, Vinod AP, Subasree R, Alladi S, Raghavendra K (2020) Motor imagery hand movement direction decoding using brain computer interface to aid stroke recovery and rehabilitation. In: IEEE transactions on neural systems and rehabilitation engineering, pp 3051–3062 18. Roy R, Mahadevappa M, Kumar CS (2016) Trajectory path planning of EEG controlled robotic arm using GA. Proc Comput Sci 147–151 19. Ghaemi A, Rashedi E, Pourrahimi AM, Kamandar M, Rahdari F (2016) Automatic channel selection in EEG signals for classification of left or right hand movement in Brain Computer Interfaces using improved binary gravitational search algorithm. Biomed Signal Process Control 487–49
Chapter 20
A Holistic Study on Aspect-Based Sentiment Analysis Himanshi and Jyoti Vashishtha
1 Introduction Online and social media, review and feedback, and health materials may all benefit from sentiment analysis. Everything from marketing to customer service to health can be done with it, as well. The use of sentiment analysis for Twitter research is the subject of the study. Researchers combined Naive Bayes, KNN, and LSTM principles to create an integrated method. Using Naive Bayes’s system, it is possible to sort items into several categories. There are times when KNN can aid. Learning-based neural networks may be used to better understand and predict the results of data collecting. Data from customer reviews were used to do sentimental analysis. As a result of these challenges, researchers have developed a new way that they hope will help them better understand the feelings of individuals. It has been incorporated to the LSTM learning model to create a sentiment analysis tool that can be used to analyze tweets. A more precise and dependable response may be obtained, according to the findings of the simulation. Product evaluations that use emotion analysis for text mining are becoming more widespread. Many computer linguistics experiment are also examined the primary goal of this project is to examine the connections between tweets. Client goods have been taken into consideration in the evaluation process. Some more machine-learning algorithms have been investigated besides Naive Bayes analysis and SVM. In addition to the Recurring Neural Network (RNN), deep neural networks were considered by researchers. For the sentiment analysis, we need a better solution. Himanshi (B) · J. Vashishtha Department of Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar, India e-mail: [email protected] J. Vashishtha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_20
233
234
Himanshi and J. Vashishtha
To ascertain the underlying tone of a document’s words, natural language processing employs a method known as sentiment analysis, often known as opinion mining. Organizations often use this technique to isolate and categorize the aspects of a certain product, service, or idea. Sentiment analysis plays a significant role in commercial decision-making. One sort of sentiment analysis is called “aspect-based sentiment analysis,” and it helps businesses thrive by revealing which features of their goods need to be improved in light of customer input in order to become bestsellers. Several studies have looked at sentiment analysis from a machine learning or deep learning perspective. However, the traditional method of sentiment analysis has a number of drawbacks that make it impractical to use in practice. The traditional method has problems with speed and precision. That’s why it’s important to enhance the precision and speed of aspect-based sentiment analysis.
1.1 Sentiment Analysis Positive, negative, or neutral data may be classified using the NLP technique of sentiment analysis. Sentiment analysis, which makes use of textual data, may be used by businesses to keep tabs on client feedback and learn more about their wants and needs. To put it another way, sentiment analysis looks at how people feel about a topic, whether it is about a person or a thing in general. Positive, negative, and neutral expressions may be classed. Take the phrase “I really appreciate the new design of your website!” as an example. Sentiment analysis is a way for finding out whether paragraph of text is highly positive, largely unfavourable. A mix of NLP and ML techniques are used in text analytics to give sentiment ratings to themes, categories, or entities inside a phrase. NLP, text analysis, computational linguistics, and biometrics are all utilized in sentiment analysis in order to systematically detect, extract, quantify, and analyze emotional states and subjective information.
1.2 Aspect-Based Sentiment Analysis In the field of text analysis, a method known as ABSA is used to determine the emotional tone of collected data according to its many aspects. ABSA is method for analyzing customer feedback by identifying patterns in how they feel about different aspects of a service or product. Instead of determining the overall tone of a piece of writing, aspect-level sentiment categorization analyzes how a reader feels about a specific phrase, sentence, or paragraph. Negative and positive thoughts are stated in the context of a remark like “great lunch but poor service,” for example, ABSA technique is used here. Aspect extraction is the process of identifying key concepts, such as product traits or features that are relevant to opinion mining and sentiment analysis.
20 A Holistic Study on Aspect-Based Sentiment Analysis Fig. 1 Process flow of aspect-based sentiment analysis
235
Input Sentence
Pre-processing
Word Vector
Aspect Detection
Aspect
Aspect Sentiment Classification
Sentiment
Aspect identification and sentiment categorization are the two main models in our system. In both models, we tested two different ways. When it comes to the aspect detection model, we examined the GRU-based aspect classification with word embedding and a bag of words for extra word weight output. Using a deep learning method, the first strategy is prevalent, whereas the second technique is comparable to Wang’s. Each technique starts with data that has been normalized, tokenized, and has punctuation-symbol elimination performed as part of pre-processing module. The original word vector and results of the aspect detection model are fed into aspect sentiment classification model, which returns sentiment class for each aspect based on original word vector. There is a diagram of flow in Fig. 1.
1.3 Machine Learning Predictive operations may be carried out in an industrial setting, as shown by the observations made. Machine learning techniques may be used to learn complex tasks that are beyond the capabilities of machines by analyzing data from several sensors that are attached to machines and identifying abnormal actions. Machine learning is the most extensively utilized artificial intelligence approach. The term “Artificial Intelligence” (AI) refers to a kind of mind created by computers. In this article, the idea of adopting the clustered concept is discussed. The use of reinforcement learning settings may save both time and space. Automated learning is a major use of AI. It
236
Himanshi and J. Vashishtha
is possible because of this that the system may grow and expand. It grows through experimenting and not following a predetermined path. Naive-Bayes: Bayes’ theorem is used to a family of basic “probabilistic classifiers” with strong independence requirements between features in statistics. In combination with kernel density estimation, they are among simplest Bayesian network models. Naive Bayes also uses many characteristics to predict the probability of different classes. This method is the gold standard in text categorization and other multi-class problems. SVM: As a supervised machine learning approach, SVM may be used for both classification and regression. Regression issues don’t really lend themselves to being handled with a simple statistical test like regression, but classification does. The SVM technique seeks to locate a hyperplane in N-dimensional space that may be used to unambiguously categorize data points. Support-vector machines are used for data evaluation in the fields of classification and regression analysis, and they are supervised learning models in machine learning.
1.4 Deep Learning Machine learning has several subfields, including deep learning. Data-rich findings are generated as a result of its capacity to learn from well-structured data. This approach is referred to as “deep neural learning.” Only one individual was credited for coining the phrase “deep neural network.” Machine learning relies on algorithms. It takes a lot of data to train a deep-learning algorithm. Automating machine learning requires human supervision, according to a new study. When it comes to programming, there are no hard and fast rules. In order to locate an image of a dog, for example, one needs to provide the computer with as much information as possible. Search results are more favourable to programmes that have access to vast databases of information. In terms of performance, deep learning beats out basic machine learning. Many people are aware of the fact that learning may be effectively organized. This kind of education is also known as “hierarchical learning”. Currently, machine learning is being employed in a broader range of applications than ever before. Artificial neural networks play a major role in these tactics. LSTM: Artificial RNN known as LSTM has been widely used. Deep learning, this is a common practice. In the LSTM, feedback interconnections are present. You can’t compare this neural network to a typical feed forward neural network, for example. It’s not simply a matter of sifting through a bunch of individual data points like a graph. Completing information sequences such as audio and video is also a part of it. There are LSTM networks that may be used for categorization. Time series information is being processed and used to make forecasts. For this reason, lapses of time that are not known to be small may occur in time series data. Short-term memory retention, a recurrent neural network category has been defined. This has been demonstrated to be able to teach sequence prediction issue order dependency.
20 A Holistic Study on Aspect-Based Sentiment Analysis
237
Table 1 Comparison of BERT and RoBERTa Comparison
BERT
RoBERTa
Parameter
Base: 110 M Large: 340 M
Base: 125 M Large: 355 M
Layers/hidden dimensions/self-attention heads
Base: 12/768/12 Large: 24/1024/16
Base: 12/768/12 Large: 24/1024/16
Training time
Base: 8 × v100 × 12d Large: 280 × v100 × 1d
1024 × v100 × 1 day (4–5 × more than BERT)
Performance
Outperformancing SOTA in OCT 2018
8805 on GLUE
Pre-training data
BooksCorpus + English Wikipedia = 16 GB
BERT + CCNews + OpenWebText + Stories = 160 GB
Methods
Bidirectional transformer, MLM and NSP
BERT without NSP, using dynamic masking
Bert: Google developed Bidirectional Encoder Representations from Transformers for NLP pre-training. In 2018, Jacob Devlin and his Google colleagues produced and released BERT. For natural language processing, BERT is an opensource machine learning framework. BERT’s purpose is to aid computers make sense of complex language in documents by building context through the utilization of nearby material. RoBERTa: An improved version of Google’s self-supervised BERT approach for pre-training NLP systems is reliably optimized. Ten thousand V100 GPUs were used to train the model, which took 500 K steps and had an 8 K batch size. Pre-training using language models has resulted in large increases in performance, although comparing various methods is difficult. The following are the most significant differences: static masking: The same chunk of the phrase is masked in each Epoch using BERT’s dynamic masking. RoBERTa, on the other hand, employs dynamic masking, in which various parts of the phrases are veiled depending on the Epoch. These two methods are compared in Table 1.
1.5 Role of Machine Learning in Aspect-Based Sentiment Analysis Texts are analyzed for polarity, from positive to negative, using sentiment analysis. By being trained with instances of emotions in text, machine learning algorithms may eventually learn to autonomously determine sentiment without human input. There is a clear reason why deep learning models like RNNs are so popular in natural language processing. When dealing with sequential data, such as text, these networks come in handy because of their recurrence. By ingesting the text token by
238
Himanshi and J. Vashishtha
Fig. 2 Role of ML in aspect-based sentiment analysis
token, sentiment analysis may make repeated predictions about the tone of the text. Sentiment analysis and other similar machine learning methods fit here. Feelings analysis is built on the backs of NLP and ML, example shown in Fig. 2. Some examples of machine learning techniques used in sentiment analysis include: SVM, RNN, CNN, RF, NB, and LSTM (2020). ML and DL algorithms may be used to extract sentiment and opinion from unstructured data. Sentiment analysis is a strong text analysis technique. For many, deep learning (DL) is seen as the next step in the development of artificial intelligence (AI). The most up-to-date, effective, and widely-accepted technique to sentiment analysis is a hybrid model. Sentiment analysis is the systematic identification, extraction, quantification, and analysis of emotional states and subjective information, and it makes use of natural language processing, text analysis, computational linguistics, and biometrics (refer Fig. 3).
2 Literature Review The present research is considering the existing research in area of machine learning, sentiment analysis, deep learning and LSTM. Research is also considering limitations, algorithms, and different platforms.
2.1 Literature Survey Wang et al. [1] introduced companies and academic groups that have created their own, specialized tools as deep learning spreads across a wide range of areas [1].
20 A Holistic Study on Aspect-Based Sentiment Analysis
239
Fig. 3 Sentimental analysis based on machine learning and deep learning
Brychcín et al. [2] provided as part of Semeval 2014’s ABSA task, they provide results of our system. They wanted to figure out what people thought about certain characteristics of certain target organizations. Using supervised machine learning, they develop a first system that relies solely on the training data for information. Restaurants and computers were analyzed in the study. In this paper, they demonstrate that our method yields encouraging outcomes [2]. Song et al. [3] looked the purpose of ABSA is to identify emotional slant of a text in relation to given feature. Pretrained BERT performs brilliantly on this task and reaches state-of-art performance levels with fine-tuning [3]. Al-Smadi et al. [4] presented in this article; they suggest using LSTM neural networks to do sentiment analysis of Arabic hotel reviews. First is conditional random field character classifier that uses a bidirectional LSTM [4]. Syam Mohan et al. [5] introduced an abundance of knowledge that may be found through a variety of Internet apps that allow people to express themselves. User emotion may be gleaned from linguistic phrases, which contain hidden information that can be deciphered and analyzed. Sentiment analysis, which is automatic extraction of opinions and attitudes from textual data, goes by many different titles [5]. Dhuria et al. [6] provided in the discipline of opinion mining and sentiment analysis, people’s views, feelings, attitudes, and emotions were analyzed from written language. This technique has been applied in a variety of ways, including the study of how events in social media networks affect people’s attitudes about certain businesses and services. In the same way as blogs, forums, microblogs, and social networks such as Twitter and Facebook have grown in popularity, sentiment analysis was becoming increasingly important [6]. Kumar et al. [7] focused on the emotion of Twitter texts, both at random and as benchmarks, have been studied extensively over the last decade. While most studies
240
Himanshi and J. Vashishtha
have only looked at text, images, or GIF videos, there have been a few that have looked at visual analysis of photos to predict emotion as well. The dominance of visual content such as images, memes, and GIFs in social media feeds has made this a non-trivial aspect of these platforms. Since this visual language has power to shift, confirm, or even grade polarity of feeling in this multimodal text, it must be analyzed [7]. Jain et al. [8] looked NL processed researchers who were working on better ways to categorize spam, Because of the meteoric development of social media platforms like Facebook and Twitter, this was particularly true. As a result, spammers have ramped up their spamming efforts in an attempt to gain an edge, whether that benefit is commercial or not. The deep learning approach, which is a developing field of research, is used in this study. The results reveal that LSTM outperforms standard machine learning algorithms by a significant margin for spam detection [8]. Nguyen et al. [9] presented an expanded avalanche of massive datasets that is revolutionizing numerous academic disciplines and may lead to technology advances that can benefit billions of people. The subject of ML, and specifically the branch of DL, has made significant strides in recent years. It was now feasible to analyze and learn from huge numbers of real-world samples in a broad range of ways utilizing methodologies pioneered in these two areas of research. There are a significant and expanding number of frameworks and libraries that implement ML algorithms, and this trend is anticipated to continue in the future [9]. Janiesch et al. [10] provided AI, like statistics and calculus, has become a valuable tool in engineering and experimental investigations in the modern era. AI, ML, and DL were the foundations of data science, which is a burgeoning subject for scholars. These two data science pillars are discussed in depth in this study. If any type of analysis is to be carried out, machine learning was required. This course begins with the basics of machine learning. Deep Learning is also a major topic. Deep learning may also be referred to as a new trend in the field of artificial intelligence [10]. Geluvaraj et al. [11] introduced algorithms and cybersecurity continue to advance, and this isn’t an exception. The field of computer science known as AI aimed to give robots the appearance of human intellect. AI-based technologies, or cognitive systems, are helping us automate numerous tasks and tackle problems that are beyond the capabilities of most humans. Traditional cybersecurity methods may not be able to identify newer forms of malware and cyberattacks. It’s important to take a more aggressive stance in the long run. Data from prior assaults are used to respond to subsequent ones in ML-based solutions to these security challenges. AI technologies in cybersecurity also have the benefit of freeing up a large amount of time for IT staff [11]. Bird et al. [12] looked, according to the results of this research, a technique for classifying text emotion as either negatively or positively expressed in an ensemble may be used to provide a score between 1 and 5. The full TripAdvisor restaurant review corpus of 684 word-stems was mined for information gain attribute selection in order to develop a high-performing model from the reviews. The best classification results were achieved by using an ensemble classifier that included different techniques [12].
20 A Holistic Study on Aspect-Based Sentiment Analysis
241
In 2019, Li [13] was employing BERT for end-to-end ABSA. In 2019, Mathur [14] completed a study on Emerge Trends in Expert Applications and Security. In 2021, Dai [15] did explore syntactic matter. Study was expressing a good framework for aspect-based sentiment analysis utilizing RoBERTa. In 2019, Do [16] was assessed on deep learning for aspect-based sentiment analysis. In 2019, M. Hoang [17] researched on ABSA using BERT. In 2019, Nandal [18] focused on ABSA approaches with mining of reviews. In 2019, Krishan [19] revealed air quality modelling leveraging LSTM over NCT-Delhi. In 2019, Fang [20] explored transfer learning-based approach for ABSA. In 2019, F. Hemmatian [21] surveyed on categorization techniques for sentiment analysis and opinion mining. In 2019, Yue [22] researched on survey of sentiment analysis in social media. In 2019, Chen [23] presented a hybrid CNN-LSTM model for typhoon formation forecasts. In 2021, Yadav [24] surveyed on aspect-based sentiment analysis. In 2022, Jie et al. [25] did research on sentiment analysis model of short text based on deep learning. In 2022, Sahoo et al. [26] did a survey on sentiment analysis to predict Twitter data using machine learning and deep learning. Sharma et al. [27] offered a study of machine learning techniques for analyzing Twitter user sentiment in 2022. Over the last several decades, there has been an explosion of research into this field. The format of tweets is complicated and hence difficult to comprehend. The brief nature of tweets raises additional concerns, such as the proliferation of slang and abbreviations. We’ll discuss several existing publications on Twitter sentiment analysis research, outlining their technique and the models they use, and outlining a more generic approach written in Python.
2.2 Comparative Study In Table 2, we have summarized the recent work done on aspect-based sentiment analysis using various methodologies and approaches along with their limitations. Table 2 Literature survey S.no.
Author year
Title
Methodology
Limitation
1.
Wang [1]
Different ML and DL libraries and their applications: a review
Machine learning, deep learning
Need to improve the accuracy
2.
Brychcín [2]
Sentiment analysis Machine learning, from multiple sentiment analysis perspectives using ML
Technical feasibility of work is less
3.
Song [3]
ABSA and NLI Using Sentiment analysis, BERT Intermediate natural language layers
Need to do more work on performance (continued)
242
Himanshi and J. Vashishtha
Table 2 (continued) S.no.
Author year
Title
Methodology
Limitation
4.
Al-Smadi [4]
Arab evaluations are subjected to aspect-based sentiment analysis using long short-term memory deep neural networks
Sentiment analysis
Scope of work are limited
5.
Mohan [5]
Survey on ABSA using ML techniques for sentiment analysis
Sentiment analysis, machine learning
Performance is very low
6.
Dhuria [6]
Sentiment analysis: a Sentiment analysis, data extraction method machine learning in natural language processing
Need to enhance the scale of application
7.
Kumar [7]
Study of the sentiment Sentiment analysis in multimodal Twitter data
Complex to implement in real work
8.
Jain [8]
Improving spam detection with semantic LSTM
Need to improve the performance
9.
Nguyen [9]
A study of ML and DL Machine learning, tools and frameworks deep learning for large-scale data mining
Scope and scalability of work is limited
10.
Chahal [10]
ML and DL
Machine learning
There is lack of performance
11.
Britto [11]
The annual international conference on networking, virtualization, and communications technologies
Computer networks
Need to do more work on accuracy
12.
Bird [12]
Sentiment analysis at a Sentiment analysis high level of detail using an ensemble-based approach
Performance of this research is very low
13.
Li [13]
Incorporating BERT into a comprehensive aspect-based sentiment analysis framework
Sentiment analysis
Limited scope
14.
Mathur [14]
New directions in secure expert applications
Security
Limited scope
LSTM
(continued)
20 A Holistic Study on Aspect-Based Sentiment Analysis
243
Table 2 (continued) S.no.
Author year
Title
Methodology
Limitation
15.
Dai [15]
Do people care about syntax? With RoBERTa, we have a solid foundation for ABSA
Sentiment analysis
Lack of flexibility and scalability
16.
Do [16]
A survey of DL Sentiment analysis methods for sentiment analysis with multiple perspectives
Need to integrate some technical work
17.
Hoang [17]
BERT for ABSA
Scope of this research is less
18.
Nandal [18]
Comparison between Sentiment analysis aspect-based sentiment analysis and opinion mining techniques
There is no security in this system
19.
Krishan [19]
LSTM air quality modelling over New Delhi, India
LSTM
Performance of this research is very low
20.
Fang [20]
An ABSA method based on transfer learning
Sentiment analysis
There is no security and performance in this system
21.
Hemmatian [21]
Review of opinion mining and sentiment analysis classification methods
Sentiment analysis
There is no implication in future
22.
Yue [22]
The state of social media sentiment analysis
Sentiment analysis
Lack of technical work
23.
Chen [23]
Typhoon forecasting using a CNN-LSTM hybrid
LSTM
Scope of this research is less
24.
Yadav [24]
An in-depth analysis of opinion mining
Sentiment analysis
Performance of this research is very low
Sentiment analysis
3 Problem Statement Unstructured free-form writing is widely used to write reviews for books on the Internet. Unstructured data are tough to work with. Sentiment analysis was used just at the aspect level in the study’s methodology. Aspect-level sentiment analysis requires two tasks to be completed. First, it is necessary to identify which aspects of a given object are being considered. An object’s attributes are referred to as its aspects. Using the collected features, we will next attempt to classify the evaluations based on their emotional content. Sentiment analysis is carried out using pre-trained
244
Himanshi and J. Vashishtha
converter models like RoBERTa. This study’s objective is to develop a data extraction technique that can be used to all of these pre-trained models. Machine Learning To better understand and develop techniques that “learn,” or use data to increase performance on some set of tasks, is the goal of machine learning (ML). One may classify it under AI. In order to draw inferences or choices without being explicitly taught, machine learning algorithms create a model using sample data. These data are referred to as training data. Many fields rely on machine learning algorithms because it is either impractical or impossible to create custom algorithms to handle specific problems, such as medicine, email filtering, voice recognition, and computer vision. Deep Learning The machine learning approach known as deep learning mimics the way people learn by observing others’ behaviour and then constructing an internal model based on that data. The ability of autonomous automobiles to identify a stop sign or to tell a person from a lamppost is made possible by deep learning. Accuracy Parameters Choosing a performance metric often depends on the business problem being solved. Let’s say you have 100 examples in your dataset, and you’ve fed each one to your model and received a classification. The predicted vs. actual classification can be charted in a table called a confusion matrix (Refer Fig. 4). Accuracy As a heuristic, or rule of thumb, accuracy can tell us immediately whether a model is being trained correctly and how it may perform generally. However, it does not give detailed information regarding its application to the problem.
Accuracy =
TP + TN TP + TN + FP + FN
Real Label Positive Negative
Fig. 4 Confusion matrix and accuracy parameters
Positive
True Positive (TP)
False Positive (FP)
Negative
False Negative (FN)
True Negative (TN)
Predicted Label
20 A Holistic Study on Aspect-Based Sentiment Analysis
245
Precision Precision helps when the costs of false positives are high. So let’s assume the problem involves the detection of skin cancer. Lots of extra tests and stress are at stake. When false positives are too high, those who monitor the results will learn to ignore them after being bombarded with false alarms. Precision =
TP TP + FP
Recall A false negative has devastating consequences. Get it wrong and we all die. When false negatives are frequent, you get hit by the thing you want to avoid. A false negative is when you decide to ignore the sound of a twig breaking in a dark forest, and you get eaten by a bear. If you had a model that let in nuclear missiles by mistake, you would want to throw it out. If you had a model that kept you awake all night because chipmunks, you would want to throw it out, too. If, like most people, you prefer to not get eaten by the bear, and also not stay up all night worried about chipmunk alarms, then you need to optimize for an evaluation metric that’s a combined measure of precision and recall.
Recall =
TP TP + FN
F1 Score F1 is an overall measure of a model’s accuracy that combines precision and recall, in that weird way that addition and multiplication just mix two ingredients to make a separate dish altogether. That is, a good F1 score means that you have low false positives and low false negatives, so you’re correctly identifying real threats and you are not disturbed by false alarms. An F1 score is considered perfect when it’s 1, while the model is a total failure when it’s 0. F1 = 2 ×
precision × recall precision + recall
It has been observed that existing machine learning model and deep learning model are taking a lot of time during training and testing. Moreover, there is a lack of accuracy in existing research works.
246
Himanshi and J. Vashishtha
Table 3 Comparison between machine learning and deep learning Machine learning
Deep learning
Significance
Machine learning is child of AI and parent of deep learning
Although DL is based on NN but it does not mean that DL cannot use other ML algorithm in its own parts such as using SVM in activation function
Operation
ML algorithms get train data set and learn how to predict similar events in future which is usually as test set
DL is mostly based on neural network which is one of ML algorithm. DL works mostly on feature selection
Methods
Supervised and unsupervised
Supervised and unsupervised
Data
A few thousand linear and logistic regression, support vector
More than million convolutional neural network (CNN)
Algorithm
RSV, SVM, Naïve Bayes, KNN, Decision tree, random forest
RNN, LSTM
4 Comparison of ML and DL An application of AI known as ML allows the system to learn and improve from experience without being explicitly designed to do this. In order to train and get reliable results, machine learning relies on data. In the field of ML, computer software is developed that utilizes data to learn from itself. To achieve its goals, deep learning makes use of both ANN and RNN. Like machine learning, algorithms are constructed in a similar manner; however, there are many more tiers of algorithms in the system. The term “artificial neural network” refers to these entire algorithmic network structures put together as a single unit. For the uninitiated, it’s like a miniature version of the human brain, complete with interconnected neural networks. It uses algorithms and its approach to tackle all of the most difficult issues. Table 3 has shown a comparison between operation, methods, data, and algorithm.
5 Need of Research To better serve their customers and assess public opinion, businesses may use machine learning to do sentiment analysis. The findings of a sentiment analysis will also provide you with real-world information that you can put to use in order to make better judgments. There is a clear reason why deep learning models like RNNs are so popular in NLP. When dealing with sequential data, such as text, these networks come in handy because of their recurrence. As each token in a piece of text is swallowed, they may be utilized to repeatedly forecast the sentiment in sentiment analysis. The ability to identify and interpret client sentiment requires the use of sophisticated sentiment analysis software. CX may be improved by firms who utilize these techniques to better understand their customers’ feelings. The f-score
20 A Holistic Study on Aspect-Based Sentiment Analysis
247
for aspect extraction is greater in this approach than in others, but the precision of the aspect-to-sentiment mapping is lower. The rule-based method’s effectiveness varies with the number of inputted aspect words and the quality of the natural language processing. Considering previous research it has been observed that there is scope to make sentiment analysis more scalable, flexible and efficient. Advance machine learning mechanism and filtering mechanism might provide better accuracy with performance. Thus, there is a need of research that could provide all these benefits. By examining and fixing the flaws in traditional machine learning methods, researchers hope to boost sentiment analysis’ precision and efficacy. Upcoming researches are supposed to provide better accuracy and performance by integrating data filtering approaches that would resolve the issues of previous research works.
6 Scope of Research In order to discover more about what their consumers are thinking and feeling, companies use sentiment analysis. It is a very simple kind of analytics that helps companies identify the areas of strength and weakness (bad attitudes) (positive sentiments). Texts are analyzed for polarity, from positive to negative, using sentiment analysis. By being trained with instances of emotions in text, machine learning algorithms may eventually learn to autonomously determine sentiment without human input. Because it learns rapidly even on huge datasets, logistic regression is an excellent model. New method of aspect-sentiment matching may also be made more effective by using the multiple techniques. There is a need to introduce sentiment mechanism that should be capable to provide high accuracy, precision, f-score and recall during classification of sentiments. Moreover, recent mechanism is supposed to provide high performance during training and testing operations.
References 1. Wang Z, Liu K, Li J, Zhu Y, Zhang Y (2019) Various frameworks and libraries of machine learning and deep learning: a survey. Arch Comput Methods Eng 0123456789. https://doi.org/ 10.1007/s11831-018-09312-w. 2. Brychcín T, Konkol M, Steinberger J (2014) UWB: Machine learning approach to aspect-based sentiment analysis. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014); 25th international conference on computational linguistics (COLING 2014), pp 817–822. https://doi.org/10.3115/v1/s14-2145 3. Song Y, Wang J, Liang Z, Liu Z, Jiang T (2020) Utilizing BERT intermediate layers for aspect based sentiment analysis and natural language inference. http://arxiv.org/abs/2002.04815 4. Al-Smadi M, Talafha B, Al-Ayyoub M, Jararweh Y (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybern 10(8):2163–2175. https://doi.org/10.1007/s13042-018-0799-4 5. Mohan S, Sunitha R (2020) Survey on aspect based sentiment analysis using machine learning techniques. Eur J Mol Clin Med 07(10)
248
Himanshi and J. Vashishtha
6. Dhuria S (2015) Sentiment analysis: an approach in natural language processing for data extraction. Int J New Innov Eng Technol 2(4):27–31 7. Kumar A, Garg G (2019) Sentiment analysis of multimodal twitter data. Multimed Tools Appl. https://doi.org/10.1007/s11042-019-7390-1 8. Jain G, Sharma M, Agarwal B (2019) Optimizing semantic LSTM for spam detection. Int J Inf Technol 11(2):239–250. https://doi.org/10.1007/s41870-018-0157-5 9. Nguyen G et al (2019) Machine learning and deep learning frameworks and libraries for largescale data mining: a survey. Artif Intell Rev 52(1):77–124. https://doi.org/10.1007/s10462018-09679-z 10. Chahal A, Gulia P (2019) Machine learning and deep learning. Int J Innov Technol Explor Eng 8(12):4910–4914. https://doi.org/10.35940/ijitee.L3550.1081219 11. Britto J, Chaudhari V, Mehta D, Kale A, Ramteke J (2019) International conference on computer networks and communication technologies, vol 15. Springer Singapore. https://doi.org/10. 1007/978-981-10-8681-6 12. Bird JJ, Ekárt A, Buckingham CD, Faria DR (2019) High resolution sentiment analysis by ensemble classification. Adv Intell Syst Comput 997:593–606. https://doi.org/10.1007/978-3030-22871-2_40 13. Li X, Bing L, Zhang W, Lam W (2019) Exploiting BERT for end-to-end aspect-based sentiment analysis, W-NUT@EMNLP 2019—5th proceedings of the workshop on noisy user-generated text, pp 34–41. https://doi.org/10.18653/v1/d19-5505 14. Mathur R, Pathak V, Bandil D (2019) Emerging trends in expert applications and security, vol 841. Springer Singapore. https://doi.org/10.1007/978-981-13-2285-3 15. Dai J, Yan H, Sun T, Liu P, Qiu X (2021) Does syntax matter? A strong baseline for aspect-based sentiment analysis with RoBERTa, pp 1816–1829. https://doi.org/10.18653/v1/2021.naacl-mai n.146 16. Do HH, Prasad PWC, Maag A, Alsadoon A (2019) Deep learning for aspect-based sentiment analysis: a comparative review. Expert Syst Appl 118:272–299. https://doi.org/10.1016/j.eswa. 2018.10.003 17. Hoang M, Bihorac OA, Rouces J (2019) Aspect-based sentiment analysis using BERT. In Proceedings of the 22nd nordic conference on computational linguistics, pp 187–196. https:// www.aclweb.org/anthology/W19-6120 18. Nandal N, Pruthi J, Choudhary A (2019) Aspect based sentiment analysis approaches with mining of reviews: a comparative study. Int J Recent Technol Eng 7(6):95–99 19. Krishan M, Jha S, Das J, Singh A, Goyal MK, Sekar C (2019) Air quality modelling using long short-term memory (LSTM) over NCT-Delhi, India. Air Qual Atmos Heal 12(8):899–908. https://doi.org/10.1007/s11869-019-00696-7 20. Fang X, Tao J (2019) A transfer learning based approach for aspect based sentiment analysis. In 2019 6th international conference on social networks analysis management security (SNAMS 2019), pp 478–483. https://doi.org/10.1109/SNAMS.2019.8931817 21. Hemmatian F, Sohrabi MK (2019) A survey on classification techniques for opinion mining and sentiment analysis. Artif Intell Rev 52(3):1495–1545. https://doi.org/10.1007/s10462-0179599-6 22. Yue L, Chen W, Li X, Zuo W, Yin M (2019) A survey of sentiment analysis in social media. Knowl Inf Syst 60(2):617–663. https://doi.org/10.1007/s10115-018-1236-4 23. Chen R, Wang X, Zhang W, Zhu X, Li A, Yang C (2019) A hybrid CNN-LSTM model for typhoon formation forecasting. GeoInformatica 23(3):375–396. https://doi.org/10.1007/ s10707-019-00355-0 24. Yadav K, Kumar N, Maddikunta PKR, Gadekallu TR (2021) A comprehensive survey on aspect-based sentiment analysis. Int J Eng Syst Model Simul 12(4):279–290; Li X, Bing L, Zhang W, Lam W (2019) Exploiting BERT for end-to-end aspect-based sentiment analysis, W-NUT@EMNLP 2019—5th proceedings of the workshop on noisy user-generated text, pp 34–41. https://doi.org/10.18653/v1/d19-5505 25. Jie L, Gui Z (2022) “Research on sentiment analysis model of short text based on deep learning”, scientific programming. Hindawi. https://doi.org/10.1155/2022/2681533
20 A Holistic Study on Aspect-Based Sentiment Analysis
249
26. Sahoo M, Rautaray J (2022) Survey on sentiment analysis to predict twitter data using machine learning and deep learning. Int J Eng Res Technol (IJERT) 11(07) 27. Sharma A, Kedar A, Jadhav H (2022) A survey on sentiment analysis of twitter using machine learning. Int J Innov Res Sci Eng Technol 11:3721–3725. https://doi.org/10.15680/IJIRSET. 2022.1104100
Chapter 21
Behavioral System for the Detection of Modern and Distributed Intrusions Based on Artificial Intelligence Techniques: Behavior IDS-AI Imen Chebbi, Ahlem Ben Younes, and Leila Ben Ayed
1 Introduction Protecting online identity against risks to confidentiality, integrity, and accessibility becomes a challenge that has to be addressed more and more when the majority of the population gains internet access. A network intrusion detection system (IDS) is defined as a tool that locates and identifies unusual network traffic in order to flag and categorize suspicious activities. It is an important part of network security and acts as the first line of defense against potential attacks by alerting an administrator or the appropriate individuals to potentially dangerous network behavior. Various artificial intelligence (AI) techniques are proposed in several academic articles for a robust network intrusion detection system (IDS). As more individuals have access to the internet, finding network traffic anomalies is becoming a more difficult undertaking. Cisco Annual Internet Report [1] predicts that by the start of 2023, roughly two-thirds of the world population or around 29.3 billion networked devices will be online. In order to ensure user privacy and safety online, dangerous network behavior must be quickly identified and reported. Network intrusion, in a nutshell, is the term for unauthorized, possibly dangerous behavior on a digital network. A computer network confidentiality, integrity, and accessibility may be threatened by an intrusion, which might result in problems including system compromise and privacy violations. I. Chebbi (B) ISSATSO, University of Sousse, Sousse, Tunisia e-mail: [email protected] A. Ben Younes ENSIT, University of Tunis, Tunis, Tunisia e-mail: [email protected] L. Ben Ayed ENSI, University of La Manouba, Manouba, Tunisia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_21
251
252
I. Chebbi et al.
Network assaults include spoofing, worms, buffer overflow attacks, traffic flooding, and denial-of-service (DoS) attacks. According to [2], signature-based detection systems and anomaly-based detection systems are the two fundamental categories of intrusion detection systems. Known attack signatures are generally used by signaturebased systems to identify illicit activities. Anomaly-based systems, on the other hand, are primarily used to identify unusual network behavior in scenarios where attack signatures are unknown. To automatically identify suspect network behavior, these detection systems employ a variety of techniques, from statistics to deep learning models. The following are the important contributions of this paper: – Data quality: It is necessary to process data before using it in machine learning algorithms. In this paper, we have used UNSW-NB15 dataset. – Proposed a behavioral system for the detection of modern and distributed intrusions based on artificial intelligence techniques: behavior IDS-AI. – For feature reduction method, we have used auto-encoder. – The evaluation of the UNSW-NB15 dataset processing with two models, Deep Neural Network (DNN) and Convolutional Neural Network (CNN). – A collection of open-source Python programs for analyzing the data and reproducing the experiments. – Comparison of our model with other works available in the literature. The findings of the research serve as the basis for evaluating this learning approach with other publicly available datasets. The body of the work is structured as follows: In Sect. 2, we present a literature review. In Sect. 3, we show the principles of intrusion detection systems (IDS) and the artificial intelligence approaches utilized to analyze the dataset. In Sect. 4, we describe our proposed approach. Section 5 presents the experimental results of our approach. Finally, Sect. 6 draws our conclusions and plan for future work.
2 Literature Review The attackers’ rising ability, which affects the dependability of data communication and networks, makes computer security risks particularly difficult to deal with. A security aim is attempted to be violated via an intrusion, which also infects systems. In order to protect networks and systems against intrusions, several tools and techniques, such as IDS, are created [3–5]. By categorizing data activity as normal or intrusive, intrusion detection is a collection of techniques used to identify undesired behaviors [4–6]. The strategies for intrusion detection find and halt intrusions that occur either inside or outside a monitored network. Due to this, there are two basic detection techniques that may be applied. The first one is called misuse detection, and it uses a recognized attack signature to find incursion. The second one, based on a departure from a typical paradigm, is called anomaly detection or behavioral detection [4, 7].
21 Behavioral System for the Detection of Modern and Distributed …
253
The hybrid detection techniques incorporate the benefits of both abuse and anomaly detection and seek to improve IDS detection rate and accuracy. As a result, intrusion detection research is still relevant and active. Nowadays, ML techniques have been incorporated to strengthen computer security and improve intrusion detection. Numerous research contributions examine the use of ML approaches in intrusion detection to improve training and data quality and provide dependable IDS that performs accurately. The data are not always acquired in an organized form, on the other hand. Processing unstructured data is necessary for useful analysis. To enhance the data’s quality and make precise conclusions, this operation is a crucial step. Before beginning the training and classification processes, data quality procedures are put into practice [8, 9]. Additionally, feature selection is a desired method that aims to choose the pertinent features in order to lower the computational cost of modeling and enhance the predictive model performance [9, 10].
3 Fundamentals This section discusses the most important aspects of this paper, particularly the principles of intrusion detection systems (IDS) and the artificial intelligence approaches utilized to analyze the dataset.
3.1 Behavior-Based Intrusion Detection System IDSs seek to identify deviations from typical network behavior by processing logs created by active network devices such as PCs or routers. This unusual activity is known as a network intrusion, and initially, intrusion detection software was built to read and analyze logs created by various systems. The same motive has been extended to network traffic processing, with intrusion detection systems (IDSs) being the most prominent application for collecting, analyzing, and classifying network traffic content. IDSs that are behavior-based offer two key benefits over ones that are signature-based. On the one hand, they are capable of spotting unidentified abnormalities and “zero-day” assaults. However, the normalcy profile may be educated for each contextual situation, which might make it more difficult for an attacker to discover any possible security holes in a network. The most pertinent profiling methods are fully described in [11], along with a summary of notable recent breakthroughs in behavior-based IDSs. This sort of IDS produces a significant percentage of false positives despite being able to respond to unknown threats, which necessitates ongoing and effective tweaking of the training dataset and adjusting reaction levels.
254
I. Chebbi et al.
Fig. 1 The structure of an autoencoder [12]
3.2 Feature Reduction Technique An auto-encoder is a tool for reducing a dataset’s feature set to a size that doesn’t negatively impact classification performance as a whole. Auto-encoder are neural networks with two symmetrical components: an encoder and a decoder. The decoder reconstructs the input data using the characteristics, also known as latent representations, that the encoder obtains from the raw data. Figure 1 shows the structure of the Auto-encoder.
3.3 Models To evaluate our approach, we have used two models described in this subsection. Deep Neural Network (DNN). End-to-end machine learning is an algorithm made up of many interconnected layers. Patterns are extracted from simple feature representations with limited prior knowledge in DNN. This deep learning model is widely used in cases where traditional machine learning algorithms cannot solve the problem properly. Figure 2 shows the structure of the DNN. Convolutional Neural Network (CNN). The CNN algorithm was created to mimic the human visual system, which means that CNN performs exceptionally well when solving computer vision problems. The CNN model is made up of convolution and pooling layers that alternate. Figure 3 shows the structure of the CNN.
4 Our Proposed Approach In this part, we present our approach Behavior IDS-AI. The suggested approach, as shown in Fig. 4, is made up of three major phases: data quality, unsupervised feature learning, and supervised classification. The first one, used in order to remove inconsistencies (Infinity, NaN, and null values). The second one performs unsupervised learning based on the use of a deep auto-encoder. The auto-encoder is applied on an unlabeled data in order to reduce its dimensionality and to derive a new feature
21 Behavioral System for the Detection of Modern and Distributed …
255
Fig. 2 The structure of the DNN [12]
Fig. 3 The structure of the CNN [12]
representation. The third phase, then, performs a supervised learning classification by using a Deep Neural Network (DNN) or Convolutional Neural Network (CNN) based on the output of the first phase. More details are given in the following subsections. Phase 1: Data quality process. The gathering and preparation of data is the major objective of this phase. As a result, the system runs a process that can collect and compile the required data from networks. After the data are obtained, the acquired network traffic is subjected to a particular data preparation. The data are evaluated at the data preprocessing stage, and incompatible data types are disregarded. In addition, the data are cleaned, and the cleaned data are preserved. In this work, we have used UNSW-NB15 dataset.
256
I. Chebbi et al.
Fig. 4 Our proposed approach behavior IDS-AI
Phase 2: Unsupervised feature learning. The complexity of the created model might rise because redundancy and noise that can affect learner performance when there are many characteristics present. These problems can be solved through dimensionality reduction. A feature extraction technique is used to do the necessary changes to produce the most significant features from unlabeled data input. The produced low-dimensional data will then be categorized, which will allow the classifier to perform better overall. In this phase, we have used an auto-encoder because is an unsupervised neural network algorithm. Its principal goal is to reduce its original input and reproduce it as an equivalent reconstructed output. Phase 3: Supervised classification. In this stage, the class vector from the initial dataset will be combined with the newly extracted feature space. As a consequence, we get a reduced, labeled training set that can be used as input data by the classifier of our choice. There are several supervised learning algorithms, such as the classification rules neural networks, decision trees, Random Forest, Convolutional Neural Network (CNN) and Deep Neural Network (DNN). In our work, we selected two supervised classifiers that is Deep Neural Network (DNN) and Convolutional Neural Network (CNN) for this task. We end up with a behavioral analysis of our result. We end up with a behavioral analysis of our result.
4.1 Dataset Description The choice of dataset is one of the most crucial aspects in developing an intrusion detection system (IDS). The selected dataset is used to test the viability of a proposed IDS model as well as to train an IDS model. We take into account the UNSW-NB15 [13] dataset, a publicly accessible intrusion detection dataset that has been extensively used in earlier publications. UNSW-NB15 Dataset. In order to address the vulnerabilities revealed in the KDDCup 99 [14] and NSL-KDD [15] datasets, the Australian Centre for Cyber
21 Behavioral System for the Detection of Modern and Distributed … Table 1 Description of UNSW-NB15 dataset
Category
Train
257 Test
Normal
56,000
37,005
Backdoor
1746
583
Analysis
2000
677
Fuzzers
18,185
6062
Shellcode
1133
378
Reconnaissance
10,492
3496
Exploits
33,393
11,132
DoS
12,264
4089
Worms
130
44
Generic
40,000
18,871
Total
175,343
82,337
Security (ACCS) cyber security research team has developed a new dataset named UNSW-NB15. A grouping of complete connection records with 10 attacks includes 82,337 test connection records and 175,343 train connection records. 42 characteristics with parallel class labels for normal and nine distinct assaults make up the partitioned dataset. Detailed data and information on the type of simulated assaults are provided in Table 1.
4.2 Metrics Evaluation The performance of our approach is evaluated based on the accuracy, precision, recall, and F-measure metrics. They are defined as follows: Accuracy: It is the proportion of correctly classified instances to all instances. Only when a dataset is balanced, this performance metric, also known as detection accuracy, can be used. Accuracy =
TP + TN TP + FP + FN + TN
(1)
Precision: It measures the proportion of attacks that were accurately predicted to all the samples that were attacked. Precision =
TP TP + FP
(2)
Recall: It measures the proportion of samples that have been successfully identified as attacks to samples that actually are attacks. Another name for it is a detection rate.
258
I. Chebbi et al.
Recall =
TP TP + FN
(3)
F-Measure: The harmonic mean of the Precision and Recall is how it is defined. In other words, it is a statistical technique for evaluating a system’s correctness while taking into account both its precision and recall. F1 − Measure =
2.Precision.Recall Precision + Recall
(4)
5 Experimental Results TensorFlow under Windows 10 was utilized in experiments as the backend, whereas Python and Keras were used for encoding.
5.1 Results On the training set, we train our model, adjust the decision criteria, sand compute the effectiveness metrics on the validation set. We then utilize the production set a dataset devoid of labels to spot any irregularities. The last phase is important because, when the model is used in practice, it will generate predictions on real-time data without labels against which to assess them. In order to ensure that the model is not behaving abnormally on this unknown, label less production set, it is always a good practice to hold out a production set. Table 2 shows the experimental results of our model for UNSW-NB15 dataset.
5.2 Related Work Comparison Several publications have extensively used the UNSW-NB15 dataset to compare network intrusion detection systems, namely, those based on deep learning and machine learning approaches. In this, we will compare our approach with forth works, Breiman [16], Cortes et al. [17], Moradi et al. [18], and Cui et al. [19]. Table Table 2 Overall performance (UNSW-NB15 dataset) Model
Accuracy
F1 score
Recall
Precision
Deep neural network (DNN)
0.9967
0.9998
0.9983
0.9979
Convolutional neural network (CNN)
0.9966
0.9982
0.9978
0.9986
21 Behavioral System for the Detection of Modern and Distributed … Table 3 Comparison with other works available in the literature
Type
ACC (%)
Our model (DNN)
99.67
Our model (CNN)
99.66
RF [16]
74.35
SVM [17]
68.49
MLP [18]
78.32
GMM-WGAN [19]
84.87
259
3 details the evaluation of our best outcomes against those of other authors, for the UNSW-NB15 dataset. Figure 5 compares the proposed Behavior IDS-AI model with other reference models in terms of accuracy, and it is evident that the accuracy of Behavior IDSAI is higher than other models. Table 3 compares the proposed model and other network models in terms of various performance metrics, from which it can be seen that the proposed Behavior IDS-AI model outperforms its comparison peers in terms of Accuracy, reaching 99.67\% and 99.66\% for DNN and CNN on the UNSWNB15 dataset, respectively. In other circumstances, the results are global and do not specify which attacks are being detected because some of the models being compared do not perform cross-validation. The overall results obtained demonstrate the suitability of using Behavior IDS-AI to detect intrusions in the dataset, even though some additional experiments must be conducted to compare our findings with the methodologies provided in the literature. Even though using these methods marginally reduces the performance, the findings are still measured against those of similar previous studies.
Fig. 5 Comparison with other works available in the literature
260
I. Chebbi et al.
6 Conclusion and Future Works The growth of Internet access has resulted in a rise in the volume and sophistication of cyber assaults. Intrusion detection is a collection of upgraded approaches used to monitor systems and data in order to make them safer. We describe in this research a dependable network intrusion detection solution based on artificial intelligence approaches. A preprocessing step is being put up to boost the detection rate and accuracy of IDS based on data heterogeneity. In addition, for high data quality, a feature selection methodology based on the entropy choice method is handled before developing the model. The validation of a novel strategy is accomplished through provided solutions that ensure efficient accuracy. The results are compared using the datasets: UNSW-NB15. As a result, the innovative suggested network intrusion detection technique has several advantages and high accuracy when compared to existing models. Other efficient strategies will be integrated with future studies to improve the detection rate and accuracy of our methodology.
References 1. Cisco Annual Internet Report (2018–2023) https://www.cisco.com/c/en/us/solutions/collat eral/executive-perspectives/annual-internetreport/white-paper-c11-741490.html 2. Shimeall TJ, Spring JM (2014) Introduction to information security, pp 253–274 (2014) 3. Fernandes G, Rodrigues JJPC, Carvalho LF (2019) A comprehensive survey on network anomaly detection. Telecommun Syst 70:447–489 4. Guezzaz A, Asimi A, Tbatou Z, Asimi Y, Sadqi Y (2019) A global intrusion detection system using pcapsocks sniffer and multilayer perceptron classifier. Int J Netw Secur 21(3):438–450 5. Khraisat A, Gondal J, Vamplew P, Kamruzzaman J (2019) Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity, vol 2 6. Guezzaz A, Asimi Y, Azrour M, Asimi A (2021) Mathematical validation of proposed machine learning classifier for heterogeneous traffic and anomaly detection. Big Data Mining Anal 4(1):18–24 7. Ji S-Y, Jeong B-K, Choi S, Jeong DH (2016) A multilevel intrusion detection method for abnormal network behaviors. J Netw Comput Appl 62:9–17 8. Jeyakumar K, Revathi T, Karpagam S (2015) Intrusion detection using artificial neural networks with best set of features. 3e Int Arab J Inf Technol 12(6A) 9. Rostami M, Berahmand K, Nasiri E, Forouzandeh S (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell 100(104210) 10. Alazzam H, Sharieh A, Sabri KE (2020) A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Syst Appl 148(113249) 11. Mishra P, Varadharajan V, Tupakula U, Pilli ES (2018) A detailed investigation and analysis of using machine learning techniques for intrusion detection. IEEE Commun Surv Tutor 21:686– 728 12. Liu H, Lang B (2019) Machine learning and deep learning methods for intrusion detection systems: a survey. Appl Sci 9(20):4396. https://doi.org/10.3390/app9204396 13. Moustafa N, Slay J (2016) The evaluation of network anomaly detection systems: statistical analysis of the unsw-nb15 data set and the comparison with the kdd99 data set. Inf Secur J: A Global Pers 25(1–3):18–31 14. KDD Cup 1999 (2007) http://kdd.ics.uci.edu/databases/kddcup00/kddcup99.html
21 Behavioral System for the Detection of Modern and Distributed …
261
15. Dhanabal L (2015) Shantharajah S PA study on NSL-KDD dataset for intrusion detection system based on classification algorithms. Int J Adv Res Comput Commun Eng 4(6):446–452 16. Breiman L (2020) Random forests machine learning 45(1):5–32. https://doi.org/10.1023/A: 1010933404324 17. Cortes C, Vapnik V (1995) Support vector machine. Mach Learn 20(3):273–297. https://doi. org/10.1007/BF00994018 18. Moradi M, Zulkernine M (2004) A neural network based system for intrusion detection and classification of attacks. In Proceedings of the IEEE international conference on advances in intelligent systems-theory and applications, IEEE Lux-embourgKirchberg, Luxembourg, pp 15–18 19. Cui J, Zong L, Xie J, Tang M (2023) A novel multi-module integrated intrusion detection system for high-dimensional imbalanced data. Appl Intell. https://doi.org/10.1007/s10489022-03361-2
Chapter 22
A Novel Integer-Based Framework for Secure Computations over Ciphertext Through Fully Homomorphic Encryption Schemes in Cloud Computing Security V. Biksham
and Sampath Korra
1 Introduction Homomorphic Encryption scheme plays a vital role in cloud computing security. It is a special cryptographic technique where computations are performed on the ciphertext by the server. The computations are performed through an evaluation function and the re- sults are sent back to client in an encrypted format. It plays vital role in cloud computing, blockchain, healthcare, machine learning and Internet of Things (IoT) for data security [1]. Homomorphic Encryption enables the users to perform various computations over ciphertext without needing the key and the decryption process. However, it matches the results when we perform the same computations on the plaintext. The working principle of the Homomorphic encryption process is demonstrated in Fig. 1 with six steps. As in step1, the client encrypts the message using an algorithm and key, resulting in the ciphertext which is stored in cloud as shown in step2. Later the client sends a computation request in step3 with arbitrary function f (), then the server evaluates the query function f () and returns the results to the client in an encrypted from as represented in step4 and step5. Finally the client receives the encrypted results and decrypts it and verifies the matches of the results as same computations are prosecuted on plaintext as directed in step6. Homomorphic’s assets aid in the development of secure e-voting system with a high privacy data retrieval strategy, as well as the use of cloud computing while ensuring the confidentiality of processed data. V. Biksham (B) Professor, Department of CSE, Sreyas Institute of Engineering and Technology, Hyderabad, India e-mail: [email protected] S. Korra Associate Professor & Head- IoT, Department of CSE, Sri Indu College of Engineering and Technology (A), Sheriguda, Ibrahimpatnam, Hyderabad, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_22
263
264
V. Biksham and S. Korra
Fig. 1 Procedure of homomophic encryption
The main idea of secure computation came in 1970s at the time RSA algorithm was introduced. Rivest et al. [2] proposes the computations over ciphertext without effecting the security. It is remains only theoretical but unsolved. After 3 decades, craig Gentry [3] proposed a solution using ideal lattices and bootstrapping. Fully Homomorphic encryption chains arbitrary computation over ciphertexts with no requirement of decrypting and performing computations over original data. Thus, FHE apprehension of an encryption algorithm E and a decryption algorithm D, such that: C1 = E(m1), C2 = E(m2)
(1)
D(f (C1, C2)) = f (m1, m2)
(2)
where C1 and C2 are ciphertexts, m1 and m2 are plaintexts, f-arbitrary function. More precisely, the Homomorphic Encryption can be described in practice, The FHE technique would give an alternate computation F’ that, when applied directly over encrypted data, will result in the encryption of the application of F over the data in the clear. F(unencrypted data) = De-crypt(F’(encrypted data) is a more formal formula.
1.1 Algorithms of Homomorphic Encryption Scheme The encryption scheme, uses public-key, homomorphic encryption scheme, and consists of the following algorithms and is shown in Fig. 2. • KeyGen(parameters) take the system parameters as input and generate a key pair (Pk,Sk) as well as a public evaluation key for homomorphic computations. • Enc(pk,m), an encryption technique that encrypts plaintext m with the public key pk.
22 A Novel Integer-Based Framework for Secure Computations …
265
Fig. 2 Architecture of Homomorphic Encryption
Fig. 3 Experimental results of proposed scheme
• Dec(sk,c), a decryption algorithm that uses the private key sk to decrypt a ciphertext c. • An addition procedure that is homomorphic. Given the input encryptions c1 and c2 of m1 and m2, Add(c1,c2) produces a ciphertext encrypting the total m1 + m2. • Mult(c1,c2) is a homomorphic multiplication operation that outputs a ciphertext encrypting the product m1 • m2 given the encryptions c1 and c2 of m1 and m2. Partially Homomorphic Encryption (PHE) and Fully Homomorphic Encryption (FHE) are the two main types of homomorphic encryption techniques. PHE systems including RSA [4], ElGamal [5], Paillier [6], and others allow encrypted data to be added or multiplied. The construction of a strategy that could support both processes at the same time was a mystery. Although Boneh et al. [7] came close, allowing infinite additions and one multiplication, it wasn’t until 2009 that the three-decadeold problem was solved in an influential work by Gentry [8], who demonstrated
266
V. Biksham and S. Korra
that fully homomorphic encryption can perform both addition and multiplication simultaneously.
2 Background In 2009, the concept of a fully homomorphic encryption scheme [3] breakthrough work was done by a young scientist, Criag Gentry, of IBM using Ideal lattices. The main ideal behind this is first, perform Somewhat Homomorphic Encryption (SHE), then do the squashing of ciphertext and finally perform the ciphertext refresh procedure called bootstrapping which makes it a fully Homomorphic encryption. This framework motivates the researchers to contribute their work in the field of Homomorphic encryption. Although Gentry’s scheme motivates the research community to develop and ensure the practical implementation of Homomorphic encryption scheme, later, in 2010 Van Dijk et al. [9] developed a fully Homomorphic encryption scheme using integers instead of ideal lattices. This scheme is conceptually simple for the encryption and decryption process. It also satisfies the correctness of the Homomorphic encryption properties. The main advantages of this scheme is reduction in the public key size. Coron et al. [10] proposed a compression based Fully Homomorphic encryption scheme. In 2012, zvika Brakerski et al. [11] developed a NTRU based FHE scheme after being introduced by Lopet-Alt, Tromer and Vaikuntanathan in 2012. These schemes are based on the hardness of Ring Learning With Errors (R-LWE) problem. The above schemes are marked as first and second generation of Fully Homomorphic Encryption schemes. In 2013, Craig Gentry et al. [12] projected a new method for construction of FHE schemes that eliminates a costly relinearization step in Homomorphic multiplication. In this method, less noise over ciphertext was observed after performing multiple computations on cipher-text and better efficiency with strong security. Further ring variation of the GSW cryptosystems are FHEW [13], TFHE [14] which were developed. The effectiveness of FHEW was supplementary improved by the Torus TFHE scheme which develops a ring variant of the bootstrapping methods for reducing the noise level and increases the computations. These schemes are noted as the third generation of the FHE scheme. Github[100] provides various repositories that contain open source libraries, framework and tools to perform fully homomorphic encryption (FHE) operations on an encrypted data set. Following the invention of the CKKS technique in the year 2018, this alerts on a result for encrypted machine learning, because of a feature of the CKKS scheme that encrypts approximated values instead than actual numbers. When computers save real-valued data, they memorize approximate values using long critical bits rather than exact values. The CKKS system was created to effectively deal with problems caused by approximations. Machine learning is familiar with the system, which contains inherent noise in its creation. Biksham et al. [16, 19] presents the symmetric key based lightweight homomorphic encryption scheme using integers and plaintext taken as matrix and encrypted using matrix multiplication with Chinese Remainder Theorem.
22 A Novel Integer-Based Framework for Secure Computations …
267
Baiyu Li and Daniele Micciancio [15] discuss passive attacks against CKKS in a 2020 article, implying that the standard IND-CPA specification may not be adequate in cases when decryption results are made public [17].
3 Proposed Scheme In the motto of achieving a fully Homomorphic encryption scheme, we proposed a secure and novel scheme which supports secure computations over ciphertex with low noise with higher accurate results. Apart from this, we analyze the experiment results and compared to state of the art approaches. Meanwhile we assure the proposed scheme works most efficient and adopts to the real world applications like cloud computing, secure analytics in big data analytics approaches. Key Generation: the following steps are needed to generate the key which is used to perform the encryption process. Step1: choose the function f(n) = (Xn + My + 2) where X,M are random chosen the integers and n,y are polynomial limits within the range of 0 to T where T is the threshold of f(n) and n is the public key and generated f(n) is the private key. Step2: perform the f(n) function from n = 1 to y−1 times where y is opted by the client where it depends on the size of the plaintext. Step3: The results of step2 are performed as module by 2 for each outcome of f(n) after performing the sum of all the average inputs Step4: Finally generates the key which is the input to the encryption and decryption process. Encryption: Steps to perform Homomorphic Encryption on plaintext: 1. Download the library for fully homomorphic encryption from the url https://git hub.com/kryptnostic/fhe-core 2. Extract the zip file onto fhe-core folder. 3. Run command./gradlew build inside the fhe-core folder to build the project 4. Run command./gradlew idea inside the fhe-core folder to setup the project using IntelliJ development environment 5. Create a file with input text of any size. Ex: 100 KB, 1 MB, 10 MB, 50 MB, etc. 6. Specify cipher text block length as 128 bytes and plaintext block length of 64 bytes to gen-erate private key. 7. Place the created input file in the path fhe-core/kryptnostic-core /src /main /java /com/kryptnostic /multivariate 8. Run the program in the following path: fhe-core/kryptnostic-core /src /main/java /com/ kryptnostic/multivariate/Main.java. Homomorphic Computation: 9. Perform encryption of the plain text using the public key to obtain ciphertext. The plain text is converted to bit vector and homomorphic functions such as AND or XOR is applied during encryption.
268
V. Biksham and S. Korra
Decryption: 10. During decryption on ciphertext, the homomorphic function such as AND or XOR is again applied which is same function that was performed in step 9 during encryption. 11. The result obtained is deciphered using the private key which belongs to data owner. 12. Compare the plaintext and result obtained and decryption using private key. Input text is the combination of any text and numeric and this input text is used for performing encryption process with public key size minimum 107,583 bits and generates the ciphertext.
4 Experimental Results We made an attempt to increase the performance of the proposed system and used the latest System specifications for performing experimental on Obuntu 21.10 version of the Linux flavor operating system with 16 GB RAM, 4.3 Ghz processing speed of programmable processor GPU. Table 1 shows the time taken for performing the encryption and decryption process with respect to homomorphic computations and it reveals that the time taken for thje encryption and decryption process is directly proportional to the plaintext size where the public and private key are constant and are generated through the key generation function. The proposed framework suitable for performing homomorphic additive and multiplicative operations except 10 KB of plaintext size is restricted. Additive operation is only due to repeated noise added to the ciphertext. The variable of time is shown in the graph1 as illustrated in time seconds. The proposed work is analyzed and tested automatically by using scyther tool. The main aim of the scyther tool is automated analysis of security protocols with respect to security claims. By using this tool, we analyzed and reported our proposed framework to work under restrictions to all cryptanalysis and make more reliable and secure to Table 1 Experimental results of proposed work Sl. No
Plaintext size
Public key
Encryption time
Private key
Decryption time
Homomorphic operation
1
10 KB
107,121
0.0670 s
52,285
0.022 s
Additive and multiplicative
2
1 MB
107,261
1.584 s
53,280
0.879 s
Additive and multiplicative
3
50 MB
107,583
11.038 s
53,200
1.9680 s
Additive and multiplicative
4
100 MB
107,821
18.201 s
53,927
2.655 s
Additive and multiplicative
5
1 GB
115,250
20.154 s
54,300
3.536 s
Additive
22 A Novel Integer-Based Framework for Secure Computations …
269
perform homomorphic computations. Scyther tool is a formal automated protocol analysis tool in cryptography protocols using perfect cryptography assumption which assumes every cryptographic function is perfect. Scyther allows users to compile multiple protocols in a run, which is the advantage of the scyther tool over the other protocol analyzers. Scyther produces the results in GUI format. This tool has the most advanced features than the existing formal automatic protocols verification tools. The tool provides a graphical user interface as shown in Fig. 4 that complements the command-line and python scripting interfaces. The GUI is aimed at users interested in verifying or understanding a protocol. The input language for the Scyther tool is the SPDL (Security Policy Description Language) which is based on the operational semantics found in [18], the SPDL language can be used in three ways: to verify whether the security claims in the protocol description hold or not; to automatically generate appropriate security claims for a protocol and verify these; to analyze the protocol by performing complete characterization. Figure 5 shows the result of the formal analysis of the proposed security scheme and the proposed scheme results showed that the scheme is verified and could not find any attacks within bound on threshold value R for key generation, which proves security claims for symmetric key-based nth tensor power FHE scheme. The protocol guarantees with success all Scyther claims for A (Initiator) and B (Responder), and no attacks are found. Authentication claims: Alive, Weakagree, and Niagree are
Fig.4 The GUI interface of the Scyther tool in Linux
270
V. Biksham and S. Korra
Fig.5 Results of Scyther tool security protocol verification claims
used to detect replay, reflection, and man-in-the- middle attacks. The secret key is one confidentiality claim. Scyther can characterize protocols, yielding a finite representation of all possible protocol behaviors.
5 Comparative Analysis The experimental results are gained by the proposed framework have been compared with the latest state of art approaches by Li et al. [15] and Cheon et al. [17] are shown in Table 2. Based on experimental results, we compare our results with latest significant work done by Li et al. [15], we analyze the usage of input size is increased to 32 bits, after encryption, the generated ciphertext size is increased to 2.22 MB of size of large amount text, noise rate is reduced to 0.16% and bootstrapping process (cipher- text refresh) is decreased to 0.07 s. In addition to that we compare our results with Cheon et al. [17] work and we increased the data size of the plaintext, and reduced the noise rate by 1.02% and ciphertext refresh process(recrypt) by 0.19 s respectively. Table 2 Comparison
Parameter
Cheon et al. [17]
Li et al. [15]
Proposed
Input size
256
264
296
Ciphertext size
1.00 MB
1.21 MB
3.43 MB
Rate of noise
6.25%
5.41%
5.25%
Time taken for Recrypt
0.42 s
0.30 s
0.23 s
22 A Novel Integer-Based Framework for Secure Computations …
271
6 Conclusion We proposed new security model definitions, extending the traditional homomorphic encryption security notation that properly captures the passive requirement for homomorphic integer based encryption schemes. The necessity of adequate security notations for approximate encryption describes us that correctness and security are two essential issues for the cryptographic system that must be considered at the equal time. We used the latest tool called Scyther that explains the automatic security protocol verification tool. Scyther tool also helps to know the security standards, verification, falsification, and analysis of Security Protocols. In future, we suggested to be implementing the FHE scheme in other formats like. pdf, images, audio and video files for computations and do research direction in reducing the noise level after computations over ciphertext and improve the bootstrapping mechanism.
References 1. Ruiqi L, Chunfu J (2020) A multi key homomorphic encryption scheme based on NTRU, || Acta Cryptologica Sinica 7(05):683–697 2. Rivest RL, Adleman L, Dertouzos ML (1978) On data banks and privacy homomorphisms|| foundations of secure computation. Academia Press, pp 169–179 3. Gentry C (2009) Fully homomorphic encryption using ideal lattices||. In: Proceedings of the forty-first annual ACM symposium on theory of computing, pp 169–178 4. Armknecht F, Katzenbeisser S, Peter A (2012) Shift-type homomorphic encryption and its application to fully homomorphic encryption. In: Mitrokotsa A, Vaudenay S (eds) Progress in cryptology—AFRICACRYPT 2012. Lecture notes in computer science, vol 7374. Springer, pp 234–251 5. https://cryptographyacademy.com/elgamal/ 6. Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes||. In: International conference on the theory and applications of cryptographic techniques. Springer, pp 223–238 7. Boneh D, Gentry C, Halevi S, Wang F, Wu DJ (2013) Private database queries using somewhat homomorphic encryption||. In: International conference on applied cryptography and network security. Springer, pp 102–118 8. Gentry C (2009) A fully homomorphic encryption scheme. Ph.D. thesis, Stanford University 9. van Dijk M, Gentry C, Halevi S, Vaikuntanathan V (2010) Fully homomorphic encryption over the integers In: Advances in cryptology—EUROCRYPT 2010, Lecture notes in computer science, vol 6110. Springer, Berlin, pp 24–43 10. Coron J-S, Naccache D, Tibouchi M (2012) Public key compression and modulus switching for fully homomorphic encryption over the integers. In: Pointcheval D, Johansson T (eds) Advances in cryptology—EUROCRYPT 2012. Lecture Notes in Computer Science, vol 7237. Springer, pp 446–464 11. Brakerski Z, Gentry C, Vaikuntanathan V (2011) Fully homomorphic encryption without bootstrapping. Electron Colloquium Comput Complexity (ECCC), 18:111 12. Gentry C, Sahai A, Waters B (2013) Homomorphic encryption from learning with errors: conceptually-simpler, asymptotically-faster, attribute-based. In: CRYPTO. Springer 13. Ducas L, Micciancio D (2014) FHEW: a fully homomorphic encryption library. GitHub 14. Chillotti I, Gama N, Georgieva M, Izabachene M (2016) Faster fully homomorphic encryption: bootstrapping in less than 0.1 Seconds
272
V. Biksham and S. Korra
15. Li B, Micciancio D (2020) On the security of homomorphic encryption on approximate numbers (PDF). IACR ePrint Archive 2020/1533 16. Biksham V, Vasumathi D (2020) A lightweight fully homomorphic encryption scheme for cloud security||. Int J Inf Comput Secur 13(3/4):357–371 17. Cheon JH, Hong S, Kim D (2020) Remark on the security of CKKS scheme in practice (PDF). IACR ePrint Archive 2020/1581 18. Cremers CJF (2008) The Scyther tool: verification, falsification, and analysis of security protocols. In: Computer aided verification. Springer, pp 414–418 19. Biksham V, Vasumathi D (2020) An efficient symmetric key based lightweight fully homomorphic encryption scheme. In: Proceedings of the 3rd international conference on computational intelligence and informatics (ICCII-2018), Springer, Singapore, pp 417–425
Chapter 23
A Narrative Review of Students’ Performance Factors for Learning Analytics Models Dalia Abdulkareem Shafiq, Mohsen Marjani, Riyaz Ahamed Ariyaluran Habeeb, and David Asirvatham
1 Introduction Student dropout and high failure rates are one of the major problems faced by institutions during the teaching and learning process [1]. In education, teachers or course instructors rely on common traditional attributes to assess student performance. For example, certain attributes such as exam grades, class participation, Cumulative Grade Point Average (CGPA), and submission of assignments all can indicate the success of a student, however, these attributes often lack the understanding of student behaviors and their interaction with the courses offered. Other critical factors can be overlooked when assessing student success such as social and economic factors [2, 3]. In addition, with the increasing implications of an online (E-learning) environment, more factors will be introduced that could affect student performance such as engagement with the e-learning system, challenges of shifting to an online learning environment, etc. [4]. Before developing a predictive model, selecting data sources and the accurate factors to determine at-risk students is crucial. In institutions, student-related data can be retrieved from multiple different systems such as the central academic system or the Virtual Learning Environment, both systems can be independent and have no integration [5]. Therefore, it becomes important to extract raw data from each system and convert it into an appropriate dataset for analysis and predictions.
D. A. Shafiq (B) · M. Marjani · D. Asirvatham School of Computer Science, Taylor’s University, No 1, Jalan Taylor’s, 47500 Subang Jaya, Malaysia e-mail: [email protected] R. A. A. Habeeb Department of Information System, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_23
273
274
D. A. Shafiq et al.
Students’ interactions with the LMS features differ based on the context it’s been used [6], for example, in blended learning where it will be a mix of on and off-campus activities and courses, universities with complete online courses, and finally, online assignments in a traditional (face-to-face) learning. Therefore, it’s vital to understand the variables or factors influencing student performance. For instance, in Massive Open Online Courses (MOOCs) such as Coursera, edX, and Udacity [7], independent variables can be collected using system and event logs. The benefits of such platforms range from free and open access to different courses from multiple prestigious universities. Additionally, MOOCs platforms encourage better learning for students using technology-based learning tools such as forums, quizzes, and online lectures, and so on. However, the retention rate is decreasing, and an often small percentage of students, rarely exceeding 25%, manage to complete the courses [7–9]. Dropout prediction has gained massive attention in the research community, especially in e-learning [5] and therefore, it becomes vital to understand what factors affect the performance of students for educators to provide timely feedback and early interventions to mitigate the risk. The rest of the paper is organized as follows. Section 2 presents a narrative literature review where the potential variables are discussed and analyzed. Section 3 provides a discussion of the review’s findings and the research gaps. Finally, Sect. 4 concludes the review and suggests points for future research.
2 Literature Review In this section, a narrative review of the potential factors is provided for three main categories: Behavioral, traditional, and participation and engagement factors. This section provides a study of researchers’ views from twenty (20) recent journal articles in the year 2017 to 2021 on the potential factors used in Learning Analytics for the online and blended learning environment to reveal the significant and insignificant factors (refer Fig. 1).
2.1 Behavioural Factors Umer et al. [7] focused on predicting the student performance by analyzing the digital traces they leave behind when pursuing a course. The authors here have performed process mining to extract features from the platform logs in addition to traditional features such as demographic, academic grades, submission of tasks, and time spent. Features extracted from logs often describe the behavior of students and the way they interact with online materials such as lecture videos. The approach is to compare students’ weekly activities with logs of the top student (with a grade higher than 90) and a fitness score is assigned for each student. Based on the result obtained from
23 A Narrative Review of Students’ Performance Factors for Learning …
275
Fig. 1 Taxonomy of learning analytics factors
features ranking using Random Forest (RF), two important features are the time spent weekly on the platform and the lecture video activities (refer Fig. 2). Fig. 2 Behavioral factors in an online learning environment [15]
276
D. A. Shafiq et al.
The results are in line with the study provided by Wang et al. [10] where it has revealed that the time spent on a session in the online learning platform can significantly contribute to students’ performance. The findings revealed that Male students from science backgrounds typically have the highest probability of lower time spent per study session. Another study proved that the Time spent has also emerged as a significant feature according to the Gini Index feature selection [11] in a blended learning environment where students are required to access the platform regularly for 14 academic weeks in addition to their face-to-face lessons. However, the online behavior of students can be affected by other factors, and often finding a relationship between variables does not show those factors. For example, in contrast to previous studies, Saqr et al. [12] findings reveal that spending more time online does not necessarily indicate higher engagement from a particular student as he/she might leave a system module open for a period without actually engaging in any learning-related activities [13]. Additionally, the significance of the variables used can heavily rely on different course structures in different study disciplines. If students are not required to access the platform regularly, then time-related features may not be useful for prediction. Moreover, these findings are supported by Estacio and Raga [14] where spending longer time on a learning platform does not necessarily indicate the student’s success as the visualization results revealed that students attempting have higher average academic grades. Another variable that is exploited by researchers is the number of interactions. Queiroga et al. [5] extracted the interactions of students in VLE every two weeks. This approach helps to continuously monitor student performance and it can be generalizable to different courses due to the common factor used. The results obtained suggested that the number of interactions is the same for both failure and successful students, whereby students who are academically struggling tend to access the platform more to search for assistance. Similarly, successful students tend to interact regularly with VLE resources. Additionally, the number of logins into the system (total logins), amount of Moodle views [16] or course homepage views [17], and Moodle quiz attempts have proven to contribute significantly to students’ academic performance [16]. This has been supported by Kondo [18] where it is found that the number of logins is higher from week 0 to week 3 of the academic semester and it progressively increased in the latter weeks. Moreover, Jokhan et al. [19] confirmed that the average logins per week were statistically significant and retailed a strong predictive ability among other variables. Also, the number of interactions significantly rises during examination weeks [20]. Therefore, it is concluded here that by carefully monitoring the activities and behavioural changes of students weekly can help to assess their performance and design strategies and supporting plans catered to their curriculum needs. Furthermore, in an online environment, course instructors are most likely to deliver their classes through interactive online videos or lecture videos. It has been proven that video interaction can affect a student’s performance. Lu et al. [21] have identified seven critical factors such as the number of times students click “Play” or “Backward” during weekly video views. These findings are also supported by Hasan et al. [22] where the results from a Principal Component Analysis confirmed that actions
23 A Narrative Review of Students’ Performance Factors for Learning …
277
Table 1 Example of interactions classification [1] Attribute
Description
Cognitive count Interactions involving content access and visualization: File upload/download, VPL exercise, URL access Teaching count
Interactions with the professor: comments on the files sent
Social count
Interactions with other students: forum participation
on video such as “Play”, “Rewind/Rewatch”, and “Paused” are related to student performance. According to the CN2 rules, students who play or pause a video more than or equal to three to seven times, have a high chance of 62–83% to fail the module. Finally, the results show that video interactions along with on/off campus activities can further enhance learning. Cognitive and behavioural data are rarely addressed by current studies and therefore it’s worth investigating in the future. Students may be struggling but are too afraid to approach instructors or seek help [20] and therefore understanding the reasoning behind this behaviour is crucial to provide students with the right support. Since these factors can’t be extracted in an online or blended learning environment, Mitra and Le [20] performed a survey questionnaire to obtain information regarding these attributes. Mitra and Le’s [20] study found that students who are more capable of managing their studies (have a clear guidelines), have performance-oriented and achievable goals, and prefer a learning style that involves more hands-on experience are expected to succeed more in the course. This is an expected observation as these types of students are typically motivated and more organized. Additionally, Mitra and Le [20] studied the personality traits of students and they concluded that more social exhibit good performance. Interestingly, the authors also indicated that students who have a high level of academic self-concept, meaning individuals who believe they have enough preparation and high intellectual abilities tend to show low performance. This can be explained psychologically, where individuals sometimes show unfairness in their cognitive abilities. Whereas Buschetto Macarini et al. [1] have involved cognitive, social, and teaching factors as interactions count of the activities related to these categories as illustrated in Table 1. Although it is beneficial to extract the number of interactions from LMS activities, Umer et al. [23] suggest that these are not enough and further investigation is required to understand various kind of online activities that differentiate multiple groups of students.
2.2 Traditional Factors Aside from behavioral factors, demographic variables can be an indicator of student performance in online learning environments. In this section, studies that utilized variables related to students’ backgrounds and demographic are reviewed.
278
D. A. Shafiq et al.
Wang et al. [10] compared the performance of students according to their gender (Male and Female) at Shaanxi Normal University for MOOC where it included 2,936 students. In addition to behavioral factors, other variables were considered such as gender, subject background, and academic qualification. The results reveal that in general, Male students exhibit low performance in academics and learning, for example, they score low for online homework, practical skills, online tests, and the final written exam. This has been supported by other studies as well [24] where the findings indicate that female students have a higher probability of 1.2 times more obtaining passing grades compared to Male students. This concludes that gender can be a sign of academic success. Ethnicity and Region factors have also been identified as significant predictors of student performance [15, 25]. In a study conducted at the American University of Nigeria (AUN) [25] it is discovered that whether students are living closer to campus or further, they can still score higher where the results revealed that students living in the southwest (further from campus) have a higher chance (coefficient of 0.95) of achieving a passing CGPA. This has been proven by Kondo et al. [18] where the authors found that the “Attendance rate” for students is an important and significant variable for all weeks in the semester except for the first week which is the “orientation week”. As expected, attendance is an evident factor of whether the student is performing well or not regardless of the learning environment. In Rayasam’s [24] study, data visualization results reveal that 99% of students who took a leave of absence at the Massachusetts Institute of Technology (MIT) tend to drop out. We can conclude that the Attendance variable does help in identifying weak students, however, it may not always be the case. Students do have a choice of whether to attend classes or not, and often their attendance can be affected by factors such as health status and religious events [26]. One of the most common indicators of students’ academic performance in higher education is their Grades or Cumulative Grade Point Average (CGPA) [25]. Most lecturers rely on this factor to determine whether the student is performing poorly as students are expected to keep a CGPA of 2.0 and above. This is a common factor used by current research studies to identify at-risk students. Bañeres et al. [27] used student’s grades solely as input data to build a Gradual At-Risk (GAR) model that includes a threshold to identify the level of risk for a student and provide feedback accordingly with an accuracy of 95.7% for the K-Nearest Neighbours algorithm. This can indicate that grades are dominant indicators of student performance and success. Other studies have found that partial grades in a course or scores of quizzes and assignments or homework are highly influential on the student’s performance [13, 21, 28]. While grades may reflect a student’s knowledge and the level of engagement [14], however, using grades as an indicator of success fails to interpret and does not explain why students are performing this way and therefore may be a limiting factor to understand the student’s behaviour. Additionally, based on the literature it is found that the students’ activity score does not necessarily impact their performance as long as the student is actively participating in the course or other activities and has high
23 A Narrative Review of Students’ Performance Factors for Learning …
279
academic background [17]. Therefore, a combination of LMS data and scores can create a stronger prediction power of students’ academic performance [7]. Additionally, Cui et al. [13] reveal that factors relating to the assessment of students such as midterm and final exams are measures that indicate the knowledge acquired by students and not their overall engagement and interactions with peers and instructors. Therefore, such traditional factors fail to analyze and identify students failing to engage as expected through a course at an early stage. According to an initial exploratory analysis done by Mitra and Le [20], it has been revealed some background variables such as ethnicity and age have a very weak relationship with either of the outcome variables such as student success or session participation. This has been proven by Yakubu and Abubakar [25] where demographic data such as age is used to predict the student’s results. The findings revealed that age is not a predictor of student success as it contributed less to the prediction model with an odd ratio of approximately 1.06 based on the Logistic Regression model. This could be explained by the relatively low number of samples of students above 25 years old as the majority of the participants (96.88%) are within the youth age group for university. Therefore, a more balanced dataset is required to study the significance of age on student’s learning outcomes. Moreover, it is found that in graduate courses these static variables are considered to be less or not statistically significant [28] to predict at-risk students. The level of study is another useful factor to predict student’s results [25] where it is found that as students progress with their courses the difficulty level increases and therefore this might lead to higher dropout rates as the requirements for the next course are not met.
2.3 Participation and Engagement Factors In educational institutions, it is important to create more structured courses with interactional materials to keep the student more engaged as this could improve their academic performance [1]. Therefore, investigating these factors is essential. In this section, studies that utilized variables related to student participation and engagement are reviewed. One of the many online tools used to anticipate participation and discussion among students is Forums. Several studies considered this variable as the most significant predictor to indicate students’ participation in the online environment to predict their success and analyze student performance [6, 11, 17, 28–31]. Forum variable could include several activities extracted from logs to track how students engage with this feature in online learning tools such as accessing and reading forums, posting, and adding messages to it to promote discussion among students. This factor has remained persistent in the literature as a strong predictor to indicate the student’s participation on online platforms such as LMS. Each researcher incorporated the Forum factor differently, where many studies have extracted the raw number of posts, comments, and access times to forums [1, 5,
280
D. A. Shafiq et al.
6, 15, 17, 28, 30, 31] whereas others have taken another approach such as comparing the total number of posts for a particular student in comparison with their peers [28] to analyze whether the student is at risk or not. Forum posts have been selected by many studies to show that they can have a significant effect on learning performance. This variable can indicate that students who are actively initiating and participating in forum discussions during the delivery of a course can perform better [29]. Another way that sparks communication among students and indicates strong participation in the course is the number of messages. Zacharis [32] proved that messages exchanged between students and their team members, instructors, and contributions made during teamwork activities are strong predictors of student success. According to the rules generated by the Tree model analysis, if the number of messages is high such as 172 messages, and constant contributions are made then the student is most likely to pass or succeed in the course. These factors are commonly applied in Blended Learning courses where interactions and communication skills are highly required. Factors relating to engagement have been proven to be a significant predictor of atrisk students [12] and their final grade [33]. Often these variables are obtained through students’ interaction with LMS materials where it is revealed that the frequency of students is more important to predict their engagement better than counting their interactions. This is mainly because public holidays could affect the student’s interaction. Additionally, student activity data is another type of engagement indicator and student’s efforts [13].
3 Discussion and Research Gaps Student dropout is a critical issue faced by most universities. There can be several reasons for students to drop out such as financial matters, and low quality of teaching [34]. Therefore, it becomes important to understand what factors affect student performance. In Fig. 3, the potential factors used for each prediction goal are illustrated. The prediction of at-risk students as early as possible can be achieved using accurate and significant variables. Based on this review, it is revealed that demographic, behaviour, LMS data, grades, and attendance data are mostly used in Learning Analytics, however, other critical factors are ignored. Therefore, we conclude three main research gaps to be addressed by researchers in the future: 1. Researchers should integrate other factors relating to student’s personalities and emotions [15]. 2. Moreover, it is revealed that most researchers address the retention problem by integrating factors relating to students only, while it has been proven that other factors relating to teachers and instructors can affect students’ performance
23 A Narrative Review of Students’ Performance Factors for Learning …
281
Fig. 3 Learning analytics factors and prediction goals
such as student-to-teachers relations, assigned instructors’ qualifications. These findings are in line with the recent review presented in [35]. 3. Lastly, as can be seen in Fig. 4, about 4% of the studies integrate factors that can be used for a model focused on multi-course, while 11% of studies focus on course-specific factors. These statistics reveal that future researchers should investigate factors that can be reused for the prediction of failure and successful students in multiple courses, to improve the generalizability of the models.
Fig. 4 Percentage of studies according to the focus of the model
282
D. A. Shafiq et al.
4 Conclusion This research aim was to discuss and review different views of researchers on the potential variables used in Learning analytics for the prediction of at-risk and successful students. It has been concluded that most researchers proposed a model that can be used to improve the retention rates for a specific course, revealing that a more generalized model is required in the future. Moreover, it is revealed that factors relating to instructors’ and student’s personalities and emotions remain undiscovered. In future work, a predictive analytics model will be developed to compare and evaluate the effectiveness of potential variables in predicting failure and successful students to provide experiment-based findings and conclusions. Acknowledgements We would like to thank Taylor’s University for giving us the opportunity to carry out this research under the Taylor’s Research Excellence Scholarship 2021 and under the Grant IVERSON/2018/SOCIT/001.
References 1. Buschetto Macarini LA, Cechinel C, Batista Machado MF, Faria Culmant Ramos V, Munoz R (2019) Predicting students success in blended learning—evaluating different interactions inside learning management systems. Appl Sci 9(24):5523. https://doi.org/10.3390/app9245523 2. Tight M (2020) Student retention and engagement in higher education. J Furth High Educ 44(5):689–704. https://doi.org/10.1080/0309877X.2019.1576860 3. Romero C, Ventura S (2020) Educational data mining and learning analytics: an updated survey. WIREs Data Min Knowl Discov 10(3):1–21. https://doi.org/10.1002/widm.1355 4. Kibuku RN, Ochieng PDO, Wausi PAN (2020) e-Learning challenges faced by universities in Kenya: a literature review. Electron J e-Learn 18(2):150–161. https://doi.org/10.34190/EJEL. 20.18.2.004 5. Queiroga EM et al (2020) A learning analytics approach to identify students at risk of dropout: a case study with a technical distance education course. Appl Sci 10(11):3998. https://doi.org/ 10.3390/app10113998 6. Bravo-Agapito J, Romero SJ, Pamplona S (2021) Early prediction of undergraduate student’s academic performance in completely online learning: a five-year study. Comput Human Behav 115(June 2020):106595. https://doi.org/10.1016/j.chb.2020.106595 7. Umer R, Susnjak T, Mathrani A, Suriadi S (2017) On predicting academic performance with process mining in learning analytics. J Res Innov Teach Learn 10(2):160–176. https://doi.org/ 10.1108/JRIT-09-2017-0022 8. Mubarak AA, Cao H, Hezam IM (2021) Deep analytic model for student dropout prediction in massive open online courses. Comput Electr Eng 93. https://doi.org/10.1016/j.compeleceng. 2021.107271 9. Soffer T, Cohen A (2019) Students’ engagement characteristics predict success and completion of online courses. J Comput Assist Learn 35(3):378–389. https://doi.org/10.1111/jcal.12340 10. Wang G-H, Zhang J, Fu G-S (2018) Predicting student behaviors and performance in online learning using decision tree. In: 2018 seventh international conference of educational innovation through technology (EITT), pp 214–219.https://doi.org/10.1109/EITT.2018.00050 11. Akçapınar G, Altun A, A¸skar P (2019) Using learning analytics to develop early-warning system for at-risk students. Int J Educ Technol High Educ 16(1):40. https://doi.org/10.1186/ s41239-019-0172-z
23 A Narrative Review of Students’ Performance Factors for Learning …
283
12. Saqr M, Fors U, Tedre M (2017) How learning analytics can early predict under-achieving students in a blended medical education course. Med Teach 39(7):757–767. https://doi.org/10. 1080/0142159X.2017.1309376 13. Cui Y, Chen F, Shiri A (2020) Scale up predictive models for early detection of at-risk students: a feasibility study. Inf Learn Sci 121(3/4):97–116. https://doi.org/10.1108/ILS-05-2019-0041 14. Estacio RR, Raga RC Jr (2017) Analyzing students online learning behavior in blended courses using moodle. Asian Assoc Open Univ J 12(1):52–68. https://doi.org/10.1108/AAOUJ-012017-0016 15. Chen Y, Zheng Q, Ji S, Tian F, Zhu H, Liu M (2020) Identifying at-risk students based on the phased prediction model. Knowl Inf Syst 62(3):987–1003. https://doi.org/10.1007/s10115019-01374-x 16. Adejo OW, Connolly T (2018) Predicting student academic performance using multi-model heterogeneous ensemble approach. J Appl Res High Educ 10(1):61–75. https://doi.org/10. 1108/JARHE-09-2017-0113 17. Helal S et al (2018) Predicting academic performance by considering student heterogeneity. Knowl-Based Syst 161(July):134–146. https://doi.org/10.1016/j.knosys.2018.07.042 18. Kondo N, Okubo M, Hatanaka T (2017) Early detection of at-risk students using machine learning based on LMS log data. In: 2017 6th IIAI international congress on advanced applied informatics (IIAI-AAI), pp 198–201. https://doi.org/10.1109/IIAI-AAI.2017.51 19. Jokhan A, Sharma B, Singh S (2019) Early warning system as a predictor for student performance in higher education blended courses. Stud High Educ 44(11):1900–1911. https://doi. org/10.1080/03075079.2018.1466872 20. Mitra S, Le K (2019) The effect of cognitive and behavioral factors on student success in a bottleneck business statistics course via deeper analytics. Commun Stat—Simul Comput 0(0):1–30. https://doi.org/10.1080/03610918.2019.1700279 21. Lu OHT, Huang AYQ, Huang JCH, Lin AJQ, Ogata H, Yang SJH (2018) Applying learning analytics for the early prediction of students’ academic performance in blended learning. Educ Technol Soc 21(2):220–232 22. Hasan R, Palaniappan S, Mahmood S, Abbas A, Sarker KU, Sattar MU (2020) Predicting student performance in higher educational institutions using video learning analytics and data mining techniques. Appl Sci 10(11):3894. https://doi.org/10.3390/app10113894 23. Umer R, Mathrani A, Susnjak T, Lim S (2019) Mining activity log data to predict student’s outcome in a course. In: Proceedings of the 2019 international conference on big data and education—ICBDE’19, pp 52–58. https://doi.org/10.1145/3322134.3322140 24. Rayasam AS (2020) Predicting at-risk students from disparate sources of institutional data. Massachusetts Institute of Technology 25. Yakubu MN, Abubakar AM (2021) Applying machine learning approach to predict students’ performance in higher educational institutions. Kybernetes. https://doi.org/10.1108/K-122020-0865 26. Gray CC, Perkins D (2019) Utilizing early engagement and machine learning to predict student outcomes. Comput Educ 131(July 2018):22–32. https://doi.org/10.1016/j.compedu. 2018.12.006 27. Bañeres D, Rodríguez ME, Guerrero-Roldán AE, Karadeniz A (2020) An early warning system to detect at-risk students in online higher education. Appl Sci 10(13):4427. https://doi.org/10. 3390/app10134427 28. Bainbridge J, Melitski J, Zahradnik A, Lauría EJM, Jayaprakash S, Baron J (2018) Using learning analytics to predict at-risk students in online graduate public affairs and administration education. J Public Aff Educ 21(2):247–262. https://doi.org/10.1080/15236803.2015.120 01831 29. Mwalumbwe I, Mtebe JS (2017) Using learning analytics to predict students’ performance in Moodle learning management system: a case of Mbeya university of science and technology. Electron J Inf Syst Dev Ctries 79(1):1–13. https://doi.org/10.1002/j.1681-4835.2017.tb00577.x
284
D. A. Shafiq et al.
30. Glavas M, Bakaric MB, Matetic M (2018) Applying advanced linear models in the task of predicting student success. In: 2018 41st international convention on information and communication technology, electronics and microelectronics (MIPRO) (May 2019):0744–0748. https:// doi.org/10.23919/MIPRO.2018.8400138 31. Han W, Jun D, Kangxu L, Xiaopeng G (2017) Investigating performance in a blended SPOC. In 2017 IEEE 6th international conference on teaching, assessment, and learning for engineering (TALE), vol 1, no 12, pp 239–245. https://doi.org/10.1109/TALE.2017.8252340 32. Zacharis NZ (2018) Classification and regression trees (CART) for predictive modeling in blended learning. Int J Intell Syst Appl 10(3):1–9. https://doi.org/10.5815/ijisa.2018.03.01 33. Shelton BE, Hung J-L, Lowenthal PR (2017) Predicting student success by modeling student interaction in asynchronous online courses. Distance Educ 38(1):59–69. https://doi.org/10. 1080/01587919.2017.1299562 34. Behr A, Giese M, Teguim Kamdjou HD, Theune K (2020) Dropping out of university: a literature review. Rev Educ 8(2):614–652. https://doi.org/10.1002/rev3.3202 35. Shafiq DA, Marjani M, Habeeb RAA, Asirvatham D (2022) Student retention using educational data mining and predictive analytics: a systematic literature review. IEEE Access 10:72480– 72503. https://doi.org/10.1109/ACCESS.2022.3188767
Chapter 24
Optimum Design of Base Isolation Systems with Low and High Damping Ayla Ocak, Sinan Melih Nigdeli, and Gebrail Bekda¸s
1 Introduction Seismic isolators are systems that are generally placed at the base of the structure and are quite rigid in the vertical, allowing movement in the horizontal. Earthquakes etc. in the face of dynamic loads, the isolators move horizontally and try to break the relationship between the structure and the vibrations created by the dynamic load. To ensure the vibration-free behavior of the superstructure, the isolators are required to be ductile. Extremely ductile behavior of isolator systems, which are decided to be designed according to the seismicity of the region, is a situation that may put the building safety at risk in the face of an unusual earthquake [1, 2]. Considering this situation, the ductility of the isolator should be limited by considering any undesigned earthquake possibilities. Ductility can be adjusted depending on the period and mass of the isolator. The mass of an isolator to be added to the structure can be thought of as the weight of one floor of the structure. Choosing the optimum damping ratio is another way to increase ductility. Correct setting of the isolator parameters for optimum efficiency from the control necessitated the optimization process for the seismic isolator. Metaheuristic algorithms are an optimization method that can be easily applied to engineering problems with their simple and understandable structure inspired by nature and living behavior. Algorithms contain design factors related to the nature and behavior of life. It has been diversified by taking the names of the lives or events it inspired [3–12]. These algorithms have been frequently preferred in the optimization of isolator parameters in recent years [13–20]. The flower pollination algorithm (FPA) A. Ocak · S. M. Nigdeli (B) · G. Bekda¸s Department of Civil Engineering, Istanbul University-Cerrahpa¸sa, 34320 Avcılar, Istanbul, Turkey e-mail: [email protected] G. Bekda¸s e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_24
285
286
A. Ocak et al.
is an algorithm inspired by the pollination process of flowers developed by Yang in 2012 [10]. In addition to the application of engineering problems, there are studies in which damper optimization is used in the solution of structural control problems [21–25]. In this study, the earthquake simulation containing FEMA P-695 records created on Matlab was applied to a single degree of freedom structure model, and the isolator properties added to the structure base were optimized [26, 27]. The base isolator, optimized using the Flower Pollination Algorithm, was analyzed at a 30 cm displacement limit with a damping limit of 10% and 40%. Optimum isolator parameters are obtained for 30 cm displacement constraint in two damping limits with low and high damping. Thus, the difference of different damping levels was compared for the optimum values of base isolators. The displacement and total acceleration reduction performance of the system obtained from the critical earthquake analysis using the optimum isolator properties were compared.
2 Method In this section, the parameter and motion equations of the seismic base isolator and the algorithm equation objective function of the Flower Pollination Algorithm to be used in the optimization process are mentioned. Seismic isolators are generally placed at the base of the structure and designed to be rigid in the vertical and movable in the horizontal. Assuming the isolator is a floor with freedom of movement, its weight can be taken as the weight of the floor of the structure. Total mass (m total ), structure mass (m total ) and isolator mass (m b ) of the structure with isolators are added together and obtained as in Eq. 1. m total = m b + m str uctur e
(1)
When a seismic isolator is placed on the base, the isolator and the structure can act together as a single degree of freedom system. In such systems, the structure and isolator period, stiffness, and damping coefficient are common. Equations 2, 3, and 4 show the period (Tb ), stiffness (kb ) and damping coefficient (cb ) calculation of the whole system, respectively. Tb =
2π wb
(2)
kb = m total × wb2
(3)
cb = 2 × ζb × m total × wb
(4)
24 Optimum Design of Base Isolation Systems with Low and High Damping
287
ζb in the equations represents the damping ratio of the system and wb represents the natural angular frequency of the system. The equation of motion of the system with a single degree of freedom structure and a moving isolator is given in Eq. 5. m total X¨ + cb X˙ + kb X = −m total X¨ g
(5)
X given in Eq. 5 represents the displacement of the system, X˙ represents the velocity of the system and X¨ represents the acceleration of the system. Ground acceleration is represented by X¨ g in the equation. Isolator ductility is one of the most important parameters in the design of isolators. In systems where the isolator and the structure move together, while the energy of the dynamic load is wanted to be destroyed quickly, on the other hand, it is desired not to move too much since the structure will move together with the isolator. Considering this situation, isolator period and damping ratio parameters, which affect ductility and damping, are optimized in isolator optimization. The objective function of isolator optimization is to reduce the maximum total acceleration in the system. In Eq. 6, the objective function is given. While the maximum acceleration of the system is minimized, its displacement is introduced as a design constraint through the function shown in Eq. 7. | | f (x) = max(| X¨ |)
(6)
g(x) = max(|X |)
(7)
In the optimization process with metaheuristic algorithms, the features and design constraints of the system are defined to the system. The solution generation process begins with the introduction of the design factors of the algorithm to the system. According to the algorithm equation used, new solutions are produced and better solutions are updated compared to the old ones. The optimization process is completed by repeating all the operations for the number of iterations. The flower pollination algorithm is a metaheuristic algorithm inspired by the pollination process of flowers developed by Yang [10]. The pollination process in flowers is done in two ways. While one of them is self-pollination in nature, the other is pollination with the help of a pollinator. In the optimization to be made with the flower pollination algorithm, the limits of the design variables to be optimized are introduced to the system as in Eq. 8. X min < X i, j < X max ,
i = 1, 2, . . . , n f j = 1, 2, . . . , n
(8)
In Eq. 8, n indicates the number of variables, n f , the number of population, and X max and X min indicate the lower and upper limits of the variable. After the design limits are determined, the first solutions are randomly assigned as in Eq. 9, and the results are recorded in the solution matrix, within the limit values.
288
A. Ocak et al.
X i, j = X min + rand(X max − X min )
(9)
Two methods express the local and global search process for new solutions to be produced. The choice of the methods is decided by the algorithm-specific design factor called the switch probability (sp). If a random number is greater than the switch probability value, the local search process is started, and if it is less, the global search process is started. In case the production process of new solutions is local, new solutions are produced with the help of the equation given in Eq. 10. X new = X i, j + rand(0, 1)(X k − X t )
(10)
X new given here denotes the new solution matrix. A new solution matrix is obtained by using two randomly selected solutions (X k , X t ) from the previously created solution matrix. If the search process is global, the solution generation equation is used such as Eq. 11, where the best solution (X best ) in the previous solution matrix is produced with the help of the Lévy flight equation (L) in Eq. 12, which is derived from the flight of pollinators such as birds and bees. X new = X i, j + L(X best − X i, j )
(11)
1 1 L = √ ε−1.5 e 2ε 2π
(12)
Each newly produced solution matrix is compared with the old solutions and the bad solutions are updated. Solutions are produced as much as the amount of iteration defined in the system, and the optimum value is obtained by applying updates to them.
3 Results In this study, a seismic base isolator was added to a single degree of freedom structure and its behavior against earthquake load was investigated. Data consisting of FEMA far-fault earthquake records were exited to the structure through the earthquake simulation created on Matlab Simulink [26, 27]. The single-degree-of-freedom moving the 10-story structure model used in the study is shown in Fig. 1. Each floor of the structure model weighs 360 tons and has a stiffness of 650 MN/m, and a damping coefficient of 6.2 MNs/m [28]. The period and damping ratio of the structure isolated from the base under earthquake load acting as a single degree of freedom (SDOF) were optimized by the flower pollination algorithm (FPA). The optimization process was run for the period of the system between 1 and 5 s and the damping ratio between 1–10% and 1–40%.
24 Optimum Design of Base Isolation Systems with Low and High Damping
289
Fig. 1 a SDOF structure model with base isolator b 3D view of the model
Table 1 Optimum results Damping ratio For 10%
For 40%
Tb (s)
ζb (%)
Tb (s)
ζb (%)
1.8240
10.00
2.6866
40.00
The search probability value is 0.5, the population number is 10, and the maximum iteration is 100. In the optimization, critical earthquake analyzes were performed by obtaining optimum values for 10% and 40% damping ratio and 30 cm displacement limit.
290
A. Ocak et al.
Table 2 The displacement and total acceleration values obtained for the critical earthquake recording With Isolator
10%
0.1635305
1.9718045
40%
0.1245290
1.5483557
0.4101091
19.2833064
Without Isolator
10% damping ratio and 30cm displacement limit
10% damping ratio and 30cm displacement limit
Displacement (m)
Total Acceleration (m/s 2)
without isolator with isolator
0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 -0.4
0
10
Total acceleration (m/s2 )
Displacement (m)
Damping Ratio
20
30
40
50
t (s)
60
20
without isolator with isolator
15 10 5 0 -5 -10 -15 -20
0
10
20
30
40
50
60
t (s)
Fig. 2 Displacement and total acceleration graphs under critical earthquake analysis for a 10% damping ratio
The optimum system period and damping ratio obtained from the optimization are shown in Table 1. Critical earthquake analysis was performed using the earthquake record, which is critical for the structure without an isolator, with the optimum values obtained at 10% and 40% damping ratio limit and 30 cm displacement constraint. The isolated structure displacement and total acceleration values as a result of the critical earthquake analysis obtained are shown in Table 2. The displacement and total acceleration graphs of the isolator system obtained as a result of the critical earthquake analysis for the 10% damping limit are shown in Fig. 2 for the 30 cm displacement limit. The system displacement and total acceleration graph for the 40% damping limit obtained from the critical earthquake analysis are given in Fig. 3.
4 Discussion and Conclusion This study is focused on the optimization of an SDOF structure model with a seismic base isolator subjected to earthquake loads with the FPA algorithm. Critical earthquake analysis was performed for the 10-story structure model with SDOF. The displacement reduction performance and total acceleration reduction percentages of the isolator structure are given in Table 3 for the critical earthquake record.
24 Optimum Design of Base Isolation Systems with Low and High Damping 40% damping ratio and 30cm displacement limit
Displacement (m)
0.4 0.3 0.2 0.1 0 -0.1 -0.2 -0.3 -0.4
0
10
20
30
40
50
60
40% damping ratio and 30cm displacement limit 20
Total Acceleration (m/s2)
without isolator with isolator
291
without isolator with isolator
15 10 5 0 -5 -10 -15 -20
0
10
20
t (s)
30
40
50
60
t (s)
Fig. 3 Displacement and total acceleration graphs under critical earthquake analysis for a 40% damping ratio
Table 3 Structure displacement and total acceleration reduction percentages with isolator for a 10-story structure
Damping ratio
Displacement (%)
Total acceleration (%)
10%
60.13
89.77
40%
69.64
91.97
When Table 3 is examined, it is seen that the high damped isolator performs better in displacement and there is overall acceleration reduction compared to the low damping limit case. It is understood that while the high damped system performs approximately 10% better in reducing the displacement for constant mobility between the low and high damping isolator, it exhibits a very close control performance in reducing the total acceleration. For both damping limits, the system with isolators reduced the structure displacement up to 60% and the total acceleration up to 90%. Increasing the damping ratio of the isolator also extended the period of the optimized rigid isolator system. However, the damping ratio as a result of the optimization was based on the defined damping ratio limit values. The analysis results in Table 3 were obtained for the critical earthquake recording of the structure without an isolator. The base isolator added to the structure has changed the critical earthquake record of the created system. According to the FEMA P-695 far-fault earthquake data, which includes 22 earthquake records with 2 components, the displacement-earthquake record graph of the isolator system at 10% and 40% damping limit for all earthquakes is shown in Fig. 4, and the total acceleration-earthquake record graph is shown in Fig. 5. When Fig. 4 is examined, the low damped isolator for the 30 cm displacement radial caused the displacement of the structure to increase in many earthquake records such as earthquake records 18, 28, and 30. In the highly damped system, the movement of the structure is limited by making a displacement below the structural displacement in almost all earthquake records. According to the total acceleration graph in Fig. 5, a much lower total acceleration than the total acceleration of the structure was observed in all earthquake records for both damping levels. When all the results are examined, the following conclusions are reached:
292
A. Ocak et al.
Structure displacement graph with and without isolator for 10% and 40% damping rate limit
Displacement (m)
0.5 0.4 0.3 0.2 0.1 0 1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
Earthquake Record Number Without Isolator
10 % Damping Ratio Limit
40 % Damping Ratio Limit
Fig. 4 Structure displacement–earthquake record number graph with and without isolator, obtained from all earthquake records for the 10% and 40% damping ratio limit
Structure total acceleration graph with and without isolator for 10% and 40% damping rate limit Total Acceleration (m/s2)
25 20 15 10 5 0 1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43
Earthquake Record Number Without Isolator
10 % Damping Ratio Limit
40 % Damping Ratio Limit
Fig. 5 Structure total acceleration–earthquake record number graph with and without isolator, obtained from all earthquake records for the 10% and 40% damping ratio limit
Increasing the damping ratio of the isolator system has increased the displacement and total acceleration reduction performance of the seismic base isolator. The high-damped isolator provided approximately 10% better displacement control than the low-damped one. It is understood that for a fixed displacement limit, the increase in damping ratio has remarkable effects on the displacement reduction performance of the structure. It has been determined that increasing the damping ratio of the isolator system affects the optimization result and extends the period of the system. When the optimization results are examined, it is seen that the damping ratio limit defined for high and low-damped isolators is based on the limit values. Based on this,
24 Optimum Design of Base Isolation Systems with Low and High Damping
293
it can be said that the inadequacy of the 30 cm displacement limit may be effective in choosing the upper limit defined for the damping ratio as the optimum. In light of all the data, it can be said that the optimization of isolator systems is effective in reducing the structure displacement and total acceleration.
References 1. Bao Y, Becker TC (2018) Inelastic response of base-isolated structures subjected to impact. Eng Struct 171:86–93 2. Sheikh H, Van Engelen NC, Ruparathna R (2022) A review of base isolation systems with adaptive characteristics. In Structures, vol 38. Elsevier, pp 1542–1555 3. Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor, MI 4. Kennedy J, Eberhart RC (1995) Particle swarm optimization. In Proceedings of IEEE international conference on neural networks no. IV, 27 November–1 December Perth, IEEE Service Center, Piscataway, NJ, 1942–1948 5. Dorigo M, Maniezzo V, Colorni A (1996) The ant system: an autocatalytic optimizing process. IEEE Trans Syst Man Cybern B 26:29–41 6. Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm: harmony search. SIMULATION 76(2):60–68 7. Karabo˘ga D (2005) An idea based on honey bee swarm for numerical optimization, vol 200, pp 1–10. Technical report-tr06, Erciyes university, engineering faculty, computer engineering department 8. Yang XS (2010) A new metaheuristic bat-inspired algorithm. In: Nature-inspired cooperative strategies for optimization (NICSO 2010). Springer, Berlin, Heidelberg, pp 65–74 9. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-Learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput Aided Des 43:303–315 10. Yang XS (2012) Flower pollination algorithm for global optimization, Lecture Notes in Computer Science. In: Durand-Lose J, Jonoska N, 27, Springer, London, 7445, pp 240–249 11. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61 12. Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67 13. Zou XK, Chan CM (2001) Optimal drift performance design of base-isolated buildings subject to earthquake loads. WIT Trans The Built Environ 54 14. Iemura H, Taghikhany T, Jain SK (2007) Optimum design of resilient sliding isolation system for seismic protection of equipment. Bull Earthq Eng 5(1):85–103 15. Jangid RS (2008) Equivalent linear stochastic seismic response of isolated bridges. J Sound Vib 309(3–5):805–822 16. Bucher C (2009) Probability-based optimal design of friction-based seismic isolation devices. Struct Saf 31(6):500–507 17. Dicleli M, Karalar M (2011) Optimum characteristic properties of isolators with bilinear forcedisplacement hysteresis for seismic protection of bridges built on various site soils. Soil Dyn Earthq Eng 31(7):982–995 18. Weber F, Distl H, Braun C (2016) Isolation performance of optimized triple friction pendulum. Int Ref J Eng Sci 5:55–69 19. Charmpis DC, Komodromos P, Phocas MC (2012) Optimized earthquake response of multistory buildings with seismic isolation at various elevations. Earthquake Eng Struct Dynam 41(15):2289–2310 20. Ocak A, Nigdeli SM, Bekda¸s G, Kim S, Geem ZW (2022) Optimization of seismic base isolation system using adaptive harmony search algorithm. Sustainability 14(12):7456
294
A. Ocak et al.
21. Nigdeli SM, Bekda¸s G, Yang XS (2016) Application of the flower pollination algorithm in structural engineering. In: Metaheuristics and optimization in civil engineering. Springer, Cham pp 25–42 22. Mergos PE (2021) Optimum design of 3D reinforced concrete building frames with the flower pollination algorithm. J Build Eng 44:102935 23. Yücel M, Bekda¸s G, Nigdeli SM (2022) Metaheuristics-based optimization of TMD parameters in time history domain. In: Optimization of tuned mass dampers. Springer, Cham, pp 55–66 24. Bekda¸s G, Kayabekir AE, Nigdeli SM, Toklu YC (2019) Transfer function amplitude minimization for structures with tuned mass dampers considering soil-structure interaction. Soil Dyn Earthq Eng 116:552–562 25. Ocak A, Bekda¸s G, Nigdeli SM (2022) A metaheuristic-based optimum tuning approach for tuned liquid dampers for structures. Struct Design Tall Spec Build 31(3):e1907 26. The MathWorks (2018) Matlab R2018a. Natick, MA 27. FEMA P-695, Quantification of building seismic performance factors. Washington 28. Singh MP, Singh S, Moreschi LM (2002) Tuned mass dampers for response control of torsional buildings. Earthquake Eng Struct Dynam 31(4):749–769
Chapter 25
The General View of Virtual Reality Technology in the Education Sector Ghaliya Al Farsi , Azmi bin Mohd. Yusof , Ragad Tawafak , Sohail Malik Iqbal , Abir Alsideiri , Roy Mathew , and Maryam AlSinani
1 Introduction In this paper, Review research on the use of VR in the real work environment analyzed the application areas of VR in practice and education. Design, remote collaboration, and training are seen as VR application areas as a means that has the potential to overcome obstacles, Where virtual reality technology or the so-called VR technology will be explained as one of the most important technologies of the modern era, although it is not a new technology as some think, the rapid technological development has provided a great service to this technology, as virtual reality technology allows to experience things that may be difficult for humans to experience it in the real world, or it may be completely fictional Alfarsi et al. [3] El-Masri and Tarhini [6]. Virtual G. Al Farsi (B) · A. Mohd. Yusof · A. Alsideiri College of Graduate Studies, Univiersiti Tenaga Nasional, Kajang, Malaysia e-mail: [email protected] A. Alsideiri e-mail: [email protected] G. Al Farsi · R. Tawafak · S. M. Iqbal · A. Alsideiri · R. Mathew · M. AlSinani Al Buraimi Univirsity College, Muscat, Oman e-mail: [email protected] S. M. Iqbal e-mail: [email protected] R. Mathew e-mail: [email protected] M. AlSinani e-mail: [email protected] M. AlSinani Universiti Kebangsaan Malaysia, Bangi, Malaysia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_25
295
296
G. Al Farsi et al.
reality is a term given to a computer’s simulation of environments that can be physically simulated in the real world. Virtual reality is the experience of living in a reality that does not exist, or it may be located in a place, but it is not necessarily accessible for one reason or another. It is a reality built using technical devices and a computer processor, which allows you to experience it within a three-dimensional world as close as possible. It can be a reality, or even to be amazingly realistic sometimes, and to enter this world, you need simple tools that have become accessible to everyone. Virtual reality systems provide a 3D experience for more than one participant; however, it has limited capabilities in the process of interaction between the participants. Virtual reality technology is used in many fields: In the medical field, the process of rehabilitation of patients using VR technology is based on real-world simulation to meet many requirements for effective intervention and achieve the highest level using a video game console, as well as a motion sensor. Collaboration: Virtual reality has been used for psychotherapy or occupational therapy, and to influence virtual rehabilitation. Patients receiving VR therapy move through digitally created environments and often complete tasks designed specifically to treat a specific disease. Work Practices and Knowledge: The technology ranges from a simple computer and keyboard to a modern virtual reality headset (a projector worn on the head or part of a helmet). Transfer Assessment: A widely used alternative form of exposure therapy. It has become one of the most important educational challenges we face today: Loss of educational resources: As a result of student’s inability to attend schools and universities, it has become difficult to access some of the resources that support the distance education process, including libraries and reference materials, as well as technical materials, such as equipment, gyms, and musical instruments [1]. Increasing the responsibility of parents of students One of the biggest challenges of distance education that parents face is that they must help their children, in order to work to bridge the gap in the absence of the teacher, but some of them do not have sufficient qualifications for this, which puts great pressure on the parents, as many of them have not heard about these programs used in distance learning. Previously, this led some of them to hire private tutors, and thus more expenses [2, 8]. Provide a suitable place to study This problem is one of the most common problems that parents face during distance education, which is the inability to provide a room for each student, and it is worth noting that some jobs have become remote and with the need to work regularly. At a daily pace, it has become necessary to provide rooms for parents as well, and this is almost impossible since not everyone owns a large house [3]. The application helps students to raise and develop their professional skills, by allowing them to take various training courses in addition to e-learning, and it is also
25 The General View of Virtual Reality Technology in the Education Sector
297
a good opportunity for older students to search for a suitable job opportunity in addition to learning through coordination between them, as well as managing time and developing students’ skills intellectual [27]. Practical and interactive learning affects and benefits more students and helps them to accept and understand information smoothly [4]. This is what virtual reality applications in the field of education have provided by creating an exceptional interactive learning experience for scientific and historical materials where you can watch and interact with historical monuments or the world of outer space and carry out scientific experiments. In the virtual world in addition to providing a secluded world for students and lecturers to learn and discuss in an appropriate atmosphere from different parts of the world without the need to travel and travel for learning such as the Virtual Speech application for training in communication skills and speech. Virtual field trips to historical places and monuments and experiences of learning and practicing virtual languages in a dedicated virtual world are among the most important applications in the field of education in addition to the feature of interaction and discussion with teachers and lecturers in real-time and the ease of cooperation with colleagues in a virtual isolated world is one of the most important features of the application of virtual reality technology in education [7, 9, 11, 14, 24]. The goal of the application is a variety of educational services that keep pace with the continuous technological changes, to facilitate the educational process and provide all participants in it, whether teachers, students, academic coordinators, or system administrators, with the necessary capabilities [3, 21]. The development and application of technology for VR has encouraged technological innovation in the preservation of education and cultural heritage and related cultural and creative industries in the development of education and has become an important research direction of academic-industry research, and the development of educational software based on digital technology is important for the development of community groups. The educational experience platform is divided into three sections: (1) Learning in a virtual showroom, (2) Building cognitive learning, (3) Experiencing innovative design, (4) Outputting the model, Virtual, imagined, latent, or virtual reality (VR) is a term that applies to computer simulations of environments that can be physically simulated in some places in the real world, in fictional worlds. Newer virtual reality environments are primarily visual experiences, either displayed on a computer screen or through a special stereoscopic display, but some simulations include additional sensory information such as sound through speakers or headphones. Some advanced tactile systems, including tactile information, commonly known as force feedback, are used in medical applications and electronic games [22]. Furthermore, virtual reality covers remote communication environments that provide users with a virtual presence with telepresence concepts either through the use of standard input devices such as a keyboard and mouse, or through multimedia devices such as wired gloves, Zlimus, and omnidirectional treadmills. The simulated environment cannot be similar to the real world of the impossibility of creating a lifelike experience for example, in a simulation of a fighter pilot or training or it can differ significantly from reality, as in virtual reality games [17]. It is currently very difficult
298
G. Al Farsi et al.
to create a high-resolution virtual reality, and this is largely due to technical limitations on processing power, image resolution, and communication bandwidth, but proponents of this technology hope that shortcomings such as processors, imaging, communications, and information technologies will become more robust and costeffective over time. for a start, it may be important to define the concept of virtual reality. Whereas the definitions of virtual reality indicate that it is a computer representation that creates a perception of the world that appears to our senses similar to the real world. Through virtual reality, information and experiences can be transferred to the mind in an attractive and more interactive way. It is also defined as a means consisting of interactive simulations using the computer. The user feels the place and the actions. It is used in games that make the person as if he was a part of the game, and he feels as if what is happening on the screen is really happening to him. He feels that he is flying, for example, or climbing a tree, or driving a car at high speed, and he can reduce its speed. The idea of virtual reality (he means virtual reality systems) is to convince the user that what is happening is real, even if what is happening is not real or actual. Despite his foresight for the future of this technology and the diversity of its applications in libraries of all kinds within the services they provide and the activities they perform, he has warned at the conclusion of his article of the need to memorize awareness when employing the applications of this technology, as he mentioned a text: “One of us should not look at these systems as It is a way to escape from the actual reality in which we live to another alternative reality that we formulate ourselves. Rather, we must deal with it as a tool that enables us to create an artificial world that will increase our understanding and awareness of the dimensions and requirements of the real world in which we live and find optimal solutions to the difficulties and problems we face in it”. The author largely agrees with Muhonen et al. [3].
2 Literature Review Specific data for VR [16] recorded the interaction of the participants in the virtual work environment and used virtual reality as specific research data, and compared it with the element with the possibility that it affected the movement in real and virtual life. This, on the other hand, helps in estimating the investment development, modernization, and economic effects, their increase and decrease. The motion virtual recordings were found by Dembski et al. [16] where the digital twin random numbers were used in the order of the post, Rehabilitation using VR technology is based on virtual reality simulation to meet many requirements for effective intervention and achieve the highest level using a video game console, as well as a motion sensor. Scientific studies have proven the efficiency of this innovative technology in the rehabilitation of patients and the treatment of many diseases cases. Virtual reality is a safe technology and an effective alternative to traditional treatment for patients suffering from balance disorders. The patients confirmed that they enjoyed the virtual reality technology in the treatment and that they did not suffer from any side effects. The
25 The General View of Virtual Reality Technology in the Education Sector
299
case represents the report in the research in terms of design effort and is considered an important point and cooperation in raising the level of achievement and education in various fields in society and for all groups [20, 30]. Virtual reality in various fields (VR) is one of the important technological applications that are considered key to developing progress in various fields and advancing digitization in all areas of life, its challenges, and development [17–19]. According to a report by Orbis Research (2022), the global VR market was valued at $3.13 billion in 2022 and is expected to reach $49.7 billion by 2023. So far, it is being developed in various fields, especially in those places where people with different disabilities live (hospitals for the elderly and disabled, hospitals, prisons, and military barracks). It is that technology based on projecting virtual objects and information into the real user environment and community, providing additional information that is directed to the user directly to him alone so that the user can deal with information and virtual objects in augmented reality through various devices such as a smart mobile phone, glasses, or even contact lenses, where all these devices, and others use a tracking system that provides accurate projection. The most famous example of this is the famous game Pokemon Go, when you wear augmented reality glasses, the technology is used in several areas such as Training, shopping, entertainment, education, etc. [29]. VR (which refers to non-immersive VR) is simply a virtual technology, not a virtual reality technology. In a non-immersive virtual environment, Virtual reality produces near-real contexts that are clear and easy to understand. The virtual information is presented via a screen of a certain size, and users can only interact with common interfaces such as keyboards and mice. And this type we talked about previously, and this article revolves around it from the first, and it is okay to refer to an idea here, which is these virtual VR glasses add attractive audio. For example, if you watch a horror scene through these glasses, you will feel danger and fear, while you are in your home reassured and safe. In general, a VR experience shows the interactions that occur between an individual and an object in a situation or environment (for example, an online or physical store, It is certainly easier to go directly to the virtual version of the Colosseum and get all the information I need, including the original and standing position of each of the elements involved. Here it is important to discuss what is the best way to interact with the virtual world. Do we need an interface like the one that has been used so far? Or will that world represent an interactive 3D environment for user experience design? Therefore, we are facing a new chapter of interaction with the digital content of information. It is difficult to predict what will happen to the virtual reality network in the future. teaching across a particular topic, the old media merged with the new in a surprising mixture. where people gather or explain an experience or teach and create something new over a certain period time [23, 25, 28].
300
G. Al Farsi et al.
3 Method Many methods were used to find information about virtual reality, most notably the use of digital books, the use of the global site Google, and the use of multiple databases such as the digital library, Google Chrome, Epsco Database, and Google scholar.
4 Result and Discussion The benefits and applications of virtual reality (VR) have been explored in various fields. Virtual reality has a lot of potentials and its application in education has shown a lot of research interest. However, how the application of virtual reality helps students in higher education takes into account the use of both high-quality and head-mounted displays (HMDs). Among the most important topics that those working with young people target by relying on virtual reality applications : Adapting virtual games to develop the individual and social skills of young people and giving them experiences that help them improve self-awareness and the surrounding communities. enhancing youth job skills; By giving them opportunities to test different work environments, develop their capabilities, and thus support their career paths [11]. Giving young people the opportunity to experience the problems of different societies; To encourages them to participate actively in their communities and support development. Providing a space to experience the barriers faced by different members of society, whether due to physical disabilities or belonging to groups vulnerable to persecution and discrimination; increases the ability to empathize and communicate [15, 21]. Examine the lives of older adults and their physical differences and how they affect their daily lives. Support the mental health of young people by providing experiences that combine art, music, play, enjoyment, and learning. Encouraging young people to adopt better behaviors in various aspects of life, such as sports and recycling [25]. Giving the opportunity to experience diverse worlds, such as the depths of the seas, space, or historical events, for young people to develop deeper and more influential knowledge. What are the pros and cons of using virtual reality in youth work? Controversy is increasing among specialized psychologists about whether simulation programs can be used for education, and they question the appropriateness of virtual experiences for different age groups, and immersion in the virtual world may lead to separation from the environment and surrounding culture, and also the high cost of the devices used is a negative factor and needs to be measured. The effect of relying on virtual reality cannot be achieved with other tools. But by exploiting this technology in youth work and directing it to improve the capabilities and skills of young people, it is possible to overcome these negatives and communicate with
25 The General View of Virtual Reality Technology in the Education Sector
301
young people on a deeper and better level. Virtual reality technologies have a set of advantages, namely: Transferring life experiences to the virtual world, thus allowing immersion in the experience and realizing different points of view [10]. Ease of use and assimilation, and then lead to increased self-confidence and enthusiasm for learning [11, 26]. It offers an interactive environment that can be changed and added to, thus leading to the development of problem-solving skills. It is the responsibility of those working with youth to ensure that the maximum benefit is achieved from the advantages of these applications and avoid the negative consequences that may accompany them, through: Creating a safe environment for youth using virtual reality technologies, and ensuring the best context for participating youth, both in terms of location and in terms of the surrounding people; that everyone feels respected, and that no one is harassed or photographed without their permission [12]. The application emphasizes three main points: the structure of the current field in terms of VR design elements and learning contents, learning theories, and VR design elements, as a basis for successful virtual reality-based learning [5, 12]. The developmental gaps identified in various domains point to unexplored areas of VR education design for different groups, which can stimulate future work and development in this field of education [13].
5 Conclusion Virtual reality technology differs from 3D technology in that you can interact with all components of the virtual world and integrate through built-in sensory features such as hearing through external speakers that provide 3D distributed sound and movement through motion tracking sensors and immersion in a realistic experience that provides a vision 360 degrees for highly interactive integration. Virtual reality can use remote communication tools that provide users with a virtual presence in the virtual environment remotely, either through the use of input devices such as a keyboard and mouse, or through multimedia devices such as looks, gloves, and other sensors. An important sign is that the simulated environment cannot be similar to the real world, as it is undoubtedly very different from reality. This virtual reality was created, by the interaction of several tools together to make the image with a 3D feature, as it gives the viewer a high-quality vision as if he is watching a realistic and real scene. Now you can decide whether virtual reality solutions and applications are suitable to achieve any of the learning and development goals of your education to reach the desired growth rate in light of the increasing technology daily and the availability of many alternatives or not and determine the aspects and tasks that you can exploit the advantages of virtual reality technology to improve it and careful research to choose the most suitable professional establishment specialized in providing virtual reality (VR) solutions and applications.
302
G. Al Farsi et al.
References 1. Killingback C, Drury D, Mahato P, Williams J (2020) Student feedback delivery modes: a qualitative study of student and lecturer views. Nurse Educ Today 84:104237 2. Haarala-Muhonen A, Ruohoniemi M, Parpala A, Komulainen E, Lindblom-Ylänne S (2017) How do the different study profiles of first-year students predict their study success, study progress and the completion of degrees? Higher Educ 74(6):949–962 3. AlFarsi G, Yusof ABM, Rusli MEB, Tawafak RM, Malik SI, Mathew R (2021) The general view of virtual learning environment in education sector. In: 2021 22nd international arab conference on information technology (ACIT). IEEE, pp 1–6 4. Brent R, Felder R (2016) Why students fail tests: 1. Ineffective studying. Chem Eng Educ 50(2):151–152 5. Tawafak RM, Alfarsi G, Jabbar J (2021) Innovative smart phone learning system for graphical systems within COVID-19 pandemic. Contemporary Educ Technol 13(3):ep306 6. El-Masri M, Tarhini A (2017) Factors affecting the adoption of e-learning systems in Qatar and USA: extending the unified theory of acceptance and use of technology 2 (UTAUT2). Educ Technol Res Dev 65(3):743–763 7. Mehta A, Morris NP, Swinnerton B, Homer M (2019) The influence of values on e-learning adoption. Comput Educ 141:103617 8. Dakduk S, Santalla-Banderali Z, Van Der Woude D (2018) Acceptance of blended learning in executive education. SAGE Open 8(3):2158244018800647 9. Mohamad SNM, Sazali NSS, Salleh MAM (2018) Gamification approach in education to increase learning engagement. Int J Human, Arts Soc Sci 4(1):22–32 10. Doumanis I, Economou D, Sim GR, Porter S (2019) The impact of multimodal collaborative virtual environments on learning: a gamified online debate. Comput Educ 130:121–138 11. Alfarsi G, Yusof ABM, Tawafak RM, Malik SI, Mathew R, Ashfaque MW (2020) An overview of electronic games in the academic areas. In: 2020 IEEE international conference on advent trends in multidisciplinary research and innovation (ICATMRI). IEEE, pp 1–6 12. Sun JCY, Hsieh PH (2018) Application of a gamified interactive response system to enhance the intrinsic and extrinsic motivation, student engagement, and attention of English learners. J Educ Technol Soc 21(3):104–116 13. Tsay CHH, Kofinas A, Luo J (2018) Enhancing student learning experience with technologymediated gamification: an empirical study. Comput Educ 121:1–17 14. Al Azawi, Al Ghatarifi D, Ayesh A (2017) A higher education experiment to motivate the use. In: Future technologies conference (FTC), Muscat Oman 15. Haris DA, Sugito E (2015) Analysis of factors affecting user acceptance of the implementation of ClassCraft e-learning: case studies faculty of information technology of Tarumanagara university. In: 2015 international conference on advanced computer science and information systems (ICACSIS). IEEE, pp 73–78 16. Dembski F, Wössner U, Yamu C (2019) Digital twin. In: Virtual reality and space syntax: civic engagement and decision support for smart, sustainable cities: proceedings of the 12th international space syntax conference, Beijing, China, pp 8–13 17. Koivisto J, Malik A (2021) Gamification for older adults: a systematic literature review. Gerontologist 61(7):e360–e372 18. Laskin M, Lee K, Stooke A, Pinto L, Abbeel P, Srinivas A (2020) Reinforcement learning with augmented data. Adv Neural Inf Process Syst 33:19884–19895 19. Jang S, Aguero-Barrantes P, Christenson R (2022) Augmented and virtual reality resource infrastructure for civil engineering courses. In: 2022 ASEE annual conference & exposition. 20. Howard JP, Wood FA, Finegold JA, Nowbar AN, Thompson DM, Arnold AD, Rajkumar CA, Connolly S, Cegla J, Stride C, Sever P, Francis DP (2021) Side effect patterns in a crossover trial of statin, placebo, and no treatment. J Am Coll Cardiol 78(12):1210–1222 21. Tawafak RM, Romli A, Malik SI, Shakir M, Alfarsi GM (2019) A systematic review of personalized learning: Comparison between e-learning and learning by coursework program in Oman. Int J Emerg Technol Learn (Online) 14(9):93
25 The General View of Virtual Reality Technology in the Education Sector
303
22. Malik S, Al-Emran M, Mathew R, Tawafak R, AlFarsi G (2020) Comparison of e-learning, M-learning and game-based learning in programming education–a gendered analysis. Int J Emerg Technol Learn (iJET) 15(15):133–146 23. Al Farsi G, Yusof ABM, Romli A, Tawafak RM, Malik SI, Jabbar J, Bin Rsuli ME (2021) A review of virtual reality applications in an educational domain. Int J Interact Mobile Technol 15(22) 24. Hussin NH, Jaafar J, Downe AG (2011) Assessing educators’ acceptance of Virtual Reality (VR) in the classroom using the unified theory of acceptance and use of technology (UTAUT). In: International visual informatics conference. Springer, Berlin, Heidelberg, pp 216–225 25. Al Farsi G, Yusof ABM, Fauzi WJB, Rusli MEB, Malik SI, Tawafak RM, Mathew R, Jabbar J (2021) The practicality of virtual reality applications in education: limitations and recommendations. J Hun Univ Nat Sci 48(7) 26. Al Farsi G, Tawafak RM, Malik SI, Khudayer BH (2022) Facilitation for undergraduate college students to learn java language using e-learning model. Int J Interact Mobile Technol 16(8) 27. AlFarsi GA, Tawafak RM, Malik SI, Mathew R, Ashfaque MW (2022) A view of virtual reality in learning process. In: Innovations in electronics and communication engineering. Springer, Singapore, pp 423–428 28. Tawafak RM, Alfarsi G, Khudayer BH (2022) Artificial intelligence effectiveness and impact within COVID-19. In: ITM web of conferences, vol 42. EDP Sciences 29. Tawafak RM, Malik SI, Alfarsi G (2021) Impact of technologies during the COVID-19 pandemic for improving behavioral intention to use e-learning. Int J Inf Commun Technol Educ (IJICTE) 17(3):137–150 30. Alrimy T, Alhalabi W, Malibari AA, Alzahrani FS, Alrajhi S, Alhalabi M, Hoffman HG (2022) Virtual reality animal rescue world: pediatric virtual reality analgesia during just noticeable pressure pain in children aged 2–10 years old (crossover design). Front Psychol 13
Chapter 26
Detection of Sign Language Using TensorFlow and Convolutional Neural Networks Ayush Upadhyay , Parth Patel , Riya Patel , and Bansari Patel
1 Introduction Pattern recognition may have been viewed of as a roadblock when it comes to human gestures. If a machine is able to interpret and interpret these human movement patterns, the essential information can be recovered. The identification of static sign motions used to signify letters of the alphabet and this will be significant among those who are deaf and cannot listen or can only hear a bit; speaking language is the fundamental way of communication for the great bulk of the population [1]. Without spoken language, a significant portion of the population won’t be able to communicate. Even when spoken language is used, some persons find it difficult to communicate with the majority of people [1]. The primary purpose of sign language is to express meaning via movement or gestures. The required message can be rebuilt by understanding the human movement patterns. It has been successfully recognized static sign language, which is used to understand alphabet letters and integers. This sign language recognition, on the other hand, is based on ASL (American Sign Language). A system for recognizing and classifying American sign language was built using the Convolutional Neural Network (CNN) [2]. A hand gesture classification system’s main purpose should provide a connection between such a CNN classification model and a human, allowing the recognized signs to be used to offer input to a machine without physically touching physical dials and presses [3]. The bulk of citizens interact using oral communication; however sign language can be useful for persons who are deaf or have lost their ability to speak. A popular image identification and classification architecture is Convolutional Neural Networks (CNNs), a subtype of Machine Learning [2]. It has shown to be quite handy involving image A. Upadhyay (B) · P. Patel · R. Patel · B. Patel Department of Computer Science and Engineering, Devang Patel Institute of Advance Technology and Research (DEPSTAR), Charotar University of Science and Technology (CHARUSAT), Anand, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_26
305
306
A. Upadhyay et al.
identification and statement. The goal of image identification is to ensure that images are classified using linear classifiers. A huge number of factors would be necessary to classify them. The Image classification challenge consists of untrained patterns instead of recognizing an item that needs be separated into various groups. CNNs are a form of deep learning approach that extracts and combines high-level features from two-dimensional inputs using various new hidden layers of neural networks. CNN may now be used to identify pictures [4].
2 Literature Review In past years, there has been work into applying ai communication systems that use sign language. Using the Kinect Sensor and Convolutional Neural Networks, they were able to recognise 20 Italian motions with a detection rate of 91.7% [5]. They developed a 30-word dictionary and an American language recognition system [6]. It was possible to attain a 10.91% mistake rate. Researchers used Microsoft Kinect Xbox to develop a gesture translator model, in addition, the Hidden-Markov Model is used in machine learning interpretation. This study correctly identifies the BISINDO gesture with a 60% accuracy rate.1000 Indonesian Sign Language samples were gathered, 500 of which were used as the training data and the remaining 1000 as the testing data [5]. The efficiency of the model was 91.60%. The earlier work that has been done in this field is depicted in Table 1.
3 Methodology Convolutional neural networks (CNNs) and other identification strategies are used for tasks like processing video signals. CNN has noted a notable increase in the categorization of images. We should implement OpenCV for image analysis, and TensorFlow and Keras for classification purpose. The steps for carrying out the project are listed below.
3.1 Detecting Hand Gestures Several algorithms, like Haar Cascades and Neural Networks, can be used to illustrate hand moments. A prominent technique for identifying faces is Haar Cascade.
26 Detection of Sign Language Using TensorFlow …
307
Table 1 Literature review No
Title
Method
Limitation
1
Indian sign language Katocha et al. [6] recognition system using SURF with SVM and CNN [6]
Authors
SVM, CNN
Less information is contained in the dataset. The only sign is understood
2
ML based sign language recognition system [2]
Convex hull for feature extraction and knn [2]
The dataset is modest. used to find and identify isolated and individual signs
3
Vision-based hand Sharma and Singh gesture recognition [7] using deep learning for the interpretation of sign language [7]
Deep Learning
The rate of error in real-time sign language recognition is relatively high
4
A brief review of the Nimisha and Jacob recent trends in sign [8] language recognition [8]
Feature extraction algorithm, You Only Look Once (YOLO)
Not mention
5
A new improved obstacle detection framework using IDCT and CNN to assist visually impaired persons in an outdoor environment [9]
IDCT and CNN
It is unable to be spatially independent of the input data and does not encrypt an object’s position or orientation
6
Hypertuned deep Mannan et al. [10] convolutional neural network for sign language recognition [10]
Deep CNN
Not Mention
Amrutha and Prabu [2]
Singh et al. [9]
3.2 Neural Network A Brain Network, also known as an Artificial Neural Network, is a bundle of artificial nodes called artificial neurons that are inspired by the way of biological neural networks [11]. The CNN’s first layer can be used to match photos against a predetermined template.
308
A. Upadhyay et al.
3.3 Deep Learning Deep learning is used to mask the stages of analysis in a classification technique [12]. It has a number of hidden neurons, each of which improves word segmentation methods by using the output from the layer before. This is advantageous for large-scale unsupervised learning from an unclassified dataset. Deep learning was implemented using the Python libraries Keras and TensorFlow [13].
3.4 Python Packages for Keras and TensorFlow Keras is an elevated deep tool for learning that can be used with TensorFlow from Google, which is used to model neural networks on the dashboard. TensorFlow is an open source framework that is designed for generating image classifier written in Python. This library was created by the Google Bran Team. Machine learning is now done with TensorFlow. TensorFlow is mostly implemented in the programming languages C++, Python, and CUDA. TensorFlow makes it possible to run code on both the CPU and GPU [13]. However, running the code on a GPU is far more useful. TensorFlow makes developing and delivering your model a lot faster. First, the user provides inputs for processing the data; then, the data is classified using picture data sets, and an output is generated based on the input [14]. This approach uses the TensorFlow framework and the Convolutional Neural Networks (CNN) architecture to produce the classification model image in the Google [5] Collaboratory. When the CNN model gets an image, it processes it using a combination of pooling & convolutional layers [5]. This technique achieves the desired result by using entirely linked layers. Figure 1 illustrated the project’s execution flow. Convolutional neural networks, or CNNs, are a type of artificial neural network [12] used in deep learning and are often employed for the recognition and classification of objects and images. Through the use of a CNN, Deep Learning can thus identify items in Fig. 2.
3.5 Image Cropping Hand sign gestures are depicted in the images in the dataset. Moreover, in addition to the hand gesture, the image includes elements of the backdrop and the face. The bounding box values for the image’s static hand motion are likewise included in the collection. While cropping photos to only show the area that needs to be emphasized may appear intuitive, understanding this one’s justification is also crucial. An image’s numerous attributes can be evaluated using convolutional neural networks (CNNs)
26 Detection of Sign Language Using TensorFlow …
309
Fig. 1 Block diagram of project execution flow
with a variety of layers. From the most fundamental to the most complex, these traits range in complexity. The initial layers can detect curves, color and other basic visual elements [1]. The SoftMax classifies the object after recognizing its shape in subsequent layers. In order to remove the appearance of unwanted shapes, these images are cropped using a Python script [1].
3.6 Splitting into Sets The images must be categorised into training, testing, and validation sets after being recognised. As inputs, the amounts needed for validation and verification are given [1]. This is done to minimise the amount of images in the training set. A proprietary algorithm is employed to deliver a trustworthy picture splitting solution. A SHA1 hash is generated from the file name [11]. The value is then converted from a percentage to a number. If it is lower than the prescribed validation %, this percentage is included in the training sample. If the total of the testing and validation percentages is [1] less than it, then it belongs to the testing set. This is a part of the training schedule in all other situations. Any other splitting method may be used in place of the technique that was chosen merely to provide a continuous splitting strategy [13].
310
A. Upadhyay et al.
Fig. 2 Flow diagram of implementation
3.7 Training To understand the new categories, the top-layer was fully trained. We were able to get significant accuracy by adjusting the learning rate to 0.05[1]. The efficiency, on the other hand, reduced drastically when the learning rate grew or decreased. Convolutional Neural Network (CNN) architecture employed in this study [5]. Pooling layers are used to reduce the length, width, and depth of feature maps while maintaining their depth. To acquire the highest possible feature values in the area of the picture that the filter covered, max pooling was applied. Since it is more successful in extracting dominant features, max pooling is [5] thought to be more productive [15].
3.8 Implementation Dataset preprocessing The first step in execution will be to preprocess the dataset in a format that CNN models can understand. The dimensions of our test database are random but it does not
26 Detection of Sign Language Using TensorFlow …
311
have a 1:1 aspect ratio. To obtain a 1:1 aspect ratio, we’ll first scale the photographs in the dataset to 256 × 256 bytes. This is the picture that the CNN’s input layer accepts. We’ll use image magic programmed, an open source editing software toolkit, to resize the photos in the set of data. In our database, only the photos for the right hand are presently available [13]. We’ll turn the current photographs horizontally to let our programmed function with the left hand as well. Web cam name: Integrated webcam Mode: live video through (Integrated webcam) Frame rate: 30 FPS Image mode: RGB Video Standard: HD Number of colors: 21,582 Bitrate: 9.01 MB/S Total used sign: 3 each sign contains 30 outputs (15 Left-hand, 15 Right-hand) Total Outputs: 3*30 = 90. CNN (Convolution Neural Network) A well-liked technique for classifying and identifying images is CNNs, a sort of deep learning. Decision-making and image recognition have both been shown to be very important applications. Convolutional neural networks are made up of neurons that can have weights and biases [16] encoded into them. Convolutional techniques are used by every neuron to receive information and process them. After receiving an image, the CNN model applies a combination of pooling and convolutional layers. Figure 3 depicted the workflow of the convolutional neural network. Pooling layers can reduce the complexity of feature maps, particularly in terms of height and breadth, while maintaining the depth. Pooling layers are divided into
Fig. 3 CNN of workflow
312
A. Upadhyay et al.
two categories: average pooling and maximal pooling [17]. Max pooling is used to obtain the maximum value of the components in the filtered area of the image [5]. Max pooling is the most effective way to extract dominant characteristics and is hence considered most efficient. Training the CNN To train the CNN, the SGD Algorithm will be used, using rates of 0.001 and 0.01 and a period of 1e-6. In order to get started, we should learn the CNN for 10 different sign categories in 15-person batches, repeating it 50 times in total. In order to create sufficient unpredictability during the training phase, the training set will be modified for each repetition of the training [4]. Additionally, we intend to use a 20% confirmation split during training, with the CNN inspecting the final 20% of each class’s learning process. To train the CNN, we’ll eventually use the whole ASL dataset, which comprises 36 signs and 2515 pictures. In such situation, we advise using a batch size of 50 and training for around 200 repetitions to allow for speedier training, results in a 98% accuracy rate. To boost accuracy, we’ll increase the validity split by 5% and match it to newly acquired programmers.
4 Result and Analysis The models will then be judged depending on how closely they match up to real-time photographs. Two parameters, both of which have been heavily employed in prior CNN research, are used by us to evaluate the models. The first metric [18] is top-1 Val accuracy, which measures the percentage of correctly categorized instances labels in which the CNN model predicts the target label first. The Top-5 Val accuracy metric shows the proportion of classifications in which the target label is one of the five categories with the highest likelihood. A multiplication done element by element is a convolution [15]. The concept is easy to understand. To produce a filter, the computer multiplies a piece of the image that has dimensions of 33 or greater. The outcome of element-wise multiplication is a feature map [19]. This process is repeated until the entire image is scanned. It should be observed that convolution has reduced the size of the image. Figure 4 shown above depicts the algorithm’s expected sign of gratitude: the word “thank you”. The system predicted the peace sign in Fig. 5 above, which is portrayed in the image. The Fig. 6 shown above displays the greeting sign, which is the sign predicted by the algorithm.
26 Detection of Sign Language Using TensorFlow …
Fig. 4 Thank you sign
Fig. 5 Peace sign
313
314
A. Upadhyay et al.
Fig. 6 Hello sign
The models are trained for the dataset had initial accuracy of 10.07% for the Training Dataset and 26.67% for the Validation Dataset [20]. After three epochs, the prediction accuracy was 96.67% for the training dataset and 100% for the validation dataset [21]. The bar chart shown in Fig. 7 represented the comparison of the correctly predicted and wrong predicted output. The line graph render the training and validation accuracy in Fig. 8.
Fig. 7 Sign prediction rate
26 Detection of Sign Language Using TensorFlow …
315
Fig. 8 Training and validation accuracy
5 Conclusion The Convolutional Neural Network Technology for Classification of American Sign Language was built using tensor flow (ASL). This paper’s forecasting approach only picks up American Sign Language (ASL). The models for this dataset have an epoch1 Training Sample accuracy of 10% and a Solution to Solve This Problem accuracy of 26.67%. The accuracy rate for the training dataset was 96.67% after 5 iterations, whereas the recognition rate for the testing dataset was %. In this work, the number of iterations and batch size for the testing set are tough decisions. Using convolutional neural networks, we were able to properly recognize images of static sign language gestures. Because it is a real-time system for hand gesture identification, it will benefit NGOs and other organizations who work with persons with special needs. The proposed method can also be used to recognize American Sign Language (ASL). With better image and graphic capacity, the system’s response time can be lowered. In the long term, we’ll employ this method to improve sign language recognition that includes visual movement and gesture. The model can be developed and taught in the future to characterise more characters, and perhaps a variety of languages. The intended model can be trained to improve its accuracy and efficacy using the input dataset, which is available in a variety of forms.
References 1. Das A et al. (2018) Sign language recognition using deep learning on custom processed static gesture images. 2018 international conference on smart city and emerging technology (ICSCET). IEEE 2. Amrutha K, Prabu P (2021) ML based sign language recognition system. 2021 International conference on innovative trends in information technology (ICITIIT). IEEE
316
A. Upadhyay et al.
3. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25 4. Khan, Zaman R, Ibraheem NA (2012) Hand gesture recognition: a literature review. Int J Artif Intell Appl 3(4):161 5. Kembuan O, Rorimpandey GC, Tengker SMT (2020) Convolutional neural network (CNN) for image classification of indonesia sign language using tensorflow. 2020 2nd international conference on cybernetics and intelligent system (ICORIS). IEEE 6. Katoch S, Singh V, Tiwary US (2022) Indian sign language recognition system using SURF with SVM and CNN. Array 14, 100141 7. Sharma S, Singh S (2021) Vision-based hand gesture recognition using deep learning for the interpretation of sign language. Expert Syst Appl 182:115657 8. Nimisha KP, Jacob A (2020) A brief review of the recent trends in sign language recognition. 2020 international conference on communication and signal processing (ICCSP). IEEE 9. Singh Y, Kaur L, Neeru N (2022) A new improved obstacle detection framework using IDCT and CNN to assist visually impaired persons in an outdoor environment. Wirel Personal Commun, 1–18 10. Mannan A et al. (2022) Hypertuned deep convolutional neural network for sign language recognition. Computational intelligence and neuroscience 2022 11. Jayswal D et al. (2023) Study and develop a convolutional neural network for MNIST handwritten digit classification. Proceedings of third international conference on computing, communications, and cyber-security. Springer, Singapore 12. Barczak ALC et al. (2011) A new 2D static hand gesture colour image dataset for ASL gestures 13. Garcia B, Viesca SA (2016) Real-time American sign language recognition with convolutional neural networks. Convolutional neural networks for visual recognition 2, 225–232 14. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. Google Scholar Google Scholar Cross Ref Cross Ref 15. Bohara M et al. (2021) An ai based web portal for cotton price analysis and prediction. 3rd international conference on integrated intelligent computing communication & security (ICIIC 2021). Atlantis Press 16. Cate H, Dalvi F, Hussain Z (2017) Sign language recognition using temporal classification. arXiv preprint arXiv:1701.01875 (2017) 17. Tanibata N, Shimada N, Shirai Y (2002) Extraction of hand features for recognition of sign language words. International conference on vision interface 18. Dabre K, Dholay S (2014) Machine learning model for sign language interpretation using webcam images. 2014 international conference on circuits, systems, communication and information technology applications (CSCITA). IEEE 19. Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. 2012 IEEE conference on computer vision and pattern recognition. IEEE 20. Tamura S, Kawasaki S (1988) Recognition of sign language motion images. Pattern Recogn 21(4):343–353 21. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. European conference on computer vision. Springer, Cham
Chapter 27
Exploring Spatial Variation of Soils Using Self-organizing Maps (SOMs) and UHPLC Data for Forensic Investigation Nur Ain Najihah Mohd Rosdi, Loong Chuen Lee , Nur Izzma Hanis Abdul Halim, Jeevna Sashidharan, and Hukil Sino
1 Introduction Soils are earth materials composed of organic and inorganic matters, minerals, and man-made materials. Various factors contribute to the profile of the soil. Environmental factors include climate, topography, and parent materials. Meanwhile, human factors refer to human activities like agriculture and mining. All these factors impact the complexity of the soil compositions. Consequently, soil composition can be varied across a landscape or within a site [1, 2]. Soil can be encountered as trace evidence on a suspect’s shoe or garments. It is usually transferred from the victim or the location of the crime. Hence, forensic soil analysis is typically performed to identify the origin or source of the questioned soil residue collected from a suspect. However, forensic analysis of soil is challenging, as soil from different locations of a site can demonstrate varying compositions or properties of unknown magnitude [3, 4]. Spatial variation of soil refers to the variability of soil properties across various localized points of a site. Therefore, understanding of spatial variability of a site is essential for a correct interpretation of forensic soil analysis. Several forensic works
N. A. N. Mohd Rosdi · L. C. Lee (B) · J. Sashidharan · H. Sino Forensic Science Program, Faculty of Health Sciences, CODTIS, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia e-mail: [email protected] L. C. Lee Institute IR4.0, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia N. I. H. Abdul Halim Analytical Department, Shimadzu Malaysia Sdn. Bhd, 47810 Kota Damansara, Selangor, Malaysia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_27
317
318
N. A. N. B. M. Rosdi et al.
have been devoted to assessing the spatial variability of soils [5–7]. On the other hand, forensic soil study considering Malaysian soil is limited [8, 9]. Therefore, this study explores the spatial variability of soil collected from a site in Nilai district, Negeri Sembilan, Malaysia, by using an unsupervised self-organizing map (SOMs) technique and non-volatile organic profile of the soils. It is worth mentioning that [10, 11] have employed supervised SOMs in forensic soil analysis using infrared spectra and the physical properties of soils, respectively. In contrast, this work employed ultra-high performance liquid chromatography (UHPLC) data of soil.
2 Related Works Several recent works evaluating spatial variations of soils will be briefly discussed herein. Morey [12] evaluated the potential of organic profiling for soil provenance using Western Australian soils. Soil samples were prepared by Soxhlet extraction and analyzed using gas chromatography-mass spectrometry (GC–MS). The betweensite variation was evident based on the qualitative analysis of the combine alkane and fatty acid profiles (as demonstrated by the GC–MS data). On the other hand, Newland et al. [13] evaluated the quartz-recovered fine fraction of Perth sandy soils in Australia for discriminating between sandy soils. They proposed an approach utilizing multiple methods combined with chemometric techniques alongside each stage of the sequence that maximizes the differentiation of soils based on inorganic profiles. Maione et al. [14] assessing soil variability in southeastern Oregon and northern Nevada, USA, based on the chemical elements of soil profiles. Inductively coupled with plasma atomic emission spectrometry (ICP-AES) was applied to record compositions of 40 elements present in the soil. The natural groupings of the 10,261 streamsediment and soil samples were elucidated via cluster analysis methods. Based on the Hartigan-Wong algorithm implementation for K-means and the Euclidean distance, the soils were found to be highly sparse in three clusters. On the other hand, Mazzetto et al. [15] evaluated the potential of soil organic matter molecular chemistry in discriminating five sites in the Parana State in Brazil. Pyrolysis–gas chromatography/mass spectrometry was applied in determining the soil organic matters. Based on a factor analysis method, the soil organic matters’ composition in the five sites was determined to be different.
27 Exploring Spatial Variation of Soils …
319
3 Methodology 3.1 Dataset A total of 11 soil samples were sampled from four locations at Nilai, Negeri Sembilan, Malaysia, respectively labelled as N1 to N4. Specifically, for location N3 and N4, more than one soil sample were collected from multiple proximity points according to the grid pattern procedure [16]. The GPS locations of the soils are summarized in Table 1. Ultra-high performance liquid chromatography (UHPLC) coupled with a UV-Vis detector was applied to acquire the nonvolatile organic profile of the soils. Sample preparation and UHPLC parameters were performed according to Lee et al. [9]. In brief, the soil samples were first dried, followed by sieving. Then, the samples were extracted using acetonitrile assisted by a sonicator. Eventually, a dataset composed of 45 UHPLC chromatograms (i.e., samples) and 2641 retention time points (i.e., variables) were obtained at 230 nm. Each chromatogram is explained by 2641 absorbance values over retention times from 0 to 22 min. Table 1 GPS locations of the soil sampling location
ID
GPS location
Number of chromatograms
N1
2.8294874,, N,
6
N2
2.8328542,, N, 101.7789897,, E
3
N3A
2.8277060,, N, 101.7843686,, E
6
N3B
2.8329114,, N, 101.7793891,, E
3
N3C
2.8328831,, N, 101.7793917,, E
6
N3D
2.8330603,, N, 101.7800184,, E
3
N4A
2.8251728,, N, 101.7775559,, E
3
N4B
2.8278226,, N, 101.775376,, E
6
N4C
2.8274458,, N, 101.7757546,, E
3
N4D
2.8271917,, N, 101.7758859,, E
3
N4E
2.8289300,, N, 101.7743916,, E
3
101.7778583,, E
320
N. A. N. B. M. Rosdi et al.
3.2 Statical Data Analysis All the statistical analyses were executed using the R software [17]. Data preprocessing. Typically, chemical data needs to be preprocessed before any modelling analysis [18]. As the raw chromatograms showed fluctuated baselines, it was decided to first apply the asymmetric least squares (AsLS) algorithm to correct the baseline. Next, the baseline corrected data was treated by normalization to sum (NS) to remove variation caused by unequal initial sample quantity. Data interpretation. The preprocessed data were first averaged by 11 soil sampling points using in-house R code. Then, the data was reduced using principal component analysis (PCA). PCA constructed new variables known as principal components (PC) from the 2641 retention time points based on variance computed using the absorbance values of different samples. After inspecting the spatial clustering of the 45 chromatograms using the scores plot of PCA, the first five PCs were studied further using the self-organizing maps (SOMs). SOM is one of the most popular artificial neural networks [19] in chemometric analysis. It aims to generate a 2-D map to reveal the relative distance among the 45 chromatograms. The number of nodes in the 2-D map needs to be optimized carefully to achieve a meaningful output from the data. Each of the five PCs was assigned to a unique weight value that was estimated iteratively.
4 Results and Discussions 4.1 Mean Chromatogram Figure 1 illustrates the mean chromatograms of the raw data and the two treated counterparts preprocessed by AsLS and AsLS-NS, respectively. Basically, the raw data presented an undesired baseline that the baseline after around 18 min, was associated with negative values. The chromatograms were significantly improved after data preprocessing. Generally, AsLS caused a more significant positive impact than NS. In other words, the non-sample variation was mainly introduced by the fluctuated baseline. Hence, it was decided the AsLS-NS treated data is the most desired sub-dataset. Although the mean chromatograms treated by AsLS-NS appeared to be highly similar to each other, several soil samples showed extra peaks eluting around 5, 10 and 15 min. Such minor variation is not clearly seen in the raw mean chromatograms. In brief, the mean chromatograms of the 11 soil samples denoted the spatial variability is insignificant; only a minority of the samples presented some slight variations.
27 Exploring Spatial Variation of Soils …
321
Fig. 1 Mean UPLC chromatograms of soils obtained from a raw, b AsLS corrected, and c AsLS-NS corrected data
4.2 PCA The relative performances of the two treated sub-datasets and the raw counterpart were also inspected based on the scores plots of PCA (Fig. 2). Overall, the three subdatasets presented similar clustering of the samples, three well-separated clusters. However, the two treated sub-datasets showed improved intra-sample variations, with the AsLS-NS treated one offering the most desired intra-sample variations. Most importantly, the three clusters of the AsLS-NS data are located within a unique
322
N. A. N. B. M. Rosdi et al.
Fig. 2 Scores plot of PCA computed using the a raw, b AsLS corrected, and c AsLS-NS corrected chromatographic data
27 Exploring Spatial Variation of Soils …
323
quadrant. Hence, the spatial variability of the samples was inspected according to Fig. 2c. Firstly, it was observed that the six chromatograms of soil N1 were split into two clusters. The same observation is also applied to N3A, N3C and N4B. Basically, the six chromatograms were obtained from different batches of UHPLC analysis, i.e., different date of analysis and samples prepared by various analysts. Despite the rather high inter-batch variation (i.e., weak reproducibility), replicate chromatograms of samples are often highly similar (i.e., good repeatability). Next, soils collected from proximate locations, e.g., N3A to N3D, are not necessarily clustered. The N3B soil formed a tight cluster with N2, N3D and N4A soils located in the quadrant associated with the negative side of PC1 and the positive side of PC2. The other two clusters were also formed in a similar way. This denotes the non-volatile organic profiles of the 11 soil samples are not homogenous within the studied site.
4.3 SOMs The first five PCs computed using the AsLS-NS data were utilized as input data of SOMs. Figure 3 presents the SOMs outputs and the loading plots of the PCs. Based on Fig. 2, it is clearly seen that the clusters of N1, N3A, N3C, and N4B (located in the quadrant associated with the positive side of PC1 and PC2) have been split into two groups. Similarly, the cluster of N2, N3B, N3D and N4A is also divided into two separate groups according to SOMs. The further enhancement of the clustering of samples could be explained by the fact that SOMs have considered PC1-PC5 concurrently. In contrast, the scores plot of PCA only utilized the PC1 and PC2. At first glance, the majority of the nodes were mainly populated by only two soil samples. Notably, two nodes located at the bottom (Fig. 3a) were found to be rather heterogenous, i.e., composing more than two soil samples. The node dominated by the PC1 contained four soil samples; meanwhile, another heterogeneous node was populated by another four soil samples. Based on Fig. 3b, one of the nodes is dominated by PC1, governed by peaks eluted between 0 and 5 min. Hence, it seems sound that region 0–5 min demonstrates the lowest spatial variability.
5 Conclusion This study presents the first pilot work evaluating the spatial variability of Malaysian soil based on the nonvolatile organic profiles for forensic investigation purposes. Attributed to the high dimensionality of the UHPLC data, principal component analysis (PCA) and self-organizing maps (SOMs) have been applied in interpreting the data. The coupling of PCA and SOMs demonstrated great potential in processing the
324
N. A. N. B. M. Rosdi et al.
Fig. 3 SOM plots (a, b) and loading plots of PCA (c–g) computed using the AsLS-NS corrected chromatographic data
27 Exploring Spatial Variation of Soils …
325
Fig. 3 (continued)
UHPLC data. It is concluded that soils collected from the Nilai district showed relatively insignificant spatial variability. To gain a more insightful understanding of the spatial variability of Malaysian soil, future work will consider soil samples collected from more points in the Nilai district and other nearby districts, e.g., Seremban and Port Dickson. In addition, the UHPLC data shall be carefully optimized via proper data preprocessing techniques and modelled using supervised machine learning algorithms. Acknowledgements We would like to thank Ms Ang May Yen and Ms Jau Mei Hui from Shimadzu Malaysia Sdn Bhd for assisting in the acquisition of UHPLC data. This research was conducted with the support of CRIM-UKM (TAP-K016373).
References 1. Fitzpatrick RW (2013) Soil: Forensic analysis. In: Jamieson A, Moenssens AA (eds) Wiley encyclopedia of forensic science. Wiley, Chichester, pp 1–14 2. Murray RC (2012) Forensic examination of soils. In: Kobilinsky L (ed) Forensic chemistry handbook. Wiley, Danvers, MA, pp 109–130 3. McCulloch G, Dawson LA, Ross JM, Morgan RM (2018) The discrimination of geoforensic trace material from close proximity locations by organic profiling using HPLC and plant wax marker analysis by GC. Forensic Sci Int 288:310–326
326
N. A. N. B. M. Rosdi et al.
4. McCulloch G, Morgan RM, Bull PA (2017) High performance liquid chromatography as a valuable tool for geoforensic soil analysis. Aust J Forensic Sci 49:421–448 5. Hay JGP, Oxley APA, Wos-Oxley MI, Hayes R, Pickles T, Roberts K, Conlan XA (2022) The cyclic nature of soil chemistry: forensic analysis with the aid of ultra-high performance liquid chromatography. Talanta Open 6:100126 6. De Caritat P, Woods B, Simpson T, Nicholas C, Hoogenboom L, Ilheo A, Aberle MG, Hoogewerff J (2021) Forensic soil provenancing in an urban/suburban setting: a sequential multivariate approach. J Forensic Sci 66:1679–1696 7. Suarez MD (2012) Evaluation of spatial variability of soils: application to forensic science. MSc thesis, University of California 8. Lee LC, Sino H, Mohd Noor NA, Mohd Ali S, Abdul Halim A (2022) Prediction of the geographical origin of soils using ultra-performance liquid chromatography (UPLC) fingerprinting and K-nearest neighbor (K-NN). In: Mathur G, Bundele M, Lalwani M, Paprzycki M (eds) Proceedings of 2nd international conference on artificial intelligence: advances and applications. Algorithms for Intelligent Systems, pp 47–56. Springer, Singapore 9. Lee LC, Ishak AA, Nai Eyan A, Zakaria AF, Kharudin NS, Mohd Noor NA (2022) Forensic profiling of non-volatile organic compounds in soil using ultra-performance liquid chromatography: a pilot study. Forensic Sci Res 7:761–773 10. Idrizi H, Najdoski M, Kuzmanovski I (2021) Classification of urban soils for forensic purposes using supervised self-organizing maps. J Chemom 35:e3328 11. Krongchai C, Funsueb S, Jakmunee J, Kittiwachana S (2017) Application of multiple selforganizing maps for classification of soil samples in Thailand according to their geographical origins. J Chemom 31:e2871 12. Morey BA (2018) Organic profiling of western australian soils for provenance determination. Thesis, Murdoch University, Australian, BSc 13. Newland TG, Pitts K, Lewis SW (2022) Multimodal spectroscopy with chemometrics for the forensic analysis of Western Australian sandy soils. Forensic Chem 28:100417 14. Maione C, da Costa NL, Barbosa FB Jr, Barbosa RM (2022) A cluster analysis methodology for the categorization of soil samples for forensic sciences based on elemental fingerprint. Appl Artif Intell 36:2010941 15. Mazzetto JMI, Melo VF, Bonfleur EJ, Vidal-Torrado P, Dieckow J (2019) Potential of soil organic matter molecular chemistry determine by pyrolsis-gas chromatography/mass spectrometry for forensic investigations. Sci Justice 59:635–642 16. Pye K (2007) Geological and soil evidence: forensic applications. CRC Press, Taylor & Francis Group, Boca Raton 17. R Core Team (2019) R: A language and environment for statistical computing. R version 3.6.2 (2019–12–12), R foundation for statistical computing, Vienna, Austria 18. Lee LC, Liong CY, Jemain AA (2017) A contemporary review on data preprocessing (DP) practice strategy in ATR-FTIR spectrum. Chemom Intell Lab Syst 163:64–75 19. Kohonen T (1990) The self-organizing map. Proc IEEE 78:1464–1480
Chapter 28
Application of Deep Learning for Wafer Defect Classification in Semiconductor Manufacturing Nguyen Thi Minh Hanh and Tran Duc Vi
1 Introduction Semiconductor manufacturing is one of the most important manufacturing industries for many countries and recently many researchers and practitioners all over the world try to utilize Artificial Intelligence (AI) tools to achieve smart semiconductor manufacturing [1]. In semiconductor manufacturing, the wafer, specifically the silicon wafer, is the raw material for fabricating the integrated circuits (IC). According to P. A. Laplante [1], a wafer is a thin slice of semiconductor material on which semiconductor devices are made. It is also called a “slice” or “substrate” and is used in electronics for the making of IC. The quality of the wafers and their consequence processes to make the IC chip directly affects the quality of finished products [2]. The wafer map is used to provide important information about the defect location on wafer for engineers. The wafer defect classification helps to detect root causes of failure in a semiconductor manufacturing process and determine the stage of manufacturing at which wafer pattern failure occurs. The improvement of wafer failure classification helps semiconductor manufacturers increase product quality, save time, and reduce production costs. Conventionally, engineers have to classify the wafer defects manually [2]. This approach took time and required expertise and experiences from engineers especially when the numbers of wafer maps need to be classified are many, and hence the company had to spend significant amount of capital for the classification process [2]. To overcome this issue, machine learning approach has been applied to classify the defects automatically, to reduce the burden of labor costs for this process. However, machine learning has limitations in computing with big and unstructured data like N. T. M. Hanh · T. D. Vi (B) School of Industrial Engineering and Management, International University, Vietnam National University Ho Chi Minh City, Block 6, Linh Trung Ward, Thu Duc District, Ho Chi Minh City, Vietnam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_28
327
328
N. T. M. Hanh and T. D. Vi
wafer maps [3]. Deep learning is therefore recently considered the new method in this field, because of the advantages of working with big and un-structured data. On the other hand, since the wafer maps are the data collected in the reality of semiconductor manufacturing, the number of samples of failure class accounted for a very small percentage of the total training data, creating so-called imbalanced data. Thus, the challenge is not only to classify the wafer map with big and unstructured data accurately but also to handle the imbalanced nature of the wafer maps datasets. In this study, we proposed to use a deep learning architecture Convolutional Neural Network (CNN) for wafer defect pattern classification and data augmentation methods including cGAN and geometric transformation to handle the imbalance data of the wafer map in semi-conductor manufacturing. The major contribution of this study is to provide models and tools in an effort to achieve intelligent semiconductor manufacturing. This paper is organized as follows. Section 2 briefly reviews the previous studies. Section 3 describes our methodology. Section 4 presents the experiments and result analysis. The conclusion is given in Sect. 5.
2 Related Work In recent years, many approaches have been researched and proposed to classify the wafer defect patterns automatically including both traditional machine learning and deep learning techniques. For example, using traditional machine learning model, Wu et al. [2] proposed Support Vector Machine (SVM) model in combination with new feature extraction methods for wafer failures classification problem. Two techniques are built to extract the important features which are random based features, and geometric based features. The proposed feature extractions in this paper have the advantage of reducing the noise of datasets, which could improve the performance of algorithms. However, these feature extraction methods did not effectively distinguish the failure patterns that have relevant shapes such as Center with Donut, Local with Edge-ring, and Random with Near-Full. Another machine learning model used widely in classification is decision tree. Piao et al. [4] investigated that model using the random transform to extract the geometric information of failure patterns in wafer and applying decision tree ensemble for classification. The proposed method performance was better than logistic and SVM method with 90.05% of accuracy. For deep learning techniques, thanks to the development of computer vision, deep learning has opened the new approaches to solve the classification problems. For wafer defect classification, CNN have been successfully applied [5, 6]. The advantages of CNN compared with the traditional machine learning methods is automatic feature extraction [3]. For example, Saqlain et al. [5] offered a deep learning for autonomous wafer defect diagnosis (CNN-WDI). The accuracy, recall, and F1score average values for the balanced dataset of the CNN-WDI model were 96.24%, 96.24%, and 96.22%, respectively. Also, Yu et al. [6] proposed two CNN models for detection and classification of wafer defect patterns. The paper had better result with
28 Application of Deep Learning for Wafer Defect Classification …
329
93.25% the average recognition rate than other traditional machine learning such as SVM, decision tree, and joint local and nonlocal linear discriminant analysis (JLDA) [7]. Beside the selection of the appropriate classifiers, handling imbalanced data should be considered as a factor to improve the yield of wafer failure classification. Saqlain et al. [5] applied classical data augmentation strategies such as degree random rotation, horizontal flipping, and width shift to improve the accuracy of minority classes. The result of the paper showed that the accuracy, recall, and F1-score of balance data increase by approximately 5%, 9%, and 8%, respectively compared with the ones of imbalanced data. Another technique for data augmentation is Generative Adversarial Networks (GAN) which is known to generate new synthetic data. For example, Ji et al. [8] studied a combination of deep Convolutional Generative Adversarial Networks (DCGAN) and classical data augmentation method for dealing with the imbalance. The results showed that the proposed method had the highest accuracy (98.3%) compared with other data augmentation methods. However, the samples generated by DCGAN is distributed randomly into the class, so we cannot control the number of new data we would like to create. From the above literatures, there is few research to apply deep learning techniques and handle imbalanced wafer defects in semiconductor manufacturing. Our study focuses on different techniques to handle imbalanced data and then applies deep learning model for wafer defect classification, thus, contribute to the knowledge intelligent semiconductor manufacturing.
3 Methodology Figure 1 shows proposed framework for our study and are described in more detail as follows. First, the dataset is split into two subset datasets (training set and test set) with the ratio of the test set being 33%. Since the dataset is highly imbalanced, we applied the combination of oversampling and under sampling techniques to handle the imbalance problem on the training set. The two techniques of data augmentation are used (i) geometric transformations (GEO) and (ii) conditional Generative Adversarial Networks (cGAN) to oversample the number of minority classes. Meanwhile, the vast majority of the None class will be dropped randomly until it balances with the samples of minorities. Then the balanced training dataset is applied to the CNN model for classification. Finally, the model performance is evaluated by three popular measures for imbalanced problems: recall, precision, and F1-score.
3.1 WM-811 k Dataset The WM-811 k Dataset [2] consists of 811,457 wafer maps collected from the realworld fabrication where, there were 172,950 (21.3%) labeled wafer maps in the
330
N. T. M. Hanh and T. D. Vi
Fig. 1 Proposed framework
dataset. The labeled wafer maps were divided into nine defect classes by experienced experts, and the None class was counted as the majority class with 147,431 (85.2%), meanwhile the others are Center 4,294 (2.5%), Donut 555 (0.3%), Edge-Loc 5,189 (3%), Edge-Ring 9,680 (5.6%), Local 3,593 (2.1%), Random 866 (0.5%), Scratch 1,193 (0.7%), Near-Full 149 (0.1%) Fig. 2, show the nine failure classes in the dataset. Also, the wafer map sizes are not constant, and it ranges from (6, 21) to (300, 202).
3.2 Conditional GAN (cGAN) Generative Adversarial Networks (GAN) are generative models that employ two networks: the Generator and the Discriminator [9]. The generator (G) creates the fake images from a random noise vector. The discriminator (D) is used to classify the real and fake images. The interaction between G and D model corresponds to the minmax two players game [9]. The GAN model converges when both of G and D model reach the Nash equilibrium. However, the GAN cannot control the types of new samples that are generated. Conditional GAN is an extension of the GAN that can control the class that we would like to generate new samples [10]. Before using cGAN to interpolate fake wafer maps, we apply the median filter algorithm with the filter size 1.5 × 1.5 to reduce the noise on the original wafer maps. The purpose of denoising is helping the cGAN model concentrates on learning the important features better. Figure 3 show the result of median filter. The proposed architecture of cGAN concludes two sub-models, the generator model and discriminator model. The generator model has two convolutional layers with kernel size 3 × 3. Meanwhile, there are two transpose convolutional layers in
28 Application of Deep Learning for Wafer Defect Classification …
Fig. 2 Types of wafer defect patterns in WM−811 k Dataset
Fig. 3 The result of median filter
331
332 Table 1 The proposed cGAN architecture
N. T. M. Hanh and T. D. Vi Generator
Discriminator
Input: Z (128)
Input: (28, 28, 1)
CONVT (128, 4, 0), BN, ReLu CONV (128, 3, 2), BN, ReLu CONVT (64, 4, 2), BN, ReLu
CONV (256, 3, 2), BN, ReLu
CONVT (32, 4, 2), BN, ReLu Output: (28, 28, 1)
Output: 1
Fig. 4 The fake wafer maps generated by cGAN after 200 epochs
the discriminator model. Both of two sub models have the batch normalize layer after each convolutional layer and the activation function is ReLu. We denoted that CONV (x, y, z) is a convolution layer with filters = x, kernel = y × y, and stride = z. CONVT (x, y, z) is a transposed convolution layer with filters = x, kernel = y × y, and stride = z. The Adam algorithm is used for optimization with the learning rate is 0.0002. Table 1 shows our cGAN architecture. The examples of fake images which generated by cGAN is showed in Fig. 4.
3.3 Geometric Transformations (GEO) Geometric transformation is one of the most common techniques for balancing the images data [11]. In this study, we apply horizontal and vertical flip to create the new
28 Application of Deep Learning for Wafer Defect Classification …
333
Fig. 5 The geometric transformation results of defect 5 (random)
images for minority classes. Figure 5 shows the geometric transformation result of defect 5 (random).
3.4 Convolutional Neural Networks (CNN) Our proposed CNN architecture consists of three convolutional layers, three Maxpooling layers, and two fully connected layer. Table 2 shows the detail of proposed CNN architecture. All of three convolutional layers use the kernel size 3 × 3 with Rectified Linear Unit (ReLu) function, and the stride is equal to one. The Maxpooling layers are added after each convolutional layers with the kernel size 2 × 2 for reducing the dimensionality of the representation, and thus further reduce the number of parameters and the computational complexity of the model [12]. The two fully connected layers contain 516, 256 nodes respectively, with ReLu function, connect directly to the softmax layer which calculates the final class probability result. The model is set up to run through 100 epochs with the batch size is 1024.
4 Experiments and Results In the data preprocessing, we only collected randomly 51,885 (30%) labeled wafer maps from the raw dataset for experience implementation. Table 3 show the number of selected samples of each class. The sizes of all selected wafer maps were rescaled to the size 28 × 28. The name of each defect was assigned as the number from 0 to 8 for computation. The experiment consists of three scenarios with two approaches for balancing the imbalanced data (geometric transformation and cGAN) and one classification model (CNN). In the first scenario, we use the imbalanced data for training. The second
334
N. T. M. Hanh and T. D. Vi
Table 2 The proposed CNN architecture Feature maps
Filter shape
Activation
(28, 28, 32)
(3, 3)
ReLu
(14, 14, 32)
(2, 2)
64
(14, 14, 64)
(3, 3)
Maxpooling 2
64
(7, 7, 64)
(2, 2)
Convolution 3
128
(7, 7, 128)
(3, 3)
Maxpooling 3
128
(3, 3, 128)
(2, 2)
Layer
Output shape
1
(28, 28, 1)
Convolution1
32
Maxpooling 1
32
Convolution 2
Input
ReLu ReLu
Fully connected
1
512
ReLu
Fully connected
1
256
ReLu
Output
1
9
Softmax
Table 3 The number of samples in each scenario Class
Defect name
0
Center
1
Donut
2
Edge-Loc
3
Edge-ring
4
Loc
5
Random
6
Scratch
7
Near full
8
None
Total
Testing set 453
Training set (original)
Training set (cGAN)
Training set (GEO)
835
4175
4175
62
104
3640
3640
523
1034
4136
4136
977
1927
3854
3854
368
710
4260
4260
83
177
3717
3717
130
228
3648
3648 3614
19
26
3614
14,508
29,721
3721
3721
17,123
34,762
34,765
34,765
scenario uses the data which balanced by cGAN model. The third scenario uses the data balanced by using geometric transformation (horizontal flip and vertical flip the original data) to create the new image for minority classes. Table 3 shows the number of each class in test set and training set of each scenario.
4.1 Model Evaluation In this study, we use several measures for evaluating the performance of model that accuracy, precision, recall, F1- score. The formula of each measure is defined as below:
28 Application of Deep Learning for Wafer Defect Classification …
Recall =
TP (TP + FN)
Precision =
TP (TP + FP)
335
(1) (2)
F1 − score =
2 ∗ (Precision ∗ Recall) (Precision + Recall)
(3)
Accuracy =
(TP + TN) (TP + FP + FN + TN)
(4)
where TP, TN, FP, FN are denoted for true positive, true negative, false positive, false negative respectively. Since our training dataset is imbalanced, precision recall and F1-score are considered as the main metrics to evaluate the performances. The accuracy is used as the reference [13].
4.2 Results and Analysis Figure 6 presents the detailed result of three metrics (precision, recall, F1 score, and accuracy) in each scenario. Since the original dataset is highly imbalanced, the accuracy metric is not suitable [13], the recall and precision are considered as the model evaluation. The precision measures the rate of the actual positive which is predicted correctly by the model among the total predicted positive [14]. Meanwhile, recall is used to measure the fraction of positive patterns that are correctly classified [14]. Although both of two measurements (precision and recall) are used commonly in imbalance problems, in this case, recall is our priority metric. Because the higher recall value is, the lower number of false negative model gain. In the semiconductor
Fig. 6 The detailed result of each scenario
336
N. T. M. Hanh and T. D. Vi
manufacturing, the cost of false negative is more expensive than the cost of false positive. As can be seen in Fig. 6, the recall value of the None class in scenario 1 is very high at 98% but the performance of minority classes is poor compared with other scenarios due to the imbalance in the training dataset. Especially, the recall values of minority classes such as Donut, Edge-Loc, Loc, and Scratch are only 71%, 61%, 53%, and 8% respectively. Scenario 3 has the highest value in the recall metric with an average is 80%, and the second rank belongs to the recall value of scenario 2 with 75%. While the recall value of scenario 1 is only 71% which is the lowest value among three scenarios. Most of recall values of minority classes in scenario 2, 3 are higher than the ones of scenario 1. For instance, the recall value of Scratch defect class in scenarios 2, and 3 increases significantly by 7% and 29% in comparison with the first scenario respectively. As the result of experiment, we conclude that the data augmentation techniques have an impact on the improvement of CNN model performance. In this study, the geometric transformations method gives the better result than the conditional GAN method. It can be seen in Fig. 7 the Loc and Scratch are two kinds of classes that have a high rate in the numbers of misclassification. The misclassified rate of Loc with Edge-Loc, Scratch and None is 7.9%, 10%, and 9% respectively. While the Scratch class has the highest misclassification in all of classes with the rate 22%, and 32% for Loc and None classes respectively. The explanation of this phenomenon is the multiple defect patterns on the surface of some wafer maps. However, the label of wafer maps in original dataset is single. Therefore, the model is confused in classifying the wafer maps which have more than one defect patterns. Figure 8 presents the validation accuracy and loss of three scenarios after 100 epochs. In Fig. 8a the scenario 2 has the highest value of accuracy while the lowest value belongs to scenario 3. Both validation accuracy of scenarios 2 and 3 increases significantly at the first epoch which is opposed to the validation accuracy of scenario 1 which increase steadily. In Fig. 8b, the validation loss of scenario 2 is the lowest value meanwhile scenario 1 has the highest value in loss at the epoch 100.
5 Conclusion In this study, the use of a deep learning model based on a convolutional neural network for wafer defect pattern classification was proposed. We used the WM 811 k dataset which includes nine defect classes. Unfortunately, this dataset is highly imbalanced. Thus, we applied two data augmentation methods to handle this issue. Our result indicates that the two data augmentation methods associated with improvement of the CNN model performance and the geometric transformation method had better performance than the cGAN method. Although the geometric transformation method
28 Application of Deep Learning for Wafer Defect Classification …
337
Fig. 7 The normalized confusion matrix of scenario 3
was examined as an optimal data augmentation technique, it is effective just in case the defects of the wafer have symmetric characteristics. Consequently, the cGAN method is expected to be the potential method to generate synthetic data which contains the characteristic features of the original data. However, the optimization of wafer image quality created by cGAN model is not mentioned in this study. For future work, the improvement in the quality of data generated by cGAN model will be discussed.
338
N. T. M. Hanh and T. D. Vi
Fig. 8 The performance results of three scenarios: a The accuracy of model, b The loss of model
References 1. Laplante PA (2005) Comprehensive dictionary of electrical engineering, 2nd edn. CRC Press, Boca Raton 2. Wu M-J, Jang J-SR, Chen J-L (2015) Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Trans Semicond Manuf 28:1–12 3. Qiu J, Wu Q, Ding G et al (2016) A survey of machine learning for big data processing. EURASIP J Adv Signal Process 2016(1):1–16 4. Piao M, Jin CH, Lee JY, Byun JY (2018) Decision tree ensemble-based wafer map failure pattern recognition based on radon transform-based features. IEEE Trans Semicond Manuf 31(2):250–257 5. Saqlain M, Abbas Q, Lee JY (2020) A deep Convolutional neural network for wafer defect identification on an imbalanced dataset in semiconductor manufacturing processes. IEEE Trans
28 Application of Deep Learning for Wafer Defect Classification …
339
Semicond Manuf 33(3):436–444 6. Yu N, Xu Q, Wang H (2019) Wafer defect pattern recognition and analysis based on convolutional neural network. IEEE Trans Semicond Manuf 32(4):566–573 7. Yu J, Lu X (2015) Wafer map defect detection and recognition using joint local and nonlocal linear discriminant analysis. IEEE Trans Semicond Manuf 29(1):33–43 8. Ji Y. Lee JH (2020) Using GAN to improve CNN performance of wafer map defect type classification: Yield enhancement. In: 2020 31st annual SEMI advanced semiconductor manufacturing conference (ASMC), pp 1–6 9. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, ... Bengio Y (2020) Generative adversarial nets. Commun ACM 63(11):139–144 10. Mirza M, Osindero S (2014) generative adversarial nets. arXiv preprint arXiv:1411.1784 11. Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48 12. O’Shea K, Nash R (2015) An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 13. Grandini M, Bagli E, Visani G (2020) Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756 14. Hossin M, Sulaiman MN (2015) A review on evaluation metrics for data classification evaluations. Int J Data Mining Knowl Manag Process 5(2):1
Chapter 29
Helping the Farmer with the Detection of Potato Leaf Disease Classification Using a Convolutional Neural Network Surya Kant Pal , Vineet Roy, Rita Roy , P. S. Jha, and Subhodeep Mukherjee
1 Introduction In the world of excess use of agricultural products, it is essential to prevent the crop from rotting to solve the problems like shortage in output and various other problems [1]. This paper aims to support farmers whose potatoes undergo significant financial losses yearly due to several diseases affecting potato plants. The most famous examples of disease are Early Blight and Late Blight. Fungal infection causes early blight, while specific microorganisms cause late blight. Farmers could save a lot of waste and money if they discover this disease early and treat it adequately [2]. The treatment of the condition mentioned above is a bit difficult, time-consuming and different from the usual methods, so it is essential to detect the disease accurately. Is there in the potato plant? So, we came up with the idea of using a Convolutional Neural Network, also known as CNN, to detect our leaf disease using pictures of all the three types of illnesses required to diagnose the threat. S. K. Pal (B) · V. Roy Department of Mathematics, School of Basic Sciences and Research, Sharda University, Greater Noida 201306, India e-mail: [email protected] V. Roy e-mail: [email protected] R. Roy Department of Computer Science and Engineering, GITAM Institute of Technology, GITAM (Deemed to be University), Visakhapatnam, Andhra Pradesh, India P. S. Jha Department of Statistics, Patna College, Patna University, Bihar, India S. Mukherjee Department of Management, GITAM (Deemed to Be University), Visakhapatnam, Andhra Pradesh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_29
341
342
S. K. Pal et al.
Potatoes are presumably the most utilized vegetable that experiences different sicknesses, which cause a deficiency in supply and monetary misfortune for ranchers [3]. Illness like early and Late Blight is the most widely recognized and happening [4, 5]. Early curse happens due to parasites, and late scourge happens because of explicit miniature organic entities. They can be dealt with if recognized early and proper measures are taken by the farmers, saving tons of waste and forestalling monetary misfortunes for farmers [6]. The treatment of both early and late curses is unique, so it is vital to arrange precisely what sort of illness is there utilizing potato leaves [7]. Convolutional Neural Networks will be used to order into early Blight, Late Blight, and Healthy recognized and most happening ones [8]. The early curse happens due to parasites, and the late scourge happens because of explicit miniature organic entities [9]. They can be dealt with if recognized early and proper measures are taken by the farmers, saving tons of waste and forestalling monetary misfortunes for farmers. Precisely what sort of illness is there utilizing potato leaf [10]?
2 Review of Literature There is a clever picture dealing with computation subject to the candidate’s area of interest acknowledgement in blending in with simple gathering techniques to deal with recognizable sickness in wild circumstances [11, 12]. Another procedure incorporates the utilization of convolutional brain network models to perform the discovery of a few plant sicknesses, what’s more, distinguish the idea of disease by assessment of the side effects by utilizing a few pictures of leaves, which ought to incorporate the two its sound and unhealthy surfaces, where the conclusion has done through profound learning philosophies [13, 14]. Another methodology is where the caught picture of the impacted leaf is first preprocessed with a picture-handling calculation. Gaussian smoothing administrator is a 2-D complexity administrator for smoothened images and eliminating commotion. Ensuing to inspect all recently referenced frameworks and methods, we can derive that there are a couple of various ways by which we can perform areas of illnesses that occur in plants [15, 16]. Each has a couple of focal points similar to imprisonment [17]. Like this, there is a degree of progress in the flow of research. Picture getting ready is a strategy that works on all flow exploration and gives the speedy and exact outcome of plant ailments. In evaluating all recently referenced procedures and methods, we can reason that there are a couple of ways by which disclosure of disorder of plants ought to be conceivable. Picture planning is a methodology that further develops all investigations with a target reality and gives the fast and exact delayed consequence of plant disease. The profound learning method was viewed as the most accurate in anticipating the infection from the caught picture of a leaf [18, 19]. Different fashioners have encountered mechanized plant infection recognizing evidence by motorized visual insightful systems. One among them was an android application called Plantix. It is a versatile reap cautioning for farmers and nursery
29 Helping the Farmer with the Detection …
343
laborers, which breaks down plant illnesses, bug hurt, and deficiencies affecting yields and offers related treatment measures [20, 21]. It utilizes brain networks for preparing the model. We should catch the engaged and clear picture of the plant leaf to be analyzed or, on the other hand, could be imported from the display. The application then, at that point, inquires for coordinates with their images from the dataset. Subsequently, the rancher has to physically distinguish the infection worried about the leaf by checking matches [22]. A strategy for the recognizable proof of early side effects of plant sicknesses. An assortment of mechanized visual symptomatic techniques was advanced from various investigations to address automatic plant sickness discovery [23, 24]. One among them was utilizing Mobile catch gadgets to analyze plant illnesses. This strategy used a picture-handling computation subject to acknowledgement of the applicant’s area of interest acknowledgement collectively with some quantifiable induction methodologies dealing with illness acknowledgement in inclement conditions [25]. Assessing plant infection is disengaged into three segments, be explicit, evaluate the event, earnest, and yield misfortune [26]. Notwithstanding the way that it is fundamental to recognize plant sicknesses in a starting period to stay away from yield misfortune, it is incomprehensible. It requires it is not yet developed to learn since the indications. Subsequently, there is a need to give plant contamination area instruments subject to picture dealing with. It provides comprehension into a teachable structure to recognize plant ailments supported in three European endemic wheat illnesses using based methodologies in the mix with measurable induction techniques [27, 28]. Plant disease detailed verification, as used here, fuses the confirmation of the likelihood that a particular affliction is Accessible. Despite the way that other sensor contraptions for prompt or atypical concealing assortment acknowledgement could give important information as earnestness and spread in the plant or yield, they don’t make an adequate number of data to examine a specific hurt in the plant, including biotic (infection, nuisance, and weeds) furthermore, abiotic gambles. Thus, there is a need to give a picture dealing with based plant disorder conspicuous confirmation, to investigate illnesses in their underlying progression stages to have the choice to answer on schedule with gather affirmation applications [29]. The mechanized plant illness location framework utilizes the Deep learning model and incorporates a convolutional brain network as a learning apparatus. CNN is prepared with detailed pictures of leaves containing sound and infection in a research facility and under field conditions [30, 31]. This model was created as a development adaptation of the model planned by Mohanty, which contrasted just two structures of CNN and 26 plant illnesses and a database with images of 14 plants. Still, the problem was that even though it gave a 99.35 progress rate, it couldn’t involve constant circumstances as the analysis was led distinctly on research center pictures [32]. Here data set incorporates around 87,848 photos which include sound and contaminated plants in the two circumstances where it contains 25 plant species, and this information base is utilized for preparing, what’s more, trying. The information base is again ordered as 58 classes, where each category includes a couple of meanings of plant and illness and a few classes incorporate concrete plants. This information base is
344
S. K. Pal et al.
again separated into two sections: the preparing set and the testing set (partitioned according to 80/20 proportion). In this way, two CNN models were created: one was prepared on research facility pictures and tried with field ones, and the other was trained on the field or real-time cultivation pictures and tested on research facility pictures.
3 Proposed Methodology 3.1 Data Collection Our data creators work with farmers, going towards the fields and asking farm owners for photographs of leaves, or they might take pictures themselves and categories them with the help of farmers. We would use readymade information from Kaggle in this research work.
3.2 Data Preprocessing Preprocessing the data is the second most challenging stage of our data science project for the purposes mentioned earlier. In this Deep Learning Project, the dataset includes 2176 images.
3.3 Convolutional Neural Network (CNN) In deep learning, a convolutional neural network (CNN) is among the class of deep neural networks, mainly deployed in analyzing/image recognition. Figure 1 shows the proposed framework. Layers of convolution It is CNN’s initial layer, and its job is to extract various characteristics from the input pictures. Pooling layer It is used to reduce the size of convolved features, resulting in a reduction in processing power. This is accomplished by lowering the characteristics that are reducing the dimension.
29 Helping the Farmer with the Detection …
345
Fig. 1 Proposed deep learning structure
4 Model Evaluation To check the accuracy of our model, the classification using the convolutional network, the dataset is classified into three types −80% for training, 10% for the test, and 10% for validation. Table 1 shows the comparison between accuracy and loss for the dataset. Table 1 shows an algorithm for machine learning optimized using a loss function. The model’s performance in these two sets determines how the loss is calculated based on training and validation data. Training or validation sets are the total number of errors produced for each example. A model’s loss value indicates how well or poorly it performs after each optimization cycle. The algorithm’s performance is evaluated using an understandable accuracy metric [33]. A model’s accuracy is often assessed after input parameters and expressed as a percentage. It measures how closely your model’s forecast matches the actual data. The values are shown in Table 1, accuracy and loss of the training, test and validation. Figure 2 shows a method for gauging a classification model’s effectiveness is accuracy. It is frequently stated as a percentage. The number of forecasts where the anticipated value matches the actual value is called the accuracy—a binary value (true or false) for a particular sample. During the training phase, accuracy is frequently graphed and tracked, albeit the number is commonly connected to the overall or final model accuracy [34]. Loss is more problematic to interpret than accuracy. A loss function considers a prediction’s probabilities or uncertainty, often referred to as a cost function, based on how much the forecast deviates from the actual value. Table 1 Model performance of the dataset
Dataset
Accuracy
Loss
Training
0.9913
0.0222
Test
0.9960
0.0098
Validation
0.9948
0.0098
346
S. K. Pal et al.
Fig. 2 Visualizing the training and validation datasets
This gives us a more detailed understanding of how the model is doing. We often see that accuracy rises as loss falls, but this isn’t always the case. Loss and accuracy measure various things and have distinct definitions. There is no mathematical connection between these two measurements, even though they frequently appear to be inversely proportionate (Fig. 3). The Model web application is built using Streamlight. Streamlight is an opensource app framework in python language. The web application is deployed using the Heroku app. It is like AWS, which is also used for model deployment.
5 Conclusion and Future Work The Model is built using Resizing, Rescaling Random flips, etc., but a better choice would be to use an Image Data Generator. The model gives results >99%, which is pretty good, but the model was not built on 2152 in addition to data augmented. In the Future, the model can be made on the whole dataset. For not clicking every single piece, which is a tiresome and lengthy process on big farm fields, drones with cameras can be used to fasten the process of detection, which is a much better choice than clicking the entire leaf pics once at a time. The model was built using a CPU
29 Helping the Farmer with the Detection …
347
Fig. 3 Testing the model’s learning
to fasten the process. Better RAM, and GPU should be used. One of the procedures could be to use VGG16 & VGG19, Alexnet, and apply Transfer Learning which can reduce the computation time in the training phase and will build better testing accuracy.
References 1. Geetharamani G, Pandian A (2019) Identification of plant leaf diseases using a nine-layer deep convolutional neural network. Comput Electr Eng (2019) 2. Saleem MH, Potgieter J, Arif KM (2019) Plant disease detection and classification by deep learning. Plants
348
S. K. Pal et al.
3. Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 4. Mukherjee S, Baral MM, Pal SK, Chittipaka V, Roy R, Alam K (2022) Humanoid robot in healthcare: a systematic review and future research directions. International conference on machine learning, big data, cloud and parallel computing (COM-IT-CON), Vol 1, pp 822–826. IEEE 5. Roy R, Baral MM, Pal SK, Kumar S, Mukherjee S, Jana B (2022) Discussing the present, past, and future of Machine learning techniques in livestock farming: a systematic literature review. International conference on machine learning, big data, cloud and parallel computing (COM-IT-CON). Vol. 1, pp 179–183. IEEE 6. Vishnoi VK, Kumar K, Kumar B (2021) Plant disease detection using computational intelligence and image processing. J Plant Diseases Protect 7. Mukherjee S, Chittipaka V, Baral MM, Pal SK, Rana S (2022) Impact of artificial intelligence in the healthcare sector. Artif Intell Indus 4 8. Vallabhajosyula S, Sistla V, Kolli VK (2022) Transfer learning-based deep ensemble neural network for plant leaf disease detection. J Plant Diseases Protect 9. Dhaka VS, Meena SV, Rani G, Sinwar D, Ijaz MF, Wo´zniak M (2021) A survey of deep convolutional neural networks applied for prediction of plant leaf diseases. Sensors 10. Hassan SM, Maji AK, Jasi´nski M, Leonowicz Z, Jasi´nska E (2021) Identification of plant-leaf diseases using CNN and transfer-learning approach. Electronics 11. Lu J, Tan L, Jiang H (2021) Review on convolutional neural network (CNN) applied to plant leaf disease classification. Agriculture 12. Mukherjee S, Chittipaka V, Baral MM, Srivastava SC (2022) Can the supply chain of Indian SMEs adopt the technologies of industry 4.0?. In: Advances in mechanical and industrial engineering. CRC Press 13. Atila Ü, Uçar M, Akyol K, Uçar E (2021) Plant leaf disease classification using Efficient Net deep learning model. Ecol Inf 14. Baral MM, Mukherjee S, Chittipaka V, Srivastava SC, Pal SK (2022) Critical components for food wastage in food supply chain management. In: Advances in mechanical and industrial engineering. CRC Press 15. Johannes A, Picon A, Alvarez-Gila A, Echazarra J, Rodriguez-Vaamonde S, Navajas AD, OrtizBarredo A (2017) Automatic plant disease diagnosis using mobile capture devices, applied on a wheat use case. Comput Electron Agric 16. Mukherjee S, Baral MM, Chittipaka V, Pal SK, Nagariya R (2022) Investigating sustainable development for the COVID-19 vaccine supply chain: a structural equation modelling approach. J Humanitarian Logist Supply Chain Manag 17. Too EC, Yujian L, Njuki S, Yingchun L (2019) A comparative study of fine-tuning deep learning models for plant disease identification. Comput Electron Agric 18. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 19. Pal SK, Baral MM, Mukherjee S, Venkataiah C, Jana B (2022) Analyzing the impact of supply chain innovation as a mediator for healthcare firms’ performance. Mater Today Proc 20. Sladojevic S, Arsenovic M, Anderla A, Culibrk D, Stefanovic D (2016) Deep neural networks based recognition of plant diseases by leaf image classification. Comput Intell Neurosci 21. Mukherjee S, Chittipaka V (2021) Analysing the adoption of intelligent agent technology in food supply chain management: empirical evidence. FIIB Bus Rev 22. Bangari S, Rachana P, Gupta N, Sudi PS, Baniya KK (2022) A survey on disease detection of a potato leaf using CNN. Second international conference on artificial intelligence and smart energy (ICAIS) IEEE 23. Chen W, Chen J, Zeb A, Yang S, Zhang D (2022) Mobile convolution neural network for the recognition of potato leaf disease images. Multimedia Tools Appl 24. Jha K, Doshi A, Patel P, Shah M (2019) A comprehensive review on automation in agriculture using artificial intelligence. Artif Intell Agric
29 Helping the Farmer with the Detection …
349
25. Agarwal M, Sinha A, Gupta SK, Mishra D, Mishra R (2020) Potato crop disease classification using convolutional neural network. InSmart Syst IoT Innov Comput. Springer, Singapore 26. Mahum R, Munir H, Mughal ZU, Awais M, Sher, Khan F, Saqlain M, Mahamad S, Tlili I (2022) A novel framework for potato leaf disease detection using an efficient deep learning model. Human Ecol Risk Assess Int J 27. Sarker MR, Borsha NA, Sefatullah M, Khan AR, Jannat S, Ali H (2022) A deep transfer learning-based approach to detect potato leaf disease at an earlier stage. Second international conference on advances in electrical, computing, communication and sustainable technologies (ICAECT). IEEE 28. Barbedo JG (2019) Plant disease identification from individual lesions and spots using deep learning. Biosyst Eng 29. Chakraborty KK, Mukherjee R, Chakroborty C, Bora K (2022) Automated recognition of optical image-based potato leaf blight diseases using deep learning. Physiol Molecular Plant Pathol 30. Rashid J, Khan I, Ali G, Almotiri SH, AlGhamdi MA, Masood K (2021) Multi-level deep learning model for potato leaf disease recognition. Electronics 31. Baral MM, Mukherjee S, Nagariya R, Patel BS, Pathak A, Chittipaka V (2022) Analysis of factors impacting firm performance of MSMEs: lessons learnt from COVID-19. Benchmarking Int J 32. Lee TY, Lin IA, Yu JY, Yang JM, Chang YC (2021) High efficiency disease detection for potato leaf with convolutional neural network. SN Comput Sci 33. Tutuncu K, Cinar I, Kursun R, Koklu M (2022) Edible and poisonous mushrooms classification by machine learning algorithms. 11th mediterranean conference on embedded computing (MECO). IEEE 34. Rahman H, Faruq MO, Hai TB, Rahman W, Hossain MM, Hasan M, Islam S, Moinuddin M, Islam MT, Azad MM (2022) IoT enabled mushroom farm automation with Machine Learning to classify toxic mushrooms in Bangladesh. J Agric Food Res
Chapter 30
Analysis of ML-Based Classifiers for the Prediction of Breast Cancer Bikram Kar
and Bikash Kanti Sarkar
1 Introduction 1.1 Background A report published by WHO said that globally 6,27,000 deaths occurred among women in 2018 which is almost 15% of all female cancer deaths and it is the most common form of cancer in Indian women [1–3]. Structured and opportunistic screening programs significantly reduce breast cancer-related mortality in industrialized nations [4]. The classification technique based on machine learning will help detect, diagnose, and plan breast cancer or surgical therapy.
1.2 Related Works Many academic researchers applied data mining on various clinical datasets to efficiently predict breast cancer. Many academics utilize these ML-based algorithms to solve complicated assignments because they offer strong categorization results. Below are a few studies in these areas that are important. Asri et al. [5] compare four widely used classifiers-SVM, Naïve Bayes, C4.5, and k-NN to evaluate efficiency and effectiveness and SVM has demonstrated its efficacy in the identification of cancer cell. Amrane et al. [6] customized Naïve Bayes classifiers and k-NN classifiers to detect cancerous and non-cancerous tumors and the performance evaluated by accuracy. Ganggayah et al. [7] developed a prediction model based on the DT, Random B. Kar (B) · B. K. Sarkar Birla Institute of Technology, Mesra, Ranchi, Jharkhand, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_30
351
352
B. Kar and B. K. Sarkar
Forest, Neural Network (NN), Extreme Boost, Logistic Regression, and SVM classifiers to identify the breast cancer cell and the clinical data are collected from Malaya Medical Centre, Kuala Lumpur. Alzu’bi et al. [8] created an NLP-based method to extract crucial information regarding breast cancer data from clinical records. The extracted characteristics are used to create a breast cancer medical dictionary. Different machine learning algorithms are applied to detect breast cancer recurrence in patients using the extracted data. Multiverse Optimization (MVO) algorithm-based segmentation approach is proposed by Kar et al. [9] for breast lesion detection. Hazra et al. [10] applied NB, SVM, and Ensemble classifiers on selected features, and the feature selection is made by Pearson Correlation Coefficient methods and PCA-based methods. Omondiagbe et al. [11] integrate LDA-based feature selection approach with SVM (radial basis kernel), ANN, and NB classifiers to identify the most suitable approach. Alghodhaifi et al. [12] utilized CNN to identify and categorize invasive ductal carcinoma in breast histopathology images and the accuracy percentage is 88% (approx.). Mohammed et al. [13] tested three classifiers: NB, SVM, and J48, with the data being discretized first and then missing and unwanted values are eliminated from the dataset. The instances are resampled by a resample filter to maintain a uniform class distribution. Saoud et al. [14] evaluated six machine learning algorithms, namely, Bayes Net (BN), SVM, ANN, k-NN, Decision Tree (C4.5) and Logistic Regression (LR) and the simulation is performed on weka tools. A comparative study to predict the recurrence of breast cancer using classification and clustering is performed by Ojha et al. [15], and the accuracy of the classification approaches is greater than clustering. Pritom et al. [16] applied NB, C4.5 Decision Tree, and SVM classifiers on WBC dataset and an efficient feature selection algorithm is used to improve the accuracy. Data Mining and Ensemble-based learning approach is used by Mohammed et al. [17] to predict the various forms of breast cancer. NB, SVM, GRNN, and J48 are applied on Wisconsin Breast Cancer dataset. Kar et al. [18] proposed a hybrid feature selection approach based on Information Gain and Correlation Coefficient for medical domain dataset. This proposed feature selection approach was applied on seventeen medical domain datasets and in most of the cases it provides improvement in results.
1.3 Objectives This paper aims to identify the cancerous and non-cancerous cell from the breast cancer dataset using the Machine Learning classification approach. A comparative study is performed between six classifiers, Logistic Regression, SVM, k-NN, Random Forest, Naïve Bayes, and Decision Tree. The empirical results reveal that the Logistic Regression and SVM provide better results than other classifiers.
30 Analysis of ML-Based Classifiers for the Prediction …
353
2 Methodology The proposed model has four parts. These are data collection, data preprocessing, feature extraction, and classification of cancer cells. The related details are described below.
2.1 Data Collection First part of this study is Data Collection. For this study, Wisconsin Diagnostic Breast Cancer dataset was taken which is available in the UCI Data Repository. It comprises 569 occurrences and 32 characteristics with no missing values as shown in Fig. 1. The outcome variable might be benign (357 observations) or malignant (212 observations) (Table 1).
2.2 Data Preprocessing The second part of this study is Data Preprocessing, which helps to produce highquality error free data. Data quality affects categorization accuracy, thus it must be unambiguous, accurate, and complete. Data preprocessing is used to complete missing values and remove inconsistencies from the data set. The data is usually collected from different sources so it contains various types of redundant, irrelevant data. For this reason, data cleaning method is used to remove them. In this case, if
Fig. 1 Bar plot of feature importance based on extra tree classifier
Table 1 Brief information of the dataset Dataset
Total characteristics Total occurrence Class
Wisconsin Diagnostic Breast Cancer (WDBC) 32
569
2
354
B. Kar and B. K. Sarkar
there is any missing value in the dataset then it is filled by attribute mean. In this case, please note that there are no errors in the Wisconsin Diagnostic Breast Cancer dataset.
2.3 Feature Selection ExtraTrees Classifiers or Extremely Randomized Trees Classifier [19, 20] is an ensemble tree-based classifier from the scikit-learn package. It is used to compute impurity-based feature importance, then utilized to eliminate unnecessary features. According to the Extra Tree Classifier-based feature selection algorithm, six less important features from the dataset are removed to obtain a better result.
2.4 Classification Classification is a predictive modeling task in machine learning that predicts a class label for a given sample of input data. In our work, six well-known and established classifiers are demonstrated in Fig. 1. A brief description of these classifiers is bellowed: Logistic regression (LR) is a type of supervised learning approach used to estimate the probability of a given occurrence. If the data is linearly separable and the result is binary or dichotomous, then this method is used. As a result, logistic regression is often used for binary classification tasks. Predicting an output that is discrete into two classes is called binary classification. SVM is a type of supervised learning that may be applied to problems with regression and classification. However, it is mostly employed to address categorization issues. Each piece of data is represented as a point in n-dimensional space, with each feature’s value being the result of the SVM algorithm at a particular place. To complete categorization, we then find the hyper-plane that clearly delineates the two classes. k-NN is a type of supervised learning technique that is used for classification and regression. It uses K Nearest Neighbors (Data points) to predict the new datapoint’s class or continuous value. Random Forest classifiers contain a number of decision trees and collect forecasts from each tree and predict the ultimate output based on the majority votes of predictions. Naïve Bayes is a supervised learning technique for addressing classification issues that are based on the Bayes theorem. It is mainly utilized in text classification tasks requiring a large training dataset. Decision tree The most effective and popular method for categorization and prediction is the decision tree. In a decision tree, each internal node stands in for an
30 Analysis of ML-Based Classifiers for the Prediction …
355
attribute test, each branch indicates the test’s outcome, and each leaf node (terminal node) contains a class label. The structure is similar to a flowchart.
3 Proposed Framework The framework proposed in this study is capable of detecting breast cancer at the primary stage. As a result, early treatment of the patient can be started and it will be able to reduce the death rate due to breast cancer to a great extent. This entire structure was created with Python 3.9. At first data collection is done then the important features are selected from the dataset using the feature selection method and then the entire dataset is split into two parts: train and test. The train set is used to train the classifiers, and test dataset is used for testing purpose. Different classification algorithms are used to check which one is showing better results. Pictorial view of the proposed framework is represented below. In the proposed method (refer Fig. 2), we used 6 classifiers that were developed in Python 3.9. These 6 classifiers are Logistic Regression, Support Vector Machine, k-NN, Random Forest, Naïve Bayes, and Decision Tree. The performance of these classifiers depends on the features of the data set (i.e., the inputted data). So, feature selection is a significant issue in this field. The Extra Tree classifier we used for feature selection in this case shows that better results are obtained in most cases. In this study, we were able to find out the best classifier among the 6 classifiers. In this case, it can be seen that the machine learning approach is widely used in the healthcare domain, especially in disease prediction, but use of Extra Tree classifier for feature selection can obtain better result. The performance of the six classifiers is measured using Accuracy, Sensitivity, and F1-score.
4 Results and Discussion The whole study is implemented in a machine with configuration of Core i5 10th Generation (Intel) processor and 16 GB of RAM. We utilized open-source machine learning packages in Python called NumPy, pandas, and Scikit-learn. The program is executed using Google Colab notebook, an open-source online application. All the classifiers are tested on Train-Test split (80%-20%) environment. All six classifiers are compared on the basis of accuracy (% age), sensitivity or recall and F1-score. The confusion matrix is used to determine the classification model’s efficacy. Table 2 shows the accuracy percentage, sensitivity, and F1-score of all six classifiers. Confusion matrix The Confusion Matrix is a graph that shows the difference between actual and expected values. It’s a table-like structure that assesses our Machine Learning classification model’s performance (Refer Fig. 3).
356
B. Kar and B. K. Sarkar
Fig. 2 Framework of the proposed model
Table 2 Six classifiers and their reported measured values
Classifiers
Accuracy (%)
Sensitivity
F1-score
Logistic regression
97.36
0.97
0.97 0.97
SVM
97.36
0.97
k-NN
96.18
0.96
0.96
Random forest
95.61
0.96
0.96
Naïve bayes
96.49
0.96
0.96
Decision tree
93.85
0.94
0.94
TP (True Positive): Positive results that were projected to be positive. FP (False Positive): Negative results that are incorrectly anticipated as positive, also known as Type I Error. FN (False Negative): Positive values which are incorrectly forecasted as negative, also known as Type II Error. TN (True Negative): Actually, negative values are predicted as negative. Accuracy: How many of them have predicted correctly from all the classes (positive and negative)? Accuracy should be high always.
30 Analysis of ML-Based Classifiers for the Prediction …
357
Fig. 3 Graphical representation of confusion matrix
Sensitivity or recall: The recall is a metric for determining how many positive expected outcomes were properly predicted out of a total number of positive outcomes. F1-score: F1-score is a harmonic mean of Precision and Recall; it provides a combined picture of these two measures. When Precision equals Recall, it reaches a maximum. After comparing all these six classifiers (Figs. 4, 5 and 6), it is shown that Logistic Regression and SVM-based classifiers provide maximum accuracy (97.36%) compared to other classifiers. So, we can use Logistic Regression and SVM to develop an efficient prediction model for breast cancer diagnosis. From the results obtained in Table 3, it can be concluded that Extra Tree classifier can work as an important feature selection algorithm in this form of disease classification. In this Fig. 4 Graphical representation of six classifiers based on classification accuracy (%)
Accuracy (%) 98 97 96 95 94 93 92
97.36 97.36 96.18
95.61
96.49 93.85
358
B. Kar and B. K. Sarkar
Sensitivity
Fig. 5 Graphical representation of six classifiers based on the sensitivity
Fig. 6 Graphical representation of six classifiers based on F1-score
Table 3 Comparison of results before and after feature selection
0.98 0.97 0.96 0.95 0.94 0.93 0.92
0.97
0.97
0.96
0.96
0.96 0.94
F1-score
0.98 0.97 0.96 0.95 0.94 0.93 0.92
Classifiers
Accuracy (%) (before, after)
Sensitivity (before, after)
F1-score (before, after)
Logistic regression
96.87, 97.36
0.96, 0.97
0.96, 0.97
SVM
96.16, 97.36
0.96, 0.97
0.96, 0.97
k-NN
95.83, 96.18
0.96, 0.96
0.96, 0.96
Random forest
95.14, 95.61
0.95, 0.96
0.95, 0.96
Naïve Bayes
95.12, 96.49
0.95, 0.96
0.95, 0.96
Decision tree
92.21, 93.85
0.92, 0.94
0.92, 0.94
30 Analysis of ML-Based Classifiers for the Prediction …
359
case, it appears that most of the time used classifiers show better results on selected features (based on selected feature selection algorithm).
5 Conclusion Breast cancer is one of the leading causes of mortality in women compared to all other cancers. As a result, early identification of breast cancer is critical in decreasing death rates. Advanced machine learning algorithms can anticipate this early breast cancer cell detection. This study shows that the Logistic Regression, and SVM-based classifier has the highest accuracy of 97.36% compared to the other five classifiers. This work can be further extended by developing Logistic regression-based ensemble classifiers, which may provide maximum accuracy with all prominent features, can improve this job. Constructing customized and computationally efficient classifiers for medical applications is a difficult problem in the machine learning and data mining fields. It is extremely difficult to identify the many medical diseases of a breast cancer patient using machine learning methods, and the prediction of conditions is also more crucial. One of the project’s future goals is to see how these categorization methods perform on large datasets. Furthermore, in the not-too-distant future, diagnosing a specific stage of breast cancer will be possible.
References 1. https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/, Accessed 16 Sep 2022 2. Ferlay J et al. (2013) GLOBOCAN 2012 v1. 0, cancer incidence and mortality worldwide: IARC. Cancer Base 11 3. Bray F et al. (2013) Global estimates of cancer prevalence for 27 sites in the adult population in 2008. Int J Cancer 132(5):1133–1145 4. Shetty, Mahesh K (ed): Breast cancer screening and diagnosis: a synopsis. Springer 5. Asri, Hiba et al. (2016) Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Comput Sci 83:1064–1069 6. Amrane M et al. (2018) Breast cancer classification using machine learning. In: 2018 electric electronics, computer science, biomedical engineering’s meeting (EBBT), pp 1–4, IEEE 7. Darshini M et al (2019) Predicting factors for survival of breast cancer patients using machine learning techniques. BMC Med Inform Decis Making 19(1):48 8. Alzu’bi A et al. (2021) Predicting the recurrence of breast cancer using machine learning algorithms. Multimedia Tools Appl 80(9):13787–13800 9. Kar B et al. (2021) Breast DCE-mri segmentation for lesion detection using clustering with multi-verse optimization algorithm. In: Sharma TK, Ahn CW, Verma OP, Panigrahi BK (eds) Soft computing: theories and applications. Advances in Intelligent Systems and Computing, vol 1381. Springer, Singapore 10. Hazra A et al. (2016) Study and analysis of breast cancer cell detection using Naïve Bayes, SVM and ensemble algorithms. Int J Comput Appl 145(2):39–45 11. Omondiagbe DA et al. (2019) Machine learning classification techniques for breast cancer diagnosis. IOP conference series: materials science and engineering 495 (1), IOP Publishing
360
B. Kar and B. K. Sarkar
12. Alghodhaifi H et al. (2019) Predicting invasive ductal carcinoma in breast histology images using convolutional neural network. 2019 IEEE national aerospace and electronics conference (NAECON). IEEE 13. Mohammed SA et al. (2020) Analysis of breast cancer detection using different machine learning techniques. International conference on data mining and big data, pp 108–117, Springer, Singapore 14. Saoud H et al. (2018) Application of data mining classification algorithms for breast cancer diagnosis. In: Proceedings of the 3rd international conference on smart city applications, pp 1–7 15. Ojha U, and Goel S (2017) A study on prediction of breast cancer recurrence using data mining techniques. 2017 7th international conference on cloud computing, data science & engineering-confluence, pp 527–530, IEEE 16. Pritom AI et al. (2016) Predicting breast cancer recurrence using effective classification and feature selection technique. 2016 19th international conference on computer and information technology (ICCIT), pp 310–314 17. Silva J et al. (2019) Integration of data mining classification techniques and ensemble learning for predicting the type of breast cancer recurrence. International conference on green, pervasive, and cloud computing, pp 18–30, Springer, Cham 18. Kar B, Sarkar BK (2022) A hybrid feature reduction approach for medical decision support system. Mathemat Problems Eng 19. Baranidharan B et al (2019) cardiovascular disease prediction based on ensemble technique enhanced using extra tree classifier for feature selection. Int J Recent Technol Eng (IJRTE) 8(3):3236–3242 20. https://scikitlearn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html. Accessed 16 Sep 2022
Chapter 31
Covid-19 Detection Using Deep Learning and Machine Learning from X-ray Images–A Hybrid Approach Afeefa Rafeeque and Rashid Ali
1 Introduction In the past few years, Covid-19 has caused multiple mortalities. World Health Organization reports that Severe Acute Respiratory Syndrome Coronavirus 2 (SARSCoV-2) is the reason behind Covid-19 SARS-CoV-2 threatens health and human life worldwide. Wuhan was the first city, found with Covid-19 cases in December 2019, and subsequently spread to various countries of the world [1]. On January 30, 2020, WHO announced Covid-19 as a pandemic, in most cases, people infected with this virus face mild to moderate illness and can get better without special treatment. However, in some cases, an infected one can get seriously ill and will require proper treatment. As of August 2022, a total of 591,683,619 confirmed cases were present, and there were 6,443,306 death penalties due to Covid-19 as reported by WHO, and still, this virus is around us. One of the significant keys to stop spreading this lifethreatening virus that has caused deaths in millions worldwide is to detect Covid-19 and diagnose it. To see this virus, RT-PCR is considered to be the optimum way to detect this virus [2]. However, this wide range of outbreaks, insufficient testing kits, areas with a high rate of occurrence, and countries with limited testing reagents to perform RT-PCR testing on millions of suspected patients is even the most significant problem and requires some alternative approach. Previous studies indicate that imaging technology that includes Chest X-ray or Computed Tomography (CT) for treating Covid-19 has higher sensitivity. Thus, a Chest X-ray can be taken as a A. Rafeeque (B) Interdisciplinary Centre for Artificial Intelligence, Aligarh Muslim University, Aligarh 202002, India e-mail: [email protected] R. Ali Interdisciplinary Centre for Artificial Intelligence, Department of Computer Engineering, Aligarh Muslim University, Aligarh 202002, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_31
361
362
A. Rafeeque and R. Ali
substitute for RT-PCR [3]. We require an alternative that takes less time in screening covid-19 as testing is repeated to ensure reliable test results. X-ray imaging takes less time than CT imaging, ultimately improving the speed of screening Covid-19. In addition, the cost of the device used for X-rays is affordable and handy, most institutions and clinics are using machines for X-ray equipment. Considering all the factors, we selected chest X-ray images for this study. The same image may be interpreted differently by different doctors or experts, and conclusions drawn might be different due to many factors, fatigue could be one of them because of workload. Thus, there is a need to interpret the images faster and more accurately, and here computer diagnosing system comes into the picture. Over the last few years, Artificial Intelligence (AI) has made remarkable progress, mainly in the medical field to detect various diseases. Well-trained models of Artificial Intelligence can give precise and rapid diagnoses or can help doctors and experts as well. AI models could also do early prediction of suffering patients at higher risk infected with this virus through training data. Machine learning in disease detection performs equivalent to healthcare professionals [4]. Xiaoxuan Liu et al. 2019 [4] found a pooled sensitivity of 87% in the case of deep learning models, whereas for healthcare professionals, it is 86.4%, and the same paper also concluded that the pooled specificity founds to be 90.50% for professionals of healthcare and 92.50% for deep learning models. Though their interpretation says machine learning performs equivalent to the healthcare professionals, but the percentage of deep learning models is slightly more significant than that of healthcare professionals, which itself concludes that machine learning even outperforms healthcare professionals. Machine learning has proven its role in the medical field and has set its priority in this field in the upcoming future. This machine learning method will play a key role in making the world with improved healthcare by having more number of patients datasets and E-records (electronic records) (Jeremy Goecks et al. [5]). The key contributions of this study are as follows: 1. We have used a pre-trained version of AlexNet [6] as a feature extractor and three machine learning classifiers for classification. We are the first to use AlexNet as a feature extractor (to the best of our knowledge) and also made a comparison with the Xception model implemented by [7] for features extraction 2. We have used three different evaluation parameters (FNR, FPR, and Accuracy) to evaluate the performance of implemented models. The rest of the paper is organized as follows: The second section introduces related work, then the methodology is discussed in the third section. The fourth section is about the experimental process. The fifth section discusses the experimental results. Finally, the last section summarizes the research.
31 Covid-19 Detection Using Deep Learning and Machine Learning …
363
2 Related Work Covid-19 has opened new research areas for academicians. Academicians widely use methods like Machine and Deep Learning to analyze various aspects of Covid-19. Ahammed et al. [8] achieved 94.03% accuracy with CNN. The authors use chest x-ray images to train their model for classifying it in either the Pneumonia class, normal class, or COVID-19 class. The limitation of this paper is the dataset, they have used only 285 images for their work, and such a small dataset is not suitable for training with deep learning. Wang et al. [9] proposed a system called COVID-net band while establishing a larger dataset COVIDx comprising 13,800 images of chest X-rays for classifying the images into either normal, pneumonia, or COVID-19 with a diagnostic accuracy for COVID-19 to be 92.4%. Abbas et al. [10] achieved 93.1% accuracy and 100% sensitivity by introducing a deep CNN DeTraC (Decompose Transfer and Compose) to detect COVID-19 from chest X-ray images and they have also validated it. They proposed a decomposition mechanism for checking irregularities in the dataset through class boundaries investigation. El-Din Hemdan et al. [11] made a comparison of standard deep-learning models and pre-trained models (trained on the ImageNet database [12]) to differentiate between healthy and COVID-19 patients. The authors chose a small dataset of 50 images, in which 24 were Normal and 25 of COVID-19. Out of the selected models, authors have found a similar performance of VGG19 and DenseNet with an F1-Score of 0.89 for normal and 0.91for COVID-19. Yoo et al. [13] introduce a decisiontree classifier based on deep learning to detect Covid-19 from chest X-ray images. Their classifier compared three binary decision trees on the PyTorch framework. The decision tree classifier categorized the X-ray images as normal or abnormal and achieved 95% average accuracy from the third decision tree. Apostolopoulos et al. [14] performed experimentations on the transfer learning model, and the authors concluded that VGG-19 outperformed other models of CNNs in terms of accuracy. Minaee et al. [15] detect covid-19 from chest x-ray images by proposing a deep learning-based framework. The authors have used four tuning models ResNet50, ResNet18, DensNet-121, and SqueezeNet. Their model attained a sensitivity of 98% and a specificity of 90% by increasing the number of samples through augmentation.
3 Method 3.1 A Transfer Learning Approach to Extract Features A deep learning model needs a large dataset to get trained; in the medical imaging field, it is pretty challenging to have such a large dataset. Due to the small number of images, deep learning models cannot have better results [16, 17]. The transfer
364 Table 1 Comparison between Xception and AlexNet in terms of the number of convolution layers and the number of floating point operations (FLOPs)
A. Rafeeque and R. Ali Model
Input size
No. of convolution FLOPs layers
Xception 224×224×3 36 AlexNet
224×224×3
5
8 billion [19] 725 million [20]
learning approach allows reusing the pre-trained model, which was already pretrained on some other task, to a new problem, and with this approach, a deep neural network can be trained with a small amount of data. This way, the transfer learning approach overcomes the problem of having a minimum dataset. Such a small dataset won’t allow the model to take in the actual arrangement of the image samples heading to overfitting, which is also solved by this approach. It is impractical to get the neural network trained from scratch, which needs many data with excessive time and high processing power. Therefore, to adapt new task, we have fine-tuned some parameters of the pre-trained model. Since the initial layer learns low-level features and the learning finetunes by learning more specific patterns with the increase in the number of layers, we only remove the last classification layer. In this work, we evaluated the performance of two popular models, Xception [18] and AlexNet [6]. AlexNet: The model was proposed by Alex Krizhevsky [6] and his colleagues in 2012. This model comprises eight layers with learnable parameters; out of eight layers, the model has five convolution layers, as shown in Table 1, with a max pooling layer accompanied by three FC (Fully Connected) layers using the Relu activation function, leaving the output layer. Xception: A 71 layers-deep neural network, also known as an efxtreme version of the Inception Model. It is also available with its pre-trained version, using millions of images from the ImageNet Dataset for its training. Xception [18] is better than Inception V3 [16] for both datasets (ImageNet ILSVRC and JFT datasets). This model (shown in Fig. 1) concludes that accuracy is improved with both Depth-wise Separable Convolution and Residual Connections. The number of floating point operations required is shown in Table 1.
3.2 Machine Learning Algorithm as a Classifier Machine learning algorithm needs human intervention to identify and hand-code the applied features depending on the type of data. In contrast, Deep learning, without human intervention, tries to learn those required features. Pin Wang et al. [17] presented an analysis of standard machine learning and deep learning for image processing. In their study, they found CNN with better results on the MNIST dataset (large dataset) whereas on using a small sample (COREL1000 dataset) SVM
31 Covid-19 Detection Using Deep Learning and Machine Learning …
365
Fig. 1 A system architecture to detect Covid-19
performs better than CNN (Convolutional Neural Network). Therefore, their experiment shows that traditional machine learning has a better solution for the small dataset than Deep learning. Since the dataset, we have used in this research is small, we have used the machine learning technique as a classifier, which indeed is the best in this case, and deep learning as a feature extractor as it tries to learn those features without human intervention. Machine learning is all about generalization, and K-Fold Cross Validation is one of the ways to do it. It is one of the most widely used methods where data is divided so that practical usage of the dataset can be done for building a model. The main aim for performing is to have a model that can outperform on unseen data. A perfect model can be made with 100% accuracy on training data or with zero error, but it may fail for unseen data. So, these models are not suitable as they are overfitting the data. The model’s performance can only be measured with unseen data points that has never been used during the training process. For K-Fold cross-validation, the dataset is split into training and testing. The training dataset is divided into K-Folds, and the k-1 dataset is utilized in training and one remaining fold is utilized in testing and this whole process is repeated k times.
366
A. Rafeeque and R. Ali
4 Dataset and Experimental Setup This section includes the experimental setup we have followed to detect Covid-19.
4.1 Dataset Currently, the Covid-19 dataset of appropriate size is not available. So, we have taken the number of images same as that Dingding Wang et al. [7] used for their study, they have also worked on a mixture of machine and deep learning. Since the dataset [21] they have used has changed, so we can’t ensure that the images we have used are the same as there. The number of images used for Covid-19 is 537, and for Normal 565. The split ratio followed in this study is the same as [7], which is 70% for training and 30% for testing.
4.2 Pre-processing Images may vary in size and other properties due to different devices that might be used to capture these images; some pre-processing steps have been followed to make all the images similar. (i) Re-scale all images: First, we resize all the x-ray images to 224 × 224 pixels as the images in the data set might differ in acquisition parameters and pixels as images might be taken from different devices. (ii) Image Normalization: The intensity values of all images are normalized to ensure that each input parameter i.e., pixel has a similar data distribution. One of the benefits of normalization is that the computation of the model becomes faster.
4.3 Features Extraction Using Deep Learning In this study, we have evaluated the performance of the AlexNet model. The pretrained version of AlexNet is used to extract the features from the training dataset. The AlexNet model, to the best of our knowledge has not been used as a feature extractor. However, Xception has been used by Dingding Wang et al. [7] as a feature extractor but since the images have changed a lot, we implemented their Xception model as a feature extractor for a comparison on the same dataset used for AlexNet in this study. The following steps are followed to do this task:
31 Covid-19 Detection Using Deep Learning and Machine Learning …
367
(i) The pre-trained version of the model was loaded from its library. (ii) Next, we made trainable parameters to zero, using the model to extract features only. (iii) Then we removed the last classification layer of the model, as we need features. (iv) At last, extracted features were stored by passing the training dataset to the pre-trained model.
5 Covid-19 Detection Using Machine Learning Classifier Once the features are extracted, the next step is to do the classification. So, the model is evaluated with different machine learning classifiers (SVM, Random Forest, and Bagging) Extracted Features obtained from the training dataset were given as input to the machine learning classifiers to get trained. All the machine learning classifiers are then generalized with 10-Fold Cross Validation. Once, the model gets generalized, it is tested over the testing dataset which is a separate set other than the training dataset.
6 Performance Evaluation Metrics This study has used the following three metrics to evaluate the implemented model, and for this three metrics we use True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) values: Accuracy =
TP + TN TP + TN + FP + FN
(1)
False Positive Rate (FPR) =
FP FP + TN
(2)
False Negative Rate (FNR) =
FN TP + FN
(3)
We have used Google Colaboratory with Nvidia Tesla K80 to perform all experiments as it provides Keras and Pytorch libraries.
7 Results and Discussion Three Machine Learning classifiers with AlexNet are generalized and validated on the training set with 10-Fold Cross Validation. The average of all the accuracies achieved for each fold is shown in Table 2.
368 Table 2 Results of cross-validation of different deep learning models with different machine learning classifiers
A. Rafeeque and R. Ali Model
Validation accuracy (%)
AlexNet + RF
93.88
Xception + RF
94.15
AlexNet + Bagging
96.13
Xception + Bagging
94.65
AlexNet + SVM
96.38
Xception + SVM
94.53
From the table, it can be seen that AlexNet achieved higher validation accuracy with Bagging and SVM. For a comparative study, Xception of [7] has also been implemented and the result is shown in the table. Then these two Pre-trained models with three machine-learning classifiers are then tested over the dataset. AlexNet performs well on the unseen dataset (test dataset). AlexNet with Random Forest and SVM classifier gives the highest accuracy of 99.33%. We have observed, that the SVM classifier performs best with both the models (Xception and AlexNet) by achieving the highest accuracy this study has completed. However, with Random Forest Classifier, Alexnet performs better than Xception. A summary of this result is shown in Table 2. Since the maximum accuracy achieved is the same for both models with two ML classifiers. However, if we look at the table, it can be observed that on comparing the minimum accuracy achieved by Xception and AlexNet, Xception got 98.65%, whereas AlexNet is leading by achieving 98.99% which is higher in this case. It can be seen in the confusion matrix (Fig. 3) that the maximum misclassification by AlexNet is three with Bagging, whereas for Xception, it is four with Random Forest. We have also evaluated the AlexNet model with a False Positive Rate (FPR) and False Negative Rate (FNR) shown in the Table 3, as in the case of Covid-19 which spreads from person to person, misclassification will lead to an increase in more number of cases. AlexNet performs well by achieving an FNR of 0.0 with all three machine learning classifiers. Since the FNR we got is 0.0, therefore, graph (Fig. 2) for the False Negative Rate has not been shown. Though the Xception of [7] achieved similar accuracy, however, we have observed that the Xception with thirty-six convolution layers took Table 3 Results of comparison of the performance of two deep learning models with three machine learning classifiers using different evaluation parameters on the test dataset
Model
Testing accuracy (%)
FPR
FNR
AlexNet + RF
99.33
0.0146
0.0
Xception + RF
98.65
0.0292
0.0
AlexNet + Bagging
98.99
0.0219
0.0
Xception + Bagging
99.33
0.0146
0.0
AlexNet + SVM
99.33
0.0146
0.0
Xception + SVM
99.33
0.0146
0.0
31 Covid-19 Detection Using Deep Learning and Machine Learning …
Accuracy 99.70%
369
False Positive Rate
AlexNet Xception
0.04
AlexNet Xception
0.03
99.20%
0.02
98.70%
0.01
98.20% RF
0
Bagging SVM
RF
Bagging
SVM
Fig. 2 Graphical Representation of Accuracy and FPR
0
1
0
1
0
1
0
135
2
0
134
3
0
135
2
1
0
161
1
0
161
1
0
161
c) AlexNet + Bagging
a) AlexNet + RF 0
1
0
133
4
1
0
161
b) Xception + RF
0
1
0
135
2
1
0
161
d) Xception + Bagging
e) AlexNet + SVM 0
1
0
135
2
1
0
161
f) Xception + SVM
Fig. 3 Confusion Matrix of AlexNet and Xception with three Machine Learning Classifiers
more time for features extraction because of the complexity of the network architecture that requires 8 Billion floating point operations, whereas AlexNet provides comparable accuracy with only five convolution layers and extracts features within no time as the number of floating point operations required is 725 million and complexity of AlexNet is ten times lower than the complexity of Xception [20] (Fig. 3).
8 Conclusion This study combines the transfer learning approach with machine learning classifiers to detect covid-19. Features are extracted by removing the last layer from the pretrained version of the Deep learning model, as deep learning models extract features
370
A. Rafeeque and R. Ali
from images automatically. AlexNet is used as a feature extractor, and for small datasets, machine learning classifiers are proven to be better than deep learning for classification, merging with the transfer learning model as a classifier. We also made a comparison with the Xception model. Both the models perform well on unseen data, whereas the best validation accuracy on the training dataset is found with AlexNet as a feature extractor and SVM as a classifier. This study’s best accuracy is 99.33%, with FPR and FNR of 0.0146 and 0.0, respectively. Both AlexNet and Xception got these values with different machine learning classifiers. However, the lowest accuracy Xception achieved is 98.65% with FPR and FNR of 0.0292 and 0.0, respectively whereas AlexNet with Bagging is leading by achieving 98.99% accuracy with an FPR and FNR of 0.0219 and 0.0, respectively. Overall, AlexNet, with the minimum number of layers and 11×Less FLOPs, performs better while providing comparable accuracy.
9 Limitations and Future Work Limitations in this study are still there as the dataset is small, though this study shows better results for the improvement in robustness, training on a large dataset is needed in the future. This study shows binary classification only; in the future, we plan to do other classifications like three-way classification (Covid-19 Vs Viral Vs Normal or Covid-19 Vs Bacterial Vs Normal, etc.) Or four-way classification (Covid-19 Vs Viral Vs Bacterial Vs Normal).
References 1. Wang C, Horby PW, Hayden FG, Gao GF (2020) A novel coronavirus outbreak of global health concern. The lancet 395(10223):470–473 2. Huang P, Liu T, Huang L, Liu H, Lei M, Xu W, Liu B (2020) Use of chest CT in combination with negative RT-PCR assay for the 2019 novel coronavirus but high clinical suspicion. Radiology. 3. Ng MY, Lee EY, Yang J, Yang F, Li X, Wang H, Kuo MD (2020) Imaging profile of the COVID-19 infection: radiologic findings and literature review. Radiol: Cardiothorac Imaging 2(1). 4. Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1251−1258. 5. Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Denniston AK (2019) A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health 1(6): e271-e297. 6. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, vol 25. 7. Mooney P (2020) Kaggle chest x-ray images (pneumonia) dataset 8. Ahammed K, Satu MS, Abedin MZ, Rahaman MA, Islam SMS (2020) Early detection of coronavirus cases using chest X-ray images employing machine learning and deep learning approaches. MedRxiv. 9. Wang L, Lin ZQ, Wong A (2020) Covid-net: a tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images. Sci Rep 10(1):1–12
31 Covid-19 Detection Using Deep Learning and Machine Learning …
371
10. Abbas A, Abdelsamea MM, Gaber MM (2021) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Appl Intell 51(2):854–864 11. Hemdan EED, Shouman MA, Karar ME (2020) Covidx-net: a framework of deep learning classifiers to diagnose covid-19 in x-ray images. arXiv preprint arXiv:2003.11055 12. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. pp 248−255. 13. Yoo SH, Geng H, Chiu TL, Yu SK, Cho DC, Heo J, Lee H (2020) Deep learning-based decisiontree classifier for COVID-19 diagnosis from chest X-ray imaging. Frontiers in medicine, 7, 427. (2020) 14. Apostolopoulos ID, Mpesiana TA (2020) Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks. Phys Eng Sci Med 43(2):635– 640 15. Minaee S, Kafieh R, Sonka M, Yazdani S, Soufi GJ (2020) Deep-COVID: Predicting COVID-19 from chest X-ray images using deep transfer learning. Med Image Anal 65:101794 16. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2818−2826. 17. Wang D, Mo J, Zhou G, Xu L, Liu Y (2020) An efficient mixture of deep and machine learning models for COVID-19 diagnosis in chest X-ray images. PloS one 15(11):e0242535 18. Goecks J, Jalili V, Heiser LM, Gray JW (2020) How machine learning will transform biomedicine. Cell 181(1):92–101 19. Ibrahem H, Salem ADA, Kang HS (2021) Real-time weakly supervised object detection using center-of-features localization. IEEE Access 9:38742–38756 20. Wang P, Fan E, Wang P (2021) Comparative analysis of image classification algorithms based on traditional machine learning and deep learning. Pattern Recognition Letters 141:61–67 21. Dong X, Huang J, Yang Y, Yan S (2017) More is less: a more complicated network with less inference complexity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5840−5848.
Chapter 32
Detecting COVID-19 in Inter-Patient Ultrasound Using EfficientNet Amani Al Mutairi, Yakoub Bazi, and Mohamad Mahmoud Al Rahhal
1 Introduction An infectious disease called COVID-19 is brought on by a coronavirus type of virus. The Latin word “corona,” which means crown, is whence the word “coronavirus” gets its name. According to the World Health Organization (WHO), the number of people who are infected with this virus is rapidly increasing. Moreover, 116 million incidents and over 2.5 million fatalities have already been confirmed, according to statistics made public on March 12, 2021 [1]. The most well-known symptoms and indicators of COVID-19 are a dry cough, fatigue, and fever. Other less frequent signs and symptoms can include discomfort, stuffy nose, and loss of taste and smell. Elderly folks and those with health issues are more at risk for major complications. Numerous techniques for CT image analysis are reported in the COVID-19 [2– 8]. For instance, Silva et al. [2] developed a voting-based strategy, in which the images from a certain patient are sorted using a voting method. The authors of [3] used differential evolution methods to construct a bidirectional classification approach. The authors of [4] suggested a contrastive learning approach for jointly learning on heterogeneous datasets. For improving detection accuracy, the authors of [5] proposed a multiscale feature fusion technique. A method for segmenting and A. Al Mutairi (B) · Y. Bazi Computer Engineering Department, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia e-mail: [email protected] Y. Bazi e-mail: [email protected] M. M. Al Rahhal Applied Computer Science Department, College of Applied Computer Science, King Saud University, Riyadh 11543, Saudi Arabia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_32
373
374
A. Al Mutairi et al.
identifying the diseased portions of photos from various sources was proposed by Zhou et al. in [6]. A deep-guided forest method based on adaptive feature selection is described in another study [7]. There have also been developed additional techniques for COVID-19 detection using X-ray imaging [9–14]. As an illustration, the authors of [9] developed a methodology to categorize photos into three groups: pneumonia, COVID, and non-COVID. To categorize the retrieved characteristics, the authors [10] studied several classifiers and used a variety of pre-trained Convolutional Neural Networks (CNN) architectures. They found that the greatest results were obtained by using MobileNet as a pre-trained CNN in conjunction with an SVM classifier. The authors of [11] proposed a transfer learning model based on decomposition techniques to determine class borders. The authors of [12] created a capsule network with four convolutional layers and three capsule layers to address class unbalance concerns. The authors of [13] presented a covid data-based network that combines segmentation and data augmentation to improve detection accuracy. The authors of [13] proposed a segmentation and data augmentation combined covid data-based network. Finally, the authors in [13] proposed pre-processing the pictures with a bilateral low-pass filter and a histogram equalization technique. Then, a CNN model is gradually provided a pseudo-color image made from the original and filtered images for categorization. Recently, ultrasound imaging has been used for illness screening as well because of its accessibility, safety, and quick results. The authors of [15] proposed a spatial transformer network that concurrently predicts the disease severity and offers weakly supervised localization of pathological artifacts as a solution to this problem. Additionally, they provided a method based on uninorms for frame score aggregation at the video level. The objective of this work is to employ deep learning models for coronavirus impact identification using ultrasound images. We don’t train new models from scratch because we have a very small number of training data; instead, we focus on optimizing a model that has already been trained more precisely; we employ a pretrained model from the EfficientNet family of models [16]. In particular, we test these models’ effectiveness in a binary classification task where the goal is to determine whether a patient’s image is covid-positive (carries the virus) or covid-negative. This article is organized as follows: in Sect. 2, we present a detailed description of the deep learning model we used for this problem. Section 3 is dedicated to the description of the dataset collected. In Sects. 4 and 5, we conclude by presenting the experimental results attained using the suggested methodology.
2 Methodology The recent trend in image classification problems is training deep neural network architectures, also called Convolutional Neural Networks (CNNs or ConvNet), to
32 Detecting COVID-19 in Inter-Patient Ultrasound Using EfficientNet
375
predict the labels of test images. A CNN network is made up of a series of convolutional layers with trainable biases and weights. On each layer’s input data, convolution is used before an optional nonlinear action. CNNs have been effectively employed in a number of research disciplines, including medical image analysis, following the success of AlexNet [17] on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Training deeper architectures, or network designs with several convolutional layers arranged in different ways, such as the GoogLeNet [18], ResNet [19], and, more recently, GPipe [20], has considerably enhanced the performance of ConvNets. These topologies grow ConvNets by increasing the number of channels within each layer, the depth, the breadth, or the image quality of the layers. EfficientNet [16] is the only model that currently scales in all three dimensions according to principles. For COVID-19 diagnostic, a variety of techniques are available, including deep learning methodologies, optimization strategies, and others [21–25]. ConvNets perform better when they are scaled in one of the three dimensions (depth, width, and picture resolution), according to the authors of EfficientNet, but as the network grows larger, the benefit quickly achieves saturation. They presented a compound scaling strategy to get around this problem, which uniformly scales network depth, width, and resolution using fixed scaling factors. Additionally, they created a mobile-size baseline network called EfficientNet-B0 to demonstrate the significance of balancing a network in all dimensions early (Fig. 1). Then, using this baseline network as a starting point, they apply the suggested scaling strategy to produce 8 different EfficientNet model versions. The suggested models greatly outperform other ConvNet designs on the ImageNet classification problem while using fewer parameters and processing data at a faster rate during inference, an important characteristic for real-time applications like the one taken into account in this work. The networks’ learnt features are also transportable and produce outstanding results on a variety of datasets. The mobile inverted bottleneck layer, which is composed of an inverted residual block mixed with squeeze-and-excitation (SE) blocks, serves as the primary building block for the baseline network (EfficientNet-B0) (Fig. 2). An inverted residual block applies depth-wise convolution in the new space after first projecting the input feature map onto a higher dimensional space. Point-wise convolution (1 * 1 convolution) with linear activation is used to project the resultant feature map back into a lowdimensional space. The output feature map is created by adding a residual connection from the point-wise convolution operation’s input to its output. On the other hand, SE blocks acquire the ability to adaptively weight the channels of an input feature map. Prior to feeding the input to a two-layer neural network, they turn it into a feature vector with a size equal to the number of channels (c). This network generates a vector of size c as an output, and this vector is used to scale each channel according to its significance. The baseline network is also built using a multi-objective neural architecture search method that considers mobile device accuracy and real-world latency. The authors use the compound scaling method to create 7 different (EfficientNet-B1 to B7) models from this baseline network.
376
A. Al Mutairi et al.
Fig 1 Architecture of the EfficientNet-B0 network
Fig. 2 A mobile inverted bottleneck layer that is a part of the family of EfficientNet models. (FC−Fully connected layer; BN−Batch normalization layer)
3 Results 3.1 Dataset Description We use the e-POCOVID-Net ultrasound dataset from [17] in the studies. This dataset includes 59 pictures and 202 lung ultrasound videos from 216 different people. Samples from COVID-19 patients, patients with bacterial pneumonia, patients with viral pneumonia, and healthy controls are included in this dataset, as seen in Table 1. Figure 3 displays various convex and linear probe-extracted positive and negative frames from various records.
32 Detecting COVID-19 in Inter-Patient Ultrasound Using EfficientNet
377
Table 1 Video and image numbers for each class Convex #Video
Linear #Image
#Video
Total #Image
COVID-19
64
18
6
4
92
Bacterial pneumonia
49
20
2
2
73
3
–
3
–
6
Viral pneumonia Normal Total
66
15
9
–
90
182
53
20
6
261
Fig. 3 Examples of ultrasound images derived from various records: a Convex sensor, b Linear Sensor
3.2 Results We adjust EfficientNet-B2 such that it can recognize COVID-19 from ultrasound images. Specifically, we freeze the remaining layers and solely fine-tune the final 100 layers of this network. We use stochastic gradient descent (ADAM) for optimization, with 20 training epochs, 20 mini-batch sizes, and a learning rate of 0.0001. We repeat
378
A. Al Mutairi et al.
Table 2 Number of images in train and test in the five trials Train
Test
Trial. 1
2312
613
Trial. 2
2388
537
Trial. 3
2415
510
Trial. 4
2345
580
Trial. 5
2206
719
Table 3 Accuracy per class (in %) of the five trials Normal
Pneumonia
COVID-19
Overall accuracy
Trial. 1
100
99.20
100
99.83
Trial. 2
100
100
100
100
Trial. 3
100
100
100
100
Trial. 4
100
100
100
100
Trial. 5
61.44
87.30
99.41
84.41
Average ± std.
99.28 ± 19.27
97.30 ± 6.22
99.88 ± 0.29
96.79 ± 7.07
the experiments 5 times, Table 2. Presents number of images in train and test in the five trials. The suggested architecture’s overall accuracy is 100% in three of the five trials and 99.83% in the first. The sixth trial, which had an overall accuracy of 84.41%, was the sole exception. It is crucial to keep in mind that the system’s COVID-19 detection accuracy is outstanding, at 99.88% on average and 100% in four of the five trials. Table 3 displays the five trials’ accuracy for each class (in %).
4 Conclusion A deep learning technique for COVID-19 patient detection from ultrasound pictures is proposed in this paper. More precisely, we enhanced the EfficientNet-B2 model, a member of the well-known family of EfficientNet models, using the transfer learning technique. Even when used with a learning-from-scratch approach, EfficientNet-B2 exhibits a tolerable performance in experimental findings. For future work, it suggests using more ultrasound samples with other deep learning models.
32 Detecting COVID-19 in Inter-Patient Ultrasound Using EfficientNet
379
References 1. WHO Director-General’s opening remarks at the media briefing on COVID-19−10 April 2020.” https://www.who.int/dg/speeches/detail/who-director-general-s-opening-rem arks-at-the-media-briefing-on-covid-19---10-april-2020. Accessed 10 Apr 2020. 2. Silva P et al (2020) COVID-19 detection in CT images with deep learning: a voting-based scheme and cross-datasets analysis. Inform Med Unlocked 20:100427. https://doi.org/10.1016/ j.imu.2020.100427 3. Pathak Y, Shukla PK, Arya KV (2020) Deep bidirectional classification model for COVID-19 disease infected patient. IEEE/ACM Trans Comput Biol Bioinform: 1–1. https://doi.org/10. 1109/TCBB.2020.3009859. 4. Wang Z, Liu Q, Dou Q (2020) Contrastive cross-site learning with redesigned net for COVID-19 CT classification. IEEE J Biomed Health Inform 24(10):2806–2813. https://doi.org/10.1109/ JBHI.2020.3023246 5. Rahhal MMA, Bazi Y, Jomaa RM, Zuair M, Ajlan NA (2021) Deep learning approach for COVID-19 detection in computed tomography images. Comput, Mater & Contin 67(2). https:// doi.org/10.32604/cmc.2021.014956. 6. Zhou L et al (2020) A rapid, accurate and machine-agnostic segmentation and quantification method for CT-based COVID-19 diagnosis. IEEE Trans Med Imaging 39(8):2638–2652. https://doi.org/10.1109/TMI.2020.3001810 7. Sun L et al (2020) Adaptive feature selection guided deep forest for COVID-19 classification with chest CT. IEEE J Biomed Health Inform 24(10):2798–2805. https://doi.org/10.1109/JBHI. 2020.3019505 8. Al Rahhal MM et al (2022) COVID-19 detection in CT/X-ray imagery using vision transformers. J Pers Med 12(2). Art. no. 2. https://doi.org/10.3390/jpm12020310. 9. Arias- JD, Gómez-García JA, Moro-Velázquez L, Godino-Llorente JI (2020) Artificial intelligence applied to chest X-ray images for the automatic detection of COVID-19. A thoughtful evaluation approach. IEEE Access 8:226811–226827. https://doi.org/10.1109/ACCESS.2020. 3044858 10. Ohata EF et al (2021) Automatic detection of COVID-19 infection using chest X-ray images through transfer learning. IEEE/CAA J Autom Sin 8(1):239–248. https://doi.org/10.1109/JAS. 2020.1003393 11. Abbas A, Abdelsamea MM, Gaber MM (2021) Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Appl Intell 51(2):854–864. https:// doi.org/10.1007/s10489-020-01829-7 12. Afshar P, Heidarian S, Naderkhani F, Oikonomou A, Plataniotis KN, Mohammadi A (2020) COVID-CAPS: a capsule network-based framework for identification of COVID-19 cases from X-ray images. Pattern Recognit Lett 138:638–643. https://doi.org/10.1016/j.patrec.2020. 09.010 13. Tabik S et al (2020) COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on chest X-ray images. IEEE J Biomed Health Inform 24(12):3595–3605. https://doi.org/10.1109/JBHI.2020.3037127 14. Heidari M, Mirniaharikandehei S, Khuzani AZ, Danala G, Qiu Y, Zheng B (2020) Improving the performance of CNN to predict the likelihood of COVID-19 using chest X-ray images with preprocessing algorithms. Int J Med Inform 144:104284. https://doi.org/10.1016/j.ijmedinf. 2020.104284 15. Roy S et al (2020) Deep Learning for Classification and Localization of COVID-19 Markers in Point-of-Care Lung Ultrasound. IEEE Transactions on Medical Imaging 39(8):2676–2687. https://doi.org/10.1109/TMI.2020.2994459 16. Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA. vol 97, pp 6105–6114. [Online]. http://proceedings. mlr.press/v97/tan19a.html
380
A. Al Mutairi et al.
17. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States. pp 1106–1114. [Online]. http://pap ers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks 18. Szegedy C et al (2015) Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. pp 1–9. https://doi.org/10.1109/CVPR.2015.7298594. 19. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016. pp 770–778. https://doi.org/10.1109/CVPR.2016.90. 20. Huang Y et al (2019) GPipe: efficient training of giant neural networks using pipeline parallelism. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada. pp 103–112. [Online]. http://papers.nips.cc/paper/8305-gpipe-efficient-trainingof-giant-neural-networks-using-pipeline-parallelism. 21. Mohammed MA, Al-Khateeb B, Yousif M, Mostafa SA, Kadry S, Abdulkareem KH, GarciaZapirain B (2022) Novel crow swarm optimization algorithm and selection approach for optimal deep learning COVID-19 diagnostic model. Comput Intell Neurosci. 22. Saeed M, Ahsan M, Saeed MH, Rahman AU, Mehmood A, Mohammed MA, Jaber MM, Damaševiˇcius R (2022) An optimized decision support model for COVID-19 diagnostics based on complex fuzzy hypersoft mapping. Mathematics 10(14):2472 23. Hameed Abdulkareem K, Awad Mutlag A, Musa Dinar A, Frnda J, Abed Mohammed M, Hasan Zayr F, Lakhan A, Kadry S, Ali Khattak H, Nedoma J (2022) Smart healthcare system for severity prediction and critical tasks management of COVID-19 patients in IoT-fog computing environments. Comput Intell Neurosci. 24. Dinar AM, Raheem EA, Abdulkareem KH, Mohammed MA, Oleiwie MG, Zayr FH, Al-Boridi O, Al-Mhiqani MN, Al-Andoli MN (2022) Towards automated multiclass severity prediction approach for COVID-19 infections based on combinations of clinical data. Mob Inf Syst. 25. Nagi AT, Awan MJ, Mohammed MA, Mahmoud A, Majumdar A, Thinnukool O (2022) Performance analysis for COVID-19 diagnosis using custom and state-of-the-art deep learning models. Appl Sci 12(13):6364
Chapter 33
Expert System for Medical Diagnosis and Consultancy Using Prediction Algorithms of Machine Learning L. M. R. J. Lobo and Dussa Lavanya Markandeya
1 Introduction Today’s healthcare industries produce and collect large amounts of data daily. The proposed system will be implemented to save the time of a patient by seeing the patient’s needs. Human sickness is the primary cause of death in humans. Raw data from the healthcare industry must be gathered and stored in an organized format in order to be used for human illness early detection. By including symptoms connected to patients’ conditions and behaviors, which is achieved through data analysis, the suggested system is able to forecast disease. In addition to disease prediction, this suggested system will suggest the best doctors depending on a certain disease. The datasets from the list of doctors will be utilized to check for symptoms and make disease predictions, also providing Information related to some institutes or hospitals which provide free treatment for this disease. It will become easy for patients to diagnose/predict what disease they are having without the need for a doctor using this application. Figure 1 shows how to find high-level knowledge from low-level data in databases which is a process called knowledge discovery. It is an iterative process that includes phases like selecting the data, pre-processing the chosen data, transforming the data into the proper form, data mining to extract the required information, and interpreting/evaluating the data. The selection stage gathers heterogeneous data from several sources for processing. Real-world medical data may be sparse, convoluted, noisy, inconsistent, and/or irrelevant, necessitating a selection process that captures the pertinent information from which knowledge is to be extrapolated. The preprocessing step carries out the fundamental activities of removing the noisy data, L. M. R. J. Lobo · D. L. Markandeya (B) Department of Computer Science and Engineering, Walchand Institute of Technology, Solapur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_33
381
382
L. M. R. J. Lobo and D. L. Markandeya
Fig. 1 KDD process
attempting to locate missing data or developing a plan for handling missing data, detecting or removing outliers, and resolving discrepancies among the data. In the transformation process, tasks including aggregation, smoothing, normalizing, generalization, and discretization are carried out to turn the data into forms that are acceptable for mining. Data reduction tasks reduce the size of the data and display the same data in a smaller amount while still producing comparable analytical results. One of the primary steps in the KDD process is data mining. Data mining is the process of selecting one or more data mining algorithms and utilizing them to extract new, hypothetically useful information from the database’s data. This involves selecting the models, algorithms, and parameters that might be appropriate and coordinating a particular data mining technique with the overarching KDD process standards. The clustering algorithm is used for grouping similar symptoms together. The process of grouping a set of data, objects, or clustures into meaningful groups that belong to the same class is called clustering. The remaining objects and data are distinct from the grouped data, whereas the objects and data in one class are similar to one another. The prediction system analyzes the symptoms provided by the patient and forecasts the illness. Predictive analytics algorithms are used to identify the best knowledge and find the best solution. Predictive analytics algorithms are used to discover knowledge and find the best solution. The practise of obtaining data from numerous datasets for the purpose of making predictions and estimates about future outcomes is known as predictive analytics. The above data mining algorithms can be used for implementing a proposed system, and the algorithm can vary as per need in future.
33 Expert System for Medical Diagnosis and Consultancy Using Prediction …
383
2 Related Work Mohapatra et al. [1] showed how hidden information was extracted from datasets. The end-user is supported by the smart healthcare management system, which also enables online user guidance for health difficulties. The need for well-organized approaches for evaluating, predicting, and detecting diseases makes this research in the area of medical sciences of utmost relevance to identify and anticipate diseases. Applications for data mining are used to manage smart health care. Shailaja et al. [2] showed that machine learning is a modern and sophisticated technical application and has become a huge trend in the industry. Machine learning is omni-current and is widely used in various applications. It is significant in a variety of fields, including security, medicine, and finance. It is used to provide excellent ability to find samples from medical education sources and to diagnose diseases. Costa et al. [3] showed that existing data mining methods and techniques are used for the purpose of finding future values for datasets, leaving out what could be more useful information than that, without in-depth research into the output of such techniques. This paper presents an overview of available representation and visualization techniques and can be used to highlight all the possibilities related to data mining techniques, especially more complex ones, such as artificial neural networks and support vector machines, with the aim of clarifying and raising awareness of possible underlying data mining techniques. Zeng et al. [4] remark that Inductive Matrix Completion (IMC) is one of the most dependable models for its grounded system and its prevalent exhibition in foreseeing quality infection affiliations. The trial results show that DCF is as yet palatable for positioning novel infection aggregates just as mining neglected connections. Khaled et al. [5] in their research work have distinguished three blood Cancer Classifiers, k-nearest neighbor (k-NN), decision tree (DS), and support vector machine (SVM) for their review. In the space of medical care, leukemia influences blood status and can be found by utilizing the Blood Cell Counter (CBC). This review plans to foresee leukemia presence by deciding the connections between blood properties and leukemia with sexual orientation, age, and well-being status of patients utilizing information mining procedures. Gandhi et al. [6] focused on data mining classification techniques used for data discovery. The purpose of this effort was to employ data mining techniques to investigate the various aspects of using healthcare data to assist people. Dewan and Sharma [7] used BP Neural Network, Data Mining, Genetic Algorithm, and Heart Disease Prediction. This paper proposed a proficient hereditary calculation mixture with the back engendering method approach for coronary illness expectation. Sidiq and Aaqib [8] spoke about data mining as the process of extracting information or finding usable data from complex databases. There is huge heterogeneous data in the medical field which is being converted into useful information using data mining techniques, and then this useful data is used by doctors to diagnose various diseases with satisfactory accuracy. In this work, the authors focus on diagnosing thyroid disease using the neural net, vote ensemble, and stacking
384
L. M. R. J. Lobo and D. L. Markandeya
ensemble methods, where the classification is compared in terms of accuracy, precision, memory, and error rate. Experimental results show that the stacking ensemble method has the highest accuracy than any other method and also has the accuracy of guessing compared to the relevant existing material. Kohli and Arora [9] tell about the use of machine learning in the field of medical diagnosis which is increasing slowly. This can largely be attributed to advancements in illness classification and recognition systems, which can give medical professionals information to aid in the early detection of diseases that are life-threatening and so considerably raise patient survival rates. In their work, they use various classification algorithms, each of which has a benefit over the three distinct disease databases (heart, breast cancer, and diabetes) that are accessible in the UCI repository for disease prognosis. Backward modeling utilizing the P-value test was used to determine the features of each dataset. The study’s findings support the concept of using machine learning for disease early detection. Bharat et al. [10] show how, in order to identify the different types of cancer cells, machine learning is commonly employed in medical applications. The main cause of death each year is breast cancer. It is the most prevalent form of cancer and the leading global cause of mortality for women. There are two types of cancer cells: benign (B) and malignant (M). Some of the algorithms used to categorize and predict breast cancer include Support Vector Machine (SVM), Decision Tree (Cart), Naive Bayes (NB), and K-Nearest Neighbor (KNN). SVM is used for the Wisconsin Breast Cancer Dataset in this study. The dataset is used to train KNN, Naive Bayes, and CART, and the prediction accuracy for each algorithm is compared. Alanazi et al. [11] show how Diabetes is a chronic disease. Diabetes danger is rising so quickly that it is damaging human health. The suggested approach combines two machine learning algorithms that use random forests and support vector machines to predict diabetes using specific information obtained from Security Force Primary Health Care. The proposed model’s ROC was 99% and its accuracy was 98%. The results show that the random forest method performs better in terms of accuracy than the support vector machine. Sarwar et al. [12] in their paper have discussed that there are many machine learning techniques that are used to make predictive analyzes on big data from different fields. Predicting in health care is a challenging task but can ultimately help practitioners make big data-informed decisions about a patient’s health and treatment in a timely manner. Six distinct machine learning algorithms are employed in this research to examine predictive analysis in health care. For the experiment, a dataset of the patient’s medical record is obtained and six different machine learning algorithms are applied to the dataset. The efficiency and accuracy of the applied algorithms are discussed and compared. Sangari and Qu [13] have commented that in recent years, machine learning algorithms have become more and more used in the healthcare industry, especially in research areas involving human participation in areas where clinical trials and data collection are very expensive. This research project has conducted a comparative study on three well-known machine learning methods: Logistic Regression (LR), Support Vector Machine (SVM), and Naïve Bayes (NB) on the same dataset to predict breast cancer to improve clinical trials. The experiment results provide a comprehensive view
33 Expert System for Medical Diagnosis and Consultancy Using Prediction …
385
of patient risk levels and risk factors that lead to benefits in effective and efficient treatment. This research has also demonstrated that there may be differences in both performance and accuracy in machine learning algorithms versus similar datasets for breast cancer prognosis. Neelaveni and Devasana [14] showed how Alzheimer’s disease is one of the neurodegenerative disorders. Even if the symptoms start out mild, they eventually get worse. A form of dementia is Alzheimer’s disease. Because there is no known treatment for this illness, it presents a problem. However, the diagnosis of the illness comes much later. Therefore, if the disease has previously been predicted, its advancement or symptoms may be delayed. In this study, machine learning algorithms are used to forecast the onset of Alzheimer’s disease utilizing psychological indicators such as age, the number of visits, the MMSE, and educational level. Shouman and Turner [15] show that the leading cause of death worldwide during the preceding ten years was heart disease. Several data mining techniques have been developed by researchers to aid medical practitioners in the identification of cardiac disease. One data mining method that has had a lot of success in the diagnosis of heart disease is naive Bayes. One of the most widely used clustering methods is K-means clustering, however, its outcomes are significantly influenced by the initial centroid selection. This work illustrates how k-means clustering, an unsupervised learning technique, can be used to enhance naive Bayes, a supervised learning technique. It looks into how to diagnose people with heart disease by combining K-means clustering and Naive Bayes. Additionally, it looks into range, inlier, outlier, random attribute values, and random row approaches for the first centroid selection of the K-means clustering in the diagnosis of heart disease patients. The findings demonstrate that the accuracy of naive Bayes in diagnosing heart disease patients may be improved by combining k-means clustering with naive Bayes and using a modified initial centroid selection. Additionally, it showed that the two clusters random row initial centroid selection approach could diagnose heart disease patients with 84.5% accuracy, outperforming other first centroid selection methods.
3 Methodology Figure 2 shows how the proposed system operates. To prevent disease, we have proposed an efficient healthcare system. In the proposed system, symptoms and diseases can be added in a structured format. In the healthcare system, many databases are organized in a disorganized manner. We can add symptoms using the patient condition. These symptoms are stored in the symptoms database. This system can apply a suitable data mining technique to predict the disease based on symptoms. After applying the database technique sorted, and validated data can be arranged in a proper manner. Data is initially saved in the database in an unstructured, noisy state as big data. In the proposed work, enter symptoms as per the patient condition or behavior. To do this, we used the disease dataset to filter out noisy data and missing values. This entered symptoms stored in the symptoms database also
386
L. M. R. J. Lobo and D. L. Markandeya
Fig. 2 Working of proposed system
Mapping this symptoms with the database. Based on patient’s symptoms mapping, the current situation and checking the condition of the patient. Based on this the proposed system will display predicted disease with associated symptoms. We can enter a number of symptoms, and the system will identify a disease as the number of symptoms increases. After this, the proposed system also analyzes a suitable doctor from the doctor database for a specific disease and shows a relevant doctor.
3.1 Algorithm K-means clustering algorithm is utilized for processing. The centroids are calculated, and the process is repeated until the optimal centroid is discovered. It presumes that there are already known quantities of clusters. K in K-means refers to how many clusters the algorithm was able to recognize from the data. One of the first and most popular clustering methods is K-means. The concept of “K-means” describes how each of the K clusters is represented by the mean of
33 Expert System for Medical Diagnosis and Consultancy Using Prediction …
387
the points within that cluster. The centroid is the name of this point. The K-Means algorithm works as follows: INPUT: D is the database with n data objects in it, and K is the desired number of clusters. OUTPUT: A set of K clusters. STEPS: (1) To start, we must choose K data points at random from dataset D to serve as the first cluster centers. (2) Then we must repeat. (3) We must now calculate the distance between each data object and each of the K cluster centers and place each object in the cluster that is closest to it. (4) Recalculate the cluster center for each cluster. (5) Up until the clusters’ centers remained unchanged.
4 Experimental Setup and Results Figure 3 shows the login and registration page. If a user wants to use a developed system, first the user needs to create an account. A doctor needs to register; after registration, the user and doctor can log in to that page with their username and password. After logging in to the page succesfully, patients will be able to use the development system application. Figure 4 shows an “Add symptoms page” from where patients will add the symptoms like palpitation; it will display the diseases which are related to palpitation like Arrhythmia, Atrial Flutter, Pulmonary Edema, Atrial Fibrillation, Catecholaminergic Polymorphic Ventricular Tachycardia, and Pulmonary Embolism. The next symptoms will be added and separated by a comma to show the disease. Additional
Fig. 3 Login and registration page
388
L. M. R. J. Lobo and D. L. Markandeya
symptoms may change the disease name. The system will display the diseases that are related to the symptoms added by the patient, and after clicking on a particular disease, it will display the doctors that can treat that disease. Figure 5 shows the disease name and clicking on the icon of that disease, it will show doctor’s name related to that disease. Figure 6 shows the appointment page where the patient can book the doctor’s appointment. After knowning the doctor’s details, the patient can get better treatment for that disease. Figure 7 shows the doctor’s page where a doctor can view their profile and the appointment list of that patient with full details. From here, doctors can approve or reject the appointment for that date.
Fig. 4 Add symptoms
Fig. 5 Symptoms of atrial fibrillation disease
33 Expert System for Medical Diagnosis and Consultancy Using Prediction …
389
Fig. 6 Appointment page
Fig. 7 Doctors page
Figure 8 shows an analysis of the achieved results in the graph below for various algorithms, namely Naive Bayes, Decision Tree, and K-means which is comparatively better than what others have achieved, i.e. accuracy 81% in Naive Bayes and accuracy for Decision Tree is 78% [15] and 90% accuracy for K-means algorithm. In other literture for Conventional Pathology Data Disease, they have used the technique of extracting patterns and detecting trends using Neural Networks, Chest Disease they have used technique Constructed a model using Artifical Neural Network, Liver disease they have used technique Classification using Bayesian, Coronary heart disease they have used technique Prediction models using Decision
390
L. M. R. J. Lobo and D. L. Markandeya
Fig. 8 Graph showing accuracy of prediction of disease using different data mining algorithms
Tree Algorithms such as ID3, Diabetes Disease they have used technique Classification of medical data using Genetic Algorithm, Chronic Disease they have used technique Prediction of Disease using Apriori Algorithm, Lymphoma Disease and Lung Cancer they have used technique Distinguish disease subtypes using Ensemble approach, Coronary Heart Disease they have used technique Improving classification accuracy using Naive Bayesian but we have used K-means clustersing algorithm specifically because it returns predicted outputs of the diseases very specifically as resulted symptoms provided as input to the system by the user/patients. As compared to the other papers, we are using disease prediction for all the diseases which we have added to the dataset and our dataset is not fixed as we can add and delete the data for a particular disease.
5 Conclusion The proposed system analyzes the symptoms provided by the patients, predicts the diseases, suggests a doctor related to that disease, and also advises people to take precautions to save time and effort. In this, clustering and prediction algorithms are used to predict disease and to suggest doctors. This system also helps the patients for knowing their health condition that what is the disease that they are suffering from after adding symptoms to the our website patients will get to know which disease it is without the need of a doctor and system also tell doctors name which are related to that disease. If patients are unable to go to the hospital, then they can use this developed system for knowing their disease and they can do their own checkups at home with the help of this developed system. We have achieved an accuracy of 90% using the K-means algorithm which is better than that of any other algorithm, and the results are much better than those achieved in any literature available.
33 Expert System for Medical Diagnosis and Consultancy Using Prediction …
391
Acknowledgements I would like to say a “Thank You” to my Institution for all the support they gave me during my work and documentation.
References 1. Mohapatra S, Patra P, Mohanty S, Pati B (2018) Smart health care system using data mining. In: International Conference on Information Technology (ICIT). pp 44−49 2. Shailaja K, Seetharamulu B, Jabbar M (2018) Machine learning in healthcare: a review. In: Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). pp 910−914. 3. Costa D, Portela F, Santos M (2018) An overview of data mining representation techniques. In: 7th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW). pp 90−95 4. Zeng X, Lin Y, He Y, Lü L, Min X, Rodríguez-Patón A (2020) Deep collaborative filtering for prediction of disease genes. IEEE/ACM Trans Comput Biol Bioinformatic 17(5): 1639−1647. 5. Khaled A, Daqqa A, Sarraj W (2017) Prediction and diagnosis of leukemia using classification algorithms. In: 8th International Conference on Information Technology (ICIT). pp 638−643. 6. Gandhi M, Singh S (2015) Predictions in heart disease using techniques of data mining. In: International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE). pp 520−525. 7. Dewan, A., Sharma, M.: Prediction of heart disease using a hybrid technique in data mining classification. In: 2nd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 704–706. (2015) 8. Sidiq U, Mutahar Aaqib S (2019) Disease diagnosis through data mining techniques. In: International Conference on Intelligent Computing and Control Systems (ICCS). pp 275−280. 9. Kohli P, Arora S (2018) Application of machine learning in disease prediction. In: 4th International Conference on Computing Communication and Automation (ICCCA). pp 1−4. 10. Bharat A, Pooja N, Reddy A (2018) Using machine learning algorithms for breast cancer risk prediction and diagnosis. In: 3rd International Conference on Circuits, Control, Communication and Computing (I4C). pp 1−4. 11. Alanazi A, Mezher M (2020) Using machine learning algorithms for prediction of diabetes mellitus. In: International Conference on Computing and Information Technology (ICCIT1441). pp 1−3. 12. Sarwar M, N. Kamal N, Hamid W, Shah M (2018) Prediction of diabetes using machine learning algorithms in healthcare. In: 24th International Conference on Automation and Computing (ICAC), pp 1–6. 13. Sangari N, Qu Y (2020) A comparative study on machine learning algorithms for predicting breast cancer prognosis in improving clinical trials. In: International Conference on Computational Science and Computational Intelligence (CSCI). pp 813−818. 14. Neelaveni J, Devasana M (2020) Alzheimer disease prediction using machine learning algorithms. In: 6th International Conference on Advanced Computing and Communication Systems (ICACCS). pp. 101−104. 15. Shouman M, Turner T (2012) Integrating naive bayes and K-Means clustering with different initial centroid selection methods in the diagnosis of heart disease patients. In: International Conference of Data Mining & Knowledge Management Process (CKDP).
Chapter 34
Dense Monocular Depth Estimation with Densely Connected Convolutional Networks Adeeba Ali, Rashid Ali, and M. F. Baig
1 Introduction The objective of estimating the distance relative to the camera, that is, depth, is to obtain a geometrical representation of the scene and to recover the appearance and 3D shape of the objects present in the RGB image. Therefore, the estimation of the depth of objects present in the surrounding environment is one of the fundamental requirements of many emerging technologies [1–3]. The predicted depth map of highquality resolution can be very useful in many vision-based applications like scene understanding and reconstruction, autonomous navigation of mobile robots, semantic segmentation [3], camera pose estimation, refocusing of images [2], and augmented reality used in [1]. Earlier, researchers and developers generally made use of lidar, laser sensors, etc. to obtain the depth of the scene. However, the use of these sensors increases the cost and weight of the application, which decreases the efficiency of the system. Then, in order to solve these issues, researchers proposed the stereo camera-based depth estimation technique which leverages pairs of images to obtain depth using the stereo matching method. Therefore, this method requires multiple images for depth calculation which increases the response time of the application. Hence, with the advent of deep learning these limitations have been resolved, as using the algorithm based on deep convolutional neural networks depth can be easily estimated using the depth features extracted from a single RGB image. Furthermore, the trending developments in deep learning-based depth estimation are focusing on A. Ali (B) Interdisciplinary Centre for Artificial Intelligence, Aligarh Muslim University, Aligarh, India e-mail: [email protected] R. Ali Department of Computer Engineering, Aligarh Muslim University, Aligarh, India M. F. Baig Department of Mechanical Engineering, Aligarh Muslim University, Aligarh, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_34
393
394
A. Ali et al.
using deep convolutional neural networks for the transformation of images from 2 to 3D. Instead of the growing applications and performance of these techniques, there are still some issues with the resolution and quality of depth maps predicted from monocular RGB images. Many emerging applications in the domain of artificial depth-of-field, augmented reality, and other image effects need a high-quality 3D reconstruction of scenes in a minimal amount of time. Therefore, it is required that for such applications the reconstructed depth maps are free from large gaps or discontinuities that are generally inserted by the convolutional neural networks in the estimation results. The problem of deep learning-based depth estimation has been reduced to the problem of minimizing the depth regression error. Computer vision researchers are still working on the development of deep learning-based depth estimation algorithms with the objective of minimizing the difference between the depth estimated by the CNN model and the one obtained using a depth camera, and to improve the quality of predicted depth maps in terms of obstacle boundaries, shape, resolution, etc. The aim of the presented work is to develop an efficient monocular depth estimation model following an encoder-decoder architecture which can regress denser depth maps of high quality with sharp boundaries and minimal blurriness. After carrying out the analysis of the existing benchmark CNN architectures and training methods [4–8], we decided to go for a simpler architecture that can be fine-tuned and modified in the future easily. In this paper, we propose a transfer learning-based approach using the pretrained DenseNet-161 model as a dense feature extractor to solve the ill-posed problem of monocular depth estimation. The objectives and contributions of this research are as follows: • In the presented research, we have used the compact encoder-decoder architecture with few training parameters and less computational complexity. • In order to verify the effectiveness of the proposed approach, we present the evaluation results obtained after testing the proposed model on the NYU validation dataset consisting of 654 RGB images and conclude that the value of Root Mean Square Error (RMSE) obtained in our case is minimum than the RMSE values calculated using all the mentioned state-of-the-art methods for deep learningbased monocular depth estimation. • The results have shown that despite the simplicity of the network, the depth images generated by it are of both high quality and resolution with a clear visual representation of the sharp object boundaries as compared to existing methods (Fig. 3). • In order to achieve these results, we rely only on the pretrained Densenet-161 [9] model originally developed for image classification and trained on Imagenet [10] dataset, and in our model, it is used as a feature extracting encoder. Therefore, by designing such a modular architecture, future developments in deep learning can be easily transferred to the problem of depth estimation with the help of transfer learning. • The paper manifests the use of the pretrained Densenet-161 model as an encoder of the depth estimating network. The encoder is responsible for extracting features
34 Dense Monocular Depth Estimation with Densely Connected …
395
from the RGB image and reducing its size. However, the decoder is made up of upsampling layers, and it utilizes the encoded features to generate depth maps of the size equal to that of a ground-truth depth image. To the best of our knowledge, we are the first ones to use the pretrained Densenet-161 model as an encoder in the proposed depth estimation model.
2 Related Work The reconstruction of a 3D scene from an RGB image can be seen as an obscured task due to certain limitations imposed by surroundings like scale ambiguities, scene capturing camera, textureless walls, reflective materials, illumination effects, and lack of scene coverage, which is the limited field of view. These limitations are liable for ambiguities which create problems in deriving the geometry of the objects present in the scene from appearance. Usually, the techniques that can successfully capture the scene’s depth without any kind of ambiguity are assisted with some hardware tools like stereo camera, lidar, or IR-based sensors or the alternative solution that can be used without any hardware assistance is to capture a large number of images of the scene using a high-quality camera, but this approach requires a long and expensive reconstruction process which makes the onboard processing and scene reconstruction difficult with limited computation power. However, in the past few years, some researchers have also proposed deep learning-based solutions for depth estimation and 3D scene reconstruction that leverage deep convolutional neural networks and can estimate results after taking one or two RGB images as input. In this section, we have mentioned the recent research studies that have used deep learning-based techniques for deeper and denser depth prediction from RGB images. The deep learning-based techniques for the estimation of depth from RGB images are classified into the following three categories.
2.1 Depth Estimation from an RGB Image It has been taken into consideration by many researchers who formulate the problem as a regression problem in which a model generates a depth map using a single RGB image [4, 6–8, 11, 12]. Performance of such convolutional neural networks has not shown a rapid development over the traditional depth estimation techniques, instead, there is a lot to work in this domain in order to achieve much better results without the aid of any kind of hardware support. The depth estimation results presented by Eigen and Fergus [13] demonstrate that the performance of such models can be improved by applying different kinds of deep CNN architectures to this problem of monocular depth estimation that has performed well on other computer vision tasks. In their work, they proposed a two-stack CNN, in which one stack is used for the prediction of upscaling calculations and the other one is leveraged for making refinements in
396
A. Ali et al.
the local details. Similarly, depth estimation using a deep residual network with 50 layers (ResNet-50) is proposed by Laina et al. [14], and their experimental results illustrate that they achieve better predictions than all the previous depth estimating techniques.
2.2 Image-Based 3D Reconstruction 3D reconstruction of the scene using convolutional neural networks leverages a pair of images captured through a stereo camera or multiple images of the same scene from multiple views captured through a monocular camera. Previous research in stereo depth estimation and scene reconstruction generally considered the solutions that make use of image pairs [15], three consecutive frames [16], etc. However, the approach of multi-view scene reconstruction using convolutional neural networks has recently been proposed [17] by a few researchers. Zhou et al. 2018 [18] proposed an approach for dense camera tracking and depth estimation using joint keyframes. In this work, we are concerned about using single image-based depth estimation for achieving better performance and minimal time complexity, and we hope that the use of the features extracted by monocular depth estimation would improve the results of multi-view-based scene reconstruction techniques.
2.3 Depth Estimating CNN With Encoder-Decoder Architecture The encoder-decoder architecture of CNN has made a noticeable amount of contribution in providing solutions to the computer vision field with much better results than the traditional approaches. These problems include depth estimation, optical flow estimation [19], image segmentation [20], and image restoration [21]. Moreover, the contemporary deployment of these architectures in both supervised and unsupervised learning-based depth estimation methods [15, 17, 18, 22–24] has shown remarkable results. Typically, these methods use one or more encoder-decoder pairs for the construction of the depth estimation network. In order to minimize the complexity of the model, Hashim and Wonka [25] leverage only a single pair of encoder-decoder networks, where the encoding part is a Densenet-169 [9]. The decoder is made up of basic upsampling blocks. Following their approach, we propose a single encoder-decoder network that leverages transfer learning by using the pretrained Densenet-161 model as the encoder of the model.
34 Dense Monocular Depth Estimation with Densely Connected …
397
3 Proposed Method In this section, we describe our proposed approach for single image-based depth estimation. First, the details of the proposed depth estimation framework and the used encoder-decoder architecture are presented, and then the learning parameters and performance of the used pretrained model are discussed. Next, the description of the loss function used in training the network is illustrated, and then finally, the augmentation techniques used in the pre-processing of the dataset are described.
3.1 Design Framework for the Proposed Method The detailed process of the proposed depth estimation methodology is discussed as follows: • First, the input dataset consisting of RGB and depth images is partitioned into training and test datasets, where the training dataset consists of 48,000 images and the test dataset contains 654 images. • Next, the task of image normalization and data augmentation is carried out. In the normalization process, the input images are converted in the (0−255) range of pixel values, and data augmentation is performed for getting diversity in the dataset used for training the model. • Then, the pre-processed images are fed to the pretrained DenseNet-161, which extracts the dense depth features and downsamples the input RGB image to 1 × 1. • The downsampled image is supplied to the bilinear upsampling blocks of the decoder, which increase the resolution using the deconvolution layers and finally predict the required depth image. Figure 1 Represents the demonstrated algorithm in the form of a flowchart.
3.2 Network Architecture The pretrained Densenet [9] block is used as an encoder in the proposed depth estimation network. In order to predict depth using a single RGB image, first the RGB image of size 640 × 480 is given as an input to the encoder which is responsible for extracting features from the given image and then encoding it into a feature vector. In our network (shown in Fig. 2), we have used the Densenet-161 network with pretrained weights that were obtained for the ImageNet dataset as an encoder. In the next step, the encoded feature vector is fed to the decoder for generating a depth map of resolution 304 × 228. The decoder used in our case is made up of five upsampling blocks [21] where each upsampling block is made up of two layers
398
A. Ali et al.
Fig. 1 Flowchart for the proposed depth estimation algorithm
Fig. 2 The visual representation of the proposed depth estimation network
of bilinear upsampling followed by two convolutional layers with kernel size equal to 3 × 3, and the number of output filters is set to half the number of input filters. Out of these two convolutional layers, the first one is applied to the concatenated output of the previous layer and the pooling layer of the encoder of the same spatial dimensions. Except for the last upsampling block, the output of each upsampling block is fed to the leaky ReLU activation function [26] with learning parameter α = 0.2. For simplicity purposes, we don’t use any batch normalization [27] layer in the network unlike the standard state-of-the-art depth estimation methods [8, 11].
34 Dense Monocular Depth Estimation with Densely Connected …
399
3.3 Performance of the Proposed Network The proposed depth estimation network with a pretrained Densenet-161 model as an encoder and a simple decoder made up of five upsampling blocks and one transpose convolutional layer has achieved better performance for monocular depth estimation than the existing state-of-the-art complex depth estimating architectures. We compare the performance of our model with the methods that have used other pretrained models including ResNet-50 [28] as the encoder of their network and thus the evaluation results imply that simply by using the less complex network, we can achieve state-of-the-art results that are comparable to existing benchmark and complex depth estimation models. Furthermore, we perform different experiments by changing the number of upsampling and transposing convolutional layers, however, we conclude that further increment in the number of upsampling blocks beyond a certain limit doesn’t affect the performance of the model, instead, it makes the network more complex. Therefore, we analyze that the simple decoder composed of five upsampling blocks followed by a transpose convolutional layer can generate depth maps of high-quality resolution.
3.4 Loss Function We train our depth estimation network with Mean Absolute Error (MAE) loss function and compare their evaluation results with the existing state-of-the-art monocular depth estimation methods. The MAE loss function is generally used for optimization in supervised regression problems. MAE loss is responsible for optimizing the simple norm|| of the ||Euclidean distance between the predicted and ground-truth values: yˆ − y = || yˆ − y ||. For our problem, MAE performs very well and the depth maps generated by the model trained with this loss function have sharp boundaries rather than a smooth low-quality visual representation of edges. The advantage of using the MAE loss function is that it doesn’t penalize heavily for outliers present in the training dataset, as all of the errors (differences between predicted and ground-truth values) are weighted according to one linear scale.
3.5 Data Augmentation In order to reduce overfitting and to get a better-generalized performance of the model, researchers usually augment the dataset used for training by carrying out various geometric and photometric transformations. In our case, it will be inappropriate to apply all geometric transformations to the image since the proposed model is designed to generate a depth map of the entire RGB image, therefore the distortions in the visual representation of the image do not always lead to logical interpretations on
400
A. Ali et al.
the corresponding ground-truth depth image. After taking into account these issues, we devised an augmentation policy for our depth estimation dataset that includes five steps: Scaling, Rotation, Color jittering, Normalization, and Flipping. The RGB images of both training and validation datasets are scaled by a number ‘s’ randomly selected from the range 1 to 1.5, that is, s ∈ [ 1, 1.5 ], and corresponding pixel values of the ground-truth depth images are divided by the same number s. Next, rotation is performed on both the RGB and depth images, where both get rotated with a random degree r ∈ [−5, 5]. After rotation, the color images go through color jittering where the values for contrast, saturation, and brightness are each scaled by ki ∈ [0.6, 1.4]. Then, the next step in the used data augmentation strategy is color normalization where each RGB image is normalized by subtracting the mean values of pixels of the image from each pixel value and then dividing the result by the standard deviation value of the image. Then, the last transformation applied to color and depth images is flipping (mirroring) where both kinds of images are flipped horizontally at a probability of 0.5.
4 Experimental Results In this section, we first discussed the dataset used and the implementation details of the presented technique for monocular depth estimation. Next, we present a brief description of the evaluation metrics to validate our experiments, and then finally we compare the performance of our transfer learning-based encoder-decoder network with the existing state-of-the-art depth estimation methods.
4.1 Dataset Used NYU V2 depth [29] dataset consists of color (RGB) and ground-truth images of 464 different indoor scenes captured with an RGB-D camera Microsoft Kinect at a resolution of 640 × 480. From these 464 scenes in our work, we use 249 for training the model and 215 for validation and testing. This is the official division of the dataset. The training dataset consists of approximately 48,000 synchronized depthRGB image pairs evenly sampled from the raw video sequence of the official training dataset. For evaluating the performance of the model, the small labeled dataset of 654 images is used. The resolution of the depth maps generated by our depth estimating encoder-decoder network is 304 × 228. The RGB images used for training are fed to the model at their original resolution, that is, 640x480, whereas the ground-truth depth images are downsampled and center-cropped to the resolution of 304 × 228.
34 Dense Monocular Depth Estimation with Densely Connected …
401
4.2 Implementation Details The proposed depth estimation network is implemented using PyTorch [13] and trained on the benchmark NYU V2 depth dataset using a GeForce RTX 2070 GPU with 16 GB memory. The encoder used in the network is Densenet-161 pretrained on the ImageNet dataset. However, random weights are assigned to the decoder. During the training of the network, the ADAM optimizer is used with a learning rate of 0.001, and the values used for the learning parameters β1 and β2 are 0.9 and 0.999, respectively. The small batch size of 8 is used and the model is trained for 20 epochs, where for each epoch the network is trained on an entire dataset of about 48,000 images and images fed to the model in small batches of 8.
4.3 Evaluation Metrics We evaluate our depth estimation model with the following metrics: (1) RMSE The root mean square error is calculated as the square root of the mean of the square value of the difference between the predicted value and ground-truth value: r mse =
√
mean y − yˆ 2
(1)
(2) REL The mean absolute error is the mean of the absolute value of the relative difference between the predicted and ground-truth value of depth: r el = mean abs y − yˆ /y
(2)
4.4 Performance Comparison In this section, we present the evaluation results of our monocular depth estimation model trained on the NYU V2 depth dataset. In Table 1, the evaluation results of the state-of-the-art monocular depth estimation methods are compared with ours. This comparison illustrates that the proposed encoder-decoder architecture-based CNN model outperforms all the existing state-of-the-art [13, 14, 30–32] depth estimation methods for the RMSE evaluation metric. Our test results that are presented in Table 1 are obtained from the model which is trained with MAE loss. For a visual representation of the results, Figure 3 shows a gallery of depth images estimated by our model
402 Table 1 Comparison of the performance of the proposed model trained with MAE loss function with the state-of-the-art methods
A. Ali et al. Method
RMSE
REL
Eigen et al. [13]
0.641
0.158
Laina et al. [14]
0.573
0.127
Roy et al. [30]
0.744
0.187
Ma et al. [31]
0.514
0.143
Lan et al. [32]
0.557
0.128
Proposed model
0.505
0.142
along with those generated by state-of-the-art depth estimation techniques. From this representation, it can be concluded that our proposed approach has produced highquality depth maps with sharp depth edges which are even slightly better than the ground-truth depth images with relatively less percentage of blurriness.
5 Conclusion and Future Work In this paper, we have demonstrated a transfer learning-based monocular depth estimation network and obtained state-of-the-art experimental results. Leveraging the recent developments in deep learning, we design a modular encoder-decoder network architecture which exploits the pretrained dense feature extractor, DenseNet-161, for learning a mapping between the features present in the RGB image and the groundtruth depth map. The proposed network comprises two parts, encoder and decoder. The encoder of the network is a pretrained CNN model, DenseNet-161, which is responsible for dense feature extraction, and the upsampling and deconvolution layerbased decoding part is used for the upsampling of the depth features stack and to generate the high-resolution depth image. Instead of such simple architecture, the proposed depth estimation model performs better than the existing state of the art. However, in some estimated depth images, certain amounts of artifacts and blurriness are observed. This issue is due to the presence of high sparsity in the ground-truth depth image. As a result, in future work, we decided to train the network with the new loss function that combines MAE and photometric reconstruction loss to improve the performance of the model. Furthermore, we will explore other benchmark depth datasets, such as KITTI and Make3D, and pretrained CNN models for designing a more robust and efficient depth estimation network.
34 Dense Monocular Depth Estimation with Densely Connected …
403
Fig. 3 Comparison of depth maps on NYU-Depth-v2 dataset. From left to right: a RGB images; b ground-truth depth; c prediction of the proposed model; d prediction of Karaman et al. [28]
404
A. Ali et al.
References 1. Lee W, Park N, Woo W (2011) Depth-assisted real-time 3D object detection for augmented reality. ICAT11 2:126−132. 2. Moreno-Noguer F, Belhumeur P, Nayar S (2007) Active refocusing of images and videos. ACM Trans Graph 26(99):67. 3. Hazirbas C, Ma L, Domokos C, Cremers D (2016) FuseNet: incorporating depth into semantic segmentation via fusion-based CNN architecture. ACCV. 4. Eigen D, Puhrsch C, Fergus R (2014) Depth map prediction from a single image using a multi-scale deep network. NIPS. 5. Li B, Shen C, Dai Y, van den Hengel A, He M (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. CVPR 6. Laina I, Rupprecht C, Belagiannis V, Tombari F, Navab N (2016) Deeper depth prediction with fully convolutional residual networks. In: 4th International Conference on 3D Vision (3DV). pp 239–248. 7. Xu D, Ricci E, Ouyang W, Wang X, Sebe N (2017) Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5354−5362. 8. Fu H, Gong M, Wang C, Batmanghelich K, Tao D (2018) Deep ordinal regression network for monocular depth estimation. In: IEEE Conference on Computer Vision and Pattern Recognition. pp 2002−2011 9. Huang G, Liu Z, van der Maaten L, Weinberger K (2016) Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 2261−2269. 10. Deng J, Dong W, Socher R, L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition. pp 248−255. 11. Hao Z, Li Y, You S, Lu F (2018) Detail preserving depth estimation from a single image using attention guided networks. In: Proceedings−2018 International Conference on 3D Vision, 3DV. Institute of Electrical and Electronics Engineers, pp 304−313. 12. Xu D, Wang W, Tang H, Liu H, Sebe N, Ricci E (2018) Structured attention guided convolutional neural fields for monocular depth estimation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 3917−3925. 13. Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2650−2658. 14. Laina I, Rupprecht C, Belagiannis V, Tombari F, Navab N (2016) Deeper depth prediction with fully convolutional residual networks. In: 4th IEEE International Conference on 3D Vision (3DV). pp 239−248. 15. Ummenhofer B, Zhou H, Uhrig J, Mayer N, Ilg E, Dosovitskiy A, Brox T (2017) DeMoN: depth and motion network for learning monocular stereo. In: IEEE Conference on Computer Vision and Pattern Recognition. pp 5622−5631. 16. Godard C, mac Aodha O, Firman M, Brostow G (2018) Digging into self-supervised monocular depth estimation. CoRR. 17. Huang PH, Matzen K, Kopf J, Ahuja N, Huang JB (2018) DeepMVS: learning multiview stereopsis. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 2821−2830. 18. Zhou H, Ummenhofer B, Brox T (2018) DeepTAM: deep tracking and mapping. In: Proceedings of the European Conference on Computer Vision (EECV). pp 822−838. 19. Fischer P, Dosovitskiy A, Ilg E, Häusser P, Hazırba¸s C, Golkov van der Smagt D (2015) FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV). pp 2758−2766.
34 Dense Monocular Depth Estimation with Densely Connected …
405
20. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI. Springer International Publishing, pp 234−241. 21. Lehtinen J, Munkberg J, Hasselgren J, Laine S, Karras T, Aittala M, Alia T (2018) Noise2Noise: learning image restoration without clean data. In: Proceedings 35th International Conference on Machine Learning. pp 2965−2974. 22. Godard C, mac Aodha O, Brostow G (2017) Unsupervised monocular depth estimation with left-right consistency. In: IEEE Conference on Computer Vision and Pattern Recognition. pp 6602−6611. 23. Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings ICML. vol 30, p 3. 24. Collobert R, Kavukcuoglu K, Farabet C (2011) Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop. 25. Alhashim I, Wonka P (2018) High quality monocular depth estimation via transfer learning. CoRR. arXiv preprint arXiv:1812.11941. 26. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. ICML. 27. Owen A (2007) A robust hybrid of lasso and ridge regression. Contemp Math 443: 59−72. 28. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 770−778. 29. Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: Proceedings of the IEEE Conference on Computer Vision. pp 746−760. 30. Roy A, Todorovic S (2016) Monocular depth estimation using neural regression forest. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5506−5514. 31. Ma F, Karaman S (2018) Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: IEEE International Conference on Robotics and Automation (ICRA). pp 4796−4803. 32. Lan L, Zhang Y, Yang Y (2021) Monocular depth estimation via convolutional neural network with attention module. J Phys: Conf Ser 2025(1):12–62
Chapter 35
Learning Automata Based Harmony Search Routing Algorithm for Wireless Sensor Networks Karthik Karmakonda, M. Swamy Das, and Bandi Rambabu
1 Introduction Wireless Sensor Networks link tiny sensor nodes with limited battery, transmission power, and processing capabilities to monitor the surroundings. Unattended environment prevents sensor node recharge. Sensor node placed in a specific area senses local data and packages it for multi-hop or single-hop transmission to the sink node. With advances in technology and hard-ware, components became smaller and cheaper. Sensor node memory and processing power aren’t a problem. Comparatively, battery technology hasn’t increased exponentially. In Elhabyan et al. [1] authors implemented a two-level PSO-based clustering and routing mechanism. Algorithm leads to unequal energy consumption. Rambabu et al. [2] This paper proposes a Hybrid ABC-MBO based Cluster Head Selection Scheme for pre-dominant cluster head selection. ABC-MBO eliminates the poor ability of ABC’s global search process, which causes sensor nodes to die quickly during cluster head selection.
K. Karmakonda (B) · M. Swamy Das CSE, CBIT, Hyderabad, India e-mail: [email protected] M. Swamy Das e-mail: [email protected] B. Rambabu CSE, CVR College of Engineering, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_35
407
408
K. Karmakonda et al.
Jling et al. [3] developed Multipath routing protocol (MRP) to decrease energy consumption and optimise network lifespan. It derives many paths using ACO between head of cluster and sink node. In the last step, the load-balancing cluster head chooses the optimum forwarding route. Mohemmed and Sahoo [4] suggested a hybrid PSO/Noise approach to solve the network shortest-path issue. It used cost-priority encoding/decoding to incorporate network-specific heuristics in route creation. Liu et al. [5] Authors created QoS-PSO, an agent-assisted QoS-based routing scheme using PSO in WSN. In QoS, the goal function is formed by considering bandwidth, packet loss, and latency. Agents monitor network packet flow, routing status, and topology. In this approach, each node must keep global information, which isn’t viable in large-scale wireless sensor networks. In [6], Each iteration’s HMCR Harmony Memory Considering Rate parameter controls convergence. But not in local search. This method considers just hop count from sensor node to sink node, selecting the shortest path initially. In the next search, the new harmony vector can’t surpass the previous longest path, reducing the possibility of finding a near-optimal path. Wang et al. [7] proposed AMRA-Adaptive multi-path routing algorithm using Harmony search routing algorithm which uses dynamic parameters to regulate the local search and global search process. It uses HLAR to reduce the probability to trap in to local optima and improve convergence precision. Rao et al. [8] implemented a PSO based clustering scheme and in this algorithm fitness function is designed using parameters such as sink distance of all Cluster heads, residual energy and inter-cluster distance. Algorithm implement PSO based meta- heuristics for Cluster head selection. Due to this cluster head not uniformly distributed in the given sensing area and leads to unequal energy consumption This paper proposes a new HS scheme called the LAHSR (Learning automata based Harmony search routing) algorithm to maximum network lifetime of network and improve the energy efficiency. The major contributions can be listed as follows: 1. Learning automate is used to fine tune control parameters like “Harmony memory consideration Rate (HMCR) , Pitch adjustment Rate (PAR) and Length adjustment rate (LAR)”. 2. Harmony Search algorithm is used to find forwarding path from sensor to sink node. Following is the paper’s structure. In Sect. 2, a learning automata-based next hop selection approach to initialise Harmony memory and the LAHSR are proposed. Section 3 shows simulation results and comparison techniques. Section 4 offers conclusion thoughts and suggestions and future works.
35 Learning Automata Based Harmony Search Routing …
409
2 Related Work 2.1 Problem Statement Energy efficient routing process is a NP Hard problem in Large scale sensor networks. It is evident that use of Meta Heuristic algorithm in routing will drastically reduce the energy consumption using meta heuristic algorithms [9]. However, In the literature survey authors considered only energy consumption and hop count of nodes to find the fitness of path from sensor node to sink node.Routing parameters such as link quality, delay, are not considered which play major role in improve of energy efficiency of routing algorithm.
2.2 Learning Automate Based Routing Mechanism in the WSNs Using Harmony Search Algorithm In this section, the Wireless Sensor network (WSN) model is introduced.To create forwarding path from source to destination Learning Automate based next hop selection method is proposed.Later Harmony search algorithm is introduced. Further Learning automate based Harmony search algorithm is proposed.
2.3 WSN Model A WSN can be model as a graph G i (Vi ∪ Si , E i ) as shown in Fig. 1, where Vi = , vi2 , vi3 ...vin } is a set of n number of sensor nodes, E = {e1 , e2 , e3 ...en } is a set {vi1 of communication links and S is a sink node where data processing can be done. Every node estimates its neighbor’s selection probability via learning automation by considering its residual energy and distance. LAHSR is used to find energy-efficient, dependable routes based on selection probability.
2.4 Learning Automate Based Mechanism The proposed learning automate model has 5- tuples, which is represented as < A , α, P , β, L >. A set of automaton can be represented as A = {a1 , a2 , ..., ad } and each automate is stationed at every node (where d is degree of a node). A series of actions can be represented as α = {α1 , α2 , ..., αd }, where α indicates a set of d neighboring nodes and actions are executed on a random environment.
410
K. Karmakonda et al.
Fig. 1 Illustration WSNs Multiple path from sensor to sink node
2.5 Learning Automate Based Harmony Search Routing Algorithm 2.5.1
Fine Tuning HSA Algorithm for WSN
In order to adapt the Harmony Search algorithm into WSNs we consider that Harmony Memory of different size. As you can see in the Eq. 1 H M1 , H M2 , ..., H Mi , ...., H M H M S are Harmony Memories which represent forwarding in the route from source node to destination node in network. The number of nodes associated with forwarding route path may vary so Harmony memories will have different length (Fig. 2). Harmony search algorithm consider three parameters PAR Pitch adjustment Rate: It is chosen from Learning automate ranges from 0.5 to 0.9. When new Harmony H M = (s, n 1 , n 2 , n 3 , ...., n i , ...., d) path is created after each iteration, node n i can be considered based on PAR. If P A R ≤p2 then it ) is chosen randomly. is included in the new Harmony otherwise N eighbor (n i−1 Link Adjustment Rate LAR : It is choosen from Learning automate from 0.1 to 0.5. IF p2 ≤ P A R ∧ p3 ≥ LAR then n i in new Harmony node n i, j is replaced with neighbor . If p2 ≤ P A R ∧ p3 ≤ L A R then n i is new Harmony neghbor node of n i, j is inserted. Else do nothing. Harmony search algorithm consist of set of Harmony vectors. Each vector in harmony memory represent a near optimal solution. Initially the HM vectors are initialized randomly. 1. 2. 3. 4.
Initialize Learning Automate and Harmony Memory (HM) Derive a new Harmony Memory Apply greedy method to replace least fitness HM with Higher fitness HM Continue till the Max-iteration.
The classic HS technique for continuous optimization problems requires equal solution vector dimensions, or harmony in HM. WSN routing is a discrete optimization
35 Learning Automata Based Harmony Search Routing …
411
Fig. 2 Learning Automate based Harmony Search
problem with an unknown number of sensor nodes. HM’s harmonic dimension may vary.
2.5.2
Initialization Harmony Memory
In this Harmony is initialized using Learning automata based next hop selection method which uses node’s score, link quality, energy and distance. The probability to select node j as a immediate with respective node i can calculate using given algorithm. This process start from the source node and continues until it reaches Sink node i.e. destination node. ⎫ ⎧ ⎫ ⎧ (s, n 1,2 , n 1,3 , ...., n 1,l , ..., d) ⎨ H M1 ⎬ ⎨ ⎬ . . = (1) HM = ⎩ ⎭ ⎩ ⎭ H MH M S (s, n H M S,2 , n H M S,3 , ...., n H M S,n , ..., d) where s is source node, n i, j is intermediate node in the path from source node to destination node, H M S is Harmony memory size which is equal to possible number of paths from source node to destination, d is destination node, H Mi is ith harmony memory which ith forwarding path. Each Harmony memory in Eq. 1 is initialized by following equation. Each node calculates the probability of choosing the next hop neighbors using the equation
412
K. Karmakonda et al.
Pi, j =
⎧ k∈N (h k )−h j i ⎪ ⎨ (C(Ni −1)∗k∈Ni (h k ) if ( j ∈ Ni ) ⎪ ⎩ 0
else
Harmony memory H Mi = (s, n i,2 , n i,3 , ...., n i, j , ..., d) can be obtained as follow.Initially source node selects its next hop neighbors whose probability value Pi, j is high.Then next hop neighbor became current node and the current node selects the next hop neighbor whose probability value Pi, j is high. This process continues until it reaches the sink node.
2.5.3
Generate a New Harmony Memory
After initialization of Harmony memory vectors that is multiple route paths from source to sink. A new harmony memory H M = (s, n 1 , n 2 , n 3 , ...., n i , ...., d) is generated from existing harmony memory as following:
n i =
⎧ ⎪ ⎨ n i ∈ n 1,i , n 2,i , ...., n H M S,i , if p1 ≤ H MC R ∧ dir ect N B R(n i−1 , n i ) = 1 ⎪ ⎩ ), n i ∈ N eighbor (n i−1
otherwise
where p1 is random probability ranges from 0 to 1, HMCR is Harmony consideration rate ranges from 0 to 1. directNBR(node1,node2)=1 if node2 is direct neighbor of nodeA (i.e. nodeB is with in the communication range of nodeA) otherwise directNBR(node1,node2)=0. Neighbor(n i ) is set of all direct neighbor of node n i . This process continue until the sensor node selects next node is sink node.
2.5.4
New Harmony Length Adjustment Rate LAR
In new Harmony memory forwarding path randomly select one node n i, j and try to adjust or replace or insert a new neighbor so that we can not trap in to local optima and reduces the chance of getting shorter length which enable to search all possible neighbors which leads to get optimal solution. ⎧ ⎪ if p2 ≤ P A R ∧ p3 ≥ L A R ⎨ r eplace, n i, j = insert, if p2 ≤ P A R ∧ p3 ≤ L A R ⎪ ⎩ donothing, otherwise where p2 and p3 are random variable range between 0 to 1, PAR is pitch adjustment rate.
35 Learning Automata Based Harmony Search Routing …
413
After adjusting every node in the Harmony memory forwarding path, evaluate the fitness of new path. If the new path fitness is better then the best path in the HM it is replace with the best path.
2.6 Evaluating New Harmony Using Fitness Function and Replacing with Worst Harmony The energy dissipation for a communication in Wireless Networks(WN) is estimated in this study using the approximation energy model in [6]. This paper assumes a basic radio model in which the transmit amplifier must waste E amp = 0.1 nJ/bit/m2 in order to attain an acceptable transceiver and receiver circulatory dissipates E ele =50 nJ/bit. The radio thus uses the following energy to broadcast a k-bits message in d-meters: 2 ∗ k + E amp E T x (k , d ) = E elec ∗ d
(2)
For receive the message the radio spends energy E Rx (k ) = E elec ∗ k
(3)
Assume that a pathX = (s , x1 , x 2, ..., xi , ..., d ) has L nodes, or L elements; in this case, the energy consumption the following formula is used to compute the cost of sending a k-bits message via this path: L−1 2 ∗ k + E amp Engr (X ) = 2 ∗ (L − 1) ∗ E ele ∗ k ∗ i=1 di,i+1
(4)
Each Harmony memory H Mi i.e. forwarding path in Eq. 1 is evaluated based on fitness function.The fitness function is derived based on the parameters like hop count, Minimum Residual Energy (MRE) and Average Residual Energy (ARE) of path as follow: f (H Mi ) =
Enrg(X ) ∗ L Enrgmin ∗ Enrgavg
(5)
In above equation Engr(X) is the energy dissipated in the X path for sending K bits. Engrmin is the residual energy of the node which contains minim energy in the path X. Engravg average residual energy of all nodes in path X.
414
K. Karmakonda et al.
3 Simulation and Results This section offers simulation experiment findings. To examine the energy consumption of sensor nodes and the lifespan of the network, a Matlab R2022a online simulation software was used on a PC with a 2.3 GHz Intel core i7 3610QM CPU and 8 GB RAM. Based on the effective communication radius, ten random small-scale situations were produced. The number of nodes in each scenario ranged from 10 to 100, in 10-node increments. In addition, the simulated area ranged from 200 X 200 m2 (10 nodes), 300 X 300 m2 (20 nodes), to 1100 X 1100 m2 (100 nodes). Two well-known EEHSBR and IHSBEER routing algorithms were compared in order to validate the effectiveness of the LAHSR method.Four metrics were utilised for each scenario to evaluate each algorithms such as Average Residual Energy(ARE),Standard Deviation in Residual Energy (SDRE), Minimum Residual Energy (MRE) and Network Lifetime (NL) This research has conducted thorough tests on the following instances to compare the proposed routing method with the IHSBEER and EEHSBR algorithms.We considered the energy level of all 10 nodes is set to 10J. All packets are broadcast from the source nodes for all situations. Table 1 contains parameters pertinent to the subject. In addition, the experimental results were derived by averaging 10 simulation runs. Figure 3a–c shows the average, lowest, and standard deviation of residual energy after 600 packets are received by the sink node. Figure 3d demonstrates how many rounds networks can handle before nodes are depleted. More residual energy from cheaper energy. Figure 3a demonstrates LAHSR outperforms IHSBEER and EEABR. LAHSR may save more energy than EEHSBR, IHSBEER, and EEABR for the same-sized packet. Balanced energy use is a critical component of WSN energy efficiency. A smaller standard deviation indicates balanced energy consumption, whereas a large one indicates nodes’ remaining energy isn’t used on average.
Table 1 Parameters Items’ Packet Size’ Communication’ Radius’ HMS’ HMCR’ in IHSBEER HMCR’, PAR’ in EEHSBR HMCR’, PAR’ in LAHS Evaluation times Initial Energy E elec E amp
Parameters’ 4098 bits 150 m 5 H MC Rmin = 0.2 , H MC Rmax = 0.9 HMCR=0.7,PAR = 0.2 H MC Rmin = 0.2 , H MC Rmax = 0.9. P A Rmin = 0.2 , P A Rmax = 0.5 500 10 J 50 nJ/bits 100 pJ/bits
35 Learning Automata Based Harmony Search Routing …
415
Fig. 3 Performance of wireless sensor network
Figure 3 illustrates that the three strategies balance network energy use for scenario 1. (b). In each case, all events and packets come from the same source node. As the node with the least energy obtains energy, network lifetime rises. In Fig. 3, LAHSR and IHSBEER surpass EEABR (c). LAHSR and IHSBEER extend network life more than EEABR (d). We timed ourselves running LAHSR 10 times to determine Fig. 3 best source-to-sink forwarding pathways. Because 10 runs take 40 s, the base station needs 4 s to find the best source-to-sink path. The suggested route works nicely. Simulating 100 nodes showed LAHSR’s superiority. When the sink node received 300, 400, or 1200 packets, researchers determined average, standard deviation, and minimum residual energy. Figures 4, 5 and 6 Compare results. It shows IHS-BEER outperforms IHSBEER and EEHSBR, demonstrating that LAHSR saves more energy and balances network energy use better for SWSNs. Figure 5 demonstrates that the standard deviation of residual energy for IHSBEER and EEHSBR increases with the number of packets received by the sink node. EEHSBR and IHSBEER use more network energy than LAHSR.
416
K. Karmakonda et al.
Fig. 4 Average residual energy
Fig. 5 Standared deviation of residual energy
4 Conclusion and Future Scope WSNs lack energy efficiency. Using Harmony search algorithm, this research proposes a learning automata-based energy efficient routing scheme for WSNs. Learning automata consider a node’s leftover energy and lowest energy level to determine the optimal node in a routing path. This work uses Harmony to find routing routes. Using Learning automata, we fine-tuned Harmony’s search parameters and examined HMCR, PAR, and HLAR ranges. Better than IHSBEER and EEHSBR. In the future, we may deploy LAHSR on actual wireless sensor networks and test its QoS impact.
35 Learning Automata Based Harmony Search Routing …
417
Fig. 6 Minimum residual energy
References 1. Elhabyan RSY, Yagoub MCE (2015) Two-tier particle swarm optimization protocol for clustering and routing in wireless sensor network. J Netw Comput Appl 52:116–128 2. Rambabu B, Reddy A, Janakiraman S (2022) Hybrid Artificial Bee Colony and Monarchy Butterfly Optimization Algorithm (HABC-MBOA)-based cluster head selection for WSNs. J King Saud Univ Comput Inf Sci:1895–1905 3. Yang J, Xu M, Zhao W, Xu B (2010) A multipath routing protocol based on clustering and ant colony optimization for wireless sensor networks. Sensors 10(5):4521–4540 4. Mohemmed AW, Sahoo NC (2007) Efficient computation of shortest paths in networks using particle swarm optimization and noising metaheuristics. Discrete Dyn Nat Soc 5. Liu M, Xu S, Sun S (2012) An agent-assisted qos-based routing algorithm for wireless sensor networks. J Netw Comput Appl 35(1):29–36 6. Bing Zeng YD (2016) An improved harmony search based energy-efficient routing algorithm for wireless sensor networks. Appl Soft Comput 41(6):135–147 7. Wang X, Wang W, Li X, Wang C, Qin C (2017) Adaptive multi-hop routing algorithm based on harmony search in wsns, in: 2017 9th International conference on advanced infocomm technology (ICAIT), pp 189–194 8. Rao PCS, Jana PK, Banka H (2016) A particle swarm optimization based energy efficient cluster head selection algorithm for wireless sensor networks. Wireless 23:2005–2020 9. Bandi R, Ananthula VR, Janakiraman S (2021) Self adapting differential search strategies improved artificial bee colony algorithm-based cluster head selection scheme for WSNs. Wireless Pers Commun:2251–2272
Chapter 36
Financial Option Pricing Using Random Forest and Artificial Neural Network: A Novel Approach Prem Vaswani , Padmaja Mundakkad , and Kirubakaran Jayaprakasam
1 Introduction Valuation of financial derivative instruments like option contract is a tortuous task involving diverse chaotic and non-stationary data factors. Option pricing is one of its rapidly growing but complex and mystified fields among the derivative instruments market, primarily used to hedge, to speculate, or to engage in arbitrage. Option contracts are financial contracts that give right to buy (call) or sell (put) the underlying asset at a pre-defined rate (strike price). Option contracts have gained popularity as these are traded in lots and do not require large initial investment payoffs. An option’s resultant payoffs depend upon the value of an underlying asset on the maturity date, i.e., the option’s moneyness (Figs. 1 and 2). There is extensive literature on the valuation of option instruments like the binomial model, Monte Carlo simulation, and the Black–Scholes model. Traditionally, the Black–Scholes equation is appraised as the most cardinal and fundamental mechanism in financial mathematics. Option pricing using these equations is based on definite rigid assumptions like asset price following geometric Brownian motion with constant drift, volatility, and risk-neutral probability (Black and Scholes [1]; Laws [2]; Merton [3]; Nielsen [4]). Despite abundant research in this area and various mathematical models available to gain better results and returns, researchers, financial traders, and analysts still perceive that determining option prices is one of the most grueling tasks.
P. Vaswani · P. Mundakkad (B) Department of Humanities and Social Sciences, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India e-mail: [email protected] K. Jayaprakasam Department of Management Studies, National Institute of Technology, Tiruchirappalli, Tamil Nadu, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_36
419
420
P. Vaswani et al.
Fig. 1 Net payoff call option (Source: Author)
Fig. 2 Net payoff put option (Source: Author)
The key variables used as input in the Black–Scholes model are the spot price of the underlying asset (S), the strike price (X), and time to maturity (t) which are readily available. However, variables like the risk-free interest rate(r) and the volatility (σ) are based on historical values. On a particular trading day, options of different strike prices with different expiration dates are available, depicting the future price volatility expectations of the investor and traders for the underlying asset. Time to maturity is another crucial variable in determining the option’s theoretical value, and it is the time in days between the date of determination and the option exercise date. The time to maturity accounts for the time value of money of the option contract. In the real world, the theoretical value varies from the actual value due to various market factors, such as Open Interest of options and underlying assets, structural breaks, speculative shocks, and changes in the expected rate of return., same is also advocated by Lemmon and Ni [5]. As a result, complex advanced modeling would enhance the accuracy in the prediction of the option prices. The data-driven AI-based machine learning models are highly appreciated in the field of predictive modeling. The traditional option pricing models regenerate the empirical relations based on rudimentary assumptions. However, ML models do not presume anything and determine a functional relationship between the inputs and output while simultaneously minimizing a given cost or error function. These methods generalize well for high-dimensional data (Singh and Srivastava [6]). Most
36 Financial Option Pricing Using Random Forest and Artificial Neural …
421
of the existing studies use autoregressive variables in option valuations using RNN and LSTM techniques. However, in case of financial option instruments, the option writers can introduce an option anytime in the market based on their profit expectation and demand for that option and each option contract is unique based on strike price and expiry date, hence continuous time series modeling methods could not be very efficient; as on a particular day, a number of options exist with different strike prices and different expiry dates for a same underlying asset. Therefore, our study focusses on exogenous and discrete variables as they correspond to the efficient market hypothesis of all known information by the investors. This paper assesses the results of basic but powerful ML models such as random forest and artificial neural network (ANN), which are used to determine option prices. It compares the results of the models in terms of Root Mean Squared Error (RMSE), Mean Squared Error (MSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) as performance evaluators. Anders et al. [7] also used similar approach for call options on German stock Index DAX, with the indication of statistical inference of superior out-of-sample performance compared to Black/Scholes model. This paper contributes to the existing literature by efficiently determining the value of the options using Random Forest and ANN models, which may supersede traditional models. To the best of our knowledge, no other study has used ML models to determine the prices of European call and put options using the variables like Open Interest and change in Open Interests that we have used in our study. The rest of the paper is sectioned as follows. Section 2 discusses the literature overview of the models used along with the parameter setting used in the study. Section 3 offers the brief of the research data of the study. Section 4 briefly discusses the proposed methodology. Section 5 reports the results of the empirical evaluation of the proposed methods and their comparisons. Finally, Sect 6 presents the conclusion and the future research scope of the paper.
2 Literature Overview 2.1 Artificial Intelligence (AI) Models The attribute learning potential of AI models has expanded the scope of application to domains such as decision support using image and language recognition (He et al. [8]; Jean et al. [9], LeCun et al. [10]). These approaches have been successfully used with structured datasets in decision support (Fan et al. [11]); Baek and Kim [12] used AI models in financial prediction modeling and volatility with Long Short-Term Memory (LSTM) and Recurrent Neural Network (RNN). Kim and Park [13] reviewed the application of AI algorithms in Eco-environmental modeling. AI modeling has also been widely used in the predictive analysis of the financial market domain, especially the stock market, as extensively reviewed by Gandhmal and Kumar [14]; Huang et al. [15]; and Jiang [16].
422
P. Vaswani et al.
These advanced modeling techniques can play a vital role in the valuation of financial instruments. They can recognize the non-linear relationship between various inputs and the prediction of daily stock exchange index prices, The evolving literature shows the application of ML methods, such as support vector regression attempted by Pan et al. [17], XGBoost used by Ivascu [18] on WTI crude oil future contracts data to determine call option prices of commodity options, and ANN used by Chen et al. [19] on Taiwan Stock Index. The literature points out that it is the design of the architecture of the AI models that avoids overfitting. A Plethora of research is available on applying AI modeling to predict stock price and return movement. Kim et al. [20] have examined the potential of deep learning methods in financial risk assessment and its ability to effectively manage the risk through better hedging decisions. Atsalakis [21] challenged the efficient market hypothesis using a neurofuzzy system on the data of five stocks listed on the Athens Stock Exchange and five stocks in NYSE and obtained solid superior prediction accuracy performance to predict the next day’s trend. Guresen et al. [22] used multiple ANN models and used GARCH to add new features into the model and compared the output of daily value of the NASDAQ index using MSE and MAE. They also argued that historical values could be used only to forecast in a shorter period, and fundamental analysis will be required for a more extended period. Moghaddam et al. [23] predicted the movement of daily NASDAQ prices using ANN. Chen et al. [19] used a probabilistic neural network to evaluate the movement and direction of expected returns on the market index of the Taiwan Stock Exchange, suggesting high returns superiority of the PNN model strategy over others. Goel and Singh [24] reported 93% accuracy using ANN model to predict BSE SENSEX closing prices. The application of ML models for stock market prediction is not a new concept. However, the ML models like Random Forest and ANN are not yet widely used in option pricing segment of financial markets. Some of the known works include Hutchison [25] assessing the nonparametric approach using learning network for estimating out-of-sample price and delta hedge option derivatives. Bennell and Sutcliffe [26] used FTSE100 options for comparing ANN and Black–Scholes. Yao et al. [27] used Nikkei 225 index futures and obtained different accuracy results by grouping data differently and argued that ANN outperforms the traditional Black–Scholes model for volatile markets. Mostafa and Dillon [28] claimed that the ANN model performed better when trained on implied volatility based on the Black–Scholes formula. However, the application of the AI models is not well explored in case of financial derivative options.
36 Financial Option Pricing Using Random Forest and Artificial Neural …
423
Fig. 3 Random forest structure (Source Author)
3 Proposed Methodology 3.1 Random Forest Regression Random Forest is a widely recognized supervised learning algorithm model that uses the ensemble learning method. Random Forest is a combination or ensemble of decision trees Biau [29]. The ensemble learning method combines predictions from multiple machine learning algorithms to more accurately predict the output than a single algorithm model. Breiman [30–32] demonstrated that ensembles of trees could achieve a substantial gain of accuracy. Random Forests form decision trees based on an arbitrary vector λ such that the tree predictor f (x, λ) comes equivalent to labels. The training set is generated independently from the random vector Y, X distribution for the numerical output values (Yang et al. [33]). The Random Forest predictor is framed as mean over k of the trees {f (x, λ k)} (Fig. 3).
3.2 Artificial Neural Network The influential paper published on the backpropagation algorithm by Rumelhart et al. [34] presenting the training of artificial neurons has unleashed the power of ANN. After this, Multi-Layer Perceptron (MLP), also referred to as Artificial Neural Network, Convolutional Neural Network (CNN), and Recurrent Neural Network
424
P. Vaswani et al.
Fig. 4 Structure of ANN (Image Source: Author)
(RNN), was developed for different applications. Every node uses a non-linear activation function, ReLU,1 and tanh2 function, to convert the inputs into output and move it to the following layer. The layers between these input and output layers are called hidden layers (Mezofi [2929]). ANN is based on an iterative approach to get the best weight coefficients that minimize the cost function, say MAE, MSE, or RMSE. It usually starts with learning meaningful relationships between variables. The neural networks model aims to build a mathematical formulation that mimics human brain behavior to extract meaningful information from incoming data. The visualized structure of ANN is shown in Fig. 4. In our study, we use Random forest and ANN models to predict the NIFTY option prices considering the wide and successful recognition in financial domain, and their high generalization and approximation abilities to generate results close to the real data (Hiransha et al [36]).
3.3 Performance Evaluation This study uses grid-search cross-validation assessment that reduces the chances of overfitting by testing the model twice using out-of-sample, adjusting the parameters, and then testing data, erstwhile making the model ready for noisy real-life data. 1
ReLU stands for rectified liner unit and is used in neural networks as a non-linear activation function. ReLU is more efficient than other functions because only a certain number of neurons are activated at a time, while transferring values from one layer to another, Sharma and Sharma [35]. 2 Tanh function is hyperbolic tangent function. Tanh function is preferred over sigmoid function as its gradients are not constrained and zero centered. Tanh function is continuous and differentiable, the values lie in the range –1 to 1, Sharma and Sharma [35].
36 Financial Option Pricing Using Random Forest and Artificial Neural …
425
Generally, cross validation minimizes the bias of random sampling by splitting the dataset into training and test datasets. However, we use grid-search cross validation, where a model with multiple parameter setting combinations runs on the training dataset. The model gets trained and validated on the training dataset; the validation error reduces synchronously at the beginning of the learning process. Under the proposed method for each algorithmic model, the algorithm takes hyper-parameter setting combination to train the model and evaluate it through k-fold cross validation. Each model performance is stored and then the process is repeated with the next parameter setting combination of the same model. After taking all the combinations, the best parameter settings are selected based on performance score. Such selected model yields the best hyper-parameter values for the model with robust application of k-fold cross validation for each combination. Then the best model parameters we acquire from minimized error parameter setting model are used to value the output of the test (holdout) dataset. The performance evaluation scores give us a superior and neutral picture of the performance of model. As followed by various studies, the performance evaluation score of MSE, MAE, RMSE, and MAPE provides a more accurate and unbiased view of model’s performance. The flow graph and model process of the proposed methodology is presented in Fig. 5. The model performance efficiency is then evaluated using the Diebold-Mariano test [37] to determine whether the two forecasts are significantly different. Let li and mi be the residuals for the two forecasts. Let li and mi be the residuals for the two forecasts: li = yi − fi
(1)
mi = yi − gi
(2)
The di will be loss differential related to MSE error statistic: Fig. 5 Structure of model process (Image Source Author)
426
P. Vaswani et al. n 1∑ di n i=1
(3)
n 1 ∑ (li − D)(m i − D) n i=k+1
(4)
D= For n>k>1, ∂k =
For h ≥ 1, the Diebold-Mariano statistics is defined as DM = √
[∂0 + 2
D ∑h−1
(5)
k=1 ∂k ]/n
Significant difference between the forecasts, if |DM| > Zcrit where Zcrit is the two-tailed critical value for the standard normal distribution, i.e., Zcrit = SND (1−α/2, TRUE)
(6)
4 Research Data Our study uses daily NSE NIFTY 50 and BANK NIFTY Index option prices data from August 2020 through August 2022 published in the NSE website as these two options construed the highest number of transactions in Indian Financial market. The dataset reports options of different strike prices with multiple expiration dates listed on the NSE. National Stock Exchange (NSE) is the highest volume traded exchange of India. The dataset reports option prices of different expiration dates and multiple strike prices listed on the exchange. The Underlying Price column represents the open price of the underlying stock. Time to maturity is calculated as the number of days between the day of calculation and the option’s expiration date. The study aims to determine the option’s theoretical value as the option’s expected closing price, which is then compared to the actual listed closing option prices. The input variables include time to maturity in days, i.e., number of days between date of calculation and the option’s expiration date, Strike price, Option’s open price, Underlying price, i.e., NIFTY open, high, low and close price, Open Interest (OI), and Change in OI. The data obtained from the NSE website for the Call and Put Index option prices, Index Prices, merged into a single file and filtered for the options with the non-zero Open Interest. The datasets had 464261 observations of NIFTY Index options and 328288 observations of BANKNIFTY Index option after applying the said filter. The inputs used in this study are correlated and have an effect on the closing price of the options.
36 Financial Option Pricing Using Random Forest and Artificial Neural …
427
5 Results The results presented in this paper are very promising and demonstrate the superiority of the data-driven methods of Random Forest and ANN. The performance evaluation results of models are presented in Table 1. The possible reason for the superior performance of the Random Forest algorithm is that the procedure adapts to sparsity as the number of strong features decides the rate of convergence and not the presence of noise variables as contended by Biau [29]. Artificial neural networks are also effective universal approximators and have high generalization potential. The results of DM test statistics clearly indicate that these AI-driven machine learning models have higher forecasting accuracy as compared to BSM model. The graphical comparison of the actual option close price and the predicted option close price is presented in Figs. 6–13. The graphical comparisons of model predictions and actual data clearly indicate precise prediction abilities of both the models to generate output results close to the real data. The predictions using large samples and complex algorithms give high accuracy results due to the weights fixed by the optimized function of the neural network. The results are consistent with Anders et al. [7], Mitra [38], and Saxena [39]. ANN has the ability to perform parallel computation and hence are suitable for large databases and spurious data. Although the training Table 1 Comparative performance evaluation results MSE
RMSE
MAE
MAPE
NIFTY Call (212643 Obs) RF ANN DM test statistic
7665.20
87.55
39.86
0.40
19226.49
138.66
57.89
3.95
4943.75
70.31
28.81
0.43
12192.87
110.42
46.27
3.75
18.32a
NIFTY put (251618 Obs) RF ANN DM test statistic
23.04a
BankNIFTY-call (165498 Obs) RF
58194.11
241.24
118.22
0.51
ANN
99544.06
315.51
120.74
2.42
DM test statistic
13.09a
BankNIFTY-put (162790 Obs) RF
43493.87
208.55
95.02
0.56
ANN
68066.01
260.90
102.74
4.49
DM test statistic
12.97a
Note: The performance evaluation results are on out-of-sample data (test data). a represents significance at 1%
428
P. Vaswani et al.
of an ANN can be quite time-consuming, but once the ANN function is finalized, then the valuation of option prices is extremely fast.
Fig. 6 NIFTY call option (RF)
Fig. 7 NIFTY call options (ANN)
36 Financial Option Pricing Using Random Forest and Artificial Neural …
429
Fig. 8 BANKNIFTY call option (RF)
Fig. 9 BANKNIFTY call option (ANN)
6 Conclusion The world has witnessed tremendous technological advancement at exceptional speed since the groundbreaking work of Nobel laureates Black–Scholes and Merton in option pricing. The computational power and data processing have seen exponential growth, particularly over the past decade. The valuation of derivatives instruments with such accuracy was unimaginable back then when theoretical option pricing models primarily stood on the rudiments of stochastic calculus. This paper provided an assessment of the data-driven approach using supervised learning algorithms,
430
P. Vaswani et al.
Fig. 10 NIFTY put option (RF)
Fig. 11 NIFTY put options (ANN)
particularly Random Forest and ANN. These methods have higher explaining power and support efficient hedging decisions, which may supersede conventional methods like the Black–Scholes model. The consistent results for two panels, i.e., NSE NIFTY Index options and NSE BANKNIFTY Index options using the same set of inputs, validate the robustness of our approach. The approach discussed in this paper is an addition to the existing toolset for financial engineers, financial analysts, and traders in effectively hedging their financial risks by comparing the calculated day-end close value with the quoted value. The assessment of difference between the calculated
36 Financial Option Pricing Using Random Forest and Artificial Neural …
431
Fig. 12 BANKNIFTY put option (RF)
Fig. 13 BANKNIFTY put option (ANN)
price and the actual price of options can give us a key indication of investors’ attitude toward financial market conditions, which can be used for determining whether the security is overvalued or undervalued. The results of DM test statistics clearly indicate that Random Forest models have higher forecasting accuracy as compared to ANN. The study is not free from limitations. We have applied the selected models only on the Index options, i.e., NSE NIFTY Index options and NSE BANKNIFTY Index options and that too only on 2 years option pricing data. For future research work, the Random Forest and ANN method along with other popular methods like support
432
P. Vaswani et al.
vector regression and XGBoost may be used to value the prices of options of individual stock companies whose options are publicly traded on Stock Exchange to assess the effectiveness of these models for such options. The datasets may further be partitioned into subsets based on call or put and moneyness of options to assess the impact of such partitioning on error results. Although deep learning models might not work efficiently for such dataset, unsupervised learning techniques may also be used to value the option prices as Salvador et al. [40] used unsupervised learning with artificial neural networks for financial option valuation and Tali [41] used reinforcement learning for delta hedging.
References 1. Black F, Scholes M (2019) The pricing of options and corporate liabilities. In: World Scientific Reference on Contingent Claims Analysis in Corporate Finance. Vol 1, Foundations of CCA and Equity Valuation, pp 3−21. 2. Laws J (2018) Introduction to options. In: Essentials of Financial Management. Liverpool: Liverpool University Press, pp 147−166. http://www.jstor.org/stable/j.ctvt6rjjs.12. Accessed 13 July 2021. 3. Merton R (1973) Theory of rational option pricing. Bell J Econ Manag Sci 4(1):141–183. https://doi.org/10.2307/3003143 4. Nielsen LT (1992) Understanding N (d1) and N (d2): Risk Adjusted Probabilities in the Blackscholes Model 1. INSEAD. 5. Lemmon M, Ni S (2014) Differences in trading and pricing between stock and index options. Manag Sci 60(8):1985–2001 6. Singh R, Srivastava S (2017) Stock prediction using deep learning. Multimed Tools Appl 76(18):18569–18584 7. Anders U, Korn O, Schmitt C (1998) Improving the pricing of options: a neural network approach. J Forecast 17(5–6):369–388 8. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770−778. 9. Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301):790–794 10. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324 11. Fan MH, Chen MY, Liao EC (2019) A deep learning approach for financial market prediction: Utilization of Google trends and keywords. Granul Comput. 1−10. 12. Baek Y, Kim HY (2018) ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113:457–480 13. Kim KS, Park JH (2009) A survey of applications of artificial intelligence algorithms in ecoenvironmental modelling. Environ Eng Res 14(2):102–110 14. Gandhmal DP, Kumar K (2019) Systematic analysis and review of stock market prediction techniques. Comput Sci Rev 34:100190 15. Huang J, Chai J, Cho S (2020) Deep learning in finance and banking: a literature review and classification. Front Bus Res China 14:1–24 16. Jiang W (2021) Applications of deep learning in stock market prediction: recent progress. Expert Syst Appl. 115537. 17. Pan Y, Xiao Z, Wang X, Yang D (2017) A multiple support vector machine approach to stock index forecasting with mixed frequency sampling. Knowl-Based Syst 122:90–102
36 Financial Option Pricing Using Random Forest and Artificial Neural …
433
18. Ivas, cu CF (2021) Option pricing using machine learning. Expert Syst Appl 163:113799 19. Chen AS, Leung MT, Daouk H (2003) Application of neural networks to an emerging financial market: forecasting and trading the Taiwan Stock Index. Comput & Oper Res 30(6):901–923 20. Kim A, Yang Y, Lessmann S, Ma T, Sung MC, Johnson JE (2020) Can deep learning predict risky retail investors? a case study in financial risk behavior forecasting. Eur J Oper Res 283(1):217–234 21. Atsalakis GS, Valavanis KP (2009) Forecasting stock market short-term trends using a neurofuzzy based methodology. Expert Syst Appl 36(7):10696–10707 22. Guresen E, Kayakutlu G, Daim TU (2011) Using artificial neural network models in stock market index prediction. Expert Syst Appl 38(8):10389–10397 23. Moghaddam AH, Moghaddam MH, Esfandyari M (2016) Stock market index prediction using artificial neural network. J Econ, Financ Adm Sci 21(41):89–93 24. Goel H, Singh NP (2021) Dynamic prediction of Indian stock market: an artificial neural network approach. Int J Ethics Syst. 25. Hutchinson JM, Lo AW, Poggio T (1994) A nonparametric approach to pricing and hedging derivative securities via learning networks. J Financ 49(3):851–889 26. Bennell J, Sutcliffe C (2004) Black-Scholes versus artificial neural networks in pricing FTSE 100 options. Intell Syst Account, Financ & Manag: Int J 12(4):243–260 27. Yao J, Li Y, Tan CL (2000) Option price forecasting using neural networks. Omega 28(4):455– 466 28. Mostafa F, Dillon T (2008) A neural network approach to option pricing. WIT Trans Inf Commun Technol 41:71–85 29. Biau G (2012) Analysis of a random forests model. J Mach Learn Res 13:1063–1095 30. Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123−140 31. Breiman L (2000) Some infinity theory for predictor ensembles. Technical Report 579, Statistics Department. UCB. 32. Breiman L (2001) Random forests. Mach Learn 45(1): 5−32 33. Yang L, Liu S, Tsoka S, Papageorgiou LG (2017) A regression tree approach using mathematical programming. Expert Syst Appl 78:347–357 34. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088): 533−536 35. Sharma S, Sharma S (2017) Activation functions in neural networks. Towar Data Sci 6(12):310– 316 36. Hiransha M, Gopalakrishnan EA, Menon VK, Soman KP (2018) NSE stock market prediction using deep-learning models. Procedia Comput Sci 132:1351–1362 37. Diebold FX, Mariano RS (2002) Comparing predictive accuracy. J Bus & Econ Stat 20(1):134– 144 38. Mitra SK (2012) An option pricing model that combines neural network approach and black scholes formula. Glob J Comput Sci Technol. 39. Saxena A (2008). Valuation of S&P CNX Nifty options: comparison of black-scholes and hybrid ANN model. In Proceedings SAS Global Forum. 40. Salvador B, Oosterlee CW, van der Meer R (2020) Financial option valuation by unsupervised learning with artificial neural networks. Mathematics 2021(9):46 41. Tali R (2020) Delta hedging of financial options using reinforcement learning and an impossibility hypothesis (Doctoral dissertation, Utah State University). 42. Mezofi B, Szabo K (2018) Beyond black-scholes: a new option for options pricing.
Chapter 37
Hosting an API Documentation Portal Using Swagger and Various AWS Anjali Sinha and K. Nagamani
1 Introduction The Application Programming Interface (API) [1] is a software interface that enables communication between two applications without any need for human involvement. APIs must be understood correctly to be used effectively. To accomplish this, APIs often include some form of documentation that users may utilize to become comfortable with the capabilities of the API. Information from API documentation is given as text, code, graphics, etc., in documents, most frequently in the form of web pages. The purpose of documentation is to clarify the API’s assumptions and limitations, provide developers with instructions for using the API, and help them avoid problems. It is essential for a positive API consumption experience [2]. Nowadays, software is usually reused, and it is uncommon to write brand-new software. Generally, SDKs (software development kits) and APIs are available to solve a problem and can be utilized to develop new software [3]. The digital environment is massively dependent on APIs. They are giving users and consumers a way to enhance the functionality of already existing products. Third-party developers must grasp the ways to use an API before using it to solve their present issues. Even if the most comprehensive API is available, it won’t be used as efficiently if the users don’t know how to utilize it. API documentation is useful in situations like these. Concise and easily understandable documentation also puts less pressure on the customer support teams.
A. Sinha (B) · K. Nagamani Department of Electronics and Telecommunication Engineering, R.V. College of Engineering, Bengaluru, Karnataka, India e-mail: [email protected] K. Nagamani e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_37
435
436
A. Sinha and K. Nagamani
Understanding the target audience is the first step in writing API documentation. So, it becomes necessary to understand the types of users being targeted, the value the content provides them, and how they use it. There are two main types of users to consider while writing API documentation. Users will be interacting with the API and thus looking for tutorials and code examples. This group is primarily composed of developers. Then there are the users, who will assess the API’s pricing, rate limits, and security to determine how well it fits their business needs and goals. This group primarily consists of CTOs and product managers, with some exceptions. Both the groups need to be kept in mind to make sure that all the users can utilize the documentation effectively [4]. Although it is inherently desirable to provide high-quality API documentation, it is not an easy task. Nowadays, most documentation either lacks the information that the intended audience would anticipate or make it difficult to obtain that information. Furthermore, there is no simple tool or method for evaluating created documentation and identifying missing or improved elements. As a result, lowquality documentation is one of the most significant barriers to learning an API for developers. This paper examines the research and publications that are currently available on API documentation and outlines the key components of effective API documentation that give readers the knowledge they need to utilize the API efficiently. These elements can be used as a guide to creating high-quality API documentation. Some standard API documentation tools are also discussed. Finally, a method to host an API documentation portal using Swagger/Open API [5] is presented. The objective of this API documentation portal is to provide any customer with information about all calls and services that are available, as well as the input needed to operate those calls. Its main goal is to illustrate how the output would look. The paper is structured as follows: Sect. 1 introduces what API documentation is, its need, and the target audience. Section 2 describes the essential elements of API documentation. Section 3 talks about some available tools for creating the documentation. Section 4 reviews available research and journals on API documentation. Section 5 presents the methodology to host the portal. Section 6 addresses methods for evaluating the quality of API documentation. Sections 7 and 8 discuss the results and conclusions. Section 9 talks about the future scope.
2 Basic Elements of API Documentation API documentation is like the entry to an API. This documentation typically combines several kinds of data that function together. The following section presents a list of the fundamental components that all API documentations must have [6]. • Resources−An API’s data is referred to as “resources.” Most APIs offer a variety of resource types or information categories that can be returned.
37 Hosting an API Documentation Portal Using Swagger and Various AWS
437
• Request Verbs−The verbs used in requests specify what should be done with the information collected from the server. The system is told to obtain or acquire the data from the server by the browser using a GET request. There are many request verbs like GET, PUT, POST, and DELETE. • Request Headers−Headers are the various additional pieces of information transmitted with the data; they also contain different formats in which the data must be retrieved. The various authentication and authorization techniques are also included here. • Request Body−The request body contains information about where the new data should be added to the web server. Data is typically sent in the request when a POST request is made to the server. • Response Body−The response body contains the details of the response obtained from the request sent. • Response Status code−Response status codes relate to a code number in the response header that identifies the general category of the response, such as whether the request was successful (200), produced a server error (500), or had permission problems (403).
3 Standard API Documentation Tools It used to be necessary to manually test and document each endpoint of an API before publishing it. This process has been simplified by the Open API spec [5] although web API developers also have other options. There are numerous approaches to describing and representing APIs that should be considered. • Open API Specification/Swagger—The most prominent framework for developing across the entire API lifecycle, including documentation, design, testing, and deployment, is Swagger. It provides a shared vocabulary for constructing APIs. Without any implementation logic in place, anyone can visualize and interact with the API resources using Swagger UI. It was the first tool to deliver API documentation with a built-in test client. • Slate—Slate [7] has a clean and intuitive design. With Slate, all code examples are on the right side of the documentation, while the API definition is on the left. It is based on the API documentation for Stripe and PayPal. Slate is responsive, so it looks excellent in print as well as on tablets and phones. • API Blueprint—It is a solid high-level language for designing web APIs. Everyone participating in the API design lifecycle can use API Blueprint [8] because it is straightforward and easily accessible. Although short, its syntax is expressive. With API Blueprint, one can quickly prototype and model APIs that will create or describe APIs that have already been deployed. • Google Cloud Endpoints—Endpoints is an API management tool that employs the same infrastructure Google has for its APIs. The Cloud Endpoints Portal allows
438
A. Sinha and K. Nagamani
to build a developer portal; a website API users can access to browse documentation and engage with the API after the APIs are deployed to the Endpoints [9].
4 Related Works In this section, related works are discussed, which are about the available research and journals on API documentation tools and studies on ways to improve a documentation. Siyuan Jiang [10] presents a prototype toolset called Docio to generate I/O examples as the present documentation tools only document the static information of I/O like parameter types and names. However, they generally miss out on the dynamic information, such as I/O examples. This is a valid point as the actual values of the inputs that generate specific outputs are a crucial part of an API documentation. Docio constitutes three programs: funcWatch collects I/O values; ioSelect chooses the appropriate I/O examples from a group of I/O values; and ioPresent inserts I/O examples into the documentation. Sohan’s [11] SpyREST is an automatic software as a service instrument for API documentation. By processing HTTP traffic involved in API requests, SpyREST uses an HTTP proxy server to intercept simple API calls and automatically collect and create RESTful API [12] documentation. It automates the whole process and therefore provides several benefits over other existing API documentation tools. Donald M. Leslie [13] investigates a method for enhancing the quality of API reference content and its incorporation into product documentation sets by combining an XML infrastructure with Javadoc, the primary mechanism for producing Java API documentation. Several other similar tools exist for other programming languages. Jeffrey Stylos’s [14] Jadeite is an improvement to Donald M. Leslie’s [13] Javadoclike API documentation system. The main idea behind Jadeite is that a documentation should also contain the classes and methods that users expect to exist. Users can add new “pretend” classes or methods listed in the actual API documentation using Jadeite’s “placeholders,” which can be tagged with the proper APIs to use in their place. Lee Wei Mar [15] proposes a methodology called PropER-Doc, which suggests appropriate code samples for documentation needs. PropER-Doc accepts API developers’ queries and uses code search engines (CSEs) to gather potential candidates for corresponding code examples. The API documentation is improved using eMoose by Dekel and Herbsleb [16] by adding decorations to method calls whose targets are associated with usage guidelines, such as rules, that invoking code authors must be informed. Pandita et al. [17] proposed to infer formal specifications from natural language text in a similar effort. Other researches have been done on identifying and avoiding problems in the API documentation. DOCREF is a method that Zhong and Su [18] presented. It combines code analysis with natural language processing techniques to find and highlight errors
37 Hosting an API Documentation Portal Using Swagger and Various AWS
439
in documentation. The authors successfully found more than thousand documentation problems using DOCREF. AdDoc is a system developed by Dagenais and Robillard [19] that automatically identifies documentation patterns or cohesive groups of code elements that are described collectively and that alerts developers to violations of these patterns as the code and documentation change. Previous research has successfully located natural language text that may be significant for a programmer utilizing a specific API type. Using word patterns, Chhetri and Robillard [20] classified text pieces in API documentation according to whether they contained necessary, valuable information, or neither. A method to find tutorial parts that describe a particular API type was put out by Petrosyan et al. [21]. They successfully attained great precision when classifying fragmented tutorial sections using supervised classification of text based on linguistic and structural factors. The existing works don’t introduce about creating an API documentation portal which is easily accessible through a custom URL and has proper security standards. But the methodology introduced in the paper takes care of all the necessary aspects like hosting the documentation, storing it securely, and creating an endpoint to access the portal. It also discusses a method to automate the entire process in the future scope.
5 Methodology The flow of the proposed methodology begins by defining the APIs on Amazon API Gateway [22] manually. The portal has been set up using Swagger. Since API Gateway doesn’t provide any type of endpoint, the Swagger website needs to be stored on an S3 bucket [23] and integrated with the API Gateway. The Amazon S3 websites don’t support HTTPS endpoints; therefore, to use HTTPS endpoint for higher security, AWS Certificate Manager [24] and AWS CloudFront [25] are used. The proposed methodology is depicted in Fig. 1.
5.1 Design Concepts The fundamental ideas of the employed tools and technologies are described in this part. AWS API Gateway: A client and different backend services are connected via an API management tool known as an API gateway. It is one of the system’s only points of entry. API Gateway encapsulates the core system architecture. Its duties include protocol translation, composition, request routing, request shaping and management, monitoring, load balancing, authentication, caching, and managing static responses. The API Gateway processes each request made by the client and then directs them to proper resources [22].
440
A. Sinha and K. Nagamani
Fig. 1 Flow of the methodology
Amazon S3: Simple Storage Service is referred to as S3. It is one of the earliest services created by AWS. Industry-leading scalability, data availability, security, and performance are provided by the object storage service Amazon S3. The files can be kept there safely. In S3, the data is stored in containers called buckets [23]. AWS Route 53: Amazon Route 53 [26] is a Domain Name System (DNS) web service. It offers programmers a quick and reliable way to link people to online services. It is highly scalable, easy to use, cost-effective, and secure. It accomplishes three critical tasks: • It registers a name for a website if one is required (domain name). • When a user enters a domain name, Route 53 aids in establishing a connection between the browser and the website. • By making automatic requests over the Internet to a resource, Route 53 examines the health of those resources. AWS Certificate Manager: An SSL certificate must be obtained to connect a custom domain name to a CloudFront distribution. A viable option for users of Route 53 and other AWS services is AWS Certificate Manager (ACM), which offers certificates without charge. Secure Sockets Layer (SSL)/Transport Layer Security (TLS) certificates encrypt network traffic, verify the identification of resources on private networks, and authenticate websites on the Internet. AWS Certificate Manager eliminates the time-consuming manual process of purchasing, uploading, and renewing SSL/TLS certificates. AWS CloudFront: Amazon CloudFront is a content delivery network (CDN) run by Amazon web services. It provides users with content, high security, desirable performance, and developer convenience. An Amazon S3 website performs better as a result. Access controls, traffic encryption, and the free usage of AWS Shield Standard to guard against DDoS attacks contribute to increased security. The website files are made accessible from data centers worldwide (known as edge locations) using CloudFront [25].
37 Hosting an API Documentation Portal Using Swagger and Various AWS
441
5.2 Implementation Creating S3 Bucket: The first step is to create an S3 bucket and make it ready to store the website. To achieve this, first, an S3 bucket is created in AWS. The bucket is configured to enable static website hosting, and the bucket policy is changed to include whitelisting of certain IP addresses. After doing this, only the whitelisted IPs will have access to the portal hosted on the bucket. The required files are uploaded to the bucket from the Swagger repository using the CLI command. The JSON file from the API Gateway also needs to be uploaded to connect it with the bucket. After following these steps, the document portal becomes available at the bucket’s AWS region-specific website endpoint [23]. Generating custom URL: The first step in hosting a public website is to register unique domain names so that users are not restricted to using their default S3 bucket names as websites’ URLs. Through the Route 53 service, AWS gives consumers the option to register domain names. Generating a custom URL for a website includes creating hosted (sub) zones for the already registered domain in the AWS Route 53 service of the AWS console. A hosted zone is a container for records, where records describe the traffic-routing preferences for a certain domain. The Nameserver (NS) records of the new hosted (sub) zone need to be added to the hosted zone in AWS Route53. A new bucket with the same name as the URL should be made with proper setup and a record with the same name as the S3 bucket needs to be created [27]. Acquiring an SSL Certificate: One of the most important components of keeping a website safe and secure is having an SSL certificate. The little data files known as SSL certificates act as a secure connection between a website’s information and a digital key. A new SSL certificate needs to be created in the AWS Certificate Manager. To connect the certificate with the domain, the domain name needs to be mentioned. The certificate needs to be validated by choosing the DNS validation method, and a new record needs to be created [28]. Creating CloudFront Distribution: Users must utilize AWS CloudFront to secure the website now that it is publicly accessible and provide quick access by employing the cache in edge locations worldwide. The necessary parameters must be specified to set up the CloudFront distribution correctly. The issued certificate needs to be added to the distribution. Next, the record in the Route53 for the new hosted (sub) zone is modified to a Canonical Name record (CNAME) to point to the ID of the CloudFront distribution. The Canonical Name record (CNAME) instructs visitors to a subdomain to use the same DNS records as another domain or subdomain. This is useful when running multiple services from the same IP address. With this, the HTTPS endpoint will be ready [29].
442
A. Sinha and K. Nagamani
Fig. 2 Architecture of the workflow
5.3 Workflow This section discusses the sequence of processes through which utilization of the documentation by a user passes from initiation to completion. When a user visits the portal, the first thing it’s going to hit is the Route 53, which routes all the traffic, and then it’s going to get to the CloudFront, which delivers data or content to users with very low latency and at the same time an SSL Certificate is going to come into the picture to make sure that the request is in HTTPS. It encrypts the user data before it requests the server to prevent hacking. Finally, CloudFront will make the request to an S3 bucket where the web page content is stored, and then it will serve it back to the user so that the user can see the API documentation on the portal. The architecture of the workflow is shown in Fig. 2.
6 Metrics and Measurements Metrics and measurement discuss how to assess the quality of API documentation and monitor development. The quality checklist can be used to analyze key documentation elements and assess how well API docs do. A checklist can be used to study, analyze, and question documentation from a different angle and find ways to improve it [30]. A proper documentation should include all the basic elements. The criteria typically used to evaluate documentation quality are findability, readability, clarity, accuracy, organization, conciseness, and context. The documentation should be easily accessible when a user searches for the product name and a few key tasks. It should also include a visible timestamp of the most recent edit so that users can determine how current it is.
37 Hosting an API Documentation Portal Using Swagger and Various AWS
443
Fig. 3 The final generated documentation portal
7 Results and Discussion The portal contains the documentation for the calls present in Petstore API. Petstore is a sample API that mimics a pet store management server. The API provides access to Petstore data through a series of individual calls. There are three endpoint types: Pet, Store, and User. Each call under this portal is originally executable. Under each endpoint type, there are several request verbs like GET, POST, PUT, and DELETE. Any change to the portal UI can be made by editing the index.html file in the S3 bucket. Figure 3 shows the final API documentation portal generated on Swagger. It includes a timestamp to indicate how recently it has been updated. The description of the API can be seen on top. An enlarged POST call for the Petstore API and the responses can be seen in Figs. 4 and 5, respectively. In Fig. 5, summary is present next to the POST call header, the description of the input parameter next to it, and the input JSON body format below it. In the response section, the status codes with their description and output JSON body structure are displayed. This portal has the following benefits: relatively lower costs as the S3 bucket is the only resource being used, more control for changes in the hands of the developer, and changes to the API documentation can be updated in the portal easily.
444
A. Sinha and K. Nagamani
Fig. 4 Enlarged pets POST call
Fig. 5 Responses for the pets POST call
8 Conclusions In this paper, we discussed how to host an API Documentation portal using Swagger and by connecting the S3-hosted static site to a CloudFront distribution, using Route 53 for DNS routing and Certificate Manager for SSL certification. The portal constitutes all the fundamental components of API documentation as discussed in Section 2 like description of the call, different request verbs with their summary, parameters, response status codes, and sample codes. This approach of generating documentation portals improves the performance and robustness of modern documentation projects with the help of various services provided by Amazon by introducing the ability to
37 Hosting an API Documentation Portal Using Swagger and Various AWS
445
retrieve the portal fast from the S3 bucket and securing it using CloudFront and SSL certificate. The only disadvantage is that the calls in API Gateway must be defined manually at first.
9 Future Scope It is crucial to keep the documentation up to date. One of the major problems of API users is outdated documentation. Several days after an update has been released, developers frequently write about it, sometimes writing only a few phrases. This occurs because there is no clear procedure for updating documents. Therefore, the future scope can be to automate the entire process of updating the documentation. This can be done by writing Terraform [31] code to set up all the static resources like S3, modifying and creating certificates, route53 hosted zones and records, and creating a CloudFront distribution. After this, a CI/CD pipeline [32] needs to be set up that gets triggered to automatically update the portal whenever updates or changes are made to the API Gateway documentation.
References 1. Kopecký J, Fremantle P, Boakes R (2014) A history and future of Web APIs. Inf Technol 56. https://doi.org/10.1515/itit-2013-1035. 2. Inzunza S, Juárez-Ramírez R, Jiménez S (2018) API documentation-a conceptual evaluation model. In: WorldCIST´ 18 2018, AISC 746, pp 229–239, 2018. 3. Nybom K, Ashraf A, Porres I (2018) A systematic mapping study on API documentation generation approaches. In: 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). pp 462−469. 4. Watson R, Stamnes M, Jeannot-Schroeder J, Spyridakis (2013) API documentation and software community values: a survey of open-source API documentation. In: SIGDOC 2013-Proceedings of the 31st ACM International Conference on Design of Communication. 5. Li F, Vincenzes M, Wang A (2018) Use swagger to document and define RESTful APIs. IBM Developer. 6. Sujan Y, Shashidhara DB, Nagapadma R (2020) Survey paper: framework of REST APIs. Int Res J Eng Technol (IRJET) 7(6). 7. Michael M, Steinhardt S, Schubert A Application programming interface documentation: what do software developers want?. J Tech Writ Commun 48: 295–330. 8. Hossein (2012) Introduction to API blueprint. J Lang Technol Comput Linguist 9. Krishnan & Gonzalez SPT, Jose (2015) Google cloud endpoints. In: Building Your Next Big Thing with Google Cloud Platform, pp 309−330. 10. Jiang S, Armaly A, McMillan C, Zhi Q, Metoyer R (2017) Docio: documenting API input/output examples. In: 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC). pp 364−367. https://doi.org/10.1109/ICPC.2017.13. 11. Sohan SM, Anslow C, Maurer F (2015) SpyREST in action: an automated RESTful API documentation tool. In: 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). pp 813−818. 12. Kulkarni C, Takalikar M (2018) Analysis of REST API implementation. Int J Sci Res Comput Sci, Eng Inf Technol © 2018 IJSRCSEIT 3(5). ISSN: 2456−3307
446
A. Sinha and K. Nagamani
13. Leslie D (2002) Using javadoc and XML to produce API reference documentation. In: Proceedings of the 20st annual international conference on Documentation, SIGDOC 2002, Toronto, Ontario, Canada, October 20−23. 14. Stylos J, Faulring A, Yang Z, Myers B (2009) Improving API documentation using API usage information. In: 2009 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp 119−126. https://doi.org/10.1109/VLHCC.2009.5295283. 15. Mar LW, Wu YC, Jiau HC (2011) Recommending proper API code examples for documentation purpose. In: 2011 18th Asia-Pacific Software Engineering Conference. 16. Dekel U, Herbsleb JD (2009) Improving API documentation usability with knowledge pushing. In: 2009 IEEE 31st International Conference on Software Engineering. pp 320−330. 17. Pandita R, Xiao X, Zhong H, Xie T, Oney S, Paradkar A (2012) Inferring method specifications from natural language API descriptions. In: 34th International Conference on Software Engineering. pp 815–825 18. Zhong H, Su Z (2013) Detecting API documentation errors. In: International Conference on Object Oriented Programming Systems Languages and Applications. pp 803–816. 19. Dagenais B, Robillard MP (2014) Using traceability links to recommend adaptive changes for documentation evolution. IEEE Trans Softw Eng 40(11):1126–1146 20. Chhetri YB, Robillard MP (2014) Recommending reference API documentation. Empir Softw Eng 20(6):1558–1586 21. Petrosyan G, Robillard MP, de Mori R (2015) Discovering information explaining API types using text classification. In: 37th International Conference on Software Engineering. pp 869– 879. 22. Zhao J, Jing S, Jiang L (2018) Management of API gateway based on micro-service architecture. J Phys: Conf Ser. 23. Mishra A (2019) Amazon S3. In: Machine Learning in the AWS Cloud. pp 181−200. 24. Vianco Paul (2008) AWS breaks new ground with soldering specification. Weld J 87:53–54 25. Chauhan S, Cuthbert D, Devine J, Halachmi A, Lehwess M, Matthews N, Morad S, Walker D (2018) Amazon CloudFront. In: AWS Certified Advanced Networking Official Study Guide. pp 207−231. 26. Piper B, Clinton D (2019) The domain name system and network routing: amazon route 53 and amazon CloudFront. AWS Certified Solutions Architect Study Guide. pp 169−188. 27. Configuring a static website using a custom domain registered with Route 53. Amazon Simple Storage Service User Guide, 2022 28. Maheswaran A, Kanchana R (2012) Web application security using SSL certificates. In: International Conference on Computing and Communication Technology. 29. Schäffler F (2015) Create a CloudFront distribution. AWS Blog with focus on aws-cli. 30. Johnson T (2012) “Documenting APIs: a guide for technical writers and engineers. I’d Rather Be Writing blog. 31. de Carvalho LR, Patricia Favacho de Araujo A (2020) Performance comparison of terraform and cloudify as multicloud orchestrators. In: 20th IEEE/ACM International Symposium on Cluster Cloud and Internet Computing (CCGRID). pp 380−389. 32. Arachchi SAIB, & Perera I (2018) Continuous integration and continuous delivery pipeline automation for agile software project management. MERCon.
Chapter 38
Bi-Level Linear Fuzzy Fractional Programming Problem Under Trapezoidal Fuzzy Environment: A Solution Approach Sujit Maharana and Suvasis Nayak
1 Introduction In many practical situations of hierarchical organizations, numerous decision making problems can be mathematically modeled as bi-level programming problem (BLPP) which consists of two sequential levels namely upper and lower. Each one of the upper and lower level decision makers (ULDM and LLDM) independently controls a set of decision variables. As the decisions of both level DMs are mutually affected and sometimes causing situation of decision deadlock, decisions of both level are emphasized and considered in a cooperative environment. Linear Fractional programming (LFP) [17] has many applications in science, engineering, industry, production, business, management, economics, health care, etc. Some common examples of fractional objectives are profit/cost, debit/equity, risk assets/capital, inventory/sale, doctor/patient, etc. Thirwani [19] proposed bilevel linear fractional programming problem (BLFPP). Calvete and Gale [5] developed an approach to solve BLFPP and Mishra [13] proposed a method based on weighting sum approach to solve BLFPP. Malhotra and Arrora [10], Pramanik and Pratim Dey [15] developed solution algorithms for BLFPP using goal programming approach. Arrora and Gupta [2] used fuzzy goal programming approach for solving bi-level programming problem. Toksari [8] used Taylor series approximation to solve BLFPP. Veeramani and Sumathi [20] proposed a solution approach based on goal programming to solve LFPP with triangular fuzzy numbers. Borza and Rambely [4] used fuzzy α-cut and max-min operator technique to solve LFPP with triangular fuzzy coefficients. Arya et al. [3] developed a method to solve multi-objective LFPP in fully fuzzy environment of triangular fuzzy numbers. Abo-Sinna [1] solved a bi-level nonlinear multi-objective decision making problem under fuzziness.
S. Maharana · S. Nayak (B) Department of Mathematics, School of Applied Sciences, KIIT Deemed to be University, Bhubaneswar 751024 Odisha, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_38
447
448
S. Maharana and S. Nayak
In this paper, a solution approach is developed to generate a compromise solution of a bi-level linear fuzzy fractional programming problem (BL-LFFPP) with trapezoidal fuzzy numbers. Change of variable method [7], Chakraborty and Gupta’s method [6], a linearization method of Pal et al. [14] with small modification, fuzzy α, β-cuts, and fuzzy goal programming are combinedly used to develop the proposed method. This paper is organized as follows: Sect. 2 contains the basic fuzzy concepts, interval valued as bi-objective optimization and change of variable method whereas Sect. 3 incorporates a linearization process of fractional functions. Section 4 provides the mathematical formulation of BL-LFFPP and Sect. 5 includes the proposed solution approach with an algorithmic presentation. Section 6 incorporates a numerical example and its solution along with some comparative result analysis. The concluding remarks are discussed in Sect. 7.
2 Preliminaries Definition 1 [21] Let X be the universal set and F ⊆ X then the fuzzy set F˜ on X is defined as a set of ordered pairs as follows, F˜ = {(x, μ F˜ (x)) : x ∈ X } where μ F˜ : X → [0, 1] is defined as the membership function and μ F˜ (x) is the degree ˜ of membership associated with the element x ∈ X in F. Definition 2 [21] A fuzzy set F˜ is said to be a fuzzy number if it possesses the following properties: (i) It is defined over the set of real numbers R. (ii) F˜ must be a normal fuzzy set (i.e., there exists at least one a0 ∈ R such that μ F˜ (a0 )=1). (iii) It is a convex fuzzy set i.e., ∀ x, y ∈ R and λ ∈ [0, 1], μ F˜ (λx + (1 − λ) y) ≥ min{μ F˜ (x), μ F˜ (y)} ˜ = {x ∈ R|μ F˜ (x) > 0} is bounded. (iv) Support of F˜ in R i.e., support( F) (v) Its membership function μ A˜ (x) is piecewise continuous. Definition 3 [21] A trapezoidal fuzzy number has four components which is defined as T˜ = {t1 , t2 , t3 , t4 }, where t1 ≤ t2 ≤ t3 ≤ t4 . The associated membership function is mathematically defined as follows, and it is graphically interpreted by the following Fig. 1.
38 Bi-Level Linear Fuzzy Fractional Programming Problem …
449
Fig. 1 Trapezoidal fuzzy membership function
⎧ ⎪ 0, ⎪ ⎪ ⎪ x−t1 ⎪ ⎪ ⎨ t2 −t1 , μT˜ (x) = 1, ⎪ ⎪ ⎪ t4 −x , ⎪ t4 −t3 ⎪ ⎪ ⎩0,
x < t1 t1 ≤ x ≤ t2 t2 ≤ x ≤ t3 t3 ≤ x ≤ t4 x > t4
(1)
Definition 4 [21] Fuzzy α-cut of the fuzzy set F˜α consists of the elements x ∈ X whose degree of membership μ F˜ (x) is greater than or equal to α ∈ [0, 1] i.e., F˜α = {(x ∈ X ) : μ F˜ (x) ≥ α}. The α-cut of a trapezoidal fuzzy number can be defined as T˜α = [t1 + α(t2 − t1 ), t4 − α(t4 − t3 )]. Definition 5 (Arithmetic operations on trapezoidal fuzzy numbers) [21] Let A˜ = {a (1) , a (2) , a (3) , a (4) } and B˜ = {b(1) , b(2) , b(3) , b(4) } be two trapezoidal fuzzy numbers in R+ then the arithmetic operations on fuzzy numbers can be defined as follows: (i) (ii) (iii) (iv) (v)
A˜ ⊕ B˜ = {a (1) + b(2) , a (2) + b(2) , a (3) + b(3) , a (4) + b(4) } A˜ B˜ = {a (1) − b(4) , a (2) − b(3) , a (3) − b(2) , a (4) − b(3) } - A˜ = {−a (4) , −a (3) , −a (2) , −a (1) } A˜ ⊗ B˜ = {a (1) b(1) , a (2) b(2) , a (3) b(3) , a (4) b(4) } (1) (2) (3) (4) A˜ = { ab(4) , ab(3) , ab(2) , ab(1) } B˜
2.1 Interval Valued and Bi-Objective Optimization Problems Consider the following optimization problem in which the objective function remains in interval-valued form of functions. max [ f L (x), f U (x)] x∈Δ
(2)
450
S. Maharana and S. Nayak
The interval valued optimization model (2) is equivalent to the following bi-objective optimization problem [18]. max { f L (x), f U (x)}
(3)
x∈Δ
2.2 Change of Variable Method Charnes and Cooper [7] developed this method to solve LFPP using an approach of variable transformation by generating an extra constraint. Consider the following linear fractional programming problem, ∑n ci xi + γ max Z (x) = ∑i=1 n i=1 di x i + δ (4)
subject to ∑n ai j xi ≤ b j , j = 1, 2, 3....m, xi ≥ 0 i=1
Let ∑n 1di xi +δ = z and xi z = yi then the LFPP (4) is equivalently transformed into i=1 the following LPP. max Z (y, z) =
n ∑
ci yi + γz
i=1
subject to n ∑ i=1
di yi + δz = 1,
(5) n ∑
ai j yi ≤ b j z, j = 1, 2, 3....m, yi ≥ 0, z > 0
i=1
Theorem 1 [17] If (yi∗ , z ∗ ) is an optimal solution of (5) then x ∗ = sponding optimal solution of (4).
yi∗ z∗
is the corre-
Theorem 2 [9] Two optimization problems P1 : max Z 1 (x) subject to x ∈ Δ1 and P2 : max Z 2 (x) subject to x ∈ Δ2 are said to be equivalent iff ∃ an one-to-one mapping Z 3 : Δ1 → Δ2 such that Z 1 (x) = Z 2 (Z 3 (x)) ∀ x ∈ Δ1 Corollary 1 [17] The transformation xi z = yi establishes an one-to-one correspondence in between the feasible regions formed by (4) and(5).
3 Linearization Process of Fractional Function Pal et al. [14] proposed a process to linearize the fractional membership functions of the objectives of a multi-objective LFPP. Consider the following membership function of a multi-objective LFPP of maximization type associated with one of its objective function z(x).
38 Bi-Level Linear Fuzzy Fractional Programming Problem …
μz˜ (x) =
cx + γ z(x) − l , wher e l ≤ z(x) = ≤u dx + δ u −l
451
(6)
As μz˜ (x) ∈ [0, 1], the fuzzy goal of the membership functions can be formulated as follows. z(x) − l + d− − d+ = 1 (7) u −l where d − and d + represent the under and over deviations from the aspiration level respectively. The fuzzy goal (7) with fractional function z(x) can be further simplified as follows. 1 u −l ' ' m(cx + γ) + d − (d x + δ) − d + (d x + δ) = m (d x + δ) wher e m = 1 + ml nx + D − − D + = p mz(x) − ml + d − − d + = 1, wher e m =
'
'
wher e n = mc − m d, p = m δ − mγ, D − = (d x + δ)d − , D + = (d x + δ)d + (8) As the LFPP is of maximization type, for non zero achievement of the degree of membership it involves d − ≤ 1 which imposes the constraint −d x + D − ≤ δ. In our proposed solution approach, we incorporate a small change by omitting over deviational variables as over deviation represents the state of complete achievement for a maximization problem. Hence, the fuzzy membership goal (7) can be expressed as, z(x) − l + d− ≥ 1 (9) u −l The linearization of (9) can now be formulated as, nx + D − ≥ p, − d x + D − ≤ δ '
'
wher e n = mc − m d, p = m δ − mγ, D − = (d x + δ)d − , 1 ' , m = 1 + ml m= u −l
(10)
4 Bi-Level Fuzzy FPP: Problem Formulation In hierarchical organizations, bi-level optimization often arises to suitably fit numerous practical problems. The uncertainty of information in such optimization models can be tackled using fuzzy numbers. Consider the following bi-level fuzzy FPP in a cooperative environment which comprises all its coefficients and constants characterized as trapezoidal fuzzy numbers.
452
S. Maharana and S. Nayak
∑n
j=1 c˜1 j x j
+ γ˜ 1 ˜ ˜ j=1 d1 j x j + δ1
(U L D M) : max Z 1 = ∑n X1
∑n
j=1 c˜2 j x j
+ γ˜ 2 ˜ ˜ j=1 d2 j x j + δ2
(L L D M) : max Z 2 = ∑n X2
(11)
subject to ∑n ˜ = m˜ jk x j ≤ n˜ k , x j ≥ 0, k = 1, 2, ..., q Δ j=1 ˜ = M1 X 1 + M˜ 2 X 2 ≤ N˜ ; X 1 , X 2 ≥ 0 where X 1 , X 2 are independently controlled by the upper and lower level DMs respectively and x = (X 1 , X 2 ) = (x1 , x2 , ..., xn ) ∈ Rn . c˜i j , d˜i j , γ˜ i , δ˜i (i = 1, 2), m˜ jk , n˜ k ∈ T R F N (R+ ) i.e., the set of trapezoidal fuzzy numbers defined on R+ . Assume that ∑ n ˜ ˜ ˜ j=1 di j x j + δi > 0, i = 1, 2 ∀ x ∈ Δ. (1) (2) (3) (4) (1) (2) (3) (4) (1) (2) (3) (4) c˜1 j = (c˜1 j , c˜1 j , c˜1 j , c˜1 j ), c˜2 j = (c˜2 j , c˜2 j , c˜2 j , c˜2 j ), d˜1 j = (d˜1 j , d˜1 j , d˜1 j , d˜1 j ), (1) (2) (3) (4) (1) (2) (3) (4) (1) (2) (3) (4) d˜2 j = (d˜ , d˜ , d˜ , d˜ ), γ˜ 1 = (γ˜ , γ˜ , γ˜ , γ˜ ), γ˜ 2 = (γ˜ , γ˜ , γ˜ , γ˜ )
1 1 1 1 2 2 2 2 2j 2j 2j 2j (1) (2) (3) (4) (1) (2) (3) (4) (1) (2) (3) (4) δ˜ 1 = (δ˜ 1 , δ˜ 1 , δ˜ 1 , δ˜ 1 ), δ˜ 2 = (δ˜ 2 , δ˜ 2 , δ˜ 2 , δ˜ 2 ), m˜ jk = (m˜ jk , m˜ jk , m˜ jk , m˜ jk ), (1) (2) (3) (4) n˜ k = (n˜ k , n˜ k , n˜ k , n˜ k )
(12)
5 Proposed Solution Approach In BLFFPP (11), using fuzzy α- and β-cuts in the objective functions and the constraints respectively the fuzzy parameters can be equivalently transformed into intervals and the objective functions at upper and lower level can be formulated as follows. max Z 1 = ∑n (1) (2) (1) (4) (4) (3) (1) (2) (1) (4) (4) (3) j=1 [c1 j + α(c2 j − c1 j ), c1 j − α(ai j − ai j )]x j + [γ1 + α(γ1 − γ1 ), γ1 − α(γ1 − γ1 )] ∑n (1) (2) (1) (4) (4) (3) (1) (2) (1) (4) (4) (3) j=1 [d1 j + α(d1 j − d1 j ), di j − α(c1 j − c1 j )]x j + [δ1 + α(δ1 − δ1 ), δ1 − α(δ1 − δ1 )]
(1) (2) (1) (1) (2) (1) ∑n (4) (4) (3) (4) (4) (3) j=1 (c1 j +α(c1 j −c1 j ))x j +γ1 +α(γ1 −γ1 ), j=1 (c1 j −α(c1 j −c1 j ))x j +γ1 −α(γ1 −γ1 )
∑n
= ∑ ∑n
(1) (2) (1) (1) (2) (1) ∑n (4) (4) (3) (4) (4) (3) n j=1 (d1 j +α(d1 j −d1 j ))x j +δ1 +α(δi −δi ), j=1 (d1 j −α(d1 j −d1 j ))x j +δ1 −α(δ1 −δ1 )
(1) (2) (1) (1) (2) (1) j=1 (c1 j +α(c1 j −c1 j ))x j +γ1 +α(γ1 −γ1 ) (4) (4) (3) (4) (4) (3) (d −α(d −d ))x +δ −α(δ −δ j j=1 1 j 1 1 1 ) 1j 1j
=
∑n
=
[Z 1L ,
Z 1U ]
∑n
, ∑nj=1
(c1(4)j +α(c1(4)j −c1(3)j ))x j +γ1(4) +α(γ1(4) −γ1(3) )
(1) (2) (1) (1) (2) (1) j=1 (d1 j −α(d1 j −d1 j ))x j +δ1 −α(δ1 −δ1 )
(13)
38 Bi-Level Linear Fuzzy Fractional Programming Problem … Table 1 Pay off table X i∗
Z iL (x)
Z iU (x)
Z iL (X ) ∗ Z iL (X Ui )
Z iU (X L i ) ∗ Z iU (X Ui )
L i∗
L i∗
X ∗ X Ui
453
∗
Similarly, max Z 2 = [Z 2L , Z 2U ]. The constraints can be simplified as follows. ∑n (2) (1) (4) (4) (3) [m (1) jk + β(m jk − m jk ), m jk − β(m jk − m jk )]x j j=1 ≤ [n (1) k ∑n
(1) (4) (4) (3) + β(n (2) k − n k ), n k − β(n k − n k )]
[m Ljk , m Ujk ]x j j=1
≤
[n kL , n Uk ],
(14)
x j ≥ 0, j = 1, 2, ..., n, k = 1, 2, ..., m
In order to maintain a wider feasible region, the constraints can be expressed into the following linear inequalities [11]. ∑n j=1
m Ljk x j ≤ n kL ,
∑n j=1
m Ujk x j ≤ n Uk , k = 1, 2, ..., m
(15)
The BLFFPP (11) can be expressed in the following form with interval valued objective functions at both level. (U L D M) : max Z 1 = [Z 1L , Z 1U ] X1
(L L D M) : max Z 2 = [Z 2L , Z 2U ] X2
subject to ∑n ∑n m Ljk x j ≤ n kL , j=1
j=1
(16) m Ujk x j ≤ n Uk ,
x j ≥ 0, k = 1, 2, ..., m The interval valued objective functions at each level are equivalently transformed into bi-objective forms based on the concept described in Sect. 2.1 and problem as a bi-level bi-objective optimization problem with the objective (16) is expressed L functions Z i , Z iU at level i = 1, 2. Using Change of variable method [7], the individual optimal solutions of Z iL and Z iU are calculated at each level i = 1, 2 in ∗ ∗ ∗ ∗ isolation. Let X L 1 , X U1 , X L 2 , X U2 be the individual optimal solutions of Z 1L , Z 1U , U L Z 2 and Z 2 respectively. Pay off Table 1 is separately constructed at each level for i = 1, 2 to determine the aspired and acceptable levels of each objective function as follows. ∗ ∗ ∗ ∗ i.e., Z iL (X Ui ) ≤ Z iL (x) ≤ Z iL (X L i ), Z iU (X L i ) ≤ Z iU (x) ≤ Z iU (X Ui ), i = 1, 2. Subsequently, the fuzzy linear membership functions associated with each objective function at both level (upper and lower) are constructed as follows.
454
S. Maharana and S. Nayak
∗
Z iL (x) − Z iL (X Ui ) ∗ ∗ , Z iL (X L i ) − Z iL (X Ui )
μ Z iL (x) =
∗
Z U (x) − Z iU (X L i ) μ Z iu (x) = U i U ∗ ∗ , i = 1, 2 Z i (X i ) − Z iU (X L i )
(17)
To determine the aspiration level of the decision variables X 1 controlled by ULDM, the following bi-objective optimization problem arising at upper level is solved using the method of solution proposed by Chakraborty and Gupta [6]. max {Z 1L , Z 1U } subject to ∑n ∑n m Ljk x j ≤ n kL , j=1
j=1
m Ujk x j ≤ n Uk ,
(18)
x j ≥ 0, k = 1, 2, ..., m Let X ∗ = (X 1∗ , X 2∗ ) be the compromise solution of (18) and the aspiration level of X 1 is assumed as X 1∗ . Finally to determine the compromise solution of the bi-level fuzzy FPP (11), the following model is formulated using fuzzy goal programming approach [12]. min
∑2
−
i=1
diL +
∑2 i=1
diU
−
+ (d1− + d1+ )
subject to −
−
μ Z iL (x) + diL ≥ 1, μ Z iU (x) + diU ≥ 1, i = 1, 2 X 1∗ + d1− − d1+ = X 1∗ ∑n ∑n m Ljk x j ≤ n kL , j=1
j=1
(19) m Ujk x j ≤ n Uk ,
x j ≥ 0, k = 1, 2, ..., m −
−
diL , diU , d1− , d1+ ≥ 0, d1− .d1+ = 0 Model (19) is solved to obtain the compromise solution by linearizing the fractional membership functions μ Z iL (x), μ Z iu (x) using the process proposed in Sect. 3 with elimination of over deviations. If DMs are not satisfied with this solution, the values of α, β-cuts ∈ [0, 1] can be altered to generate another compromise solution.
38 Bi-Level Linear Fuzzy Fractional Programming Problem …
455
5.1 Algorithm Step 1. Use fuzzy α- and β-cuts respectively for the trapezoidal fuzzy numbers involved in the objective functions and the constraints of BL-FFPP (11). Step 2. Express the objective functions in interval valued forms and linearize the constraints as (16). Step 3. Consider Z iL and Z iU as bi-objective functions at each level i = 1, 2. Step 4. Construct the fuzzy membership functions μ Z iL (x) and μ Z iU (x) for each objective function Z iL and Z iU , i = 1, 2 following the proposed process as explained in (17). Step 5. Determine the aspiration level X 1∗ of the decision variables X 1 controlled by ULDM on solving (18). Step 6. Use fuzzy goal programming method to formulate the model (19). Step 7. Linearize the fractional membership functions μ Z iL (x) and μ Z iU (x) in (19) using the proposed linearization process and solve it to find the compromise solution of the bi-level fuzzy FPP (11). Step 8. If DMs are not satisfied with the obtained compromise solution, reformulate and solve the model (11) by changing α, β ∈ [0, 1] and aspiration level X ∗ .
6 Numerical Example Consider the following bi-level linear fuzzy FPP which is initially solved by Saad and Hafez [16] in deterministic form. (U L D M) : max Z 1 = x1
(L L D M) : max Z 2 = x2
˜ 2 ˜ 1 + 3x 2x ˜ 2 + 6˜ ˜ 1 + 4x 1x ˜ 2 ˜3x1 + 4x ˜ 1 + 4x ˜ 2 + 3˜ 6x
subject to ˜ 3x ˜ 1 + 1x ˜ 2 ≤ 5, ˜ 1 + 1x ˜ 2 ≤ 10, ˜ x1 , x2 ≥ 0 1x wher e, 1˜ = (0, 0.5, 4, 7), 2˜ = (0.5, 1, 5, 9), 3˜ = (1, 2, 9, 12), 4˜ = (1, 3, 8, 12), 5˜ = (2, 4, 10, 18), ˜ = (4, 7, 15, 25). 6˜ = (1, 2, 12, 20), 10
(20)
456
S. Maharana and S. Nayak
Table 2 Pay off table X∗ L ∗1
X ∗ X U1 X∗ ∗ X L2 ∗ U X 2
Z 1L (X ∗ )
Z 1U (X ∗ )
0.0921 0.0540 Z 2L (X ∗ ) 0.1416 0.0697
4.0552 6.7471 Z 2U (X ∗ ) 3.8621 4.5902
(0.5, 1, 5, 9)x1 + (1, 2, 9, 12))x2 (0, 0.5, 4, 7)x1 + (1, 3, 8, 12)x2 + (1, 2, 12, 20) (1, 2, 9, 12)x1 + (1, 3, 8, 12)x2 (L L D M) : max Z 2 = x2 (1, 2, 12, 20)x1 + (1, 3, 8, 12)x2 + (1, 2, 9, 12) subject to (0, 0.5, 4, 7)x1 + (0, 0.5, 4, 7)x2 ≤ (2, 4, 10, 18) (1, 2, 9, 12)x1 + (0, 0.5, 4, 7)x2 ≤ (4, 7, 15, 25) x1 , x2 ≥ 0 (U L D M) : max Z 1 = x1
(21)
Solution: Using fuzzy α, β-cuts in the objective functions and the constraints with α = β = 0.5, problem (20) is formulated as the following model.
0.75x + 1.5x 7x1 + 10.5x2 1 2 , x1 5.5x1 + 10x2 + 16 0.25x1 + 2x2 + 1.5
10.5x1 + 10x2 1.5x1 + 2x2 (L L D M) : max Z 2 = , x2 16x1 + 10x2 + 10.5 1.5x1 + 2x2 + 1.5 subject to 0.25x1 + 0.25x2 ≤ 3, 5.5x1 + 5.5x2 ≤ 14,
(U L D M) : max Z 1 =
(22)
1.5x1 + 0.25x2 ≤ 5.5, 10.5x1 + 5.5x2 ≤ 20, x1 , x2 ≥ 0 Interval valued objective function at each level is equivalently considered as biobjective functions. Using change of variable method, each objective function is individually maximized and their upper and lower bounds are computed using the following pay off Table 2. So the membership functions of the linear fractional objectives at each level are formulated as follows.
38 Bi-Level Linear Fuzzy Fractional Programming Problem … 0.75x1 +1.5x2 5.5x1 +10x2 +16
μ Z 1L (x) =
− 0.0540
0.0381 − 4.0552
7x1 +10.5x2 0.25x1 +2x2 +1.5
μ Z 1U (x) =
2.6919 − 0.0697
1.5x1 +2x2 16x1 +10x2 +10.5
μ Z 2L (x) =
457
μ Z 2U (x) =
(23)
0.0719 − 3.8621
10.5x1 +10x2 1.5x1 +2x2 +1.5
0.7281
To determine the aspiration level of the decision variables controlled by the ULDM, the compromise solution of the upper level problem is obtained as X ∗ = (0, 2.5455) resulting x1 ≈ 0 (aspiration level). The fuzzy goals of (20) are formulated as follows. μ Z iL (x) + diL− − diL+ = 1, μ Z iU (x) + diU − − diU + = 1, i = 1, 2 x1 + d − − d + = 0
(24)
Using goal programming method and the proposed linearization process of the fractional membership functions, (24) can be formulated as follows. −
−
−
−
min (D1L + D2L + D1U + D2U ) + (d − + d + ) subject to −
0.2434x1 + 0.5790x2 + D1L ≥ 1.4736 5.3132x1 − 2.9942x2 + D1U
−
≥ 10.1206
− − 0.7656x1 + 0.5840x2 + D2L ≥ 1.4869 − 3.6147x1 + 0.8196x2 + D2U ≥ 6.8853 − − 0.2096x1 − 0.3810x2 + D1L ≤ 0.6096 − − 0.6730x1 − 5.3838x2 + D1U ≤ 4.0378 − − 1.1504x1 − 0.7190x2 + D2L ≤ 0.7550 − − 1.0922x1 − 1.4562x2 + D2U ≤ 1.0922 x1 + d − − d + = 0
(25)
0.25x1 + 0.25x2 ≤ 3, 5.5x1 + 5.5x2 ≤ 14, 1.5x1 + 0.25x2 ≤ 5.5, 10.5x1 + 5.5x2 ≤ 20 −
−
x1 , x2 ≥ 0, D1L ≥ 0, D2L ≥ 0, D1U
−
≥ 0, D2U
−
≥ 0, d − , d + ≥ 0, d − .d + = 0
On solving (25), the compromise solution of the BL-LFFPP (20) is obtained as x1 = 1.903934, x2 = 0.001581340 and the corresponding optimal objective values are Z 1 = [0.0540, 6.7424] and Z 2 = [0.0698, 4.5898] at upper and lower level respectively.
458
S. Maharana and S. Nayak
Fig. 2 Crisp and interval valued optimal objective values
6.1 Result Analysis Saad and Hafez [16] obtained the optimal solution of BL-LFPP (20) in deterministic form as x = (3, 1) and the corresponding crisp optimal objective values are Z 1 = 0.6923 and Z 2 = 0.5200. In this paper, BL-LFPP (20) is considered in fuzzy environment which causes the generation of the optimal objective values in form of specific range instead of fixed values i.e., Z 1 = [0.0540, 6.7424] and Z 2 = [0.0698, 4.5898] are obtained using the proposed solution approach. It is clearly observed that the crisp optimal objective values of the deterministic BL-LFPP (20) due to [16] lie within the interval valued optimal objective values of fuzzy BL-LFPP (20) due to the proposed method i.e., Z 1 = 0.6923 ∈ [0.0540, 6.7424] and Z 2 = 0.5200 ∈ [0.0698, 4.5898] which justifies the feasibility of the proposed solution approach. In this context, the following Fig. 2 interpretes the crisp and interval valued (lower and upper bound) optimal objective values of the objective functions as the result analysis of the numerical problem.
7 Conclusions In hierarchical organizations, bi-level optimization plays an important role in solving various practical problems. Fractional objectives are often formulated to suitably fit the optimization models of the practical problems. The uncertainty and ambiguity in the available data can be addressed using fuzzy numbers. This paper develops an efficient solution methodology to determine a compromise solution of a bi-level fuzzy fractional programming problem with trapezoidal fuzzy numbers. LINGO software is used for the computational works of the optimization problems in the numerical section. BL-LFFPP with linear, triangular, pentagonal etc. fuzzy numbers can also be solved using the proposed solution approach. Numerical example with comparative result analysis illustrates the proposed method and justifies its feasibility. As the
38 Bi-Level Linear Fuzzy Fractional Programming Problem …
459
future scope of this work, bi-level multi-objective and multi-level multi-objective FPP can be further studied under different fuzzy environments to develop the efficient solution algorithms.
References 1. Abo-Sinna MA (2001) A bi-level non-linear multi-objective decision making under fuzziness. Opsearch 38(5):484–495 2. Arora SR, Gupta R (2009) Interactive fuzzy goal programming approach for bilevel programming problem. Eur J Oper Res 194(2):368–376 3. Arya R, Singh P, Kumari S, Obaidat MS (2020) An approach for solving fully fuzzy multiobjective linear fractional optimization problems. Soft Computing 24(12):9105–9119 4. Borza M, Rambely AS (2022) An approach based on alpha-cuts and max-min technique to linear fractional programming with fuzzy coefficients. Iran J Fuzzy Syst 19(1):153–168 5. Calvete HI, Galé C (1999) The bilevel linear/linear fractional programming problem. Eur J Oper Res 114(1):188–197 6. Chakraborty M, Gupta S (2002) Fuzzy mathematical programming for multi objective linear fractional programming problem. Fuzzy Sets Syst 125(3):335–342 7. Charnes A, Cooper WW (1962) Programming with linear fractional functionals. Naval Res Logistics Q 9(3–4):181–186 8. Toksari MD (2010) Taylor series approach for bi-level linear fractional programming problem. Selçuk J Appl Math 11(1):63–69 9. Hirche J, Craven BD (1989) Fractional programming. Berlin, Heldermann Verlag 1988, 145 p, DM 48. ISBN 3-88538-404-3 (Sigma Series in Applied Mathematics 4). Zeitschrift Angewandte Mathematik und Mechanik, 69(10), 371–371 (1989) 10. Malhotra N, Arora SR (2000) An algorithm to solve linear fractional bilevel programming problem via goal programming. Opsearch 37(1):1–13 11. Mehra A, Chandra S, Bector CR (2011) Acceptable optimality in linear fractional programming with fuzzy coefficients. Fuzzy Optim Decis Making 6(1):5–16 12. Miettinen K (2012) Nonlinear multiobjective optimization (Vol 12). Springer Science & Business Media 13. Mishra S (2007) Weighting method for bi-level linear fractional programming problems. Eur J Oper Res 183(1):296–302 14. Pal BB, Moitra BN, Maulik U (2003) A goal programming procedure for fuzzy multiobjective linear fractional programming problem. Fuzzy Sets Syst 139(2):395–405 15. Pramanik S, Dey PP (2011) Bi-level linear fractional programming problem based on fuzzy goal programming approach. Int J Comput Appl 25(11):34–40 16. Saad OM, Hafez MS (2011) An algorithm for solving bi-level integer linear fractional programming problem based on fuzzy approach. General Math Notes 3:86–99 17. Stancu-Minasian IM (2012) Fractional programming: theory, methods and applications (Vol 409). Springer Science & Business Media 18. Stanojevi´c B, Dzitac S, Dzitac I (2020) Fuzzy numbers and fractional programming in making decisions. Int J Inf Technol Decis Making 19(04):1123–1147 19. Thirwani D, Arora SR (1993) Bi-level linear fractional programming problem. Cahiers du Centre d’études de recherche opérationnelle 35(1–2):135–149 20. Veeramani C, Sumathi M (2016) A new method for solving fuzzy linear fractional programming problems. J Intell Fuzzy Syst 31(3):1831–1843 21. Zimmermann HJ (2011) Fuzzy set theory and its applications. Springer Science & Business Media
Chapter 39
Comparative Assessment of Runoff by SCS-CN and GIS Methods in Un-Gauged Watershed: An Appraisal of Denwa Watershed Papri Karmakar, Aniket Muley, Govind Kulkarni, and Parag Bhalchandra
1 Introduction Water is such a necessary natural resource, devoid of which any life cannot stay alive. Water requirement is increasing day by day with gigantic population growth. Water is necessary in every sphere of life counting individual use, cattle consumption, agriculture, industry, etc., it becomes really significant to supervise this primary resource in a sustainable way. Suitable management will not only copped soil erosion but also recharge groundwater [1, 2]. In modern times, a noticeable enthusiasm has been amplified towards watershed approach. This refers to such a geographical area from which runoff ensuing precipitation accumulates in a single channel to debouch either into large streams, rivers, lakes or oceans [3]. It is a very dynamic unit to supply rainwater at a common point and has been acknowledged as an essential component for planning and execution of the remedial, defensive and amelioration agenda. Therefore, it is considered as a perfect entity for proper management of natural resources to facilitate sustainable development by lessening the adverse effect of natural disasters. The important facet compulsory for planning and development of a watershed involves the analysis regarding its terrain configuration or topography, geomorphology, soil, drainage system, land use/land cover and existing P. Karmakar (B) Department of General and Applied Geography, Dr. HGC University, Sagar, M.P, India e-mail: [email protected] A. Muley Department of Stats, School of Mathematical Sciences, S.R.T.M. University, Nanded, MS, India G. Kulkarni Department of Computer Sciences, LLDMM, Beed, Parli-Vaijnath, MS, India P. Bhalchandra School of Computational Science, S.R.T.M. University, Nanded, MS, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_39
461
462
P. Karmakar et al.
water resources[4]. Rainfall-runoff method is a mathematical model revealing rainfall–runoff associations within a watershed [3]. The disparity in temporal and spatial distribution and highly nonlinear [5] nature of rainfall converted into surface runoff represents a very complex part of hydrological cycle. Conventional techniques to calculate runoff are helpful only in those watersheds where gauging station is installed to measure the discharge rate of river channel. However, these measurements usually are very costly, time-consuming and complicated. Runoff method could be exercised for the prioritization of watersheds [3, 6]. Remote sensing and GIS are the most promising techniques for evaluation on prioritization of watershed administration and improvement. Progresses in advance modeling, convenience of spatial data, analyzing pattern employed a possibility to envisage the runoff correctly utilizing minimum data [7, 8]. The primarily significant spatial data are mandatory. A superior runoff model incorporates appropriate analysis of parameters such as rainfall, soil types and land use/land-cover. There are comprehensive hydrologic models based on physical and spatial distribution of basins that are capable to assimilate both RS and GIS technologies for measurement and monitoring of surface water flow to compute series of runoff for a particular rainfall episode. At present, there are several approaches to calculate approximate runoff from ungauged basin. Rainfall–runoff models are frequently applied in this context. Among them, Curve Number method (SCS-CN) is the most effective approach due to its capability and flexibility [2, 6, 9]. The United States Soil Erosion Service (SES) was formed in 1933 to supervise ongoing soil conservation projects. The agency altered to Soil Conservation Service (SCS) after the Soil Conservation Act of 1935. The SCS become conscious about the necessity of acquiring hydrologic data and simple procedure for assessing runoff rates. Contemporary research in [10–12] leads the way to build up the unique method for assessing direct runoff from storm rainfall. A variety of researches and analyses have been studied from various regions of India as seen in [13, 14] recommended that the runoff values estimated by curve number technique can be categorized as more, moderate and low classes and thus it became very easy to compute the months [2, 15]. The Soil Conservation Service curve number method is a comprehensive multipurpose procedure for quick runoff estimation and rather uncomplicated to furnish satisfactory in small database like United States Department of Agriculture (USDA) as seen in [10, 16, 17, 18]. In recent times, the emerging accessibility of remote sensing data and fast progress in computational techniques have made feasible to couple SCS curve number in spatial domain coupled with Satellite Information System (GIS) [12]. It is generally employed as initial input in different hydrologic models together with erosion and water quality models, including SWAT [19], EPIC [20] CREAMS [21] and AGNPS [22]. This model is more suitable for small watershed ( 15%). For PMV ≥ −
682
Fig. 1 Auditorium view from the stage
Fig. 2 Auditorium view from the audience
E. Conceição et al.
55 Control Application to a Heating System Used in an Auditorium …
683
0.7, the HVAC remains off. For more details, see the study presented in Conceição et al. [19].
4 Results and Discussion The results presented here refer to those obtained on the sixth day of the simulation. The previous 5 days were simulated so that the quantities involved stabilized their values.
4.1 Air Temperature The daily variation of the indoor air temperature in the auditorium when the HVAC system has no automatic control system implemented and when it has a PMV control is presented, respectively, in Figs. 3 and 4. Both Figs. 3 and 4 present the daily variation of the air temperature outside the auditorium. With the use of the HVAC system without automatic control system, during occupation, the average air temperature inside the auditorium is about 13.2 °C. The air 25
Indoor
Outdoor
20
15
10
5
0 0
2
4
6
8
10
12
14
16
18
20
22
24
time (h)
Fig. 3 Daily variation of the air temperature (°C) inside and outside the auditorium when the HVAC has no control system
684
E. Conceição et al.
25
Indoor
Outdoor
20
15
10
5
0 0
2
4
6
8
10
12
14
16
18
20
22
24
time (h) Fig. 4 Daily variation of the air temperature (°C) inside and outside the auditorium when the HVAC has a PMV control
temperature inside the auditorium is higher, on average, by about 8.3 °C in relation to the outdoor temperature. By using the HVAC with PMV control, during occupation, the average air temperature inside the auditorium is about 22.5 °C. The indoor air temperature is higher, on average, by about 17.7 °C in relation to the outdoor temperature. Comparing the use of HVAC with and without control system, the indoor air temperature increases, on average, by about 112%. By using the HVAC with PMV control, during occupation, it is feasible to have the air temperature inside the auditorium relatively constant, slightly above 22.5 °C and situated within the range of values, between 20 and 25 °C, recommended by the standards. Note that, after connecting the HVAC system, the indoor air temperature takes about 65 min to stabilize due to the size of the auditorium and its high occupancy.
4.2 PMV Index The evolution of the PMV index inside the auditorium when HVAC has no automatic control system implemented and when it has a PMV control is shown, respectively, in Figs. 5 and 6. By using the HVAC without automatic control system, during occupation, the occupants’ thermal comfort level is unacceptable due to PMV negative values. During
55 Control Application to a Heating System Used in an Auditorium …
685
0.7
0
-0.7
-1.4
-2.1
-2.8
-3.5 0
2
4
6
8
10
12
14
16
18
20
22
24
time (h)
Fig. 5 Evolution of PMV index inside the auditorium when the HVAC has no control system. The shaded zone delimits the thermal comfort zone within category C of ISO 7730 [6] 0.7
0
-0.7
-1.4
-2.1
-2.8
-3.5 0
2
4
6
8
10
12
14
16
18
20
22
24
time (h)
Fig. 6 Evolution of the PMV index inside the auditorium when the HVAC has a PMV control. The shaded zone delimits the thermal comfort zone defined by category C of ISO 7730 [6]
686
E. Conceição et al.
the occupancy period, the PMV average value is −3.24. This high value is due to the HVAC being always on. By using HVAC with PMV control, during occupation, after the interior thermal conditions have stabilized (about 9.30 am), the occupants’ thermal comfort level is acceptable by PMV negative values in line with category C [6], around the acceptable threshold of −0.7. During the occupancy period, the PMV average value is −0.91. If we consider from the moment that the interior thermal conditions stabilize, the average value of the PMV increases to −0.83. The oscillations that occur in the PMV index are due to the action of the control system in defining HVAC operating periods. Comparing the use of HVAC with and without control system, the PMV increases, on average, by about 256%. In this way, it is possible to ensure levels of thermal comfort for the occupants by means of PMV negative values around the acceptable limit proposed by category C [6].
4.3 Thermal Power The evolution of the thermal power of the HVAC system has a control system based on the PMV index which is presented in Fig. 7. As can be seen in Fig. 7, at the start of the HVAC, the highest value of the thermal power is obtained, approximately 41.7 kW. It is also verified that the HVAC remains on, with power values between that value and 39.2 kW until the thermal conditions 50
Q (kW)
45 40 35 30 25 20 15 10 5 0 0
2
4
6
8
10
12
14
16
18
20
time (h)
Fig. 7 Evolution of the thermal power (Q) of HVAC when it has a PMV control
22
24
55 Control Application to a Heating System Used in an Auditorium …
687
inside the auditorium stabilize around 9.30 am. After this moment, the HVAC system will turn on when the PMV is below the value of −0.7 and will turn off when the PMV is above the value of −0.7. When connecting the HVAC, the values of the thermal power verified are generally around 39.0 kW. In the early afternoon, it is also noted that the HVAC system is switched on for about 15 min with a thermal power of around 39.0 kW. When the auditorium has no occupancy, the HVAC is off. In this case, HVAC is on for 4.6 h (corresponding to 57.5% of the occupancy time), about 3.0 h in the morning and about 1.6 h in the afternoon. In this way, it is possible to manage energy consumption by the HVAC more efficiently and, at the same time, to ensure acceptable levels of thermal comfort for the occupants.
5 Conclusions In this numerical work, done for winter conditions, an application of a power system and a heating energy control system in an auditorium with complex topology was presented. The heating energy system is controlled by the PMV. The auditorium is occupied by 140 persons. A BTB software was used to assess the occupants’ thermal comfort considering that the control system can be used or not. By using the control system, the temperature of the indoor air rises to relatively constant values around 22.5 °C, which is therefore within the limits recommended by the standards. In the meantime, the PMV rises to values around −0.7, therefore, the thermal comfort of the occupants is within the acceptable limits in line with category C of ISO 7730 [6]. Finally, the necessary operating time of the HVAC system is reduced by 42.5%, thus obtaining a decrease in energy consumption by the HVAC system. Acknowledgements The authors would like to acknowledge to the project (SAICTALG/39586/2018) from Algarve Regional Operational Program (CRESC Algarve 2020), under the PORTUGAL 2020 Partnership Agreement through the European Regional Development Fund (ERDF) and the National Science and Technology Foundation (FCT).
References 1. Building research establishment environmental assessment methodology. https://www.breeam. com/. Accessed 22 Feb 2022 2. Brostrom M, Howell G (2008) The challenges of designing and building a net zero energy home in a cold high-latitude climate. In: Proceedings of the 3rd international solar cities congress, Adelaide, Australia, pp 17–21 3. Liang J, Du R (2007) Design of intelligent comfort control system with human learning and minimum power control. Energy Convers Manag 49:517–528 4. Conceição E, Lúcio M, Ruano A, Crispim E (2009) Development of a temperature control model used in HVAC systems in school spaces in Mediterranean climate. Build Environ 44(5):871–877
688
E. Conceição et al.
5. Fanger P (1970) Thermal comfort: analysis and applications in environmental engineering. Danish Technical Press, Copenhagen, Denmark 6. ISO 7730 (2005) Ergonomics of the thermal environments—analytical determination and interpretation of thermal comfort using calculation of the PMV and PPD indices and local thermal comfort criteria. International Standard Organization, Geneva, Switzerland 7. Homod Z, Sahari K, Almurib H, Nagi F (2012) RLF and TS fuzzy model identification of indoor thermal comfort based on PMV/PPD. Build Environ 49:141–153 8. Ferreira P, Ruano A, Silva S, Conceição E (2012) Neural networks based predictive control for thermal comfort and energy saving in public buildings. Energy Build 55:238–251 9. Ruano A, Pesteh S, Silva S, Duarte H, Mestre G, Ferreira P, Horta R (2016) The IMBPC HVAC system: a complete MBPC solution for existing HVAC systems. Energy Build 120:145–158 10. Ku K, Liaw J, Tsiai M, Liu T (2015) Automatic control system for thermal comfort based on predicted mean vote and energy saving. IEEE Trans Autom Sci Eng 12:378–383 11. Xu Z, Hu G, Spanos C, Schiavon S (2017) PMV-based event-triggered mechanism for building energy management under uncertainties. Energy Build 152:73–85 12. Conceição E, Santiago C, Lúcio M, Awbi H (2018) Predicting the air quality, thermal comfort and draught risk for a virtual classroom with desk-type personalized ventilation systems. Buildings 8(2):35 13. Conceição E, Nunes A, Gomes J, Lúcio M (2010) Application of a school building thermal response numerical model in the evolution of the adaptive thermal comfort level in Mediterranean environment. Int J Vent 9(3):287–304 14. Conceição E, Lúcio M (2010) Numerical study of the influence of opaque external trees with pyramidal shape on the thermal behaviour of a school building in summer conditions. Indoor Built Environ 19(6):657–667 15. Conceição E, Gomes J, Awbi H (2019) Influence of the airflow in a solar passive building on the indoor air quality and thermal comfort levels. Atmosphere 10(12):766 16. Conceição E, Awbi H (2021) Evaluation of integral effect of thermal comfort, air quality and draught risk for desks equipped with personalized ventilations systems. Energies 14(11):3235 17. Conceição E, Silva M, André J, Viegas D (2000) Thermal behaviour simulation of the passenger compartment of vehicles. Int J Veh Des 24(4):372–387 18. ANSI/ASHRAE Standard 62-1 (2016) Ventilation for acceptable indoor air quality. American Society of Heating, Refrigerating and Air-Conditioning Engineers, Atlanta, GA, USA 19. Conceição E, Gomes J, Ruano A (2018) Application of HVAC systems with control based on PMV index in university buildings with complex topology. IFAC (Papers Online) 51(10):20–25
Chapter 56
Mathematical Analysis of Effect of Nutrients on Plankton Model with Time Delay Rakesh Kumar
and Navneet Rana
1 Introduction The plankton are micro-organisms that live in water. They are very light bodies and move freely in the water. They are also termed as ‘marine-drifters’. The plankton are classified into two main forms as Phytoplankton (plants) and Zooplankton (animals). The phytoplankton has a very large contribution toward aquatic life as well as terrestrial life. The phytoplankton are green in color and produce their food by the process of photosynthesis. They absorb large amounts of carbon dioxide from the atmosphere and supply oxygen in the atmosphere. The zooplankton eat phytoplankton as their food. The growth of the phytoplankton depends upon the nutrients present in water like nitrate, phosphate, etc. Sometimes, overgrowth of the phytoplankton in water causes harmful algal blooms, which decreases the amount of oxygen dissolved in water. The aquatic micro-organisms die due to this process, which in turn leads to the imbalance in the aquatic ecosystem. The mathematical modeling of the plankton population is a very useful method to learn about the physical and biological processes of plankton conservation. The biological process in the plankton system can be represented by differential equations and analyzed for their stability and periodic solution as explained in [1–4]. Researchers in [5, 6] have discussed the phytoplankton and zooplankton models mathematically to learn the dynamical behavior of these tiny particles. The plankton model in the presence of nutrients has been studied by authors [7–10] in their research. The stability analysis along with bifurcation analysis of innovation diffusion delayed models has been explained by authors in [11–14]. The plankton nutrient model with delay has been presented by R. Kumar · N. Rana (B) Department of Mathematics, Shaheed Bhagat Singh State University, Moga Road, Ferozepur 152004, Punjab, India e-mail: [email protected] R. Kumar e-mail: [email protected] N. Rana PG Department of Mathematics, Guru Nanak College, Sri Muktsar Sahib 152026, Punjab, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_56
689
690
R. Kumar and N. Rana
researchers [15–19], where they study the stability of delay model, nutrient cycling, toxicity and harvesting. In [20], Singh et al. have studied the effect of diffusion and time delay on a nutrient phytoplankton model with toxicity. The stability and Hopf bifurcation of a delayed diffusive plankton model is defined by Liang and Jia in [21]. The effect of seasonality on a nutrient-plankton system with toxicity in the presence of refuge and additional food has been discussed by Tiwari et al. in [22]. Kaur et al. in [23] have described the nutrient-plankton model with delay and toxicity. The remaining paper is arranged into different sections as follows: the mathematical expression of the model to describe the biological process has been defined in the next section. The system of differential equation has been analyzed for positive solution and its bounded behavior in the third section. The various kinds of equilibrium points with their stability analysis have been completed in the fourth section. In the fifth section, the model has been studied without time delay and with time delay. The parameter τ is considered as a bifurcation parameter to explain the occurrence of Hopf bifurcation in the system. In the last section, conceptual results are confirmed using numerical simulations.
2 Mathematical Expression for Nutrient-Plankton Model Let n(t) represent the nutrient concentration at any time t in water. Let p(t) and z(t), respectively, denote the densities of phytoplankton population and zooplankton population at any time t. Let τ be the delay in time in zooplankton predation. Then, the system of delay differential equations to explain the mathematical model is described as follows: ⎧ dn = ⌃ − cn(t) − ∈n(t) ⎪ ( ) p(t), ⎪ ⎨ dt dp ) = r p(t) 1 − p(t) − δ1 p(t), + βn(t) p(t) − ap(t)z(t−τ (1) K 1+αp(t) dt ⎪ ⎪ ⎩ dz a1 p(t)z(t) = 1+αp(t) − δ2 z(t). dt The set of parameters involved in these mathematical expressions are explained as follows: ⌃ → the initial concentration of nutrients. c → the absorption rate of nutrients. ∈ → the rate at which phytoplankton absorb the nutrients. r → the rate at which phytoplankton grow naturally. K → the phytoplankton’s carrying capacity. β → the rate at which nutrients are converted for the growth of phytoplankton. a → the rate at which zooplankton capture phytoplankton. a1 → the rate at which phytoplankton are converted to zooplankton. δ1 → the rate at which phytoplankton die naturally. δ2 → the rate at which zooplankton die naturally. α → the handling time.
56 Mathematical Analysis of Effect of Nutrients …
691
The initial condition for system (1) are defined as follows: n(ν) = ρ1 (ν), p(ν) = ρ2 (ν), z(ν) = ρ3 (ν), ρ1 (ν) ≥ 0, ρ2 (ν) ≥ 0, ρ3 (ν) ≥ 0, ν ∈ [−τ, 0] and ρ1 (0) > 0, ρ2 (0) > 0, ρ3 (0) > 0, where ρ1 (ν), ρ2 (ν), ρ3 (ν) ∈ C([−τ, 0], R3+ ), the Banach space of continuous functions defined on the interval [−τ, 0] into R3+ , where R3+ = {(x1 , x2 , x3 ) : xi > 0, i = 1, 2, 3}.
3 Analysis of Model for Positive and Bounded Solution As we are dealing with a system of biological existence, it is very important to check the system (1) for positive and bounded solutions. Theorem 1 With reference to the above-defined initial conditions, the solution of the system (1) is positive for all t ≥ 0. Proof Let (n(t), p(t), z(t)) be the solution of the system (1). Rewriting the first equation of system (1) as dn(t) ≥ −{c + ∈p(t)}dt. n(t) By integrating over the interval [0, t], we get the result as ( n(t) ≥ ρ1 (0) exp
{
t
−
) (c + ∈p(t))dt
> 0.
0
Rephrasing the second equation of system (1) as } ( ) } p(t) az(t − τ ) dp(t) = r 1− − δ1 dt, + βn(t) − K 1 + αp(t) p(t) dp = G 1 dt, p } ( where G 1 = r 1 −
) p(t) K
+ βn(t) −
az(t−τ ) 1+αp(t)
Integrating over [0, t] and using G 11 =
{t 0
} − δ1 . G 1 dt, we have
p(t) = ρ2 (0) exp(G 11 ) > 0, Similarly, the third equation gives the result as z(t) = ρ3 (0) exp(G 22 ) > 0,
692
R. Kumar and N. Rana
{t a1 p(t) where G 22 = 0 G 2 dt and G 2 = 1+αp(t) − δ2 . Thus, we get the positive solution of the system (1). Theorem 2 All the solutions of the system (1), which start in R3+ , are consistently bounded. Proof Let (n(t), p(t), z(t)) be the solution of the system (1) with reference to abovedefined initial conditions. Rewriting the first equation of system as dn(t) + cn(t) ≤ ⌃. dt Using comparison lemma, we have lim supt→∞ n(t) ≤ The second equation of system gives
⌃ . c
) ( p(t) dp(t) + βn(t) p(t). ≤ r p(t) 1 − K dt Using comparison lemma, we have lim supt→∞ p(t) ≤ K + i.e., Kβ⌃ . lim sup p(t) ≤ K + rc t→∞ Considering the function B(t) = β∈ n(t) + p(t) +
a z(t), a1
Kβn , r
we get
dB β⌃ r p 2 + kB = rp + − dt ∈ K where k = min(c, δ1 , δ2 )
dB β⌃ + kB ≤ rp + . dt ∈
Using the result of theorem on differential inequalities as mentioned in [24], we obtain ( ) 1 β⌃ rp + , lim sup B(t) ≤ k ∈ t→∞ i.e., lim sup B(t) ≤ t→∞
( ( ) ) 1 Kβ⌃ β⌃ r K+ + . k rc ∈
Thus, for all t > 0, all the solutions of the system (1) are ultimately bounded.
56 Mathematical Analysis of Effect of Nutrients …
693
4 Existence of Equilibrium Points and Their Steadiness Analysis The given system of differential equation has the following balance points: (i) The boundary steady state E 1 ( ⌃c , 0, 0), which occurs when phytoplankton and zooplankton population density becomes zero in the system. (ii) The zooplankton-free steady state E 2 (n 1 , p1 , 0), which occurs when zooplankton population density becomes zero in the system, where n1 = and p1 =
⌃ c + ∈p1
K (r + βn 1 − δ1 ). r
(iii) The coexisting steady state E 3 (n ∗ , p ∗ , z ∗ ), where n∗ =
⌃(a1 − αδ2 ) ; ca1 − cαδ2 + ∈δ2
p∗ = and z∗ =
δ2 ; a1 − αδ2
( ) 1 + αp ∗ r p∗ + βn ∗ − δ1 . r− a K
n ∗ and p ∗ are positive if a1 > αδ2 and z ∗ is positive if r + βn ∗ >
r p∗ K
+ δ1 .
4.1 Steadiness Analysis of Equilibrium Points In this subsection, we are going to discuss the steadiness conditions for different equilibrium points. The Jacobian matrix is calculated at the equilibrium points. The condition for which the eigenvalues of the Jacobian matrix have negative real part is used to analyze local stability around the equilibrium point. Theorem 3 The boundary steady point E 1 ( ⌃c , 0, 0) is stable if r + βn < δ1 . Proof The steady point E 1 ( ⌃c , 0, 0) has the Jacobian matrix, defined as follows: ⎡
JE1
⎤ −c −∈n 0 = ⎣ 0 r + βn − δ1 0 ⎦ . 0 0 −δ2
694
R. Kumar and N. Rana
Its eigenvalues are −c, r + βn − δ1 and −δ2 . The boundary steady state E 1 ( ⌃c , 0, 0) is stable if r + βn − δ1 < 0, i.e., r + βn < δ1 . Theorem 4 The zooplankton-free steady point E 2 (n 1 , p1 , 0), where n1 = and p1 = is stable if
⌃ , c + ∈p1
K (r + βn 1 − δ1 ), r
a 1 p1 < δ2 , 1 + αp1
and r + βn 1 < c + ∈p1 +
2r p1 + δ1 K
Proof The zooplankton-free equilibrium point E 2 (n 1 , p1 , 0) has a Jacobian matrix, defined as ⎡ ⎤ B11 B12 0 JE2 = ⎣ B21 B22 0 ⎦ , 0 0 B33 where
⎧ B11 ⎪ ⎪ ⎪ ⎪ ⎨ B12 B21 ⎪ ⎪ B ⎪ 22 ⎪ ⎩ B33
= −c − ∈p1 , = −∈n 1 , = βp1 , = r − 2rKp1 + βn 1 − δ1 , a1 p1 = 1+αp − δ2 . 1
The characteristic equation of JE2 is given by (B33 − λ)(λ2 − (B11 + B22 )λ + B11 B22 − B12 B21 ) = 0. The eigenvalues of this characteristic equation have negative real parts if B33 < 0 and B11 + B22 < 0, a1 p1 < δ2 and r + βn 1 < c + ∈p1 + 2rKp1 + δ1 . i.e., 1+αp 1 Next, we are presenting the stability conditions of coexisting equilibrium point E 3 and show that stability disappears through Hopf bifurcation and as a result of which instability takes place. The bifurcation parameter is time delay parameter τ . The Jacobian matrix at equilibrium point E 3 (n ∗ , p ∗ , z ∗ ) is given as
56 Mathematical Analysis of Effect of Nutrients …
695
⎡
JE3 ⎧ D11 ⎪ ⎪ ⎪ ⎪ D12 ⎪ ⎪ ⎪ ⎨ D21 D22 ⎪ ⎪ ⎪ ⎪ D23 ⎪ ⎪ ⎪ ⎩D
where
32
⎤ D11 D12 0 = ⎣ D21 D22 D23 ⎦ , 0 D32 0
= −c − ∈p ∗ , = −∈n ∗ , = βp ∗ , ∗ = r − 2rKp + βn ∗ − = =
∗ −
e λτ , − ap (1+αp∗ ) a1 z ∗ . (1+αp∗ )2
az(t−τ ) (1+αp∗ )2
− δ1 ,
The Jacobian matrix JE3 has characteristic equation, defined as Δ(λ, τ ) | E3 = (λ3 + Ω1 λ2 + Ω2 λ + Ω3 ) + (Ω4 + Ω5 λ)e−λτ = 0,
(2)
where ∗ az(t−τ ) Ω1 = c + ∈p ∗ − r + 2rKp − βn ∗ + (1+αp ∗ )2 + δ1 , ( ) 2r p∗ az(t−τ ) ∗ ∗ Ω2 = (c + ∈p ) − r + K − βn + (1+αp∗ )2 + δ1 + β∈n ∗ p ∗ , Ω3 = 0, a1 ap∗ z ∗ Ω4 = (c + ∈p ∗ ) (1+αp ∗ )3 , ∗ ∗
a1 ap z Ω5 = (1+αp ∗ )3 . It is noted that Ω1 , Ω2 , Ω3 , Ω4 , Ω5 are continuous and differentiable functions, which appear as the coefficients of Eq. (2).
5 Analysis of Model Without Time Delay and with Time Delay 5.1 Analysis of the Delay-Free Model Here we are going to check the system with no delay, i.e., τ = 0. The system (1) changes for τ = 0 to ⎧ dn = ⌃ − cn(t) − ∈n(t) ⎪ ( ) p(t), ⎪ ⎨ dt dp = r p(t) 1 − p(t) + βn(t) p(t) − dt K ⎪ ⎪ ⎩ dz 1 p(t)z(t) = a1+αp(t) − δ2 z(t). dt
ap(t)z(t) 1+αp(t)
− δ1 p(t),
(3)
696
R. Kumar and N. Rana
So, the transcendental equation (2) for τ = 0 becomes λ3 + Ω1 λ2 + (Ω2 + Ω5 )λ + (Ω3 + Ω4 ) = 0,
(4)
where the values of Ω1 > 0, Ω2 + Ω5 > 0 and Ω3 + Ω4 > 0 are defined above. By the Routh-Hurwitz criterion [25], the coexisting steady point E 3 (n ∗ , p ∗ , z ∗ ) without any delay is locally asymptotically stable if the characteristic equation will have roots with negative real parts, i.e., if (T1 ) : Ω1 (Ω2 + Ω5 ) − (Ω3 + Ω4 ) > 0 hold.
5.2 Analysis of Model with Time Delay and Hopf Bifurcations In the present subsection, we will study the system with time delay around E 3 . First, we state the two results as follows: Lemma 1 ([26]) The conditions for roots of cubic equation z 3 + az 2 + bz + c = 0 are as follows: 1. There exists at least one positive root of the equation if c < 0. 2. There exists no positive root of the equation if c ≥ 0 and a 2 − 3b ≤ 0. 3. There exist positive roots of the equation if c ≥ 0 and a 2 − 3b ≥ 0. Lemma 2 ([27, 28]) The system (1) has the coexisting steady state E 3 (n ∗ , p ∗ , z ∗ ), which is (i) absolutely stable if and only if the two conditions are fulfilled: (a) The steady point E 3 (n ∗ , p ∗ , z ∗ ) is asymptotically stable; (b) For any τ > 0, the characteristic equation (2) does not have purely imaginary roots. (ii) conditionally stable if and only if the two conditions are fulfilled: (a) At τ = 0, all the roots of the characteristic equation (2) have negative real parts; (b) A pair of complex roots which are conjugate of each other as ±i∈ for some positive value of τ exist for the characteristic equation (2).
Theorem 5 For any time delay τ > 0, if the constrain (T1 ) holds for the system (1), where (T1 ) : Ω1 (Ω2 + Ω5 ) − (Ω3 + Ω4 ) > 0, then the coexisting steady state E 3 (n ∗ , p ∗ , z ∗ ) is conditionally stable.
56 Mathematical Analysis of Effect of Nutrients …
697
Proof In this theorem, we are going to find roots, which are purely imaginary, occur pairwise and are conjugate to each other of the characteristic equation (2). Let the root of characteristic equation (2) be given by λ = i∈ for some τ > 0. Then, −i∈ 3 − Ω1 ∈ 2 + Ω2 i∈ + Ω3 + (Ω4 + Ω5 i∈)e−i∈τ = 0. By using the conventional method as mentioned in [29, 30] for separating real and imaginary parts, we get Ω4 cos∈τ + Ω5 ∈sin∈τ = Ω1 ∈ 2 − Ω3 ,
(5)
− Ω5 ∈cos∈τ + Ω4 sin∈τ = Ω2 ∈ − ∈ 3 .
(6)
The Eqs. (5) and (6) are solved by squaring and adding, and using ∈ 2 = x, we get F(x) = x 3 + d1 x 2 + d2 x + d3 = 0,
(7)
d1 = Ω21 − 2Ω2 , d2 = Ω22 − 2Ω1 Ω3 − Ω25 , d3 = Ω23 − Ω24 . Thus, the Eq. (2) has at least one positive root ∈ = ∈0 , which satisfies (5) and (6). Then, the roots of the characteristic equation (2) are given by ±i∈0 , which is a pair of complex values. This results in the occurrence of Hopf bifurcation. Put ∈ = ∈0 in Eqs. (5) and (6) and calculating the value of τ as } } 3 1 2nπ −1 (Ω1 Ω5 − Ω4 )∈0 + (Ω2 Ω4 − Ω3 Ω5 )∈0 = sin , n = 0, 1, 2, 3, . . . + 2 2 2 ∈0 ∈0 Ω5 ∈0 + Ω4 (8) If the condition (T1 ) holds, then, for τ = 0, the characteristic equation (2) has all the roots with negative real parts. Hence, by the result of lemma (2), the coexisting steady state is conditionally stable. The existence of Hopf bifurcation will depend upon the two conditions, as discussed in the theory of Hopf bifurcation in [31, 32]: τn∗
(i) A pair of eigenvalues ±i∈0 exists, which are purely imaginary and conjugate to each other. The real parts of all the other eigenvalues are negative. (ii) The transversality condition is satisfied. Next, for the verification of the transversality condition of Hopf bifurcation, we /= 0 at Reλ = 0. show that d(Reλ) dτ Let the root of the characteristic equation(2) near τ = τ0∗ be λ(τ ) = μ(τ ) + i∈(τ ) with conditions that μ(τ0∗ ) = 0 and ∈(τ0∗ ) = ∈0 . Substituting this value of λ(τ ) into (2) and differentiating w.r.t. the time delay parameter τ , we get the result as
698
R. Kumar and N. Rana
g1
dμ d∈ − g2 = h1, dτ dτ
g2
d∈ dμ + g1 = h2, dτ dτ
(9)
where g1 = −3∈02 + Ω2 + Ω5 Cos(∈0 τ0∗ ) − τ0∗ Ω4 Cos(∈0 τ0∗ ) − ∈0 τ0∗ Ω5 Sin(∈0 τ0∗ ), g2 = 2∈0 Ω1 − Ω5 Sin(∈0 τ0∗ ) + τ0∗ Ω4 Sin(∈0 τ0∗ ) − ∈0 τ0∗ Ω5 Cos(∈0 τ0∗ ), h 1 = ∈0 (Ω4 Sin(∈0 τ0∗ ) − ∈0 Ω5 Cos(∈0 τ0∗ )), h 2 = ∈0 (Ω4 Cos(∈0 τ0∗ ) + ∈0 Ω5 Sin(∈0 τ0∗ )). Solving Eq. (9), we have } } g1 h 1 + g2 h 2 dμ = . dτ μ=0 g12 + g22 This equation can also be written as } } } } ∈02 d F(x) dμ = 2 /= 0. dx dτ μ=0 g1 + g22 x=∈02 Thus, the condition of transversality of Hopf bifurcation holds at τ = τ0∗ . Hence, the conditions for the occurrence of Hopf bifurcation [33] are satisfied. On the basis of these analyses, we can state the results in the form of theorem as follows. Theorem 6 If the condition (T1 ) holds for coexisting steady state E 3 (n ∗ , p ∗ , z ∗ ) of the system (1), then 1. the coexisting steady state E 3 (n ∗ , p ∗ , z ∗ ) is LAS for 0 < τ < τ0∗ ; 2. the coexisting steady state E 3 (n ∗ , p ∗ , z ∗ ) is unstable for τ > τ0∗ ; 3. at τ > τ0∗ about the coexisting steady state E 3 , the system undergoes the Hopf bifurcation.
6 Numerical Simulations In the present section, the dynamical behavior of the system will be discussed by means of numerical simulations. In this direction, the assumed set of parametric values for the system (1) are defined as follows: ⌃ = 5; c = 0.485; ∈ = 0.388; r = 0.15; K = 5; β = 0.2; a = 0.31; a1 = 0.237; α = 0.35; δ1 = 0.25; δ2 = 0.2.
56 Mathematical Analysis of Effect of Nutrients …
699 9 8
6
n(t),p(t),z(t)
Zooplankton z(t)
7 5 4 3 2 1
6 5 4 3
0 8
2 10
6 8
4
1
6 4
2
Phytoplankton p(t)
0
2 0
0 0
Nutrients n(t)
100
200
300
400
500
600
700
800
900
1000
Time
Fig. 1 Solution curves indicating the local stability of the nutrient-plankton model with no time delay τ 9 n(t) p(t) z(t)
8 7 7 5
n(t), p(t), z(t)
Zooplankton z(t)
6
4 3 2 1
6 5 4 3
0 8
2 10
6 8
4
6
Pytoplankton p(t)
1
4
2 0
2 0
Nutrients n(t)
0 0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Time t
Fig. 2 Convergence of solution trajectories to E 3 (5.264, 1.198, 4.198) at τ = 1.62 < τ0∗
The parametric values as defined above along with the condition of no time delay, i.e., τ = 0 and the initial values defined as n(t) = 0.03, p(t) = 0.01 and z(t) = 0.13, the system (1) converges to asymptotically steady coexisting balance point E 3 (5.264, 1.198, 4.198) which is clearly shown in Fig. 1. With the help of the DDE23 pack of MATLAB, the system is united with delay parameter τ and it achieves steadiness for τ = 1.62. The system shows stable dynamical behavior about equilibrium point E 3 at τ = 1.62, as shown in Fig. 2. The oscillatory behavior of the nutrient-plankton model at τ0∗ = 1.92 near the coexisting equilibrium point E 3 is demonstrated in Fig. 3. But when we fix all the parameters and then slowly rise the value of time delay parameter τ , the system exhibits small periodic orbits, and this results in the existence of Hopf bifurcation in the given system. Figure 4 shows the occurrence of stable periodic solution of the system (1) at τ = 4.2.
700
R. Kumar and N. Rana 10 n(t) p(t) z(t)
9 8
8 7
6
n(t), p(t), z(t)
Zooplankton z(t)
7
5 4 3 2 1
6 5 4 3
0 8
2
10
6 8
4
1
6 4
2
Pytoplankton p(t)
0
2 0
0 0
Nutrients n(t)
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Time t
Fig. 3 Oscillatory behavior of the nutrient-plankton model of system around E 3 is shown at τ0∗ = 1.92 14 12
Zooplankton Z(t)
Fig. 4 Stable periodic solutions around E 3 for the nutrient-plankton model at τ = 4.2
10 8 6 4 2 0 8 15
6 4
10 2
Pytoplankton P(t)
5 0
0
Nutrients N(t)
Numerically, with the help of parametric values as defined above, we can find the values of Ω1 = 0.7149, Ω2 = 0.2662, Ω3 = 0, Ω4 = 0.12275 and Ω5 = 0.1292. The condition for the Routh-Hurwitz criterion (T1 ) : Ω1 (Ω2 + Ω5 ) − (Ω3 + Ω4 ) = 0.1599 > 0 is satisfied, i.e., there exist a unique positive root of Eq. (7) and a purely imaginary root i∈0 with ∈0 = 0.4251. From Eq. (8), we get τ = τ0∗ = 1.63. The equilibrium point E 3 (n ∗ , p ∗ , z ∗ ) loses its stability when τ crosses the critical value τ0∗ 1.63 (keeping all other parameters to be fixed). Also, to check the condition for Hopf bifurcation, we have g1 = −0.38764, g2 = 0.58416, h 1 = 0.01536 and h 2 = 0.05506. Therefore, the equation has the value as } } g1 h 1 + g2 h 2 dμ = = 0.05332. dτ μ=0 g12 + g22 This equation can also be written as
56 Mathematical Analysis of Effect of Nutrients …
}
dμ dτ
} μ=0
701
} } ∈02 d F(x) = 2 = 0.053 /= 0. dx g1 + g22 x=∈02
Thus, the condition of transversality for the occurrence of Hopf bifurcation is verified at τ = τ0∗ .
7 Conclusion The nutrient-plankton model is of great importance as the level of nutrients in water greatly affects the growth of phytoplankton. Phytoplankton absorb carbon dioxide and produce about 50% oxygen in the atmosphere. The phytoplankton are the food for zooplankton, which are further consumed by fish. Plankton is an important ecological tool and also has industrial and biotechnological capacities, which can be used in commercial products. In this paper, we have presented a plankton model with nutrients by the system of delay differential equations. Firstly, the system is analyzed for positive and bounded solutions. The occurrence of all the steady states and their steadiness analysis has been discussed. The boundary steady state E 1 ( ⌃c , 0, 0) is stable if r + βn < δ1 . The stability conditions of zooplankton-free equilibrium point E 2 (n 1 , p1 , 0) and coexisting equilibrium point E 3 (n ∗ , p ∗ , z ∗ ) have also been explained. The bifurcation parameter is assumed to be the time delay parameter τ . The critical value of 0 < τ0∗ < 1.63 gives the stability of the system, beyond which the system loses the stability and Hopf bifurcation occurs in the system. Thus, it is summarized that the stability of the nutrient-plankton system depends upon the delay parameter and instability arises in the system for higher values of the delay parameter. The novelty of the present paper lies in the fact that we have taken the time delay in the zooplankton population to predate phytoplankton with the Holling Type-II functional response. To the best of our knowledge, it is a new thing to consider these kinds of terms, which are providing us with innovative ideas for future reference. This article is recommended for the students, researchers and mathematicians working in the field of biological/ecological modeling to apply the concept of nutrients with delay and apply the Holling-type functional responses in their future endeavors.
References 1. Dieudonne J (2013) Foundation of modern analysis. Read Books Ltd. 2. Kuang Y (1993) Delay differential equations with applications in population dynamics. Academic Press 3. Gopalsamy K (2013) Stability and oscillations in delay differential equations of population dynamics, vol 74. Springer Science and Business Media 4. Hale JK (1969) Ordinary differential equations. Wiley, New York
702
R. Kumar and N. Rana
5. Saha T, Bandyopadhyay M (2009) Dynamical analysis of toxin producing phytoplanktonzooplankton interactions. Nonlinear Anal Real World Appl 10(1):314–332 6. Rehim M, Imran M (2012) Dynamical analysis of a delay model of phytoplankton-zooplankton interaction. Appl Math Model 36(2):638–647 7. Fan A, Han P, Wang K (2013) Global dynamics of a nutrient-plankton system in the water ecosystem. Appl Math Comput 219(15):8269–8276 8. Zhang T, Wang W (2012) Hopf bifurcation and bistability of a nutrient phytoplankton zooplankton model. Appl Math Model 36(12):6225–6235 9. Pardo O (2000) Global stability for a phytoplankton-nutrient system. J Biol Syst 8(2):195–209. https://doi.org/10.1142/S0218339000000122 10. Huppert A, Olinky R, Stone L (2004) Bottomup excitable models of phytoplankton blooms. Bull Math Biol 66(4):865–878. https://doi.org/10.1016/j.bulm.2004.01.003 11. Kumar R, Sharma AK, Agnihotri K (2018) Stability and bifurcation analysis of a delayed innovation diffusion model. Acta Math Sci 38(2):709–732 (2018). https://doi.org/10.1016/ S0252-9602(18)30776-8 12. Kumar R, Sharma AK (2021) Stability and Hopf bifurcation analysis of a delayed innovation diffusion model with intra-specific competition. Int J Bifurc Chaos 31(14):2150213. https:// doi.org/10.1142/S0218127421502138 13. Kumar R, Sharma AK, Agnihotri K (2020) Bifurcation behaviour of a nonlinear innovation diffusion model with external influences. Int J Dyn Syst Differ Equ 10(4):329–357 (2020). https://doi.org/10.1504/IJDSDE.2020.109107 14. Kumar R, Sharma AK, Agnihotri K (2022) Hopf bifurcation analysis in a multiple delayed innovation diffusion model with Holling II functional response. Math Methods Appl Sci 43(4):2056–2075. https://doi.org/10.1002/mma.6032 15. Ruan O (1995) The effect of delays on stability and persistence in plankton models. Nonlinear Anal Theory Methods Appl 24(4):575–585. https://doi.org/10.1016/0362-546X(95)93092-I 16. Das K, Ray S (2008) Effect of delay on nutrient cycling in phytoplankton zooplankton interactions in estuarine system. Ecol Model 215(1–3):69–76. https://doi.org/10.1016/j.ecolmodel. 2008.02.019 17. Chattopadhayay J, Sarkar RR, El Abdllaoui A (2002) A delay differential equation model on harmful algal blooms in the presence of toxic substances. Math Med Biol A J IMA 19(2):137– 161 18. Rehim M, Zhang Z, Muhammadhaji A (2016) Mathematical analysis of a nutrient-plankton system with delay. Springerplus 5(1):1055. https://doi.org/10.1186/s40064-016-2435-7 19. Meng XY, Wang JG, Huo HF (2018) Dynamical behaviour of a nutrient-plankton model with holling type IV, delay and harvesting. Discret Dyn Nat Soc 2018:9232590, 19 pages. https:// doi.org/10.1155/2018/9232590 20. Singh R,Tiwari SK, Ojha A, Thakur NK (2022) Dynamical study of nutrient-phytoplankton model with toxicity: effect of diffusion and time delay. Math Methods Appl Sci. https://doi. org/10.1002/mma.8523 21. Liang Y, Jia Y (2022) Stability and Hopf bifurcation of a diffusive plankton model with timedelay and mixed nonlinear functional responses. Chaos Solitons Fractals 163:112533. https:// doi.org/10.1016/j.chaos.2022.112533 22. Tiwari PK, Roy S, Misra AK, Upadhyay RK (2022) Effect of seasonality on a nutrient plankton system with toxicity in the presence of refuge and additional food. Eur Phys J Plus 137(3):368. https://doi.org/10.1140/epjp/s13360-022-02566-1 23. Kaur RP, Sharma A, Sharma AK (2021) Dynamics of a nutrient-plankton model with delay and toxicity. J Math Comput Sci 11(2):1076–1092. https://doi.org/10.28919/jmcs/5294 24. Birkhoff G, Rota G (1989) Ordinary differential equations. Ginn, Boston 25. Luenberger DGDG (1979) Introduction to dynamic systems: theory, models and applications 26. Song Y, Wei J, Han M (2005) Stability and Hopf bifurcation analysis on a simplified BAM neural network with delays. Phys D Nonlinear Phenom 200(3–4):185–204. https://doi.org/10. 1016/j.physd.2004.10.010
56 Mathematical Analysis of Effect of Nutrients …
703
27. Boonrangsiman S, Bunwong K, Moore EJ (2016) A bifurcation path to Chaos in a time-delay fisheries predator-prey model with prey consumption by immature and mature predators. Math Comput Simul 124:16–29 28. Sharma A, Sharma AK, Agnihotri K (2014) The dynamic of plankton-nutrient interaction with delay. Appl Math Comput 231:503–515 29. Li F, Li H (2012) Hopf bifurcation of a predator-prey model with time delay and stage structure for the prey. Math Comput Model 55(3):672–679 30. Song Y, Han M, Wei J (2004) Local and global Hopf bifurcation in a delayed hematopoiesis model. Int J Bifurc Chaos 14(11):3909–3919 31. Edelstein-Keshet L (1988) Mathematical models in biology, vol 46. SIAM 32. Kuznetsov YA (2004) Elements of applied bifurcation theory, 3rd ed. Applied mathematical sciences, vol 112, Springer, New York 33. Hassard BD, Kazarinoff BD, Wan YH (1981) Theory and applications of Hopf bifurcation. CUP Archive, vol 41
Chapter 57
Enhanced Dragonfly-Based Secure Intelligent Vehicular System in Fog via Deep Learning Anshu Devi , Ramesh Kait, and Virender Ranga
1 Introduction New technologies are most prevalent in the service and communications industries. Recent years have seen significant growth in the automation industry. Automobiles have progressed from mere modes of transportation to designing objects with personal, public, and social spaces. This represents a significant advancement in the design of automobiles. A vehicular ad-hoc network, also known as a VANET, is comprised of mobile vehicles that are both equipped with on-board processing units (OBPUs) and roadside units (RSUs). Both of these types of units offer assistance to other vehicles while they are operating on the road [1]. In Intelligent Transportation Systems, V2V and V2I network issues have been studied by several researchers. As the prevalence of green communication increases, there has been an increase in the energy required for remote communications. It makes use of the sending probability of the transmission power. Additionally, the nodes receive regular updates based on the convictions made. Other protocols reduce the number of dead nodes and save more energy, just like in a large-scale Wireless Sensor Network. This investigation seeks to determine the level of energy efficiency and base inactivity within a particular network that suffers from data loss. This protocol’s primary goal is to reduce the amount of energy used by the WSN and, as a result, increase the amount of time it can remain operational. Data transfer using A. Devi (B) · R. Kait Kurukshetra University, Kurukshetra, Haryana, India e-mail: [email protected] R. Kait e-mail: [email protected] V. Ranga Delhi Technological University, Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_57
705
706
A. Devi et al.
moving vehicles is analysed by selecting the best path from an optimization and clustering model to transmit data to and from the source and destination nodes. The remainder of the sections is organized in the following fashion. The second section discusses related work. The proposed flowchart is discussed in detail in Sect. 3. Section 4 presents detailed simulation results and an analysis of simulation findings. Section 5 concludes with a discussion of potential future research.
2 Related Survey The following are the topics covered in this section of VANET-related work: The authors [2] used CAVDO in conjunction with MA-DTR and then compared the results to the baseline progressive techniques ant colony optimization (ACO), as well as the comprehensive learning particle swarm optimization (CLPSO). Several experiments were carried out with transmission dynamics for stable topology in mind. In many cases, CAVDO worked better because it gave the fewest clusters possible for the current channel condition. Clustering time, re-clustering delay, dynamic transmission range, direction, and speed are all critical clustering process parameters. According to these metrics, CAVDO outperforms ACO-based clustering and CLPSO in various network configurations. CAVDO also performs better than CLPSO overall. In addition, the incorporation of the features and capabilities of the next-generation network infrastructure is made possible through the use of architecture that is enabled for 5G communications. The authors [3] have proposed a new clustering-based optimization technique that will improve V2V communication even more. The K-Medoid clustering model is used to cluster vehicle nodes in this paper, and the results are then applied to the problem of increasing energy efficiency. An energy-efficient communication strategy is developed using a meta-heuristic algorithm. Simulated results show that the new method is more efficient in terms of time and energy consumption. The authors [4] present new intrusion detection models for fog-based IoT, taking into account a preliminary process, the extraction of features, and the detection of intrusions. Initial data normalization (considered pre-processing) is performed on the raw data. Following that, the pre-processed data is subjected to a feature extraction phase. This phase extracts features based on entropy, gain, and gain ratio. The extracted features are then subjected to an attack detection phase, in which an intruder is detected using attributed data by the Iterative Deep Neural Network (DNN). This stage occurs after the features have been extracted. The optimal training of a DNN is achieved by optimizing the weights. A Modified Electric Fish Optimization Algorithm performs this optimization (MEFO). Finally, the superiority of the proposed developed model is evaluated compared to existing techniques. The authors [5] suggested the intrusion detection system combines two algorithms to learn the vehicle’s boundary behaviour and detect intrusive behaviour. The accuracy and effectiveness of the model were evaluated using data taken from real vehicles. Experiments have demonstrated that both of these technologies can accu-
57 Enhanced Dragonfly-Based Secure Intelligent Vehicular …
707
rately identify abnormal boundary behaviour. Time-based back propagation updates model parameters iteratively. This study’s model has a 96% detection accuracy rate. Furthermore, the authors [6] investigated a new algorithm for clustering using safety, density, and speed metrics. With NS3, SUMO, and MOVE simulation tools, the proposed solution was validated and compared to recent proposed works in this field (MADCCA and CAVDO). It also demonstrated that the design achieves superior node connectivity and cluster stability compared to competing protocols. Dragonflies, which exhibit novel swarming behaviours such as dynamic and static swarming, inspired the author [7] to create this work. FANET nodes use the knowledge from dragonflies to find a neighbour and maintain connectivity. In addition, they employ separation and alignment to prevent collisions and expand the coverage area. Using machine learning (ML) and the dragonfly algorithm, an isolated drone tries to rejoin the network if it becomes disconnected (DA). Expanding the coverage area while reducing the number of drones in isolation is the goal of the proposed plan. Improved network intelligence through DA-facilitated learning makes communication possible even when the network topology is unstable or changing. Using Deep Neural Networks (DNN) and Bat Algorithms (BA), the author [8] has devised a new method for managing vehicular traffic. The former is used to reroute vehicles around heavily congested routes to improve efficiency while simultaneously reducing the average time spent waiting at intersections. This latter method, combined with the Internet of Things, analyses the traffic congestion status between network nodes by utilizing Virtual Area Networks (VANETs). This experimental study aims to look at how DNN-IoT-BA performs in a variety of machine learning and deep learning algorithms applied to VANETs. DNN-IoT-BA validation requires packet delivery ratio, latency, and packet error rate. The authors [9] provide an in-depth examination of DA and its new variants, which are classified as modified or hybrid. Aside from that, it explains how DA can be used in many different areas, from machine learning to neural networks to image processing to robotics. The authors proposed [10] an architecture that considers the fact that the back propagation algorithm will eventually converge and consists of a standard CNN in addition to an error term that is fundamental to the problem. In the meantime, a probabilistic representation is used to provide a theoretical analysis of the proposed CNN-based deep architecture’s convergence. It is used to test the method’s accuracy by running it on the testbed. The authors [11] proposed a deep neural network-based anomaly detection system for vehicles. In order to distinguish standard vehicle data from abnormal vehicle data, employ a method known as sequence reconstruction.
3 Proposed Work In this paper, IoV-DF (Internet of Vehicle-based Dragonfly) is a mobile module connecting vehicle units to infrastructure units within VANETs. This module is included in the proposed architecture for the VANET. The IoV-DF (Internet of Vehicle-based Dragonfly) is in charge of uploading vehicle information to the DNN to facilitate data
708
A. Devi et al.
packet distribution from the source to the sink node via cooperative DF (Internet of Vehicle-based Dragonfly) and the selective routing track. Because IoV-DF (Internet of Vehicle-based Dragonfly) contains a diverse data set, each system has its own identity. The fog stores current vehicle data as it passes through a road segment, and DNN (Deep Neural Network), a packet transmission indicator, defines the routing path. Lastly, the data storage area contains all the information from the vehicle’s various units. In this case, the DNN (Deep Neural Network) routing path is discovered in the RSU (Road Side) Infrastructure Unit. Figure 1 depicts Nodes that have to be deployed. It generates attributes of vehicles like location and power consumption. When a node transfers a packet, how much power will it consume in power consumption? After that source and destination are chosen. Data has to be sent from Source to Destination. If that node is in its range, then direct data is to be sent; otherwise, broadcast the route request. The respondent node will be put into the k-number group by using the Dragonfly clustering algorithm. Every respondent has to give their network share to remain in the network. Now Fog Server computes the Network key from respondent-network share using E-Lagrange Interpolation. It preserves a single global key for the whole network; as a result, the global key is used to identify each vehicle. Because it is insecure about distributing the global key across the vehicles, the vehicles use a shared system. When a vehicle asks a fog server for information directly or through an RSU, the fog server will either ask any vehicle in the network for three shares or choose two at random. The demanding vehicle will be one of three total shares considered. For the following calculation, the fog server will employ the Lagrange polynomial. Then compute predicated share compared with B1 and B2. If the share lies within the range, then the route node is added, and if it does not, then the route node is dropped. After that, you will pick the next respondent and keep checking every time; if the destination meets, there is no need to pick the next one. So the route is complete. After that, the quality of service parameters will be evaluated. Now the quality of service parameter is stored in Fog Repository. Faulty and non-faulty node segregation will be done using a fog repository. In order to discriminate between faulty and non-faulty, the evaluation of cosine similarity and Euclid distance is used to find out co-relation among nodes. Faulty and non-faulty nodes are stored in group1 and group2. Then these nodes are trained using training credentials from the Deep neural network.
3.1 Network Architecture for VANET In the method that has been proposed, VANETs collaborate with the IoV-DA (Internet of the vehicle with dragonfly algorithm), which helps the infrastructure unit and integrates with the vehicle unit for routing. At the network level, the infrastructure of the network is located, and robust routing paths can be determined using DNNs based on the position and speed of the vehicle (DNN architecture with optimization for enhanced Dragonfly algorithm in Fig. 2).
57 Enhanced Dragonfly-Based Secure Intelligent Vehicular …
709
Fig. 1 Enhanced dragonfly based secure VANET communication in Fog via deep learning
The workflow of the infrastructure unit is depicted in Fig. 2.
3.2 Dragonfly Algorithm Optimization methods [13–15], such as the improved dragonfly algorithm, are employed to lower the overall energy consumption. The dragonfly algorithm is based on the traits of dragonflies, which act like predators by chasing away smaller insects. The DA algorithm is built on dynamic and static swarming particles. These two
710
A. Devi et al.
Intialize DNN’s parameter
Set
Training DNN
Training data from fog repostory
calculate
optimal optimalDNN DNN model parameter achieved Fig. 2 Deep learning workflow
swarming practices optimize by utilizing and exploring meta-heuristics. In the same way, the improved function ignores the neighbourhood’s ideal. It is used to reduce the likelihood of being caught. The following formula can be used to initialize the dragonflies or vehicle nodes [3, 12] (Fig. 3): D j = D1 , D2 , D3 , . . . , Dm where j = 1, 2, 3, . . . , m Figure 4 depicts the five basic primitive principles utilized to design the swarm intelligence of Dragonfly. Let us assume that P denotes the current vehicle’s position, P j is the position of jth neighbouring vehicle, and M is the number of neighbouring vehicles. Mathematically, the primitive principle can be defined as follows [9]: 1. Separation: It represents collision avoidance in a static fashion that can be followed by each vehicle to avoid collisions with others in the neighbourhood of the vehicles. Mathematically, separation can be defined as follows in Eq. 1
57 Enhanced Dragonfly-Based Secure Intelligent Vehicular …
Fig. 3 The static and dynamic swarming behaviours of dragonflies [9]
Fig. 4 Primitive corrective patterns between dragonflies in a swarm [9]
711
712
A. Devi et al.
Si = − M j−1 P − P j
(1)
2. Alignment: It specifies the velocity matching of individuals among other neighbouring vehicles of the same group. Mathematically, it can be represented as in Eq. 2. M j=1 V j (2) Ai = M where V j denotes the velocity of the jth vehicle. 3. Cohesion: Cohesion is the tendency towards the swarm group as described in Eq. 3 M Pj σ j=1 −P (3) Ci = M 4. Attraction: Attraction towards the food source can be defined as in Eq. 4: Fi = FP − P
(4)
where Fi = Food source of ith vehicle FP = The position of the food source. 5. Distraction from the enemies: Distraction from the enemies can be defined in Eq. 5: (5) Ei = E P + P where E i = the position of the enemy of the ith vehicle E P = the enemy’s position.
4 Result Analysis This is depicted in Fig. 5 and is listed in Table 1. As shown in Table 1, we used the MATLAB network simulator to build a network with 50 nodes and a simulation area of 1000 * 1000. The average speed of a node is 50–100 m/s, according to the best estimates, depending on their location. Throughput, energy consumption, and jitter are three quality of service parameters that are measured and reported, as shown in Fig. 6. In this particular instance, five simulations were run, yielding 300,000 packets. When each route has been evaluated, five different routes’ throughput, jitter, and energy consumption are calculated based on the evaluation results.
57 Enhanced Dragonfly-Based Secure Intelligent Vehicular …
Fig. 5 Finding optimized route Table 1 Simulation parameters Node_id Total node Simulation area Simulator Node speed Packets injection rate
Value 50 1000 * 1000 m MATLAB 50−100 m/s 300,000 per second
Fig. 6 Deep neural network (throughput, PDR, and energy consumption)
713
714
A. Devi et al.
5 Conclusion This paper demonstrates that DNN-IoV-DA (Deep neural network in the Internet of Vehicles with the Dragonfly algorithm), which allows for the adequate transportation of vehicles on highways and other roads with high speeds, is available. The DNN-IoV-DA is the most effective method for reducing the wasteful use of energy. The DNN-IoV-DA investigates the entirety of the network to determine the moving states of each vehicle connected to VANETs. It optimizes routing decisions based on inputs, DNN-IoV-DA, and higher network traffic congestions. The DNN can make routing decisions more quickly and effectively. It also provides a way to set up the best routes, which helps reduce network congestion faster. The routing table’s utilization for performing routine vehicle updates at RSUs guarantees both the most effective choice of vehicles and consistent routing decisions. DNN-IoV-DA offers faster routing decisions. In addition, these results demonstrate a lower likelihood of a connection being severed when using DNN-IoV-DA, allowing for more data to be transmitted. Continuous DNN-IoV-DA monitoring has optimized traffic density, ensuring efficient packet delivery to destination nodes. In the near future, traffic management will use alternative deep learning techniques to manage vehicles with varied traffic information.
References 1. Devi A, Kait R, Ranga V (2019) Security challenges in fog computing. In: Handbook of research on the IoT, cloud computing, and wireless network optimization. IGI Global, pp 148–164. https://doi.org/10.4018/978_1_5225_7335_7.ch008 2. Aadil F, Ahsan W, Rehman ZU, Shah PA, Rho S, Mehmood I (2018) Clustering algorithm for internet of vehicles (IoV) based on dragonfly optimizer (CAVDO). J Supercomput 74(9):4542– 4567. https://doi.org/10.1007/s11227_018_2305_x 3. Chen JIZ, Hengjinda P (2021) Enhanced dragonfly algorithm based K-medoid clustering model for VANET. J ISMAC 3(01):50–59. https://doi.org/10.36548/jismac.2021.1.005 4. Abdussami AA (2021) Incremental deep neural network intrusion detection in fog based IoT environment: an optimization assisted framework. Indian J Comput Sci Eng 12(6):1847–1859. https://doi.org/10.21817/indjcse/2021/v12i6/211206191 5. Fei L, Jiayan Z, Jiaqi S, Szczerbicki E (2020) Deep learning-based intrusion system for vehicular ad-hoc networks. CMC-Comput Mater Contin 65:653–681. https://doi.org/10.32604/cmc. 2020.011264 6. Gasmi R, Aliouat M (2020) A weight based clustering algorithm for internet of vehicles. Autom Control Comput Sci 54(6):493–500. https://doi.org/10.3103/S0146411620060036 7. Hameed S, Minhas QA, Ahmad S, Ullah F, Khan A, Khan A, Hua Q (2022) Connectivity of drones in FANETs using biologically inspired dragonfly algorithm (DA) through machine learning. Wirel Commun Mobile Comput. https://doi.org/10.1155/2022/5432023 8. Kannan S, Dhiman G, Natarajan Y, Sharma A, Mohanty SN, Soni M, Gheisari M (2021) Ubiquitous vehicular ad-hoc network computing using deep neural network with IoT-based bat agents for traffic management. Electronics 10(7):785. https://doi.org/10.3390/electronics10070785 9. Meraihi Y, Ramdane-Cherif A, Acheli D, Mahseur M (2020) Dragonfly algorithm: a comprehensive review and applications. Neural Comput Appl 32(21):16625–16646. https://doi.org/ 10.1007/s00521_020_04866_y
57 Enhanced Dragonfly-Based Secure Intelligent Vehicular …
715
10. Nie L, Ning Z, Wang X, Hu X, Cheng J, Li Y (2020) Data-driven intrusion detection for intelligent internet of vehicles: a deep convolutional neural network-based method. IEEE Trans Netw Sci Eng 7(4):2219–2230. https://doi.org/10.1109/TNSE.2020.2990984 11. Alladi T, Agrawal A, Gera B, Chamola V, Sikdar B, Guizani M (2021) Deep neural networks for securing IoT enabled vehicular ad-hoc networks. In: ICC 2021-IEEE international conference on communications. IEEE, pp 1–6. https://doi.org/10.1109/ICC42927.2021.9500823 12. Rahman CM, Rashid TA (2019) Dragonfly algorithm and its applications in applied science survey. Comput Intell Neurosci 13. Josephen WF, Warnars HLHS, Abdurrachman E, Assiroj P, Kistijantoro AI, Doucet A (2021) Dragonfly algorithm in 2020. Commun Math Biol Neurosci (2021) 14. Kumar V, Dhurandher SK, Tushir B, Obaidat MS (2016) Channel allocation in cognitive radio networks using evolutionary technique. In: International conference on wireless networks and mobile systems, vol 2. SCITEPRESS, pp 106–112 15. Phull N, Singh P, Shabaz M, Sammy F (2022) Performance enhancement of cluster-based Ad Hoc on-demand distance vector routing in vehicular Ad Hoc networks. Sci Program
Chapter 58
Towards Ranking of Gene Regulatory Network Inference Methods Based on Prediction Quality Softya Sebastian and Swarup Roy
1 Introduction The various cellular processes that take place within an organism are the result of intricate interactions between biological molecules like genes, mRNAs, and proteins. A biomolecular interaction network or graph is frequently used to visualize these interactions. A biomolecule is represented by each node, whereas an interaction or association between two or even a complex mixture of biomolecules is represented by an edge [1, 2]. Gene regulatory networks, transcription regulatory networks, protein interaction networks, metabolic networks, signaling networks, and hybrid networks are examples of biomolecular networks [3]. All cell systems have these networks, which carry out their fundamental and crucial functions in both creating and maintaining new life. By generating a vast amount of information regarding interactions, networks, functional modules [4, 5], routes, and pathways, high-throughput experimental approaches have made it possible to analyze biomolecular networks. Gene Regulatory Networks (GRNs) are crucial for understanding the activities of genes and how they interact with other genes because they provide insight into the workings of intricate biological systems. As a result, numerous methods for inferring GRNs from expression profiles of genes have been devised [3, 6–16]. A massive amount of gene expression data from hundreds of genes having multiple time-series expressions or samples is necessary for the challenging task of GRN inference. Most GRN inference methods that have different or even the same inference approaches with minor variations were proposed keeping in mind the methods already proposed, seeking to outperform them in terms of running time and accuracy parameters such as the Area Under Receiver Operating Characteristic (AUROC) and Area Under Precision and Recall (AUPR). However, these metrics rarely comprehensively S. Sebastian (B) · S. Roy Network Reconstruction and Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, Gangtok, India e-mail: [email protected] S. Roy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_58
717
718
S. Sebastian and S. Roy
assess the inferred network’s quality. As a result, we know little about the prediction quality of GRN inference methods. Therefore, in this paper, we take ten different GRN inference methods, perform an extensive comparative analysis of their prediction quality, and report them in detail. To the best of our knowledge, no study has done such an extensive analysis of GRN inference methods in terms of inference quality.
2 GRN Inference Methods We chose ten GRN inference methods having disparate approaches for inference with different underlying mathematical rationales, as seen in Fig. 1 and whose working implementation codes are available. While two methods infer directed networks, the rest infer undirected methods. They are briefly introduced next. The Context Likelihood of Relatedness (CLR) algorithm [6] first computes the mutual information (MI) between genes and then determines the statistical likelihood of the computed MI values and finally uses an adaptive background correction to remove false edges to finally give the network. The popular method—Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE) [7]—estimates the MI for each pair of genes and filters it subsequently to obtain the network out of which the indirect interactions are eliminated using the Data Processing Inequality (DPI). The BC3NET (Bagging C3NET) [10] algorithm is an improved version of the C3NET algorithm [17]. First, a dataset of s samples is sampled using a non-parametric bootstrap to create an ensemble of separate bootstrap datasets. Then, C3NET infers a network for each dataset generated in the ensemble. The final binary undirected network is formed by creating one weighted network from this group of networks to obtain the statistical significance of the relationship between gene pairs.
Fig. 1 Types of GRN inference approaches
58 Towards Ranking of Gene Regulatory Network Inference …
719
GeneNet [13] statistically infers the network by first obtaining a partial correlation graph from an inferred correlation network and then building a directed acyclic causal network from the graph. MRNET [8] uses the maximum relevance/minimum redundancy (MRMR) feature selection technique for gene selection. An edge in the network generated means that one of the corresponding two genes is a well-ranked predictor of the other. MRNETB (MRNET Backward) [9] is an improvisation that attempts to overcome MRNET’s drawbacks by first utilizing a backward selection technique and then by sequential replacement. MutRank [11] uses Pearson’s method to compute the correlation between pairs of genes and then ranks them so as to score the similarity between them. Since the scores are asymmetric, the geometric mean is used to finally assign the confidence score between two genes in the network. MINE (Maximal Information-Based Nonparametric Exploration) [15] uses a symmetric correlation measure that ranges in [0, 1] to show the relationship strength between genes. It tends to 0 for statistically independent data while it approaches 1 for noiseless functional relationships. PCOR [12] builds the network by computing the partial correlations between all pairs of genes. It measures the relationship between two genes while controlling for the effect of other genes. G1DBN [14] utilizes the low-order conditional dependence graph concept which is extended to Dynamic Bayesian Networks (DBN) that are an extension of Bayesian Networks. It infers the uncertainties in interactions among genes using probabilistic graphical models which tells us about the dependencies between genes qualitatively. Table 1 presents the summary of the methods discussed above.
3 Experimental Evaluation In this section, we first describe the datasets used, followed by the metrics used to assess the prediction quality of the methods, and then present the results and a brief discussion thereon.
3.1 Datasets In order to quantify the prediction quality of the popular GRN inference methods chosen, we utilize simulated datasets generated using Gene Net Weaver (GNW) [18], which is a Java tool that helps generate realistic in silico benchmarks. The underlying true networks of the synthetic expression profiles generated are compared with the networks inferred by the GRN inference methods from those expression profiles to determine their performance. The different synthetic expression profiles are detailed in Table 2.
720
S. Sebastian and S. Roy
Table 1 Summary of the GRN inference methods Inference method Inference approach Network
Implementation code
CLR
Mutual information with background
Undirected
ARACNE
Mutual information, DPI
Undirected
BC3NET
Bagging C3NET Undirected (maximal mutual information) Directed acyclic causal Directed network from a partial correlation network Supervised feature Undirected selection
GeneNet
MRNET
MRNETB
Backwards MRNET, Undirected sequential replacement
MutRank
Ranked correlation, geometric mean
MINE
Symmetric correlation Undirected based on maximal information Partial correlation Undirected
PCOR
G1DBN
Low-order conditional Directed dependence graph extended to DBN
Table 2 Features of the GNW synthetic datasets Dataset Species #Genes #Time name points Yt1 Yt2 Yt3 Ec4 Ec5 Ec6
Undirected
Yeast Yeast Yeast E. coli E. coli E. coli
4000 3000 2000 1000 500 100
510 510 510 510 510 510
https://www. bioconductor.org/ packages/minet/ https://www. bioconductor.org/ packages/minet/ https://CRAN.Rproject.org/ package=bc3net https://CRAN.Rproject.org/ package=GeneNet https://www. bioconductor.org/ packages/minet/ https://www. bioconductor.org/ packages/minet/ https://www. bioconductor.org/ packages/ netbenchmark/ https://CRAN.Rproject.org/ package=minerva https://CRAN.Rproject.org/ package=ppcor https://CRAN.Rproject.org/ package=G1DBN
Size of matrix
#True edges #Total interactions
2,040,000 1,530,000 1,020,000 510,000 255,000 51,000
22,628 16,046 10,496 4018 2638 314
15,996,000 8,997,000 3,998,000 999,000 249,500 9,900
58 Towards Ranking of Gene Regulatory Network Inference … Table 3 Performance metric scores Performance metric Definition
721
Eq.
Sensitivity Specificity/recall PPV NPV Detection rate Detection prevalence Prevalence Precision F1 Balanced accuracy Observed accuracy Expected accuracy
TP T P+F N TN T N +F P TP T P+F P TN T N +F N TP T P+F P+T N +F N T P+F P T P+F P+T N +F N T P+F N T P+F P+T N +F N TP T P+F P Precision ∗ Recall 2 ∗ Precision + Recall Sensitivity ∗ Specificity 2 T P+T N T P+F P+T N +F N
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
Cohen’s Kappa, κ
Predicted Accuracy−Expected Accuracy 1−Expected Accuracy
(13)
[Prevalence ∗ Detection Prevalence] + [(1 − Detection Prevalence) ∗ (1 − Prevalence)]
3.2 Performance Evaluation Metrics We use twelve standard statistical metrics to quantify the prediction quality of the GRN inference methods. We first compute the True Positive (TP), False Negative (FN), True Negative (TN), and False Positive (FP) and then proceed to compute the values for the evaluation metrics in the context of GRN networks. We use Sensitivity to evaluate the ability of the methods to infer true edges correctly; Specificity or Recall to evaluate the ability of the method to infer absent edges correctly; Positive Predictive Value (PPV) for the proportion of positive edges that are correctly inferred; Negative Predictive Value (NPV) for the proportion of absent edges that are correctly inferred; Detection rate for the proportion of the whole network edges where the true edges are detected correctly; Detection Prevalence for the number of edges predicted as true as a proportion of all the edges; Prevalence to get the fraction of the inferred edges that is actually true; Precision for the ratio of correctly inferred true edges to a total number of edges inferred as true; and Observed Accuracy to measure the number of edges inferred correctly as a percentage of the total number of inferred edges. However, these metrics are not enough to describe the quality of the output networks because, in the GRN context, the datasets are imbalanced. There are very less true edges compared to the absent edges. Therefore, we use the following metrics: F1 score for the harmonic mean of precision and recall; Balanced Accuracy for the arithmetic mean of sensitivity and specificity; and Cohen’s Kappa Score to compare the predicted accuracy which is the observed accuracy (11) with the expected accuracy that is the accuracy which any random predictions can attain. The definitions for all these metrics are given in Table 3.
722
S. Sebastian and S. Roy
Fig. 2 The comparison of the TPs, TNs, FPs, and FNs obtained on implementing all the GRN inference methods for the different networks
3.3 Assessment of the Prediction Quality of GRN Inference Methods We inferred the networks for all the datasets for all the algorithms except G1DBN. G1DBN did not execute for datasets with more than 2000 genes. After we got the inferred networks, we compared them to their respective gold standard networks and got the TPs, FNs, TNs, and FPs for all the networks, as shown in Fig. 2. We then computed the metrics (1–13) for all the methods and report them individually for each dataset in Fig. 3. We also average the scores for all the datasets and then rank the methods from best-performing to worst-performing based on the average scores. The
58 Towards Ranking of Gene Regulatory Network Inference …
723
averaged scores for the metrics for all methods in order of their ranks are reported in Table 4. From Figs. 2 and 3 and Table 4, it can be clearly seen that the methods showcase a dismal performance. We can see that the values for observed accuracy are high for some methods. However, this does not reflect the actual accuracy because the datasets are highly imbalanced, i.e., the number of true edges is diminutive compared to the number of absent edges (refer Table 2), resulting in a very low prevalence for all datasets. Therefore, we take the help of other metrics to derive a meaningful conclusion. The sensitivity of most methods is extremely low for most methods. This is because
Fig. 3 The comparison of the prediction quality of all the GRN inference methods for the different networks using the performance metrics (1–13)
724
S. Sebastian and S. Roy
Fig. 3 (continued)
these methods infer only 1–40% of the true edges, as seen from Fig. 2. This is also why the precision or PPV of all methods is negligible. Even though the methods inferred scant true edges, the number of FPs is very high, causing the detection prevalence scores of all methods to be insubstantial. This is a huge disadvantage because it means that if these methods are applied to real networks, the majority of the interactions inferred as present will in fact be absent, which renders the inferred network useless. The specificity or recall is, however, very high for most methods. However, from the context of GRNs, this is not significant because even if most absent edges are correctly inferred with sparse true edges, the network still has no use. High specificity with low sensitivity is of practically no use in the case of GRNs, and we observe this in the case of most methods. The high NPV of most methods is the result of high
58 Towards Ranking of Gene Regulatory Network Inference …
725
specificity and bulk absent edges. High specificity and NPV show that most methods are exceptionally good in not inferring most absent edges as true edges, while the whole purpose of inference is to detect the true interactions correctly. The imbalanced datasets are a major reason for the mediocre performance of most methods. Therefore, we use the F1 score and balanced accuracy to solve the imbalanced dataset problem. The F1 score of all methods seems to be approaching 0 while the balanced accuracy of most methods except four is less than 0.55. This is because of the very low sensitivity. The four methods with modest balanced accuracy are CLR, MRNETB, MRNET, and G1DBN. However, we can clearly see that the methods are consistently showcasing underperformance. Even in the case of the last metric, Cohen’s Kappa score, we see consistent low performance of all methods. A low value of κ signifies a low level of agreement between the inferred network and the gold standard network. The methods have ironically shown consistent poor performance in most metrics that intend to show the capability of the inference method to predict the true edges accurately.
3.4 Ranking the GRN Inference Methods Based on Prediction Quality On a scale from 1 to 10, the mean values of the eleven metrics scores derived across all datasets and across all ten methods are ranked individually. The score closest to zero is ranked 10, while the one closest to one is ranked 1. To determine the overall ranking of the methods, the cumulative average of each of the ranks attained for the 11 evaluation metrics is calculated. Provided a vector of rank scores for all the 11 metrics for a method i, {R1 , R2 , . . . , R N }, the Rank of i is determined using the formula: N Ri Ranki = i=1 . (1) N After ranking the methods, we report their ranking scores, along with their averaged performance metrics scores in Table 4. Thus, the methods—MINE, CLR, G1DBN, and BC3NET—are adjudged as the best-performing methods, while all the other algorithms can be said to show below-average performance. Researchers will find this ranking table, along with Table 1, helpful for choosing between the inference methods.
4 Conclusion In this paper, we perform an extensive comparative analysis of the prediction quality of ten popular GRN inference methods using synthetic expression profiles from GNW
Rank
1 2 3 4 5 6 7 8 9 9
Method
MINE CLR G1DBN BC3NET MRNETB GeneNet MRNET ARACNE MutRank PCOR
4.0909 4.5455 4.8182 5.0000 5.1818 5.7273 5.9091 6.0909 6.8182 6.8182
Ranking score
0.0451 0.5466 0.2511 0.0331 0.5352 0.1280 0.5327 0.0206 0.3379 0.5630
0.9804 0.6019 0.8682 0.9890 0.6034 0.9048 0.6052 0.9899 0.7156 0.4544
0.0441 0.0103 0.0200 0.0246 0.0101 0.0117 0.0101 0.0178 0.0096 0.0093
Sensitivity Specificity/ PPV recall 0.9918 0.9925 0.9892 0.9915 0.9923 0.9916 0.9923 0.9914 0.9917 0.9923
NPV
0.0839 0.0197 0.0387 0.0477 0.0195 0.0228 0.0194 0.0295 0.0186 0.0175
F1
0.0010 0.0047 0.0036 0.0004 0.0045 0.0006 0.0045 0.0003 0.0025 0.0055
0.0200 0.3988 0.1329 0.0112 0.3972 0.0953 0.3954 0.0102 0.2846 0.5460
0.5128 0.5742 0.5597 0.5111 0.5693 0.5164 0.5689 0.5052 0.5268 0.5087
Detection Detection Balanced rate prevaaccuracy lence
Table 4 The rankings of the ten GRN inference methods and their averaged performance metrics scores
0.9734 0.6020 0.8623 0.9810 0.6032 0.8973 0.6049 0.9818 0.7117 0.4563
0.0239 0.0033 0.0149 0.0185 0.0030 0.0035 0.0029 0.0098 0.0018 0.0013
Observed Cohen’s accuracy Kappa
726 S. Sebastian and S. Roy
58 Towards Ranking of Gene Regulatory Network Inference …
727
to rank them effectively. We find that only four methods—MINE, CLR, G1DBN, and BC3NET—perform satisfactorily and emerge as the best-performing methods in that order. Despite including methods that followed different inference approaches and had a non-identical underlying mathematical rationale, we observe that these dissimilar inference methods show similar poor performance in predicting the true edges accurately. Although the imbalanced datasets are partially responsible for the dismal performance showcased by most methods, there is no justification for the low percentage of true edges other than the methods being simply incapable of accurate inference. These findings serve as a caution to those researchers choosing to simply parallelize existing methods only to speed up the execution time. Apart from improving the execution time, it is very important to focus on improving the quality of the networks inferred by the methods. There is no use for a fast executing method if it does not infer at least 80% of the true edges. Hence, future researches in the area of GRN inference need to focus on inference methods that are not only scalable but also capable of reconstructing accurate networks from expression profiles. Acknowledgements This research was funded by the Department of Science & Technology (DST), Govt. of India, under the DST-ICPS Data Science program [DST/ICPS/Cluster/Data Science/General], and carried out at NetRA Lab, Sikkim University.
References 1. Guzzi PH, Roy S (2020) Biological network analysis: trends, approaches, graph theory, and algorithms. Academic Press 2. Roy S, Bhattacharyya DK, Kalita JK (2014) BMC Bioinf 15(7):S10 3. Butte AJ, Kohane IS (1999) Biocomputing 2000. World Scientific, pp 418–429 4. Roy S, Bhattacharyya DK, Kalita JK (2015) Microarray data analysis. Springer, pp 91–103 5. Sharma P, Ahmed HA, Roy S, Bhattacharyya DK (2015) Netw Model Anal Health Inf Bioinf 4(1):1 6. Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS (2007) PLoS Biol 5(1):e8 7. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006) BMC Bioinf (BioMed Central) 7:S7 8. Meyer PE, Kontos K, Lafitte F, Bontempi G (2007) EURASIP J Bioinf Syst Biol 2007:8 9. Meyer P, Marbach D, Roy S, Kellis M (2010) BioComp, pp 700–705 10. de Matos Simoes R, Emmert-Streib F (2012) PloS One 7(3) 11. Obayashi T, Kinoshita K (2009) DNA Res 16(5):249 12. Kim S (2015) Commun Stat Appl Methods 22(6):665 13. Opgen-Rhein R, Strimmer K (2007) BMC Syst Biol 1(1):1 14. Lèbre S (2009) Stat Appl Genet Mol Biol 8(1) 15. Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC (2011) Science 334(6062):1518 16. Sebastian S, Ali A, Das A, Roy S et al Innovations in computational intelligence and computer vision. Springer, pp 85–92 17. Altay G, Emmert-Streib F (2010) BMC Syst Biol 4(1):132 18. Schaffter T, Marbach D, Floreano D (2011) Bioinformatics 27(16):2263
Chapter 59
Role of Viral Infection in Toxin Producing Phytoplankton and Zooplankton Dynamics: A Mathematical Study Rakesh Kumar
and Amanpreet Kaur
1 Introduction The diverse collection of single-celled drifting microorganisms of water known as plankton are very much sensitive to environment changes, water temperature level, salinity, pH level and having good adaptation of nutrients. The most significant species of the plankton are phytoplankton and zooplankton. Phytoplankton is the infinitesimal algae found in oceanic and fresh water and it is the fundamental base of marine food web. Zooplankton consume these phytoplankton and use their energy which is further passed on to the small fishes by consuming the zooplankton and these small fishes are devoured by the large fishes. Like the other terrestrial plants, phytoplankton use the sunlight and chlorophyll for the process of photosynthesis and produce the sufficient amount of oxygen for the respiration. It is noticeable alliance in the process of photosynthesis, and phytoplankton consume almost 50% of carbon dioxide and deliver approximately 70% of the oxygen to the environment. Phytoplankton transforms the chemical energy derived from the photosynthesis to the other marine species and the whole aquatic life depends upon these phytoplankton. Phytoplankton are usually found in low concentrations of water but they are able to escalate opaque strengthening of cells on water surfaces which is known as blooms. Bloom is the existence of sudden expeditious blast and deterioration of phytoplankton. Das et al. [1], Sharma et al. [2] and Brussaard et al. [3] have explained in their papers R. Kumar · A. Kaur (B) Department of Applied Science and Humanities, Shaheed Bhagat Singh State University, Ferozepur, Punjab, India e-mail: [email protected] R. Kumar e-mail: [email protected] A. Kaur Department of Mathematics, GGN Khalsa College, Ludhiana, Punjab, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7_59
729
730
R. Kumar and A. Kaur
that for the enormous hike of algae, a significant role has been played by highly nutritional and favourable surroundings whereas for restricting their progress, the low nutrient and unfavourable situations are held responsible. Blooms in the phytoplankton are connected with the level of nutrients in water. The interaction configuration of plankton species with available nutrients and relationship of their growth with the high nutrient level has been discovered by Sharma et al. [2]. Due to these blooms, the deoxygenation of water which is harmful to the aquatic plants and animals is caused by sudden hike in bacteria and it has been explained by the authors [4–6]. There are hundreds of species of phytoplankton but only a few of them generate toxin. From the last few years, gradual increment in the Harmful Algae Blooms (HAB) has been examined by many authors in their research study [7–12]. These HAB can cause harmful effects on environment by affecting health system of humans, commercialization of fishes and tourism. All the blooms of these phytoplankton are not harmful. Some of these are necessary for the functioning of the ocean system but some are harmful for the other species. Several authors in their papers [13–16] have explored those toxic substances produced by TPP which influenced the zooplankton growth and hence the other marine creatures too. In fact, these phytoplankton that produce toxins are capable of biologically controlling plankton blooms and they have contributed to the cessation of blooms. Virus is the part of natural life and death cycle in the oceans which are found in abundant form. When marine viruses infect the microalgae phytoplankton that is the base of marine food web, it affects the earth’s climate and environment [17, 18]. Although viruses are among the most prevalent molecules in the water, they are crucial to the survival, interaction and disappearance of the plankton population. Numerous studies [1, 5, 11, 13, 17–24] have been conducted to examine the importance of virus in the plankton dynamics. Beltrami and Carroll [5] have outlined how viral diseases contribute to recurrent phytoplankton blooms and investigated how a very little amount of an infectious agent can cause the system to become unstable even when the phytoplankton and its producer are present in stable tropic composition. Various authors in their study [6, 9, 14] have discussed the impact of virus on TPP corresponding to the different parameters like toxic liberation rate, rate of transmission of infection and the most important factor which is the delay period. Many researchers in their research [1, 9, 18, 21] while considering the role of infection on the plankton system, it has assumed that the infected person can become susceptible again because of immunity or some vaccination strategies and investigate that force of infection must follow law of mass action rate for the coexistence of all species. Panja [8, 25, 26] has discussed the impact of cholera disease on plankton dynamics and explored the conditions of stability and coexistence of the species under the influence of toxin generation and infection transmission parameter. The investigation of the dynamical behaviour of the virally infected phytoplankton and zooplankton system in presence of toxin synthesis by the phytoplankton against zooplankton is the focus of this work. In the following sections: In Sect. 2, a virally infected toxin-generating phytoplankton-zooplankton population model has been formulated. Section 3 includes the basic preliminaries like positivity, boundedness. Section 4 explores all possible equilibrium points along with stability conditions of
59 Role of Viral Infection in Toxin Producing Phytoplankton …
731
the model. The investigation of Hopf bifurcation by taking the rate of infection as bifurcation parameter has been given in Sect. 5. Section 6 supports the scientific findings through numerical simulation. The concluding Sect. 7 culminates the paper with the important outcomes with discussion.
2 Model Formulation A mathematical model has been developed to analyse the dynamics of toxinproducing phytoplankton and zooplankton under the influence of virus infection. In order to do this, the plankton population has been divided into phytoplankton (P) and zooplankton (Z). The population of phytoplankton has been divided into two categories, susceptible phytoplankton (S p )' and infected phytoplankton (I p )' , because it has been considered that only the phytoplankton population is influenced by the virus infection in this case. It has been assumed that as compared to the susceptible ones, the infected class of phytoplankton regrows with less capability. According to our hypothesis, phytoplankton grows with r and K as its intrinsic growth rate and carrying capacity, respectively. Susceptible population enters the infected population due to the interaction with some virus. It has been considered that infected phytoplankton population interact with susceptible one with interaction rate C. Due to virus, it is presumed that when zooplankton predates the phytoplankton then infected phytoplankton are more vulnerable to be a prey. Although zooplankton predates both susceptible as well as infected ones. Here b1 and h reflect the growth rate of zooplankton due to predation of susceptible and infected phytoplankton, respectively, whereas b and e have been taken as the predation rate in which zooplankton predates susceptible population and infected population, respectively. When susceptible phytoplankton and infected phytoplankton interact, the Holling type II functional response has been taken into consideration, using k1 as the half saturation constant. δ1 and δ2 have been assumed as death rate of infected population due to infection and natural mortality rate of zooplankton, respectively. It has also been assumed that zooplankton population decreases due to consumption of toxin generated by phytoplankton at the rate θ under the Holling type II functional response. Recovery rate has been assumed to be γ , taking into account that some infected phytoplankton can recover and re-enter the susceptible class [2, 12, 20]. Taking into consideration all the above assumptions, virally infected toxin-producing phytoplankton-zooplankton model has been formulated as below ⎫ Sp + I p C Sp I p d Sp ⎪ = r Sp 1 − − + γ I p − bS p Z ,⎪ ⎪ ⎪ dt K S p + k1 ⎪ ⎪ ⎪ ⎬ d Ip C Sp I p = −δ1 I p + − γ I p − eI p Z , ⎪ dt S p + k1 ⎪ ⎪ ⎪ ⎪ θ (S p + I p )Z dZ ⎪ ⎪ ⎭ = b1 S p Z + h I p Z − δ2 Z − dt α + Sp + I p
(1)
732
R. Kumar and A. Kaur
along with initial conditions S p (0) > 0, I p (0) ≥ 0 and Z (0) > 0.
3 Basic Preliminaries In this section, the possibilities of basic preliminaries of the model shall be explored. The model’s positivity and boundedness will be investigated first, after which the system’s equilibrium points and the dynamics of the system around them will be discussed.
3.1 The Positivity and Boundedness of the System Biologically, positivity ensures that all species will persist in the positive octant for a finite period of time, whereas boundedness of the system means that all species will develop, but not exponentially for an extended period of time due to the limited resources. 3 Theorem 1. All solutions of the system (1) which initiate in R+ are uniformly bounded for suitably chosen positive υ ≤ min{δ1 , δ2 }, eb1 ≥ hb. Proof. Create a new function as follows to investigate the system’s positivity and bounds: X (t) = S p (t) + I p (t) + bb1 Z (t). ⇒
Sp d Sp d Ip b dZ hb bδ2 dX Ip Z − = + + ≤ r Sp 1 − − δ1 I p − e − Z. dt dt dt b1 dt K b1 b1
Introducing positive number υ such as υ ≤ min{δ1 , δ2 }, eb1 ≥ hb, we have Sp dX hb Ip Z +υ X ≤ S p υ + r 1 − − (δ1 − υ)I p − e − dt K b1 Sp b(δ2 − υ) . Z ≤ Sp υ + r 1 − − b1 K ⇒
dX (r + υ)2 K + υX ≤ . dt 4r
(2)
(3)
Using concept of differential equality theorem for solving Eq. (3), we get: 0 < X < 4rKυ (r + υ)2 1 − e−υt + X 0 e−υt , where X 0 = X (0) = S p (0), I p (0), Z (0) . At t → ∞ we have 0 < X < 4rKυ (r + υ)2 .
3 : S + I + b Z = K (r + υ)2 + ε, ε > 0 Hence region ψ = S p , I p , Z ∈ R+ p p 4r υ b1 contains all of the system’s solutions. Therefore, it has been studied how positive and uniformly bounded solutions of virally infected models behave.
59 Role of Viral Infection in Toxin Producing Phytoplankton …
733
4 Equilibrium Points and Stability Analysis Stability analysis of system at all possible equilibrium points will be explored in this section using the concept of Kumar et al. [27, 28].
4.1 Equilibrium Points The feasibility for equilibrium of the model system (1) is given by ⎫ Sp + I p C Sp I p ⎪ − + γ I p − bS p Z = 0,⎪ r Sp 1 − ⎪ ⎪ K S p + k1 ⎪ ⎪ ⎪ ⎬ C Sp I p − γ I p − eI p Z = 0, −δ1 I p + S p + k1 ⎪ ⎪ ⎪
⎪ ⎪ θ Sp + I p Z ⎪ ⎪ ⎭ = 0. b1 S p Z + h I p Z − δ2 Z − α + Sp + I p
(4)
The following possible equilibrium points exist for the system: (a) The feasible trivial equilibrium is attained at E 0 = (0, 0, 0). (b) Possibility of equilibrium on the first
octant’s edge is E 1 = (K , 0, 0). (c) Infection free equilibrium E 2 = S p , 0, Z has been obtained whereS p = √ Sp −(b1 α−δ2 −θ )+ Δ r 1 − , Δ = (b1 α − δ2 − θ )2 + 4b1 αδ2 . , Z = 2b1 b K Thus, E 2 exists if S p < K and b1 α < δ2 +θ .
(d) Zooplankton free equilibrium E 3 = Δ
Sp =
k1 (δ1 +γ ) C−δ1 −γ Δ
and I p =
r k1 (δ1 +γ )[ K C−(K +k1 )(δ1 +γ )] . (C−δ1 −γ )[r k1 (δ1 +γ )+δ1 K (C−δ1 −γ )]
S p , I p , 0 has been obtained with
Δ
Δ
Hence equilibrium point E 3 = S p , I p , 0 exists if C > δ1 + γ and S p < K .
(e) Existence of the non-trivial interior equilibrium E ∗ = S ∗p , I p∗ , Z ∗ where S ∗p , I p∗ and Z ∗ satisfy the Eq. (4) and S ∗p corresponds to the positive root of the
b1 (I p∗ +α)+h I p∗ −δ2 −θ ∗ quadratic equation f S ∗p = S ∗2 and p + AS p + B = 0, where A = b1 (h I p∗ −δ2 )( I p∗ +α)−θ I p∗ . B= b1 Here I p∗ , Z ∗ are obtained by solving the equations: Δ
Δ
Δ
er S ∗p S ∗p + k1 K − S ∗p − bS ∗p S ∗p + k1 (γ + δ1 ) − C S ∗p
= , e S ∗p + k1 (γ + δ1 ) − C S ∗p C S ∗p 1 − γ − δ1 . Z∗ = e S ∗p + k1
I p∗
734
R. Kumar and A. Kaur
where S ∗p is determined by the quadratic equation’s roots, and the outcome is S ∗p = √ −A± A2 −4B . Here, at least one of the roots will be positive if any one of the following 2
conditions is held: (a) A < 0, B < 0 (b) A > 0, B < 0 (c) A < 0, B > 0 and A2 − 4B > 0. C S∗ One of the above-mentioned conditions along with S ∗p < K and S ∗ +kp 1 > γ + δ1 p must therefore be met in order for the internal equilibrium E ∗ to exist.
4.2 Stability Analysis This section will explore the local behaviour of the system stability near all current equilibrium locations using the characteristics of the variational matrix’s eigenvalues. (i)
The eigenvalues obtained from the characteristic equation of the variational matrix at E 0 are r, −δ1 − γ and − δ2 . The system has one eigenvalue ‘r’ which is always positive even when the other eigenvalues are negative since all the parameters suggested in the model are expected to be positive. Therefore, a saddle point is the trivial equilibrium point. (ii) The eigenvalues obtained from the characteristic equation of the variational K K − γ and b1 K − δ2 − Kθ+α . For stability, matrix at E 1 are −r, −δ1 + KC+k 1 all eigen values of the variational matrix must be negative. Thus, E 1 is locally K < δ1 + γ and b1 < δK2 + K θ+α conditions are asymptotically stable if KC+k 1 held. (iii) One eigenvalue of the variational matrix at E 2 is obtained as −δ1 − γ + C Sp − eZ , and others two are derived from the roots of the characS +k p
1
rS
teristic equation of the 2 × 2 matrix whose trace = − Kp < 0 and . Hence both roots of the characteristic determinant = bS p Z b1 − (S θα 2 +α) p
equation of the submatrix will be negative if b1 > locally asymptotically stable if
C Sp S p +k1
θα . (S p +α)2
Therefore,E 2 is
< eZ + δ1 + γ and b1 >
θα . (S p +α)2 Δ
Δ
(iv) One of the eigenvalue derived from the variational matrix at E 3 is b1 S p + h I p − θ( S p + I p ) and others two are the roots of the characteristic equation of δ2 − ( S p + I p +α) CS I rS γI the 2 × 2 matrix with trace = p p2 − K p − p and determinant = Sp S p +k1 Ck1 I p r Sp θ( S p + I p ) 2 and + δ1 > 0. Therefore, if b1 S p + h I p < δ2 + K ( S p + I p +α) S p +k1 Δ
Δ
Δ
Δ
ΔΔ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
Δ
59 Role of Viral Infection in Toxin Producing Phytoplankton … ΔΔ
Δ
C Sp I p 2 S p +k1 Δ
0. p p ( S p +k1 ) By the Routh Hurwitz criteria, the essential and sufficient conditions for stability surrounding E ∗ are Mi > 0 for i = 1, 2, 3 and M1 M2 − M3 > 0 . =
M3
Theorem 2. Model is locally asymptotically stable around E ∗ C S∗ I ∗ r S ∗p γ I∗ + S ∗p ;(b). (S ∗ +Iθα∗ +α)2 following conditions hold: (a). ∗ p p 2 < K p p p ( S p +k1 ) bCk1 h S ∗p θα > De b1 − (S ∗ +I ∗ +α)2 ; min {b1 , h};(c). ∗ p p ( S p +k1 )2
if
. b1 − ∗ K (S p + I p∗ + α)2
5 Hopf Bifurcation Analysis The bifurcation point is the crucial value of the parameter at which the dynamics of the system show a qualitative shift. Here, our primary goal is to investigate the system stability associated with various parameter values. Here, the bifurcation parameter C is supposed to be the ‘infection transmission rate’. Theorem 3. The necessary and sufficient condition that model suffers from Hopf bifurcation around the non-trivial equilibrium point E ∗ when the parameter C crosses its critical value say C r , if the following conditions persist: (a) Mi (Cr ) > 0, i = i /= 0. 1, 2, 3 (b) M1 (Cr )M2 (Cr ) − M3 (Cr ) = 0 (c) Re dλ dc C=C r
736
R. Kumar and A. Kaur
Proof: In case of failure of conditions of theorem 2, stability of the system interrupted around the equilibrium point E ∗ i.e., if ψ(C) = M1 (C)M2 (C) − M3 (C) = 0 for some value C = Cr . For the occurrence of the Hopf bifurcation at C = Cr , the characteristic equation must be of the form λ3 (C) + M1 (C)λ2 (C) + M2 (C)λ(C) + M3 (C) = 0.
(5)
It must have two purely√imaginary roots and one√ root must be negative real number. So λ1 (C) = i M2 (C) , λ2 (C) = −i M2 (C) and λ3 (C) = −M1 (C) at C = Cr . Now for the transversality conditions of the Hopf bifurcation at C = Cr , we substitute the two complex eigenvalues λi (C) = μ(C) ± i ν(C) , i = 1, 2 in Eq. (5) we get: ψ1 (C)μ' (C) − ψ2 (C)υ ' (C) + u(C) = 0
(6)
ψ2 (C)μ' (C) + ψ1 (C)υ ' (C) + η(C) = 0
(7)
and
ψ1 (C) = 3 μ2 − υ 2 + 2M1 μ + M2 , ψ2 (C) = 6μυ + 2M1 υ,
u(C) = M1' μ2 − υ 2 + M2' μ + M3' , η(C) = 2μυ M1' + υ M2' . 2 ' λ M1 + λM2' + M3' dλ (μ + i υ)2 M1' + (μ + i υ)M2' + M3' ⇒ =− . =− dC 3λ2 + 2λM1 + M2 3(μ + i υ)2 + 2(μ + i υ)M1 + M2 dλ [ψ1 (Cr )u(Cr ) + η(Cr )ψ2 (Cr )] ∴ Re =− /= 0, dC C=Cr (ψ1 (Cr ))2 + (ψ2 (Cr ))2 √ because on C = Cr , λ1,2 = ±i M2 , λ3 = −M1 . Thus, the transversality conditions of Hopf bifurcation Re at C = Cr .
dλi (C) dC C=Cr
/= 0 holds
6 Numerical simulation To validate our scientific findings of the dynamic behaviour of the model, numerical simulation using MATLAB has been performed in this section. For the simulation, we have taken the set of parameters as r = 0.215, K = 80, C = 0.6, k1 = 10, e = 0.45, h = 0.1, b = 0.2, b1 = 0.15, γ = 0.005, δ1 = 0.01, δ2 = 0.1, θ = 0.079, α = 8.
(8)
59 Role of Viral Infection in Toxin Producing Phytoplankton …
737
Now, corresponding to these values of the parameters, model (1) with initial values (0.3, 0.51, 0.71) stipulates the equilibrium point as (1.446, 0.51, 1.762), which is found to be locally asymptotically stable. The phase space and the time series graph of three species in Fig. 1 shows the stability of the system. Taking C = 0.78, and keeping other parameters remaining same as in Eq. (8), system (1) exhibits unstable situations with very small oscillations. This phenomenon can be easily observed from the phase space and time series graph of all three species in Fig. 2. Thus, the existence of the critical value of C is observable before 0.78 and it has been detected as C r = 0.75. After crossing its critical value C r = 0.75, stability of the system is disrupted and small oscillations can be seen in the dynamics of susceptible phytoplankton and zooplankton species. If C ∈ (Cr , 4.62), system behaved in an unstable manner with small periodic oscillations in susceptible phytoplankton and zooplankton, whereas infected phytoplankton remain consistent with their initial value. The behaviour of the model for various values of the parameter C while the other parameters listed in Eq. (8) are the same reveals various structural alterations in the three species shown in Table 1. From the Table 1, it is noticeable that with the increase in the value of the parameter C, initially susceptible phytoplankton and zooplankton increase whereas infected phytoplankton remain consistent. Beyond the parameter value C = 4, susceptible
Fig. 1 Phase space graph (a) and time series graph (b) of three species shows convergence to stability for C = 0.6
Fig. 2 Phase space graph (a) and time series graph (b) for C = 0.78 > C r = 0.75 exhibits unstable situation with small oscillations
738
R. Kumar and A. Kaur
Table 1 Description of changes in the population with changes in infection transmission parameter
Infection transmission parameter C
Susceptible phytoplankton Sp
Infected phytoplankton Ip
Zooplankton Z
0.4
1.434
0.51
1.751
0.75
1.458
0.51
1.773
1.2
1.49
0.51
1.801
4
1.221
0.51
1.219
4.6
0.9801
0.51
1.202
4.62
0.9726
0.5133
1.119
4.8
0.9104
0.6689
0.71
5.1
0.82
0.6296
0.71
phytoplankton and zooplankton start decreasing and after crossing C = 4.62 infected population start to rise. Rising the value of C beyond C = 4.62, the system exhibits stability of all the three species with stable limit cycle. This phenomenon can be easily verifiable from Fig. 3 corresponding to value C = 4.7. Let us now consider the behaviour of the system with respect to the toxin liberation parameter θ. For this consider the following set of parameters: r = 0.215, K = 80, C = 0.6, k1 = 10, e = 0.45, h = 0.38, b = 0.2, b1 = 0.17, γ = 0.005, δ1 = 0.01, δ2 = 0.1, θ = 4.2, α = 8.
(9)
The phytoplankton produces toxin to avoid the predation from zooplankton, but under the influence of virus infection, infected phytoplankton are more exposed to prey with less capability to grow. As a result, all three species dynamics are significantly influenced by the toxin liberation rate. For the parameters value given in Eq. (9), system (1) presents (54.98, 23.89, 9.664) as the equilibrium point which is locally asymptotically stable. Rise in value of θ promotes the susceptible phytoplankton community and suppresses the zooplankton. For θ = 4.2, the equilibrium point is (32.52, 13.33, 6.724). The phase space graph along with time series graph in Fig. 4 represents the stability of the system around the equilibrium point
Fig. 3 Phase space and time series graph of three species for C = 4.7 produced stability of the solution curves
59 Role of Viral Infection in Toxin Producing Phytoplankton …
739
Fig. 4 Phase space and time series graph of three species for θ = 4.2 shows convergence to stability of the system
Fig. 5 Phase space (a) and time series graph (b) of three species for θ = 4.45 steps towards instability of the system
E ∗ for θ = 4.2. But as the toxin liberation rate θ increased, the system’s stability was thrown off. The phase space graph and time series graph of all three species in Fig. 5 show that as θ increases from 4.2 to 4.45, the system begins to move towards instability. Further by increasing the infection transmission parameter, toxin liberation parameter plays a significant role in stabilization. Now, corresponding to C = 1.6, θ = 2.3, and other parameter remains same as in Eq. (9), solution shows the quasi-periodic behaviour clearly visible through phase space and time series graphs in Fig. 6. It has been examined whether toxicants can govern the quasi-periodic behaviour with a progressive increase in. This is completely depicted in Fig. 7 for θ = 3.4. Thus, system stability has been derived by increasing the toxin liberation rate.
7 Discussion The dynamical behaviour of the toxin-producing phytoplankton and zooplankton system under the influence of viral infection has been explored in this paper. Several
740
R. Kumar and A. Kaur
Fig. 6 Phase space (a) and time series graph (b) of three species for θ = 2.3, C = 1.6 showing quasi-periodic behaviour
Fig. 7 Phase space (a) and time series graph (b) of three species for θ = 3.4, C = 1.6 shows convergence of solution curves
parameters have been considered for the analysis but infection transmission parameter and toxin liberation rate perform excellently in context of the stability of the system. Because of the impact of viral infection on phytoplankton, phytoplankton community has been classified into two categories namely, susceptible and infected. Both types are outlived by zooplankton, however because of the infection, infected phytoplankton are more vulnerable to becoming prey to the zooplankton. We have examined the stability structure of all three species around the all-possible feasible equilibrium points. In the analysis, we have explored that for small value of infection transmission rate, system remains in stable condition but with its corresponding rising, and for different value of toxin liberation rate quasi-periodic behaviour has been seen. It has been investigated that toxin liberation parameter can control the unstable behaviour which can give rise to blooms. Thus, toxin liberation parameter controls the infection transmission rate for persistence of all three species. Hence, it has been discovered that the parameter of toxin liberation and the rate of infection transmission play a significant role in maintaining the phytoplankton and zooplankton population and are in charge of regulating blooms.
59 Role of Viral Infection in Toxin Producing Phytoplankton …
741
References 1. Das KP, Roy P, Karmakar P, Sarkar S (2020) Role of viral infection in controlling planktonic blooms-conclusion drawn from a mathematical model of phytoplankton-zooplankton system. Differ Equ Dynam Syst 28(2):381–400 2. Sharma A, Sharma AK, Agnihotri K (2014) The dynamic of plankton–nutrient interaction with delay. Appl Math Comput 231:503–515 3. Brussaard CPD, Gast GJ, Van Duyl FC, Riegman R (1996) Impact of phytoplankton bloom magnitude on a pelagic microbial food web. Mar Ecol Prog Ser 144:211–221 4. De SenerpontDomis LN, Elser JJ, Gsell AS, Huszar VL, Ibelings BW, Jeppesen E, Lürling M (2013) Plankton dynamics under different climatic conditions in space and time. Freshw Biol 58(3):463–482 5. Beltrami E, Carroll TO (1994) Modeling the role of viral disease in recurrent phytoplankton blooms. J Math Biol 32(8):857–863 6. Chattopadhyay J, Sarkar RR, Mandal S (2002) Toxin-producing plankton may act as a biological control for planktonic blooms—field study and mathematical modelling. J Theor Biol 215(3):333–344 7. Gakkhar S, Negi K (2006) A mathematical model for viral infection in toxin producing phytoplankton and zooplankton systems. Appl Math Comput 179(1):301–313 8. Panja P, Mondal SK, Jana DK (2017) Effects of toxicants on phytoplankton–zooplankton–fish dynamics and harvesting. Chaos, Solitons Fractals 104:389–399 9. Thakur NK, Srivastava SC, Ojha A (2021) Dynamical study of an eco-epidemiological delay model for plankton systems with toxicity. Iran J Sci Technol, Trans A: Sci 45(1):283–304 10. Upadhyay RK, Chattopadhyay J (2005) Chaos to order: role of toxin producing phytoplankton in aquatic systems. Nonlinear Anal: Model Control 10(4):383–396 11. Agnihotri K, Kaur H (2019) The dynamics of viral infection in toxin producing phytoplankton and zooplankton systems with time delay. Chaos, Solitons Fractals 118:122–133 12. Khare S, Misra OP, Dhar J (2010) Role of toxin producing phytoplankton on a plankton ecosystem. Nonlinear Anal Hybrid Syst 4(3):496–502 13. Gakkhar S, Singh A (2010) A delay model for viral infection in toxin producing phytoplankton and zooplankton system. Commun Nonlinear Sci Numer Simul 15(11):3607–3620 14. Sarkar RR, Chattopadhyay J (2003) The role of environmental stochasticity in a toxic phytoplankton–non-toxic phytoplankton–zooplankton system. Environmetrics: Off J Int Environmetrics Soc 14(8):775–792 15. Chakraborty K, Das K (2015) Modeling and analysis of a two-zooplankton one-phytoplankton system in the presence of toxicity. Appl Math Model 39(3–4):1241–1265 16. Bum BK, Pick FR (1996) Factors regulating phytoplankton and zooplankton biomass in temperate rivers. Limnol Oceanogr 41(7):1572–1577 17. Xiao Y, Chen L (2001) Modeling and analysis of a predator–prey model with disease in the prey. Math Biosci 171(1):59–82 18. Chattopadhyay J, Pal S (2002) Viral infection on phytoplankton–zooplankton system—a mathematical model. Ecol Model 151(1):15–28 19. Singh BK, Chattopadhyay J, Sinha S (2004) The role of virus infection in a simple phytoplankton zooplankton system. J Theor Biol 231(2):153–166 20. Auger P, Mchich R, Chowdhury T, Sallet G, Tchuente M, Chattopadhyay J (2009) Effects of a disease affecting a predator on the dynamics of a predator–prey system. J Theor Biol 258(3):344–351 21. Chattopadhyay J, Arino O (1999) A predator-prey model with disease in the prey. Nonlinear Anal 36:747–766 22. Zhou X, Shi X, Song X (2009) Analysis of a delayed prey-predator model with disease in the prey species only. J Korean Math Soc 46(4):713–731 23. Bairagi N, Roy PK, Chattopadhyay J (2007) Role of infection on the stability of a predator–prey system with several response functions—a comparative study. J Theor Biol 248(1):10–25
742
R. Kumar and A. Kaur
24. Wang S, Song X, Ge Z (2011) Dynamics analysis of a delayed viral infection model with immune impairment. Appl Math Model 35(10):4877–4885 25. Panja P (2020) Plankton population and cholera disease transmission: a mathematical modeling study. Int J Bifurc Chaos 30(04):1–16 26. Panja P, Mondal SK (2015) Stability analysis of coexistence of three species prey–predator model. Nonlinear Dyn 81(1):373–382 27. Kumar R, Sharma AK, Agnihotri K (2018) Stability and bifurcation analysis of a delayed innovation diffusion model. Acta Math Sci 38(2):709–732 28. Kumar R, Sharma AK, Sahu GP (2022) Dynamical behavior of an innovation diffusion model with intra-specific competition between competing adopters. Acta Math Sci 42(1):364–386
Author Index
A Abdul Halim, Nur Izzma Hanis, 317 Ahmad, Gayas, 553 Ajith, P. M., 41 Alam, Sanwar, 503 Ali, Adeeba, 393 Ali, Rashid, 361, 393 Alsideiri, Abir, 295 AlSinani, Maryam, 295 Amle, R. B., 91 Anand, Aishwarya, 623 Anchana, P., 41 Ankita, 525 Aremu, Fatina Mosunmola, 125 Ariyo, Funso Kehinde, 125 Asaletha, R., 511 Asirvatham, David, 273 Awbi, Hazim, 679 Ayanlade, Samson Oladayo, 125
C Chakrabarty, Sugato, 539 Chakraborty, Sonali, 491 Chavan, Sahil Sanjay, 623 Chavan, Sharvay Shashikant, 623 Chebbi, Imen, 251 Chowdhury, Lomat Haider, 643 Conceição, Eusébio, 679 Conceição, Mª Inês, 679
D Dayana, A. Mary, 19 Devi, Anshu, 705 Dhanya, Y., 11 Dornberger, Rolf, 99 Duddu, Mahesh Chandra, 655 Durga Bhavani, S., 655
E Emmanuel, W. R. Sam, 19 B Baig, M. F., 393 Bazi, Yakoub, 373 Bekda¸s, Gebrail, 285 Ben Ayed, Leila, 251 Ben Younes, Ahlem, 251 Bhalchandra, Parag, 461 Bharathi, G. P., 61 Bhatlawande, Shripad, 221 Bhoir, Smita Vinit, 623 Biksham, V., 263
F Fairy, 525 Farin, Nusrat Jahan, 643 Farsi Al, Ghaliya, 295
G Goel, Amit Kumar, 31 Gomes, João, 679 Govekar, Shripad, 221
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 M. S. Uddin and J. C. Bansal (eds.), Proceedings of International Joint Conference on Advances in Computational Intelligence, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-99-1435-7
743
744 H Habeeb, Riyaz Ahamed Ariyaluran, 273 Hanh, Nguyen Thi Minh, 327 Hanne, Thomas, 99 Himanshi, 233
I Iqbal, Sohail Malik, 295 Islam, Salekul, 643
J Jadhav, Anuja R., 207 Jain, Kirti, 479 Jain, Pratik, 199 Jain, Samyak, 31 Jaiswal, Swati, 207 Jalnekar, Rajesh, 221 Jayaprakasam, Kirubakaran, 419 Jeevitha, J. K., 61 Jha, P. S., 341 Jimoh, Abdulrasaq, 125 Jimoh, Abdulsamad Bolakale, 125 Johnson, Rithika F., 539
K Kachhoria, Renu, 207 Kait, Ramesh, 525, 705 Kar, Bikram, 351 Karmakar, Papri, 461 Karmakonda, Karthik, 407 Kaur, Amanpreet, 729 Khadse, Chetan, 207 Korra, Sampath, 263 Kulkarni, Govind, 461 Kumar, Akhilesh, 553 Kumar, Arabind, 187 Kumar, Arun, 669 Kumar, Harshata S., 11 Kumar, Rakesh, 689, 729
L Lee, Loong Chuen, 317 Lobo, L. M. R. J., 381 Lúcio, Mª Manuela, 679
M Maharana, Sujit, 447 Mallikarjuna, Basetty, 31 Maneesha, 669
Author Index Marjani, Mohsen, 273 Markandeya, Dussa Lavanya, 381 Mathew, Roy, 295 Matin, Abdul, 139 Mehta, Sonam, 75 Meier, Patrick, 99 Mofakh Kharul Islam, A. S. M., 565 Mohammed, S. Asif, 511 Mohd Rosdi, Nur Ain Najihah, 317 Mukherjee, Subhodeep, 341 Muley, Aniket, 461 Mundakkad, Padmaja, 419 Murshed, Mohammad N., 503 Mutairi Al, Amani, 373 Mzili, Ilyass, 113 Mzili, Toufik, 113
N Nagabotu, Vimala, 51 Nagamani, K., 435 Naib, Bharat Bhushan, 31 Namburu, Anupama, 51 Natarajan, Sowmya, 175 Nayak, Suvasis, 447 Nguyen, Quoc Hung, 597 Nigdeli, Sinan Melih, 285
O Ocak, Ayla, 285 Ogunwole, Emmanuel Idowu, 125 Opu, Md. Nahidul Islam, 581
P Pal, Surya Kant, 341 Pandey, Praveen Kant, 669 Patel, Bansari, 305 Patel, Parth, 305 Patel, Riya, 305 Ponnusamy, Vijayakumar, 175 Pratibha, N., 11 Prithvi, Kishan, 11
R Rafeeque, Afeefa, 361 Raha, Rishita, 539 Rahhal Al, Mohamad Mahmoud, 373 Rakesh Datta, V., 1 Rama Devi, K., 1 Rambabu, Bandi, 407 Rana, Navneet, 689
Author Index Ranga, Virender, 705 Ranjithkumar, K., 1 Rashmi, M., 539 Riffi, Mohammed Essaid, 113 Roy, Animesh Chandra, 565, 581 Roy, Rita, 341 Roy, Swarup, 717 Roy, Vineet, 341
S Sailaja, V., 607 Sarkar, Bikash Kanti, 351 Sarker, Shaswato, 139 Sashidharan, Jeevna, 317 Sebastian, Softya, 717 Shafiq, Dalia Abdulkareem, 273 Shahid, Mohammad, 553 Shanmugam, Aditi, 11 Shilaskar, Swati, 221 Shukla, Pragya, 75 Singh, Kuldeep, 157 Sinha, Anjali, 435 Sino, Hukil, 317 Srinivas, Kalyanapu, 1 Stawiski, Marc, 99 Subhashini, R., 61 Sunil Prakash, M., 607 Surwade, A. U., 91
745 Swamy Das, M., 407
T Tapasvi, Niranjan, 221 Tawafak, Ragad, 295 Tiwari, Nidhi, 199 Truong, Viet Phuong, 597 Tyagi, Anmol, 157
U Uniyal, Aditi, 31 Upadhyay, Ayush, 305 Usha Sravani, G., 607
V Vashishtha, Jyoti, 233 Vaswani, Prem, 419 Vimala Kumari, G., 607 Vi, Tran Duc, 327
Y Yadav, Mukesh Kumar, 199 Yadav, Sanjay, 187 Yevale, Pallavi, 207 Yusof, Azmi bin Mohd., 295