404 72 8MB
English Pages 297 [298] Year 2023
Advances in Intelligent Systems and Computing 1446
Siddhartha Bhattacharyya Gautam Das Sourav De Leo Mrsic Editors
Recent Trends in Intelligence Enabled Research Selected Papers of Fourth Doctoral Symposium, DoSIER 2022
Advances in Intelligent Systems and Computing Volume 1446
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by DBLP, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST). All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
Siddhartha Bhattacharyya · Gautam Das · Sourav De · Leo Mrsic Editors
Recent Trends in Intelligence Enabled Research Selected Papers of Fourth Doctoral Symposium, DoSIER 2022
Editors Siddhartha Bhattacharyya Rajnagar Mahavidyalaya Birbhum, India Algebra University College Zagreb, Croatia Sourav De Cooch Behar Government Engineering College Cooch Behar, India
Gautam Das Cooch Behar Government Engineering College Cooch Behar, India Leo Mrsic Algebra University College Zagreb, Croatia Rudolfovo Scientific and Technological Center Novo mesto, Slovenia
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-99-1471-5 ISBN 978-981-99-1472-2 (eBook) https://doi.org/10.1007/978-981-99-1472-2 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Siddhartha Bhattacharyya would like to dedicate this volume to his loving wife Rashni. Sourav De would like to dedicate this volume to his loving wife Debolina Ghosh, his beloved son Aishik De, and his sister Soumi De. Leo Mrsic would like to dedicate this volume to his son Noah.
DoSIER 2022 Symposium Technical Committee
Chief Patron Dr. Prabal Deb, Principal, Cooch Behar Government Engineering College, Cooch Behar, India
General Chairs Dr. Siddhartha Bhattacharyya, Rajnagar Mahavidyalaya, Birbhum, India and Algebra University College, Zagreb, Croatia Dr. Goutam Das, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Sourav De, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Leo Mrsic, Algebra University College, Zagreb, Croatia and Rudolfovo Scientific and Technological Center, Novo mesto, Slovenia
Program Chairs Dr. Sudip Kumar Adhikari, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Aritra Acharyya, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Sushovan Chatterjee, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Hrvoje Jerkovi´c, Algebra University College, Croatia
vii
viii
DoSIER 2022 Symposium Technical Committee
Organizing Secretaries Dr. Shyamal Ghosh, Cooch Behar Government Engineering College, Cooch Behar, India Mr. Sukhendu Shekhar Mondal, Cooch Behar Government Engineering College, Cooch Behar, India Mr. Atanu Maji, Cooch Behar Government Engineering College, Cooch Behar, India
International Advisory Committee Dr. Vincenzo Piuri, Universita’ degli Studi di Milano, Italy Dr. Debotosh Bhattacharjee, Jadavpur University, India Dr. Vaclav Snasel, VSB Technical University of Ostrava, Czech Republic Dr. Aboul Ella Hassanien, Cairo University, Egypt Dr. Ujjwal Maulik, Jadavpur University, India Dr. Wei-Chang Yeh, National Tsing Hua University, Taiwan Dr. Elizabeth Behrman, Wichita State University, USA Dr. Mario Koeppen, Kyushu Institute of Technology, Japan Dr. Mita Nasipuri, Jadavpur University, India Dr. Xiao-Zhi Gao, University of Eastern Finland, Finland Dr. Robert Kopal, Algebra University College, Croatia Dr. Goran Klepac, Hrvatski Telekom, Croatia
Technical Program Committee Dr. Zlatan Mori´c, Algebra University College, Croatia Dr. Zdravko Kuni´c, Algebra University College, Croatia Dr. Danijel Bele, Algebra University College, Croatia Dr. Predrag Šuka, Algebra University College, Croatia Dr. Vanja Šebek, Algebra University College, Croatia Dr. Mario Fraculj, Algebra University College, Croatia Dr. Jayanta Chandra, RKMGEC, India Dr. Sourav De, Cooch Behar Government Engineering College, India Dr. Jyoti Prakash Singh, NIT Patna, India Dr. Rik Das, Xavier Institute of Social Service, India Dr. Dipak Kole, Jalpaiguri Government Engineering College, India Dr. Jisan Mehedi, Jalpaiguri Government Engineering College, India Dr. Goutam Kumar Panda, Jalpaiguri Government Engineering College, India Dr. Koushik Mondal, IIT Dhanbad, India Dr. Kousik Dasgupta, Kalyani Government Engineering College, India
DoSIER 2022 Symposium Technical Committee
ix
Dr. Amitava Ray, Jalpaiguri Government Engineering College, India Dr. Soumyajit Goswami, IBM, India Dr. Indradeep Banerjee, University Institute Technology, Burdwan University, India Dr. Debashis De, MAKAUT, India Dr. Mihaela Albu, Politehnica University of Bucharest, Romania Dr. Rajarshi Mahapatra, IIIT Naya Raipur, India Ms. Koyel Chakraborty, Supreme Knowledge Foundation Group of Institutions, India Ms. Tulika Dutta, Assam University, India Dr. Hrishikesh Bhaumik, RCC Institute of Information Technology, India Dr. Debarka Mukhopadhyay, CHRIST (Deemed to be University), India Dr. Pijush Samui, NIT Patna, India Mr. Abhidhan Bardhan, NIT Patna, India Ms. Pampa Debnath, RCC Institute of Information Technology, India Dr. Swarup Ghosh, Sister Nivedita University—Techno India Group, Kolkata, India Mr. Swalpa K. Roy, Jalpaiguri Government Engineering College, India Dr. Debajyoty Banik, KIIT University, India Dr. Soham Sarkar, RCC Institute of Information Technology, India Dr. Sandip Dey, Sukanta Mahavidyala, India Dr. Abhishek Basu, RCC Institute of Information Technology, India Mr. Debanjan Konar, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany Dr. Anirban Mukherjee, RCC Institute of Information Technology, Kolkata, India Dr. Indrajit Pan, RCC Institute of Information Technology, Kolkata, India Dr. Amlan Chatterjee, California State University, USA Dr. Jyoti Sekhar Banerjee, Bengal Institute of Technology, Kolkata, India Dr. Balachandran Krishnan, CHRIST (Deemed to be University), Bangalore, India Dr. Abhijit Das, RCC Institute of Information Technology, Kolkata, India Mr. Arpan Deyasi, RCC Institute of Information Technology, Kolkata, India Dr. Pijush Barthakur, KLS Gogte Institute of Technology, India Dr. Shibakali Gupta, University Institute Technology, Burdwan University, India Mr. Soumyadip Dhar, RCC Institute of Information Technology, India Dr. Hiranmoy Roy, RCC Institute of Information Technology, India Dr. Jan Platos, VSB Technical University of Ostrava, Czech Republic Dr. Pankaj Pal, RCC Institute of Information Technology, Kolkata, India Dr. Sabyasachi Mukhopadhyay, Bharatiya Vidya Bhavan Institute of Management Science Kolkata, Kolkata, India Dr. S. K. Hafizul Islam, IIIT Kalyani, Kalyani, India Dr. Ayan Das, BIT Mesra, India Dr. Subhas Barman, Cooch Behar Government Engineering College, Cooch Behar, India
x
DoSIER 2022 Symposium Technical Committee
Publicity and Sponsorship Chairs Dr. Siddhartha Bhattacharyya, Rajnagar Mahavidyalaya, Birbhum, India Mr. Sukhendu Shekhar Mondal, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Andrea Blažiˇcevi´c, Algebra University College, Croatia
Local Hospitality Chairs Mr. Prasenjit Das, Cooch Behar Government Engineering College, Cooch Behar, India Mr. Rajib Das, Cooch Behar Government Engineering College, Cooch Behar, India
Finance Chair Mr. Saptaparna Basu Roy Chowdhury, Cooch Behar Government Engineering College, Cooch Behar, India
Organizing Committee Mr. Arnab Gain, Cooch Behar Government Engineering College, Cooch Behar, India S. K. Rabiul Hossain, Cooch Behar Government Engineering College, Cooch Behar, India Mr. Tanumay Halder, Cooch Behar Government Engineering College, Cooch Behar, India Md. Asif S. K., Cooch Behar Government Engineering College, Cooch Behar, India Mr. Biren Gurung, Cooch Behar Government Engineering College, Cooch Behar, India Mr. Prasenjit Das, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Gopal Ghosh, Cooch Behar Government Engineering College, Cooch Behar, India Dr. Madhumita Dhar, Cooch Behar Government Engineering College, Cooch Behar, India Mr. Tanmay Choudhury, Cooch Behar Government Engineering College, Cooch Behar, India Mrs. Apu Mondal, Cooch Behar Government Engineering College, Cooch Behar, India
DoSIER 2022 Symposium Technical Committee
Dr. Goran Ðambi´c, Algebra University College, Croatia Dr. Danijel Kuˇcak, Algebra University College, Croatia Dr. Hrvoje Jerkovi´c, Algebra University College, Croatia Dr. Mislav Balkovi´c, Algebra University College, Croatia - Skok, Algebra University College, Croatia Dr. Srdan
Technical Sponsors Asia-Pacific Artificial Intelligence Association (AAIA), Kolkata Branch IEEE Computational Intelligence Society, Kolkata Chapter Acuminous Research Foundation (AcuminoR), Kolkata
xi
Preface
Computational Intelligence has become the helm of affairs when it comes to the infusion of intelligence in existing processes and systems. The present world has seen a wide manifestation of computational intelligence techniques in almost all spheres of human civilization, thanks to the evolution of novel intelligent algorithms and architectures. Being capable of yielding robust and failsafe solutions to critical real-life problems, Computational Intelligence has made its presence felt in several scientific and engineering disciplines, including signal processing, smart manufacturing, predictive control, robot navigation, smart cities, and sensor design, to name a few. In tune with the developments across the world in this direction, the Government of India has also put emphasis on the promotion and dissemination of computational intelligence techniques in various fields. The Fourth Doctoral Symposium on Intelligence Enabled Research (DoSIER 2022) is the fourth novel initiative in this direction. The 2022 Fourth Doctoral Symposium on Intelligence Enabled Research (DoSIER 2022), a two-day event, was held at Cooch Behar Government Engineering College, Cooch Behar, India, during December 22–23, 2022. The symposium was held in collaboration with the Algebra University College, Croatia, and in association with the Acuminous Research Foundation (AcuminoR), providing doctoral students and early career researchers an opportunity to interact with eminent scientists from across the globe working on foundations, techniques, tools, and applications of computational intelligence. The event was technically sponsored by the Asia-Pacific Artificial Intelligence Association (AAIA), Kolkata Branch and IEEE Computational Intelligence Society, Kolkata Chapter. The main goals of the symposium were to provide the participants with independent and constructive feedback on their current research and future research directions, to develop a supportive community of scholars and a spirit of collaborative research, to provide an opportunity for student participants to interact with established researchers and practitioners in relevant fields of research and to open up the possibility of a closed research forum for mutual exchange of knowledge base and know-how.
xiii
xiv
Preface
DoSEIR 2022 received a good number of quality submissions from countries including India, Ethiopia, Liberia, Russia, and Portugal. Each submission underwent a rigorous review process followed by quality checks per the standard norms. Finally, twenty-four submissions were accepted for presentation in the symposium. DoSIER 2022 featured keynotes from eminent scientists from across the globe, which include (i) Dr. Leo Mrsic, Algebra University College, Zagreb, Croatia, (ii) Prof. (Dr.) Amit Konar, Jadavpur University, Kolkata, India, (iii) Prof. (Dr.) Ashish Mani, Amity University Noida, India, and (iv) Prof. (Dr.) Jagdish Chand Bansal, South Asian University New Delhi, India. The accepted papers were presented in two technical sessions chaired by (i) Prof. (Dr.) Debashis De, Maulana Abul Kalam Azad University of Technology, Kolkata, India, (ii) Dr. Abhishek Basu, RCC Institute of Information Technology, Kolkata, India, (iii) Dr. Rajarshi Mahapatra, IIIT Naya Raipur, Naya Raipur, India, and (iv) Dr. Manojit Chattopadhyay, IIM Raipur, Raipur, India. This volume comprises 23 well-versed contributory chapters emanating from the papers accepted and presented in the symposium entailing different facets of intelligence-enabled research perspectives and frameworks. The widespread introduction of artificial intelligence systems in all areas of human activity imposes certain requirements on these systems. In particular, systems operating in critical areas such as health care, economics, and security systems based on artificial neural network models should have an explanatory apparatus in order to be able to evaluate not only the recognition, prediction, or recommendation accuracy familiar to everyone but also to show the algorithm for getting the result of neural network working. In the chapter “Rule Extraction Methods from Neural Networks”, the authors suggest to investigate methods for rules extraction from ANN, which are based on fuzzy logic. They consider several possible methods for extracting rules, such as the mining clustering algorithm and decision trees. The river Ganges is the primary source of surface water in India, which is used for drinking, agricultural, religious, and industrial purposes. However, the quality of Ganges water is becoming more and more polluted due to changes in climate and anthropogenic activities. The aim of this study is to analyse the quality of the Ganges river water and extract the key features of the existing water management systems. From the literature, it has been observed that dissolved oxygen (DO) is an essential water component. The water quality is good if it has a sufficient amount of DO. In the chapter “Quality Analysis of the Ganges River Water Utilizing Machine Learning Technologies”, various machine learning approaches, i.e. support vector regressor, random forest, decision tree, multi-layer perceptron, and Naïve Bayes, are used to assess the amount of DO present in the Ganges water. From simulation results, it has been observed that the MLP is best among other comparing machine learning methods for estimating the quantity of DO in the Ganges water. Eye-tracking has been a topic of interest in research in recent years because it provides convenience to a wide range of applications. It is acknowledged as an important non-traditional method of human–computer interaction. Eye-tracking is a useful tool for determining where and when people devote visual attention to a scene and helps to understand cognitive functioning. Nowadays, eye-tracking technology
Preface
xv
is making its way from the laboratory to the real world, collecting more data at a faster rate and with a greater variety of data kinds. Eye-tracking will become closer to big data if the current trend continues. A real-time model is created using machine learning methodology, which tests a high-accuracy hypothesis. Eye-tracking with parameters looks into a participant’s eye movements while presenting them with a variety of options. Machine learning analyses eye movements and extracts attribute to assess eye behaviour. K nearest neighbour, Naive Bayes, decision trees, and random forests are machine learning algorithms that produce models with improved accuracy. In the chapter “Eye-Tracking Movements—A Comparative Study”, the authors reviewed different eye-tracking technologies to obtain eye movement parameters and classifiers for categorization, such as machine learning and deep learning towards recognition of cognitive processes involved in learning. With better education, there would be a greater need for factual preservation. Without adequate threat management, serious record breaches have already happened and are probably going to happen again. By reviewing the recent literature on acknowledged assets, risk occurrences, risk actors, and vulnerabilities in better training, it is possible to deduce the effect of cyber-attacks on society. The chapter “Cybersecurity Imminent Threats with Solutions in Higher Education” reviews the studies as well as projects to increase understanding of the primary cybersecurity threat domains. The top spot has shifted due to the paucity of empirical research on cybersecurity threats in higher education and large literature gaps. Despite this issue, some of the publications which are examined spent too much time explaining cybersecurity-related concepts. The top-level view of mission-critical assets, regular risk events, a widely accepted risk model, and an overview of common cybersecurity vulnerabilities are all combined in this chapter. Strategic cyber-risks are summarized in this chapter with descriptions of frequency distributions and starting points for protection researchers in higher education. A two-level fuzzy model for filtering complex signals such as automatic dependent surveillance broadcast is presented in the chapter “A Two-Level Fuzzy Model for Filtering Signals of the Automatic Dependent Surveillance-Broadcast”. The first and second levels of the fuzzy model consist of three operations: fuzzification, fuzzy composition, and defuzzification. Input variables of two levels are given by trapezoidal membership functions that are formed automatically, depending on the characteristics of the complex signal. A singleton function gives the output function at the first level, and the defuzzification is carried out using a simplified centre of gravity model. The proposed two-level fuzzy model makes it possible to increase the sensitivity of the ADS-B signal receiver and correctly detect the received signal. The challenge of identifying and monitoring multiple types of solar panels has not been studied. Solar panels can be single, double, or double with a water heater on top. Some are packed closely together. Due to installation requirements, additional solar panels may have a random orientation. When combined with the difficulties of detecting different types of panels, this arbitrary orientation negatively affects the effectiveness of deep learning algorithms by resulting in false positive and erroneous panel classifications. Furthermore, no research on the identification of various solar panel types has been done yet. In the chapter “Toward More Robust Multiclass Aerial Solar Panel Detection and Classification”, the authors concentrate on two
xvi
Preface
key problems: first, the detection of different types of solar panels, and second, the arbitrary orientation of these panels. The proposed method does not use a horizontal bounding box; rather, it leverages horizontal bounding boxes and generates a rotated bounding box during the train time. Using our method, the authors could precisely identify three types of solar panels with various orientations. The authors also show a comparison of their differences for the identification of three different types of solar panels, including water heater photovoltaic (WPV), farm-type photovoltaic (FPV), and SPV, in terms of box loss, objectness loss, classification loss, precision, and recall (single photovoltaic). Classification and identification of plants are necessary from the perspective of agricultural specialists as well as botanical research. The traditional methods of finding the information for the specific plant consume time and effort. The deployment of a machine learning algorithm can play a vital role while identifying as well as classifying the plant. As such, the authors propose a novel model in the chapter “PlantML: Some Aspects of Investigation on Deployment of Machine Learning Algorithm for Detection and Classification of Plants” based on a machine learning algorithm that can be deployed to identify the flowers and fruits. The proposed model, referred to as PlantML, highlights the experimental arrangement of PlantML, the use case, and the activity diagram of the system. The comparative analysis among applicable machine learning algorithms for PlantML is discussed. In this work, deep network knowledge is used to train the data sets considering the features of the ImageNet model of deep neural networks. The framework platform TensorFlow is utilized to deploy it. The viability of the work is evaluated to find evidence that PlantML is suitable and can act as a supplementary tool for agricultural as well as botanical research. As such, from the study, it can be concluded that the proposed model can recognize the different types of flowers and fruits at a higher accuracy. In the realm of computer science, RSS is a set of tools and methods for making useful product recommendations to end-users. Telecoms provide a wide range of offerings to maintain footholds in a competitive industry. It is challenging for a client to choose the best-fit product from the huge bouquet of products available. It is possible to increase suggestion quality by using large amounts of textual contextual data detailing item qualities which are accessible with rating data in various recommender’s domains. Users have a hard time making purchases in the telecom industry. A fresh strategy for improving recommendation systems in the telecommunications industry is proposed in the chapter “Big Data Analytics-Based Recommendation System Using Ensemble Model”. Users may choose the recommended services which are loaded onto their devices. Using a recommendation engine is a simple way for telecoms to increase trust and customer satisfaction index. The suggested recommendation engine allows users to pick and choose the services they need. The present study compares two distinct recommendation frameworks: a single algorithm and an ensemble algorithm model. Experiments are conducted to compare the efficacy of separate algorithms and an ensemble algorithm. Interestingly, the ensemble algorithm-based recommendation engine proves to provide better recommendations in comparison with individual algorithms.
Preface
xvii
The chapter “Uncertainty Management in Brain Data for Olfactory PerceptualAbility Assessment of Human Subjects by General Type-2 Fuzzy Reasoning” introduces an interesting approach to assessing olfactory perceptual ability and its gradual degradation over months for both healthy persons and people suffering from early olfactory ailments. Functional near-infrared spectroscopic (f-NIRs) data acquired from the experimental subjects’ brains are pre-processed and then fed to a novel general type-2 fuzzy regression unit to predict the subjective olfactory perceptual ability. The model parameters are corrected using subjective feedback about the olfactory stimuli concentration. During the test phase, the model is used to predict perceptual degradation in olfaction for patients suffering from olfactory ailments. The prediction error computed with respect to the subject’s self-assessment of stimulus concentration is used as a metric of performance of the proposed prediction algorithm. The original contribution of the work lies in the formulation to handle uncertainty in multi-trial, multi-session experimental brain data using general type-2 fuzzy logic. The proposed technique outperforms traditional type-2 fuzzy techniques with respect to the per cent success rate and run-time complexity. Cloud computing framework is growing in importance in recent times. The public cloud is more challenging than the private cloud. The public cloud framework has inherent challenges of security issues, service reliability, and time-constrained requirement for providing service on demand. Resource management is an important aspect of the public cloud. The efficient allocation of resources across different service peers or servers is an important aspect towards error management and fault management. The chapter “Checkpoint-Based Round-Robin Scheduling Technique Toward Fault-Tolerant Resource Management in Cloud Computing” proposes a modified round-robin resource scheduling mechanism with multiple checkpoints that take care of error handling and fault management. The proposed method has been tested on different benchmarks and established its robustness. Effective identification of a small set of nodes within a network which can potentially cover many nodes in the remaining network is known as the influence spread process. Influence spreading among a maximum number of nodes is called the influence maximization process. The influence maximization task is computationally hard, involving promising seed set selection and estimating the maximum influence spread throughout the network. A community detection algorithm to figure out an effective seed set for influence maximization within an acceptable execution time is the key essence of chapter “Cost Optimized Community-Based Influence Maximization”. The proposed community-based identification method involves three stages. The first stage detects communities in the given network, the second stage analyses community structure to select candidate nodes within the communities, and the third stage identifies promising influential members from the candidate set to make a target set. Ultimately, average influence spread is measured through the Monte Carlo Simulation technique. The proposed algorithm has been rigorously tested on two real-world social network data sets to establish its usefulness and efficacy. Artificial intelligence (AI) and machine learning (ML) have slowly but steadily become integral to human lives. While much remains to be learned, it surely has contributed to research works beyond the human mind’s comprehension. The chapter
xviii
Preface
“A Novel Model for Automated Identification of Terrestrial Species” focuses on the concept of standard machine learning techniques used for terrestrial species identification using 3D images. A novel methodology is proposed for the extraction of the characteristics and structural properties of these species based on images using convolution neural networks (CNN). This study mainly contributes to disseminating awareness and knowledge to research groups in similar fields of study. Observation has been made that despite the advancement of science and technology, people can still not differentiate a buffalo from a bison. In addition, recent developments in the field of machine learning are still a novice in the field of species studies. The emergence of AI and ML is slowly changing educational entities and industrial services, and rightly so. The proposed methodology can also act as a supplementary tool for restructuring traditional higher education, which is still prevalent in India. Object detection is one of the most popular areas of research on computer vision. Monitoring marine ecosystems is now a demand for saving nature. Sending human beings to observe the marine environment for some of the tasks is more dangerous. Multiple works are going on in the improvement of autonomous underwater vehicles (AUV) to monitor the underwater environment. In aquaculture fisheries, fish monitoring by AUV has gained great importance. It is required to collect information regarding different types of fish to maintain the marine ecosystem. The chapter “Fish Detection from Underwater Images Using YOLO and Its Challenges” concentrates on using real-time object detectors like YOLOv3 and YOLOv4. This work uses the Roboflow fish detection data set to validate the proposed method. In the chapter “Design and Realization of an IoT-Based System for Real-Time Monitoring of Embryo Culture Conditions in IVF CO2 Incubators”, the authors have designed and implemented an Internet-of-things (IoT)-based real-time embryo culture conditions monitoring and alerting system for CO2 incubators used in the in vitro fertilization (IVF) process. Industry-grade sensors capable of sensing accurate concentrations of carbon dioxide (CO2 ) and total volatile organic compound (VOC), temperature, and humidity have been placed inside the incubator chamber under a culture environment. These sensor data are sent periodically to the microcontroller unit outside the incubator through a flat flexible cable (FFC) for efficient data acquisition and further processing. Minicomputer unit and touch screen display are used to display the real-time culture conditions, i.e. CO2 , VOC, temperature, and humidity. The complete local desktop access of the proposed system can be obtained from any personal computer/smartphone using a cross-platform screensharing system. The minicomputer unit is capable of storing real-time culture data in an appropriate format in the cloud. The minicomputer unit is registered to a specific cloud service so that its desktop can be fully accessed from anywhere in the world through the Internet. SMS and e-mail alerts are sent immediately to one or more pre-registered recipients via an application programming interface (API) bulk SMS service and e-mail server if one or more culture parameters cross the predefined allowable limit. Further test runs have been performed for a prolonged period (eighteen months) to determine the period after which the recalibration of the sensors is required.
Preface
xix
Multiprocessor system-on-chip (MP-SoC) architectures with increasing cores benefit from network-on-chip (NoCs) communication methods, which are robust and scalable. Ultimately, router performance depends on the microarchitecture of the router and its routing algorithms, which determine its throughput and latency properties. Routing algorithms are notably crucial NoC router design options during packet travel through the network. For look-ahead routing, the individual first computes the desired output port, which establishes its local blockage scenario, after it transfers the flits to the neighbouring router’s desired output ports. The chapter “FPGA Implementations and Performance Analysis of Different Routing Algorithms for the 2D-Mesh Network-On-Chip” proposes a synchronous predictive routing computing for distributed scalable predictable interconnect networks (DSPIN) along with RTL implementation and analysis via some popular routing algorithms consecrated towards 2D mesh network topographic anatomy. Parametrized Verilog HDL has been used for the RTL implementations, and Xilinx Vivado has accomplished PAD (power, area, and delay) analysis. Power dissipation for the design has been estimated with the help of an Xpower analyser, and Xpower estimator. Increased dependency on artificial intelligence (AI) decision-making methods has recently emerged. The attention lies in the convolutional networks for efficient feature extractions and high-accuracy generation of valuable insights. Understanding the internal workings of such complex models to ensure trust in the generated systems is challenging for developers and end-users. A growing study on explainability methods with visualizations focuses on model interpretations. Class activation mapping (CAM), in the context, is used for visualizing the discriminating regions of the images used by the CNNs for classification. Existing gradient-based CAMs like Grad-CAM, Grad-CAM++, and XGrad-CAM have certain disadvantages like saturation and false confidence. In the chapter “A Comparative Analysis of Non-gradient Methods of Class Activation Mapping”, the authors perform a comparative analysis with some recently developed non-gradient CAM approaches like Eigen CAM, Score CAM, and Ablation CAM. The efficiency of these methods has been analysed using the evaluating metrics of remove and debias on the benchmark data sets. The performance of the multi-layer perceptron (MLP) degrades while working with high-resolution images due to the issues of vanishing gradient and overfitting. However, the performance of an MLP can be improved with efficient weight initialization techniques. The chapter “Deep Learning-Based Weight Initialization on Multi-layer Perceptron for Image Recognition” puts forward a systematic deep neural network (DNN)-based weight initialization strategy for an MLP to enhance its classification accuracy. Moreover, the training of an MLP may not converge due to the presence of many local minima. It is feasible to avoid local minima by properly initializing weights as an alternative to random weight initialization. A restrictive Boltzmann machine (RBM) has been used in this chapter to pretrain the MLP. An MLP is trained layer-by-layer, with weights between each neighbouring layer up to the penultimate layer. The whole network is then fine-tuned after the pretraining, by the reduction of the mean square error (MSE). To compare the performance of
xx
Preface
the proposed initialization of weights and random weight initialization, two standard image classification data sets (i) CIFER-10 and (ii) STL-10 are used. Due to the spreading of the novel coronavirus all over the world in January 2020, India underwent a lockdown for three consecutive months and thereafter a partial lockdown as and when it was necessary. In spite of several devastating effects during the lockdown, certainly, there was a positive impact on ambient air quality. The chapter “Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown with Reference to an Industrial City and a Populated City of India” investigates how the quality of air has improved from a pre-lockdown period to a lockdown period and further how the quality has degraded during the post-lockdown period. In this study, several air quality parameters were collected from the Central Pollution Control Board, Government of India, and plotted to analyse their impact on the environment. Two different kinds of cities, such as an industrial city (Bhiwadi, Rajasthan) and a highly populated city (Delhi), were considered for the present study. The outcomes of the simulation showcase the enhancement of the performance of MLP with the proposed weight initialization. Furthermore, the proposed method yields a better convergence speed than standard MLP. A large volume of data is produced from the digital transformation with the extensive use of the Internet and global communication system. Big data denotes this extensive heave of data which cannot be managed by traditional data handling methods and techniques. This data is generated every few milliseconds in the form of structured, semi-structured, and unstructured data. Big data analytics are extensively used in the enterprise, which plays an important role in various fields of application. The chapter “Applications of Big Data in Various Fields: A Survey” presents applications of big data in various fields, such as healthcare systems, social media data, E-commerce applications, agriculture applications, smart city applications, and intelligent transport systems. The chapter also tries to focus on the characteristics and storage technology of using big data in these applications. Healthcare information management system produces a huge volume of healthcare data. Information related to different patients may vary according to his/her health situation. This health information for all patients needs to store in a database for future use. The patient information is of huge volume with different varieties and unstructured in nature. It is very difficult to normalize and store such data in traditional RDBMS. Hence, there is an essence to using NoSQL to store such data in a big data environment. NoSQL can handle unstructured or medical data where a row id identifies each particular patient, and varieties of medical information can be stored in particular column families. In the chapter “Performance Analysis of Healthcare Information in Big Data NoSql Platform”, the authors present big data characteristics in the healthcare system, focusing particularly on NoSQL databases. They have proposed an architectural model where we have used HBase as a NoSQL on top of HADOOP platform and show how the performance of query execution differs according to the data volume stored in the HBase. An artificial intelligence (AI)-based smart Internet protocol (IP) camera system for examination monitoring has been designed, realized, and rigorously tested under a realistic environment in the chapter “IoT-Based Smart Internet Protocol Camera for Examination Monitoring”. The purpose of the proposed IP cameras is to carry out
Preface
xxi
real-time remote surveillance of the examination hall as well as store corresponding video recordings in cloud storage. Internet-of-things (IoT) technology is used to remote access IP cameras and real-time video streaming on the web browser of any computer/smartphone connected to either an internal or external network; a dedicated cloud service (named “remoteit”) is used for this purpose. Practically 640 × 480 sq. pixels with 30–35 frames per second (fps) video capturing are used for online streaming. On the other hand, a compressed video size of 320 × 240 sq. pixels with 30–35 fps is used for recording and storing in the cloud. In addition, a custom TensorFlow lite model has been built to detect undesirable chaos in the examination hall. A bulk short messaging service (SMS) application programming interface (API) is used to send a real-time alert to the competent authority in order to initiate necessary action. In future, the proposed system may be customized for various other sectors for security and alerting purposes. Low-complexity, optimal performance massive multiple-input multiple-output (MIMO) receiver design is challenging. Several low-complexity approaches are reported in the literature for massive MIMO detection. However, when the ratio of receiving to transmitting antenna ratio is lower than four, conventional linear detectors do not result in good performance. Recently developed large-MIMO approximate message passing algorithm (LAMA) shows near-optimal detection performance. However, its complexity is still higher. In the chapter “Efficient Low-Complexity Message Passing Algorithm for Massive MIMO Detection”, the authors have proposed an efficient approach to updating the mean and variance in LAMA. A termination condition is added to reduce unnecessary computations. Simulation results show that the error performance of the proposed algorithm is almost identical to the conventional method. Also, significant complexity reduction is achieved in the proposed method. Influence maximization refers to reaching the most promising part of a network by selecting a small number of nodes. Computing the maximum influence spread throughout a network based on influential seed sets is a computationally hard problem. The chapter “An Effective Community-Based Greedy Approach Over Selected Communities for Influence Maximization” attempts to minimize the execution time and maximize the influence spread. The technique is communitybased and comprises three steps: (1) identify communities within a network, (2) locate the most significant communities and candidate vertices, and (3) trace the most influential nodes among candidate vertices. Experiments have been conducted on two real-world social network data sets to understand the efficacy and usefulness of the proposed algorithm. This volume is a novel attempt to enrich the existing knowledge base on intelligence-based research foundations and applications. The editors would feel
xxii
Preface
rewarded if the volume comes to the benefit of budding researchers exploring the field further to unearth indigenous intelligent models and frameworks for the future. Birbhum, India/Zagreb, Croatia Cooch Behar, India Cooch Behar, India Zagreb, Croatia/Novo mesto, Slovenia December 2022
Siddhartha Bhattacharyya Gautam Das Sourav De Leo Mrsic
Contents
Rule Extraction Methods from Neural Networks . . . . . . . . . . . . . . . . . . . . . Sergey Yarushev, Alexey Averkin, and Vladimir Kosterev Quality Analysis of the Ganges River Water Utilizing Machine Learning Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prasenjit Dey, Sudip Kumar Adhikari, Arnab Gain, and Suman Koner Eye-Tracking Movements—A Comparative Study . . . . . . . . . . . . . . . . . . . . Sunny Saini, Anup Kumar Roy, and Saikat Basu Cybersecurity Imminent Threats with Solutions in Higher Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mahendra Kumar Gourisaria, Abtsega Tesfaye Chufare, and Debajyoty Banik
1
11 21
35
A Two-Level Fuzzy Model for Filtering Signals of the Automatic Dependent Surveillance-Broadcast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bobyr Maxim, Arkhipov Alexander, and Milostnaya Natalia
49
Toward More Robust Multiclass Aerial Solar Panel Detection and Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Indrajit Kar, Sudipta Mukhopadhyay, and Bijon Guha
61
PlantML: Some Aspects of Investigation on Deployment of Machine Learning Algorithm for Detection and Classification of Plants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gavel D. Kharmalki, Gideon D. Kharsynteng, Narisha Skhemlon, Abhijit Bora, and Gypsi Nandi
75
Big Data Analytics-Based Recommendation System Using Ensemble Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Devika Rani Roy, Sitesh Kumar Sinha, and S. Veenadhari
85
xxiii
xxiv
Contents
Uncertainty Management in Brain Data for Olfactory Perceptual-Ability Assessment of Human Subjects by General Type-2 Fuzzy Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mousumi Laha and Amit Konar
99
Checkpoint-Based Round-Robin Scheduling Technique Toward Fault-Tolerant Resource Management in Cloud Computing . . . . . . . . . . . 113 Jayanta Datta, Subhamita Mukherjee, and Indrajit Pan Cost Optimized Community-Based Influence Maximization . . . . . . . . . . . 127 Mithun Roy, Subhamita Mukherjee, and Indrajit Pan A Novel Model for Automated Identification of Terrestrial Species . . . . . 139 Pradei Sangkhro, Phidawanhun Pyngrope, Bangermayang, and Gypsy Nandi Fish Detection from Underwater Images Using YOLO and Its Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Pratima Sarkar, Sourav De, and Sandeep Gurung Design and Realization of an IoT-Based System for Real-Time Monitoring of Embryo Culture Conditions in IVF CO2 Incubators . . . . . 161 Sukanya Bose, Swarnava Ghosh, Subhadeep Dhang, Prasenjit Dey, and Aritra Acharyya FPGA Implementations and Performance Analysis of Different Routing Algorithms for the 2D-Mesh Network-On-Chip . . . . . . . . . . . . . . 173 Vulligadla Amaresh, Rajiv Ranjan Singh, Rajeev Kamal, and Abhishek Basu A Comparative Analysis of Non-gradient Methods of Class Activation Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Mrittika Chakraborty, Sukanya Sardar, and Ujjwal Maulik Deep Learning-Based Weight Initialization on Multi-layer Perceptron for Image Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Sourabrata Mukherjee and Prasenjit Dey Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown with Reference to an Industrial City and a Populated City of India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Anurag Nayak, Tunnisha Dasgupta, Amit Shiuly, and Suman Koner Applications of Big Data in Various Fields: A Survey . . . . . . . . . . . . . . . . . 221 Sukhendu S. Mondal, Somen Mondal, and Sudip Kumar Adhikari Performance Analysis of Healthcare Information in Big Data NoSql Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Sukhendu S. Mondal, Somen Mondal, and Sudip Kumar Adhikari
Contents
xxv
IoT-Based Smart Internet Protocol Camera for Examination Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Sukanya Bose, Swarnava Ghosh, Subhadeep Dhang, Rohan Karmakar, Prasenjit Dey, and Aritra Acharyya Efficient Low-Complexity Message Passing Algorithm for Massive MIMO Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Sourav Chakraborty, Salah Berra, Nirmalendu Bikas Sinha, and Monojit Mitra An Effective Community-Based Greedy Approach Over Selected Communities for Influence Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Mithun Roy, Subhamita Mukherjee, and Indrajit Pan Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
About the Editors
Dr. Siddhartha Bhattacharyya [FRSA, FIET (UK), FIEI, FIETE, LFOSI, SMIEEE, SMACM, SMAAIA, SMIETI, LMCSI, LMISTE] is currently the Principal of Rajnagar Mahavidyalaya, Birbhum, India. He is also serving as a scientific advisor at Algebra University College, Zagreb, Croatia. Prior to this, he was a Professor at CHRIST (Deemed to be University), Bangalore, India. He also served as the Principal of RCC Institute of Information Technology, Kolkata, India. He has served VSB Technical University of Ostrava, Czech Republic as a Senior Research Scientist. He is the recipient of several coveted national and international awards. He received the Honorary Doctorate Award (D.Litt.) from The University of South America and the SEARCC International Digital Award ICT Educator of the Year in 2017. He was appointed as the ACM Distinguished Speaker for the tenure 2018–2020. He has been appointed as the IEEE Computer Society Distinguished Visitor for the tenure 2021– 2023. He is a co-author of six books and the co-editor of 98 books and has more than 400 research publications in international journals and conference proceedings to his credit. Dr. Gautam Das is a professor of ECE department of Cooch Behar Government Engineering College, West Bengal. He completed B.Tech. and M.Tech. from Institute of Radio Physics and Electronics, Calcutta University and subsequently completed Ph.D. from NBU. Dr. Das has more than 20 years of teaching and research experience. He has the author and co-author of many journal and conference papers and participated/organized national and international conferences. His area of interest includes System-on-Chip Testing and Design of Smart City. Dr. Sourav De did his B.E. (IT) and M.E. (IT) in 2002 and 2005, respectively. He completed Ph.D. in Computer Science and Technology from IIEST in 2015. He is currently an Associate Professor of CSE Department in Cooch Behar Government Engineering College, West Bengal, India. He is a co-author of one book and the co-editor of 12 books and has more than 59 research publications in internationally reputed journals, international edited books, and international IEEE conference proceedings and five patents and two copyright to his credit. He also served as a xxvii
xxviii
About the Editors
reviewer in different reputed international journals. He has been the member of the organizing and technical program committees of several national and international conferences. His research interests include soft computing, pattern recognition, image processing, and data mining. Dr. De is a Senior member of IEEE and also member of ACM, Institute of Engineers (IEI), CSTA, IAENG, Hong Kong. He is a life member of ISTE, India. Dr. Leo Mrsic is the Vice Dean for Science and Research, Algebra LAB SME digital incubator manager, Algebra LAB EDIH manager and Digital Transition Consultant with focus on projects of digitization and digital entrepreneurship for public and private sector clients. As part of the research team in Algebra University, he is focusing on strategy, processes, technology, and governance of activities related to digitally empowered solutions. Leo’s experience is based on supporting ecosystem (clients and stakeholders) to select, define, implement and manage appropriate processes and methodologies to meet their goals. He specialized in providing strategic and operational advisory with a primary focus on increasing the efficiency of processes and services provided in the public or private sector and designing monitoring and reporting tools for organizations. His experience includes process and data analysis, business process redesign, organizational improvements, change management, design and implementation of monitoring and evaluation system, specification and project management. He has extensive experience in supporting public stakeholders in Croatia, where he was instrumental member in managing and coordinating teams on various projects, focusing on ensuring excellence of projects’ performance (scope, cost, schedule, quality). He is the holder of the BBA degree in the field of insurance, the M.Sc. degree in the field of business statistics/economics, and the Ph.D. degree in the field of data science. Permanent court expert in the fields of finance, accounting and computer science. Registered Consultant at GOPA Consulting Group Germany in the areas of business consulting, application of analytical/statistical methods, labor market analysis and support for educational policy-making and member of the ESCO Maintenance Committee of the European Commission in Brussels in the third convocation (2018–2022).
Rule Extraction Methods from Neural Networks Sergey Yarushev , Alexey Averkin , and Vladimir Kosterev
Abstract The widespread introduction of artificial intelligence systems in all areas of human activity imposes certain requirements on these systems. In particular, systems operating in critical areas such as health care, economics, and security systems based on artificial neural network models should have an explanatory apparatus in order to be able to evaluate not only the recognition, prediction, or recommendation accuracy familiar to everyone, but also to show the algorithm for getting result of neural network working. In this paper, we suggest to investigate methods for rule extraction from ANN, which are based on the fuzzy logic. In our research, we will consider several possible methods for extracting rules, such as the mining clustering algorithm and decision trees. Keywords Fuzzy logic · Decision trees · Machine learning · Explainable artificial intelligence · Neural networks
1 Introduction Research in the field of explanatory artificial intelligence has been actively conducted since the mass introduction and application of artificial intelligence methods, in particular artificial neural networks, began. The reason was the inability of ANN to give an answer to a simple question—how was the result achieved? Artificial neural S. Yarushev (B) Departments of Informatics, Plekhanov Russian University of Economics, Moscow, Russia e-mail: [email protected] A. Averkin Educational and Scientific Laboratory of Artificial Intelligence, Neuro-Technologies and Business Analytics, Plekhanov Russian University of Economics, Moscow, Russia e-mail: [email protected] V. Kosterev National Research Nuclear University (MEPhI), Moscow, Russia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_1
1
2
S. Yarushev et al.
networks, including deep learning networks, already surpass human capabilities in many tasks, for example, the task of image recognition or the search for regularities. But the mass introduction of such technologies was followed by a number of problems, because in order for the result to be trusted, it is needed to determine the basis of which points ANN made a particular decision. From a legal point of view, restrictions are also imposed, because it is impossible to justify legally accepted decisions by artificial intelligence. To solve these problems, scientists around the world are working on the creation of explanatory artificial intelligence (XAI). One of the approaches to creating models based on ANN is to extract rules from them based on fuzzy logic methods, genetic algorithms, or decision trees [1]. In this paper, we will consider the operation of decision trees in the problem of extracting rules from ANN. One of the most important fields of application neural networks is the time series forecasting. And here this is extremely important to understand how an artificial neural network made a particular decision. And one of the variations of solving this problem is fuzzy logic theory application. To begin with, let us say a few words about the time series and importance of rule extraction from ANN when working with time series. Time series contains two things—time and events. These events have some uncertainty. Each point in time series is correlated with a fuzzy variable which has main membership function.
2 A Short Survey to Rule Extraction Algorithms Fuzzy logic theory is most suitable for explanation of artificial neural networks working. In this chapter, we will present few possible algorithms for rule extraction from ANN—a mountain clustering algorithm used to form a database of fuzzy rules and linguistic terms of fuzzy variables and fuzzy cognitive maps. Mountain clustering or clustering without specifying the number of clusters is not fuzzy, but it is often used when generating linguistic terms of fuzzy variables and fuzzy rules. The algorithm of “mountain” clustering is made in several stages. On the first step of the “mountain” clustering algorithm, the maximum values of elements on each cluster are searched. By default, this element is assigned like the 1st cluster found [2–4]. At the second stage, data normalization is carried out, i.e., bringing all the values of the source data to a single hypercube. At the third stage, potential cluster centers are formed. Their number should be finite. Clustering objects can act as centers. The cluster potential depends on the distance between the potential cluster and other clustering objects. The highest potential is designated as the center of the cluster, which in most cases is surrounded by other fairly high peaks. In this regard, assigning the center of the next cluster with the maximum potential among the remaining vertices would lead to the multiplication of clusters. To search for the next cluster, it is necessary to
Rule Extraction Methods from Neural Networks
3
exclude its influence on the remaining vertices. Therefore, the potential of the found cluster is subtracted from the potential of the clusters [5, 6]. At the fourth stage, the potentials of the remaining clusters are recalculated. Steps 3 and 4 occur iteratively until the maximum cluster value exceeds the specified threshold. After the cluster centers are formed, the source data and cluster centers are denormalized. The cluster centers found are fuzzy terms of linguistic variables. The width of linguistic terms of fuzzy linguistic variables depends on the radius of the clusters found [7, 8]. Fuzzy cognitive map (FCM) is another method for developing explainable systems based on fuzzy logic and cognitive modeling. It was proposed by B. Kosko and is used in modeling relationships between graph nodes of a current area. The FCM is a fuzzy-directed graph with feedback whose nodes are fuzzy sets. The directed edges of the graph reflect relationships between graph nodes and numerically characterize the degree of influence (weight) of the connected graph nodes. The power of influence between factors, unlike simple cognitive maps, is set using linguistic values selected from set (the set has been ordered) of possible influence forces and the factor value; their increments are set in some linguistic variable, and after which, they are selected from ordered sets of factor values that it could take and its possible increments—factor scales and increment scales [9–11]. From the point of view of artificial Intelligence, FCMs are neural networks trained with a teacher: The more data is available for modeling a problem, the more adaptive the FCM is in terms of development and development of a suitable solution. Thus, FCMs are well suited for problems of finding solutions on a variety of alternatives.
3 Extraction of Fuzzy Production Rules Based on Mountain Clustering Automated training ANFIS assumes automatic extraction of fuzzy production rules from the training set. In the current version of the developed software library, this is done on the basis of mountain clustering. The peculiarity of mountain clustering is that the initial setting of the number of clusters is not required. The adjustable parameters of mountain clustering are: • • • •
radii—admissible cluster radius; sqshFactor—cluster suppression factor; acceptRatio—cluster acceptance ratio; rejectRatio—cluster rejection ratio.
The mountain clustering algorithm is shown in Fig. 1, and it is part of the developed software library. The ANFIS output linguistic variables implemented on the basis of the Takagi– Sugeno algorithm are described as first-order polynomials:
4
S. Yarushev et al.
Fig. 1 Mountain clustering visualization
1. 2. 3. 4.
mf1: −1.4572 − 0.13 * input1 + 70.9998 * input2 − 65.5718 * input3; mf2: 23.6074 + 0.0008 * input1 + 1201.5427 * input2 − 1763.0497 * input3; mf3: 371.0949 − 0.9062 * input1 − 2.4704 * input2 − 5.9843 * input3; mf4: 3.2298 + 0.7281 * input1 − 0.5894 * input2 + 0.323 * input3;
where mf1, mf2, mf3, and mf4 are the names of the linguistic terms of the output variable. After automated learning, the ANFIS production model can be used to solve problems in the subject area.
4 Rule Extraction Algorithm Based on Decision Trees The most well-known disadvantage of ANN is that it is impossible to determine how the neural network achieved the result. ANN basically works according to the principle—received a training sample, trained, received input data, processed, and gave the result. What happens inside the neural network and how the neural network makes this or that decision are unclear. Accordingly, as mentioned in previous chapters, these problems impose significant restrictions on the use of artificial intelligence algorithms in real tasks, especially if these tasks are critically important for a person or his health. For the knowledge extraction from ANN, a large number of algorithms have been developed, and mainly they work with ANN weights and impose significant restrictions on the neural network architecture [12]. Next, we will consider a rule extraction algorithm based on decision trees, which does not have specific requirements for neural network model. This algorithm builds decision trees from a trained ANN. The advantage of this algorithm is that it can work with absolutely any artificial neural network of any architecture, regardless of
Rule Extraction Methods from Neural Networks
5
the presence of feedbacks, training methods, or types of input and output data. The decision tree obtained from the neural network will be a classification tree. Next, let us look at the main characteristics and basic information about decision trees. Suppose there is some database in which each record is represented by a set of attributes. One attribute is the target—its value must be predicted from the values of the other attributes. There is a requirement for this attribute that it must be measured in a nominal scale, that is, its set of values must represent some kind of classification. A decision tree is a plan presented in the form of an acyclic graph (tree), which is used to classify objects described by a set of attributes by a target attribute. In the task of establishing the authorship of a text, the objects are texts, and the target attribute is the surname of the author of this text. In this paper, the frequency characteristics of the text will be used as attributes describing the text. Each node of the tree represents a branching by one of the attributes. If the attributes are nominal, then the node has as many branches as the values of this attribute. If the attributes are integer or real values, each node has only two branches corresponding to the fact that the attribute value is less than, or greater than, or equal to a certain value. The leaves of the tree contain the values of the target attribute. Following the tree in accordance with the attribute values of an arbitrary object, we will find ourselves in one of the sheets. The value of the target attribute in this sheet will be the value predicted by the tree. Taking into account the specified requirement for the target attribute, it is clear that decision trees are a tool for classifying input data. This is very well suited for the task of determining the authorship of texts, which is nothing more than the classification of the text (or rather, statistical characteristics extracted from the text) on the basis of “Author”.
4.1 Rule Extraction Algorithm Using decision trees to extract rules from ANN allows you to extract a humanunderstandable structure from a neural network and convert the output result of the network into a hierarchical sequence of IF-THEN rules [13]. The algorithm classifies the input data in the neural network and analyzes the network itself, thereby extracting classification rules from it separately for each class. The architecture of the ANN used is shown in Fig. 2. The algorithm for constructing a decision tree approximating the operation of a trained artificial neural network looks like this: 1. Build and train the original neural network—“oracle” [14]. 2. Calculate the value q = max (0, minSamples − S), where minSamples is the minimum number of training examples used in each node of the tree, S is the current training sample (respectively, S is its volume). Thus, q is the number of additional examples that need to be generated.
6
S. Yarushev et al.
Fig. 2 The structure of the ANN
3. Based on the evaluation of the distribution of features from S, q new training examples are randomly generated. 4. The “oracle” recognizes that both new examples and old examples from the set S belong to one or another class. 5. Add the generated examples to the set S. 6. We are splitting up a set of buildings as a free algorithm for building buildings. 7. For each of the resulting subsets—recursion from step 2 until the local or global completion criterion of the algorithm is met. It is possible to extract knowledges which are structured not only from simplified artificial neural networks, but also from arbitrary classifiers, which makes it possible to apply the described algorithm in a wide range of practical tasks [15].
4.2 Rule Extraction from ANN Based on the Decision Trees A thinned tree is much more compact and simpler than an untreated one. In addition to this, the thinned tree has greater classification accuracy. In this regard, we will draw all meaningful conclusions on the basis of a thinned tree. The first meaningful conclusion that can be drawn at first glance is that the work of Sergei Dovlatov is very clearly different in style (here, by style we will understand the frequency response of sentence lengths) from three other writers, and for making such a decision, there was enough data that in the text fragment under study there are less than 2 sentences 34 words long and less than 4 sentences 48 words long. Linguistics experts will have to judge why exactly such figures turned out, and such data may be of great value to them. In contrast to the work of S. Dovlatov, the work of A.I. Kuprin is the least clearly distinguished from others. We can see this by the way a lot of records describing his works gradually disintegrate into parts, often close in size. However, such a feature can be caused not only by the peculiarities of the writer’s work, but also by the small number of works (18 in total) that participated in the study. It can be noted that the resulting decision tree does not include all the attributes that were present in the training sample. This leads to an important conclusion that decision trees also allow you to select the most significant characteristics and discard
Rule Extraction Methods from Neural Networks
7
unnecessary ones. This observation is very important for further research in the field of choosing the characteristics most relevant to the individual author’s style. Having made sure that thinning gives a more compact decision tree, it would be interesting to find out: what threshold level of significance gives the most compact decision tree, without reducing the accuracy of prediction? To answer this question, various threshold levels of significance were checked. Threshold levels of significance from the range [0; 1] were checked, and it is clear that the threshold level of significance 0 corresponds to the most “strict” thinning and level 1 corresponds to the complete absence of thinning. Figure 3 shows a diagram of the dependence of the prediction accuracy on the text length. From this graph, it can be seen that after thinning, with a successful choice of the threshold level of significance (the choice can be made using the same technique as was used during testing), the percentage of errors is significantly reduced. In the presented algorithm, there is an additional parameter reflecting the minimum number of training data for each node. The dependence of the accuracy of ANN on the number of training data points is shown in Fig. 4. Figure 4 clearly demonstrates an increase in the accuracy of ANN up to 89%. Classical algorithms based on decision trees are not capable of achieving such accuracy. The horizontal line shows the level of accuracy that classical algorithms show. Fig. 3 ANN prediction efficiency in dependence of text length
Fig. 4 ANN prediction accuracy in dependence of training data points
8
S. Yarushev et al.
Fig. 5 Extracted decision tree from ANN
Figure 5 illustrates the ability to simulate the ANN structure with the extracted decision tree. In this example, ANN [16] was trained on 10 epochs.
5 Discussion We demonstrate the possibility of the rule extraction methods from an ANN based on decision trees. As an example, we try to determine the authorship of classical works of Russian writers based on ANN, and decision trees were chosen. Decision trees based on the trained network built its structure. Two more approaches to the extraction of rules and the development of explanatory artificial intelligence systems, such as fuzzy cognitive maps [17] and a mining clustering algorithm, were also considered. Summing up, it should be noted that all the presented algorithms require additional testing and research of their effectiveness as methods of extracting rules and knowledge from neural networks, but they can definitely be considered as possible ways of developing explanatory artificial intelligence systems.
Rule Extraction Methods from Neural Networks
9
Acknowledgements The study was supported by the Russian Science Foundation grant No. 2271-10112, https://rscf.ru/en/project/22-71-10112/.
References 1. Siau, K., Wang, W.: Building trust in artificial intelligence, machine learning, and robotics. Cut. IT. J. 31(2), 47–53 (2018) 2. Averkin, A., Yarushev, S.: Hybrid neural networks and time series forecasting. Artif. Intell Commun. Comput. Inf. Sci. 934, 230–239 (2018) 3. Pilato, G., Yarushev, S.A., Averkin, A.N.: Prediction and detection of user emotions based on neuro-fuzzy neural networks in social networks. In: Proceedings of the Third International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’18), Advances in Intelligent Systems and Computing. Sochi, Russia, vol. 875, pp. 118–126 (2018) 4. Averkin A.N., Pilato G., Yarushev S. A.: An approach for prediction of user emotions based on ANFIS in social networks. In: Second International Scientific and Practical Conference on Fuzzy Technologies in the Industry. FTI 2018—CEUR Workshop Proceedings. Ostrava– Prague, Czech Republic, pp. 126–134 (2018) 5. Jin, X.-H.: Neurofuzzy decision support system for efficient risk allocation in public-private partnership infrastructure projects. J. Comput. Civ. Eng. 24(6), 525–538 (2010) 6. Jin, X.-H.: Model for efficient risk allocation in privately financed public infrastructure projects using neuro-fuzzy techniques. J. Constr. Eng. Manage. 1003–1014 (2011) 7. Borisov, V.V., Fedulov, A.S., Zernov, M.M.: Fundamentals of hybridization of fuzzy models. In: Series “Fundamentals of Fuzzy Mathematics”. Book 9, 100 p. Hot Line—Telecom, Moscow (2017) 8. Rudkovskaya, D.: Neural networks, genetic algorithms and fuzzy systems. In: D. Rudkovskaya; per. with Floor. I. D. Rudinsky, 452 p. M.: Hotline-Telecom (2008). Rajab, S., Sharma, V.: A review on the applications of neuro-fuzzy systems in business. Artif. Intell. Rev. 49, 481–510 (2018) 9. Mitra, S., Hayashi, Y.: Neuro-fuzzy rule generation: survey in soft computing framework. IEEE Trans Neural Netw. 11(3), 748–768 (2000) 10. Vieira, J., Morgado-Dias, F., Mota, A.: Neuro-fuzzy systems: a survey. WSEAS Trans. Syst. 3(2), 414–419 (2004) 11. Kim, J., Kasabov, N.: HyFIS: adaptive neuro-fuzzy inference systems and their application to nonlinear dynamical systems. Neural Netw. 12(9), 1301–1319 (1999) 12. Batyrshin, I.Z., Nedosekin, A.O., Stetsko, A.A., Tarasov, V. B., Yazenin, A.V., Yarush-kina, N.: Fuzzy hybrid systems. In: Yarushki-noy, N.G. (eds). Theory and Practice. FIZMATLIT, 208 p. Moscow (2007). Viharos, Z.J., Kis, K.B.: Survey on neuro-fuzzy systems and their applications in technical diagnostics and measurement. Measurement 67, 126–136 (2015) 13. Aliev, R.A., Pedrycz, W., Guirimov, B.G., Aliev, R.R., Ilhan, U., Babagil, M., Mammadli, S.: Type-2 fuzzy neural networks with fuzzy clustering and differential evolution optimization. Inf. Sci. 181(9), 1591–1608 (2011). https://doi.org/10.1016/j.ins.2010.12.014 14. Craven, M., Shavlik, J.: Extracting tree-structured representations of trained networks. In: Advances in Neural Information Processing Systems, vol. 8, pp. 24–30 (1995) 15. Gridin, V.N., Solodovnikov, V.I., Evdokimov, I.A., Filippkov, S.V.: Building decision trees and extracting rules from trained neural networks. Artif. Intell. Decis. Mak. 4, 26–33 (2013) 16. Shevelev, O.G., Petrakov, A.V.: Classification of texts using decision trees and feedforward neural networks, vol. 290. Bulletin of Tomsk State University (2006) 17. Yarushev, S.A., Averkin, A.N.: Time series analysis based on modular architectures of neural networks. Procedia Comput. Sci. 123, 562–567 (2018). https://doi.org/10.1016/j.procs.2018. 01.085
Quality Analysis of the Ganges River Water Utilizing Machine Learning Technologies Prasenjit Dey , Sudip Kumar Adhikari , Arnab Gain , and Suman Koner
Abstract The river Ganges is primary source of surface water in India, which is used for drinking, agricultural, religious, and industrial purposes. However, quality of water of river Ganges is becoming more and more polluted due to changes in climate and anthropogenic activities. The aim of this study is to analyze the quality of the Ganges River water and extract the key features of the existing water management systems. From the literature, it has been observed that the dissolved oxygen (DO) is an essential water component. The water quality is good if it has a sufficient amount of DO. In this work, various machine learning approaches, i.e., support vector regressor, random forest, decision tree, multi-layer perceptron (MLP), Naïve Bayes are used to assess the amount of DO present in the Ganges water. From simulation results, it has been observed that the MLP is best among other comparing machine learning methods for estimating the quantity of DO in the Ganges water. Keywords Dissolved oxygen · Ganges River water quality · Multi-layer perceptron · Neural networks · Support vector regressor
P. Dey (B) · S. K. Adhikari · A. Gain Coochbehar Government Engineering College, Coochbehar, West Bengal, India e-mail: [email protected] S. K. Adhikari e-mail: [email protected] A. Gain e-mail: [email protected] S. Koner Jalpaiguri Government Engineering College, Jalpaiguri, West Bengal, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_2
11
12
P. Dey et al.
1 Introduction The majority of civilization has grown up around riverbanks because rivers have been the most accessible and readily available source of water [1, 2]. Quality of river water plays a vital role in fulfilling the demand for drinking, agricultural, and industrial needs of the surrounding areas. In recent decades, we have seen a rapid increase in river contamination due to the climate change, urbanization, industrialization, etc. Rising pollutants, noisy river data, numerous variables, geological location specific time series pattern, and a lack of funding have all made the river water quality monitoring and controlling a challenging task [3]. In order to create more effective management strategies and cutting-edge early warning systems, research over the past few decades has heavily focused on prediction for water quality (WQ) in river basin, assessment of risks, and methodologies for classification of pollutants. The water quality can be physiochemical, hydro-morphological, and biological quality [4]. Among them, physiochemical water quality is frequently utilized for river water quality modeling [5]. The physical and chemical variables have synergistic impacts on water quality that could result in unpredictable WQ findings [3]. Few well-known indicators used to measure the river water quality are dissolved oxygen (DO), biochemical oxygen demand (BOD), total solids, nitrogen compounds, water temperature, electrical conductivity (EC), chemical oxygen demand (COD), potential of hydrogen (pH), etc. Unsafe water refers to dirty water, which is the reason of water associated health issues [6]. Water quality sampling, testing, and data handling require a significant amount of money and time in regards of chemicals, equipment, and manpower. Traditional methods make it harder to complete water-related projects. Unpredictable natural changes, interdependent relationships, human influence, nonlinearity and nonstationarity in water data, noisy and missing values in data, etc., make river water quality prediction a challenging task. To accomplish the goal and get over the preceding restrictions, a successful and economical methodology for water quality estimate using effective and robust advanced methodologies is needed [3]. In recent years, machine learning (ML) techniques have shown promising results while dealing with these types of data. Moreover, ML techniques are also good at dealing with missing data. This paper presents various machine learning techniques to assess amount of dissolved oxygen (DO) present in water of river Ganges. Here, we have applied support vector regressor (SVR), random forest (RF), decision tree (DT), multi-layer perceptron (MLP), and Naïve Bayes regressor (NB) for predicting water quality in the form of DO of the Ganges River water depending on other parameters such as: temperature, pH, conductivity, BOD, fecal coliform, and total coliform. The simulation results have shown that MLP most accurately predicts dissolved oxygen of water compared to other ML algorithms. Followings are the details of this paper’s summary: review of the literature is discussed in Sect. 2. In Sect. 3, the proposed method is explained, which used five machine learning techniques, viz. SVR, RF, DT, MLP, and NB to predict the amount of dissolved oxygen by using a freely accessible data set collected from the website
Quality Analysis of the Ganges River Water Utilizing Machine Learning …
13
of Central Pollution Control Board, Govt. of India. Using four evaluation metrics: mean absolute error (MAE), mean absolute percentage error (MAPE), mean squared error (MSE), and R2_score, we have experimented, and its result has been given in Sect. 4. We have concluded our work in Sect. 5.
2 Related Works The studies about AI models and their main contributions for evaluating and forecasting quality of coastal water were summarized in [7]. Their study recognized the utility of many AI models for water quality modeling, including knowledgebased systems, genetic algorithms (GA), neural networks (NN), and fuzzy inference systems (FIS). However, the research on river water was not taken into account in their study. In order to manage the basins of rivers, a data-driven models and evolutionary optimization summary have been provided by Solomatine and Ostfeld [8]. Their review found a number of shortcomings. The scientists looked at models that may generate findings on hydrology-related data classification and maps associated to floods using the Lake Kinneret watershed and the Nepalese Bagmati River basin as its two water sources. This study did not use advanced model information and only examined at fewer models. Nicklow et al. [9] investigated the use of evolutionary algorithms (EAs) in groundwater management, WQ variable detection systems, and planning and management of water supplies. The authors in [10] reviewed the support vector machine (SVM) models’ application in assessment of the groundwater quality. Voza and Vukovi (2018) studied clustering and discriminant analysis (DA) as two analytical techniques for reducing initial data set and identify pollutant’s main source. Their screening improves the artificial neural network (ANN) model’s effectiveness, which may then provide a comprehensive and reliable approach to manage water quality [11]. Che Osmi et al. studied on fuzzy-based algorithms used to model river data in [12]. Their study covered model design, membership function, operations, and a viewpoint on how fuzzy models might be integrated with other models. Recurrent ANN was used by Jeong et al. in [13] to forecast chlorophyll-a (Chl-a) utilizing a variety of inputs, including meteorological, hydrological, and environmental data. The model is capable of delivering an accurate result using input data with a three-day time lag. A comparison study was done by Niroobakhsh et al. [14] for evaluating the predictive power of MLP and radial basis function network (RBFN) models. The outcomes have shown that the RBFN efficiently manages enormous amounts of data and predict total dissolved solids (TDS). The prediction abilities of generalized regression neural network (GRNN), backpropagation neural network (BPNN), recurrent neural network (RNN), and multiple linear regression (MLR) models were evaluated in [15]. Their results show that the RNN model is most effective among all three ANN models and outperforms MLR in predicting DO. When comparing the back propagation (BP) and linear model (LM) algorithm-based ANN models with MLR and adaptive neuro-fuzzy interference system (ANFIS) models for predicting Chl-a, Grbi´c et al. [16] discovered that nonstationary variables are
14
P. Dey et al.
better managed by the ANN model. According to Ahmed and Shah [17], both the feed-forward neural network (FFNN) and RBFN yield acceptable results to predict DO and FFNN shows overall better performance. The authors in [18] employed a ward neural network (wNN) system and a self-organizing map to preprocess the data and extract its important features. The wNN model creates two parallel hidden layers for identifying various WQ data patterns, which in consequence improves accuracy of the model when compared to GRNN. The authors in [19] applied RBFN and MLP models to forecast the dissolved oxygen at 72 h in advance. Experimental results show that these models achieved good accuracy, but the forecasting accuracy decreased as the time step ahead increased. The authors in [20] used trial-and-error method to assess the effects of each input feature for the WQ models: SVM, k-nearest neighbor (kNN), and probabilistic neural network (PNN). They observed that the PNN works well when only a few variables are left out, kNN performs well when every features are considered, and SVM works well when few variables are used. Asadollahfardi et al. [21] employed the autoregressive-integrated moving average (ARIMA) box Jenkins time series and MLP to estimate total dissolved solids. The MLP model outperformed the other models by utilizing ten hidden layers, LM optimizer, and Tansig transfer function. The authors in [22] investigated MLR, ANN, and ANFIS models for forecasting dissolved oxygen. The outcomes demonstrated that ANFIS and MLR models’ performance was inferior to that of ANN because of their flaws in over-estimation and under-estimation, respectively. Kogekar et al. [23] had used the historical water data set of river Ganga and presented advanced processing and monitoring approaches using deep learning techniques. They had proposed a deep hybrid model to forecast the water quality using convolutional neural network (CNN)— gated recurrent units-support vector regression. Li et al. [24] proposed another water quality prediction model on the water data set of Yangtze River economic zone. They had employed segmented regression model and indicated percentage of the impervious surface area (PISA) as a useful WQ indicator over watershed spatial scales. Cai et al. [25] used situ hyperspectral reflectance data to assess comprehensive WQ of urban rivers using ML algorithms. They had retrieved water quality index (WQI) of 382 hyperspectral data from urban rivers using RF and CNN. Aslam et al. [26] developed algorithms for WQI and assessed the WQ in Northern Pakistan. They had developed WQI prediction models using the latest ML techniques like random trees, RF, M5P, and reduced error pruning tree and also using hybrid data mining algorithms.
3 Proposed Method In the present work, we have used five machine learning algorithms to predict the DO present in water of river Ganges. The freely accessible data set WATER_QUALITY_OF_RIVER_GANGA_2013 is collected from the website of Central Pollution Control Board, Govt. of India [27]. These five ML algorithms are SVR, RF, DT, MLP, and NB. The aforesaid data set contains many parameters, and
Quality Analysis of the Ganges River Water Utilizing Machine Learning …
15
Table 1 Results of the Pearson correlation for the selected five parameters Ta
Cb
D.O
pH
1.00
−0.52
−0.39
0.23
0.40
0.24
0.21
D.O
−0.52
1.00
0.41
−0.27
−0.36
−0.70
−0.68
pH
−0.39
0.41
1.00
−0.12
−0.09
−0.28
−0.25
Cb
0.23
−0.27
−0.12
1.00
−0.09
0.05
0.05
Ta
BOD
Fecal coliform
Total coliform
BOD
0.40
−0.36
−0.09
−0.09
1.00
0.27
0.26
Fecal coliform
0.24
−0.70
−0.28
0.05
0.27
1.00
0.96
Total coliform
0.21
−0.68
−0.25
0.05
0.26
0.96
1.00
a
b
Temperature, Conductivity
some of them may be irrelevant or redundant for prediction of DO. Thus, we have performed Pearson correlation test to select only those input parameters that show high correlation with DO parameter. The selected input parameters are temperature, pH, conductivity, BOD, fecal coliform, and total coliform. The results of the Pearson correlation value for the selected five parameters are given in Table 1. SVR is a variant of SVM classifier used to regression and thus follows the principle of SVM. SVR optimally draws a decision boundary (hyperplane) which makes the minimum average error with respect to all input samples. In order to create the hyperplane, SVM selects the extreme points and vectors, i.e., support vectors, and for that reason, the algorithm is known as SVM. The performance of the SVR depends on three parameters: gamma, C value, and kernel function. The frequently used kernel functions used in SVR are linear kernel, polynomial kernel, and RBF kernel; among them, we have used RBF kernel. To determine the values of C and gamma, we have considered different configurations of C = {1, 5, 10, 15, 100} and gamma = {0.0001, 0.001, 0.05, 0.01, 0.1, 1}. Random forest is an ensemble ML technique that uses an ensemble of DTs. It divides the data set into a number of subsets and trains each subset with a different subset of features present in the data set. Initially, each DT estimates the value of DO present in the water. Later, RF takes the mean value of all the DTs for the evaluation of the performance of the RF model. A decision tree employs a tree structure resembling a flowchart or is a model of decisions and all of their potential outcomes. The judgment is based on the branches, which stand for the statement’s veracity or untruth. The performance of the DT depends on the error criteria of model and the tree depth. In this work, we have performed grid search using various types of error criteria, i.e., squared_error, friedman_mse, absolute_error, Poisson, and different depth values from 1 to 30 to select the optimal error criteria and depth value. Simulation results show that the DT performs best on training data set when the error criteria are absolute_error and depth of the tree is 5. MLP is a multiple layer feed-forward neural network model which learns weights of neural network by back propagating errors generated by the model in the output layer. In this work, we have considered a single layer MLP. To select the optimal
16
P. Dey et al.
number of nodes in hidden layer, we have used different number of nodes = [5, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100]. In initial experimental approach it is observed that for 80 numbers of hidden nodes the model performs best on the training data set. Therefore, we have used 80 numbers of hidden nodes to evaluate performance of model on test data set. Naïve Bayes regression model follows the principle of the Bayes conditional probability and based on likelihood values approximate the model outcome. In this study, regression is carried out using kernel density estimators and the probability distribution of the DO quantity values. We have used NB regressor model to approximate the amount of DO present in the water of the Ganges River.
4 Experimental Results To compare all the algorithms in identical experimental setup, we have used the same seed value for splitting the data set into training and testing. Here, 80% of the data is utilized for training and 20% is utilized for testing the model. Later, we have used tenfold cross-validation for evaluating the model performance, where the average of 10-runs is considered as the output of the model. Table 2 shows the experimental results of all the ML algorithms used in this paper. To evaluate the performance of the models, we have used four evaluation metrics: mean absolute error (MAE), mean absolute percentage error (MAPE), mean squared error (MSE), and R2_score. The bold-faced entries in the table have shown the best performance in the context. Table 2 demonstrates that the MLP regression model performs the best among all other ML models with respect to three evaluation metrics: MAPE, MSE, R2_score. However, for the error criteria MAE, the performance of RF is better compared to all other ML models. In Figs. 1, 2, 3, 4, and 5, the bar charts represent the differences between the actual DO value and the predicted DO value in the test data set for all five ML models, i.e., SVR model, MLP model, DT model, RF model, and NB model, respectively. From the comparison of Figs. 1, 2, 3, 4, and 5, it is evident that the MLP model predicts the DO amount better than the other ML models. Table 2 MAE, MAPE, MSE, and R2_score values WATER_QUALITY_OF_RIVER_GANGA_2013 data set Algorithms
of
the
ML
algorithms
MAE
MAPE
MSE
R2_score
SVR
0.30
0.04
0.19
0.74
RF
0.10
0.06
0.25
0.65
DT
0.57
0.09
0.58
0.19
MLP
0.24
0.03
0.07
0.90
NB
0.47
0.07
0.34
0.53
on
Quality Analysis of the Ganges River Water Utilizing Machine Learning … Fig. 1 Bar plot of actual values versus predicted values for SVR model
Fig. 2 Bar plot of actual values versus predicted values for MLP model
Fig. 3 Bar plot of actual values versus predicted values for DT model
17
18
P. Dey et al.
Fig. 4 Bar Plot of actual values versus predicted values for RF model
Fig. 5 Bar plot of actual values versus predicted values for NB model
5 Conclusion In this paper, we have analyzed different parameters affecting the DO amount in the Ganges River water. Due to this reason, we have performed the Pearson correlation test among different parameters and the DO parameter of the water. Then, we have selected only those parameters which show high correlation with DO and used those parameters to estimate the amount of DO present in the water using different machine learning algorithms, i.e., SVR, RF, DT, MLP, and NB. Experimental results and bar chart visualization plots show that MLP is best among all five ML models in predicting the amount of DO in the water of the Ganges River [24].
References 1. Mustafa, A.S., Sadeq S.O., Shahooth., S.H.: Application of QUAL2K for Water Quality Modeling and Management in the lower reach of the Diyala river. Iraqi J. Civ. Eng. 11, 66–80
Quality Analysis of the Ganges River Water Utilizing Machine Learning …
19
(2017) 2. Viessman, W., Hammer, M.J., Perez, E.M., Chadik, P.A.: Water supply and pollution control 1998) 3. Tung, T.M., Yaseen. Z.M.: A survey on river water quality modelling using artificial intelligence models: 2000–2020. J. Hydrol. 585, 124670 (2020) 4. Tchobanoglous, G., Schroeder, E.E.: Water quality: characteristics, modeling, modification (1985) 5. Mohtar, W.H.M.W., Maulud, K.N.A., Muhammad, N.S., Sharil, S., Yaseen, Z.M.: Spatial and temporal risk quotient based river assessment for water resources management. Environ. Pollut. 248, 133–144 (2019) 6. Vörösmarty, C.J., McIntyre, P.B., Gessner, M.O., Dudgeon, D., Prusevich, A., Green, P., Davies, P.M.: Global threats to human water security and river biodiversity. Nature, 467(7315), 555–561 (2010) 7. Chau, K.: A review on integration of artificial intelligence into water quality modelling. Marine Pollut. Bull. 52(7), 726–733 (2006) 8. Solomatine, D.P., Ostfeld, A.: Data-driven modelling: some past experiences and new approaches. J. Hydroinf. 10(1), 3–22 (2008) 9. Nicklow, J., Reed, P., Savic, D., Dessalegne, T., Harrell, L., Chan-Hilton, A., Karamouz, M., Minsker, B., Ostfeld, A., Singh, A., Zechman, E.: State of the art for genetic algorithms and beyond in water resources planning and management. J. Water Resour. Plan. Manag. 136(4), 412–432 (2010) 10. Raghavendra, S.N., Deka, P.C.: Support vector machine applications in the field of hydrology: a review. Appl. Soft. Comput. 19, 372–386 (2014) 11. Voza, D., Vukovi´c, M.: The assessment and prediction of temporal variations in surface water quality—a case study. Environ. Monit. Assess, 190(7), 1–16 (2018) 12. Osmi, S.F.C., Malek, M.A., Yusoff, M., Azman, N.H., Faizal, W.M.: Development of river water quality management using fuzzy techniques: a review. Int. J. River Basin Manage. 14(2), 243–254 (2016) 13. Jeong, K.S., Joo, G.J., Kim, H.-W., Ha, K., Recknagel, F.: Prediction and elucidation of phytoplankton dynamics in the Nakdong River (Korea) by means of a recurrent artificial neural network. Ecol. Modell. 146, 115–129(2001) 14. Niroobakhsh, M., Musavi-Jahromi, S. H., Manshouri, M., Sedghi, H.: Prediction of water quality parameter in Jajrood River basin: application of multi layer perceptron (MLP) perceptron and radial basis function networks of artificial neural networks (ANNs). Afr. J. Agric. Res. 7, 4131–4139 (2012) 15. Antanasijevi´c, D., Pocajt, V., Povrenovi´c, D., Peri´c-Gruji´c, A., Risti´c, M.: Modelling of dissolved oxygen content using artificial neural networks: Danube River, North Serbia, case study. Environ. Sci. Pollut. Res. 20(12), 9006–9013 (2013) 16. Grbi´c, R., Kurtagi´c, D., Sliškovi´c, D.: Stream water temperature prediction based on Gaussian process regression. Expert Syst. Appl. 40(18), 7407–7414 (2013) 17. Ahmed, M.A.A., Shah, S.M.A.: Application of adaptive neuro-fuzzy inference system (ANFIS) to estimate the biochemical oxygen demand (BOD) of Surma River. J. King Saud Univ.-Eng. Sci. 29(3), 237–243 (2017) 18. Antanasijevi´c, D., Pocajt, V., Peri´c-Gruji´c, A., Risti´c., M.: Multilevel split of high-dimensional water quality data using artificial neural networks for the prediction of dissolved oxygen in the Danube River. Neural Comput. Appl. 32(8), 3957–3966 (2020) 19. Salim, H.: Simultaneous modelling and forecasting of hourly dissolved oxygen concentration (DO) using radial basis function neural network (RBFNN) based approach: a case study from the Klamath River, Oregon, USA. Modeling Earth Syst. Environ. 2(3), 1–18 (2016) 20. Dezfooli, D., Hosseini-Moghari, S.M., Ebrahimi, K., Araghinejad, S.: Classification of water quality status based on minimum quality parameters: application of machine learning techniques. Modeling Earth Syst. Environ. 4(1), 311–324 (2018) 21. Asadollahfardi, G., Zangooi, H., Asadi, M., Tayebi Jebeli, M., Meshkat-Dini, M., Roohani, N.: Comparison of Box-Jenkins time series and ANN in predicting total dissolved solid at the Z¯ayandé-R¯ud River, Iran. J. Water Supply: Res. Technol.-Aqua 67(7), 673–684 (2018)
20
P. Dey et al.
22. Abba, S.I., Hadi, S.J., Abdullahi, J.: River water modelling prediction using multi-linear regression, artificial neural network, and adaptive neuro-fuzzy inference system techniques. Procedia Comput. Sci. 120, 75–82 (2017) 23. Kogekar, A.P., Nayak, R., Pati, U.C.: A CNN-GRU-SVR based deep hybrid model for water quality forecasting of the River Ganga. In: 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV) (2021) 24. Li, Z., Peng, L., Wu, F.: The impacts of impervious surface on water quality in the urban agglomerations of middle and lower reaches of the Yangtze River economic belt from remotely sensed data. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 14, 8398–8406 (2021) 25. Cai, J., Chen, J., Dou, X., Xing, Q.: Using machine learning algorithms with in situ hyperspectral reflectance data to assess comprehensive water quality of urban rivers. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2022) 26. Aslam, B., Maqsoom, A., Cheema, A.H., Ullah, F., Alharbi, A., Imran, M.: Water quality management using hybrid machine learning and data mining algorithms: an indexing approach. IEEE Access 10, 119692–119705 (2022) 27. CPCB ENVIS: https://cpcb.nic.in/wqm/2013/RIVERWATER%20DATA%202013_5.htm. Last accessed 15 Dec 2022
Eye-Tracking Movements—A Comparative Study Sunny Saini , Anup Kumar Roy , and Saikat Basu
Abstract Eye tracking has been a topic of interest in research in recent years because it provides convenience to a wide range of applications. It is acknowledged as an important non-traditional method of human–computer interaction. Eye tracking is a useful tool for determining where and when people devote visual attention to a scene, and it helps to understand cognitive functioning. Nowadays, eye-tracking technology is making its way from the lab to the real world, collecting more data at a faster rate and with a greater variety of data kinds. Eye tracking will become closer to big data if the current trend continues. A real-time model is created using machine learning methodology, which tests a high-accuracy hypothesis. Eye tracking with parameters looks into a participant’s eye movements while presenting them with a variety of options. Using machine learning to analyze eye movements and extract attributes to assess eye behavior. K-nearest neighbor, Naive Bayes, decision trees, and random forests are machine learning algorithms that produce models with improved accuracy. In this paper, we have reviewed different eye-tracking technologies to obtain eye movement parameters and classifiers for categorization, such as machine learning and deep learning toward recognition of cognitive processes involved in learning. Keywords Eye tracking · Feature extraction · Machine learning (ML) · Deep learning (DL)
1 Introduction Sensor technology called eye tracking is utilized to decide the presence of a person to monitor their look in actual time. Eye movements are converted into a stream of information that contains parameters such as fixation rate, saccadic rate, fixation S. Saini · S. Basu (B) Maulana Abul Kalam Azad University of Technology, Kolkatta, West Bengal, India e-mail: [email protected] A. K. Roy Indian Institute of Technology, Kharagpur, West Bengal, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_3
21
22
S. Saini et al.
duration, number of fixations, saccade duration, and number of saccades. In essence, the technology reads eye movements and turns them into understanding that can be utilized for many different purposes or as a secondary input modality. Machine learning (ML) and deep learning (DL) technologies give an outstanding understanding of large datasets. Three types of eye movements have been studied: fixation, saccade, and smooth pursuit [1]. Direct observations were used in the 1800s to conduct eye movement studies. In eye tracking, retina is inundated with tremendous volumes of visual information as we travel through rich and complicated settings in our everyday lives. In the selective attention process, the brain chooses and focuses on key scene regions for cognitive and visual processing. As a result, eye movements directly estimate spatial attention and are regarded as a view inside the mind and brain. The research community is interested in eye movement and having a variety of uses in the examination of visual material, analysis of design, mobility, and customer behavior research to driving, gaming, and medical research with gaze-based interaction [2]. K-neasest neighbor (KNN), support vector machine (SVM), decision tree, random forest, linear regression, Naive Bayes, K-mean clustering, and convolutional neural network are the methods described in this paper for tracking eye movements. In machine learning algorithms; KNN, Nave Bayes, decision tree, and random forest provide higher accuracy models, while CNN is the best accuracy model in deep learning. As a result, recent research in eye tracking has focused on developing ML and DL-based eye-tracking algorithms which do not need user calibration.
2 Related Works/Literature Review Using eye tracking, researchers study the movements of a user’s eyes during various activities. This provides information about cognitive processes. The use of eyetracking technology improves technology usability. Communicating with machines is advantageous in order to automate manual tasks. It provides data that is unbiased, objective, and quantifiable. Eye trackers are unobtrusive and allow for the completion of normal tasks. Eye-tracking technology can be applied in almost any situation or environment. It monitors eye movement by measuring parameters such as pupil position, fixation rate, saccadic rate, saccade duration, number of saccades, fixation duration, and number of fixations. The results can provide a very high level of granularity for deep analysis, depending on the device and software used. The techniques of ML and DL are applied in order to classify and achieve the highest model accuracy.
2.1 Eye-Tracking Features Nuraini et al. [3], the event detection algorithm attempts to categorize signals from the eye and differentiate between individual eye movements. Three threshold-based
Eye-Tracking Movements—A Comparative Study
23
algorithms exist for detecting eye movement: velocity and dispersion threshold identification (I-VDT), velocity and movement pattern identification (I-VMP), and velocity and velocity threshold identification (I-VVT). Using two threshold values, the IVVT algorithm distinguishes smooth pursuit, saccade, and fixation eye movements. For the duration of that phase, the velocity of the eye movement surpasses the saccade velocity, then saccade eye movement is the term used to describe this portion. However, a fixation occurs if the velocity is less than the smooth pursuit velocity. I-VMP and I-VVT use the saccade velocity threshold value to identify the saccade. Machine learning outperforms threshold-based algorithms when the training data is more varied. The K-nearest neighbors (KNNs) technique is used to detect eye movements. Using data from several users with different kinds of eye motions, this method achieved detection rates of up to 92%. Akshay et al. [4] proposed a method for extracting smooth pursuit, saccades, and fixation derived from unprocessed data from eye tracking using machine learning algorithms. In a single dataset, the Website Web gazer recorded 1075 observations. The raw unlabeled data from eye tracking is provided to the input in the proposed system. They use the K-means algorithm to classify fixation and saccades in this unlabelled data. After that, they are labeled using the IDT algorithm. The 1075 observations are classified as fixation or saccades. On the dataset, they use many supervised ML algorithms to validate a measure of the labeling procedure, including random forest, decision tree, SVM, and KNN. Fikri et al. [5] labeled the data and then separated them into four groups: noise, post-saccadic oscillation, saccade, and fixation. The information was split into two categories: training (75%) and validation (25%). Random forests and decision trees achieved 98% accuracy and were the best choice for event detection algorithms due to their ability to prevent overfitting. This study also suggested that the events detection technique should ideally be performed without using parameters. Convolutional neural networks are used in DL to classify eye movements. They devised a technique for identifying eye movements in raw data in real time without the need for segmentation or human involvement. They also created a fresh dataset with several participants. As a consequence, CNN beats various binary approaches and also does well in multi-class situations. Roy et al. [6] in 2017 provided a computational framework for processing raw eye tracking to cognitive model development. The models predicted different objects of the bi-stable image automatically. They extracted 46 unique features and compared them with different classifiers. Another study by them provided the framework to develop the signature markers’ attentional variations over ambiguous images [7]. In 2022, Nasreen et al. identified different parameters which can indicate an individual’s little-c creativity [8].
24
S. Saini et al.
2.2 Machine Learning and Deep Neural Network in Eye Tracking Shojaeizadeh et al. [1] proposed the use of eye movement data to detect task demand in an unobtrusive and automatic manner. To put our claim to the test, they created a machine learning workload detection system based on eye tracking. Their findings revealed that the most crucial predictor in determining task requirement was pupil data, particularly pupil dilation ratio during saccades and fixations. Their findings demonstrated that the workload detector could accurately and quickly detect task demands. Çetinta¸s et al. [9] evaluate the sorts of texts read by incorporating eye movement research into daily reading activities. The data is derived from recordings of the 10-minute reading procedure made by 20 participants. Participants have access to documents from the humor online newspaper and text categories. A spectrogram is used to create visual representations by simply measuring the change in pupil size over time as a function of participant characteristics. These spectrograms are examined using AlexNet, ResNet, and other designs, in addition to the classification phases, to determine whether satisfactory results can be obtained. As the activation function, AlexNet employs ReLU’s max-pooling layer. ResNet, on the other hand, has 152 layers. 70% of the spectrogram image data is set aside for training, while 30% is set aside for testing. Several machine learning approaches are applied to categorize the dataset. To summarize, the random forest and decision tree algorithms have a success rate of 98% when working with raw eye-tracking data.
2.3 Uses of Eye Tracking The eye-tracking movement is used in various studies and research projects. As a result, precise eye tracking and identification are regarded as critical in the advancement of human–computer interaction, as well as the production of attentive user interfaces and the investigation of human affective states. This section discusses some of the studies related to eye-tracking movements. Joseph et al. [10] did eye-tracking research on 50 people ranging in age from 20 to 60 years. The study’s goal is to look into the effects of aging on mobile phone user experience when doing complicated activities. In realistic circumstances, five activities were done using an Android phone while putting on eye-tracking spectacles in the study. For each individual user’s left and right eyes, the ocular parameters were measured. According to the findings, participants’ task completion was harder for people aged 50 to 60, and their cognitive effort was raised. Feng et al. [11], far-infrared (FIR) therapy is becoming more popular in clinical settings, but more research into its ability to alleviate visual fatigue is needed. Over the course of two days, twenty healthy people took part in the experiment. Each subject was shown a visual stimulus program prior to the subjective rating and eye-tracking
Eye-Tracking Movements—A Comparative Study
25
assessments, then eye relief using a thermal eye mask with infrared technology was introduced. The eye tracking and subjective rating assessments were carried out again. When comparing before and following eye relief with an FIR treatment mask, the saccade amplitude (SA), fixation frequency (FF) as well as subjective score (SS), and eye blink frequency (BF) were significantly different, indicating that SA and FF were beneficial in assessing visual tiredness. Kokanova et al. [12], this paper discusses the findings of eye-tracking pilot research incorporating sight translation and reading. Seventeen people with a year of sight translating experience were tasked with reading and sight translating two texts from B (English) to A (Spanish). The written material has both independent and dependent factors. The usage of eye-tracking methods in the translation process where the research may aid in comprehending the problems associated with sight translation as a sort of communication translation from one language to another.
3 Objectives The eye behavior of an individual provides important information related to emotional stages and cognitive functions. In this study, we aim • To find the eye parameters used in learning and perceptual processes. • To understand the requirements while conducting an eye-tracking experiment. • To understand the procedure of eye-tracking data and classification process.
4 Methodology This research looked at observations on eye-tracking methods to investigate cognitive processes that occur during learning. We anticipate revealing how the eye movement method has been used in research of various subject matter for study, as well as the eye movement indicators which are used for research, by examining available studies from the last five years of research. We have taken 25 research reports for a comparative study on the basis of parameters used in eye-tracking movement. Smooth pursuit, saccade, and fixation are the three types of eye movements studied. Fixations occur when our eyes stop and keep looking at a point. Saccades are rapid, ballistic eye motions that quickly alter the place of focus. Smooth pursuit movements are sluggish motions of eye tracking that are intended to maintain a moving stimulation on the fovea. We have taken 25 papers to survey the methodology. These papers are shortlisted for the methodology.
26
S. Saini et al.
5 Results and Discussion We observed various classification techniques used to track eye movements and their severity in the papers discussed above. EEG, fMRI, OpenCV library, Raspberry Pi, web gazer, ARKit technology, and other tools were used to collect data by the researchers. 3D eye models are used in some cases. There have been three eye movements studied thus far: fixation, saccade, and smooth pursuit. Each researcher used a different technique for eye-tracking movements, gaze estimation, and event detection. In [1], to create a job load monitoring method based on eye tracking, a procedure is designed to find a solution to a categorization problem. They desired to develop a classification system that determines if there are any eye movements recorded in the work demand that might be minimal or high. In [13], researchers have combined fMRI and EEG-based eye tracking. In many ways, the combination of eye tracking together with fMRI is less obvious than the combination of eye tracking and EEG. In [14], both supervised and unsupervised learning approaches can be used for classification. Classification accuracy for the KNN and Naive Bayes algorithms was 42.3% and 57.6%, respectively. The k-means algorithm yielded a ratio of 75.2%. In [2, 5, 15–17], the convolutional neural network is employed. CNN is the deep neural network that is commonly used for eye gaze estimation. A CNN comprises several convolutional layers, pooling, non-linearity, and ultimately, there is a completely linked layer, followed by an output layer. The first three levels deal with image retrieval, whereas the fully connected layer manages categorization. In [9], on a daily basis, eye movement analysis is performed while reading. AlexNet, ResNet, and other designs are used in the spectrogram analysis and classification stages. In [18], a software application for the iPad and iPhone is in the works. The software includes an easy-to-use text presentation interface and ARKit technology to collect eye-tracking data from respondents. In [3, 16], SVM and K-nearest neighbors (KNNs) techniques are used to detect eye movement. The SVM model was trained and tested. The average accuracy is 95.68% with a standard deviation of 3.43, whereas KNN produces an eye detection performance of up to 92%. In [4, 5], they run several supervised ML algorithms on the dataset for eye movement classification. Random forests and decision trees achieved 98% accuracy due to their ability to avoid overfitting. In [19], using features for eye movement, detection using a random forest classifier was performed. They make use of the Scikit-learn package, where the random forest is introduced as well as the LUNARC1 Aurora computer cluster. In [20]; a Raspberry Pi, a Raspberry Pi camera, and a laptop are the main components of wearable technology. Input footage is obtained using a camera with the eye look moment the wearable technology is donned. By examining available studies from the last five years of research, we aim to reveal how the eye movement method is used in research of various subjects of study. We selected 25 papers for a comparative study based on eye-tracking movement parameters. Using eye parameters fixation rate, saccadic rate, and pupil dilation, the
Eye-Tracking Movements—A Comparative Study
27
daily mobile phone activity of the elderly’s is reduced. The far-infrared (FIR) therapy had the potential to accelerate visual fatigue relief using fixation frequency and saccade amplitude. Reading speed, reading distance, and pupil size are all tested for readers’ efforts to improve performance. The study investigates the interdependence of criteria for eye movement in a stereoscopic display using projection. Using eye movements to determine an image’s emotional category, KNN and Naive Bayes algorithms have the best accuracy rates. The machine learning SVM method has had a lot of success in mostly text-heavy documents. The machine learning system scans massive amounts of information about eye movement to tell aspects of the eye movement of the viewer while consuming academic and other material (Table 1). The Naive Bayes algorithm produced the best model that can be used to develop a multimodal learning system that is flexible that takes the learning methods of pupils into account. People who have low self-esteem look at faces for a longer period of time. In wearable contexts, pupil data can be used instead of gaze data for saccade and fixation detection. We have taken ML and DL algorithms for eye-tracking detection events, which give the best classification accuracy model. Furthermore, eye tracking is a versatile technology that is used to address a diverse set of study issues in a variety of fields. The goal of this study is to give a brief summary of how eye tracking can be used in research.
6 Conclusion and Future Scope In this paper, we have performed a survey on different eye-tracking systems. We found some papers where the classification of eye movements was applied and focused on ML and DL methods. The rest of the papers indicate that eye-tracking features key relevant information for understanding the cognitive process. The inclusion of the ML approach in this paper is to understand the approaches for eye tracking-based automatic predictive models. ML approach checks a high-accuracy hypothesis and creates a real-time model. Eye tracking with parameters investigates a participant’s eye movements through a wide set of options. Using ML in eye tracking to extract features from eye movements for evaluating the behavior of the eyes. Using ML algorithms, KNN, Naive Bayes, decision trees, and random forests provide higher accuracy models. Future work could focus on compiling symmetrical and substantial datasets for calibration-free evaluation of the look. A future study might concentrate on creating less cost, cheap in terms of computing, device compatibility, and network designs that could be essential for analyzing eye movements. The collection of wellbalanced and noise-free datasets should be the subject of upcoming research.
28
S. Saini et al.
Table 1 Table shows objectives, design, results, and conclusions of the included studies Author Objective Participant Conclusions Joseph et al. [10]
Prasety et al. [21]
Feng et al. [11]
Miranda et al. [22]
Lin et al. [23]
Wei et al. [24]
Lin et al. [25]
Tamuly et al. [14]
Çetinta¸s et al. [9]
Fixation rate, saccadic Aged people had a rate, pupil dilation, higher cognitive and aging workload
The elderly’s daily mobile phone activities have been reduced Eye gaze accuracy, The fixations are the The findings will be time of first fixation, most accurate useful to engineers fixation duration, and predictor of eye gaze who specialize in number of fixations accuracy human aspects and creators of virtual reality. Subjective score (SS), Visual fatigue was Far-infrared (FIR) blink frequency (BF), assessed using FF and therapy had the saccade amplitude SA potential to accelerate (SA), and fixation visual fatigue relief frequency (FF) Pupil size, Pupil size, reading Readers’ efforts to convergence distance, distance, and reading improve performance fixation duration, and speed are all measured are reflected in the saccade amplitude using various devices devices, they use Eye gaze accuracy Lowering parallax, The investigation of (AC), number of extending EMT, the interdependence fixation (NF), time to increasing NF, criteria to allow for first fixation (TFF), extending FD, and eye mobility in a fixation duration (FD), extending TFF all stereoscopic and eye gaze contributed to higher presentation using movement time (EMT) AC projection Duration of fixations, Users concentrate their When deciding number of fixations, attention on different between high and number of saccades, areas of images of low-quality images, and duration of varying quality users consider the saccades overall content and locals of the images Focus radius, The average precision Eye movement maximum saccade score decreased after behavior more distance, and total the fatigue task appropriately fixation time duration recognizes visual of the inner circle tiredness Scene valence, fixation The KNN and Naive Using eye movements frequency, saccade Bayes algorithms have to determine an frequency accuracy rates of 42.3 image’s emotional and 57.6%, category respectively Pupil sizes, average AlexNet architecture The machine learning gaze times, blink, and SVM method has had a lot saccades, fixation classification yield the of success in mostly best results text-heavy documents (continued)
Eye-Tracking Movements—A Comparative Study Table 1 (continued) Author
29
Objective
Participant
Conclusions
Anisimov et al. [18]
Dispersion of gaze coordinates, gaze movement speed
On the model of distinct textual presentation, the detection of the reading strategy was implemented
Nuraini et al. [3]
Saccade velocity and smooth pursuit velocity
K-nearest neighbors (KNNs) detection rates of up to 92% were achieved
Akshay S. et al. [4]
Fixation, saccades, and smooth pursuit
Zemblys et al. [19]
Fixation, saccades, and smooth pursuit
In machine learning methods, a confusion matrix was used to assess and determine the dataset’s usefulness The other ten machine learning algorithms were outperformed by random forest
Pritalia et al. [26]
Number of fixation, saccade duration, and average fixation duration
Fuhl et al. [27]
Fixation, saccades, and smooth pursuit
Vortmann et al. [28]
Pupil dilation, links, For various aspects of saccades, and fixations the attention state, the CNN picture features outshine
The machine learning system scans massive amounts of information about eye movement to tell aspects of the viewer while consuming academic and other material Creating and verifying new eye-tracking characteristics to increase the accuracy of event detection Eye-tracking events can be used in a variety of applications. Web page analyzers and uses for medicine are such examples Machine learning techniques outperform current cutting-edge activity recognition algorithms These findings can be used to develop a multimodal learning system that takes the learning methods of pupils into account A unique machine learning technique based on rules for developing detectors that use marked or generated data is proposed Current eye-tracking-based attentional state classifiers can be improved by modifying the set of features
The Naive Bayes algorithm produced the best model, which had 71% accuracy, 60% sensitivity, and 75% specificity Learning many kinds of eye motions and detecting pupil reading mistakes automatically
(continued)
30 Table 1 (continued) Author
S. Saini et al.
Objective
Participant
Conclusions
Skarama- gkas et al. [29]
Fixations, saccades, blinks, and pupil size variation
Artificial intelligence and machine learning methods were put to the test for classification accuracy
Oyekunle et al. [30]
Fixation count, fixation length, and saccade length
Users’ needs are better met by the educational-based site than by the e-commerce site
Sharvas- hidze et al. [31]
Fixation durations and Art viewers had the saccade amplitudes greatest fixations and the lowest saccade amplitudes
Wang et al. [32]
Saccade and fixation
Latifzadeh et al. [33]
Microsaccades amplitude, blink latency, fixations saccade velocity, saccade length, and pupil dilation Gaze duration and fixation count
The goal of this research is to present the most important eye/pupil metrics from existing literature in order to develop a robust emotional or cognitive computational model This study looked at fixation values for item search tasks on both e-commerce and educational Websites and discovered differences in design choices for Website content, color, and picture elements During art viewing, the task clearly influences eye movements. Fixation lengths and saccade amplitudes differed In wearable contexts, pupil data can be used instead of gaze data with less effort and more precision Eye-tracking data analysis was used to assess cognitive load in multimedia language learning
Potthoff et al. [34]
Saccade/fixation identification based on pupil data outperforms detection using gaze data by 8.6% The cognitive load imposed by multimedia learning is assessed using eye movement data People who have low self-esteem look at faces for a longer period of time and may be more critical
The eye-tracking research on self-face seeing in a mirror. The duration of one’s gaze was related to one’s self-esteem (continued)
Eye-Tracking Movements—A Comparative Study Table 1 (continued) Author Zhu et al. [35]
Baharom et al. [36]
Zandi et al. [37]
Objective
Participant
Fixations, saccades, and smooth pursuits
The hidden Markov model (HMM) effectively separates the types of eye movement
31
Conclusions
The suggested method’s efficacy and stability are demonstrated by outperforming state-of-the-art techniques Fixation, saccade, and K-means clustering The study’s goal was scan path gaze reduces the size of to discover the link large eye-tracking between data attributes datasets and visual consumer preferences Fixation, saccade, and The random forest The study results in gaze point classifier achieves high the creation of new accuracy throughout technology for all epoch durations assessing alertness, providing early warning of drivers’ tiredness and sleepiness
References 1. Shojaeizadeh, M., Djamasbi, S., Paffenroth, R.C., Trapp, A.C.: Detecting task demand via an eye tracking machine learning system. Decis. Support Syst. 116, 91–101 (2019) 2. Valliappan, N., Dai, N., Steinberg, E., He, J., Rogers, K., Ramachandran, V., Xu, P., Shojaeizadeh, M., Guo, L., Kohlhoff, K., et al.: Accelerating eye movement research via accurate and affordable smartphone eye tracking. Nat. Commun. 11(1), 1–12 (2020) 3. Nuraini, A., Murnani, S., Ardiyanto, I., Wibirama, S.: Machine learning in gaze based interaction: a survey of eye movements events detection. In: 2021 International Conference on Computer System, Information Technology, and Electrical Engineering (COSITE), pp. 150– 155. IEEE (2021) 4. Akshay, S., Megha, Y., Shetty, C.B.: Machine learning algorithm to identify eye movement metrics using raw eye tracking data. In: 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 949–955. IEEE (2020) 5. Fikri, M.A., Santosa, P.I., Wibirama, S.: A review on opportunities and challenges of machine learning and deep learning for eye movements classification. In: 2021 IEEE International Biomedical Instrumentation and Technology Conference (IBITeC), pp. 65–70. IEEE (2021) 6. Roy, A.K., Akhtar, M.N., Mahadevappa, M., Guha, R., Mukherjee, J.: A novel technique to develop cognitive models for ambiguous image identification using eye tracker. IEEE Trans. Affective Comput. 11(1), 63–77 (2017) 7. Roy, A.K., Nasreen, S., Majumder, D., Mahadevappa, M., Guha, R., Mukhopadhyay, J.: Development of objective evidence in Rorschach ink blot test: an eye tracking study. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1391-1394. IEEE (2019) 8. Nasreen, S., Roy, A.K., Guha, R.: Exploring ‘little-c’ creativity through eyeparameters. In: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 1078–1081. IEEE (2022)
32
S. Saini et al.
9. Çetinta¸s, D., Firat, T.T.: Eye-tracking analysis with deep learning method. In: 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), pp. 512–515. IEEE (2021) 10. Joseph, A.W., Jeevitha Shree, D., Saluja, K.P.S., Mukhopadhyay, A., Murugesh, R., Biswas, P.: Eye tracking to understand impact of aging on mobile phone applications. In: International Conference on Research into Design, pp. 315–326. Springer (2021) 11. Feng, Y., Wang, L., Chen, F.: An eye-tracking based evaluation on the effect of far-infrared therapy for relieving visual fatigue. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 313–316. IEEE (2019) 12. Kokanova, E.S., Lyutyanskaya, M.M., Cherkasova, A.S.: Eye tracking study of reading and sight translation. In: SHS Web of Conferences, vol. 50, p. 01080. EDP Sciences (2018) 13. Carter, B.T., Luke, S.G.: Best practices in eye tracking research. Int. J. Psychophysiol. 155, 49–62 (2020) 14. Tamuly, S., Jyotsna, C., Amudha, J.: Tracking eye movements to predict the valence of a scene. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–7. IEEE (2019) 15. Akinyelu, A.A., Blignaut, P.: Convolutional neural network-based methods for eye gaze estimation: a survey. IEEE Access 8, 142581–142605 (2020) 16. Koochaki, F., Najafizadeh, L.: Predicting intention through eye gaze patterns. In: 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS), pp. 1–4. IEEE (2018) 17. Arsenovic, M., Sladojevic, S., Stefanovic, D., Anderla, A.: Deep neural network ensemble architecture for eye movements classification. In: 2018 17th International Symposium INFOTEH-JAHORINA (INFOTEH), pp. 1–4. IEEE (2018) 18. Anisimov, V., Chernozatonsky, K., Pikunov, A., Shedenko, K., Zhigulskaya, D., Arsen, R.: Mlbased classification of eye movement patterns during reading using eye tracking data from an apple ipad device: perspective machine learning algorithm needed for reading quality analytics app on an ipad with built-in eye tracking. In: 2021 International Conference on Cyberworlds (CW), pp. 188–193. IEEE (2021) 19. Zemblys, R., Niehorster, D.C., Komogortsev, O., Holmqvist, K.: Using machine learning to detect events in eye-tracking data. Behav. Res. Methods 50(1), 160–181 (2018) 20. Caya, M.V.C., Mendez, B.A.Q., Sanchez, B.J.S., Santos, G.F., Chung, W.Y.: Development of a wearable device for tracking eye movement using pupil lumination comparison algorithm. In: 2018 IEEE 10th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM). pp. 1– 6. IEEE (2018) 21. Prasetyo, Y.T., Widyaningrum, R., Lin, C.J.: Eye gaze accuracy in the projection based stereoscopic display as a function of number of fixation, eye movement time, and parallax. In: 2019 IEEE international conference on industrial engineering and engineering management (IEEM), pp. 54–58. IEEE (2019) 22. Miranda, A.M., Nunes-Pereira, E.J., Baskaran, K., Macedo, A.F.: Eye movements, convergence distance and pupil-size when reading from smartphone, computer, print and tablet. Scand. J. Optometry Vis. Sci. 11(1), 1–5 (2018) 23. Lin, C.J., Prasetyo, Y.T., Widyaningrum, R.: Eye movement parameters for performance evaluation in projection-based stereoscopic display. J. Eye Movement Res. 11(6) (2018) 24. Wei, H., Lin, S., Chen, W., Chen, J., Zheng, Y.: Non-invasive image quality assessment based on eye-tracking. In: 2021 7th International Conference on Computer and Communications (ICCC), pp. 1802–1806. IEEE (2021) 25. Lin, H.J., Chou, L.W., Chang, K.M., Wang, J.F., Chen, S.H., Hendradi, R.: Visual fatigue estimation by eye tracker with regression analysis. J. Sens. 2022 (2022) 26. Pritalia, G.L., Wibirama, S., Adji, T.B., Kusrohmaniah, S.: Classification of learning styles in multimedia learning using eye-tracking and machine learning. In: 2020 FORTEI-International Conference on Electrical Engineering (FORTEI-ICEE), pp. 145–150. IEEE (2020) 27. Fuhl, W., Castner, N., Kasneci, E.: Rule-based learning for eye movement type detection. In: Proceedings of the Workshop on Modeling Cognitive Processes from Multimodal Data, pp. 1–6 (2018)
Eye-Tracking Movements—A Comparative Study
33
28. Vortmann, L.M., Knychalla, J., Annerer-Walcher, S., Benedek, M., Putze, F.: Imaging time series of eye tracking data to classify attentional states. Front. Neurosci. 15, 664490 (2021) 29. Skaramagkas, V., Giannakakis, G., Ktistakis, E., Manousos, D., Karatzanis, I., Tachos, N., Tripoliti, E.E., Marias, K., Fotiadis, D.I., Tsiknakis, M.: Review of eye tracking metrics involved in emotional and cognitive processes. IEEE Rev. Biomed. Eng. (2021) 30. Oyekunle, R., Bello, O., Jubril, Q., Sikiru, I., Balogun, A.: Usability evaluation using eyetracking on e-commerce and education domains. J. Inf. Technol. Comput. 1(1), 1–13 (2020) 31. Sharvashidze, N., Schütz, A.C.: Task-dependent eye-movement patterns in viewing art. J. Eye Movement Res. 13(2) (2020) 32. Wang, Z., Epps, J., Chen, S.: An investigation of automatic saccade and fixation detection from wearable infrared cameras. In: 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2250–2257. IEEE (2021) 33. Latifzadeh, K., Amiri, S., Bosaghzadeh, A., Rahimi, M., Ebrahimpour, R.: Evaluating cognitive load of multimedia learning by eye-tracking data analysis. Technol. Educ. J. (TEJ) 15(1), 33–50 (2020) 34. Potthoff, J., Schienle, A.: Effects of self-esteem on self-viewing: an eye-tracking investigation on mirror gazing. Behav. Sci. 11(12), 164 (2021) 35. Zhu, Y., Yan, Y., Komogortsev, O.: Hierarchical hmm for eye movement classification. In: European Conference on Computer Vision, pp. 544–554. Springer (2020) 36. Baharom, N., Aid, S., Amin, M., Wibirama, S., Mikami, O.: Exploring the eye tracking data of human behaviour on consumer merchandise product. J. Adv. Manuf. Technol. (JAMT) 13(2) (2019) 37. Zandi, A.S., Quddus, A., Prest, L., Comeau, F.J.: Non-intrusive detection of drowsy driving based on eye tracking data. Transp. Res. Rec. 2673(6), 247–257 (2019)
Cybersecurity Imminent Threats with Solutions in Higher Education Mahendra Kumar Gourisaria , Abtsega Tesfaye Chufare , and Debajyoty Banik
Abstract With better education, there would be a greater need for factual preservation. Without adequate threat management, serious record breaches have already happened and are probably going to happen again. By reviewing the recent literature on acknowledged assets, risk occurrences, risk actors, and vulnerabilities in better training, it is possible to deduce the effect of cyberattacks in society. The review included studies as well as projects to increase our understanding of the primary cybersecurity threat domains. The top spot has shifted due to the paucity of empirical research on cybersecurity threats in higher education as well as large gaps in the literature despite this issue, some of the publications which are examined spent too much time explaining cybersecurity-related concepts. The top-level view of missioncritical assets, regular risk events, a widely accepted risk model, and an overview of not uncommon cybersecurity vulnerabilities are all combined in this document. Strategic cyber risks are summarized in this report with descriptions of frequency distributions and starting points for protection researchers in higher education. Keywords Higher education societies · Phishing and SQL injection attacks · Email filtration
1 Introduction A crucial component of a safe and encouraging atmosphere is the ability to soundly encourage with digital structures. This is particularly literature in institutions of higher education, where more and more students are pursuing their education online, where teachers, staff, and visitors are continuously accessing and exchanging data online, and where more infrastructure and facility elements are handled online. Schools and universities operate in robust information technology networks and M. K. Gourisaria · A. T. Chufare · D. Banik (B) Kalinga Institute of Industrial Technology, Patia, Bhubaneswar 751024, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_4
35
36
M. K. Gourisaria et al.
multi-layered infrastructure architectures with multiple levels of admittance and connectivity to sustain their collaborative culture.
2 Background of Study People can now benefit from both the actual world and the virtual world with the aid of the development of the Internet. With the help of new platforms, it is now comparably easy to trace out problems and get a solution form them. On the other hand, cybercrime has the potential to have detrimental impact on Internet users in expanding cyberspace. Therefore, timely correction of these problems is necessary to prevent a significant impact. The development of cybersecurity among Internet users is crucial. Because cybercrime may occur everywhere, regardless of a person, business, or place, cybersecurity education is essential. Cybersecurity is the condition of having one’s electronic data safeguarded from misuse, illegal behavior, or the activities taken to achieve it. Our lives have been drastically altered by information and communication technologies (ICT) rapid development. The World Wide Web makes it simple for people and organizations to navigate various types of information, yet it has the potential to endanger people’s lives if used maliciously.
3 Proposed Methodology The research methodology used in this cybersecurity task allows you to limit operations to define explicit controls on processes. Research methods also ensure that operations are managed with thoughtful development of work. The different types of research methods commonly used in this include experimental, investigative, co-relational, and review. The research includes operations and several facility adjustment that can lead to usability changes. The search method can also compete with the ease of use of specific and intelligent return management. The work is influenced by a process of action research, online data, and research analysis. Operations can be made available by performing operations that coordinate and extend the set of work functions. Online data and action research help collect secondary data to support factor utility and define information development implementations. An essential topic of study in cybersecurity is data integrity. The current cyberattack produces large amounts of traffic for service interruption as well as unauthorized authorization and authentication. The definition of cybersecurity is the protection of cyber components and processes from risks to discretion, dependability, and accessibility. As a result, safety standards must be followed to protect users from hacking and other malicious activities.
Cybersecurity Imminent Threats with Solutions in Higher Education
37
We can say that threats to networks are a problem that is spreading throughout the world and getting worse and more numerous with each passing day because they can be exploited by attackers and result in cyberattacks. These threats include hardware failure, existing vulnerabilities, program bugs, malicious code, and network intrusion from a local or remote hacker.
4 The Vulnerability of Higher Education Even though virtually every major industry has substantial cybersecurity risks, higher education is particularly exposed to vulnerability for several significant reasons. As per said in [1], cybersecurity is related to the special culture of science, which enjoys a degree of openness and transparency that most other industries do not. Accordingly, universities have placed a high priority on ensuring that our employees, students, audience, and benefactors can effortlessly communicate. Another factor is the background of faculties and institutions, in particular how long they have been accessible online. Universities were among the first places to have Internet access, and once you have Internet access, people start to see how far they can get. This claim in [2] Education Dive article, is nothing that universities have historically been among the top targets for cyberattacks. Modern tools and techniques are used by cyber criminals to attack academic networks. Accordingly said in [3], many colleges and universities still rely on legacy systems that are particularly susceptible to attack because they adopted digital tools and interfaces so early and for practical, economical, and other reasons. In the aforementioned piece, it has been stated that a lot of institutions use content that was published years ago. Simply put, cyber attackers infiltrate university systems using cutting-edge strategies and methodologies. More precisely, IT systems at the university are decentralized and, often characterized by arbitrary structures that can be easily exploited by attackers. Likely, different technical requirements are needed for a university literature department. This kind of fragmented structure creates a clear weakness in information security. Dozens of departments, at least one, have outdated equipment and unlatched operating systems, poor email filtering, incomplete data backups, or inadequate. It is likely a combination of user training and policy. Table 1 puts significant malware behaviors that users, specifically college students experience while using the Internet. Among the different malware described above, willingness to open suspicious sites and attachments through email and links is common which makes the users vulnerable to exploitation. Among the cases listed above, the behavior of how malware attacks users is not differentiated explicitly due to its dynamic characteristic. The majority of reasons written are considered under the willingness to open broken sites and links. This malware takes the action on the user’s computer once they get a chance. Then, it could be unmanageable to proctor and delete this malware once it is hosted under the computer. Therefore, the very common and wisest solution can be taken care of
38
M. K. Gourisaria et al.
Table 1 Users’ cybersecurity behavior on malware No.
Items
Agree (%)
1
Willing to open emailed URL from strangers
20.34
Don’t know (%) 8.6
Disagree (%) 71.06
2
Scan removal drives before plugging
28.28
17.30
54.42
3
Detect problem if the computer runs slow
54.12
12.23
33.65
4
Aware of anti-virus in personal computer
67.43
4.21
28.36
5
Downloading freeware on the Internet
23.43
19.53
57.04
6
Interested to open URL with multiple extension
34.19
3.89
61.92
before entering malicious websites and links driven from any external sites as per described in the table above. As per described in [2], although it is not specific to higher education, colleges continue to face a significant challenge in finding cybersecurity talent. Organizations frequently pay a premium price for cybersecurity expertise due to the vastly disproportionate demand for cybersecurity personnel. Universities that attempt to recruit such people from lucrative employment in the private sector. Behind cybersecurity vulnerabilities in higher education institutions by understanding the rationale, we can now examine how these vulnerabilities are exploited as per [4]. A variety of cyberattacks can change their methodology of attacking constantly. There, sectors must be aware of potential assaults and different kinds of intrusions in the educational system. They must be aware of frauds and ruses which can alternately expose them to vulnerabilities. They must be knowledgeable about the different kinds of dangerous software, their defenses, etc. Advanced topics including the secure usage of social networking sites and GPS-enabled mobile devices must also be covered in the curriculum. Additionally, they need to be familiar with concepts like: • • • •
Hardware/desktop security Wired and wireless security File/folder-level security with password protection Adverse software: Worms, viruses, spyware, and shareware Trojans, botnets, spyware, adware, and zombies social networking.
5 Techniques How Attackers Use to Exploit Vulnerabilities Malicious actors launch cyberattacks using a range of tools and techniques. Here are two instances of such tactics are highly remarkable. This list is not exhaustive or specific to a school or university as mentioned in [5], but it does provide additional
Cybersecurity Imminent Threats with Solutions in Higher Education
39
Fig. 1 Types of attacks common in higher education [6]
knowledge about how hackers exploit cybersecurity vulnerabilities and help prevent such attacks in the future. Figure 1 [6] elaborates the different attacks which are common in higher education, Denial of Service, and phishing take the major side of the attacking technique and the vulnerability of users as it comprises the greater privilege. Inside the attack of Denial of Service the common one, SQLi injection and phishing are elaborated which takes the focus of this paper. These phishing assault statistics show how much easy it has become for us to be duped thanks to the Internet. Think about investing in phishing prevention software to safeguard both you and your business. Your emails are evaluated by a top phishing prevention service, to see if they include phishing attacks. The platform goes a step further and uses artificial intelligence to create a model of your company’s communications network by analyzing communications across several email platforms. Table 2 from [5] lists the areas in which phishing attacks are common. Among all, sites about education and consumer goods have the most vulnerabilities. In the past year, about 29% of clients of the education organization account for more than twice the rate of the general population have fallen for phishing scams masquerading as business communications. Spear phishing attacks target a specific person or organization, as opposed to standard phishing assaults, which try to catch any willing victim. This is achieved through the use of tailored emails, which frequently impersonate well-known individuals and contain information particular to the target to persuade the receiver that the request for sensitive information or wire transfers is legitimate.
5.1 SQL Injection SQL injection (SQLi), to put it simply, is the database that supports positive packets. It has been termed possibly the heaviest load on Internet packets, attacks intended to be used to get around password security. The language used for managing and
40 Table 2 Exposed sites for phishing attacks
M. K. Gourisaria et al. No.
Exposed sites
Percentage
1
Consumer goods
24
2
Education
20
3
Professional services
23
4
Hospitality
20
5
Health care
13
Fig. 2 SQL injection types in higher education [7]
interacting with databases is called Standard Query Language (SQL). SQL injection is the practice of forcing a specific database to produce sensitive data by taking advantage of flaws in the coding of input pages, such as login pages for usernames and passwords. For instance, if a login page with a username and password is shown to the attacker, one can insert SQL code in the password box. If the code for the underlying database is weak, the SQL code might modify it and force the database-driven utilities to give hackers access (Fig. 2). The diagram above [7] shows the SQL injection attack’s primary taxonomies. SQL operations, code injection, function calls, and buffer overflows fall under database attacks, where the classification of this attack comprises legitimate differences in the database system which can attack various database-centered areas. As it has been stated in [8], numerous password-protected online programs used by colleges and universities, ranging from student grade reviews to job data, are theoretically vulnerable to SQL injection attacks. This was one such attack targeting improved training. SQL injections will probably continue to be prevalent and very simple for hackers to utilize as long as institutes of higher learning continue to have vulnerabilities baked into their underlying databases.
5.2 Phishing Phishing attacks, which were mentioned at the beginning of this article, use emails or websites to trick people into divulging personal information like passwords or credit card details. Typically, the email message is sent to a huge number of people whose
Cybersecurity Imminent Threats with Solutions in Higher Education
41
Table 3 Statistical value on cybersecurity phishing behavior No. Items
Agree (%) Don’t know (%) Disagree (%)
1
Enhancing phishing knowledge by reading
22.30
2
Interested to access confidential email
68.87
2.30
28.83
3
Willing to transmit sensitive information
10.12
12.23
77.65
4
Check URL before proceeding transaction
43.43
4.21
52.36
5
Not targeting phishing attack due to students 54.43
10.53
35.04
10.6
67.1
email addresses the phisher has obtained via copying books and web pages at some point online. The mail, which is typically well-written and official-looking, may also claim to be from a financial group, a firm, or several various corporate operations thought to have resources the receiver may employ. Often, the recipient is asked to provide the records with the useful resource of the use of clicking a net web website online link with the email. But at the same time, because the link to the net website online may additionally look legitimate, the link that is displayed isn’t the actual net Internet site online you visit while you click on it. A very common cyberattack, phishing, has been encountered in different areas. Higher education is among the stakes where the effect can be observed at a glance. As per Table 3, the statistics from the users show the extensive effect of how people get misled by phishing attack tricks and leak their private information. While the procedures noted above have demonstrated effectiveness, we can see that they may be prevented. The behavior of phishing listed in table illustrates the magnificent ways in which phishing can take control of the user’s identity. Receiving suspicious emails, trusting any email messages announcing contests/prizes, willing to provide confidential information are the common ones in the majority of higher education societies. This shouldn’t be a shock to anyone. The facts about phishing demonstrate how simple they are compared to other types of cyber-attacks; phishers don’t need to try to hack a system or look for infrastructure flaws. Instead, they merely aim at an organization’s attributes, which are its weakest link. Simply tricking one employee sometimes out of hundreds or even thousands of others into opening an email or selecting a link or attachment is all that is required.
6 Prevention Ways in Better Code and Vigilance Several techniques exist for preventing cyber assaults as per described in [9], some contain techniques that better schooling IT experts need to rent themselves, at the same time as others contain techniques that everybody with inside the better community, such as giving up customers, needs to be enforced. So, it is widely acknowledged that preventing SQL injection attacks won’t be particularly difficult because there has
42
M. K. Gourisaria et al.
been a lot written about how to accomplish it. The Open Web Application Security Project provides an overview of SQLi attack prevention measures (OWASP).
6.1 Prepared Statement Higher education institutions must put together their underlying databases with wellorganized assertions. According to OWASP, even if SQL instructions are injected by an attacker’s means, prepared statements make sure that they can’t always be used to change the goal of a query. Essentially, anyone attempting to enter data can make well-structured SQL instructions.
6.2 Database Procedures The primary difference between stored strategies and organized statements, according to the documentation, is that the SQL code for a saved method is described and stored inside the datasets and interpreted from the application. Saved methods won’t always be effective for defending against SQLi attacks, but when created and implemented properly, they can be a viable option for colleges and universities, as noted by OWASP and others. Adding an additional layer of security to your database, such as multi-factor authentication is also recommended as per said in [10]. Because they would need access to your other authentication factors, such your physical device or biometric, attackers would struggle to circumvent security mechanisms even if your credentials were compromised. Utilizing certificate-based authentication strategies, such as client certificates, provides an additional choice. Authorized users can connect into their accounts using this password-free way of authentication without having to remember or input any passwords that could be hacked thanks to public key infrastructure (PKI). Utilizing certificate-based authentication strategies, such as client certificates, provides an additional choice. Authorized users can connect into their accounts using this password-free way of authentication without having to remember or input any passwords that could be hacked thanks to public key infrastructure (PKI).
6.3 Input Validation SQL injection attacks target databases and programmed that don’t cross-reference and validate inputted data as said in [11]. Therefore, a logical first step in stopping those assaults is to make sure that input validation is required for a database that is currently being constructed. Additionally, Microsoft lists entry validation as a crucial
Cybersecurity Imminent Threats with Solutions in Higher Education
43
technique for preventing SQLi attacks inside of its ASP Internet web development platforms. When running in the background, a web application may occasionally result in a malicious attack or input validation assault. However, the majority of the time, it is an individual who corrupts the system’s performance by entering the data. A program or user discussed in [12] entering data as part of a user input assault can leave a computer open to unauthorized modifications and damaging commands. Simple phrases, malicious code, and large scale information attacks are dangerous data that can be inserted into a system prior to releasing an application, test for input validation, as this the way to defend against these attacks.
6.4 Preventing Phishing via Training and Heightened Suspicion Phishing schemes are largely dependent on academics, staff, and students, when it is compared to the injection type of attack, which can be manipulated through internal technical remedies. Schools and institutions must take a number of measures to ensure that every student is vigilant. Hosting variety of training regarding the different cyberattacks for the university members and giving them advice from the experts play a great role in maintaining the sustainability of users’ privacy online. Universities should have to take a number of precautions to guard against phishing. They must stay abreast of the most recent phishing tactics and ensure that their security solutions and policies can counteract threats as they emerge. Making ensuring that their staff members are aware of the potential attacks, risks, and defense strategies are equally crucial. When defending phishing assaults, informing personnel and adequately maintaining systems are essential. Universities must prevent students from enrolling in courses that teach them the different attacking ways like phishing and giving awareness on the caution methods in order to be safe. Some businesses are aware of giving this service, and institutions that are knowledgeable are more likely to devote the necessary time and money to properly educating their students and employees, as stated in [13]. The requirement for frequent repetition of this training exposes clients to a wide range of phishing attempts, as mentioned in an editorial in Info Security Magazine. Raising awareness can also be accomplished by giving examples of actual attacks and by compiling archives of them. The above techniques are not all encompassing and may not save you from all attacks, but they can provide enormous benefits in combating potential cyber threats through better education with a very simple steps as per [14].
44
M. K. Gourisaria et al.
6.5 Email Filtration As an initial step stated in [15], education sites should install an email filter that sends suspicious non-university emails to users’ junk email folders. It is a foolproof solution, but it is a critical first step in preventing malicious email from reaching its destination. The requirements are slightly different if you enable software that watches for phishing emails. While the aforementioned criteria may place an email on a content filter’s radar automatically, additional warning signs will set off a phishing filter. Phishing emails frequently have a link or an attachment that can be used to access malware. A phishing filter may be triggered by senders who have been identified as having a history of spreading malware.
6.6 Integrating Deep Neural Network in Threatened Areas Deep neural networks (DNNs) have emerged as a potent method for tackling longstanding supervised and unsupervised AI tasks in areas including computer vision, speech recognition, and natural language processing. The categorization of Android malware, event detection, and fraud detection is the three major cybersecurity use cases that we seek to apply DNN to in this study. Each use case’s dataset includes actual samples of known dangerous and benign actions. By conducting numerous trails of experiments for network parameters and network structures, the effective network architecture for DNN is selected. In every situation, DNN performs better than traditional machine learning classifiers. Additionally, in every application, the same architecture outperforms the other traditional machine learning classifiers. By encouraging training or adding a few extra layers to the current designs, the stated outcomes of DNN can be significantly enhanced. Artificial neural networks (ANNs) depict a collection of artificial neurons, which are typically referred to as units in mathematical models and connected by edges. This was influenced by biological brain network properties, where nodes stand in for biological neurons and edges for synapses. An example of an ANN is a feed forward network. A set of units that are connected with edges in a single direction without the creation of a cycle make up a feed forward network (FFN). They are the most widely used and straightforward algorithm. A subclass of FFN known as a multilayer perceptron (MLP) consists of three or more layers and several artificial neurons, also known as units. Input, a hidden layer, and output layers make up the three layers. When the data is complicated in nature, there is a chance to expand the number of hidden layers as put in [16]. As a result, the number of hidden layers is parameterized and dependent on the data complexity. Together, these components create an acyclic graph that transmits data or signals forward from layer to layer without the need.
Cybersecurity Imminent Threats with Solutions in Higher Education
45
7 The Keys to a Secure Future To build a future of better education, greater security, and higher financial stability, it is crucial to understand vulnerabilities as it has been described in [17], how commonplace cyberattacks are discussed in [18], and how to protect yourself from them. There is no guarantee that current dangers (and the methods employed to counter them) will resemble future threats, though, as cyber threats are always changing. What we ought to do is essentially extrude the manner software programs are written today. We want to bolster software program improvement in order that human beings are aware of safety dangers and vulnerabilities. Another key assembly destiny cybersecurity demanding situations, in brief mentioned in, is to appoint a robust and steady crew of professionals with inside the field. Of course, that is easier said than done. This is due to the fact college budgets are tight and scarce, as has been idealized in [19]. Still, there are approaches around this problem, which includes hiring skilled freelancers or freelancers who need to paint remotely. Cybersecurity demanding situations in better training are vast and the prices of fixing them are high, however, the capacity which is described in [6], monetary and the reputation dangers related to terrible defenses may be even greater. Institutions throughout the better training surroundings may also locate that powerful cybersecurity answers pay for themselves in the long run.
8 Discussions We evaluate many forms of online fraud, malware, phishing, and password threats. The results show that the respondents’ behavior is extremely vulnerable in all respects and that it would put them at risk for cybersecurity risks. The study’s conclusions agreed with those of earlier research on several cybersecurity behaviors. Information systems for students are common in colleges. These have names like Banner and Canvas, among others. For students, administrators, and even teachers, these are essential components of the college process. These systems are frequently used by students, administrators, and faculty to pay for as said in [20], enroll in, change, and access courses, enter grades, access housing assignments, update contact information, and perform a variety of other tasks. Naturally, this makes this a desirable target for attackers. As with other application software, these program have their own weaknesses.
9 Recommendations The above paragraphs demonstrate how these respondents’ cybersecurity practiced can render them susceptible to threats. If only people were aware of these problems,
46
M. K. Gourisaria et al.
some of the hazards might be eradicated or at least diminished. The concerned parties might take to safeguard such groups from the changing cybersecurity dangers is to provide knowledge to improve their comprehension of such concerns. Therefore, it is crucial to educate people about cybersecurit to safeguard Internet users against future cybercrimes and evolving online threats. In accordance with the studies, the researchers vehemently urge the introduction of formal cybersecurity education to address the escalating cybersecurity challenges.
10 Future Works Due to time constraints, many various modifications, testing, and experiments have been postponed (i.e., the experiments with real data are usually very time-consuming, requiring even days to finish a single run). Future research will focus on further in-depth examination of certain mechanisms, fresh suggestions to test out new techniques, or just plain curiosity. When describing and creating the understanding on the basic cyberattacks in higher education, there are a few things I would have liked to attempt. The majority of the attacks on this thesis main focus on the common and outraged attacks attempted on the higher education society frequently. As the attacks are dynamic and change their models timely with the upgraded technology and modifications, the information inside this paper has to include them as well. Therefore, the paper needs update timely with new attacks to be considered in the higher education. The following ideas could be tested for the future work to be done with the new information gathered.
11 Conclusion An overview of a few pieces of literature reveals that the IT sector has been monitoring hackers and cyber criminals for years. Therefore, in the near future, we need a cybersecurity curriculum that builds on today’s young people’s understanding of cybersecurity and ultimately the IT industry, through cyber, on more and more of what is considered so important. You will receive educated and secure certified specialists. Protect safety education will help you recognize the potential problems you may experience while using different online tools in day to day activities. However, teaching cybersecurity presents a number of challenges. These include teachers’ lack of skill, finance, and resources, as well as their degree of knowledge. Through cybersecurity education in schools, all stakeholders work together to determine the best ways to protect kids from online crime and bullying. Through cybersecurity education efforts, media like radio and television also play a significant role in teaching the society. Because people find these advertising more engaging and involved. Therefore, it is necessary to establish and, more significantly, implement an effective cybersecurity policy at all levels. In the future, the role of
Cybersecurity Imminent Threats with Solutions in Higher Education
47
government and the involvement of the education system in assuring cybersecurity awareness will lead to a highly secure nation.
References 1. Adams, A., Blanford, A.: Security and online learning: to protect and prohibit. In: Usability Evaluation of Online Learning Programs, pp. 331–359. IGI Global, Hershey, PA, USA (2003) 2. Kaspersky: Digital education: The cyber risks of the online classroom (2020) 3. Banik, D., Ekbal, A., Bhattacharyya, P.: Statistical machine translation based on weighted syntax–semantics. S¯adhan¯a 45(1), 1–12 (2020) 4. Mello, S.: Data Breaches in higher education institutions. In: Honors Theses and Capstones. University of New Hampshire, Durham, NH, USA (2018) 5. Elgelany, A., Gaoud, W.: Cloud computing: empirical studies in higher education a literature review. Int. J. Adv. Comput. Sci. Appl. 8(10), 121–127 (2017). https://doi.org/10.14569/IJA CSA.2017.081017 6. ISO/IEC 27002:2013 Information technology—Security techniques—Code of practice for information security controls. Standard; International Organization for Standardization, Geneva, Switzerland (2014) 7. Banik, D., Ekbal, A., Bhattacharyya, P., Bhattacharyya, S., Platos, J.: Statistical- based system combination approach to gain advantages over different machine translation systems. Heliyon 5(9), e02504 (2019) 8. Pardeshi, V.H.: Cloud Computing for Higher Education Institutes 9. Banik, D.: Phrase table re-adjustment for statistical machine translation. Int. J. Speech Technol. 24(4), 903–911 (2021) 10. Ahmed, A.E.A., Badawy, M., Hefny, H.: Exploring and measuring the key performance indicators in higher education institutions. Int. J. Intell. Comput. Inf. Sci. 18, 37–47 (2018) (Google Scholar) 11. Wangen, G.: Quantifying and analyzing information security risk from incident data; graphical models for security, pp. 129–154. In: Albanese, M., Horne, R., Probst, C.W. (eds.) Springer, Cham, Switzerland, London, UK (2019) 12. Whitman, M.: Management of Information Security. Cengage Learning, Inc., Boston, MA, USA (2018). ISBN 9780357691205 (Google Scholar) 13. Banik, D., Ekbal, A., Bhattacharyya, P.: Wuplebleu: the wordnet-based evaluation metric for machine translation. In: 15th International Conference on Natural Language Processing, p. 104 (2018) 14. Ulven, J.: High level information security risk in higher education. Master’s thesis, Norwegian University of Science and Technology, Trondheim, Norway (2020) (Google Scholar) 15. Beaudin, K.: College and university data breaches: regulating higher education cybersecurity under state and federal law. J. Coll. Univ. Law 41, 657–693 (2015) 16. Banerjee, A., Banik, D.: (2022) Pooled hybrid-spectral for hyperspectral image classification. In: Multimedia Tools and Applications, pp. 1–13 17. .ISO/IEC 27002:2013 Information technology—Security techniques—Information security risk management; standard. International Organization for Standardization, Geneva, Switzerland (2018) 18. ISO/IEC 27002:2013 Information Technology—Security techniques—Code of practice for information security controls. Standard; International Organization for Standardization 19. Wangen, G., Halstensen, C., Snekkenes, E.: A framework for estimating security risk assessment method completeness 20. Internet World Stats: Internet penetration in Asia December 31, 2013. In: Jones, H.B., Heinrichs, R.L. (eds.) (2012) Do business students practice smartphone security? J Comput Inf Syst 22–30. 14. Kshetri, N. (2010)
A Two-Level Fuzzy Model for Filtering Signals of the Automatic Dependent Surveillance-Broadcast Bobyr Maxim , Arkhipov Alexander , and Milostnaya Natalia
Abstract A two-level fuzzy model for filtering complex signals such as automatic dependent surveillance-broadcast is presented in the article. The first and second levels of the fuzzy model consist of three operations: fuzzification, fuzzy composition and defuzzification. Input variables of two levels are given by trapezoidal membership functions that are formed automatically, depending on the characteristics of the complex signal. The output function at the first level is given by a singleton function, and the defuzzification is carried out using a simplified center of gravity model. The proposed two-level fuzzy model makes it possible to increase the sensitivity of the ADS-B signal receiver and correctly detect the received signal. Keywords ADS-B · Fuzzy logic · Filtering complex signals · Fuzzy filter
1 Introduction Fuzzy logic is used to solve various problems, e.g., data recognition [1, 2], CNC machine control [3], signal filtering [4] and allows not only to increase their sensitivity, but also to correct some errors. Also, fuzzy logic makes it possible to correctly detect messages even during signal processing, where the signal-to-noise ratio approaches unity [5, 6], for instance, for automatic dependent surveillancebroadcast (ADS-B). A search for articles on the resource https://www.sciencedirect. com/ showed that fuzzy logic is practically not used in these systems. It should be noted that ADS-B systems transmit a large quantity of information such as identifier, position, heading, aircraft speed and altitude, weather conditions and textual flight advice. The ADS-B signal in Mode-S is transmitted at a frequency of 1090 MHz and is subject to strong noise, especially if the receiver is located at a considerable distance from the aircraft. ADS-B message receivers apply various analog and digital B. Maxim (B) · A. Alexander · M. Natalia Southwest State University, Kursk 305040, Russian Federation e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_5
49
50
B. Maxim et al.
Fig. 1 ADS-B signal (1090 MHz) in Mode-S mode
Fig. 2 Structure of the ADS-B signal
Fig. 3 Real received ADS-B signal
filters to process the signal. The signal is a pulse frequency modulation in the Manchester code, 120 bits long and lasts 120 μs, while 112 bits of useful information are transmitted, including check bits. An example of an ADS-B signal in Mode-S is shown in Fig. 1. Figure 2 shows its structure, including a preamble and a data block. The received signal is highly distorted and is shown in Fig. 3. An approach based on fuzzy logic that allows filtering out noise, detecting a signal and correcting errors during binarization is presented in the article. With the above in mind, the article has the following novelty. In our article, we propose a method for converting a digitized complex analog signal into a binary code based on two-level fuzzy model for filtering signals. The two-level fuzzy model used to errors correction which remain after digitization and filtering of the input signal. In order to reduce the complexity of computational operations, the proposed method uses the same triangular input membership function for both levels of fuzzy model. This solution makes it possible to use the same computing modules in the practical implementation of the proposed method on FPGAs. The effectiveness of the proposed solutions is confirmed by the experimental calculation presented in the fourth section.
A Two-Level Fuzzy Model for Filtering Signals of the Automatic …
51
2 Signal Filtering Various analog and digital filters are used to filter signals such as the ADS-B signal. Digital filters include averaging [7], median [8], Kalman [9], FIR [10], IIR [11], fuzzy [12], etc. Each filter has its own advantages and disadvantages.
2.1 Average Filter This filter performs the averaging of several signal samples and is given by the formula: k=i+ n2 yi =
k=i− n2 +1
xk
n
(1)
where x k is input signal, n is number of samples to average, i is current count number, yi is average signal value. The averaging filter reduces noise and improves the signal-to-noise ratio. The signal-to-noise ratio will double when using this filter. The input signal, the 4-sample filtered signal (n = 4) and the re-filtered signal are shown in Fig. 4. The disadvantage of the filter is the smoothing of the signal edges, which leads to a change in the duration of the pulses during binarization.
2.2 Median Filter The median filter selects the central value from a group of samples sorted in ascending order and is calculated by the formula: yi = midpoz(sort(xi −
n n + 1, . . . xi + )) 2 2
(2)
where x i is input signal, n is number of samples to average, i is current count number, sort() is sort function, midpoz() is middle element function, yk is median signal value. The disadvantage of the medial filter is the increase in computational complexity due to the addition of sorting for each filter sample, which depends on the number of elements to be sorted.
52
B. Maxim et al.
Fig. 4 Result of ADS-B signal processing by the averaging filter
2.3 Kalman Filter The Kalman filter is a recursive filter that evaluates the state of the system at each step, based on the current measurement and the previous state of the system. The output value of the filter is calculated by the formula: yi = α ∗ (xi + k(xi − xi−n )) + (1 − α)yi−1
(3)
where x i , x i-n are input signal at current position i and at position i-n, α and k are filter coefficients, yi , yi-1 are the value of the filtered signal in the current and previous positions. The disadvantage of the filter is the complexity in the selection of its coefficients.
A Two-Level Fuzzy Model for Filtering Signals of the Automatic …
53
2.4 FIR Filter A Finite Impulse Response (FIR) filter is a non-recursive linear digital filter. The sum of several samples, each of which is multiplied by its own coefficient, is used to calculate the resulting value: yi =
k=n−1
xi−k ∗ sk
(4)
k=0
where x i-k is input signal in position i-k, sk is filter coefficient for k-th element, yi is filtered signal value. The results of FIR filter processing of weakly- and strongly noisy signals are shown in Fig. 5. The disadvantages of the FIR filter include the complexity in the selection of coefficients. The result of filtering the previously described filters is a non-binary function (see Fig. 6).
Fig. 5 Result of ADS-B signal processing by the median filter
Fig. 6 Result of processing the signal of a weakly and strongly noisy signals
54
B. Maxim et al.
2.5 Fuzzy Filter The fuzzy filter allows you to convert the signal received from the ADS-B system into a binary function, and to correct errors that occurred during binarization. In general, the fuzzy filter is calculated by the formula: Yi = F(xi−n , . . . xi )
(5)
where x i-n , x i are input signals in positions i-n and i, F() is fuzzy function, yi is filtered signal value. The transformation of a complex signal into a binary code based on a combination of fuzzy rules is a positive property of a fuzzy filter.
3 A Two-Level Fuzzy Model for Filtering Signals Before starting the two-level fuzzy model, the ADS-B signal is smoothed by one of the existing filters; in our work, a FIR filter was used. The FIR filter receives an integer signal with a digit of 14 bits and a frequency of 25 MHz. The output of the FIR filter is a 14-bit integer. A binary signal is taken from the output of the fuzzy filter (see Fig. 7).
Input data
Averaging/ Median/ FIR
Membership functions block
Fuzzy composition block
Defuzzifier
filter The first level of the fuzzy model
Binary signal in the Manchester code Membership functions block
Fuzzy composition block
Block of logic inference
The second level of the fuzzy model
Fig. 7 Filtering complex signals based on the two-level fuzzy model
Decoder
A Two-Level Fuzzy Model for Filtering Signals of the Automatic …
55
The first level of the fuzzy model consists of three blocks: an automatic membership function generator, a compositional inference block and a defuzzifier. The second level of the model also consists of three blocks: the membership function generator, the compositional inference block and the defuzzifier. The output binary data is sent to the device, which decodes the input binary signal into an ADS-B message.
3.1 First Level of the Fuzzy Model First, at the first level of the fuzzy model, an input trapezoidal membership function consisting of two variables DX 1 and DX 2 is automatically formed [13–15]: ⎧ ⎪ ⎨
⎧ ⎪ if x ∈ [0; P0 ], ⎨ P1 −x D X 1 = P1 −P0 , if x ∈ [P0 ; P1 ], D X 2 = ⎪ ⎪ ⎩ 0, if x ∈ [P ; 5000]. ⎩ 1 1,
0, x−P0 , P1 −P0
1,
if x ∈ [0; P0 ], if x ∈ [P0 ; P1 ], if x ∈ [P1 ; 5000].
(6)
where P0 = m 1 (αmax − αmin ), P1 = m 2 (αmax − αmin ) where x is input variable, αmax and αmin are the minimum and maximum values of the sample (see Eq. 6), m1 and m2 are adjustment coefficients, by default m1 = 0.36, m2 = 0.65. Graphically, the input membership function is shown in Fig. 8. The output membership function is a singleton function and is determined by formula 7, shown in Fig. 9. F1 (x) = min D X 1 , D X 2 , Edge . Fig. 8 Input membership function of the first level of the fuzzy model
Fig. 9 Output membership function of the first level of the fuzzy model
(7)
56
B. Maxim et al.
The output values obtained at the first level of the fuzzy model are transferred to the second level.
3.2 Second Level of the Fuzzy Model At the second level of the fuzzy model, false positives are excluded, which can be obtained at the first level. To do this, a trapezoidal membership function is formed that includes two input variables DX 0 and DX 1 and takes into account several values obtained at the first level, in three positions before and after the current count:
(8)
The value of the DX 1 variable is 1 if the number of samples in the selected range is one and the number of these units is greater than two. Conversely, the value of the variable DX 0 is 1 if the number of samples in the selected range is one and the number of these units is less than or equal to one. The input membership function for the second level is shown in Fig. 10. An example of the operation of the logical inference block and the area for calculating logical variables is shown in Fig. 11. This block analyzes the current reading, the three previous and three subsequent values obtained at the output of the first level of the fuzzy model. Seven variables are Fig. 10 Input membership function of the second level of the fuzzy model
Fig. 11 Logic inference
а
b
c
d i-3 i-2 i-1 i i+1 i+2 i+3
A Two-Level Fuzzy Model for Filtering Signals of the Automatic …
57
analyzed in logic inference. Boolean variable a takes on a value of one if the number of ones in the previous three readings is more than one (see Fig. 11, green zone). Boolean variable b is equal to one if the current sample is equal to one (see Fig. 11, brown zone). Boolean variable c takes on a value of one if the number of ones in the next three samples is more than one (see Fig. 11, blue zone). The logical variable d takes on a value equal to one if three samples with respect to the central one are equal to one (see Fig. 11, yellow zone). At the output of the logical inference block, the logical value of the function F 2 (x) is calculated according to formula 9. It takes a value equal to one in four cases: • when the variables b and a are equal to one, i.e., when the current count is equal to one and at least two of the three previous ones are equal to one; • when the variables b and c are equal to one, i.e., when the current count is equal to one and at least two of the three subsequent ones are equal to one; • when variables a and c are equal to one, i.e., when at least two of the three previous ones are equal to one and at least two of the next three are equal to one; • when the variable d is equal to one, i.e., when the center count, one before the and one after the center counts are equal to one. F2 (x) = max[min(a, b); min(b, c); min(a, c); d] . where ⎧ ⎨ 1, i f a= ⎩ 0, ⎧ ⎨ 1, i f c= ⎩ 0,
;
k=i−1 k=i−3
D X 1k ≥ 2,
; otherwise ;
k=i+3 k=i+1
D X 1k ≥ 2,
; otherwise
b = D X 1 , ⎧ ⎨ 1, i f ; k=i+1 D X 1k = 3, d= k=i−1 ⎩ 0, ; otherwise
(9)
The proposed logic function (see Eq. 9) allows to accurately determine the start and end of the pulse and remove false positives.
4 Experiment Results The proposed two-level fuzzy model for filtering signals was implemented on the basis of the Xilinx XC7A35T FPGA. The results of signal processing with a good signal-to-noise ratio are shown in Fig. 12. The output signal obtained from FIR filter is shown in Fig. 12a. The binarized signal at the output of the first level of the fuzzy model is shown in Fig. 12b. The corrected signal at the output of the second level of the fuzzy model is shown in Fig. 12c.
58
B. Maxim et al.
Fig. 12 Result of processing a low-noise signal by a two-level fuzzy filter
The results of signal processing with a bad signal-to-noise ratio are shown in Fig. 13. The output signal obtained from FIR filter is shown in Fig. 13a. The binarized signal at the output of the first level of the fuzzy model is shown in Fig. 13b. The corrected signal at the output of the second level of the fuzzy model is shown in Fig. 13c. FPR coefficient (False Positive Rate) was used as the evaluation index of the proposed FPR = FN/(TP + FN)
(10)
where True Positive (TP) refers to the number of ADS-B messages that are correctly decoded; False Negative (FN) refers to the number of ADS-B messages that are incorrectly decoded. Figure 14 shows a comparison of the FPR indicator when decoding ADS-B messages using proposed two-level fuzzy model and without it, with a different signal-to-noise ratio (SNR) ratio. The number of correctly recognized ADS-B messages is greater when using a two-level fuzzy model than without using it (see Fig. 14). This conclusion is made on the basis of the calculation of the FPR coefficient.
A Two-Level Fuzzy Model for Filtering Signals of the Automatic …
59
Fig. 13 Result of processing a high-noise signal by a two-level fuzzy filter
Fig. 14 Results of a comparative analysis of the work of a two-level fuzzy model
5 Conclusion A two-level fuzzy model that allows converting weakly and strongly noisy signals into a binary code is considered in the article. The model can be used to filter complex signals such as ADS-B messages in small spacecraft modules. The proposed fuzzy filtering model makes it possible to increase the sensitivity of the ADS-B signal receiver and correctly detect the received signal. Acknowledgements The work was prepared as part of the implementation of the RSF project No. 23-21-00071. The authors are grateful to the foundation for their support.
60
B. Maxim et al.
References 1. Ghosh, S.K., Ghosh, A., Bhattacharyya, S.: Recognition of cancer mediating biomarkers using rough approximations enabled intuitionistic fuzzy soft sets based similarity measure. Appl. Soft Comput. 124, 109052 (2022). https://doi.org/10.1016/j.asoc.2022.109052 2. Piegat, A.: Fuzzy Modelling and Control. Physica-Verlag, Heidelberg (2001). https://doi.org/ 10.1007/978-3-7908-1824-6. 3. Bobyr, M., Yakushev, A., Dorodnykh, A.: Fuzzy devices for cooling the cutting tool of the CNC machine implemented on FPGA. Measur: J Int Measur Confederation 152, 107378 (2020). https://doi.org/10.1016/j.measurement.2019.107378 4. Bobyr M., Milostnaya, N., Bulatnikov, V.: The fuzzy filter based on the method of areas’ ratio. Appl. Soft Comput. 117. (2022). https://doi.org/10.1016/j.asoc.2022.108449 5. Ganjeh-Alamdari, M., Alikhani, R., Perfilieva, I.: Fuzzy logic approach in salt and pepper noise. Comput. Electr Eng 102, 108264 (2022). https://doi.org/10.1016/j.compeleceng.2022. 108264 6. Rizaner, A., Ulusoy, A.H., Amca, H.: Adaptive fuzzy assisted detector under impulsive noise for DVB-T systems. Optik 127(13), 5196–5199 (2016). https://doi.org/10.1016/j.ijleo.2016. 02.079 7. Lin, Y.D., Tan, Y.K., Tian, B.: A novel approach for decomposition of biomedical signals in different applications based on data-adaptive Gaussian average filtering. Biomed. Signal Process. Control 71 PA, 103104 (2022). https://doi.org/10.1016/j.bspc.2021.103104 8. Tay, D.: Sensor network data denoising via recursive graph median filters. Signal Process. 189, 108302 (2021). https://doi.org/10.1016/j.sigpro.2021.108302 9. Sharma, S., Kulkarni, R., Ajithaprasad, S., Gannavarpu, R.: Fringe pattern normalization algorithm using Kalman filter. Results Opt. 5, 100152 (2021). https://doi.org/10.1016/j.rio.2021. 100152 10. Patali, P., Kassim, S.: High throughput and energy efficient linear phase FIR filter architectures. Microprocessors and Microsystems 87, 104367 (2021). https://doi.org/10.1016/j.micpro.2021. 104367 11. Bui, N., Nguyen, T., Park, S., Choi, J., Vo, T., Kang, Y., Oh, J.: Design of a nearly linear-phase IIR filter and JPEG compression ECG signal in real-time system. Biomed. Signal Process. Control 67, 102431 (2021). https://doi.org/10.1016/j.bspc.2021.102431 12. Bobyr, M., Milostnaya, N., Bulatnikov, V.: The fuzzy filter based on the method of areas’ ratio. Appl. Soft Comput. 117, 108449 (2022). https://doi.org/10.1016/j.asoc.2022.108449 13. Bobyr, M., Kulabukhov, S., Milostnaya, N.: Fuzzy control system of robot angular attitude. In. 2nd International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM), pp. 1–6. (2016). https://doi.org/10.1109/ICIEAM.2016.7910970 14. Bobyr, M., Titov, V., Belyaev, A.: Fuzzy system of distribution of braking forces on the engines of a mobile robot. MATEC Web Conf. 79. EDP Sciences (2016). https://doi.org/10.1051/mat ecconf/20167901052 15. Bobyr, M., Kulabukhov, S.: Simulation of control of temperature mode in cutting area on the basis of fuzzy logic. J. Mach. Manuf. Reliab. 46, 288–295 (2017). https://doi.org/10.3103/S10 52618817030049
Toward More Robust Multiclass Aerial Solar Panel Detection and Classification Indrajit Kar , Sudipta Mukhopadhyay , and Bijon Guha
Abstract The challenge of identifying and monitoring multiple types of solar panels has not been studied. Solar panels can be single, double, or double with a water heater on top. Some are packed closely together. Due to installation requirements, additional solar panels may have any random orientation. When combined with the difficulties of detecting different types of panels, this arbitrary orientation negatively affects the effectiveness of deep learning algorithms by resulting to false positive and erroneous panel classifications. Furthermore, no research on the identification of various solar panel types has been done yet. In this study, we concentrate in on two key problems: first, the detection of different types of solar panels; and second, the arbitrary orientation of these panels. Our method does not use horizontal bounding box, rather it leverages horizontal bounding boxes and generates rotated bounding box during the train time. Using our method, we were able to precisely identify three various types of solar panels with various orientations. We show a comparison of their differences for the identification of three different types of solar panels, including water heater photovoltaic (WPV), farm type photovoltaic (FPV), and SPV, in terms of box loss, objectness loss, classification loss, precision, and recall (single photovoltaic). Keywords Aerial imagery · Arbitrary orientation · Oriented region of proposal · Multiclass tiny object detection and solar panel
I. Kar · S. Mukhopadhyay (B) · B. Guha Siemens Technology and Services Private Limited, Bangalore, India e-mail: [email protected] I. Kar e-mail: [email protected] B. Guha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_6
61
62
I. Kar et al.
1 Introduction From 1992 to 2018, the growth of photovoltaics on a global scale was exponential. During this period, photovoltaics (PV), also known as solar PV, has transitioned from a niche field of small-scale applications to a widespread energy source [1]. Several nations initiated incentive programs, such as feed-in tariffs, when solar PV systems were identified as a potential renewable energy source. These initiatives were designed to provide financial incentives for the installation of solar PV systems. For many years, expanding economies, particularly in Japan and several avant-garde European nations, were the primary drivers of progress. More than half of the solar photovoltaic capacity will be installed in China and India by 2050, according to the high-resilience scenario of the LEA. This will make solar energy the world’s largest source of energy [2]. Every year, more solar power plants are being installed around the world. To examine the solar plants and find faults inside these photovoltaic panels, automated diagnostic methods are required. Typically, unmanned aerial aircraft (UAVs) are equipped with imaging sensors to do the inspection. Locating the solar panels in those images is the main motivation in the entire process. However, for many years, scientists relied solely on inefficient image processing methods. Due to a variety of factors, changes in the intensity of the sun, change in the UAV flight path and arbitrary orientations, the image processing-based detector fails to identify solar panels during UAV navigation. Thermal object detection is one method for addressing these problems. But this identification is challenging in thermal images due to inconsistencies caused by weed shadowing, sunlight reflection [3], or hot areas, as well as the fact that not all panel edges are visible [4]. With the advancement of artificial intelligence-based application, in the recent years, researchers have proposed or investigated various image processing and object detection methods for solar panel detection from areal images [5]. One of the mainstays of computer vision is object detection, the process of determining classes of object occurrences in scenes and simultaneously marking their locations. Researchers in the field of aerial imagery have been motivated to find a solution to the difficult ground object recognition problem by the success of deep convolutional neural networks (DCNNs) in the field of object detection in natural situations. Images captured by optical sensors from a significant height and distance have properties that set them apart from those captured by regular consumer cameras. They may be set on the ground in arbitrary angles, making them difficult to capture with the horizontal bounding boxes typically deployed in generic object detection frameworks. Detection algorithms will also be challenged by the abundance of tiny, closely packed objects in aerial data. Either reducing the size of input images or decreasing the downsampling rate of CNN to keep high-quality features will increase the effective resolution in the resulting feature map, hence improving tiny object recognition. Increasing the detail of feature maps could incur substantial computing costs. Several research [6–8] proposed reusing the multiscale feature maps from the levels of a CNN to build a feature pyramid as a solution to this issue. High-level features are often used
Toward More Robust Multiclass Aerial Solar Panel Detection …
63
to detect large objects, while low-level features are used to detect tiny objects. It is computationally expensive to keep high-resolution feature maps from the network’s surface to its depths. Head detection using low-level characteristics still has a high computational complexity. The overhead view creates an inherently unpredictable orientation of objects, which has two effects on the object detection process and is the primary difference between natural images and aerial images. First, most existing deep learning network models cannot employ rotation-invariant feature representations, which are preferred when identifying objects that may be placed in any orientation. The problem is far from being solved, even though rotation-invariant convolutional neural networks (CNNs) have been used in approaches such as those developed in [9, 10], and [11]. We targeted the challenges and take it as principal objective to propose a novel solution approach: (1) The features that emphasize small objects are typically contaminated, especially if they belong to many classes, due to the down-sampling procedures at the core of convolutional neural networks (CNN). (2) The receptive field of low-resolution features may not be proportionate to the size of microscopic objects, as discussed in [3]. (3) It is more difficult to localize small items of different oriented classes than it is to localize large objects, because a slight change in the bounding box can have a substantial impact on the intersection over union (IoU) measure. Since in the field of computer vision, multisolar panel detection was not explored till now, as a contribution, we have proposed a solution for detection of three types of arbitrary oriented solar panels water heater photovoltaic (WPV), farm type photovoltaic (FPV), and Single photovoltaic (SPV).
2 Literature Reviews 2.1 Multisolar Panel Detection There are few solar panel detection research papers which will be discussed below. According to our research, there are no research papers on this subject, and we are the first to have thought of and carried out an experimental study using data containing different class of solar panels. According to the study [12], a customized version of EfficientNet-B7 was trained to distinguish between solar and non-solar tiles in satellite images by Transfer Learning. EfficientNet-B7 can easily achieve the state-of-the-art ImageNet benchmarks of 84.4% top-1 and 97.1% top-5 accuracy. Using a training set that included 1295 photos without solar panels and 668 images with solar panels, the authors finetuned EfficientNet-B7. In order to evaluate the efficacy of the optimization results, it is compared against a validation dataset consisting of 324 tiles without solar panels and 168 tiles with solar panels. Precision and recall, on average, are both 0.98 for
64
I. Kar et al.
the classification model, while accuracy is at 0.98 overall. The paper, however, does not address the issue of arbitrary orientation or the focal points of different types of solar panels. Using the Google Maps Static API, the researchers have also created their own dataset. Algorithms for identifying solar panels have been developed in a number of recent research [13–17]. One such effort that made significant progress in this area is DeepSolar [13], which was trained on more than 350 thousand images and achieved recall and precision 90% for solar panel recognition and a mean relative error of 2.1% for size estimate of solar panels. While deep solar’s work has advanced the field, their methods are still unusable in the multisolar panel and bounding box orientation scenarios.
2.2 Deep Learning There are various types of neural networks: one-stage and two-stage one-shot as well as two-shot detectors which has various trade-offs in terms of speed and performance [18]. The one-stage network [19–22] has a CNN backbone, feature pyramid, and detection heads. The two-stage network has [23–25] CNN backbone, RPN region proposal, and detection head. Unlike two shots, one shot looks only once to detect an object. There are other models with anchor [19–22] and without anchor [23–28]. A very good explanation of the models which we have used has been covered in the paper Survey of Modern Deep Learning-based Object Detection Models; as a result, we will only discuss a few SOTA models that were not covered in the study [29]along with few major reasons why the SOTA models failed to model the solar panel data. These state-of-the-art models performed very poorly on the solar panel dataset due to the very reason that these models are not optimized for multitiny object detection [30] and cannot handle arbitrary orientations. Hence, there was a need of customization of these SOTA models at various stages to fit them to solar panel dataset.
2.2.1
Rotated Region of Interest
One common technique for effectively representing objects of arbitrary orientations is to rotate the predicted bounding boxes by specified angles, with those angles derived from the object features driven by a set of anchors. Because angle prediction is highly nonlinear, it is difficult to acquire exact angles from the horizontal proposals. A few studies [31, 32] attempt a solution by first designing rotated anchors and then regressing them to rotated region of interests (RRoIs). With the progress made by these techniques, rotational region proposals have also been used for ship detection. In the framework, Liu introduced rotation bounding box regression and pooling of a rotation region of interest (RroI) [33, 34]. Yang
Toward More Robust Multiclass Aerial Solar Panel Detection …
65
combined a fast R-CNN, RRPN, and dense feature pyramid network to get excellent results. To identify ship parts for rotational region proposals, Zhou applied a semantic segmentation network [35]. Although techniques using a rotation bounding box have been suggested in recent years, particularly for ships [32] and automobiles, the same technology has not been used to identify airborne solar panels. Finding regions with a high likelihood of holding important data is the goal of region proposal. For instance, an object detection algorithm may have a region proposal system may propose regions of interest that have a high likelihood of having objects of interest. After then, the overall recognition system concentrates on those suggested regions. Object proposal is the process of making a set of predictions or guesses that have a high chance of being accurate about the identity of an unknown object. The procedure of region proposal for object detection does not care what the objects are identified. Without knowing the potential identities of those objects, the region proposal system must locate all regions that contain at least one object. Various angle-based and polygon boundary box encodings are devised to address the object orientation on aerial images, and a strategy for extracting rotated regions was provided to further improve the detection precision. Rotation bounding boxes are originally introduced and studied in the context of text detection, which must also be capable of recognizing text in any orientation. Rotational area convolutional neural networks (R2CNN) [36] employ skew non-maximal suppression to select region proposals and add skewed angle data to CNNs used for detection and regression. To improve the quality of proposals made in the RPN stage and hence the performance of the R2CNN, Rotation RPN (RRPN) [37] creates rotation anchors. Thus, a completely modified architectural design and approach must be proposed for trustworthy aerial detection of any random many tiny solar panels.
3 Methodology 3.1 Dataset Preparation For this paper, we have utilized mainly web scraping solar panels which contains images of solar farms and images of solar water heaters; we have further removed the background and pasted these solar panels on varied background. Some of the advantages of selected datasets are as follows: solar panels that are varied and representative. Authors claimed to have picked their solar panels carefully so that they create a diversified and representative (in terms of power capacity) group of genuine solar panels that will be placed in different environments. The solar panel dataset created by the us in this paper consists of 4000 images of multiple solar panels in the images. The water heater photovoltaic (WPV), farm type photovoltaic (FPV), and Single Photovoltaic (SPV) all resides on the same image. The data was further subdivided into 3000 for training, with annotation and 1000 for testing.
66
I. Kar et al.
3.2 Experimental Design We changed the state-of-the-art neural network architectures significantly; for twostage neural network, we introduce rotated region proposal network, for anchor-based neural network, we introduce 3.2.1, and for anchor-free-based, oriented bounding box vector Sect. 3.2.2. We further discovered that introducing objectness loss would further improve detection of solar multiple types of panels with varied orientations. Rotated bounding box has to be closed to the object; however, that is not always the situation when annotations are done, even when the bounding box augmentation is applied, the issue still persists. These lead to issues where objects or features are not aligned to the image axis. The inability of the four parameters to precisely characterize the object’s contour is a common problem with anchor-based detectors, which can cause them to make incorrect detections.
3.2.1
Offline Generation of Rotated Bounding Boxes from Horizontal Bounding Boxes
All the state-of-the-art models listed above use horizontal bounding boxes. A horizontal bounding box can be used to detect the majority of objects in natural photos; however, in our oriented multiclass classification due to their distinct shapes, solar panels cannot be precisely located. A horizontal bounding box cannot determine the orientation PV. Recognizing tightly grouped inshore ships based on region proposals is extensively discussed in a recent paper [38], where the authors compare and contrast the use of horizontal bounding boxes with the use of rotating bounding boxes. Here, rotational bounding boxes are used as they more realistically depict the randomly oriented targets of solar PV cells than a horizontal boundary frame would. The IoU, on the other hand, is sensitive to variations in angle. In studies, we discovered that by providing proper learning targets, the angle may be taught accurately even without utilizing a spinning anchor box. We utilized a recent rotational bounding box representation technique for this research. For generating rotated bounding boxes, existing horizontal rectangle annotations were utilized. Using horizontal bounding boxes, a polygon mask was generated for solar panel using watershed algorithm. Once we get the polygon mask, we can find the minimum enclosing rotating rectangle for the solar PV cells using standard OpenCV smallest rectangle transform in python (Fig. 1).
Toward More Robust Multiclass Aerial Solar Panel Detection …
67
Fig. 1 Rotated bounding box generation from normal bounding box, a represents bigger bounding box, b represents bounding box generated by following algorithm 1
Algorithm 1: Generate rotating bounding box annotations Input Data: Images and Rectangular bounding Box annotations While not all images are iterated Read image While not all bounding boxes are iterated Read bounding box Find Watershed (cropped image enclosed by bounding box) Cv2.boundingRecatngle from contours returned Find centre X,Y, and w,h of rectangle and θ from x-axis End End Return rotated bounding boxes [x, y, w, h, θ]
3.2.2
Training Time Generation of Rotated Bounding Box from Horizontal Bounding Box
Disparity is most pronounced for rectangular items or those with a high aspect ratio. The distance between an image and its bounding box can be reduced by using a second parameter, the object’s angle with respect to the vertical axis θ. The minimum x and y coordinates, width, height, and θ of an image can now be defined. To capture OBBs in anchor-free scenarios, it is sufficient to calculate their width (w), height (h), and angle (θ) with respect to the origin. This is “Center + wh + θ” which was proposed in [39]. There are a few downsides to this method. Initially, small shifts in angle have a large impact on the IOU difference between the predicted box and the ground-truth box but have a minimal impact on the total training loss. Because of its exceptional rotational features, OBB’s x, y, and z coordinates are additionally rotated with respect to the y-axis. As a result, the network has a hard time learning coordinated box parameters for all the objects. Similar to [39], we also
68
I. Kar et al.
propose using vectors that consider the box boundaries to describe the OBB in this work boundary box aware vectors (BBAV). The BBAVectors are used to keep track of the a(top), b(right), c(bottom), and d (left) vectors that come from the objects’ axes of symmetry. The design incorporates the four cardinal directions of Cartesian space as symbols for the four vector classes. Having all objects, regardless of orientation, utilize the same coordinate system would improve the model’s ability to generalize. boxp = [a, b, c, d, widthext, heightext] is defined as a proposed boundary box vector parameter. Unlike [39] externally, if we measure oriented bounding box in a horizontal manner, widthext , heightext are the width and height respectively. Boxm , the box parameter, has 4 vectors with 2 external size parameter which is denoted width height by Boxm ∈ R N ∗ s ∗ s . Total number of channels are N, and s is a standardizing scaler.
Detection of Corners Multiclass Arbitrary Oriented Tiny Solar Panel In real life, we see that detection fails when objects are approximately parallel to the x and y axes. The reason for this is that distinguishing between vector types becomes challenging around the quadrant boundary. The term of these situations is “corner cases.” As a solution, we classify OBBs into two distinct groups and handle them independently in this study. In specifically, we distinguish between horizontal bounding boxes (HBB) and rotation bounding boxes (RBB), with the latter including every rotational box except for the horizontal ones. The upside of this method of categorization is that it allows us to convert otherwise intractable edge instances into manageable horizontal ones. The orientation category and the external size can assist the network in capturing the correct OBB when it experiences edge circumstances. Sigmoid function is used as an activation function in the output map of Boxm . Predicted orientation will be Boxoripred ; we define it as follows:
Boxoripred
⎧ Oriented bounding box, ⎪ ⎨ 1 if IoU < 0.95 Horizontal Bounding Box = ⎪ ⎩ 0 else
where IoU measures the similarity between the OBB and the HBB.
4 Results and Discussions Loss and performance metrics are:
Toward More Robust Multiclass Aerial Solar Panel Detection …
69
4.1 Objectness Loss Each box prediction has a prediction called “objectness” attached to it. Because it is multiplied by the lass score to offer absolute class confidence, it takes the position where in earlier detectors like R-CNN originated the confidence that an area proposal includes an object. Despite expectations, that prediction is an IoU prediction, or how effectively the network believes the box encompasses the item. While the coordinate loss teaches the network to predict a better box, the objectness loss term trains the network to predict a proper IoU. All spatial cells contain only the best fitted boxes.
4.2 Box Loss The box loss metric is a measure of how well an algorithm can estimate the bounding box around an object and how well it can locate the object’s center.
4.3 Classification Loss The functions are computationally viable loss functions that indicate the cost of inaccurate prediction in classification situations. The classification loss indicates how successfully the algorithm predicts the proper class of a given object. Scheduler learning rate is used with 0.01 as starting value and 0.0001 weight decay, and epoch was 1000, and later, we increase to 3000 to check performance (Fig. 2).
5 Conclusion Table 1 shows the metrics of for 13 different custom-rotated models—the YOLO, the single stage detector, the masked R-CNN, and the faster R-CNN, Yolo v7, EfficientDet, ResDet, RetinaNext, RetinaNet, MobileNet, CenterNet, DetectoRS, FCOS—to illustrate their performance on the dataset. Our primary objective was to propose a rotated region of interest and accurately detect corner scenarios by proposing oriented bounding box at the time of training. On the test set, both accuracy and recall were significantly improved, and the losses plots were smoother than those of other competitive networks. The CenterNET model achieves a better balance between box loss and objectness loss, leading to more accurate and resilient solar panel classification, despite other network’s poorer classification compared to faster R-CNN.
70
I. Kar et al.
Fig. 2 Outputs of different classes Table 1 Results of various deep learning model used for results comparison Method
Architecture Box type loss
Objectness Classification mAP_0.5 mAp Inference loss loss 0.5–0.95 time
YOLO v4
Two stage single shot with anchor
0.025 0.0005
0.000
0.810
0.450
38
SSD
Single stage 0.400 0.5000 single shot
0.250
0.630
0.610
25
Masked R-CNN
Two stage two shot with anchor
0.006 0.0081
0.000
0.750
0.000
24.5
Faster R-CNN
Two stage two shot with anchor
1.150 1.4500
0.500
0.690
0.600
43.2
Yolo V7
Two stage single shot with anchor
0.023 0.0003
0.012
0.850
0.550
30
EfficientDet Single stage 0.050 0.0025 single shot anchor free
0.006
0.125
0.452
26.5
ResDeT
Single stage 0.042 0.0561 single shot with anchor
0.002
0.256
0.316
21
Retinanet
Single stage 0.320 0.0032 single shot anchor free
0.052
0.395
0.287
36.3
(continued)
Toward More Robust Multiclass Aerial Solar Panel Detection …
71
Table 1 (continued) Method
Architecture Box type loss
Objectness Classification mAP_0.5 mAp Inference loss loss 0.5–0.95 time
MobileNet
Single stage 0.100 0.0036 single shot with anchor
0.078
0.442
0.325
19.8
CenterNET
Two stage two shot anchor free
0.020 0.0460
0.057
0.498
0.156
38.5
DetectoRS
Two stage two shot anchor free
0.025 0.0562
0.015
0.524
0.367
36.7
FCOS
Single stage 0.350 0.0085 two shot anchor free
0.256
0.650
0.480
41.7
References 1. Lorenzoni A.: The support schemes for the growth of renewable energy (2010) 2. bloomberg.: Transition in energy, transport—predictions for 2019 (2019) 3. Liao, K.C., Lu, J.H.: Using UAV to detect solar module fault conditions of a solar power farm with ir and visual image analysis. Appl. Sci. 11(4), 1835 (2021) 4. Gallardo-Saavedra, S., Hernández-Callejo, L., Duque-Perez, O.: Technological review of the instrumentation used in aerial thermographic inspection of photovoltaic plants. Renew. Sustain. Energy Rev. 93, 566–579 (2018) 5. Almalki, F.A., Albraikan, A.A., Soufiene, B.O., Ali, O.: Utilizing Artificial intelligence and lotus effect in an emerging intelligent drone for persevering solar panel efficiency. Wirel. Commun. Mobile Comput. (2022) 6. Cai, Z., Fan, Q., Feris, R.S., Vasconcelos N.: A unified multi-scale deep convolutional neural network for fast object detection. In ECCV. Springer (2016) 7. Lin, T.Y., Dollar, P., Girshick, R., He K, Hariharan B, Belongie, S.: Feature pyramid networks for object detection. In CVPR (2017) 8. Liu, W., Anguelov, D., Erhan D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In ECCV. Springer (2016) 9. Cheng, G., Han, J., Zhou, P., Xu, D.: Learning rotation-invariant and fisher discriminative convolutional neural networks for object detection. IEEE TIP 28(1), 265–278 (2018) 10. Cheng, G., Zhou, P., Han, J.: Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 54(12), 7405–7415 (2016) 11. Zhou, Y., Ye, Q., Qiu, Q, Jiao, J.: Oriented response networks. In: CVPR, pp. 4961–4970. IEEE (2017) 12. Parhar, P., Sawasaki, R., Todeschini, A., Vahabi, H., Nusaputra, N., Vergara, F.: HyperionSolarNet: solar panel detection from aerial images. arXiv preprint arXiv:2201.02107 (2022) 13. Yu, J., Wang, Z., Majumdar, A., Rajagopal, R.: DeepSolar: a machine learning framework to efficiently construct a solar deployment database in the United States. Joule 2(12), 2605–2617 (2018) 14. Camilo, J., Wang, R., Collins, L.M., Bradbury, K., Malof, J.M.: Application of a semantic segmentation convolutional neural network for accurate automatic detection and mapping of solar photovoltaic arrays in aerial imagery. arXiv preprint arXiv:1801.04018 (2018)
72
I. Kar et al.
15. Zhuang, L., Zhang, Z., Wang, L.: The automatic segmentation of residential solar panels based on satellite images: a cross learning driven U-Net method. Appl. Soft Comput. 92, 106283 (2020) 16. Wani, M.A., Mujtaba, T.: Segmentation of satellite images of solar panels using fast deep learning model. Int. J. Renew. Energy Res. (IJRER) 11(1), 31–45 (2021) 17. Golovko, V., Kroshchanka, A., Mikhno, E., Komar, M., Sachenko, A.: Deep convolutional neural network for detection of solar panels. In: Data-Centric Business and Applications, pp. 371–389. Springer, Cham (2021) 18. Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S. et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of CVPR, pp. 7310–7319 (2017) 19. Jiang, Z., Zhao, L., Li, S., Jia, Y.: Real-time object detection method based on improved YOLOv4-tiny. arXiv preprint arXiv:2011.04244 (2020) 20. Cao, G., Xie, X., Yang, W., Liao, Q., Shi, G., Wu, J.: Feature-fused SSD: Fast detection for small objects. In: Ninth international conference on graphic and image processing (ICGIP 2017), vol. 10615, pp. 381–388. SPIE (2018) 21. Sanjay, N.S., Ahmadinia, A.: MobileNet-Tiny: a deep neural network-based real-time object detection for rasberry Pi. In: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 647–652. IEEE (2019) 22. Cheng, M., Bai, J., Li, L., Chen, Q., Zhou, X., Zhang, H., Zhang, P.: Tiny-RetinaNet: a onestage detector for real-time object detection. In: Eleventh International Conference on Graphics and Image Processing (ICGIP 2019), vol. 11373, pp. 195–202. SPIE (2020) 23. Sumit, S.S., Watada, J., Roy, A., Rambli, D.R.A.: In object detection deep learning methods, YOLO shows supremum to Mask R-CNN. J. Phys. Conf. Ser. 1529(4), 042086 (2020). IOP Publishing 24. Yang, J., Li, S., Wang, Z., Yang, G.: Real-time tiny part defect detection system in manufacturing using deep learning. IEEE Access 7, 89278–89291 (2019) 25. Xu, X., Liang, W., Zhao, J., Gao, H.: Tiny FCOS: A lightweight anchor-free object detection algorithm for mobile scenarios. Mobile Netw. Appl. 26(6), 2219–2229 (2021) 26. Wang, J., Yang, W., Guo, H., Zhang, R., Xia, G.S.: Tiny object detection in aerial images. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 3791–3798. IEEE (2021) 27. Yang, L., Rakin, A.S., Fan, D.: Rep-Net: efficient on-device learning via feature reprogramming. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12277–12286 (2022) 28. Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp. 734–750 (2018) 29. Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019) 30. Tong, K., Wu, Y.: Deep learning-based detection from the perspective of small or tiny objects: a survey. Image Vis. Comput., 104471 (2022) 31. Yu, Y., Yang, X., Li, J., Gao, X.: A cascade rotated anchor-aided detector for ship detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–14 (2020) 32. Xiao, X., Zhou, Z., Wang, B., Li, L., Miao, L.: Ship detection under complex backgrounds based on accurate rotated anchor boxes from paired semantic segmentation. Remote Sens. 11(21), 2506 (2019) 33. Koo, J., Seo, J., Jeon, S., Choe, J., Jeon, T.: RBox-CNN: Rotated bounding box based CNN for ship detection in remote sensing image. In: Proceedings of the 26th ACM SIGSPATIAL international conference on advances in geographic information systems, pp. 420–423 (2018) 34. Li, M., Guo, W., Zhang, Z., Yu, W., Zhang, T.: Rotated region based fully convolutional network for ship detection. In: IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium, pp. 673–676. IEEE (2018) 35. Liu, Z., Hu, J., Weng, L., Yang, Y.: Rotated region based CNN for ship detection. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 900–904. IEEE (2017)
Toward More Robust Multiclass Aerial Solar Panel Detection …
73
36. Zhou, Q., Yu, C.: Point RCNN: an angle-free framework for rotated object detection. Remote Sens. 14(11), 2605 (2022) 37. Azimi, S.M., Vig, E., Bahmanyar, R., Körner, M., Reinartz, P.: Towards multi-class object detection in unconstrained remote sensing imagery. In: Asian Conference on Computer Vision, pp. 150–165. Springer, Cham (2018) 38. Deshmukh, S., Moh, T.S.: Fine object detection in automated solar panel layout generation. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1402–1407. IEEE (2018) 39. Yi, J., Wu, P., Liu, B., Huang, Q., Qu, H., Metaxas, D.: Oriented object detection in aerial images with box boundary-aware vectors. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2150–2159 (2021)
PlantML: Some Aspects of Investigation on Deployment of Machine Learning Algorithm for Detection and Classification of Plants Gavel D. Kharmalki , Gideon D. Kharsynteng , Narisha Skhemlon , Abhijit Bora , and Gypsi Nandi Abstract Classification and identification of plants are necessary from the perspective of agricultural specialist as well as botanical research. The traditional methods of finding the information for the specific plant consume time and effort. The deployment of machine learning algorithm can play the vital role while identifying as well as classifying the plant. As such, we propose a novel model based on machine learning algorithm that can be deployed to identify the flowers and fruits. We call it PlantML. The proposed work will highlight the experimental arrangement of PlantML as well as the use case, activity diagram of the system. The comparative analysis among applicable machine learning algorithm for PlantML will be discussed. In this work, the deep network knowledge is used to train the datasets considering the features of ImageNet model of deep neural network. The framework platform TensorFlow is utilized to deploy it. The study also highlights that in the domain of image classification, impressive results can be seen while using latest technique of convolutional neural network. The viability of the work will be evaluated to find the evidence that PlantML will be suitable and can act as supplementary tool for agricultural as well as botanical research. As such, from the study, it can be concluded that the proposed model can recognize the different types of flowers and fruits at a higher accuracy. Keywords Classification · TensorFlow · Convolutional neural network · Keras · Machine learning
1 Introduction In the community of computational and botanical society, identification and classification of plant can bridge the botanical taxonomic gap. Identification of a plant’s species can highlight the significance characteristics and information about the plant [1]. Interpretation through visual and manual observation is inaccurate as it may G. D. Kharmalki · G. D. Kharsynteng · N. Skhemlon · A. Bora (B) · G. Nandi Department of Computer Applications, Assam Don Bosco University, Guwahati, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_7
75
76
G. D. Kharmalki et al.
contain erroneous information. Sampling of digital images can help in identifying specific pattern texture features [2]. Complex models for automatic plant identification have been proposed as machine learning technology develops. Using the emerging services of smart phones, millions of digital plant images can be acquired. Automatic plant identification through mobile based applications can help in ecological surveillance, monitoring exotic plant and ecological science popularization [3]. Periodic advancement in technology found their way to identify the plant. It is achieved with the help of different learning technique. Machine learning techniques, particularly deep learning technologies, have significantly improved as a result of the recent increase in data availability and rising computing power. Convolutional neural networks (CNNs) are the foundation of robust and effective automated plant identification because they enable applications to have higher recognition performance. Deep CNNs have demonstrated accuracy levels comparable to human performance on tests requiring general object recognition and fine-grained species identification [4]. In increasing the plant species, the task became more difficult each year, yet identification performance got better every year by improving the prediction machine learning models. It is crucial to assess the application accuracy before wider use. Evaluation of recognition algorithm considering plant picture and comparing using different data set is carried out in different platform [5]. However, there is lack of performance assessment in actual scenario. To obtain high success in identification, image classification can play the vital role. As such, to overcome the botanical gap of supplementary tool, the proposed work deploys a novel architecture that can identify and classify the flowers and fruits as well as can extract relevant information for the specific identified plants.
2 Related Work The popularity of machine learning model to predict pattern is increasing gradually. Different models are used for detection and classification [2, 3, 6–12]. In the year 2006, Nilsback et al. discussed three major models [13]. In the first model, the features extractions are carried out using deep CNN. Different machine learning algorithms to classify the objective are used to enhance the performance of the classifying method for flower images. In the second, the image expansion technique was deployed to enhance the performance. Multi-perceptions, SVM, KNN and random forest models have been used for comparisons to evaluate the machine learning classifiers. While using support vector machine (SVM) on Oxford dataset, the researcher found an accuracy of 97.2%. An accuracy of 98% was observed using MLP on the dataset of Oxford-17. Abu et al. use deep neural network for image classification of the five types of flowers and found an average result of 90% [14]. The random forest classifier method was deployed to develop the system. In the year 2017, Albadarneh et al. observed a high result in the data set of Oxford 17 while using image classification as well as the features and characteristics of texture, color, and shape [15]. In the year 2018, Kamilaris et al. observed [4] that the use of deep learning approaches gave the
PlantML: Some Aspects of Investigation on Deployment of Machine …
77
highest accuracy to different farming industries challenges. In the year 2018, Lakesar utilized the Inception-v3 model to increase the flower classification accuracy [16]. To train the multilayer networks, the backpropagation learning algorithm was used. In the year 2021, Islam et al. utilized the dataset of Oxford-102 flower to test and train the model [17]. In the year 2022, Habib et al. emphasized the utilization of segmentation model while using color and shape features of flower [18]. The proposed work is different from existing work, as emphasize is given on simulation of classification model for the data set of botanical aspects. The study illustrates exact close findings of boundary recognition while using big data set of images. The techniques involved in this proposed work incorporate a lot of time to train since the data is vast and optimal processing takes more time and system resources which inherently are accessible only to adequately funded groups or individuals. In this work, we are implementing a model using the resources from a generic machine such as a personal computer, to show that the model could be implemented in a system with minimal computational requirements; however it lacks the processing capabilities for a more efficient (with emphasis to the processing speed) implementation of the model.
3 Objective and Implementation Methodology For conservation of nature, having a critical knowledge on biodiversity is very essential. The qualified experts can play the vital role in this global loss of habitats and species that can enforce the measure for the protection of the flora. Undoubtedly, the public’s awareness of the species has recently been declining. Plant blindness or the inability to recognize plant in the globe and finding features of each has increased in the community of youth as well as upcoming society. The proposed system accomplishes this problem by using machine learning technique. This work explores prediction of plants by taking various datasets of flowers and fruits. The learning classification and techniques are used to predict the flower. To implement the proposed system, a pre-trained model is used which is also known as transfer learning, using pre-trained CNN model allows us to train the model in a short time since a pre-trained CNN model is much more optimized to learn patterns and complex patterns than writing our own sequential model which will take a lot of time in training and testing to get the required accuracy. With this implementation, we put emphasis on simplicity and ability to implement this model without the inclusion of complex independent code. This ensures that any researcher can put out and compare the empirical findings from this paper. Here, a pre-trained model named EfficientNetB7 is used from keras along with pre-trained Checkpoints named “noisy-student” to train our model so as to achieve a higher accuracy. Fig. 1 shows the architecture of the EfficientNet-B7. The software and hardware that are utilized for the proposed work are shown in Table 1. Figure 2 shows the architecture of the structural design for the proposed model of PlnatML. It shows the working system of the classification model.
78
G. D. Kharmalki et al.
Fig. 1 Architecture of EfficientNet-B7 Table 1 Software and hardware configuration of PlantML
Categories
Tools
Software IDE
Android studio version 2021.3.1.16 Jupyter notebook/Google collab Anaconda navigator (2022.05)
Operating system
Windows 8/10/11 64-bit
Library
Python Numpy (1.23.3), tensor flow (2.10.0), Keras (3.7.2)
Framework
Flutter (3.3.2)
RAM
8 GB
Processor
Ryzen 5 4800H (4.0 GHz)
Storage
12 GB
Data store
104 Flowers: Garden of Eden [9], Fruits-262 [10]
Fig. 2 Architecture of the structural design for PlantML
PlantML: Some Aspects of Investigation on Deployment of Machine …
79
3.1 Datasets There are a few publicly available flower and fruit datasets that have been used by researchers. In the proposed model, we are using the datasets [19] 104 Flowers: Garden of Eden and [20] Fruits-262 that are available in Kaggle. In our dataset, it consists of 104 different categories of flowers and 262 different categories of fruits. The training and testing dataset is divided in the percentage ratio of 80:20. As such, 80% of the data set are utilized for the training, and the remaining 20% was utilized for testing. A sample of the flower datasets is shown in Fig. 3. In the proposed system, we use the data augmentation technique to create variation of images that enhances the ability of our model to generalize what we have learned into images. The primary goal of using data augmentation in Keras is to increase the generalizability of our model. Data augmentation is not used at the time of testing the model. For classifying the images, the pre-trained model (EfficientNetB7) is loaded with pretrained weights, and we use the datasets from Kaggle to train the model in order for the model to be more accurate in distinguishing the different types of plants.
Fig. 3 Flower sample dataset
80
G. D. Kharmalki et al.
Fig. 4 Accuracy assessment for the test set
The pre-processing part is already included in the model itself and we no longer need to pre-process the image. Once the image is inputted into the model, the model will classify the image and it gives an output as a result showing the class name of the predicted image along with its accuracy score. Fig. 4 shows the accuracy of our system. The increase of the accuracy achieved by this model in training went from 0.38, and the validation accuracy went from 0.55 to 0.54 and 0.75, respectively. Fig. 5 shows the decline in the model loss which went from 2.7 in the training data and 1.86 in the validation loss to 0.54 and 0.87, respectively. The use case diagram of PlantML is shown in Fig. 6. It shows that the user can identify plants like flowers and fruits by using the app. The system provides two options one for identifying the flowers. One can upload or take the image using the phone camera. It is required to crop the image. The system will show the output by identifying its type and also an option to add to favorites, view history and view more info is available. Figure 7 shows the activity diagram of PlantML. It shows the scenario while using system. It illustrates the activities or event when the system is in use. It also shows the activity flow of the user and the system.
4 Conclusion The proposed work discussed the classification of flowers and images using the dataset available in Kaggle. The data set includes 104 types of flowers and 262 types of fruits. Using transfer learning, the model pre-process the images and perform data augmentation on it to easily extract the features of flowers and fruits. The prediction model in our system uses CNN to classify the images into a categorical multi-class label. The proposed model can achieve a fairly good accuracy, but we could have achieved more with more training time of our model. Since the time allotted for the work pertaining to this research paper’s commencement and conclusion was limited, the aim of achieving a greater accuracy from the training and testing sessions was cut
PlantML: Some Aspects of Investigation on Deployment of Machine …
Fig. 5 Model loss accuracy
Fig. 6 Use case diagram of the proposed system
81
82
Fig. 7 Activity diagram of PlantML
G. D. Kharmalki et al.
PlantML: Some Aspects of Investigation on Deployment of Machine …
83
short; since the accuracy of the model was gradual and proportional to the number of training and testing sessions, we are able to fit in the schedule within the time period. The comparison among different CNN model and evaluation for optimal training time will be one of the objectives as part of future work in this study.
References 1. Wäldchen, J., Mäder, P.: Plant species identification using computer vision techniques: a systematic literature review. Arch. Comput. Methods Eng. 25, 507–543 (2018). https://doi.org/10. 1007/s11831-016-9206-z 2. Aradhya, V.N.M., Mahmud, M., Guru, D.S., et al.: One-shot cluster-based approach for the detection of COVID–19 from chest X–ray images. Cogn. Comput. 13, 873–881 (2021). https:// doi.org/10.1007/s12559-020-09774-w 3. Bhapkar, H.R., Mahalle, P.N., Shinde, G.R., Mahmud, M.: Rough sets in COVID-19 to Predict symptomatic cases. In: Santosh, K., Joshi, A. (eds) COVID-19: Prediction, Decision-Making, and its Impacts. Lecture Notes on Data Engineering and Communications Technologies, vol. 60. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-9682-7_7 4. Kamilaris, A., Prenafeta-Boldú F.X.: Deep learning in agriculture: a survey. Comput. Electron. Agric., 147, 70–90, ISSN 0168-1699 (2018) https://doi.org/10.1016/j.compag.2018.02.016 5. Chithra, P.L, Bhavani, P.: A study on various image processing techniques. Int. J. Emerg. Technol. Innov. Eng. 5(5) (2019) 6. Shahrin, F., Zahin, L., Rahman, R., Hossain, A.J., Kaf, A.H., Azad, A.K.M.: Agricultural analysis and crop yield prediction of habiganj using multispectral bands of satellite imagery with machine learning. Int. Conf. Electr. Comput. Eng., 21–24 (2020). https://doi.org/10.1109/ ICECE51571.2020.9393066 7. Chengjuan Ren, D.-K.K., Jeong D.: A survey of deep learning in agriculture: techniques and their applications. J. Inf. Process. Syst. 16(5), 1015–1033 (2020). https://doi.org/10.3745/JIPS. 04.0187 8. Singh, G., Sethi, G.K., Singh, S.: Survey on machine learning and deep learning techniques for agriculture land. SN Comput. Sci. 2, 487 (2021). https://doi.org/10.1007/s42979-021-009 29-6 9. Condran, S., Bewong, M., Islam, M.Z., Maphosa, L., Zheng, L.: Machine learning in precision agriculture: a survey on trends, applications and evaluations over two decades. IEEE Access 10, 73786–73803 (2022). https://doi.org/10.1109/ACCESS.2022.3188649 10. Treboux, J., Genoud, D.: Improved machine learning methodology for high precision agriculture. In: 2018 Global Internet of Things Summit (GIoTS), pp. 1–6 (2018). https://doi.org/10. 1109/GIOTS.2018.8534558 11. Kavitha, R., Kavitha, M., Srinivasan, R.: Crop recommendation in precision agriculture using supervised learning algorithms. In: 2022 3rd International Conference for Emerging Technology, pp. 1–4 (2022). https://doi.org/10.1109/INCET54531.2022.9824155. 12. Gehlot, A., Sidana, N., Jawale, D., Jain, N., Singh, B.P., Singh, B.: Technical analysis of crop production prediction using machine learning and deep learning algorithms. Int. Conf. Innov. Comput. Intell. Commun. Smart Electr. Syst. pp. 1–5 (2022). https://doi.org/10.1109/ICSES5 5317.2022.9914206 13. Nilsback, M.E., Zisserman, A.: A visual vocabulary for flower classification. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1447–1454 (2006) 14. Abu, M.A., Indra, N.H., Abd Rahman, A.H., Sapiee, N.A., Ahmad, I.: A study on image classification based on deep learning and TensorFlow. Int. J. Eng. Res. Technol. 12(4), 563–569 (2019) 15. Albadarneh, A., Ahmad, A.: Automated flower species detection and recognition from digital images. IJCSNS Int. J. Comput. Sci. Netw. Secur. 17(4), 144–151 (2017)
84
G. D. Kharmalki et al.
16. Lakesar, A.L.: A review on flower classification using neural network classifier. Int. J. Sci. Res. 7(5), 1644–1646 (2018) 17. Islam, T., Absar, N., Adamov, A.Z., Khandaker, M.U.: A machine learning driven android based mobile application for flower identification. In: Mahmud, M., Kaiser, M.S., Kasabov, N., Iftekharuddin, K., Zhong, N. (eds.) Applied Intelligence and Informatics. AII 2021. Communications in Computer and Information Science, vol. 1435 (2021). https://doi.org/10.1007/9783-030-82269-9_13 18. Habib, M.T., Raza, D.M., Islam, M.M., Victor, D.B., Arif, M.A.I.: Applications of computer vision and machine learning in agriculture: a state-of-the-art glimpse. Int. Conf. Innov. Trends Inf. Technol., 1–5 (2022). https://doi.org/10.1109/ICITIIT54346.2022.9744150 19. Kaggle for dataset. https://www.kaggle.com/datasets/msheriey/104-flowers-garden-of-eden. Accessed 2022/09/23 20. Kaggle for dataset. https://www.kaggle.com/datasets/f9472b258bbdab0dbc8cc773ad8c78a2f a1b997fa0cd88a476f184b78b93338c. Accessed 2022/09/21
Big Data Analytics-Based Recommendation System Using Ensemble Model Devika Rani Roy, Sitesh Kumar Sinha, and S. Veenadhari
Abstract In the realm of computer science, RSS is a set of tools and methods for making useful product recommendations to end users. To maintain footholds in competitive industry, telecoms provide a wide range of offerings. It is challenging for a client to choose the best-fit product from the huge bouquet of products available. It is possible to increase suggestion quality by using the large amounts of textual contextual data detailing item qualities which are accessible with rating data in various recommender’s domains. Users have a hard time making purchases in the telecom industry. Here, fresh strategy for improving recommendation systems in the telecommunications industry is proposed. Users may choose the recommended services which is loaded onto their devices. Using a recommendation engine is a simple way for telecoms to increase trust and customer satisfaction index. The suggested recommendation engine allows users to pick and choose services they need. The present study compared two distinct recommendation frameworks: a single algorithm and an ensemble algorithm model. Experiments were conducted to compare the efficacy of separate algorithms and ensemble algorithm. Interestingly, the ensemble algorithmbased recommendation engine has proven to provide better recommendations in comparison to individual algorithms. Keywords Bigdata analytics · Recommender system · Euclidean and Manhattan · Minkowski · Collaborative filtering
1 Introduction Wireless data traffic has been growing at an astounding rate in recent years. The widespread use of smartphones and the growth of Internet access throughout the D. R. Roy (B) · S. K. Sinha · S. Veenadhari Rabindranath Tagore University, Bhopal, Madhya Pradesh 464993, India e-mail: [email protected] S. Veenadhari e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_8
85
86
D. R. Roy et al.
globe are major factors in this trend. Thanks to the rising popularity and shrinking dimensions of smartphones, nano and picocells are now a practical reality [1]. As a consequence, not only have the quantity and variety of smartphones increased rapidly but also have the services required to monitor and manage those smartphones to maintain a consistent level of high quality for the benefit of the consumers. Since the vast majority of new data is being produced by smartphones and sent over telecom networks, it might be difficult for these companies to sift through the deluge of information and determine which pieces of information are most relevant to anticipating their customers’ behavior. Based on the criteria used to generate suggestions, recommender systems are often divided into the following groups: [1–3] Recommendations systems can be classified by three ways: (1) Collaborative-based recommender system in which items which are liked by users with same choices and preferences are suggested, (2) contentbased, in which same items to those previously liked by the user are suggested, and (3) hybrid, which is combination of both content and collaborative recommendation system. Content-based filtering, collaborative filtering, and hybrid techniques are only some of the methods that may be used in a recommendation system. The most popular method, called “collaborative filtering,” predicts what a particular client would want by analyzing the preferences of others who have their interests and behaviors. A key component of the collaborative filtering method [2] is discovering commonalities across consumers. The problem arises when there is a paucity of information on elderly clients, forcing businesses to rely on general demographic information to make recommendations. In contrast, the content-based approach highlights the product’s unique qualities. To better serve the consumer, it indicates a preference for a certain product to suggest other items to them that have that preference’s features. Hybrid approaches are the outcome of combining the aforementioned methods, and there are some ways to put this into action. One example is to make predictions using both content-based and collaborative filtering methods separately and then combine the findings. The cold-start issue that has been a concern with the recommended system in the past may also be fixed with this method [4]. In a recommendation system, a study suggests that the Euclidean distance [5] between the user’s usage vector and product vector may be used to decide which products are suitable for the individual user. Mathematical averaging of linear combinations of Pearson and Euclidean similarity is one method [6] employed by traditional collaborative filtering. Despite the widespread use of several correlation and similarity measures, each of these approaches has its limitations. Larger features tend to dominate smaller ones, as seen by both the Euclidean and Manhattan distances [7, 8]. In a similar vein, cosine similarity does not account for differences in magnitude across vectors that share the same angle [7, 9]; hence, it always gives the same number for similarity. Given the assumptions upon which Pearson’s correlation is based—a linear connection, a cause-and-effect relationship, etc.—its results are very sensitive to the values of extreme vectors and outliers. [7, 10] As a result of our thorough literature review, we settled on a methodology in which we combine several
Big Data Analytics-Based Recommendation System Using Ensemble …
87
strategies to maximize advantages while minimizing negatives [11]. Looking at how customers use various forms of communication technology. One of the key drawbacks of recommender systems is the “usage of consumer demographic data” for the suggestion, which means that many customized recommender systems must deal with having inadequate personal data to make correct recommendations [12]. To meet the growing demand for high-quality, dependable, and cost-effective data and voice services, telecommunications companies must invest in modernizing their IT and connection infrastructure. New risks enabled by new technologies have emerged, making network security a top responsibility for telecoms. The goal is to find the most effective mobile recommender systems that can be used to boost sales and address the issue of poor product adoption [12]. In this study, we implement a unique ensemble algorithm-based recommendation system combination. On the foundation of matching of usage and product profiles ranking, recharge recommendations rankings are made. These rankings which are of recharge recommendations help in calculating and creating the recommendation of recharge pack. On the basis of individual similarity measure approach, these rankings are improvised. This is done by using an ensemble algorithm consisting of cosine similarity, Minkowski, Manhattan similarity, Euclidean similarity, and recharge history index. With this concept, prepaid telecommunications companies may keep their customers for longer by rewarding their loyalty and attracting new ones.
2 Motivation By using recommendation engines to provide customers with the services they want, rather than forcing them to take advantage of services they do not need, businesses can easily win back the trust of their clientele and enhance their services’ overall quality. My motivation for using advanced collaborative algorithms and their combination is the need for telecom companies to keep improving their recommendation systems to survive in today’s rapidly evolving telecom market. Machine learning, data mining, approximation theories, and AI are just a few of the methods that can be used to boost the quality of recommendations. Content-based filtering and item content are frequently used in research.
3 Proposed Method The individual algorithm-based recommendation model or ensemble algorithmbased Use past purchases and a matrix of product characteristics to determine the rank of recharge packs to recommend. Using this data, the models advise on the best-individualized pack for each customer. The following considerations are made in the proposed method:
88
D. R. Roy et al.
Use Case I: Experiments 1 and 2 each considered a stratified sample of 1,000,000 users who had been labeled as either TG or CG. Experiment 1: We applied the individual model by tracking key performance indicators (KPIs) metrics every month. Experiment 2: Monthly key performance indicators were used with the ensemble model.
3.1 Input Data Two strategies have been found and labeled from the user population as a whole. Target Group (TG)-specific product recommendations were made using either an individual algorithm or an ensemble of algorithms. In the Control Group (CG), on the other hand, the telecom provider will make suggestions based on more conventional procedures. The inputs included users’ consumption history data and recharging patterns. There was no change in the recharging history. Benefit description, product ID, and price, as well as the time and date of the transaction were among the inputs. Total 4G data usage, total decrement, overall-total data revenue, gross ARPU, total data usage, local voice MoU, total roaming OG, offset voice MoU, total 3G data usage, total local voice revenue, total offset voice revenue, were the identified KPIs. Two distinct machine learning algorithms—principal component analysis and random forest—were used in this one-of-a-kind feature selection process. Using these two methods, only the characteristics that significantly improve the prediction are chosen. Both the individual algorithms and the ensemble algorithm model benefit from the automated removal of noise and unnecessary features.
3.2 Individual Algorithm-Based Recommender Model Manhattan similarity, Minkowski similarity, Euclidean similarity, cosine similarity, and recharge history index are used in the recommender model’s similarity calculation module for user and product profiles, respectively. Initially, information was separated. Using what is known as a product-profile matrix, we determined the typical consumption of each key performance indicator for certain recharge goods. Using a min–max scalar, we normalized the usage profile and product-profile vectors. Next, the algorithms shown in (Fig. 1) were applied to the data, where X is the usage profile vector and Y is the product-profile vector, to determine the order in which the recharge products should be rated. (a) Cosine similarity Cosine similarity is a measure of the closeness of two vectors in inner product space. If two vectors are roughly moving in the same direction, we can calculate this using
Big Data Analytics-Based Recommendation System Using Ensemble …
89
Fig. 1 We see a block diagram that gives an overview of the individual algorithms used to rank the suggested recharge products. The product purchases are the products which are bought of the profile matrix, the customer usage profile and customer profile, and the total count of days since the last recharge make up the system’s three primary components. The approach evaluates the aforementioned factors and then recommends the best recharge packs for the individual user
cosine of the angle between them. To find how similar two sets of data are in terms of their cosines, we used the formula: Similarity(X, Y ) = cos(θ ) =
(X.Y ) X Y
(1)
We have then used the (minimum–maximum) technique to standardize the cosine similarity across all available recharges. (b) Euclidean similarity In geometry, the distance between two points is measured in terms of their Euclidean distance. To rephrase, in Euclidean space, the distance between two points is equal to the length of the line connecting them [14]. The Pythagorean distance is another name for the Euclidean distance since it can be calculated using the coordinates and Pythagoras’ theorem. We calculated the Euclidean distance to be Euclidean distance = ((a2 − b2)2 + (a1 − b1)2 + · · · + (an − bn)2) (2) We then applied the formula to derive the Euclidean similarity by reducing the total distance by the (minimum–maximum) normalized distance:
(Minimum − Maximum) Euclidean similarity = 1− normalized Euclidean Distance
(3)
90
D. R. Roy et al.
(c) Manhattan similarity A pair of points’ Manhattan distance is equal to the product of their absolute coordinate differences in Cartesian space. That is just a fancy way of saying that it is the sum of x- & y-coordinate differences [15]. The formula used to determine Manhattan similarity was: Manhattan similarity distance = Manhattan similarity (c, d) =
|c − d| (4)
Subtracting the (Minimum–Maximum) normalized distance from the expression capitulated the Manhattan similarity. Manhattan similarity = 1−(Min − Max normalized Manhattan Distance)
(5)
As a result, the telecom operator may provide more variety in the form of pack suggestions by making available more appropriate packs depending on rankings for a certain user. (d) Minkowski similarity As a metric in a normed vector space, the Minkowski distance or Minkowski metric may be thought of as a superset of the Euclidean distance and the Manhattan distance [15]. For two variables A and B, the Minkowski similarity measure is computed as n P1 |Ai − Bi | P
(6)
i=1
3.3 Ensemble Algorithm-Based Recommender Model Manhattan similarity, Euclidean similarity, Minkowski, cosine similarity, and the recharge history index are all used in the similarity calculation module of the Ensemble algorithm-based recommender model, with the arithmetic average of these measures used to produce more accurate recommendations than would be possible using any one of them alone. The addition of Minkowski with Manhattan similarity, Euclidean similarity, and cosine similarity algorithms makes the unique ensemble model which together gives better recommendation then in comparison to running ensemble model with any combination of ensemble model based on Manhattan similarity, Euclidean similarity, and cosine similarity. Initially, information was separated by using recharge packs. We calculated the usual usage of each key performance indicator in connection to specified recharging items using a tool called a product-profile matrix. To standardize in the recommender model the usage profile of customers and product-profile vectors, a minimum–maximum scalar is considered. Using the
Big Data Analytics-Based Recommendation System Using Ensemble …
91
Fig. 2 Block diagram of ensemble algorithm for ranking recharge products (Fig. 2). The system’s three main components are the user’s user profile, the number of days since the user’s previous recharge, and the product purchase/profile matrix. The approach evaluates the aforementioned factors and then recommends the best recharge packs for the individual user
following approaches (Fig. 2), we rated the recharge products, where the algorithms from each individual model were employed but the average of the measurements was derived since this was an ensemble model.
4 Evaluation Parameters The experiment was designed to establish that the user profile and recommendation methods presented may significantly boost recommender system efficiency. To measure how well the proposed methods perform in producing the best possible TopN recommendations, to assess our recommender model against the user in the test set, we have chosen the classification accuracy metrics and the model performance metrics. Metrics for classifying correctly are evaluated according to classification accuracy and model performance metrics.
4.1 Classification Accuracy A recommender system is better or not is evaluated by precisely classifying an item as interesting or not by using classification accuracy. Tolerance differs from people to people, so distance of threshold tolerance does not matter. Too simply, we can
92
D. R. Roy et al.
Fig. 3 Execution of feature selection where random forest is being used and individual algorithmbased model where Minkowski similarity is taken into consideration which helps to select features and produces rank of recharge products for every consumer inform of recommendation
Fig. 4 The final output after running individual algorithm-based model where Minkowski similarity is taken into consideration which shows the generated recommendations
say that a rating above two stars is better or not for everyone cannot be assumed [16–18]. A more reasonable threshold is a vector average and if rating is more than or equal to the vector average where it is reflected positive. The tolerance threshold produces a binary threshold scale which reflects item (positive) if liked by person or not (negative) if vice versa. Similarly, right (true) or wrong (false) can be considered
Big Data Analytics-Based Recommendation System Using Ensemble …
93
Fig. 5 Execution of feature selection where random forest is being used and ensemble algorithmbased model where cosine, Euclidean, Manhattan, and Minkowski similarities are taken into consideration which helps to select features and produces rank of recharge products for every consumer inform of recommendation
Fig. 6 The above diagram shows the final output after running ensemble algorithm-based model where cosine, Euclidean, Manhattan and Minkowski similarities are taken into consideration which shows the generated recommendations
as the recommendations. Every prediction pr occurrence and every actual rating ra is calculated by the below metrics. (TP): pr = positive, ra = positive if true positive occurrence (TN): pr = negative, ra = negative if true negative occurrence (FP): pr = positive, ra = negative if false positive occurrence (FN): pr = negative, ra = positive if false negative occurrence More predictive metrics can be consolidated by using precision and recall. Of all positive ratings which are correctly classified, fraction is calculated which is known as Precision. The system is better for positive recommendations which is evaluated using precision. Let us assume that the user gets three of every five recommendations, and this can be called 60% precision. Recall is the fraction of all positive gathered recommendation. How better the system is at finding positive recommendations is
94
D. R. Roy et al.
evaluated with the help of Recall. [19–21]. Let us assume that a recall is that the system can precisely recommend seven out of every eight of your favorite recharges is 80% recall. So we can say that precision will be TP/(TP + FP), and we can say that recall will be TP/(TP + FN). Precision and recall are two sides of coin for different evaluative measures and are related to each other. Most often, the higher the precision leads to low recall and the higher the recall leads to lower the precision. To generate an individual metrical to access the F-measure, which is actually a harmonic mean of the precision and recall we generally do a combination of precision and recall. Precision: To calculate Precision@N, we divide the count of pertinent things by the total number of items considered at rank N. In Eq. 7, Brs stands for collection of pertinent items that have been considered at N rank, Bs for the collection of items that have been chosen, and Br for the set of all relevant items. To put it simply, 1.0 precision means that every recommendation on the list is better recommendation. The probability of selecting a relevant item at rank N is denoted as Precision@N. Precision = | Brs|/|Bs| = | Br ∩ Bs |/| BS|
(7)
Recall: Relevant items fraction in a collection of everyitems B at rank N that is selected is called recall at rank N (Recall@N). The relevant products at rank N, Brs, are defined in Eq. 8, while the relevant items, Br, and the selected items, Bs, and are also, respectively. Recall@N is the probability where a required item shall be selected at the given rank. All highly suggested items are included, when the value of recall is 1. Recall@N = | Brs|/|Bs| = | Br ∩ Bs |/| Br|
(8)
As mentioned above, our goal is to assess how well the proposed recommender algorithms generate a list of suggestions for a given test set of users. So, the preferred items of each test taker were secretly used as the test or answer items. The resulting recommendation list, Recui, was evaluated using the test items set Tua. In the experiments, the count of items recommended in the Top-N set (defined as |Bs| in Eq. 7) was used as the count of the items which are selected are defined (defined as |Bs|). Therefore, |Bs| = N. Let us say that ua is the functioning user who is currently performing the hidden tests and that X is the set of tests. The collection of the Top-N items which are selected and suggested for every functioning user (ua) is denoted by Recui. In Eq. 3, Top-N can be thought of as the total number of items from which some subset (called “relevant items”) have been chosen [22]. |Brs| = |Tuf| ∩ | Recui |
(9)
In Eq. 8, the sum of the relevant items, denoted by |Brs |, is equal to the summation of test items for currently functioning users, ua, given by |Br| = |Tuf|.
Big Data Analytics-Based Recommendation System Using Ensemble …
95
Here is how we calculate the suggestions’ accuracy and recall at N. Precision@N (ua) =|Brs|/|Bs| = |Tua ∩ Recui |/ N = (|Tua ∩ Recui |)/|Recui | |Recall (ua) = | Brs|/|Br| = (|Tua ∩ Recui |)/(|Tua|)|
(10) (11)
If the collection Br (or Tua) of relevant item is. {b1, b2, b3, b4, b5, b7, b10}. With help of a recommender system, you would get suggestions for the perfect combination of products. Bs (or Recui) = {b2, b1, b4, b7, b11} for the user ua. In light of this, the accuracy of this suggestion would be = ({b1, b2, b3, b4, b5, b7, b10} ∩ {b2, b1, b4, b7, b11})/({b2, b1, b4, b7, b11}) = 4/5 The recall N would be = ({b1, b2, b3, b4, b5, b7, b10} ∩ {b2, b1, b4, b7, b11})/({b1, b2, b3, b4, b5, b7, b10}) = 4/7 = common element in both divided by total number of elements The F1measure or F1score at N (F1@N):This is generated using the harmonic mean of recall and precision, where N is the industry standard. To showcase the results as a whole. F1@N =
(2 X Precision X Recall@N) (Recall@N + Precision@N)
(12)
Average precision at N (AP): Across all users U in a test set X, the average precision is the average rank precision (AP). Mean average precision (MAP): Across all tests sets, the average of all users’ precision @N scores is MAP metric. Over some test cases, n, if we want to determine typical accuracy of these predictions, then MAP can be used.
4.2 Model Performance Metrics By comparing TG and CG for recharge and revenue conversion, we can assess the efficacy of both individual ensembled models. Metrics for measuring model performance can be evaluated in the following ways:
96
D. R. Roy et al.
(TGcount − CGcount) × 100 (Subscribercount)
(13)
(TGcount − CGrevenue) × 100 (Total revenue)
(14)
% increase in conversion = % increase in revenue =
5 Results and Discussion Cosine similarity, Euclidean similarity, Manhattan similarity, and Minkowski similarity were all considered to strengthen the framework of the individual model. A user’s recharge history index (the number of days since the previous recharge) was also considered before a final suggestion was made. The elimination of insignificant KPIs from the existing data was a novel effect of combining multiple algorithms like random forest and principal component analysis (PCA), which in turn enhanced the quality of both the individual algorithm model and the ensemble algorithm model. Typically, the features selection algorithm opted for the most widely used Key performance indicators, such as overall data revenue, overall 4G usage data, and overall data usage. Overall number of recharges, overall decrease, overall number of offnet voice MoUs, overall number of SMSes (local), overall number of SMSes (Main + ETU), overall number of SMSes (Main + ETU), overall revenue from local voice MoUs, overall number of offnet voice MoUs, overall revenue from offnet voice MoUs, MPESA payments, overall number of offnet voice revenue, overall recharges, overall decrease.
5.1 Execution of Individual Algorithm Model- Minkowski (See Fig. 3).
5.2 Final Output (User Interface View) Individual Model (See Fig. 4).
5.3 Execution of Ensemble Algorithm Model (See Fig. 5)
Big Data Analytics-Based Recommendation System Using Ensemble …
97
5.4 Final Output (User Interface View) Ensemble Model (See Fig. 6). When it comes to obtaining user preferences and recommendation strategies based on taxonomy data, these models give the most effective techniques possible. The alternate approach of using ensemble distance-based algorithm to the issue of individual distance-based algorithms also helps to improve user profiling.
6 Conclusion The telecommunications sector may benefit from the adoption of recommender systems based on either the ensemble algorithm model or the individual algorithm model. The distributed nature of the framework for ensemble algorithm model systems makes the execution of customized suggestions more feasible and efficient. The ensemble algorithm model approach will help advance the state of the art in recommender systems in two keyways: (a) it will provide a promising solution to the sparsity problem, a limitation of current Methods, and (b) it will provide a promising basis for hybrid systems. Despite the meteoric rise in e-commerce users and products in recent years, the quality of recommendations made by traditional forms of collaborative filtering has steadily declined. Telecommunications providers may use the offered recommendation engines to better advice customers on which services would best meet their needs. As a result, you will have an advantage over the competition. Here are a few questions that need to be explored more in the future: This means incorporating the aforementioned recommendation engine into preexisting infrastructure without adding unnecessary strain. It would be helpful to have additional criteria for input parameters upon which to base a recommendation. To provide location-based, time-sensitive suggestions to customers in real-time.
References 1. Singh, I., Singh, K.V., Singh, S.: Big data analytics based recommender system for value added services (VAS). Adv. Intell. Syst. Comput. 2. Wilson J, Chaudhury, S., Kapadia, P., Lall, B.: Improving collaborative filtering based recommenders using topic modelling. In: Proceedings Cornell University (2014) 3. Soft, M., Zulu, D.M., Mazhandu, R.: Recommender system for telecommunication industries: a case of Zambia telecoms. Am. J. Econ., 271–273 (2017) 4. Zhang, Z., Lin, H., Liu, K., Wu, D., Zhang, G., Lu, J.: A hybrid fuzzy-based personalized recommender system for telecom products/services. Inf. Sci. 235, 117–129 (2013) 5. Thomas, S., Wilson, J., Chaudhury, S.: Best-fit mobile recharge pack recommendation. In: 2013 National Conference on Communications (NCC) (2013) 6. Morozov, S., Zhong, X.: The evaluation of similarity metrics in collaborative filtering recommenders. In: Hawaii University International Conferences, Hawaii (2013)
98
D. R. Roy et al.
7. Shirkhorshidi, S., Aghabozorgi, S., Wah, T.Y.: A comparison study on similarity and dissimilarity measures in clustering continuous data. PLoS ONE 10(12), e0144059 (2015) 8. Jain, K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999) 9. Perlibakas, V.: Distance measures for PCA-based face recognition. Pattern Recogn. Lett. 25(6), 711–724 (2004) 10. Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: a survey. IEEE Trans. Knowl. Data Eng. 11, 1370–1386 (2004) 11. Bhatt, B., Patel, P.J., Gaudani, H.: A review paper on machine learning based recommendation system. 2014 IJEDR 2(4). ISSN: 2321-9939 12. Cleger-Tamayo, S, Fernandez-Luna, J.M., Huete, J.F.: A new ´ criteria for selecting neighborhood in memory-based recommender systems. In: Proceedings of the 14th international conference on Advances in artificial intelligence: Spanish association for artificial intelligence, ser. CAEPIA’11, pp. 423–432. Springer, Berlin (2011) 13. Kridel, D.J., Dolk, D.R., Castillo, D.: Recommender systems as a mobile marketing service. J. Serv. Sci. Manage. 6, 32–48 (2008). Published Online December 2013 (http://www.scirp.org/ journal/jssm) 14. Candillier, L., Meyer, F., Boulle, M.: Comparing state-of-the-art collaborative filtering systems. In: Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition, ser. MLDM ’07, pp. 548–562. Springer, Berlin (2007) 15. Zauvik, Laksito, A.D.: The comparison of distance measurement for optimizing KNN collaborative filtering recommender system. In: 3rd International Conference of Information and Communications Technology (2020) 16. Bobadilla, J., Serradilla, F., Bernal, J.: A new collaborativefiltering metric that improves the behavior of recommender systems. Knowl. Based Syst. 23(6), 520–528 (2010) 17. Brun, Castagnos, S., Boyer, A.: A positively directed mutual information measure for collaborative filtering.In: 2nd International Conference on Information Systems and Economic Intelligence—SIIE 2009, pp. 943–958 (2009) 18. Gualdi, S., Medo, M., Zhang, Y.-C.: Crowd avoidance and diversity in socio-economic systems and recommendation (2013) 19. Boe, “Collaborative filtering for recommending movies,” Master’s thesis, Norwegian University of Science and Technology (2007) 20. Wit, J.: “Evaluating recommender systems,” Master’s thesis, University of Twente, May (2008) 21. Yongli Ren, W.Z., Li, G.: Automatic generation of recommendations from data: a multifaceted survey. Deakin University, Tech. Rep. TR C11/4 (2011) 22. Hu, B., Long, Z.: Collaborative filtering recommendation algorithm based on user explicit preference. In: 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA) (2021). https://doi.org/10.1109/ICAICA52286.2021.9498149
Uncertainty Management in Brain Data for Olfactory Perceptual-Ability Assessment of Human Subjects by General Type-2 Fuzzy Reasoning Mousumi Laha
and Amit Konar
Abstract The paper introduces an interesting approach in assessing olfactory perceptual-ability and its gradual degradation over months for both healthy persons and people suffering from early olfactory ailments. Functional near infrared spectroscopic (f-NIRs) data acquired from the experimental subjects’ brain are first preprocesses and then fed to a novel general type-2 fuzzy regression unit to predict the subjective olfactory perceptual-ability. The model parameters are corrected using subjective feedback about the olfactory stimuli concentration. During the test phase, the model is used to predict perceptual degradation in olfaction for patients suffering from olfactory ailments. The prediction error computed with respect to subject’s self-assessment of stimulus concentration is used as a metric of performance of the proposed prediction algorithm. The original contribution of the work lies in the formulation to handle uncertainty in multi-trial and multi-session experimental brain data using general type-2 fuzzy logic. The proposed technique outperforms traditional type-2 fuzzy techniques both with respect to percent success rate and run-time complexity. Keywords Human perceptual-ability related to olfaction · Spectroscopy in near infrared wavelength · Type-2 models of fuzzy sets and logic · Vertical slice representation of type-2 fuzzy sets
1 Introduction Perception refers to the process of acquisition of external stimuli by our sense organs and their subsequent interpretation through interaction with one or more brain modules. The ability to perceive external stimuli, hereafter called perceptual-ability, M. Laha (B) · A. Konar Department of Electronics and Telecommunication Engineering, Jadavpur University, Jadavpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_9
99
100
M. Laha and A. Konar
of a person depends largely on his/her capability in acquisition, processing, and classifying the stimuli [1]. This chapter deals with assessment of perceiving capability of people from the acquired brain images using functional infrared spectroscopic (f-NIRs) device [2]. The signals acquired by brain-imaging devices usually are nonstationary as the signal variance has wider fluctuations over experimental trials for apparently similar environments and stimuli. The non-stationary characteristics of brain signals add uncertainty in their feature space, thus posing additional burden to the classification of the cognitive tasks, like attention, perception, learning, and motor control and coordination. One approach to alleviate this problem is to add a module in the brain-computer interface [3, 4] system to reduce the scope of uncertainty in the feature space for similar cognitive tasks with similar (preferably same) stimuli over the experimental trials. The logic of fuzzy sets has potential to handle uncertainty in measurement space with the help of membership functions [5]. Classical fuzzy sets, also called type-1 fuzzy sets [6], unfortunately are unable to capture the uncertainty in the variation of (brain) data over experimental trials with same input stimuli. Type-2 fuzzy sets have an inherent advantage to model such uncertainty in measurement space [7]. This is due to the representational merit of type-2 fuzzy sets that allows fuzzy rather than crisp membership assignment for all possible values of the linguistic variable. It is indeed important to mention here that classical fuzzy set provides crisp membership assignment at each value of the linguistic variable, and so it is capable of encoding data into fuzzy memberships, when the data are captured from a stationary signal source. The paper demonstrates how data variation over trials in an experimental session can be described by a Gaussian-type fuzzy membership function (MF), and how multi-session data are represented by a bunch of Gaussian MFs, the union and intersection of which is an (interval) type-2 fuzzy set, stated above. Figure 1 provides a summary of interval type-2 MF construction for multi-session data, where each session includes a finite number of trials. The interval type-2 fuzzy set (IT2FS) outlined above appears to be one of the simplest techniques to handle non-stationary variations of the brain data. In IT2FS representation, the entire space of membership for each discrete value of the linguistic variable f = f is equally important. The question that may be raised naturally: which membership value should we consider from the interval of memberships for f = f ? It is apparent that the interval [u, u] for (primary) membership u for f = f has more certainty (i.e., minimum uncertainty) at the center of the span of primary membership u mid = (u + u)/2, and least certainty, i.e., maximum uncertainty at the extremities of the interval at u = u and u = u (Fig. 2). The second question that now may be raised: should we regard only the membership u = (u + u)/2, at f = f , and ignore all others in the span u belongs to [u, u] or use the entire interval with different degrees of importance. This calls for a secondary membership function over the interval [u, u], and the choice of the function depends on users’ preference based on certain user-defined metrics and/or application domains. The resulting system now appears as 3-dimensional and is often referred to as general type-2 fuzzy sets (GT2FS) [8].
Uncertainty Management in Brain Data for Olfactory …
101
Fuzzy Sets 1.0
1.0
A1 A2
An
μ Ai ( f )
μ A( f )
1.0 n
μ Ai ( f ) i =1
f
f
(b) Number of Gaussians MFs for a multi-session data comprising multiple trials 1.0
(a) Gaussian Mf of a single session multi-trial data 1.0
f
(c) Union of the MFs in (b)
μ A~ ( f )
n
i
μ Ai ( f ) i =1
f
f
(e) Interval type-2 fuzzy representation by inclusion of (c) and (d) in together
(d) Intersection of the MFs in (b)
Fig. 1 represents a summary of interval type-2 MF construction for multi-session data comprising multiple trials
Fig. 2 Construction of the vertical slice GT2FS
μ A~ ( f ) (u ) i
f1′
i
f 2′ u1 u2
u
u3 u4
u u mid u
u5 u6
The novelty of the present research is automated reasoning under type-2 fuzzy uncertainty using GT2FS. It is needless to mention here that 2 common types of reasoning in GT2FS are prevalent, called z-slice-based approach [9] and vertical slice-based approach [10]. Both the approaches have certain merits and demerits, but the vertical slice-based reasoning outperforms its counterpart by a number of metrics [11–13]. Here, emphasis is given on 3 different new proposals of vertical slice-based reasoning and their relative merits/demerits. Like any classical computer science problems, here too, we are concerned with two conflicting but useful metrics: the performance and computational overhead. If the computational overhead is restricted, the reasoning performance merely outperforms IT2FS-based reasoning. However, with no restriction in computational cost (as happens in offline/non-real time systems), the reasoning performance is incredible. The work presented here is classified into 5 sections. Section 2 outlines the research hypothesis on the assessment about possible degradation in olfaction. Next, a
102
M. Laha and A. Konar
proposal on general type-2 fuzzy logic-based reasoning for perceptual–ability assessment of people is given in Sect. 3. Experimental details and an analysis in performance of the proposed model in comparison with existing ones are covered in Sect. 4. Conclusions and the future scope of the research are listed in Sect. 5.
2 Research Hypothesis Animals often use olfaction and touch as the primitive modalities of their perception. With the progress in civilization, humans use these 2 modalities as secondary in comparison with visual and auditory modalities of perception. However, olfaction plays an important role in the detection of the (early) Alzheimer’s disease, COVID19, and many others. So, in this paper, an attempt is endeavored to assess the olfactory perceptual-ability and the olfactory perceptual degradation of people based on the hemodynamic response of the brain [14]. But as discussed earlier, brain data are not free from uncertainty. Here, lies the challenge of the present research to assess perceptual-ability and degradation from the noisy, non-stationary, and imprecise (uncertain) brain data. The entire work includes 2 main phases. The first phase undertakes a new fuzzy regression technique to model the olfactory perceptual process of experimental human subjects. The model parameters are tuned using subjective feedback about the olfactory stimuli concentration. The second phase, also called the test phase, employs the pre-trained model to predict the strength (concentration grade) of the odor stimulus. Now, the difference between the responses produced by the model and the human subject (oral response of subject) is regarded as the perceptual error of the subject. The merit of the modeling thus is to predict perceptual degradation in olfaction. Although there is a vast literature to hand uncertainty in brain data, it is noted that vertical slice-based general type-2 fuzzy reasoning (VS-GT2FR)-based regression is advantageous to the existing approaches [15–17] in handling uncertainty in multi-session and multi-trial data. The first model proposed is concerned with triangular vertical slice-based general type-2 fuzzy reasoning (TVS-GT2FR), which also termed as semi general type-2 fuzzy reasoning [12]. The second approach deals with Gaussian vertical slice-based general type-2 fuzzy reasoning (GVS-GT2FR) [14]. While the former has less computational overhead, the latter has better performance in terms of prediction-accuracy. The variability in (vertical slice-based) reasoning formalisms and defuzzification techniques yields the best assessment of perceptualability that matches closely with users’ self-assessment of his/her perceptual-ability. The research challenge lies in the choice of secondary membership functions, reasoning technique, and type-2 defuzzification to attain the best results.
Uncertainty Management in Brain Data for Olfactory …
103
3 Original Contributions of the Present Research One important fragment of the original contribution of the present research lies with measurement in degradation of olfactory perceptual capability of people by utilizing an f-NIRs device. Here, the novelty involves the design issues of a new proposal for regression using fuzzy logic that expects to work under imprecision of measurements that come into play due to non-stationary characteristics of the brain signals. The architecture of the proposed reasoning models is summarized below.
3.1 The Proposed TVS-GT2FR Model (Model 1) Let us consider IT2FS A˜ i for i varying in [1, n], with an interval valued type-2 membership function [μ A˜ ( f i ), μ A˜ i ( f i )] for the linguistic variable where the quani tities in square bracket, respectively, denote the lower and the upper membership functions (LMF and UMF) of A˜ i . The construction of A˜ i involves both intra-session (3 trials/session) and inter-session (7 independent sessions) variations in f i , A type-1 Gaussian membership function (MF) which is constructed with mean and variance, respectively, equal to the mean (M) and the standard deviation (σ ) of the intrasession data points. The union and the intersection of such Gaussians for all the sessions together are computed to obtain an interval type-2 fuzzy set [12]. Now for general type-2 fuzzy set, a secondary membership function μ A˜ i ( fi ) (u i ) is constructed by Eq. (1). μ A˜ i ( fi ) (u i ) = max[μ A˜ i ( fi ) (u mid ) · exp(−|u mid − u i |), max(μ A˜ i ( fi ) (u i ), μ A˜ i ( fi ) (u i ))
(1)
where, u mid , the center value of [u i , u i ], would have a peak as the uncertainty is least at the center and u i = u i and u i = u i , are the lower and upper bounds of u i over the f = f line lying within the bounds of the possible band of uncertainty [10]. Now, for the current proposal on the fuzzy reasoning, a rule R j is given by: If f 1 is A˜ 1 and f 2 is A˜ 2 and… and f n is A˜ n , then y is B˜ j . Here, f i is A˜ i , for feature i = 1 to n denotes the GT2 fuzzy antecedent propositions, and y is B˜ j is an IT2FS consequent proposition. To keep the reasoning complexity within limits, the FOU of the antecedent fuzzy sets is reduced by a non-linear mapping. This informally is carried out in 2 steps. In the first step, the secondary memberships of all discretized primary membership values u in [u, u] are thresholded. The ones crossing the user-defined threshold (empirically set, 0.01) are used to compute the geometric mean of the thresholded secondary membership and the corresponding primary membership values. In step 2, the largest and the smallest of the geometrical means over all sample points u lying in the FOU [10] for a given linguistic variable value f = f are used to, respectively, define the upper and lower
104
M. Laha and A. Konar
Fig. 3 a Construction of modified triangular vertical slice-based G2FS. b Architecture of the proposed triangular vertical slice-based G2F reasoning model
bounds of the modified u for the same f = f . It is found from Fig. 3a that such operation results in a reduced FOU (i.e., with reduced uncertainty). The process of generating inferences for a selected point of measurement is next accomplished by the same process as undertaken in conventional IT2FS-based logical reasoning. Finally, the centroid type reduction [18] and decoding of the resulting inference are performed using the enhanced Karnik–Mendel (EKM) algorithm [19]. Figure 3b represents the architecture of the proposed TVS-GT2FR model.
3.2 The Proposed GVS-GT2FR Model (Model 2) The computations of the type-1 MF and IT2Fs here are more or less similar with the first model, and only difference lies in secondary membership construction and modified/refined IT2FS construction. Here too, a Gaussian secondary membership μ A˜ i ( fi ) (u i ) is considered for the GT2FS A˜ i , which is defined as
Uncertainty Management in Brain Data for Olfactory …
μ A˜ i ( fi ) (u i ) = exp[−
105
(u i − u mid )2 ] 2σi2
(2)
with parameters u mid and σi as obtained for the previous model. The current reasoning model is improved over the last model by 2 levels of uncertainty management policy. First, the revised primary membership values pi,k are computed by the following mapping function 1−μ A˜
pi,k ← u i,k
(u i,k ) i ( fi )
.
(3)
where μ A˜ i ( fi ) (u i,k ) is the secondary membership for the corresponding k-th primary MFs u i,k , along the f i = f i axis. It is important to mention here that the proposed mapping function is trustworthy in the sense that the pre-constructed LMF is enhanced, and the pre-constructed UMF is reduced along the f i = f i axis, which produce the refined LMF and UMFs. Consequently, the spread of the FOU is lowered in comparison with its original width (Fig. 4a). Secondly, the greatest of the lower bounds of the revised LMFs and also the least of the upper bounds of the revised UMFs for the selected value of the linguistic variables are determined to obtain the upper and lower strengths (hereafter called, UFS and LFS) in rule-firing. The largest of the lower bounds and the smallest of the upper bounds together assert an additional decline in the range of uncertainty of the fuzzy logical inferences (see Fig. 4b). Finally, the centroidal type reduction and EKM defuzzification process are employed to obtain the computed odor concentration of the stimulus presented to the subject. Now, in the training phase, the prediction error is evaluated by taking the difference between the desired conc. grades of the odor stimulus collected from the verbal narration of the subject and the computed odor conc. grade of the given stimulus.
3.3 Subjective Perceptual-Ability Assessment in the Test Phase In the test phase, perceptual-ability (here, olfactory perceptual degradation (OPD)) of human subjects has been evaluated by the following strategy. First, calculate OPD (ξcr ) of a subject to an olfactory stimulus r having a concentration c by defining ξcr = |Ccr − Ocr |, where Ccr be the computed response obtained from the pre-trained GT2FS algorithm and Dcr be the desired response obtained from the oral response of the subject. Next, compute the root mean square error (RMSE) of all concentration 1/2 r 2 . The average value level of the stimulus r is given by RMSEr = ∀c (ξc ) /cmax of the RMSE across all the stimulus is obtained by RMSE = ∀r RMSEr /rmax . Finally, the OPD of a subject is assessed by
106
M. Laha and A. Konar
Fig. 4 a Construction of modified Gaussian vertical slice-based G2FS. b Architecture of the proposed Gaussian vertical slice-based G2F reasoning model
OPD =
RMSE Max RMSE
th
−
Min RMSE
× 100,
(4)
th
where Max RMSEth and Min RMSEth are the maximum and minimum theoretical value of RMSE.
Uncertainty Management in Brain Data for Olfactory …
107
4 Experimental Results and Performance Analysis The overall scheme undertaken in the experimental process is summarized in Fig. 5. Here, an olfactory stimulus is presented to the subject, and his/her brain-response to the stimulus is acquired by an f-NIRs device. The acquired information obtained at several trials over a session is distinguished and pre-processed prior to extraction of certain features [20–22]. The type-2 MFs are then constructed following the principles narrated in Sect. 3. The phase of training then begins with the motivation to build the GT2FS regression model. The training phase is not indicated in the figure. Next in the prediction phase, for an olfactory stimulus, we collect subject’s feedback as oral response about the concentration grade of the stimulus and predict the concentration by the proposed regression model. We take the difference between the responses of the model and the subject (by his/her verbal response) and thus obtain perceptual error. 30 subjects (22 healthy and 8 diseased subjects) participated in the said experiment maintaining ethical issues, satisfying the Helsinki protocol, declared in 1970, and revised in 2004 [23].
4.1 Experiment 1: Comparative Analysis of the Proposed Regression Models The objective of the present experiment is to measure the performance of the reasoning-based regression technique using the model response of human subject. Figure 6 provides the graphical representation of the desired and the computed response of two proposed models for a diseased subject (diseased subject 4) evaluated over 12 months. It is observed from the plot that the computed response of the second (GVS-GT2FR) model is closer with the desired (oral) response during the possible degradation period of subjects over months. Consequently, the second model yields Fig. 5 Block diagram of the proposed experimental process
Approach: f-NIRs data acquisition
Complex signal processing
Feature Extraction and Selection
Proposed GT2Fs based Regression Model
Oral response Odor Human Stimulus Subject
Computed response
Odor concentration grade
+
Perceptualerror
Desired response
108
M. Laha and A. Konar Experimental Results: 100 80 60
OPD
Fig. 6 Comparative analysis of the desired and the computed response of two proposed models for a diseased subject (subject 4) evaluated over 12 months
40 20 10
6
3
9
12
Months GVS-GT2Fs TVS-GT2Fs Oral response of subject
better performance in assessing the perceiving capability of people from the readings of the acquired f-NIRs system in presence of uncertainty.
4.2 Experiment 2: Relative Performance Analysis and Validation To measure the comparative performance of the proposed algorithm with respect to the existing algorithms, the following steps are undertaken. 1. First, the successful instances of each subject are identified by utilizing the chosen user-defined inequality criterion: [Di − 5% of Di ] ≤ Ci ≤ [Di + 5% of Di ], i.e., 0.95 Di ≤ Ci ≤ 1.05 Di for the proposed algorithm. 2. After step 1 is computed for all instances, percentage success rate (PSR) [14] of the proposed algorithm is determined by PSR =
No of successful intances × 100. 250
(5)
3. Repeat steps 1 and 2 for all subjects to calculate PSR for the proposed algorithm. 4. The above steps are repeated to compute PSR of 8 selected state-of-the-art algorithms for all subjects. Table 1 provides the enumerations of the PSR computation and the run-times required by the proposed and the off-the-shelf algorithms tested across 30 datasets. It is apparent from Table 1 that the GVS-based GT2FR model gives superior performance to its competitors. The well-known Friedman statistical test [31] is undertaken to check the efficacy of the proposed models over their competitors.
Uncertainty Management in Brain Data for Olfactory …
109
Table 1 Relative performance analysis in average PSR of the proposed 2 regression models with traditional models across datasets of 30 subjects Comparative regression models with optimal selection of parameters
PSR (in %)
Run-time complexity (using IBM PC dual-core machine) (ms)
The LSVM-based regression [24]
78.9
56.28
The SVM with Gaussian kernel-based regression [25]
80.4
57.76
Regression based on BPNN algorithm [26]
75.3
61.73
Polynomial regression model with order 10 [27]
73.1
103.22
Type-I fuzzy regression model based on genetic algorithm [28]
65.3
47.82
IT2FS regression model based on differential evolution (DE) [29]
84.8
45.72
Traditional GT2FS-based regression [30]
88.2
95.48
Proposed TVS-CT2Fs regression model [12]
90.2
95.22
Proposed GVS-GT2FS regression model [14]
92.3
94.78
Bold values indicates the result outcome of the proposed model, which is superior than the other models
5 Conclusions and Future Scope The paper proposed interesting solutions to perceptual degradation assessment of human subjects, suffering from olfactory disorders, such as early Alzheimer’s, COVID-19, and the like. A prediction model is designed to detect subject’s performance from his past training data samples. The model performance is checked by measuring prediction noise with reference to peoples’ graded response in concentration of the stimuli concerned with olfaction. After training is completed, the model is used for prediction of subject’s degradation in olfactory ailment by comparing the model response with actual response of the subject after every 3 months of his/her cognitive deterioration in olfactory system. Such system is not available and thus seems to be useful to a large section of people, suffering from perceptual disorder. The principles narrated for olfactory disorder degradation can be extended easily for other perceptual modalities. The type-2 fuzzy vertical slice models can be replaced by rough sets and other tools and techniques of uncertainty modeling.
110
M. Laha and A. Konar
References 1. Saha, A., Konar, A., Chatterjee, A., Ralescu, A.L., Nagar, A.K.: EEG analysis for olfactory perceptual-ability measurement using recurrent neural classifier. IEEE Trans. Hum. Mach. Syst. 44(6), 717–730 (2014) 2. Ferrari, M., Quaresima, V.: A brief review on the history of human functional near-infrared spectroscopy (fNIRS) development and fields of application. Neuroimage 63(2), 921–935 (2012) 3. Makeig, S., Kothe, C., Mullen, T., Bigdely-Shamlo, N., Zhang, Z., Kreutz-Delgado, K.: Evolving signal processing for brain–computer interfaces. In: Proceedings of the IEEE, 100(Special Centennial Issue), pp. 1567–1584 (2012) 4. Li, Y., Yu, Z.L., Bi, N., Xu, Y., Gu, Z., Amari, S.I.: Sparse representation for brain signal processing: a tutorial on methods and applications. IEEE Signal Process. Mag. 3(3), 96–106 (2014) 5. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Pretice-Hall (1997) 6. Mendel, J.M.: Type-2 fuzzy sets. In: Uncertain rule-Based Fuzzy Systems, pp. 259–306. Springer, Cham (2017) 7. Mendel, J.M., Robert, I.J., Feilong, L.: Interval type-2 fuzzy logic systems made simple. IEEE Trans. Fuzzy Systems 14(6), 808–821 (2006) 8. Mendel, J.M.: General type-2 fuzzy logic systems made simple: a tutorial. IEEE Trans. Fuzzy Syst. 22(5), 1162–1182 (2014) 9. Wagner, C., Hagras, H.: Toward general type-2 fuzzy logic systems based on zSlices. IEEE Trans. Fuzzy Systems 18(4), 637–660 (2010) 10. Mendel, J.M., Hagras, H., Tan, W.W., Melek, W.W., Ying, H.: Introduction to Type-2 Fuzzy Logic Control: Theory and Applications. John Wiley & Sons (2014) 11. Ghosh, L., Konar, A., Rakshit, P., Nagar, A.K.: Hemodynamic analysis for cognitive load assessment and classification in motor learning tasks using type-2 fuzzy sets. IEEE Trans. Emerg. Top. Comput. Intell. 3(3), 245–260 (2018) 12. Laha, M., Konar, A., Rakshit, P., Nagar, A.K.: Exploration of subjective color perceptual-ability by EEG-induced type-2 fuzzy classifiers. IEEE Trans. Cogn. Dev. Syst. 12(3), 618–635 (2019) 13. Saha, A., Konar, A., Nagar, A.K.: EEG analysis for cognitive failure detection in driving using type-2 fuzzy classifiers. IEEE Trans. Emerg. Top. Comput. Intell. 1(6), 437–453 (2017) 14. Laha, M., Konar, A., Rakshit, P., Nagar, A.K.: Hemodynamic analysis for olfactory perceptual degradation assessment using generalized type-2 fuzzy regression. IEEE Trans. Cogn. Dev. Syst. 14(3), 1217–1231 (2022) 15. Wu, D., Mendel, J.M.: Similarity measures for closed general type-2 fuzzy sets: overview, comparisons, and a geometric approach. IEEE Trans. Fuzzy Syst. 27(3), 515–526 (2018) 16. Ontiveros-Robles, E., Castillo, O., Melin, P.: Towards asymmetric uncertainty modeling in designing general type-2 fuzzy classifiers for medical diagnosis. Expert Syst. Appl. 183, 115370 (2021) 17. Andreu-Perez, J., Cao, F., Hagras, H., Yang, G.Z.: A self-adaptive online brain–machine interface of a humanoid robot through a general type-2 fuzzy inference system. IEEE Trans. Fuzzy Syst. 26(1), 101–116 (2016) 18. Liu, F.: An efficient centroid type-reduction strategy for general type-2 fuzzy logic system. Inf. Sci. 178(9), 2224–2236 (2008) 19. Wu, D., Mendel, J.M.: Enhanced Karnik-Mendel algorithms. IEEE Trans. Fuzzy Systems 17(4), 923–934 (2009) 20. De, A., Laha, M., Konar, A., Nagar, A.K.: Classification of relative object size from parietooccipital hemodynamics using type-2 fuzzy sets. In: FUZZ-IEEE, pp. 1–8 (2020) 21. Laha, M., Konar, A., Rakshit, P., Ghosh, L., Chaki, S., Ralescu, A.L., Nagar, A.K.: Hemodynamic response analysis for mind-driven type-writing using a type 2 fuzzy classifier. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). pp. 1–8. IEEE (2018)
Uncertainty Management in Brain Data for Olfactory …
111
22. Chowdhury, E., Qadir, Z., Laha, M., Konar, A., Nagar, A.K.: Finger-induced motor imagery classification from hemodynamic response using type-2 fuzzy sets. In: Soft Computing for Problem Solving 2019, pp. 185–197. Springer, Singapore (2020) 23. World Medical Association: World medical association declaration of Helsinki. Ethical principles for medical research involving human subjects. Bull. World Health Organ. 79(4), 373–374 (2001) 24. Sánchez, A.S., et al.: Application of an SVM-based regression model to the air quality study at local scale in the Avilés urban area (Spain). Math. Comput. Model. 54(5–6), 1453–1466 (2011) 25. Wang, W., Xu, Z., Lu, W., Zhang, X.: Determination of the spread parameter in the Gaussian kernel for classification and regression. Neurocomputing 55(3,4), 643–663 (2003) 26. Sun, J., Kalenchuk, D.K., Xue, D., Gu, P.: Design candidate identification using neural networkbased fuzzy reasoning. Robot. Comput. Integr. Manuf. 16(5), 383–396 (2000) 27. Goodale, C.L., Aber, J.D., Ollinger, S.V.: Mapping monthly precipitation, temperature, and solar radiation for Ireland with polynomial regression and a digital elevation model. Climate Res. 1, 35–49 (1998) 28. Aghaeipoor, F., Javidi, M.M.: On the influence of using fuzzy extensions in linguistic fuzzy rule-based regression systems. Appl. Soft Comput. 79, 283–299 (2019) 29. Bhattacharya, D., Konar, A., Das, P.: Secondary factor induced stock index time-series prediction using self-adaptive interval type-2 fuzzy sets. Neurocomputing 171, 551–568 (2016) 30. Halder, A., Konar, A., Mandal, R., Chakraborty, A., Bhowmik, P., Pal, N.R., Nagar, A.K.: General and interval type-2 fuzzy face-space approach to emotion recognition. IEEE Trans. Syst. Man Cybern. Syst. 43(3), 587–605 (2013) 31. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Checkpoint-Based Round-Robin Scheduling Technique Toward Fault-Tolerant Resource Management in Cloud Computing Jayanta Datta , Subhamita Mukherjee , and Indrajit Pan
Abstract Cloud computing framework is growing with importance in recent times. Public cloud is more challenging than private cloud. Public cloud framework has inherent challenges of security issues, service reliability, and time-constrained requirement for providing service on demand. Resource management is an important aspect under public cloud. Efficient allocation of resources across different service peers or servers is an important aspect toward error management and fault management. This work proposes a modified round-robin mechanism for resource scheduling with multiple checkpoints which take cares of error handling and fault management. Proposed method has been tested on different benchmarks and establishes its robustness. Keywords Cloud computing · Error handling · Failure management · Fault tolerance · Round-robin scheduling
1 Introduction Cloud computing framework provides access to costly resources on subscription basis. These resources are either hardware or software which are accessed from remote locations. Accessing hardware and software needs with the growing demand is almost infeasible to establish every time at private side. However, availability of these resources is nowadays feasible with the advent and progress of cloud computing
J. Datta · I. Pan (B) RCC Institute of Information Technology, Kolkata, West Bengal 700015, India e-mail: [email protected] J. Datta e-mail: [email protected] S. Mukherjee Techno Main Salt Lake, Kolkata, West Bengal 700091, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_10
113
114
J. Datta et al.
[1]. Cloud computing is a fast growing technology. This technology has been adopted by many enterprise or business firms in their day to day operations. Large-scale exercise of cloud computing services with growing popularity and demand has invoked very high load management requirement and demand for reliability on cloud servers. Only scheduling of resource requests on cloud servers is not sufficient to sustain with reliable services. It requires fault management and fault-aware routing to overcome failure of service requests and timely completion of assignments [11]. This article addresses fault-aware scheduling of different service requests on cloud servers. A checkpoint-based round-robin scheduling algorithm has been proposed here which assigns a thresholding-based load on various cloud computing servers so that there be even distribution of loads across different servers. This helps servers to perform with optimum efficiency. Moreover, there is a module which identifies faulty servers and instantly reallocates resource requests on other servers maintaining the threshold capacity of servers. It facilitates all ongoing service requests to finish up successfully within a stipulated time. Section 2 discusses related works available in the literature; different cloud computing basics are illustrated in Sect. 3. Section 4 defines the problem which is followed by proposed methodology in Sect. 5. Experimental results of the proposed method on different benchmark datasets and a comparative analysis are presented in Sect. 6. Finally, the conclusion is given in Sect. 7.
2 Related Work Resource scheduling in cloud computing framework has gained ample research interest in recent days. Fault-aware scheduling management is more important aspect of research as fault and failure of a system is an obvious part of the process. In this section, some of the recent relevant researches have been discussed to derive the motivation of the current research reported in this article. Zhao et al. [1] have discussed cloud workflow scheduling by bringing cloud and workflow together in cloud workflow management system. Improvement of security in workflow execution is introduced in an article [2] which proposes a scientific functional model based on game design. This game operates on attack and defense strategy. A resource provisioning method was introduced by Rodriguez et al. [3] where workflow execution cost has been minimized under some pre-defined deadline constraint. A work in [4] has introduced another cloud scheduling strategy which is economically balanced in terms of cost and also takes care of energy aspect. A deep Q-Learning scheduling technique is discussed in [5] which address scheduling overhead through directed and acyclic graph model. Another article [6] focuses on priority of resource requests and a measure based on relative distance to reduce completion time of resource requests. It also manages deadlines of requests. The work in [7] has proposed Pareto optimization of multi-objective parameters under workflow
Checkpoint-Based Round-Robin Scheduling Technique Toward …
115
modeling scheme. Another evolutionary scheme focused on multi-objective parameters has been proposed in [8]. Kalra and Singh have proposed a hybrid approach for energy-aware scheduling of deadline-constrained workflows (HAED) using intelligent water drops algorithm along with genetic algorithm-based implementation of multi-objective workflow scheduling [9]. These articles have discussed deadline-constrained scheduling of multiple resources requests. However none has considered any fault tolerant measure during resource scheduling. Pandey et al. [10] proposed a task replication method and an architecture-based solution that relies on task categorization and authorized access to the categories of tasks. Different levels of trust to improve the robustness of mobile device clouds are the key objective of this work. Marahatta et al. [11] discussed an energy-aware fault-tolerant dynamic scheduling scheme, called EFDTS, to coordinate optimization of resource utilization and energy consumption with a fault-tolerant mechanism in cloud data center. However, EFDTS improves the overall scheduling performance and achieves a higher degree of fault tolerance through parallel task scheduling. Another article by Ding et al. [12] discussed a scientific workflow in the cloud using fault-tolerant elastic scheduling algorithm. This model concentrates on performance improvement using a fault-tolerant method which attempts to maximize utilization of cloud resources. Fan et al. [13] discussed a fault-tolerant algorithm which dynamically manages the scheduling under different deadlines. Authors have applied Petri nets-based technique for analysis and validation of the feasibility toward improvement of reliability. Olteanu et al. [14] proposed a method to manage fault scenarios through rescheduling technique by resubmission and exception handling for large-scale distributed networks. However, this fault-tolerant approach improved the mechanism for resource management considering task-level failure, but it did not consider resource-level failures. Another article [15] proposed a heuristic approach based on resubmission to address multiple workflow scheduling problems. The approach is robust in nature. However, this method helps during concurrent occupancy of resources to overcome service delays due to large contention resources. Articles discussed in the second part of this section are mostly focused on faulttolerant methods, but all these are focused on identification of fault in the system during the process. In all these cases, resource requests are failure prone if there is a fault in the framework. Another approach for fault management is fault-aware routing to avoid failure and yield maximum success rates. Proposed ideation of this article targets to develop a fault-aware scheduling so that fault in the framework can efficiently be managed, and thus, it can generate maximum success rate.
3 Cloud Computing Basics Cloud computing flexibly offers different computing services and resources mainly through the internet. It reduces economic overhead and yields extensive variety.
116
J. Datta et al.
Computing resources include servers, storage, databases, networking, software, analytics, and intelligence. Generally, on-premise IT infrastructure is maintained by organization. However, the growing need for wide-scale infrastructure makes the process economically infeasible. In order to apply various austerity measures, current trend technologies are shifting toward cloud deployments. Cloud deployment creates pool of different resources which can be procured on-demand basis and pay on use mode. Cloud infrastructure can be of type public, private, or hybrid. Prominent defining characteristics of cloud computing involve: (a) On-demand self-service: Consumer can avail service from cloud service provider as per their demand on payment basis. (b) Broad network access: Cloud computing offers distributed physical hardware network which allows access for acquiring service. Cloud services are mostly dependent on the internet infrastructure and provide a ubiquitous availability of services if internet connection is available. (c) Resource pooling: Computing resources in a cloud computing environment are dynamically divided and allocated on demand. Considering cost factor resource polling is required to support large amount of physical and virtual servers. (d) Rapid Elasticity: Cloud elasticity property allows consumers to request their computational resources auto-scale with increase or decrease of demands. (e) Measured Service: It works for both service providers and consumers to have an easy-to-measure payment scheme. This is similar to services those are availed from electric power supply corporation or direct to home (D2H) television service model. As it uses a metering facility, consumers pay for the resources used by them as per utilization-based module. Benefits of cloud computing are. (a) Cost Reduction: Resource optimization is the main criteria for cost reduction. Total expenses are much lower than traditional computing as the billing model is pay as you go. (b) Increased Scalability: Businesses can scale up or scale down respective operation and storage needs quickly to suit the particular situation, allowing flexibility as your needs change. Elasticity features of cloud computing can scale resources on demand by auto-scaling method. (c) Better Performance: Cloud data center uses multi-core CPUs for heavy parallel processing task and consistently updates with the latest technology with high performance.
3.1 Cloud Deployment Model Cloud deployment model defines the process of deploying cloud services based on their location and management of cloud infrastructure. There are four different
Checkpoint-Based Round-Robin Scheduling Technique Toward …
117
deployment models based upon the location and infrastructure management. These are, (a) Public Cloud: This model is public for all, minimum infrastructure and minimum hardware set up is required to provide cloud services. Consumers can access these services over the internet from a cloud service provider. (b) Private Cloud: Private cloud is used when the cloud infrastructure or data center is operated solely for an organization to serve its customers. Private cloud focuses on data privacy and data security though it suffers from maintenance cost. (c) Community Cloud: Community cloud allows access to a specific set of users using minimum investment. This model of cloud computing is managed and hosted internally or by a third-party vendor. (d) Hybrid Cloud: Hybrid cloud is a combination of multiple clouds like private, public, or community and working together as a unit. An enterprise can keep their critical data on a private cloud, while less sensitive data can be stored on a public cloud.
3.2 Cloud Services Cloud computing framework provides different services like. (a) Infrastructure as a service (IaaS): Infrastructure as a service means hardware as a service that can be hired from cloud service provider as per consumer demand. These services are highly scalable, dynamic and flexible. It provides access to physical machines, virtual machines and virtual storage. (b) Platform as a service (PaaS): Platform as a service model provides a software execution environment to develop, test, run, and manage the applications. It builds on virtualization technology and integrates with web services and database technology. It also provides the auto-scale feature. (c) Software as a service (SaaS): Software as a service hosted on a remote server and managed from a central location using internet. In cloud computing environment SaaS applications are developed and deployed on PaaS or IaaS layer based on consumer demand.
3.3 Faults, Failure, and Error Under Cloud Deployment Fault means system’s inability to perform at desired efficiency due to some unstable state or existence of errors in one or many parts of the framework. A fault is segmented as per difference in time and resources. Faults in cloud computing can be of different types: (a) Network Fault: Partition of network, loss of packet, different congestion and corruption in packet, failure of links are the main reasons for network fault.
118
J. Datta et al.
(b) Physical faults: These are mainly hardware faults. Different hardware components like memory, processor, and power unit can be a potential reason for these faults. (c) Media faults: Read write head failure is the reasons for this type of fault. (d) Process faults: If required resources are not readily available then a process may not perform properly. Also, bugs in the program may be another reason for this fault. (e) Service expiry fault: It occurs when service time exceeds deadline of an application. Failure is also similar kind of event which may appear in the system as a result of fault. A failure occurs when the system deviates from fulfilling its expected normal execution without satisfying the desired customer content. Cloud environment witnesses multiple types of failures. Failures are classified in Fig. 1. Transition of a system unit to an error state due to fault is known as error. Such errors lead to partial or complete failure of system. A distributed system may exhibit many patterns of errors (see Fig. 2).
Omissions
Timing
Response
Failures
Software
Hardware
Network
Fig. 1 Types of failure
Fig. 2 Various types of errors
Errors
Software
Miscellaneous
Network
Packet Corruption
Permanent Errors
Memory Leak
Packet Loss
Intermittent Errors
Numerical Exception
Network Congestion
Transient Errors
Checkpoint-Based Round-Robin Scheduling Technique Toward …
119
3.4 Fault Management Basics There are three major faults in the cloud deployment system. These faults create a huge problem to the server which results in dysfunction of the cloud deployment system. In general, fault-tolerant approaches help the system to continue with its task even after meeting with the fault. Fault tolerance is needed in cloud computing for better performance. Some major fault-tolerant approaches are. (a) Reactive fault tolerance: An approach used in a closed system for minimization of failure. (b) Proactive fault tolerance: This apprehends the fault and substitutes the susceptible component with some running components. Parameters used for fault tolerance in cloud computing In cloud computing, the fault-tolerant approach is evaluated on various parameters for judging the effectiveness of the cloud system. These are illustratively discussed (See Table 1).
4 Problem Definition Fault in different cloud servers is considered in this work during scheduling. Problem has to deal two different requirements as discussed below: (a) Initially, a baseline base method is needed to evenly distribute upcoming resource requests across various active servers so that requests can be served within a significant time frame. (b) Subsequent requirement is there to frequently check each server for faults. If a fault is detected at any server, then all loads assigned to that server have to Table 1 Fault-tolerant parameters Parameter
Details
Adaptive
Process follows automatically following the conditions
Performance
It narrates system efficiency
Response time
It is calculated as the actual time is consumed for a specific algorithm
Throughput
Calculates the total count of implemented as complete and successful
Reliability
Its main intent is to provide correct results within a certain period
Availability
It is the probability of readiness of a system functioning properly
Usability
It is the described as a measure of user accessibility to fulfill the target with efficiency, effectiveness, and satisfaction
120
J. Datta et al.
be redistributed on other servers so that overall stability of the framework is sustained, and high percentage of task completion rate is maintained.
5 Proposed Method 5.1 Round-Robin Scheduling Algorithm Round robin is a pre-emptive algorithm and is designed mainly for time-sharing systems. A small unit of time called time quantum or time slice is defined. The ready queue is treated as a circular queue with a FIFO feature where arriving requests wait for allocation to server. The scheduler goes around the ready queue for allocating a server to each resource request for a time interval of one quantum. Remaining ready requests wait in the ready queue for their turn to appear. Each request is dispatched to server till it completes its required burst time and executes based on the following two conditions: (a) If the request has a burst time of less than equal to 1-time quantum, then it will release the CPU voluntarily and the scheduler will proceed to the next request in the ready queue. (b) If the request has a burst time of greater than 1-time quantum, then it will be context switched or preempted and again put at the tail of the ready queue. The scheduler will then select the next requests in the ready queue.
5.2 Objectives of the Proposed Method A variety of factors have to be considered when developing a scheduling routine, such as what types of servers are available and what are user’s needs. Goals of the proposed method are. (a) Fault Reduction: Reduce the influence of failure rate in cloud systems after actual fault occurrence. (b) Maximizing throughput: Distributing network traffic and incoming service requests across multiple servers on a selective basis. This ensures no single server bears too much load. By spreading the work evenly, proposed model will enhance server responsiveness and throughput by spreading the work evenly across all available servers. (c) Enforcement of priorities: The scheduling mechanism will favor the higherpriority processes.
Checkpoint-Based Round-Robin Scheduling Technique Toward …
121
5.3 Round-Robin Scheduling Algorithm with Parallel Checkpoints Preliminary Assumptions (a) There are multiple resource requests; each will be associated with burst time and deadline. (b) Default priority of each job is regular which is defined by 0 (c) There will be multiple servers where each will have a maximum threshold capacity for resource management. (d) Any particular server may have down time and during the period that will not process any resource requests. Proposed Algorithm Step 1. Step 2.
Allocate all the ready resource request to each server in a cyclic order. Once a resource request appears, a server’s capacity is checked, and request is assigned to that server sequentially. If capacity exceeds, assignment of resource request takes place in next server. Step 3. At each time cycle, the resource requests are executed in a round-robin scheduling fashion in their respective server. Step 4. The projected completion time (PCT) for each processing resource request in every server is calculated and stored. Step 5. A parallel checkpoint mechanism will check the availability of that server at every time cycle. If available, resource request is assigned. Step 6. A checkpoint-based progress monitoring mechanism is implemented at each point of time, which checks whether the PCT of a resource request lies within its respective deadline. Step 7. If a resource request’s instantaneous PCT fails to meet its deadline, the processing request is preempted from its server. Step 8. As a resource request gets preempted from its server, the PCT of other resource requests available in that server will change and therefore is recalculated. Step 9. The other servers are checked whether anyone can allocate the preempted resource request. Step 10. Allocation to a new server is based on the following two conditions: a. If the server’s capacity is within range. b. If the server is available at that moment. c. If inserting the preempted process does not modify the PCT of the already existing resource requests in that server beyond their deadline. d. If the allocation of the preempted process is successful, its priority is scaled up to high priority (1). Step 11. The PCT of each resource request in that server will be computed including that of the newly arrived resource request.
122
J. Datta et al.
Step 12. Job having priority 1 will get continuous server access in sequential time instances till its PCT drops below its deadline. Step 13. Once the PCT of a high priority resource request falls below its deadline, it will be scaled down to normal priority (0) and will follow the regular round-robin allocation method. Step 14. Usually, a server can allocate only one high priority process at a time. If all servers fail to accommodate the preempted Job, it is considered a failure. Step 15. At any point of execution, if a server is down, the Jobs in that server are delayed and the PCT is calculated again for each of its processes to evaluate the new completion time. Projected Completion Time Before allocating a resource request in the server considering the existing allocated resource requests in that server, their remaining burst time, their priority value and round-robin allocation policy, projected completion time of each resource request will be computed including newly arrived resource requests. The calculative formula is defined by the sum of total time spent at the time of initiation of execution and the remaining burst time of the resource request. If the projected compilation time of the newly arrived resource request remains below their deadline, then the new resource request can be assigned to that server.
6 Experimental Results Proposed method has been experimented on a standard dataset acquired from [11] which is given in Table 2. Dataset contains five different scenarios. Each of these tuples contains total resource requests (#Resource_Requests) that have to be managed by the total number of servers (#Servers) available in the framework to manage those requests. Summative burst requirement to be served by the servers is given under #Cummulative_burst. #Faulty_sites denote the number of servers those encounter faults in the midway. Proposed method has been executed on this benchmark dataset of Table 2 in CloudSim framework [2]. Experimentation has mainly focused on three aspects, Table 2 Details of benchmark dataset Set A
#Resource_Requests 89
#Servers
#Cumulative_burst
#Faulty sites
7
803
2
B
77
7
467
1
C
105
9
987
2
D
65
7
711
1
E
212
9
1921
2
Checkpoint-Based Round-Robin Scheduling Technique Toward … Table 3 Performance of proposed method on benchmark dataset
Table 4 Comparative performance of proposed method with [11, 12]
123
Set
Average completion time
Overall completion time
Success (%)
A
13
17
100
B
11
13
100
C
14
21
100
D
15
19
100
E
13
23
98
Set
Proposed method
[11]
[12]
A
100
88
86
B
100
82
79
C
100
89
84
D
100
85
73
E
98
72
65
(a) Average completion time: This is an average of completion time of all successful resource requests. (b) Overall completion time: This is latest completion time of lately completing resource requests among all successful resource requests. (c) Success percentage: Percentage of successfully completing resource requests among all resource requests (#Resource_requests). Here, time means number of processing cycles utilized by the resource requests. Acquired result has been shown in Table 3. Results of Table 3 have been further compared with the model reported in [11, 12]. Comparative performance shows the efficacy of proposed method is very high in compared to both [11, 12]. Comparison has been done only on success percentage. Other two parameters are beyond the scope of the models proposed in [11, 12]. Comparative performance has been given in Table 4. A graphical representation on Table 4 is given in Fig. 3.
7 Conclusion Cloud computing provides various benefits to user utilizing different cloud services. Different faults can occur when cloud service provider (CSP) lends services. Multiple faults in cloud computing environment will result in major setback of the system framework. Fault-tolerant system handles the fault in a way so that failure can be minimized. Existing fault-tolerant techniques mainly focuses on preventing system failures due to hardware or software breakdown. Proposed checkpoint-based roundrobin algorithm provides a fault-tolerant system by avoiding the failure of the server
124
J. Datta et al.
110 100 90 80
Proposed Method 100
100
100
[12] 98
89
88 86
100
[11]
85
82 84
72
79 70
73 65
60 A
B
C
D
E
Fig. 3 Comparative performance as per Table 4
framework. Proposed method identifies faults and generates a schedule which will guide to modify the allocation schedule within the servers so that execution of resource requests does not suffer due to failure of the framework. Proposed method has been experimented on five different benchmarks. Performance of the system found to be very efficient in compared to two other recently published methods [11, 12].
References 1. Zhao, Y., Fei, X.: Opportunities and challenges in running scientific workflows on the cloud. In: 2011 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, pp. 455–462 (2011) 2. Wang, Y.: CLOSURE: a cloud scientific workflow scheduling algorithm based on attackdefense game model. Future Gener. Comput. Syst. 111, 460–474 (2020) 3. Rodriguez, M.: Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds. IEEE Trans. Cloud Comput. 2(2), 222–235 (2014) 4. Li, Z.: Cost and energy aware scheduling algorithm for scientific workflows with deadline constraint in clouds. IEEE Trans. Serv. Comput. 11(4), 713–726 (2018) 5. Tong, Z.: A scheduling scheme in the cloud computing environment using deep Q-learning. Inf. Sci. 512, 1170–1191 (2020) 6. Zhang, L.: Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf. Sci. 531, 31–46 (2020) 7. Durillo, J., Prodan, R.: MOHEFT: a multi-objective list-based method for workflow scheduling. In: International Conference on Cloud Computing Technology and Science Proceedings, pp. 185–192 (2012) 8. Zhu, Z.: Evolutionary multi-objective workflow scheduling in cloud. IEEE Trans. Parallel Distrib. Syst. 27(5), 1344–1357 (2016) 9. Kalra, M.: Multi-objective energy aware scheduling of deadline constrained workflows in clouds using hybrid approach. Wirel. Pers. Commun. 116(3), 1743–1764 (2021) 10. Pandey, P.: Robust orchestration of concurrent application workflows in mobile device clouds. J. Parallel Distrib. Comput. 120, 101–114 (2018) 11. Marahatta, A.: Energy-aware fault-tolerant dynamic task scheduling scheme for virtualized cloud data centers, Mobile Netw. Appl. 24 (3) 1063–1077(2019).
Checkpoint-Based Round-Robin Scheduling Technique Toward …
125
12. Ding, Y.: Fault-tolerant elastic scheduling algorithm for workflow in cloud systems. Inf. Sci. 393, 47–65 (2017) 13. Fan, G.: Modeling and analyzing dynamic fault-tolerant strategy for deadline constrained task scheduling in cloud computing, IEEE Trans. Syst. Man Cybern. Syst. 50 (4), 1260–1274 (2020) 14. Olteanu, A.: A dynamic rescheduling algorithm for resource management in large scale dependable distributed systems. Comput. Math. Appl. 63(9), 1409–1423 (2012) 15. Chen, W.: Adaptive multiple-workflow scheduling with task rearrangement. J. Supercomput. 71(4), 1297–1317 (2015)
Cost Optimized Community-Based Influence Maximization Mithun Roy , Subhamita Mukherjee , and Indrajit Pan
Abstract Effective identification of a small set of nodes within a network which can potentially cover many number of nodes in the remaining network is known as influence spread process. Influence spreading amongst maximum number of nodes is called influence maximization process. Influence maximization task is computationally hard which involves promising seed set selection and estimation of the maximum influence spread throughout the network. Community detection algorithm to figure out effective seed set for influence maximization within an acceptable execution time is the key essence of this article. Proposed community-based identification method involves three stages. First stage detects communities in the given network, second stage analyzes community structure to select candidate nodes within the communities, and the third stage identifies promising influential members from the candidate set to make a target set. Ultimately, average influence spread is measured through Monte Carlo simulation technique. Proposed algorithm has been rigorously tested on two real-world social network datasets to establish its usefulness and efficacy. Keywords Clique proximity · Influence spread · Modularity · Overlapped community · Social network
1 Introduction Nowadays, social media-based connectivity amongst the people around the world is common. This connection is depicted through a network diagram vis a vis graph data structure. Researchers are rigorously considering data from different popular M. Roy Siliguri Institute of Technology, Siliguri, West Bengal 734009, India S. Mukherjee Techno Main Saltlake, Kolkata 700091, West Bengal, India I. Pan (B) RCC Institute of Technology, Kolkata 700015, West Bengal, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_11
127
128
M. Roy et al.
networks like Twitter, Facebook, and many others for experimentation. Domingos and Richardson [1, 2] have used digital platform to illustrate the impact of influence on viral marketing. They have explained the process flow of influence and proposed an optimization model. Two advanced diffusion propagation models were coined by [3] as independent cascade model (IC) and linear threshold model (LT). These two models can calculate the influence spread across network. Monte Carlo simulation technique was applied for estimation of the overall spread of influence. This simulation has been used on selective diffusion model through a seed set S for evaluation of random diffusion for R times. Each iteration counts the number of active nodes, and finally, an average spread is calculated. Most of the influence maximization algorithms use greedy methods to discover the influential seed set. Sometimes, lazy evaluation techniques are cost-effective when outcome of the greedy algorithm is not modified rapidly. The proposed model in this work identifies the most promising/ significant cluster within any social network through a community detection algorithm. Significant groups provide potential information towards analysis of data in later stage. Often these data facilitate to achieve some objective purposes. Assume the case of an ecommerce site that often analyzes customer feedbacks and accordingly offers various membership advantages for their frequent purchasers. On a social media platform, a person can belong to multiple communities and association with these communities are for different benefits. At times, one person can emerge out as influential member of the group. If a single entity appears to be the representative of more than one communities, then these communities are called to be overlapped. A new influence maximizing technique is proposed here which is based upon community-based measures. This method is cost-effective in terms of producing good results without requiring a lot of space. Remainder of the article contains review of published researches in next section, followed by proposed method and experimental records in section III and IV. Section V discussed conclusion and future scope of research.
2 Literature Survey There are many of research articles available in the literature which address research on influence maximization problem. These articles can be grouped in four different categories,
Cost Optimized Community-Based Influence Maximization
(i) (ii) (iii) (iv)
129
Sub-modularity-based techniques Centrality-based heuristic methods Influence path-based approach Community-based techniques.
Here, the focus of review is on the fourth category of algorithms. Community-based algorithms follow a three phase method, (a) First phase classifies communities across social network (b) Second phase identifies candidate members on the basis of the structure of communities and (c) Final phase detects target node from the previously selected candidate nodes.
2.1 Clique Percolation Method (CPM) The CPM [4] algorithm uses connectivity. A k-clique percolation algorithm is used to discover sub-graphs autonomously. In the past, Radicchi et al. [5] used a similar technique. Initially, this method identifies the largest clique in a graph (/ network) followed by percolation into the k-clique on the network.
2.2 Network Transformation Algorithms Girvan–Newman (GN) algorithm identifies communities by edge betweenness centrality [6]. The cluster overlap Girvan–Newman algorithm (CONGA) proposed in [7] selects the vertices having maximum betweenness centrality and partitions them. Edges having highest betweenness centrality are deleted from the network to reveal disjoint communities. In order to split a node, one must consider the nature of the group of neighbours. Gregory uses a separate betweenness centrality measure to speed up the CONGO [8] technique by lowering the betweenness centrality.
2.3 Link Partition Optimizing Girvan–Newman (GN) technique [6] follows a divisive hierarchical clustering technique. This approach figures out edges in a graph having maximum number of shortest path through them and eliminates those until no edge is left. Computationally, this is very expensive because it identifies all pair shortest path in a graph. An average goodness parameter has been introduced to detect best communities on the basis of maximum cluster value. The Louvain method [9] identifies disjoint communities using modularity maximization. Based on the Louvain method [9], each cluster in a network is divided to maximize its average modularity value. The algorithm detects
130
M. Roy et al.
small clusters where the modularity value is maximized locally. Small clusters are then combined to make a large cluster. Louvain has used greedy heuristic optimization to address NP completeness nature of modularity maximization problem. Clauset et al. [10] devised another method for modularity optimization in disjoint community detection.
2.4 Clique Merged Technique Clique merged approach combinedly uses the idea of modularity maximization and cliques in two phases. A set of clique is created from the given graph in first phase. These cliques are merged in the second phase through a hill climbing greedy mechanism. Greedy hill climbing technique performs clique merging on the basis of maximum cluster modularity [11].
2.5 Community-Oriented Framework This is a two phase influence propagation process. First phase is influence maximization through community-oriented approach [12]. This method does seed expansion and intracommunity expansion. At preliminary stage, seed nodes are expanded across various communities, and at the end, propagated influence within the communities is found to be independent of each other. Second phase works on attributed networks [13], which consists of three phases, first phase is community detection, second phase is candidate community generation, and third phase is seed node selection. This work generates candidate communities based on information related to community structure and node attribute. Also a prediction of intranode influencing capabilities is taken into consideration. Similarity in topology structure helps in prediction process. Ultimately, a seed set is formed to achieve influence gain. In LBSN [14], the community-based influence maximization model considers both structure of community and spatiotemporal behaviour of users. The influence spread within LBSN is maximized by developing two community-based algorithms. First part is executed on the basis of the mobility of users to detect communities, and the other part selects the most influential individuals according to those communities. By analyzing the historical check-in data of users, they calculate the similarity between them and design a weighted algorithm based on distance to detect communities. Second phase selects candidates on the basis of local network structure and calculates the precise spread of influence by communities for each candidate.
Cost Optimized Community-Based Influence Maximization
131
2.6 Community-Based Approach Towards Identification of Most Influential Nodes The work [15] has proposed the process of improving efficiency in terms of execution time by maintaining the accuracy of ultimate influence spread. An enhanced efficacy has been obtained through reduction in the number of candidate nodes in terms of identifying the most potential information spreader. Authors have envisioned a sizeable technique regarding the non-submodular information diffusion models. Discussed method follows a three-steps • Classify communities on social network • Identify candidate nodes from the community structure • Derive target nodes from the candidate nodes. Above discussion on existing works reveals that the use of community detection algorithms for influence maximization problem is not very evident. This article proposes one such mechanism.
3 Proposed Method Our proposed method works in three phases. First phase detects the overlapping communities and creates a candidate set (V¯S ). Second phase sorts all the nodes of the candidate set depending on their influence on the network and selects the top ‘k’ nodes for the target set (S). Third phase uses the Monte Carlo simulation technique to calculate the total influence spread (σ ) over the original network using the target set.
3.1 Disjoint and Overlapping Community Let, G := (V, E) represents a social network which can be segregated into k partitions, where each partition has several nodes. covers all the groups. So, = {α1 , α2 , . . . , αk }, where α1 = {v11 , v12 , . . . , v1k }, α2 = {v21 , v22 , . . . , v2k }, · · · , αk = {vk1 , vk2 , . . . , vkk }. • Disjoint community is represented by D =
(αi ) = V ∀i
(1)
132
M. Roy et al.
• Overlapping community is represented by O =
(αi ) =
(2)
∀i
3.2 Clique Proximity () In an overlapping community detection algorithm, a node may appear in multiple groups. Clique proximity [16] is a metric that is used to transform the original network. Total count of vertices and their connections in a transformed network increases when some nodes follow the proper proximity. In this method, a subgraph G¯ is created with the help of the neighbour nodes of u in the graph (G). A 3-clique percolation is used to decide which node will have participated in multiple groups. Two types of clique proximity might be formed. These are • Clique proximity of u is termed as Null Proximation when empty (D = φ) division set is found. • Clique proximity of u is known as Proper Proximation when there is more than one divisions (3) ∩∀i (Di ) = φ
3.3 Disjoint Community Detection Algorithm Louvain technique has been used to identify disjoint communities in the transformed ¯ Unsupervised nature of this technique doesn’t require the count of clusters network G. from the users. There are two phases involved in the process, • Modularity Optimization: All nodes of the network are randomly sequenced in modularity maximization [17] (). () is represented as =
kx k y 1 ](cx c y + 1) [A x y − 4 xy 2m
(4)
Weighted adjacency matrix of the graph is denoted by A. Degree of the node is represented by k x and k y . Cluster of node vx and v y is, respectively, represented by cx and c y . Total edges in the network are m = 21 x k y . • Community Aggregation: In community aggregation, each pair of minor communities is merged into single node. Count the sum of links within a community before being collapsed into single node.
Cost Optimized Community-Based Influence Maximization
133
3.4 Influential Seed Set Identification Influential seed set can be identified from the candidate set (V¯S ). The candidate set contains all the overlapping nodes. All nodes in the candidate set are arranged on the basis of their influence in the network (δ), and the top ‘k’ node is selected for the influential seed set (S).
3.5 Monte Carlo Simulation Method The Monte Carlo simulation technique is used to judge the overall influence spread. A seed set S is diffused R times, and count of active nodes on each iteration is finally averaged to get average spread.
3.6 Proposed Algorithm Cost-effective community-based influence maximization (Cbim) algorithm is based on linear threshold (LT) diffusion model. Proposed algorithm is given as Algorithm 1. Initially, transform the original network G using the clique proximity technique, and the transformed graph is denoted by ‘G¯ . ‘VS . This only contains the split nodes after transformation of the graph. In line 4, the Louvain method is used to detect the disjoint communities, and that method is applied on the transformed graph ‘G¯ . After the classification, all the communities have been stored in ‘C¯ . In line 5, the post-process method is used to notice which nodes are overlapped, and these nodes have been kept in a candidate set (V¯S ). Identify the candidate nodes based on the influence ability in the original network (G) and select the top ‘k influential nodes for the seed set. In lines 9–24, The propagation probability (ξ(u,v) ) value is randomly distributed to all the edges, and the threshold value (vβi ) is also assigned randomly to all the vertices in the original network ‘G . The Monte Carlo simulation method is iterated for R times that calculate the average influence spread (σ ).
3.7 Time Complexity In our proposed algorithm, lines, 2–6 are computing the overlapping nodes. This code segment has taken O(|V ||E|) time to detect those nodes. In this segment, three major modules have been used, and they have taken different time complexities. In the Transform() module, the graph is transformed in O(|V ||E|) time, the Louvain() method detects the communities in O(|V | log |V |) time, and the Postprocess() method renames them in O(|V |) time. The top ‘k nodes in line 7 are
134
M. Roy et al.
Algorithm 1 Community-Based Influence Maximization Algorithm 1: procedure Cbim(G) 2: G¯ = Transform(G) ¯ Split 3: VS = G.V ¯ 4: C¯ = LouvainMethod(G) ¯ VS ) 5: C = PostProcess(C, 6: V¯S = C.VS 7: T = V¯S 8: I Max = 1 9: while I Max ≤ R do 10: Set the propagation probability ξ(u,v) to all the edges. 11: Set the threshold value vβi to all the nodes randomly. 12: w(u) = 0|u ∈ V 13: while T = do 14: u = αi |∀αi ∈ T 15: for each v ∈ adjacent(u) do 16: w[v] = w[v] + ξ(u,v) 17: if w[v] ≥ vβi then 18: T = T ∪ {v} 19: end if 20: end for 21: end while 22: I Max = I Max + 1 23: end while 24: σ (S) = |TR | 25: return σ (S) 26: end procedure
Consider Overlapping nodes select top k influential nodes.
V is the set of vertices of G
selected in O(v log v) time. Using the ‘k number of nodes, lines 9–24 calculate the average spread throughout the network. This calculation takes O(R|v||e|) time, where ‘R’ denotes the maximum number of iterations, v ⊆ V and e ⊆ E. Therefore, in the worst case, the overall time complexity is O(|V ||E|) + O(|V | log |V |) + O(|V |) + O(|E|) + O(|V |) + O(R|v||e|) ≤ O(|V ||E|). The proposed algorithm takes O(|V |3 ) time for dense networks and O(|V ||E|) time for sparse networks.
4 Experimental Result Proposed algorithm has been tested on two real-world benchmark social network datasets (Table 1) [13, 14]. This test was performed on a 64 GB Ram machine in a Core i7 Intel Processor where Centos Enterprise version 6.0 served as an operating system.The Python (version 3.8) programming language was used for implementation. Various useful packages like NetworkX 2.0 and Numpy 1.8 were also used for graphs and numerical calculations. Authors have assigned a propagation probability value to all the edges between 0 and 1 randomly. Calculating the total spread for all the influential nodes in the
Cost Optimized Community-Based Influence Maximization Table 1 Benchmark dataset Particulars of network
Vertices
NETHEPT—high energy 15,233 physics theory GRQC—general relativity and 5242 quantum cosmology
135
Edges 58,891 14,496
Fig. 1 Influence spread on NETHEPT dataset
Fig. 2 Influence spread on GRQC dataset
network is a huge overhead. Hence, a cost-effective approach has been used towards gradual improvement of the solution. Consider only those nodes that are actively participating in more than one group they have some importance. This set might be large. The top ‘k’ important nodes have been selected depending on the centrality measure as per our given budget. Using this budget, our algorithm has reached the most promising part of the network. Mostly, our algorithm performed well than the other state-of-art algorithms (Figs. 1 and 2).
136
M. Roy et al.
Table 2 Experimental record on NETHEPT Algorithms
Seed set size (|S|) 5
10
15
20
25
30
35
40
45
CBIM
19.0
41.0
51.0
69.0
82.0
91.0
102.0
115.0
117.0
50 125.0
NA [3]
15.1
29.7
41.4
53.2
65.1
73.5
95.7
94.6
105.4
114.7
GNA [21]
14.2
31.4
45.2
57.8
72.3
85.1
86.2
103.9
112.0
120.6
PSO [18, 19]
18.7
39.0
52.0
65.0
76.0
87.0
96.0
105.0
111.0
117.0
LAPSO [19, 20]
19.2
38.0
52.0
66.0
77.7
93.0
103.4
113.0
120.2
128.0
SA [22]
10.6
17.1
24.0
29.8
37.1
43.1
49.7
56.1
60.9
68.5
Table 3 Influences for GRQC dataset Algorithms
Seed set size (|S|) 5
10
15
20
25
30
35
40
45
CBIM
24.0
31.0
45.0
64.0
73.0
84.0
94.0
106.0
117.0
50 144.0
NA [3]
15.2
27.6
38.2
49.3
60.3
68.2
78.4
85.0
94.4
105.6
GNA [21]
20.2
4.6
48.3
60.2
73.6
83.4
93.1
106.0
113.6
127.8
PSO [18, 19]
13.0
18.0
26.0
34.0
42.0
50.0
53.0
57.0
61.0
65.0
LAPSO [19, 20]
16.0
19.0
26.0
34.0
42.0
50.0
54.0
59.0
63.0
37.0
SA [22]
11.5
19.0
24.8
29.7
37.0
41.2
48.4
55.3
59.4
68.7
Two benchmark datasets were used to test proposed (Cbim) algorithm. Experimental results reveal that proposed method has outperformed other promising approaches. (Cbim) algorithm also shows that it has reached the most promising part of the network compared to [3, 18, 19, 19–22]. Detail findings are given in Tables 2 and 3.
5 Conclusion Influence maximization is widely used by a variety of campaigns, disease outbreak detection, sensor networks, and digital marketing. Proposed work has used community-based influence maximization method towards gaining more influence spread in compare to other existing work in this domain. Key achievement behind this proposed concept is the use of smaller active seed set and achieves maximum influence spread. Proposed community-based influence maximization (Cbim) has successfully achieved maximum network coverage for influence spread with the help of small seed size. We have also checked the computation time of the proposed method and found that the method is significantly performing on large data network with acceptable computation time. Community detection is another promising research area. There can be many innovative approach to identifying overlapping
Cost Optimized Community-Based Influence Maximization
137
communities. Researchers can work on different community detection techniques to identify overlapping communities which may act more promisingly to find out effective seed set to percolate maximum reachability within any given network.
References 1. Domingos, P., Richardson, M.: Mining the network value of customers. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 57–66. ACM (2001) 2. Richardson, M., Domingos, P.: Mining knowledge-sharing sites for viral marketing. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 61–70. ACM (2002) 3. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146. ACM (2003) 4. Palla, G., et al.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814 (2005) 5. Radicchi, F., et al.: Defining and identifying communities in networks. Proc. Nat. Acad. Sci. 101(9), 2658–2663 (2004) 6. Girvan, Michelle, Newman, Mark EJ.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99(12), 7821–7826 (2002) 7. Gregory, S.: An algorithm to find overlapping community structure in networks. In: European Conference on Principles of Data Mining and Knowledge Discovery, pp. 91–102. Springer (2007) 8. Gregory, S.: A fast algorithm to find overlapping communities in networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 408–423. Springer (2008) 9. Blondel, V.D., et al.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 10, P10008 (2008) 10. Clauset, A., Newman, M.E., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70(6), 066111 (2004) 11. Yan, B., Gregory, S.: Detecting communities in networks by merging cliques. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, vol. 1, pp. 832–836 (ICIS) 2009. IEEE (2009) 12. Shang, J., et al.: CoFIM: a community-based framework for influence maximization on largescale networks. Knowl. Based Syst. 117, 88–100 (2017) 13. Huang, H., Shen, H., Meng, Z.: Community-based influence maximization in attributed networks. Appl. Intell. 50(2), 354–364 (2020) 14. Chen, X., et al.: Community-based influence maximization in location-based social network. World Wide Web 24(6), 1903–1928 (2021) 15. Hosseini-Pozveh, M., Zamanifar, K., Naghsh-Nilchi, A.R.: A community-based approach to identify the most influential nodes in social networks. J. Inf. Sci. 43(2), 204–220 (2017) 16. Roy, M., Pan, I.: Overlapping community detection using clique proximity and modularity maximization. In: 2018 4th International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN). IEEE (2018) 17. Newman, M.E.: Modularity and community structure in networks. Proc. Nat. Acad. Sci. 103(23), 8577–8582 (2006) 18. Gong, M., et al.: Influence maximization in social networks based on discrete particle swarm optimization. Inf. Sci. 367, 600–614 (2016) 19. Singh, S.S., et al.: ACO-IM: maximizing influence in social networks using ant colony optimization. Soft Comput. 1–23 (2019)
138
M. Roy et al.
20. Singh, S.S., et al.: LAPSO-IM: a learning-based influence maximization approach for social networks. Appl. Soft Comput. 82, 105554 (2019) 21. Tsai, C.W., Yang, Y.C., Chiang, M.C.: A genetic newgreedy algorithm for influence maximization in social network. In: 2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2549–2554. IEEE (2015) 22. Jiang, Q., et al.: Simulated annealing based influence maximization in social networks. AAAI 11, 127–132 (2011)
A Novel Model for Automated Identification of Terrestrial Species Pradei Sangkhro , Phidawanhun Pyngrope , Bangermayang , and Gypsy Nandi
Abstract Artificial intelligence (AI) and machine learning (ML) have slowly but steadily become integral to human lives. While much remains to be learned, it surely has contributed to research works which were beyond the comprehension of the human mind. This paper focuses on the concept of standard machine learning techniques used for terrestrial species identification using 3D images. A novel methodology is proposed for the extraction of the characteristics and structural properties of these species based on images using convolution neural networks (CNNs). This study mainly contributes to disseminating awareness and knowledge to research groups in similar fields of study. Observation has been made that despite the advancement of science and technology, people can still not differentiate a buffalo from a bison. In addition, recent developments in the field of machine learning are still a novice in the field of species studies. The emergence of AI and ML is slowly changing educational entities and industrial services, and rightly so. The proposed methodology can also act as a supplementary tool for restructuring traditional higher education, which is still prevalent in India. Keywords Artificial intelligence · Machine learning · Species identification · Convolution neural networks · Transfer learning
1 Introduction In a world surrounded by technology, one field which stands out significantly is artificial intelligence (AI). AI has taken the world of technology by storm; according to TechTarget [1], artificial intelligence can be referred to as the simulation of human intelligence processes by machines, such as computer systems. The possibilities for AI are limitless, and the amount of data AI can consume and learn from is mammoth [2]. Letting AI assist humans in learning, research, and many more is the way to the P. Sangkhro (B) · P. Pyngrope · Bangermayang · G. Nandi Department of Computer Applications, Assam Don Bosco University, Assam, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_12
139
140
P. Sangkhro et al.
future. For example, top grandmasters use chess AI to assist in training and preparation [3]. The same could be done to improve the education system, like providing personalized learning and identifying the course areas that require improvement based on machine learning algorithms [4]. Machine learning (ML) is a part of AI that learns from past data. It can and is used to predict the future, detect cancer, and make self-driving cars. This study uses ML to classify animals and display their features. In this day and age, most people from the cities might know what a deer may be but not what it looks like. Also it is believed that interactions or encounters with wild animals can sometimes be dangerous, especially if it is not known what or which animal it is. For example, a slow loris is an animal that is the size of a teddy bear which looks adorable but is quite dangerous and their bites are equipped with fast acting poison. Rapid advancements in AI and ML should be taken advantage of to the fullest. What could have been a miracle a few years back can now be quickly done. An example of this can be found in image classification. Today, anyone with basic knowledge can develop fully functional image classification applications, all with the help of existing algorithms, such as scalable vector graphics (SVG), convolutional neural networks (ConvNet/CNNs), and latent Dirichlet allocation (LDA). As there are not many tools or applications available in the market which are solely focused on personal uses, this study can contribute to helping other researchers as well as individuals who want to learn more about animals as well in AI. In this study, one such algorithm known as CNN, along with a technique called transfer learning without which the model would have to be trained from scratch every time there is a change in the feature or datasets [5], will be implemented to achieve a high-performance and high-accuracy image classification model. CNN is a basic but powerful deep learning algorithm that can take an image as the input and assign importance by calculating and adding weight to the specific pixel, which is then used to differentiate or classify images [6]. As it is simple and easy to understand CNN architecture as well as the concept it uses, CNN is a good choice for starters to implement and use. Also the main emphasis is the features the proposed methodology provides, the ML model can be replaced with a better one so as to make it scalable. For better and more efficient results, ML models which are recent are recommended. Transfer learning is a technique where a model that has been pre-trained with different datasets is used. The learned feature (the base of the model) is extracted, and only the classification part is trained to classify the inputs. This technique is very much preferred when there are fewer datasets available.
2 Related Works Chandrakar et al. [7] proposed detecting and identifying animals in wildlife sanctuaries using convolutional neural networks. It demonstrates the automatic extraction of data using convolutional neural networks (CNNs) in real time. Deep CNN is trained for a set of images in a wildlife dataset. Its purpose is to keep track of wildlife to
A Novel Model for Automated Identification of Terrestrial Species
141
protect them better. A total number of 301,400 images were used as the dataset. Of which, 283,000 were used for training and the rest for testing. It achieved an accuracy level above 95%. The results of SVM, LBPH, LDA, and PCA were compared with CNN, and unsurprisingly, CNN had the best results with an overall accuracy of 98% for 70% precision. Shetty et al. [8] developed animal detection using deep learning algorithms to detect the animal in wildlife. The algorithm can classify animals based on their images which in turn assist in monitoring them more efficiently. Animal detection and classification can help prevent animal collisions with vehicles, help track animals and grant protection by preventing smugglers from capturing them. The proposed system uses the camera trap image sequences that are analyzed using the Iterative Embedded Graph Cut (IEGC) technique to create a small group of wild animal detection using deep convolutional neural networks. This study had a dataset for 20 species with about 100 images each. The proposed system produced accuracy around 91% and an F1-score of around 0.95; however, this can be misleading and it may not perform too as it was tested with very few dataset and edge cases might have been missed. Chaudhari et al. [9] proposed animal identification using machine learning techniques to identify animals in order to help researchers and wildlife protectors study animal behaviors and protect them. The paper presents a proposed CNN compared to the well-known algorithm for image recognition, feature recognition, extraction, and image classification (PCA, LDA, SVM, and LBPH). It was concluded that CNN performed the best out of all the other classification algorithms. Karlsson et al. [10] used a combination of radio frequency identification (RFID) and a wireless camera sensor network to classify and track animals in a zoo environment. The cameras configure themselves and are autonomous. Animals are identified in close proximity by deploying RFID in strategic locations. The network of cameras continuously tracks animals that are in its view. A large dataset of about 5000 h of video within the span of about a year has been captured. This can be used in different computer vision projects. Saxena et al. [11] proposed an animal detection and collision avoidance system using deep learning to help prevent injuries and deaths of animals and humans from road accidents. The proposed method considers SSD and R-CNN to detect animals. A dataset containing 25 classes of animals with over 31,000 images was developed for training and testing. The model achieved an accuracy of 80.5% at 100fps, and at 10fps, its accuracy was 82.11% which was evaluated using mean average precision(mAP) as the criteria.
142
P. Sangkhro et al.
3 Proposed Objectives and Methodology The main objective of this research is to help people understand more about animals, especially people from urban areas that live far from rural areas or wildlife habitats with the help of AI and ML. The proposed methodology focuses on scalability in terms of functionality as well as performance wise, with the current methodology as the base.
3.1 Approach Overview As shown in Fig. 1, firstly the datasets are retrieved, prepared, and processed. Next, the neural network is built by configuring the layers and compiling the model. After the compilation of the model is complete, the model’s accuracy is evaluated. Finally, the model is then saved in HDF5 format and loaded from the next time onwards. For the purpose of this study, a dataset has been created manually, which includes 19 animals (Cat, Dog, Whale, Human, Bear, Chicken, Cow, Crocodile, Elephant, Goose, Hippopotamus, Horse, Lion, Lioness, Pig, Rhino, Sheep, Tiger, and Rabbit). Each class has 60 images, where 45 have been used for training and 15 for testing. There are very few datasets to work with, so this research is done using a pre-trained
Fig. 1 Block diagram of the proposed methodology
A Novel Model for Automated Identification of Terrestrial Species
143
Fig. 2 VGG16 architecture [13]
3d image
ML
Detailed
Fig. 3 Proposed model for species identification using a saved machine learning model
model downloaded from the keras library called VGG16 which has been trained with the ImageNet dataset. VGG16 is a deep convolutional network model shown to achieve high accuracy in image-based pattern recognition tasks. The architecture of VGG16 is given in Fig. 2, which comprises 13 CNN layers, 3 fully connected dense layers, and 5 max pooling layers. This technique of reusing a model that has performed well in a specific task to solve another but related task is also known as transfer learning. The main aim of TL is to transfer knowledge from a large dataset, i.e., ImageNet to a smaller dataset, which is the target domain [12]. Once an average accuracy of 90% is achieved, the model will be saved in HDF5 format so that it can be implemented simply without training or testing it repeatedly. After the evaluation is completed, the saved model is integrated and ready for use. The final model will look something like the one shown in Fig. 3. Here, an image will be fed to the model using the saved machine learning model.
4 Results and Discussion Training and Evaluation. A validation accuracy of about 90% is achieved while training the datasets with the VGG16 model. Figure 4 shows the training and evaluation process which includes 16 batches of 64 images, each with an epoch set to 15 and Adam algorithm is used for the optimizer.
144
P. Sangkhro et al.
Fig. 4 Accuracy and loss after each epoch
Visualization. It is observed in Fig. 5 that after just two epochs, the training accuracy went up from less than 0.4 to almost 0.9. This shows how efficient transfer learning and the VGG16 model can be. Evaluation. Using the confusion matrix given below in Fig. 6, it is possible to get the mean average precision (mAP) to evaluate the proposed method. The mAP score is the mean of average precision of all the classes. Testing the model with about 15—20 inputs for each class, the mAP of the proposed method is 88.5%. This score can also be compared with another study [14], wherein they achieved around 80% to 83%, which is also considered good.
Fig. 5 Training accuracy and validation accuracy after each epoch
A Novel Model for Automated Identification of Terrestrial Species
145
Fig. 6 Confusion matrix
Figure 7 shows the F1-score of each class. The overall F1-score being 82% which can be calculated using the above confusion matrix after finding the precision and recall. Output. Figure 8 shows the output that was generated by the VGG16 model and includes images of crocodile, dog, elephant, and a lion as an input image. The output generates that same image but with a label (the class or the name of the animal).
5 Conclusion and Future Works As cities and towns expand, wildlife is diminishing daily, which means the upcoming generations will be less exposed to wild animals. This study will not only help spread awareness to the general public but can also help educate the younger generations by making it easier to identify the animal along with their details. Getting accurate and detailed information about the whereabouts and behavior of animals in the wild would help study and conserve ecosystems [15].
146
P. Sangkhro et al.
Fig. 7 F1-score of each class
The implementation of CNN is more for the understanding and even though it gives good accuracy there are always newer and better machine learning models to make the proposed method more accurate. In the future, the authors plan to swap the ML model with a better performing one as well as use various datasets available on different websites or platforms to train the model. With some tweaks in the code and the datasets, this method can further be integrated to spot or monitor animals in wildlife sanctuaries and detect sick animals. Further advancements in this method can help eliminate accidents caused by human errors or even incorrect diagnoses of animals by veterinarians. Eventually, unlike humans, AI will keep adapting and improving itself.
A Novel Model for Automated Identification of Terrestrial Species
147
Fig. 8 Outputs generated by the saved model
References 1. TechTarget.: A guide to artificial intelligence in the enterprise. https://www.techtarget.com/sea rchenterpriseai/definition/AI-Artificial-Intelligence 2. Alzubaidi, L., Zhang, J., Humaidi, A.J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M.A., Al-Amidie, M., Farhan, L.: Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J. Big Data 8(1), 1–74 (2021) 3. Ai in chess: The evolution of artificial intelligence in chess engines ... https://towardsdatascie nce.com/ai-in-chess-the-evolution-of-artificial-intelligence-in-chess-engines-a3a9e230ed50. Accessed 9 Nov 2022 4. 8 ways AI is used in education. In: Analytics insight. https://www.analyticsinsight.net/8-waysai-is-used-in-education/#:~:text=AI%20enhances 5. Tammina, S.: Transfer learning using VGG-16 with deep convolutional neural network for classifying images. Int. J. Scient. Res. Publ. (IJSRP) (2019). https://doi.org/10.29322/ijsrp.9. 10.2019.p9420 6. Sumit, S.:. A comprehensive guide to convolutional neural networks—the ELI5 way. Towards data science 15 (2018%20the%20personalization%20of,universal%2024%2F7%20learning%20access. Accessed 9 Nov 2022 7. Chaudhari, G., Patil, A., Gangurde, G.: Identification Using Machine Learning Technique, March (2020). Retrieved from https://sreyas.ac.in/wp-content/uploads/2021/07/6.-Dr.-RohitRaja.pdf 8. Shetty, H., Singh, H., Shaikh, F.: Animal Detection using Deep Learning, June (2021). Retrieved from https://ijesc.org/upload/f001d8180864788afc97068687a1f59b.Animal%20D etection%20using%20Deep%20Learning%20(1).pdf9
148
P. Sangkhro et al.
9. Patil, A.: Animal Identification Using Machine Learning Technique, March (2019). Retrieved from https://www.researchgate.net/publication/349532649_Animal_Identification_ Using_Machine_Learning_Technique 10. Karlsson, J., Ren, K., Li, H.: Tracking and Identification of Animals for a Digital Zoo, December (2010). Retrieved from https://ieeexplore.ieee.org/abstract/document/5724879 11. Saxena, A., Gupta, D.K., Singh, S.: An Animal Detection and Collision Avoidance System Using Deep Learning, August (2020). Retrieved from https://link.springer.com/chapter/https:// doi.org/10.1007/978-981-15-5341-7_81 12. Hridayami, P., Putra, I.K., Wibawa, K.S.: Fish species recognition using VGG16 deep convolutional neural network. J. Comput. Sci. Eng. 13, 124–130 (2019). https://doi.org/10.5626/jcse. 2019.13.3.124 13. Architecture (June 2022).CNN Architecture—Detailed Explanation. Retrieved from https:// www.interviewbit.com/blog/cnn-architecture/ 14. O’Shea, K., Nash, R.: An Introduction to Convolutional Neural Networks, November 26 (2015). Retrieved from https://typeset.io/papers/an-introduction-to-convolutional-neural-net works-5342q71fyx 15. Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S., Packer, C., Clune, J.: Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning, June 5 (2018). Retrieved from https://www.pnas.org/doi/https://doi.org/10. 1073/pnas.1719367115
Fish Detection from Underwater Images Using YOLO and Its Challenges Pratima Sarkar , Sourav De , and Sandeep Gurung
Abstract Object detection is one of the most popular areas of research on computer vision. Monitoring of marine ecosystems is now a demand for saving nature. Sending human beings for observing the marine environment for some of the tasks is more dangerous. Multiple works are going on the improvement of Autonomous Underwater Vehicle (AUV) to monitor the underwater environment. In aquaculture fisheries, fish monitoring by AUV has got great importance. It is required to collect information regarding different types of fish to maintain the marine ecosystem. The work concentrates on object detection by using real-time object detectors like YOLOv3 and YOLOv4. The work used the Roboflow fish detection dataset for validating our work. It is a dataset of 26 classes. YOLOv3 and YOLOv4 are rigorously tested here and the final result is shown using precision, recall, IOU, and mAP. The work achieved 50% mean average precision after 16,000 iterations. The work discussed the main challenges of fish detection. Also, it stated the different reasons for the getting not higher values of performance parameters. Keywords Fish detection · Underwater object detection · Deep learning approach · Challenges of fish detection
P. Sarkar (B) Computer Science and Engineering, Sikkim Manipal University, Techno International New Town, Gangtok, India e-mail: [email protected] S. De Computer Science and Engineering, Cooch Behar Government Engineering College, Harinchawra, India e-mail: [email protected] S. Gurung Computer Science and Engineering, Sikkim Manipal University, Gangtok, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_13
149
150
P. Sarkar et al.
1 Introduction Detection of different types of fish is one of the most popular applications in the area of deep learning. It helps in the detection of particular fish species. Fish detection has great importance in the marine ecosystem. In the deep sea, the fish movements are very quick so recognition of a particular type of fish is very difficult. Fish detection refers to the detection of a region of a fish from an image and classifies it. Tracking and counting a particular fish species in a region can help in the conservation of an environment. According to static of Norway [1], from 2017 to 2018 fish got decreased by 27%. The reason is human beings killing fish. The static Norway survey takes into account salmon, sea trout, and migratory fish. There is no such idea is available about how much fish is caught, slaughtered, or released after catching. As underwater monitoring by a human being is very dangerous, it needs some automation system [2]. It is also important to determine whether catching a particular fish species is healthy for the ecosystem or not. In addressing such kind of problem fish detection plays a great role. CNN helps in classifying fish. Reithaug et al. [3] trained almost 20 neural networks for salmon fish detection and also does parameter tuning. SSD received 84% accuracy. The problem with this kind of network is it is trained for the only one type of fish; it does not work well with a dataset of different types of fish. Haar feature extraction and Voila–Jones feature extraction techniques are also used for fish detection [4]. Larsen et al. [5] used 100 images for fish detection and used three categories of fish. The author achieved an accuracy of 76%. They have used texture features for fish detection. Balk et al. [6] proposed sonar 5 technique that includes acquisition, analysis, and interpretation of the fish. The echo of sound is used in making an image; these techniques are used for underwater imaging. Xiang et al. [7] used CNN-based approach with VGG 16 and SSD on 9 different species of fish. The data is collected from the Missouri river. The classification accuracy is 87%. Ogunlana et al. [8] used 76 and 74 fish two datasets for doing experiments. SVM classifier is used for classification. Rathi et al. [9] used a CNN-based technique for fish detection and received better accuracy. Undoubtedly, fish detection is a difficult task due to the complex environment and the reasons are the movement of fish, noise, distortion, occlusion, etc. [10]. Even segmentation of fish from the background is also difficult because of an unclear image. Based on the object detection pipeline and backbone, object detection algorithms are classified into two categories: Two-stage object detectors and single-stage object detectors [11]. Two-stage detectors are scanning the images twice, once for finding out regions and the second time for the classification of fish. The most popular two-stage detectors are R-CNN [10], Fast R-CNN [12], Faster-R-CNN [13], and Mask R-CNN [14]. These models are usually achieved better accuracy but slower in detection. Single-stage detectors are scanning the image only once, together detection and region selection are made in a single shot. SSD [15], YOLO [16], YOLOv2 [17], YOLOv3 [18], and YOLOv4 [19] are the examples of single-stage object detectors. Machine learning algorithms are usually lagging in detection when an image consists of multiple objects. Out of all single-stage detection, YOLOv3 and YOLOv4
Fish Detection from Underwater Images Using YOLO and Its Challenges
151
are giving similar or better accuracy than Faster-R-CNN [13] so we have selected the YOLOv3 and YOLOv4 for implementing our work. The contribution of the work is as follows: a. The work has been implemented on the Roboflow fish dataset [20], it is an unbalanced dataset so we can able to analyze the effect of the unbalanced dataset in deep learning models. b. The work also identified challenges during working with YOLOv3 and YOLOv4. c. It also tries to find out the model’s performance with the multi-object scenario for fish detection.
2 Proposed Solution In this work, we have used YOLOv3 and YOLOv4 [21] for fish detection. Both models are modifications of YOLO9000 [17]. Both single-stage object detectors are good for detection of real-time object detection. We have used fish data set from Roboflow [20] dataset for fish detection, where 26 categories of fish are available. YOLOv3 [18] performs bounding box prediction, feature extraction, and class prediction in a single scanning for fish detection. It follows the same method as YOLO9000 for predicting region. The network predicts 4 coordinates for each region. Let us assume Rx, Ry, Rh, and Rw are the four coordinates of the region. Cx and Cy are the offset values from the top left corner and Ph and Pw are the prior known height and weight values. So, the predictions are: bx = σ(r x ) + cx
(1)
by = σ r y + cy
(2)
bw = pw erw
(3)
bh = p h e r h
(4)
The prediction box generates an abjectness score for each of the boxes. The maximum value of the abjectness score is 100%. When the prediction box matches with ground truth, then the abjectness score is 100%. Darknet-53 backbone is used for feature extraction as shown in Fig. 1. It is a modified version of Darknet-19 which is used for YOLOv2 [17]. It is using 3 × 3 and 1 × 1 filter sizes for convolution operation. As the name says, it is consisting of 53 layers. For each prediction box, classes are predicted. During prediction, YOLOv3 uses logistic regression instead of the Softmax classifier. Softmax function predicts one class for a particular bounding it is not flexible. Finally, the prediction of an object is decided by using IOU, i.e., intersection over the union.
152
P. Sarkar et al.
Fig. 1 Darknet-53 architecture
YOLOv4 [19] is a modified version of YOLOv3. Most object detection techniques lag in the detection of small objects. YOLOv4 trying to improve accuracy by adding new features in YOLOv3. It includes a few new features like Weighted Residual Connections (WRC), Cross mini Batch Normalization (CmBN), Cross-Stage Partial connections (CSP), Mish activation, and Self-adversarial training (SAT). YOLOv4 mainly contains three parts backbone, neck, and head as shown in Fig. 2. Backbone is used for feature extraction. CSPDarknet-53 backbone is used instead of Darknet-53. It increases the receptive field and segregates the most important features from unimportant features which helps in reducing feature size. YOLOv4 Neck id additional module used for the purpose of feature aggregation. Spatial Pyramid Pooling and PANet are used as the neck of YOLOv4. Mosaic and SAT data augmentation approach are used to increase dataset size. Mosaic uses 4 images
CSPDarknet-53 (Backbone)
Fig. 2 Architecture of YOLOv4
SPP and PANet (Neck)
Dense prediction (yolov3) (Head)
Fish Detection from Underwater Images Using YOLO and Its Challenges
153
Table 1 Different kinds of fish Sl. no Fish name
Sl. no Fish name
Sl. no Fish name
1
Acanthuridae—Surgeonfishes
10
Scombridae—Tunas
19
Parrot
2
Balistidae—Triggerfishes
11
Serranidae—Groupers
20
Shark
3
Carangidae—Jacks
12
Shark -Selachimorpha
21
4
Ephippidae—Spadefishes
13
Zanclidae(Moorish Idol) 22
Spade
5
Labridae—Wrasse
14
Zanclidae-Moorish Idol
23
Surgeon
6
Lutjanidae—Snappers
15
Angel
24
Trigger
7
Pomacanthidae—Angelfishes
16
Damsel
25
Tuna
8
Pomacentridae—Damselfishes 17
Grouper
26
Wrasse
9
Scaridae—Parrotfishes
Jack
18
Snapper
to generate a single image and it is used for tanning purposes. Hyper-parameter selection is made by using a genetic algorithm. The Mish activation function is used during feature selection because the Mish activation function is continuous and also differentiable at 0 so it does better prediction. The Mish activation function is shown in the following equation: f(x) = x tanh(soft plus(x)) = x tanh(ln(1 + e x ))
(5)
3 Dataset Description The Roboflow object detection dataset [20] is used here for fish detection. The dataset consists of 680 images and the size of each image is 416 × 416. The entire dataset is composed of 26 different fish categories. 476 images are used for training, 136 for validation, and 68 for testing. Table 1 represents different categories of fish available in the dataset.
4 Object Detection Evaluation Metrics In this work, fish detection done by using YOLOv3 and YOLOv4. Fish detection fundamental parameters are precision, recall, IOU, and mean average precision. The outcome of the fish detector can have the following case: 1. The fish detector predicted a presence of a fish of class C1, and it is a correct prediction—True Positive(TP). 2. The fish detector predicted a presence of a fish of class C1, and it is a wrong prediction—False Positive (FP).
154
P. Sarkar et al.
3. The fish detector predicted the absence of a fish of class C1, and it is a wrong prediction—False Negative (FN). 4. The fish detector predicted the absence of a fish of class C1, and it is a correct prediction—True Negative (TN). From using the above parameter, it is possible to calculate precision, recall, and mean average precision using the following formula: precision =
(6)
TP T P + FN
(7)
Area of Intersection Area of Union
(8)
recall = I OU =
TP T P + FP
Category-wise average precision: A PCi =
n 1 Pi n 1
(9)
where Pi ith image of Ci category. Mean average precision : m AP =
n 1 A PCi n 1
(10)
5 Results and Discussion In this work, we aim to detect fish from an image. The Roboflow fish detection dataset is used to train the models. It is an unbalanced dataset and we have experimented using YOLOv3 and YOLOv4. Google Colab is used for the implementation of the work and the parameters are as follows as given in Table 2. The models are evaluated based on precision, recall, IOU, and mean average precision. The epochs of the experiment were set to 16,000. The data was taken in the interval of 4000 epochs. From Fig. 3a, it can be observed that the recall for both the models increases with the number of iterations but after 16,000 iterations, we got the recall value that lies between 40% and50%. Similarly for Fig. 3b, IOU value is higher for YOLOv4 than YOLOv3. It lies between 30 and 65% in 16,000 iterations. One of the main reasons for a lower value of these parameters is that dataset is not balanced.
Fish Detection from Underwater Images Using YOLO and Its Challenges
Fig. 3 Recall (a) and IOU (b) results obtained by YOlOv3 and YOLOv4
Parameters
Specification
Train-test split ratio
7:3
Training batch size
64
Training subdivisions
16
Testing batch size
1
Testing subdivisions
1
Max batches
16,000
Steps
9600, 10,800
Final layer filters
21
Classes
26 Recall 1
Recall
Table 2 Parameter specification for YOLOv3 and YOLOv4
155
YOLOv4
0.5
YOLOv3
0 4000
8000
12000
16000
(a) IOU IOU
100.00% YOLOv4 50.00%
YOLOv3
0.00% 4000
8000
12000
16000
(b)
Precision
Fig. 4 Precision (a), mAP (b) results obtained by YOlOv3 and YOLOv4
Precision
0.6 0.4
YOLOv4 YOLOv3
0.2 0 4000
8000
12000
16000
(a) Mean Average Precision
mAP
60.00% 40.00%
YOLOv4
20.00%
YOLOv3
0.00% 4000
8000
12000
(b)
16000
156
P. Sarkar et al.
Figure 4a presents the results obtained by YOLOv3 and YOLOv4 in precision. The precision value lies between 30 and 48% for 16,000 epochs. In Fig. 4b, 30–42% for 16,000 epochs in case of precision. In both cases, YOLOv4 is performing better than YOLOv3 Figure 5 presents some of the fish detection images by YOLOv3 and YOLOv4. Both techniques work well in single object detection but in column three, while two fish are existed small, it remains undetected. From Fig. 6, it can be observed that loss per iteration is lower in the case of YOLOv3 than in YOLOv4. Figure 7 presents incorrect detection for both cases. From all the results, it is found that after 16,000 epochs, there are so many undetected fish for many images.
Fig. 5 First row is fish detection by YOLOv4 and the second row is fish detection by YOLOv3
Fig. 6 Loss per iteration a YOLOv3 b YOLOv4
Fish Detection from Underwater Images Using YOLO and Its Challenges
157
Fig. 7 a Incorrect object detection by YOLOv3 b YOLOv4
Figure 7a has the same multiple types of fish and YOLOv3 fails in detection. Figure 7b has only two objects still lagging in detection. The reason behind lower performance both the detectors are lagging in detection, where more than one object is preset in a single image. Other reasons are dataset is not balanced; the model is trained till 16,000 iterations.
6 Challenges Faced During Experiment The following challenges were faced during the implementation of the work.
6.1 Massive Data Requirement The most of deep learning module required massive amount of data train the module. If the dataset is not balanced, then it will suffer from lower precision. Training of large amount of data is very time-consuming.
6.2 Slow Training Process The deep learning models needs large amount of data to train a model. Training of model takes large amount of time to train entire model. If a model can learn with smaller dataset, then training time can be reduced.
158
P. Sarkar et al.
6.3 Unclear Images Underwater images suffer from color cast so it has the effect of blue color. Context of the images is not understood some times. It also effects accuracy of entire model.
6.4 Multi-object The YOLOv3 and YOLOv4 are lag in detection of objects if an image consists of many objects. It is also difficult to detect if the image has occlusion, overlapping, etc., in an image.
6.5 Large Number of Classes When image dataset having large number of categories, it is difficult to train the entire model. It is facing problem like unbalancing of dataset and one image with multiple categories of object and large training time.
7 Conclusion Fish detection is one of the challenging areas of computer vision. The main challenges are most of the images are not clear due to noise and lack of light source undersea, etc. The work used the Roboflow fish dataset consisting of 26 categories of fish species and it is not a balanced dataset. While implementing the work achieved maximum of 50% mAP precision was received in YOLOv4. The accuracy of the work is lower as most of the images have multiple fish, occlusion present; the dataset is unbalanced, etc. Another important challenge faced if the number of object categories is more, it takes a large amount of time to train the model. Here the model was trained till 16,000 epochs, and it is not sufficient for 26 categories and achieved 50% mAP. In the future, it is important to build a model which can be trained on lesser time and less number of data.
References 1. River catch (ssb.no) Accessed on 5th Sept 2022 2. Raza, K., Hong, S.: Fast and accurate fish detection design with improved YOLO-v3 model and transfer learning. (IJACSA) Int. J. Adv. Comput. Sci. Appl. 11(2) (2020) 3. Reithaug, A.: In: Employing Deep Learning for Fish Recognition, pp. 85. (2018)
Fish Detection from Underwater Images Using YOLO and Its Challenges
159
4. Lantsova, E.: In: Automatic Recognition of Fish from Video Sequence, pp. 49. (2015) 5. Larsen, R., Ólafsdóttir, H., Ersbøll, B.K.: Shape and texture based classification of fish species. Image Anal. 745–749 (2009) 6. Balk, H.: Development of hydro acoustic methods for fish detection in shallow water, pp 28. (2001) 7. Xiang, F.: Application of Deep Learning to Fish Recognition, pp. 53. (2018) 8. Ogunlana, S.O. et al.: In: Fish Classification Using Support Vector Machine. pp. 75. (2015) 9. Rathi, D., Jain, S., Indu, S.: Underwater fish species classification using convolutional neural network and deep learning. In: Computer Vision and Pattern Recognition (2018) 10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (cs.CV) (2014) 11. Sarkar, P., De, S., Gurung, S.: A survey on underwater object detection. In: Bhattacharyya, S., Das, G., De, S. (eds) Intelligence Enabled Research. Studies in Computational Intelligence, vol 1029. Springer, Singapore. https://doi.org/10.1007/978-981-19-0489-9_8 12. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448. (2015) 13. Ren S. He, K., Girshick, R., Sun, J.:Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99. (2015) 14. He, K., Gkioxari, G., Dollár, P., Girshick, R.: “Mask R-CNN.” In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969. (2017) 15. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., Berg, A. C.: “Ssd: Single shot multibox detector. In: European Conference on Computer Vision, October, pp. 21–37. Springer, Cham (2016) 16. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. (2016) 17. Redmon, J., Farhadi, A.:YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517—6525. (2017). https://doi.org/ 10.1109/CVPR.2017.690 18. Joseph, R., Farhadi, A.:Yolov3: an incremental improvement (2018). arXiv preprint arXiv: 1804.02767 19. Bochkovskiy, A., Wang, C.-Y., Mark Liao, H.-Y.: YOLOv4: optimal speed and accuracy of object detection. In: Computer Vision and Pattern Recognition (2020). https://doi.org/10. 48550/arXiv.2004.10934 20. Fish Dataset—416x416 (roboflow.com). Accessed on May 2022 21. Anand, R., Das, J., Sarkar, P.:Comparative analysis of YOLOv4 and YOLOv4-tiny techniques towards face mask detection. In: 2021 international conference on computational performance evaluation (ComPE), pp. 803—809. (2021). https://doi.org/10.1109/ComPE53109.2021.975 1880
Design and Realization of an IoT-Based System for Real-Time Monitoring of Embryo Culture Conditions in IVF CO2 Incubators Sukanya Bose, Swarnava Ghosh, Subhadeep Dhang, Prasenjit Dey, and Aritra Acharyya Abstract The authors have designed and implemented an Internet of things (IoT)based real-time embryo culture conditions monitoring and alerting system for CO2 incubators used in the in-vitro fertilization (IVF) process. Industry grade sensors capable of sensing accurate concentrations of carbon dioxide (CO2 ) and total volatile organic compound (VOC), temperature, and humidity have been placed inside the incubator chamber under culture environment. These sensor data are sent in periodically to the microcontroller unit outside the incubator through a flat flexible cable (FFC) for efficient data acquisition and further processing. Minicomputer unit and touch screen display are used to display the real-time culture conditions, i.e., CO2 , VOC, temperature, and humidity. The complete local desktop access of the proposed system can be obtained from any personal computer/smartphone by using a crossplatform screen sharing system. The minicomputer unit will be capable of storing the real-time culture data in appropriate format in cloud. The minicomputer unit will be registered to a specific cloud service so that its desktop can also be fully accessed from anywhere of the world through Internet. The SMS and e-mail alerts will be sent immediately to one or more pre-registered recipients via application programming interface (API) bulk SMS service and e-mail server, respectively, if one or more culture parameters cross the predefined allowable limit. Further test runs have been performed for prolonged time of period (eighteen (18) months) in order to determine the period after which the re-calibration of the sensors is required. Keywords Embryo culture · IoT · In-vitro fertilization · Minicomputer
S. Bose · S. Ghosh · S. Dhang · A. Acharyya (B) Department of Electronics and Communication Engineering, Cooch Behar Government Engineering College, Harinchawra, Ghughumari, Cooch Behar, West Bengal 736170, India e-mail: [email protected] P. Dey Department of Computer Science and Engineering, Cooch Behar Government Engineering College, Harinchawra, Ghughumari, Cooch Behar, West Bengal 736170, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_14
161
162
S. Bose et al.
1 Introduction First ever human infant was derived by in-vitro fertilization (IVF) process in England on 25th July, 1978 [1, 2]. Since then, five decades have been passed, and currently, IVF is a well-developed field and worldwide acceptable process for deriving human infants. Immense advancements can be observed in this field since last two decades with the help of ground-breaking developments in the fields of sensor technologies, biomedical engineering and instrumentation, advanced temperature, humidity and gas control systems, and other relevant state-of-the-art technologies related to electronics and communication engineering. Initially, IVF was not well accepted by India people due to the high procedural cost, unavailability of state-of-the-art facilities, sociological unacceptability, etc. But as a result of recent advancements of medical electronics, almost all aforementioned barriers have been broken to increase the popularity of IVF in our nation since the beginning of the last decade [3]. Recent advancement in artificial intelligence, wireless communication, Internet technology, robotics, etc., can provide the technological support to the doctors to diagnose, treat, and even perform surgery from remote place. Various technological supports have already been incorporated in IVF process in order to increase the reliability and accuracy in embryo culture. In IVF, various types of incubators are used, such as benchtop incubators, time-lapse incubators, drawer incubators, water jacket incubators, and air jacket incubators [4]. The main purpose of incubators is to provide precisely controlled and stable embryo culture environment. Temperature inside an incubator chamber must be precisely maintained at 37 °C, and the pH of the culture media must be maintained between 7.2 and 7.4. The pH control is achieved by maintaining the CO2 concentration inside the chamber between 5 and 7%. Moreover, the humidity of 90–95% is recommended for maintaining favorable embryo culture environment. The volatile organic compound (VOC) content inside the incubator chamber must be kept as low as possible (VOC < 10 ppb) [5]. In average, three (3) days of culture under the said environment are required for each embryo. All types of incubators mentioned earlier are well equipped with precise temperature, humidity, and gas control systems. Most of those are also equipped with local alarming system for adverse culture conditions. However, remote access of real-time culture conditions of individual incubators within an IVF clinic and cloud storage of full culture data is not currently available. No such commercial product is available in market to monitor and display real-time culture conditions of individual incubators in an IVF laboratory which can be remotely accessible from anywhere of the world. In this proposal, an Internet of things (IoT)-based system has been proposed to serve this purpose. The IoT technology is a newly developed field under computer science and technology which is already widely utilized in various household applications, security systems, mining security, smart cities, industries, aviation, etc. [6]. A few applications of this technology in biomedical field can also be found in literature [7]. In this proposal, an IoT-based system is proposed for real-time monitoring of the embryo culture conditions of culture environment in IVF CO2 incubators. This system utilizes
Design and Realization of an IoT-Based System for Real-Time …
163
the knowledge and resources of four (4) multidisciplinary fields, such as sensor technology, IoT, networking, and biomedical engineering. IVF sector is a vastly growing field in India. Due to the stressful lifestyle, unfavorable climate and environmental conditions and many more factors are affecting the reproductive power of young generation in our country. Thus, thousands of young couples are taking the assistance of IVF nowadays. Also presently, the cost of this process is not beyond the reach of middle class Indians. The demand of IVF process is expected to increase from thousands to lakhs and lakhs to millions within few years; however, currently, no source of reliable data is available in India due to the absence of mandatory reporting system [8]. This expected growth in the requirement of clinical work in near future will increase the demand of embryologists drastically, which will be highly difficult to be fulfilled in immediate basis. Thus, the remote access of the clinical work is very essential to fulfill this upcoming growth of demand. Therefore, developing a system which is capable of providing remote access to a significant portion of clinical work is the primary motivation behind this proposal. The major outcome of the present work is primarily a new product in biomedical engineering field which is ready to be commercialized. The developed product is cost efficient, since it will be completely indigenously developed in our country without the help of any other patented technology. Subsequently, this product will initiate a start-up business in this area which will eventually generate job opportunities for young engineers and marketing personnel. Continuous research and development will produce more related IoT-based biomedical products which lead to sustainable job opportunities as mentioned earlier. The outcome of the project directly endorses the “Make in India” campaign of the Government of India.
2 Present Scenario Some wireless embryo culture environment monitoring systems and data loggers are already commercially available in market. Some of the leading companies producing these systems are Ellab Monitoring Solutions Limited (UK), Vaisala Oyj (Finland), CAS DataLoggers (USA), Vitrolife (Sweden), etc. [9–12]. However, these products are very costly from the prospective of Indian IVF clinics, and also those are lacks in full remote access of real-time data, stored data, and global alert systems. No India-based company has developed or commercialized remotely accessible culture environment monitoring system. Most of the leading IVF companies of India are importing the foreign products and working as distributors in collaboration with foreign companies to sell those products in India. Naturally, the high cost of those products is a significant burden for most of the IVF clinics in India. Real-time display of culture condition data and arrangement of storing of those data for prolonged period of time, emergency local, and remote alert system will be immensely helpful for the embryologists to culture the embryos in IVF laboratories.
164
S. Bose et al.
Remote access of real-time culture conditions and remote access of full culture data in various forms will revolutionize the IVF process.
3 Primary Objectives The primary objectives of the work are as follows: (a) Design and implement an IoT-based system for monitoring the real-time embryo culture conditions for IVF CO2 incubators. (b) Online streaming of graphical user interface (GUI)-based local display desktop showing real-time sensor readings of CO2 (%), VOC (ppb) concentrations, temperature (0 C), and humidity (%) inside the incubator chambers to World Wide Web which can be accessed from anywhere of the world. (c) Carryout initial calibration of the sensors with respect to the standard meter outputs and rigorous testing of the prototypes in real conditions inside CO2 incubator chambers in IVF laboratory for prolonged period of time. (d) Carryout at least eighteen (18) months continuous testing under real embryo culture conditions and determine accuracy of individual culture parameters periodically in order to accurately evaluate the average period of after which re-calibration is required.
4 Design and Development The entire work can be subdivided into two major phases, such as (A) Phase I—design and implementation of the prototypes and (B) Phase II—continuous testing of the prototypes in IVF laboratories under real embryo culture conditions for prolonged period of time. These two phases are described below in details.
4.1 Phase I: Design and Implementation of the Prototypes The logical block diagram of the proposed system is shown in Fig. 1. The design and implementation of the system can be categorized into two parts, such as (i) hardware design and realization and (ii) software design and implementation; those are described as follows: Hardware Design and Realization: The hardware of the system has two parts; those are (a) Universal Sensor Bank and (b) Controller. Details of those are as follows: Universal Sensor Bank: This is nothing but two parallel sensor breakouts. The primary sensor to be used for measuring the CO2 (%), temperature (0 C), and humidity (%) is STC31 (Qwiic) CO2 sensor breakout made by SparkFun. This sensor breakout
Design and Realization of an IoT-Based System for Real-Time …
165
Fig. 1 Logical block diagram of the proposed system
uses Sesirion STC31 thermal conductivity sensor. It has two different CO2 concentration measurement ranges: 0–25% and 0–100%. Accuracy of the 0–25% range 0.5 vol% + 3% measured value is better as compared to the 0–100% range which is 1.0 vol% + 3% measured value. Since 5–7% CO2 concentration must be maintained inside the incubator chamber in order to keep pH of the culture media around 7.2– 7.4, the first and more accurate measuring range (i.e., 0–25%) of STC31 is automatic choice for this application. Moreover, this sensor provides suitable repeatability (0.2 vol%) with adequate stability (0.025 vol%/°C), which is appropriate for the application under concern, because this lead to longer period of sensor operation without recalibration. The STC31 also equipped with temperature, humidity, and atmospheric pressure compensated measure of CO2 concentration, which provides very accurate readings under the embryo culture conditions where the temperature and humidity are maintained at 37 °C and 88–95%, respectively. Moreover, this sensor provides measured temperature (°C) and humidity (%) data. Since the CO2 gas is supplied to the incubator chambers from external CO2 cylinders cascaded with appropriate air filter banks, the monitoring of VOC inside the incubator chambers is very important. Adafruit CCS811 CO2 plus VOC sensor breakout will be used for VOC measurement. This sensor breakout uses hot plate MOX sensor which is capable of providing accurate total VOC readings ranging from 0 to 1187 ppb, which is sufficient for the current application. However, the equivalent CO2 concentration measured by this breakout is not reliable. Parallel combination of STC31 (for measuring CO2 (%), temperature (°C), and humidity (%)) and CCS811 (for measuring VOC (ppb)) breakouts form the universal sensor bank. The size of the sensor bank is very important due to the limitations in space availability inside the benchtop CO2 incubator chambers; thus, the dimension of the porous aluminum enclosure inside of which
166
S. Bose et al.
the universal sensor bank must be accommodated must be limited to 30 × 30 × 15 mm3 . Controller: The controller will be outside of the incubator mounted on the incubator side-wall. The controller unit comprises of an Arduino Uno microcontroller board, a Raspberry Pi4, Model B (8 GB RAM, 1.5 GHz CPU) minicomputer, an AC to DC power adapter, and a touch screen display. The sensor data will be sent to an Arduino Uno microcontroller board via inter-integrated circuit (I2C) interface. Flat flexible circuit (FFC) ribbon cable (20 pin) will be used to implement such reliable interface between the universal sensor bank and Arduino Uno. The FFC cable can easily be passed through the airtight gasket door of the incubator without any gas leakage from the inside of the incubator. The universal sensor bank, FFC ribbon cable, and Arduino Uno together form the “sensor data acquisition unit” as shown in Fig. 1. The universal sensor bank runs with 3.3 V supply from Arduino Uno I2C interface. Under stable operating condition, the universal sensor bank consumes very small current (~few hundred ampere). The Arduino Uno is connected with the Raspberry Pi4 via an USB B to USB A cable connector, which is used for data transfer between Arduino Uno and Raspberry Pi4; Arduino Uno gets power supply from Raspberry Pi4 via the same cable. Power supply is provided to the Raspberry Pi4 from AC main supply via a 5.1 V, 3A, 15.3 W AC to DC power adapter. The second unit, i.e., “internal and external network remote access unit” (see Fig. 1), comprises of a minicomputer unit, touch screen display, power supply unit, and piezoelectric buzzer (alarm). This unit will display the real-time culture conditions, i.e., CO2 (%), VOC (ppb), temperature (°C), and humidity (%) in a 4.3 inch onboard capacitive tough screen display. It will play loud alarm (with adjustable volume control with enable and disable facilities from both local and remote ends) if one or more culture parameters cross the predefined allowable limit. The complete local desktop access of the proposed system can be obtained from any personal computer/laptop/smartphone by using a cross-platform screen sharing system known as virtual network computing (VNC) viewer. The minicomputer unit is capable of storing the real-time culture data in appropriate format in cloud. The minicomputer unit is registered to a specific cloud service known as remoteit cloud service so that its desktop can also be fully accessed from anywhere of the world through Internet. The SMS and e-mail alerts will be sent immediately to one or more pre-registered recipients via application programming interface (API) bulk SMS service and e-mail server, respectively, if one or more culture parameters cross the predefined allowable limit. Culture conditions of more than one incubator inside an IVF laboratory will be monitored locally from the display associated with the prototypes. All those prototypes are be connected to Internet via a star-connected wireless local area network (WLAN). An optional personal computer/laptop can be kept in the laboratory for obtaining the full simultaneous local access of all prototypes’ desktops. The starconnected WLAN topology for the proposed system for N number of incubators (N > 1) inside an IVF laboratory is shown in Fig. 2. The photograph of the hardware of the proposed system is shown in Figs. 3 and 4. Software Design and Implementation: The GUI, local display, local alarm system, internal and external network access of the desktop with full user control,
Design and Realization of an IoT-Based System for Real-Time …
167
Fig. 2 Star-connected WLAN topology for the proposed system for N number of incubators (N > 1) inside an IVF laboratory Fig. 3 Photograph of the hardware without enclosure
168
S. Bose et al.
Fig. 4 Photographs of the prototype inside enclosure
cloud server for sensor data storage, and SMS as well as e-mail alert systems will be implemented in the Raspberry Pi4, Model B minicomputer under 32 bit Raspbian operating system (based on Linux). Python3 will be used for writing the codes, and Tkinter GUI library in Python3 will be utilized to design and implement the local screen display and remote viewer. Photograph of typical GUI local display is shown in Fig. 5, showing CO2 (%), VOC (ppm), temperature (°C), and humidity (%) data in the 4.3 inch capacitive touch screen display; here, the universal sensor bank is kept at open air condition inside the laboratory. Locally stored sensor data set is periodically uploaded to the Google drive, which can be used by the embryologist for analyzing the culture conditions. Interface with the remoteit cloud service for providing full remote access of the desktop of the minicomputer is implemented in the same platform. Fast2SMS API bulk SMS service and g-mail server will be implemented in Python3 and will be used for the SMS and e-mail alerts.
Design and Realization of an IoT-Based System for Real-Time …
169
Fig. 5 Photograph of the local display showing the real-time sensor data
4.2 Phase II: Continuous Testing of the Prototypes in IVF Laboratories Under Real Embryo Culture Conditions Initially, the sensors are calibrated to provide accurate measurements of CO2 concentration, VOC, temperature, and humidity with respect to (w.r.t.) standard measuring equipments. Standardized CO2 concentration, VOC, temperature, and humidity meters are used for the calibration of the sensors. After the initial calibration, the prototype is to run under real embryo culture conditions (keeping the universal sensor bank inside the incubator chamber) for 24 × 7 h continuously for prolonged period of time (preferably for six months (180 days)). The accuracy in measured individual culture parameter is verified periodically after the 15 days by comparing the measured data with the outputs of standard meters. Repetitive accuracy checking after the interval of 15 days provides the deviation pattern for each culture parameters with respect to time. When the deviation of one or more data is found to be deviated more than the predefined allowable amount (allowable deviation amount for each culture parameter is decided after the consultation with the experienced embryologist), the re-calibration of the sensor is done. This process has been repeated for three (3) times in order to evaluate the average time period after which the re-calibration of the sensors is required. Therefore, minimum six months (180 days) × 3 equal to around eighteen (18) months of continuous run, and rigorous testing is required for multiple prototypes in order to optimally determine the average time period after which the re-calibration is required; this process of laboratory test run is most important task for ensuring the accurate performance of the system. The entire process to be performed in phase II is summarized as a flowchart shown in Fig. 6.
170
S. Bose et al.
Fig. 6 Flowchart showing the algorithm for determining the average re-calibration period
5 Performance Evaluation After around 2–3 min of continuous run, the sensors provide stable and accurate data output. Typical twenty four hours sensors’ output data versus time plots are shown in Fig. 7a–d). During this period under consideration (starting at 3:12 am and ending at 2:28 am (next day)), the incubator chamber under test is intentionally opened for three times (at 3:35 am, 12:55, and 21:22 pm, respectively). The CO2 percentage rapidly decreased due to sudden gas leakage at those instances (as depicted in Fig. 7a). Rapid increase in VOC is also observed from Fig. 7b due to the exposure of the incubation chambers to the outer environment. However, no significant change in incubator temperature and humidity is observed since the chambers opening times were very short (less than 30 s). This test has been carried out after the 100 days of continuous run of the prototype without the re-calibration.
Design and Realization of an IoT-Based System for Real-Time …
171
Fig. 7 Typical 24 h sensor’s output data versus time plots
6 Conclusion Real-time display of culture condition data and arrangement of storing of those data for prolonged period of time, emergency local, and remote alert system will be immensely helpful for the embryologists to culture the embryos in IVF laboratories. Remote access of real-time culture conditions and remote access of full culture data in various forms will revolutionize the IVF process. Incorporation of more sensors like O2 , N2 , CO, NH3 , CH4 gas sensors, atmospheric pressure sensor, etc., as per the requirement, the same system can be used as air quality monitoring system, local weather monitoring system, toxic gas detection system for residential houses, hospitals, fabrication laboratories, mines, etc. Therefore, more than one product can be redesigned, re-implemented, and commercialized as by-products/secondary products of the proposed research.
References 1. Steptoe, P.C., Edwards, R.G.: Lancet 2(2), 366–372 (1978) 2. The History of IVF—The Milestones, Research & Education, IVF Worldwide, CME Congress (2008). https://ivf-worldwide.com/ivf-history.html. Accessed on 19th Feb 2022 3. The History of IVF, “ISAR 2019 Report: Head Office: Flat23-A, 2nd Floor, Elco Arcade, Hill Road, Bandra (W), Mumbai—400 050, India
172
S. Bose et al.
4. Kiu, J., Zhou, Y.H., Wang, X.X., Tong, L.X., Li, Y.H., Liu, L., Xu, Z.Y., Wang, Liu, J., Zhou, Y.H., Tong, L.X., Li, Y.H., Li, Y.H., Liu, L., Xu, Z.Y., Wand, H.H.: Effects of different types of incubators on embryo development and clinical outcomes. Res. Square 1–15 (2020). https:// doi.org/10.21203/rs.3.rs-122464/v1 5. Khoudja, R.Y., Xu, Y., Li, T., Zhou, C.: J. Assist. Reprod. Genet. 30(1), 69—76 (2013) 6. Upanana.: Real World IoT applications in different domains. Edureka, 25th November, 2020, https://www.edureka.co/blog/iot-applications/. Accessed on 19th Feb 2022 7. Balas, V.E., Son, L.H., Jha, S., Khari, M., Kumar, R. (Eds): In: Internet of Things in Biomedical Engineering. Elsevier (2019). ISBN: 978-0-12-817356-5 8. Yovich, J.: History of IVF from the Indian perspective. Report of the Project Entitled “Historical assisted reproduction”, Australia (2020). https://www.researchgate.net/publication/342 170301_History_of_IVF_from_the_Indian_Perspective. Accessed on 19th Feb 2022 9. Hanwell CO2 Monitor for IVF Incubators, Ellab Monitoring Solutions Limited: https://han well.com/news/co2-monitor-ivf-incubators/. Accessed on 19th Feb 2022 10. Incubator Monitoring System for In Vitro Fertilization Clinic, CAS DataLoggers: https://www. dataloggerinc.com/resource-article/incubator-monitoring-system/. Accessed on 19th Feb 2022 11. Boston IVF ensures ideal CO2 for embryos in assisted reproduction with Vaisala calibration, Vaisala: https://www.vaisala.com/en/case/boston-ivf-ensures-ideal-co2-embryos-assisted-rep roduction-vaisala-calibration. Accessed on 19th Feb 2022 12. Log & Guard System, Vitrolife: https://www.vitrolife.com/products/lab-qc-systems/log-guard/. Accessed on 19th Feb 2022
FPGA Implementations and Performance Analysis of Different Routing Algorithms for the 2D-Mesh Network-On-Chip Vulligadla Amaresh, Rajiv Ranjan Singh, Rajeev Kamal, and Abhishek Basu
Abstract Multiprocessor system-on-chip (MPSoC) architectures with an increasing number of cores benefit from Network-on-Chip (NoC) communication methods, which are robust and scalable. Ultimately, router performance depends on the microarchitecture of the router and its routing algorithms, which determine their throughput and latency properties. During packet travel through the network, routing algorithms are notably crucial NoC router design options. For look-ahead routing, individual first computes the desired output port, which is establishing its local blockage scenario, after it transfers the flits to the neighboring router’s desired output ports. This paper proposes a synchronous predictive routing computing for Distributed Scalable Predictable Interconnect Network (DSPIN) along with RTL implementation and analysis via some popular routing algorithms consecrated toward 2D-Mesh network topographic anatomy. Parametrized Verilog HDL has been used for the RTL implementations, and power, area, and delay (PAD) analysis has been accomplished by Xilinx Vivado. Power dissipation for the design has been estimated with help of Xpower analyzer and Xpower estimator. Keywords NOC · DSPIN · MPSoC · Communication-centric · FPGA · Look-ahead routing
V. Amaresh (B) · R. R. Singh Presidency University Bengaluru, Bengaluru, India e-mail: [email protected] R. R. Singh e-mail: [email protected] R. Kamal Samsung Semiconductor India Research, Bengaluru, India e-mail: [email protected] A. Basu RCC Institute of Information Technology, Kolkata, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_15
173
174
V. Amaresh et al.
1 Introduction In the field of modern interconnection infrastructure or high-speed communicationcentric platform, Network-on-Chip (NoC) plays a vital role to influence design strategy of the Multiprocessor System-on-Chip (MPSoC) [1]. In this recent era, quality of service (QoS), signal integrity, and reliability with robustness have become the serious concern for the conventional on-chip bus-based interconnection networks [2]. These serious issues arise due to scaling down of the feature sizes of the devices as per the Moore’s law. NoC is a new standard that replaces the traditional bus-based SoC, and it is used to address problems, like robustness, data integrity, signal integrity, QoS, etc., for the current sophisticated architectures of MPSoC. NoC concepts are borrowed from LAN’s idea, used in computer network, and totally work on the principle of packet switching [3]. For a reliable on-chip communication platform, NoC could be one of the best solutions for the giga-scale MPSoC designs and architectures [4]. The conventional interconnections, specifically shared bus architectures, have very less features in terms of extensibility and flexibility when it deals with large number of cores within an MPSoC. NoC is responsible for dealing with such kind of problems in MPSoC interconnection. Since NoC’s routers are working on the principle of packet switching, it is responsible for routing the flits (the small parts of the data packets, e.g., head flits, body flits, tail flits) from the input buffer of the routers to the output ports, with its proper arbitration with flexible routing algorithm [6]. Based on routing algorithms and network topology, the routers decide; and the decision can be decentralized or centralized [7]. In developing the architecture of NoC’s router, the major criteria is to select an accurate algorithm for the distribution of packets on the links [8]. In the modern era, three of the most popular routing algorithms for the NoC are balanced dimensionorder routing, X–Y routing, then adaptive X–Y. The balanced dimension-order one is very useful in case of random traffic (traffic should be distributed uniformly). In such case (random traffic), this algorithm provides highest performance across the entire link [4]. X–Y algorithm works on the principle of minima and maxima. In case of routing packets, flits should go first in X-direction, then in Y-direction, and finally, these reach on the destination co-ordinates. In adaptive X–Y routing, the module first senses the traffic on the link before sending of packets, and if there is no congestion, packets should route properly. Moreover, for unbalanced distribution of traffic (across the link) and dimension-order, X–Y routing gives deprived performance (actually network traffic is dynamic and busty in nature). Thus, the new age NoC-based MPSoC prefers adaptive routing over the other routing algorithms. The roles and responsibilities of NoC routers are to take care of connectivity of the link within the on-chip interconnection architectures, and to maintain routing algorithm properties and flow control mechanism of the source and destination node of the network in case of multiple packet flow [9]. On the basis of routing algorithms, router module first calculates the destination address using source request flits. Once address translation is done, it routes the packets accordingly, and it moves toward the next hop that will be closer to the destination. The address translation is performed
FPGA Implementations and Performance Analysis of Different Routing …
175
by routing computation module. This module is implemented using turn prohibiting techniques or lookup table or simple 8b/10b encoding techniques [10]. Consider a situation where the master packet carries a request for the output port of the current router and the destination address of the next router to the network. Such condition is considered in look-ahead routing computation, i.e., head flits computed before reaching a destination, the current router should be requested for its port. SGI Spider was the first network design that accommodates look-ahead routing computation (LRC) concepts [11], and this concept is adopted with adaptive routing in [12]. Organization of this paper is taken care as follows: Sect. 2 describes numerous related works on look-ahead routing computation techniques; Sect. 3 describes the baseline route compute placement and different sets of look-ahead routing module design; Sect. 4 describes the three different algorithms along with analysis and implementation based on PAD parameters; and finally, this paper concludes in Sect. 5 with a brief discussion on its future aspects.
2 Literature Survey The process design and their implementation aiming NoC specified in some prior project works. A unique routing algorithm named DyAD has been proposed by Jingcao; the algorithm is a combination of motion routing and X–Y deterministic [13]. X–Y is selected in this routing algorithm, in case of light traffic on the network, and whenever there is heavy traffic or congestion on the link, routers switched to second type of routing (adaptive). Kim et al. proposed another router architecture consisting of a two-clock fully adaptive predictive routing module. Based on congestion aware flow control, one clock cycle is used to pre-compute the output direction in advance [14]. Nadera Najib proposed a virtual channel wormhole NoC router that uses the concept of look-ahead routing algorithm, which is partially adaptive for a low latency [15]. The preferred output port is selected by router module on the basis of congestion, and the flits are transferred by router module to the neighboring router’s module. Manevich proposed a compact adaptive routing that adaptively, the deterministic routing X–Y or Y–X for each source–destination pair as a function of monitoring global NoC traffic [16]. Abacus-turn-model (AbTM) is proposed by Binzhang for wormhole routing, which is effective in design time and space with reconfigurable architecture [17]. For reducing chip area and routing latency, this model uses the concept of dynamic communication pattern. Probabilistic odd–even (POE) technique, developed by Hu et al., is very useful in case of network performance issues during heavy traffic and congestion, as well as it provide minimum flits delivery time [18]. A fully adaptive routing algorithm, namely Whole Packet Forwarding (WPF) was developed for the virtual channel router by Sheng et al. [19].
176
V. Amaresh et al.
Fig. 1 Baseline RC placement
3 Route Computation Modules 3.1 Route Computation (RC) It typically involves an algorithm that is used across the network and as well governs the packets when moving across the network. Routing algorithm is the core property of any router, and its job is to forward the arriving packets to the proper output path. In case of source routing, which is also known as path addressing in computer networking, the incoming packet exactly knows the path that is to be followed to reach destination; whereas for distributed routing, incoming packet contains the destination address and depending on routing computation, employed in the router local output request ID that shall be used before passing from the current router. Route computation for generating the local ID can be implemented with simple lookup table [20]. In case of adaptive routing, the source packets reach final destination via multiple paths, and eligible output ports are calculated by RC logic (Fig. 1).
3.2 Look-Ahead RC In a dynamic algorithm, the RC obtains the destination address from the destination header and translates it into an outbound port request in the region guaranteeing ID generation and algorithmic rules. In this kind of RC, the routing computation will be computed first and then the router related jobs, such as the request ID generation,
FPGA Implementations and Performance Analysis of Different Routing …
177
Fig. 2 Look-ahead RC (In-parallel-to-Link-traversal)
arbitration will be executed. This is a kind of serial implementation. However, this serial execution can be taken care by having output port request ID along with destination address in the received header flit. To perform this, header flit shall compute the look-ahead RC and obtains the outgoing port which is forwarded before the packet reaches the current router. From the application point of view, as shown in Fig. 2, the RC is calculated in parallel with the link throughput. In the router’s X entry, the pre-calculated port request will be part of the main packet. And this is same for the neighboring routers’ inputs as shown for A, B, and C routers wherein the X router will have RCA, RCB, RCC modules instead of links. The output port requires a vector of flit points and an X input will have the information about one of the connecting links to next router inputs, and this will be used as select line for multiplexing. This is illustrated in Fig. 3. Each RC unit computes the output port request and is added to the header, arriving at the input of the router. This helps in selecting the next router to which the packet has to be forwarded inside the router. For example, if a packet arrives at X and must go to B at the next router, the external RCB port request is picked up and added to the header.
4 Analysis and Implementations of Routing Algorithms This section presents classical XY, adaptive XY, and sequential dimensionally balanced routing, along with implementation and analysis based on area, power, and delay.
178
V. Amaresh et al.
Fig. 3 Parallel input look-ahead RC
4.1 Classic XY Routing Routing algorithms and design methodologies based on XY geometry are widely accepted. For 2D-Mesh, calculating distance between the source and threshold coordinate (X, Y) is easier, total offsets in every dimensions. Working principle of XY routing algorithm is progressive routing algorithms and consists of dropping the displacement to null before weighing the displacement to succeeding dimension. In classic XY routing, packets are accelerated in X dimension till same destination column appear; afterward they are sent in the Y direction to get to their destination. The packets are routed in ascending or descending order, lowering a one-dimensional offset prior to routing, as shown in ALGO-1. XY routing provides various advantages including ease of hardware implementation and the deterrence of out-of-order flits distribution in the terminus core. However, as it ignores network state while dealing with real-world applications, it performs poorly. Verilog HDL creates a contemporaneous look-ahead routing engine revolving around the XY scheme. Synthesis has been accomplished using Xilinx Vivado and verified using Modelsim Questa. The synthesis report of Table 1 contains the number of devices, delay in nanoseconds, and power usage in watts.
FPGA Implementations and Performance Analysis of Different Routing … Table 1 Utilizing the module XY approach, synchronized look-ahead NOC routing computation
179
Device Used
Delay
Power dissipated
8 × 4-bit single-port RAM: 1
3.142 ns
1.146 W
3-bit subtractor: 2
(0.142 ns logic, 2.009 ns route)
2-bit register: 2
(6.6% logic, 93.4% route)
3-bitregister: 1 32-bit comparator greater: 4 3-bit 2-to-l multiplexer: 3
Four different graphical representations of power analysis presented in Fig. 4. Using Xilinx tools, the module’s power analysis is completed and plotted. Algorithm 1 2D-Mesh Topology-Based XY Routing Input: Current node co-ordinates (XC ,YC ). Destination node co-ordinates (Xd , Yd ): Output: Next Port Number Begin
Fig. 4 Power analysis for XY (Xpower analyzer)
180
V. Amaresh et al.
Direction of Next Port : EAST, WEST, NORTH, SOUTH, and LOCAL; Xdiff := Xd - Xc ; Ydiff := Yd -Yc ; if (Xdiff = 0) and (Xdiff = 0) then Next Port Number = LOCAL; else if (Xdiff > 0) then Next Port Number = EAST; else if (Xdiff < 0) then Next Port Number = WEST; else if (Ydiff > 0) then Next Port Number = SOUTH; else if (Ydiff < 0) then Next Port Number = NORTH; end if end if end if
4.2 Adaptive X–Y Routing This is the traditional XY method in an adaptive form. After selecting the X and Y dimensions with the fewest nodes the packet is routed to the less congestion dimension. An extended form of the traditional XY routing algorithm is adaptive XY routing algorithm which improves performance and enables congestion control by supporting routing decisions that consider alterations in traffic. The method gathers routing information out of adjacent nodes or all nodes. The adaptive XY algorithm behavior is deterministic, when the network is clean or slightly congested. When the network is down, it operates in adaptive mode. The adaptive XY algorithm distributes traffic more evenly across the whole network, whereas traditional XY routing places more stress on the network’s center than its lateral parts. The algorithm is shown in ALGO2. In terms of implementation, adaptive XY requires more hardware than XY algorithm, which naturally uses more power. As previously mentioned, parametric Verilog HDL is used to create the synchronous look-ahead routing calculation using adaptive XY method. Utilizing Xilinx ISE 14.7, this parametric Verilog RTL code was created, and Table 2 shows synthesis report. Adaptive XY demands more space and equipment when compared to Table 2 (a synthesis report of the XY method). While power dissipation is nearly the same in both scenarios, adaptive XY algorithms cause greater delay than conventional XY algorithms. The routing module’s power is analyzed using the Xpower analyzer. The results are shown in Fig. 5.
FPGA Implementations and Performance Analysis of Different Routing … Table 2 Utilizing the module adaptive XY approach, synchronized look-ahead NOC routing
181
Device used
Delay
Power dissipated
8 × 4-bit single-port RAM: 1
3.85 ns
1.15W
3-bit subtractor: 2
(1.019 ns logic, 2.831 ns route)
2-bit register: 2
(26.46% logic, 73.532% route)
3-bit register: 1 32-bit comparator greater: 4 3-bit 2-to-l multiplexer: 3 4-bit 7-to-1 multiplexer: 1
Fig. 5 Adaptive XY power analysis (Xpower analyzer)
Algorithm 2 2D-Mesh Topology-Based XY Routing Input: Current node co-ordinates (XC ,YC ); Destination node co-ordinates (Xd ,Yd ); Congestion Control (WestT/s_South. WestJVsJNorth. EastJVsJNorth. EastWs_South);
182
V. Amaresh et al.
Output: Next Port Number Begin Direction of Next Port: EAST.. WEST. NORTH, SOUTH; and LOCAL; Xd ft — Xd - Xc ; Ydiff := Yd -Yc ; if (Xdift = 0) and (Xdiff = 0) then Next Port Number = LOCAL; else if (Xdiff > 0) then if (Ydiff > 0) then if (Congestion Control[East_Vs_South]) then Next Port Number = EAST; else Next Port Number = SOUTH; end if else if (Ydiff < 0)_then if (Congestion Control[East_Vs_North]) then Next Port Number = EAST; else Next Port Number = NORTH; end if else Next Port Number = EAST; end if else if (Xdiff < 0) then if (Ydiff > 0) then if (CongestiomControl[WestiVs_South]) then Next Port Number = WEST; else Next Port Number = SOUTH; end if else if (Ydiff 0) then if (Congestion Control[WesLVs_North]) then Next Port Number = WEST; else Next Port Number = NORTH; end if else Next Port Number = EAST; end if else if (Ydiff < 0) then Next Port Number = SOUTH; else if (Ydiff > 0) then Next Port Number = NORTH; else Next Port Number = LOCAL; end if end if end if
FPGA Implementations and Performance Analysis of Different Routing …
183
4.3 Router with Balanced Dimension-Order Packets continuously transfer to the greater difference value dimension in balanced dimension-order routing. It is regarded as a deadlock-free routing algorithm as well. Because of the straightforward routing rules, balanced minimal path is offered for every destination. Many NoC designers adopt balanced dimension-order routing due of its ease of router design. With the advantages like deadlock-free operations, shortest path, straightforward realization, etc., the above-mentioned routing is frequently used as deterministic routing algorithms and is compatible with 2D-Mesh topology. ALGO-3 displays the algorithm. An RTL-level module has been constructed using Verilog HDL in Xilinx Vivado to analyze synchronous look-ahead routing synthesis report. According to synthesis statistics, provided in Table 3, the synchronous look-ahead routing computation for the balancing dimension-order routing algorithm uses more device macros than the adaptive XY and XY algorithms. When compared to traditional XY, the balanced dimension-order algorithm delays more, however, less than adaptive XY. In all three instances, power dissipation is essentially the same. The Xpower analyzer analyzes power of the module and plots graphs as seen in Fig. 6. Algorithm 3 2D-Mesh Topology-Based XY Routing Input: Current node co-ordinates (XC ,YC ), Destination node co-ordinates (Xd , Yd); Output: Next Port Number Begin Direction of Next Port: Table 3 Balanced dimension-order is used to compute for synchronized look-ahead NoC routing
Device used
Delay
Power dissipated
8 × 4-bit single-port RAM: 1
3.399 ns
1.149 W
2-bit subtractor: 2
(0.951 ns logic, 2.448 ns route)
2-bit register: 2
(27.97% logic, 72,03% route)
3-bit register: 1 2-bit comparator greater: 3 3-bit 2-to-l multiplexer: 4 3-bit 2-to-l multiplexer 3
184
V. Amaresh et al.
Fig. 6 Xpower analyzer power investigation report of balanced dimension-order
EAST, WEST, NORTH, SOUTH, and LOCAL; if (Xd > Xc ) then x addrlow = x c; x addrhigh = x d; else x x
end if x
addrlow = x d; addrhigh = x c; diff := x adderhigh - x addrlow;
if (Yd > Yc ) then Y addrlow = Y c; Y addrhigh = Y d; else Y Y
addrlow = Y d addrhigh = Y c;
end if Y
diff
:= Y
adderhigh - Y addrlow ;
if (Xdiff = (1) and (Xdiff = (1) then Next Port Number = LOCAL;
FPGA Implementations and Performance Analysis of Different Routing …
185
else if (Xdiff > Ydiff ) then if (Xd > Xc ) then Next Port Number = EAST; else Next Port Number = WEST; end if else if (Xdiff model. Alternating Gibbs sampling can be used to achieve it by starting at any random visible unit state and continuing for an extended period of time. In one cycle of alternating Gibbs sampling, all hidden nodes are updated simultaneously using 2, and then all the visible units are updated simultaneously with 3. The states of the visible units are first assigned to a training vector. Subsequently, the hidden units’ binary states are all calculated in parallel using 2. Following the hidden units’ binary states has been determined, a reconstruction is created by setting each vi to 1 with a probability determined by 3. After then, the weight change is determined by 4. (4) wi j = e(vi h j data − vi h j recon ) where vi h j data is the original image and vi h j recon is the reconstructed image. With the aim of decreasing the reconstruction error in each layer, the RBMs train each hidden layer in a stack-wise fashion. Thus, the hidden layers keep the input representation as much as possible, which is mapped to the intended output. Weight updates control how faithfully the RBM outputs can reconstruct the inputs and how far they can stray from the target value. Better generalization is achieved by ensuring that the input topology is preserved in the hidden layers. With the exception of the weights between the penultimate and output layers, all neighboring layer weights in the proposed technique have been trained using stacked RBM. The reconstruction
202
S. Mukherjee and P. Dey
error that is reduced in relation to the network weights during pre-training serves as the RBM’s cost function. In this work, gradient descent optimization algorithms is used to minimize the cost function. Let us examine the proposed approach in depth.
3.2 Pre-training Let the data set D = {(v1 , t1 ), (v2 , t2 ), · · · , (v P , t P )} contains P patterns, where the input and the output are, respectively, vi ∈ R n and ti ∈ [0, 1]. RBMs are used to create a hidden characterization of input during pre-training. RBMs receive the input vi with h hidden nodes. By changing the weighted connections between the succeeding layers, the RBM reconstructs its input. The outcome of the hidden layer is given in (2), and the RBM’s output layer’s function is explained in (4). In this paper, the bias is used in the preceding layer that increases the dimension of the input vector v by 1; thus, v = (v1 , v2 , v3 , . . . , vn+1 ), where v1 = 1. W1h = bh represents bias corresponds to the hidden node h. The cost function of the RBM is given in 4. The trained weights between input and hidden layers W have produced a hidden representation of input vi in the form of si (h).
3.3 Training Following the pre-training of weights, batch mode MLP training is performed on the weights W. A single batch contains all the training samples present in the training data. Here, the output of the hidden layer has undergone the following step 5 for the provided data set. h = f (W vi )
(5)
where f (·) denotes the sigmoid activation function. In this paper, MLP with two hidden layers have been used. Here, the output of the hidden layer is si (h). The output of the output layer is given in (6). si (y) = f (W class si (h))
(6)
Here, the weight between the penultimate and output layers is W class , and the class . The intended output di is compared to the bias value of the output node o is W1o actual output si (y) to determine the cost function for the entire network, which is as follows: c 1 k (d − si (y k ))2 (7) ei = 2 k=1 i
Deep Learning-Based Weight Initialization on Multi-layer …
203
The network is then trained for its set of parameters, W class and W , using gradient descent-based back propagation to optimize the loss function.
4 Results and Analysis 4.1 Implementation Settings Two standard image data sets from the Kaggle website were used to present the results [23]. The complete details of the data sets are given in Table 1. The effectiveness of the model mentioned in this paper has been assessed using the MATLAB’s toolbox of neural networks. The initialization range for the weights in the MLP network is [+0.5, −0.5]. Two hidden layers with 10 hidden nodes each have been used in the proposed weight initialization approach as well as in the standard MLP model. For all the data sets, we use eta1 = 0.5 and eta2 = 0.3, where eta1 and eta2 are the pre-training and post-training learning rates, respectively. Both the pre-training as well as training of the network have been carried out using batch mode learning with 10,000 epochs. Initial z-score normalization is performed on the training data set to provide the mean and standard deviation, which is then utilized to compute z-score normalization on the test data.
4.2 Data Sets Cifar10: The data set CIFAR-10 [12] is a collection of images used for the purpose of training computer vision and image recognition algorithms. This data set contains 60,000 color images with 32 × 32 resolution. This data set contains ten different classes, each having 6000 images. These ten different classes are cars, ships, trucks, frogs, airplanes, cats, deer, birds, dogs, and horses. The training data set contains 50,000 images, and the test data set contains 10,000 images. The training batch contains 10,000 images, 1000 from each class. STL-10: The STL-10 data set is a modified version of CIFAR-10 data set. The resolution of the images in this data set are 96 × 96, which is much higher than the CIFAR-10 data set. Compared to the CIFAR-10, in this data set each class contains
Table 1 Data sets (a number of features, b number of classes) Data set #Fa #Cb CIFER-10 STL-10
6000 5000
10 3
Number of samples 60,000 15,000
204
S. Mukherjee and P. Dey
Table 2 Classification accuracy (%) of the proposed methodology and the conventional MLP Data set MLP Proposed method CIFAR-10 STL-10
79.10 ± 2.63 64.88 ± 1.56
81.02 ± 1.06 72.08 ± 2.52
fewer labeled training samples. For each class, there are 500 training images and 800 test images. It contains ten different classes: truck, ship, monkey, vehicle, cat, deer, horse, and horse. Prior to supervised training, a sizable collection of unlabeled samples is also offered to help with image model learning. For unsupervised learning, it also includes 100,000 unlabeled images. The training batch contains all the training images.
4.3 Comparison On the CIFAR-10 and STL-10 data sets, the ten times ten-fold cross-validation is used for both the models: the conventional MLP and the proposed approach. Figures 3 and 4 compare the training accuracy between MLP and the proposed method per epoch for CIFER-10 and STL-10 data sets, respectively. In order to compare the proposed model and the conventional MLP, means and standard deviations from these ten runs of ten-fold cross-validation have been calculated. Columns 2 and 3 of Table 2 show the results of the proposed model and the conventional MLP, respectively. The outcomes suggest that on these two real-world data sets, the proposed model performs better.
Fig. 3 Training accuracy per epoch for CIFER-10 data image
Deep Learning-Based Weight Initialization on Multi-layer …
205
Fig. 4 Training accuracy per epoch for STL-10 data image
5 Conclusion This paper is motivated by the need to develop suitable training algorithms for deep architectures since these can be much more representationally efficient than shallow ones, i.e., one hidden layer neural nets. This paper proposes a weight initialization method by pre-training the network using stacked RBM layers. The experiment on the two standard data sets CIFAR-10 and STL-10 exhibits the proposed model outperforms an MLP with randomly initialized weights. Similar to previous pretraining methodologies, the goal of this one is to provide an improved initial point for gradient-based learning. The vanishing gradient problem occurs in a DNN due to inappropriate hyperparameter settings. Furthermore, these hyper-parameter values depend on the given data set. In the future, we will focus on adaptive hyper-parameter tuning for a given data set.
References 1. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., et al.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19, 153 (2007) 2. Ciregan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE conference on Computer vision and pattern recognition (CVPR), pp. 3642–3649. IEEE (2012) 3. Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012) 4. Divya, S., Adepu, B., Kamakshi, P.: Image enhancement and classification of CIFAR-10 using convolutional neural networks. In: 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 1–7. IEEE (2022)
206
S. Mukherjee and P. Dey
5. Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010) 6. Erhan, D., Manzagol, P.A., Bengio, Y., Bengio, S., Vincent, P.: The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Artificial Intelligence and Statistics, pp. 153–160 (2009) 7. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010) 8. Güler, O., Yüceda˘g, ˙I: Hand gesture recognition from 2d images by using convolutional capsule neural networks. Arab. J. Sci. Engi. 47(2), 1211–1225 (2022) 9. Hendrycks, D., Gimpel, K.: Generalizing and improving weight initialization. arXiv preprint arXiv:1607.02488 (2016) 10. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) 11. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012) 12. Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 (Canadian Institute for Advanced Research). 5(4), 1. http://www cs.toronto.edu/kriz/cifar.html (2010) 13. Larochelle, H., Bengio, Y., Louradour, J., Lamblin, P.: Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 10, 1–40 (2009) 14. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015) 15. Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Artificial Intelligence and Statistics, pp. 562–570 (2015) 16. Ludwig, O., Nunes, U., Araujo, R.: Eigenvalue decay: a new method for neural network regularization. Neurocomputing 124, 33–42 (2014) 17. Ngiam, J., Chen, Z., Chia, D., Koh, P.W., Le, Q.V., Ng, A.Y.: Tiled convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1279–1287 (2010) 18. Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: International Conference on Machine Learning, pp. 1310–1318 (2013) 19. Rifai, S., Mesnil, G., Vincent, P., Muller, X., Bengio, Y., Dauphin, Y., Glorot, X.: Higher order contractive auto-encoder. Machine Learning and Knowledge Discovery in Databases, pp. 645–660 (2011) 20. Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 833–840 (2011) 21. Seyyedsalehi, S.Z., Seyyedsalehi, S.A.: A fast and efficient pre-training method based on layerby-layer maximum discrimination for deep neural networks. Neurocomputing 168, 669–680 (2015) 22. Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066 (2013) 23. Wang, L., Liu, J., Chen, X.: Microsoft malware classification challenge (big 2015) first place team: say no to overfitting (2015) (2015) 24. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision, pp. 818–833. Springer, Berlin (2014)
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown with Reference to an Industrial City and a Populated City of India Anurag Nayak, Tunnisha Dasgupta, Amit Shiuly, and Suman Koner
Abstract Due to the spreading of the novel coronavirus all over the world in January 2020, India underwent a lockdown for three consecutive months and thereafter a partial lockdown as and when it was necessary. In spite of several devastating effects during the lockdown, certainly, there was a positive impact on ambient air quality. This study investigated how the quality of air has improved from a pre-lockdown period to a lockdown period and further how the quality has degraded during the post-lockdown period. In this study, several air quality parameters were collected from the Central Pollution Control Board, Government of India and were plotted to analyse their impact on the environment. Two different kinds of cities such as an industrial city (Bhiwadi, Rajasthan) and a highly populated city (Delhi) were considered for the present study. Keywords Positive impact · COVID-19 lockdown · Air quality · Air pollutant
1 Introduction The novel coronavirus had started spreading all over the world including India in January 2020 and as a matter of fact, our country underwent a lockdown period from 24 March to 31 May 2020, in four phases during the first and second waves A. Nayak · T. Dasgupta · S. Koner (B) Jalpaiguri Government Engineering College, Jalpaiguri, West Bengal 735102, India e-mail: [email protected] A. Nayak e-mail: [email protected] T. Dasgupta e-mail: [email protected] A. Shiuly Jadavpur University, Kolkata, West Bengal 700032, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_18
207
208
A. Nayak et al.
of pandemics. Thereafter, a partial lockdown period continued in the places as and when required. Again 3rd wave of COVID-19 spread from mid-January to midFebruary 2022. The impact of the pandemic era on millions of the world including our country in the true sense was devastating. The economic and social disruption including malnutrition of the resource-poor people is huge. In spite of the several negative impacts caused during this period, there was certainly one positive impact, i.e. improvement of air quality and reduction of death rate caused by air pollution [1]. The threats of air pollution have been continuously growing day by day, and in India, deaths due to air pollution in the year 2019 were above 1.66 million [2]. It has been observed that there is a significant rise in the areal concentration of the pollutants every year particularly during the months between October to February due to fire crackers used in the festivals and also due to burning of stubble in addition to the emission caused due to burning of fossil fuel [3, 4]. However during lockdown period (2020), the meteorological conditions remained favourable with respect to accumulation of the pollutants as compared to 2018 or 2019 [5]. In this study, we have collected air pollutant data from two different places, namely Bhiwadi, Rajasthan, and Delhi, the capital of India. The reason behind choosing these two places is—Bhiwadi is one of the most industry-rich areas in India and Delhi is one of the most populated cities in India. The major concern behind this comparative study is to judge the difference in concentration of pollutants in an industry-rich area in comparison to a population-rich area. Thus, the main objective of the study is to analyse several air pollutant parameters before, during and after the lockdown period and to judge how the air quality has improved during the lockdown and how it has further degraded after the lockdown was revoked.
2 Study Area Two Indian cities such as Bhiwadi in Rajasthan and Delhi, the capital of India have been selected for the study. The latitude and longitude of Bhiwadi and Delhi are 28.2014° N, 76.8276° E, and 28.6139° N, 77.2090° E, respectively. We have chosen the two cities based on their geographical, climatic, and environmental as well as other factors which may influence the pollution data due to either their dense population or the density of industries present in the area. Delhi is a city rich in population, the geographic conditions and the factors affecting the emission rates of the pollutants provide us with the opportunity to study deeply the pollution trends and the emission sources. On the other hand, Bhiwadi is a city densely embedded with industries which provide us an opportunity to study the sources of pollutants as well as the difference in the pollution trends of the two cities.
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown ... Table 1 Phases of lockdown in India
Table 2 Different air quality parameters used for the study
209
Phases
Start date
End date
Pre-lockdown period
01-01-2019
24-03-2020
Lockdown period
25-03-2020
31-05-2020
Post-lockdown period
31-05-2020
01-07-2021
Air quality parameters
Symbol
Particulate matter < 2.5µ
PM2.5
Particulate matter < 10µ
PM10
Nitric oxide
NO
Nitrogen dioxide
NO2
Carbon monoxide
CO
Ozone
O3
Sulphur dioxide
SO2
Ammonia
NH3
Benzene
C6 H6
Toluene
C7 H8
3 Methodology In 2016, the Government of India first launched the National Air Quality Index, (NAQI) which is daily published by the Central Pollution Control Board (CPCB) Ministry of Environment, Forest, and Climate Change, Govt. of India and is made available in the website of CPCB [6]. The data were collected from the website of CPCB as well as from IQAir [7]. The data was collected over a time period of 2 years and 6 months which includes pre-lockdown, lockdown, and post-lockdown periods. Table 1 indicates the three phases during which the data have been collected. Ten major pollutants parameters that are considered for the study are presented in Table 2.
4 Results and Discussion We have analysed the behaviour of the pollutants over the period as mentioned earlier and it is explained here. In each plot, the upper curve represents the parameters for Bhiwadi and the lower one represents Delhi city. Again, the green colour represents the air quality for pre-lockdown period where the red and the orange represent the lockdown and post-lockdown period, respectively.
210
A. Nayak et al.
4.1 Particulate Matter Less Than 2.5 µm (PM2.5) PM2.5 is generated from combustion and hence burning of fuels including wood is the major source of its emission. PM2.5 and PM10 both can be breathed into the lungs. The penetration of PM2.5 is deeper into the alveoli of the lungs and they remain for a very long period of time in the lungs than PM10. Figure 1 represents the variation of PM2.5 over a time period from January 2019 to July 2021 for both Bhiwadi and Delhi. It is observed from the figure that there is a significant drop in the level of PM2.5 pollutants in the air during the lockdown period in both Bhiwadi and Delhi. This drop was seen as a result of the shutdown of industries and the movement of vehicles that was maintained strictly in both areas during the period from March 2020 to May 2020. Even after the lockdown period is over, it is seen that for five months, the level of PM 2.5 remained in the lower range which obviously indicates the benefits of lockdown from an air quality point of view. Again it is observed that aerial concentration of PM 2.5 remained higher side from October to February 2019–20 and the same time during 2020–21 for both cities. Some of the major contributors to this problem are stubble burning, bursting of crackers during festivals such as Durgapuja and Diwali [3, 4]. It is interesting to mention that for both cities, peak occurred in the months of November and December which may be resulted from the burning of stubble left after harvesting grains such as rice and wheat in the neighbouring area. It is also observed that the concentration level in Delhi remained slightly higher throughout the period of study due to higher emissions as vehicular exhaust. Through this study, we report that two of the major sources of PM2.5 emissions in Delhi would be majorly due to vehicular exhaust [8], and for Bhiwadi, it was due to industrial emission. The maximum concentration of PM2.5 for both cities is near 200 µg/m3 which occurred
Fig. 1 Concentration of PM2.5 during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown ...
211
during the post-lockdown period (November to December 2020). It has been observed that most of the annual average pollutant concentrations during the pre- and postlockdown periods were higher than the permissible range recommended by World Health Organization [9]. According to the reports of IQAir, the PM2.5 levels in Delhi [10] and Bhiwadi [11] remain generally 32.4 times and 14.8 times higher than the limits provided by World Health Organization.
4.2 Particulate Matter Less Than 10 µm (PM10) In India, vehicles are the major contributors of PM 10 in urban area rather in an industrial area [12]. While going through the graph of PM10 as represented in Fig. 2, we observed that the variation in the plots for PM10 at both Bhiwadi and Delhi is quite similar. The major sources which contribute the emission of PM10 in the air are burning of fossil fuel, smoke stacks, fires, and dust particles resulting from construction sites, unpaved roads, and fields. These are common in both the cities. Delhi is a highly populated area where vehicular traffic is high, similarly Bhiwadi being an industryrich area is also attracting a lot of vehicular traffic along with industrial exhaust. Also there is the factor of the periodic crop burning in neighbouring area that affects both the region. The maximum concentration of PM10 occurred during November 2020 and January 2019 for Bhiwadi and Delhi, respectively, and the concentration for both the cities was recorded more than 300 µg/m3 . The nature of the curves is almost similar to PM 2.5 since the sources are identical. It is interesting to notice that during the months from March to May 2021 (post-lockdown period) Delhi managed
Fig. 2 Concentration of PM10 during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
212
A. Nayak et al.
Fig. 3 Level of NO during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
to keep the PM 10 in lower range which may be a result of partial lockdown in few places in scattered manner. It can also be concluded that due to the monsoon period, the level of both PM2.5 and PM10 remained low in the atmosphere for both the cities.
4.3 Nitric Oxide (NO) The major sources from which NO and NO2 are emitted include exhaust of automobile, cement kiln, and exhaust of power plants [13]. These are also formed in the atmosphere at high temperature during lightning. It is clearly seen from the curve (Fig. 3) that level of NO is quite lower at Bhiwadi than Delhi throughout the period of study. It is also noticeable that the minimum concentration of Delhi was lower than minimum concentration of Bhiwadi during lockdown period where the maximum level at Delhi (100 µg/m3 ) was double than that at Bhiwadi (50 µg/m3 ). It is also observed from the curves that from February of 2020 till September of 2020 (Partial lockdown period) level of NO at Bhiwadi fluctuates rapidly.
4.4 Nitrogen Dioxide (NO2 ) Elevated NO2 level can cause damage to the human respiratory system and increase a person’s vulnerability and severity towards the infection of respiratory tract and asthma. In case of NO2 emission, the peaks are noticed in the winter season for both the cities. This happened due to burning of weeds after the crops have been plucked. The maximum concentration for both the cities remained over 80 µg/m3 . Although
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown ...
213
Fig. 4 Atmospheric concentration of NO2 during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
during the lockdown period, the level remained low but the lockdown didn’t have much impact on the concentration compared to the other parameters. During the period from June to August for both 2019 and 2020, the level remained in lower range (Fig. 4). This may be due to ongoing monsoon and also due low traffic density.
4.5 Ammonia (NH3 ) It is clearly seen from the curves that lockdown does not seem to have an effect on the level of NH3 like other pollutants. The maximum concentrations during lockdown period remained around 50 µg/m3 for Bhiwadi and the same for Delhi was 60 µg/m3 (Fig. 5) which was not much less than maximum concentration during the period of study. Since the major sources of NH3 includes the application of fertilizer, animal husbandry that continued to be applied during the lockdown period too as a part of essential activity.
5 Sulphur Dioxide (SO2 ) Sulphur dioxide mainly affects the respiratory system of human body; it can also irritate the eyes. It is interesting to notice that in case of SO2 , the level of concentration of this pollutant maintained higher at Bhiwadi compared to Delhi because of the fact that SO2 is majorly generated from burning of coals. Some of the other sources include petroleum refineries, cement manufacturing, paper and pulp industry, etc.
214
A. Nayak et al.
Fig. 5 Ammonia level during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
Bhiwadi being an industrial city, the SO2 concentration remained elevated here. It is also interesting to observe irrespective of the lockdown period the concentration of this pollutant remained quite low during monsoon (June, July, and August 2019 and 2020) as because it reacts with rain to form sulphuric acid as acid rain [14]. The mean concentration throughout the period of study at Bhiwadi and Delhi was around 40 µg/m3 and 20 µg/m3 , respectively (Fig. 6).
Fig. 6 Sulphur concentration during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown ...
215
Fig. 7 Atmospheric CO level during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
6 Carbon Monoxide (CO) At higher concentration, the CO can severely affect the aerobic metabolism, owing to its high affinity towards haemoglobin (Hb), the component of the blood responsible for the transport of oxygen. CO reacts with Hb of blood to form carboxyhaemoglobin (CO-Hb) thus reducing the capacity of the blood to carry oxygen [15]. As the major source of this pollutant resulted from internal combustion engine, the level of this pollutant remained elevated in Delhi, and during lockdown, it had drastically reduced. Since the Bhiwadi city has vehicular emission much less than Delhi city, it witnessed a low level of this pollutant throughout the study period. The maximum CO level at Bhiwadi was 1.2 µg/m3 where the same at Delhi 4 µg/m3 (Fig. 7).
7 Ozone (O3 ) Ozone occurs both in earth’s upper atmosphere and ground level. Ozone can be good or bad depending on where it is found. Stratospheric ozone called good ozone occurs naturally in the upper atmosphere, where it forms a protective layer that prevents us from Sun’s harmful ultraviolet rays. Tropospheric ozone or bad ozone is formed by the chemical reaction of volatile organic compounds (VOC) with oxides of nitrogen (NOx ). Attri and coworkers reported display of fireworks could produce ozone (O3 ), a strong and harmful oxidizing agent, at the ground level without the participation of NOX [16]. It is seen from the graph for the ozone concentration (Fig. 8) that lockdown period didn’t have any positive impact for both the cities, and interestingly, it had reverse impact on the tropospheric ozone level as the level raised up compared to pre-
216
A. Nayak et al.
Fig. 8 Tropospheric ozone level during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
and post-lockdown periods. The reason is that when the ratio VOC and NOx is less than 6 the reaction occurs and when it raised greater than 12 the reaction doesn’t take place. During lockdown, the sources of VOC were greatly reduced due to industrial shutdown resulting the aforesaid ratio to be less than 6 and thus formation of ozone occurred. The minimum concentration during lockdown period was around 40 and 25 µg/m3 for Bhiwadi and Delhi where the same for pre-lockdown period was 15 and 10 µg/m3 .
8 Benzene (C6 H6 ) and Toluene (C7 H8 ) Both benzene and toluene are emitted in the atmosphere due to common anthropogenic activities. From Figs. 9 and 10, it can be observed that the nature of the emission of both these pollutants is similar and Delhi witnessed higher level of these throughout the whole period. This may be due to high volume of traffic movement at Delhi compared to Bhiwadi. For the obvious reason, the concentration level of both the pollutants remained lowest during lockdown period. The maximum level of benzene during pre-lockdown period remained 7 and 12 µg/m3 for Bhiwadi and Delhi, respectively, whereas during lockdown, it maintained at 1 and 2 µg/m3 . Similar nature in the level of toluene was witnessed.
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown ...
217
Fig. 9 Atmospheric benzene level during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
Fig. 10 Ambient toluene concentration during the period from pre-lockdown to post-lockdown at Bhiwadi and Delhi
9 Conclusion From the above study, we have tried to investigate how air quality parameters have been improved during lockdown period in two cities of India such as Bhiwadi in Rajasthan and Delhi, the capital of India. We had observed that all the pollutants except ozone and ammonia have showed a sharp decrease during lockdown period for both the cities due to complete shutoff of all industrial activities and restriction in traffic movement. It is also seen even after the lockdown period is over, the ambient air
218
A. Nayak et al.
quality continued to be in a good level for further three to four months. It is interesting to notice that concentration of ammonia remained unaffected during lockdown period since the agricultural activities were not stopped as it was an essential activity. Again it is seen in case of sulphur dioxide, the level of concentration always remained on higher side at Bhiwadi since it is majorly generated from industrial activity. On the other hand, it is very important to observe the revere nature of pattern in case of ozone emission. During lockdown period, the level of ozone emission remained higher than other time due to the control of VOC emission during lockdown period. From the above discussion, it can be concluded that in spite of several devastating impact of spreading of novel corona virus all over the world, the world had witnessed one positive impact, i.e. sudden improvement of atmospheric air quality that continued to be a favourable level even after lockdown period is over. Therefore, it can be stated in order to maintain a good air quality, certainly we need to reduce the combustion of fossil fuel as far as possible which can be fulfilled if we can replace the conventional source of energy with use of renewable energy, e.g. hydrogen fuel, solar-driven vehicle, and battery-operated vehicle, and if it can be achieved, then suddenly we will be able to slow down the process of climate change, we can reduce the impact of pollution to our mother nature and we can reduce the diseases and deaths of human being caused due to air pollution. More importantly, this study will provide us a scope to re-think regarding the current regulation policies and to plan further strategies for air pollution reduction.
References 1. Nigam, R., Pandya, K., Luis, A.J., Sengupta, R., Kotha, M.: Positive effects of COVID-19 lockdown on air qualityof industrial cities (Ankeleswar and Vapi) of Western India. Sci. Rep. Nat. Portfolio 11, 4285 (2021) 2. Tiseo, I: tatista.com/statistics/935666/india-average-annual-deaths-from-air-pollution (2022) 3. https://timesofindia.indiatimes.com/city/delhi/day-after-diwali-delhis-air-turns-hazardous/art icleshow/66539912.cms 4. https://timesofindia.indiatimes.com/city/delhi/delhi-breathed-easier-from-january-to-april/art icleshow/59011204.cms 5. Saharan, U.S., Kumar, R., Tripathy, P., Sateesh, M., Garg, J., Sharma, S.K., Mandal, T.K.: Drivers of air pollution variability during second wave of COVID-19 in Delhi, India. Urban Clim. 41, 101059 (2022). https://doi.org/10.1016/j.uclim.2021.101059 6. Central pollution Control Board (CPCB): https://www.cpcb.nic.in, Last updated 15 Nov 2022 7. https://www.iqair.com/in-en/world-air-quality-ranking 8. https://cerca.iitd.ac.in/uploads/Reports/1576211826iitk.pdf 9. Ghosh, N., Roy, A., Bose, Das, A., Debsarkar, A., Roy, J: Covid-19 Lockdown: Lesson learnt using multiple air quality monitoring station data from Kolkata city in India, 1–28. https://doi. org/10.21203/rs.3.rs-43739/v1 10. https://www.iqair.com/in-en/india/delhi 11. https://www.iqair.com/in-en/india/rajasthan/bhiwadi 12. Ghosh, N., Roy, A., Mandal, R., Dutta, A.: Performance assessment of roadside PM10 forecasting models. Res. J. Chem. Environ. 24(7), 1–11 13. Dheeraj Alshetty, V., Kuppili, S.K., Nagendra, S.M.S., Ramadurai, G., Sethi, V., Kumar, R., Sharma, Namdeo, A., Bell, M., Goodman, P., Chatterton, T., Barnes, J., De Vito, L., Longhurst,
Assessment of Air Quality as a Positive Impact of COVID-19 Lockdown ...
219
J.: Characteristics of tail pipe (Nitric oxide) and resuspended dust emissions from urban roads— A case study in Delhi city. J. Transp. Health 17, 100653 (2020). https://doi.org/10.1016/j.jth. 2019.100653 14. Environmental Protection Agency: https://www.epa.gov/acidrain. Last updated 24 June 2022 15. Peavy, H.S., Rowe, D.R., Tchobonaglous, G.: Environmental Engineering, Latest McGraw-Hill Book Company, New York (1985) 16. Attri, A.K., Kumar, U., Jain, V.K.: Formation of ozone by fireworks. Nature, 411, 1015 (2001)
Applications of Big Data in Various Fields: A Survey Sukhendu S. Mondal, Somen Mondal, and Sudip Kumar Adhikari
Abstract A large volume of data is produced from the digital transformation with the extensive use of Internet and global communication system. Big data denotes this extensive heave of data which cannot be managed by traditional data handling methods and techniques. This data is generated in every few milliseconds in the form of structured, semi-structured, and unstructured data. Big data analytics are extensively used in enterprise which plays an important role in various fields of application. This paper presents applications of big data in various fields such as healthcare systems, social media data, e-commerce applications, agriculture application, smart city application, and intelligent transport system. The paper also tries to focus on the characteristics, storage technology of using big data in these applications. This survey provides a clear view of the state-of-the-art research areas on big data technologies and its applications in recent past. Keywords Big data · Big data analytics · Applications of big data
1 Introduction A large volume of information is generated in every day in every sector of modern world due to rapid progress of technologies such as smart mobile devices, Internet, Internet of things (IoTs), cloud, social media. Moreover, as business processes become more computer-dependent and data-intensive, larger amounts of data are recorded more rapidly and with higher accuracy. The complexity of analytical tasks S. S. Mondal · S. Mondal · S. K. Adhikari (B) Cooch Behar Government Engineering College, Cooch Behar, West Bengal, India e-mail: [email protected] S. S. Mondal e-mail: [email protected] S. Mondal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_19
221
222
S. S. Mondal et al.
has significantly increased as a result of this trend. The necessity to deal with multiple data formats, such as key-value, relational, graph that various services consume or produce leads to additional challenges. Also, the processing in workflows which are becoming complex day by day can take many different forms, from straightforward data operations to the application of sophisticated algorithms like graph processing algorithms, data mining algorithms, etc., or even custom procedures related to specific businesses. Finally, various restrictions and limitations may be applied to the workflow execution, related to deadlines, performance, and optimization of several dimensions, such as effectiveness, cost, fault-tolerance. A large amount of data is generated as a result of the decreased cost and expanded sensor manufacturing. Effective ways for gathering, storing, and organizing data are required in order to use it efficiently. “Big data” is a term used to describe a vast and complicated data set that is challenging for typical database systems or application software to process within a time period that can be acceptable to users [1–4]. As big data merges streams of data from numerous stakeholders, data from across the enterprise from different data models is typically evaluated by business model. Data models are the main components of the platform of big data. In order to generate the necessary metrics, depending on the current data models, users typically carry out complicated processing, run queries, and construct huge table joins. Superior data modeling techniques are required for an organization’s big data platform in order to achieve the ideal balance of cost, performance, and quality. Researchers are currently dealing with difficulties related to large data, including data duplication, categorizing data from one data set to another, and identifying multidisciplinary data. Big data challenges can no longer be met by traditional methods of structured, semi-structured, and unstructured data processing and analysis using relational database management systems and data warehousing. The data that is accessible for research must be processed using a variety of data analytics methodologies, which are referred to as “big data analytics.” These methods are beneficial for handling large amounts of structured, semi-structured, or unstructured data that are constantly changing and impossible to process using traditional database methods. In the big data era, research in data science is focused predominantly on scalable, efficient, and accurate (SEA) analytics. The new technologies have been developed which support scalability through huge storage management and parallel/distributed computing. Additionally, the necessity for increased efficiency in addition to scalability was becoming apparent, and the development of new systems for quick analytics evolved to meet such objectives. Data can be structured, semi-structured, or unstructured because it is gathered from heterogeneous sources. In order to store and analyze information, a significant amount of money and time is squandered. Consequently, it is extremely complicated and challenging to examine using conventional methods like RDBMS. Therefore, as time goes on, it is evident that using big data technologies and approaches would be considered necessary in every sector.
Applications of Big Data in Various Fields: A Survey
223
2 Motivation and Organization Very few survey papers are available on big data technologies and its applications. Some reviews are there on a particular application. However, we want to have clear concise view of big data and its applications in a single paper. This motivates us to write this survey paper which helps the researchers to get a clear view of the stateof-the-art research on big data applications. They can also get a clear view of the big data characteristics, storage requirement, and technology of big data which are extensively used in those applications. The paper is organized in such a way that it will help the beginners to get a clear understanding of big data characteristics and storage and its applications. The remaining part of the paper is arranged as follows: big data characteristics and storage are presented in Sect. 3. This section presents a survey on how various researchers have used different technologies of big data in different applications. Section 4 presents different applications of big data in various fields: health care, social media data, e-commerce, smart cities, and intelligent transport system. Section 5 concludes the paper.
3 Big Data Characteristics and Storage Big data was first defined by Francis Diebold as the unprecedented growth of relevant data available today, both in terms of quantity and quality, largely as a result of recent and novel advances in data capture and storage technologies. In this reality, sample sizes are measured in other ways, such as megabytes, rather than in terms of the number of observations. There was nothing extraordinary about data accumulating at a rate of several gigabytes each day [5]. Volume, variety, and velocity, also known as the “3 V’s,” and value are four common big data characteristics [6]. Volume denotes the amount of data (in terabytes, for reference), velocity the rate at which data is received, variety the varied sorts of data (such as relational, photos, text, and videos), and value the conclusions drawn from BDA. The four pillars of validating data characteristics for any big data use case are the 4 Vs. Demchenko et al. [7] had classified big data operations in following ways: (a) big data properties, (b) new data models, (c) new analytics, (d) infrastructure and tools, and (e) source and target. Big data enables huge advantage over traditional computing and storage system. It provides scale out facilities other than scale up when the data is in huge volume. For big data operation, high-end processing is required, and it provides the fault-tolerance environment. Storage and management of huge volume of data cannot be handled by traditional relational databases. The standard big data storage which is very much efficient and effective is the NoSQL data store [8, 9]. Big data analytics adopts NoSQL technologies which can resolve most of the data storage, data management, data querying, and data processing tasks. NoSQL stores are CouchBase, MongoDB,
224
S. S. Mondal et al.
Neo4J, Cassandra, NoSQLDynamoDB of Amazon, Redis and NoSQLCosmosDB of Microsoft Azure in addition to Apache Hadoop and its ecosystem. These NoSQL stores are also known as NewSQL stores since they operate in a distributed environment, similar to MongoDB and Redis, and they satisfy the ACID requirements of RDBMS, allowing users to view updated data [10, 11]. The complex issues of big data analytics can be solved by Hadoop which can provide state-of-the-art solution. It works on distributed architecture that introduces additional cluster nodes to address the problem of rising storage and computational power. One of the main features of Hadoop is that data processing works separately from data storage. Hadoop has four components: Hadoop common libraries, HDFS, Hadoop Yet Another Resources Negotiator (YARN), and Hadoop MapReduce. Hadoop’s persistent database is HBase, and it executes on top of HDFS, whereas Hive which is Hadoop’s warehouse executes SQL queries over the data in HDFS. Similar to RDBMS storage, it stores the data in tabular format [12].
4 Application in Various Fields Numerous studies have been conducted on the use of big data in different areas. This section provides an overview of some recent studies in some of the most prominent fields.
4.1 Application in Healthcare Due to the adaptation of modern equipment, sensors handle devices that generate large volumes of heterogeneous medical data and that becomes the source of healthcare information for a patient. This data can be heterogeneous, unstructured type of data, and that should be accessible to all concerned. So there is a need to store, access, and update such information in a cost effective and fault tolerant manner. Therefore, the Healthcare Information Management System (HIMS) utilizes cutting-edge big data tools, techniques, and technologies to manage the transformation of diverse healthcare data into useful and pertinent information. Big data analytics (BDA) is crucial for identifying risks and lowering healthcare expenses. Columnar stores, key-value stores, graph stores, document stores, and hybrid stores are the four NoSQL types that form the foundation of these applications. Imran et al. [2] explained a thorough analysis of big data applications in the healthcare field. Ercan et al. [13] analyzed NoSQL applications in the field of healthcare featuring the following crucial NoSQL characteristics: scaling out, automatic scaling, reliability, and data model alternatives, NewSQL compliance, optimized query execution, and cost effective. The authors in [14] proposed a healthcare system to forecast and effectively manage the patient’s ailment data based on Hadoop ecosystem using text mining. Yang et al. [15] presented a more substantial study on the implementation of
Applications of Big Data in Various Fields: A Survey
225
HBase in healthcare applications. They motivated HBase to integrate several excel documents or related applications and showed an experimental set up of their idea on a two node clusters. Then the authors of [16] implemented a system to import 30 TBs of data from a healthcare application of a Canadian healthcare project to an HBase cluster with the help of MapReduce, where all base processing was performed. Park et al. [16] presented a graph database in the healthcare system and also showed the usefulness of this database with investigating with doctors and other healthcare agencies. Stufi et al. [17] described a Vertica NoSQL hybrid model to manage healthcare big data for a Czech healthcare center. They had undertaken four step BDA process, namely data storage, data management, data analytics, and data visualization. Gopinath et al. [18] implemented a proof-of-concept (POC) to be a benchmark hybrid architecture using HBase, Cassandra, and MongoDB on e-health clouds for an industrial project centered in India. The main component of this architecture consists of query interface, query administrator, and translator for NoSQL organization. In paper [19], the advantages of using BDA applications in health care are highlighted and presented: like better health care, better patient care, better medical care, better healthcare value, and better care delivery, etc. The benefit of BDA application to health care can be identified by various research papers. Best practices can be identified, and effective treatment can be provided using BDA application. Moreover, BDA can enrich patient care, healthcare planning, and decision making [20]. As using big data platform social and psychological problem of the patient can be stored, so nursing care can also be improved [21, 22]. Overall the BDA benefits are as follows: better health care, better patient care, better medical care, better healthcare value, better care delivery, etc. [23]. However, some limitations can be identified regarding BDA applications in healthcare domain. There are limited research papers about practical BDA implementation using NoSQL data store. A formalized healthcare architecture for BDA is not yet fully operational. Table 1 shows the summary of survey for big data applications in health care.
4.2 Application in Social Media Data Analysis of social media data is another important aspect of research on big data analysis. Recent research in the fields of social media, data science, and machine learning provides a broad overview on social media-big data analytics. Social media serves as a massive information source with wide variety of large volume of data and with high velocity where users share comments, events, and feelings. The variety of social media platforms are available. LinkedIn and Facebook are connecting people, Wikipedia allows people to edit in a collaborative way, WordPress allows blogging, Twitter allows micro-blogging, Stackoverflow, Quora allows question answering, etc. Every day massive amount of data including text, photograph, audio, and video are generated by large user base across all social media. This unstructured, semistructured, organized data is produced as a main source of “big social data” [24–27].
226
S. S. Mondal et al.
Table 1 Summary of survey for big data applications in health care References Year Survey objective and details [13]
2014 NoSQL applications to healthcare system, important properties of NoSQL
[14]
2014 Proposed a Hadoop-based healthcare system using text mining to forecast and effectively manage the patient’s disease
[15]
2013 Integrate several excel documents or related applications with HBase
[16]
2014 Canadian healthcare project to load 30 TBs of data of healthcare application in HBase Graph database in the healthcare system
[17]
2020 Vertica NoSQL hybrid model to manage healthcare system Emphasis on data storage, data management, data analytics and data visualization
[18]
2017 Proof of concept (POC) to benchmark a hybrid architecture using MongoDB, HBase, and Cassandra on e-health clouds Query interface, query administrator, and translator for NoSQL organization on the POC architecture
[19]
2013 Benefits of BDA applications to healthcare system in different aspects
[20]
2011 Enrich patient care, healthcare planning, and decision making achieved using BDA
[22]
2015 Big data as a storage platform for social and psychological problem of the patient
[23]
2015 Benefits of BDA
So big data obtained from and connected with social media is commonly known as big social data. Big data analytics reveals patterns, trends, and other perspectives from massive social data. Different big data analytics approaches are used in social media: social graph theory, text mining, opinion mining, social influence analysis, sentiment analysis, statistical analysis, cyber risk analysis, etc. [28]. Big social data analytics helps leading companies like Microsoft, Apple, Google, Amazon, Samsung, Twitter, NVidia, Honda to enhance their customer relations practices and corporate strategy. Rahman et al. [24] explained the significant elements of the social media data for an enhanced data-driven decision-making process. Hou et al. [25] also presented a fine review of practical aspect of big data analytics in social media by applying sentiment analysis, time series analysis, and network analysis in areas like disaster management, health care, and business. Table 2 presents the summary of survey for big data applications in social media data. There are several challenges and limitations in working with the field of big data analytics in social media. • The maintenance of big social data is expensive • Public access is growing more difficult as social media data is scattered over numerous physical sites.
Applications of Big Data in Various Fields: A Survey
227
Table 2 Summary of survey for big data applications in social media data References
Year
Survey objective and details
[24]
2022
Data mining and analysis on social media data to enhance their customer relations practices and corporate strategy
[25]
2020
Apply sentiment analysis, time series analysis, and network analysis in areas like disaster management, health care, and business
[27]
2019
Various social media data in variety of forms like unstructured, semi-structured, organized data serve as a main source of big social data
[28]
2017
Big data analytics to explore reveals patterns, trends, and other insights from massive social data
• Constant data processing, data cleaning, and data filtering should be performed to retrieve necessary information is a costly and time consuming task. • Social data is severely affected by cyber-attacks. So, faulty conclusion can be obtained using big data analytics.
4.3 Application in E-Commerce E-commerce or electronic commerce engages both customers and vendors from different region of a country. It also engages different supply chain management to deliver the product in a timely manner. So in that scenario, e-commerce can be categorized from business to business, business to customer, and customer to customer. So huge participation of vendors and customers from different regions using different handheld devices or laptop generates a huge volume of data, and by applying BDA tools, both customer and vendors will be benefitted. One of the main difficulties in e-commerce is the enormous amount of data that needs to be processed and analysed. Decision-making process is improved greatly by applying BDA. There are several research papers that narrate big data application in e-commerce [29–32]. Alrumiah et al. [29] presented a detailed description of advantage of BDA to both vendors and consumers in the domain of e-commerce. Searching and shopping experience of the customer are personalized using BDA recommendation system. However, they had also mentioned the negative effects like shopping addiction, data accuracy, and security and the necessity of expensive BDA tools and professionals. Akter et al. [30] wrote a detailed review of big data application in e-commerce. They had explored the different types of big data in the e-commerce space and illustrated its commercial usefulness. Another major contribution of their paper is the guidelines to tackle the limitations of big data applications within e-commerce. Moorthi et al. [31] presented a detailed study and different methodologies followed in e-commerce. They also proposed some methodologies by adopting business intelligence through big data analytics to enhance the business process. Feng et al. [32] considered Internet of things for logistic distribution, quality control and also considered cloud computing under big data analytics to develop
228
S. S. Mondal et al.
Table 3 Summary of survey for big data applications in e-commerce References Year Survey objective and details [29]
2021 Advantage of BDA in e-commerce to both vendors and consumers BDA as recommendation system
[30]
2016 Illustrated the business value of big data in e-commerce How to tackle limitations of big data applications within e-commerce
[31]
2017 Varies methodologies followed in e-commerce Proposed some methodologies by adopting business intelligence through big data analytics
[32]
2019 Internet of things for logistic distribution Cloud computing under big data analytics to develop strategies of e-commerce
strategies of e-commerce. Table 3 presents the summary of survey for big data applications in e-commerce. Table 3 shows the summary of survey for big data applications in e-commerce.
4.4 Application in Agriculture Precision agriculture makes cultivation more productive. Due to increase in the number of populations of the world, food consumption is also increasing day by day. However, production of agriculture is not growing in similar rate. For the production of sustainable agriculture, some recent technologies like machine learning, artificial intelligence, Internet of things, block chain are included. That is why we have to generate huge amount of data from the farm and apply those data to derive agricultural decision and that can solve the scarcity of food globally. This data may be soil moisture, temperature and humidity, environmental characteristics, water, etc. Considering this huge amount of data in agricultural sector, BDA plays an important role to empower efficient and smart decision-making process about demand of crop, price, etc. A good number of papers [33–36] are available on application of big data in agriculture. Bhat et al. [33] presented a detailed review of big data and AI revolution in precision agriculture. They had also shown different machine learning techniques which are applied in BDA decision making. Big data-based decision support system also faces different challenges as collection of data from different IoT devices, satellite images are quite complex. Jedlika et al. [36] explained how real-time analysis is required to improve agricultural productivity. Different IoT tools are nowadays available to measure the soil parameter, weather forecasting, parasite identification according to which a smart decision can be taken like what seeds required for planting, what and how much pesticide required in case of parasite attack etc.
Applications of Big Data in Various Fields: A Survey
229
Table 4 Summary of survey for big data applications in agriculture References Year Survey objective and details [33]
2021 Big data and AI revolution in precision agriculture Various machine learning techniques which are applied in BDA decision making in agriculture sector Challenges of collection from different IoT devices, satellite images
[36]
2018 Real-time analysis to improve agricultural productivity where source of data is different IoT-enabled devices
[37]
2021 Demand-based productivity in agricultural sector by analyzing demand forecasting using BDA
[38]
2017 Supply chain management on agricultural product using big data
Spandana Vaishnavi et al. [37] presented a relation among demand and supply of different farming product in different region because for farmers it is difficult to determine what to produce or plant according to the demand. Farmers will be badly affected if such big data analytic tools are not applied for finding the relation between demand and supply and suggest farmers accordingly. Kumar et al. [38] discussed the necessity of proper implementation of supply chain management on agricultural product so that farmers will not suffer to gain profit when they produce their product in large volume. Table 4 shows the summary of survey for big data applications in agriculture.
4.5 Application in Smart Cities A smart city generally means an advanced city that is technologically equipped and capable of understanding its surroundings through data analysis and making immediate changes to address problems and enhance quality of life. As we mentioned the smart city is technologically equipped, it adopts ICT-based devices, IOT devices which become the source of data to sense the environment. On the other hand, big data technology considers this data as the input and applies analytics or processes it to take smart decisions. A lot of research papers are there for the application of big data on smart city. Talebkhah et al. [39] presented a detailed review on the improvement in smart city application due to the adaptation of IoT and big data in spite of various challenges and critical issues. Alshawish et al. [40] wrote a detailed review on the application of big data in smart city and discussed the necessity of data acquisition from different sources and analysis in big data environment to improve resident’s quality of life with some real life examples of smart energy, smart public safety, and smart traffic system. Ismail et al. [41] discussed about the diverse nature of data generated in a city, analysis of such data in big data platform and resulting enhanced economic and environmental outcome to improve quality of life.
230
S. S. Mondal et al.
Table 5 Summary of survey for big data applications in smart cities References
Year
Survey objective and details
[39]
2021
Improvement of smart cities application on adopting IoT-enabled devices and big data analysis Various challenges and critical issues in smart cities application
[40]
2016
Data acquisition from different sources and analysis in big data environment to improve resident’s quality of life
[41]
2016
Diverse nature of data generated in a city, analysis of such data to improve quality of life in economic and environmental point of view
[42]
2016
Detailed big data architecture for smart city which can adopt data generated from IoT devices or permanently connected devices
[43]
2018
Real-time data analysis in big data environment where ICT-enabled device or IoT device acts as data sources
[44]
2016
Proposed an architecture and implementation model based on Hadoop ecosystem in real environment
Costa et al. [42] discussed about a detailed big data architecture for smart city to analyze huge volume, variety of data which is generated by adopting IoT devices or permanently connected devices. Manjunatha et al. [43] give an overview of real-time data analysis in big data environment for the implementation of smart cities to improve quality of life where the data sources are information and communication technology (ICT)-enabled devices and IOT devices. Rathore et al. [44] proposed an architecture and implementation model based on Hadoop ecosystem in real environment consists of various steps like data generation, data aggregation, data filtration, classification, processing, computing, and end with a decision making. Table 5 shows the summary of survey for big data applications in agriculture.
4.6 Application in Intelligent Transportation System (ITS) Transport system has been transformed to an intelligent one since 1970. This is developing day by day which consists of advanced technologies like sensors, data transmissions, and controlling technologies. Drivers and riders will get a better service by using this intelligent transportation system. Huge amount of data is generated in modern ITS which is inefficient and not possible to store in a traditional data processing system. Big data analytics play an important role in the operation of modern ITS as the storage, analysis, and management of data has been resolved. Efficiency of the ITS operation can be improved by big data analytics. Moreover, ITS safety level can be improved using big data analytics and as a result the occurrence of traffic accident can be predicted.
Applications of Big Data in Various Fields: A Survey
231
Table 6 Summary of survey for big data applications in intelligent transport system References Year
Survey objective and details
[45]
2019 Intelligent transport system for prediction in road traffic, accident analysis, planning, and control of public transportation
[46]
2017 Proposed a framework for data collection, integration, fusion, and management of traffic data from various sources
Zhu et al. [45] presented a detailed review of intelligent transport system using big data analytics including flow prediction in road traffic, accident analysis, planning and control of public transportation, etc. Guido et al. [46] presented the use of big data in public transportation system and proposed a framework for data collection, integration, fusion, and management of traffic data from various sources. Table 6 shows the summary of survey for big data applications in intelligent transport system.
5 Conclusion Big data analytics plays an important role in modern world. In this paper, we have briefly explained the big data characteristics and storage technology by exploring different relevant research papers. Moreover, we have also presented a detailed review of the application of big data in various fields. We have taken here six important applications of big data: health care, social media, e-commerce, agriculture, smart cities, and intelligent transport system.
References 1. Singh, N., Lai, K.H., Vejvar, M., Cheng, T.C.E.: Big data technology: challenges, prospects, and realities. IEEE Eng. Manage. Rev. 47(1), 58–66 (2019) 2. Imran, S., Mahmood, T., Morshed, A., Sellis T.: Big data analytics in healthcare-a systematic literature review and roadmap for practical implementation, IEEE/CAA J. Automatica Sinica 8(1), (2021) 3. Tsai, C.W., Lai, C.F., Chao, H.C., Vasilakos, A.V.: Big data analytics: a survey. J. Big Data 21 (2015) 4. Rabhi, L., Falih, N., Afraites, A., Bouikhalene, B: Big data approach and its applications in various fields: review. Procedia Comput. Sci. 155, 599–605 (2019) 5. Diebold, F.X.: Big data’ dynamic factor models for macroeconomic measurement and forecasting. In: Advances in Economics and Econometrics, Eighth World Congress of the Econometric Society, pp. 115–122 (2000) 6. Laney, D.: 3D data management: Controlling data volume, velocity, and variety, META Group, Tech. Rep., Feb. (2001) 7. Demchenko, Y., Ngo, C., Membrey, P.: Architecture framework and components for the big data ecosystem Draft Version 0.2, System and Network Engineering, SNE technical report SNE-UVA-2013–02, Sept (2013) 8. Harrison, G: Next Generation Databases: NoSQL, NewSQL, and Big Data. Apress (2015)
232
S. S. Mondal et al.
9. Wu, X., Kadambi, S., Kandhare, D., Ploetz, A.: Seven NoSQL Databases in a Week: Get Up and Running with the Fundamentals and Functionalities of Seven of the Most Popular NoSQL Databases Kindle. Packt Publishing, USA (2018) 10. Marz, N., Warren, J.: Big Data: Principles and Best Practices of Scalable Realtime Data Systems, USA. Manning Publications, Greenwich (2015) 11. Tudorica, B. G. and Bucur, C.: A comparison between several NoSQL databases with comments and notes. In: Proceeding RoEduNet International Conference 10th Edition: Networking in Education and Research. Iasi, Romania (2011) 12. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Antony, S., Liu, H., Murthy, R.: Hive—A petabyte scale data warehouse using Hadoop. In: Proceeding of the IEEE 26th International Conference Data Engineering, pp. 996–1005. Long Beach, USA (2010) 13. Ercan, M. and Lane, M.: An evaluation of the suitability of NoSQL databases for distributed EHR systems. In: Proceeding 25th Australasian Conferences Information Systems. Auckland, New Zealand (2014) 14. Lee, B., Jeong, E.: A design of a patient-customized healthcare system based on the Hadoop with text mining (PHSHT) for an efficient disease management and prediction. Int. J. Softw. Eng. Appl. 8(8), 131–150 (2014) 15. Yang, C.T., Liu, J.C., Hsu, W.H., Lu, H.W., Chu, W.C.C.: Implementation of data transform method into NoSQL database for healthcare data. In: Proceeding International Conference Parallel and Distributed Computing, pp. 198–205. Applications and Technologies, Taipei, China (2013) 16. Park, Y., Shankar, M, Park, B.H., Ghosh, J.: Graph databases for large-scale healthcare systems: a framework for efficient data management and data services. In: Proceeding of the IEEE 30th International Conference Data Engineering Workshops. Chicago, USA, (2014) 17. Štufi, M., Bacic, B., Stoimenov, L.: Big data analytics and processing platform in Czech republic healthcare. Appl. Sci. 10(5), 1705 (2020) 18. Gopinath, M. P., Tamilzharasi, G.S., Aarthy, S. L. and Mohanasundram, R: An analysis and performance evaluation of NoSQL databases for efficient data management in e-health clouds. Int. J. Pure Appl. Math. 117(21), 177–197 (2017) 19. Chen, K.L., Lee, H.: The impact of big data on the healthcare information systems, in transactions of the. In: International Conference Health Information Technology Advancement (2013) 20. Thorlby, R., Jorgensen, S., Siegel, B., Ayanian, J.Z.: How health care organizations are using data on patients’ race and ethnicity to improve quality of care. Milbank Quart. 89(2), 226–255 (2011) 21. Zillner, S., Lasierra, N., Faix, W., Neururer, S.: User needs and requirements analysis for big data healthcare applications. Stud. Health Technol. Inform. 205, 657–661 (2014) 22. Boinepelli, H.: Applications of big data, in Big Data. In: Primer, A. (Ed.) Springer, New Delhi, India, pp. 161–179 (2015) 23. Hood, L., Lovejoy, J.C., Price, N.D.: Integrating big data and actionable health coaching to optimize wellness. BMC Med. 13(1), 4 (2015) 24. Rahman, M.S., Reza, H.: A systematic review towards big data analytics in social media. Big Data Min. Anal. 5(3), 228–244 (2022) 25. Hou, Q., Han, M., Cai, Z.: Survey on data analysis in social media: a practical application aspect. Big Data Min. Anal. 3(4), 259–279 (2020) 26. Dhawan, V., Zanini, N.: Big data and social media analytics. Res. Matt. Cambridge Assess. Publ. 18, 36–41 (2014) 27. Ghani, N.A., Hamid, S., Targio Hashem, I.A, Ahmed, E.: Social media big data analytics: a survey. Comput. Hum. Behav. 101, 417–428 (2019) 28. Ayele, W.Y., Juell-Skielse, G.: Social media analytics and internet of things: Survey. In: Proceeding 1st International Conference on Internet of Things and Machine Learning, pp. 1–11. Liverpool, UK (2017) 29. Alrumiah, S.S., Hadwan, M.: Implementing big data analytics in E-commerce: vendor and customer view. IEEE Access 9, 37281–37286 (2021)
Applications of Big Data in Various Fields: A Survey
233
30. Akter, S., Wamba, S.F.: Big data analytics in E-commerce: a systematic review and agenda for future research. Electron. Market. 26(2), 173–194 (2016) 31. Moorthi, K., Srihari, K., Karthik, S.: A survey on impact of big data in E-commerce. Int. J. Pure Appl. Math. 116(21), 183–188 (2017) 32. Feng, P.: Big data analysis of E-commerce based on the internet of things. In: 2019 International Conference on Intelligent Transportation, Big Data and Smart City (ICITBS), pp. 345–347 (2019) 33. Bhat, S.A., Huang, N.F.: Big data and AI revolution in precision agriculture: survey and challenges. IEEE Access 9, 110209–110222 (2021) 34. Bermeo-Almeida, O., Cardenas-Rodriguez, M., Samaniego-Cobo, T., Ferruzola-Gómez, E., Cabezas-Cabezas, R., Bazán-Vera, W.: Blockchain in agriculture: a systematic literature review. In: Proceeding International Conference Technology Innovations, pp. 44–56. Springer, Cham, Switzerland (2018) 35. Lokhande, S.A.: Effective use of big data in precision agriculture. In: 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), pp. 312–316 (2021) 36. Jedliˇcka, K., Charvát, K.: Visualisation of Big Data in Agriculture and Rural Development, 2018 IST-Africa Week Conference (IST-Africa), pp. 1–8 (2018) 37. Spandana Vaishnavi, A, Ashish, A, Sai-Pranavi, N., Amulya, S.: Big Data Analytics Based Smart Agriculture. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 534–537 (2021) 38. Kumar, M., Nagar, M.: Big data analytics in agriculture and distribution channel. In: 2017 International Conference on Computing Methodologies and Communication (ICCMC), pp. 384–387 (2017) 39. Talebkhah, M., Sali, A., Marjani, M., Gordan, M., Hashim, S.J., Rokhani, F.Z.: IoT and big data applications in smart cities: recent advances challenges, and critical issues. IEEE Access 9, 55465–55484 (2021) 40. Alshawish, R.A., Alfagih, S.A.M., Musbah, M.S.: Big data applications in smart cities. 2016 International Conference on Engineering & MIS (ICEMIS), pp. 1–7 (2016) 41. Ismail, A.: Utilizing big data analytics as a solution for smart cities. In: 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), pp. 1–5 (2016) 42. Costa, C., Santos, M.Y.: BASIS: A big data architecture for smart cities. 2016 SAI Comput. Conf. (SAI), pp. 1247–1256 (2016) 43. Manjunatha, Annappa, B.: Real time big data analytics in smart city applications. In: 2018 International Conference on Communication, Computing and Internet of Things (IC3IoT), pp. 279–284 (2018) 44. Rathore, M.M., Ahmad, A. Paul, A.: IoT-based smart city development using big data analytical approach. In: 2016 IEEE International Conference on Automatica (ICA-ACCA), pp. 1–8 (2016) 45. Zhu, L., Yu, F.R., Wang, Y., Ning, B., Tang, T.: Big data analytics in intelligent transportation systems: a survey. In: IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 1, pp. 383–398 (2019) 46. Guido, G., Rogano, D., Vitale, A., Astarita, V. and Festa, D.: Big data for public transportation: A DSS framework. In: 2017 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS) (2017)
Performance Analysis of Healthcare Information in Big Data NoSql Platform Sukhendu S. Mondal, Somen Mondal, and Sudip Kumar Adhikari
Abstract Healthcare information management system produces huge volume of healthcare data. Information related to different patient may vary according to his/her health situation. This health information for all patients need to store in a database for future use. The patient information is of huge volume with different varieties and unstructured in nature. It is very difficult to normalize and store such data in traditional RDBMS. So there is an essence to use NoSql to store such data in a big data environment. NoSql can handle the unstructured or medical data where each particular patient is identified by a row id and varieties of medical information can be stored in a particular column family. In this paper, we present big data characteristics in healthcare system focusing particularly on NoSQL databases. We have proposed an architectural model where we have used HBase as a NoSql on top of HADOOP platform and showed how performance of query execution differ according to the data volume stored in the HBase. Keywords Health care · Big data · Hadoop · HBase · NoSQL
1 Introduction In modern healthcare information management system (HIMS), huge volume of patient information is generated [1–3]. The relationship among hospitals with their patients or clinics with their patients is influenced by high velocity of the information. Due to the adaptation of modern equipment, sensors handle devices that generate S. S. Mondal · S. Mondal · S. K. Adhikari (B) Cooch Behar Government Engineering College, Cooch Behar, West Bengal, India e-mail: [email protected] S. S. Mondal e-mail: [email protected] S. Mondal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_20
235
236
S. S. Mondal et al.
large volume of heterogeneous medical data and that becomes the source of healthcare information for a patient. This data can be heterogeneous, unstructured type of data, and that should be accessible to all concerned. So there is a need to store, access, and update such information in a cost-effective and fault-tolerant manner. Hence, big data technology which is a state-of-the-art tools or technique used in HIMS to extract valuable and useful information from the heterogeneous healthcare data [4]. Big data analytics (BDA) plays a vital role to reduce healthcare costs and to identify the risks. Diagnostic treatments without human interference are improved day by day by applying BDA [5, 6]. Francis Diebold first presented big data as the recent and unique advancement in data recording and storage technology, and due to this reason, the surge in the number and quality of pertinent information become possible, and also, the sample sizes are not determined by the number of sightings; however, they are determined by data volume say, in megabytes. Even it was not considered unusual, if the data growth rate or data generation at speeds of multiple gigabytes or more per day [7]. Big data can be characterized by “3V’s” that are volume, velocity, and variety which is presented in [8]. Value is also an important characteristic along with “3 v’s.” Volume indicates the size of data, and it is measured in terabytes or in petabytes, velocity is the arriving data speed or data bandwidth, variety refers to multiple types of data like structured data (relational), unstructured data (images, text, videos) or semi-structured data (web data, log file), etc., and value refers to enhancement of data collaboration by reducing data complexity, increased data availability and by unifying data system. So, these are essentially ideas coming out of BDA. The 4V’s are the backbone for the measurement, and validation measures the accuracy and correctness of data features in any use case of big data. Big data enables huge advantages over traditional computing and storage system. It provides scale out facilities other than scale up when the data is in huge volume. For big data operation, high-end processing is required, and it provides the fault-tolerant environment. Apache Hadoop [9] is a software framework for big data which is open source. It has two basic parts or components: Hadoop Distributed File System (HDFS) which acts as storage system of Hadoop and MapReduce which is a programming model that simplifies parallel programming. Hadoop provides a storage platform that only can perform batch processing. So query to a Hadoop file system takes huge time. On the other hand, HBase is a framework that works on top of Hadoop file system and allows random access to stored data. Traditional database system is row oriented. So, it is very difficult to store a particular patient information in a single row where the patient information may vary patient to patient. So there is a need of column-oriented database (NoSQL) [10, 11] to support storing of patient information. The remaining part of the paper is structured as follows: Literature survey is presented in Sect. 2. Section 3 presents need for big data and Hbase for storing healthcare information. Possible platform for storing healthcare information with a proposed architectural model and a detailed comparison between them are explored in Sect. 4. Section 5 presents results and discussion. Section 6 concludes the paper.
Performance Analysis of Healthcare Information in Big Data NoSql …
237
2 Literature Survey A lot of research work has been done on big data applications on healthcare systems. The essence of these applications can be categorized as four NoSQL types: columnar stores, key-value pair stores, store as graph, document store, and hybrid stores. Ercan et al. [12] analyzed the following important properties of NoSQL applications in health care: scaling out instead of scale up, automated scaling, high reliability, data model options, NewSQL compliance, optimized query execution, and cost-effective. Lee et al. in [13] proposed a patient-specific healthcare system using text mining on Hadoop for early prediction of deadly diseases and effective management of patient profile. Yang et al. [14] presented a more substantial work by using HBase in healthcare application. They motivated HBase to integrate different excel files or related applications and showed an experimental set up of their idea on a two-node clusters. Then the authors in [15] implemented a set up using HBase cluster for a healthcare system of Canada which is capable to support storage about 30 TBs of healthcare data application where all internal processing are done by MapReduce. Park et al. [15] presented a graph database in the healthcare system and also showed the usefulness of this database with investigating with doctors and other healthcare agencies. Stufi et al. [16] described a Vertica NoSQL hybrid model to manage healthcare big data for a Czech healthcare center. They had undertaken four steps BDA process, namely data storage, data management, data analytics, and data visualization. Gopinath et al. [17] implemented a proof of concept (POC) to benchmark a hybrid architecture for an industrial project based in India using MongoDB, HBase and Cassandra on e-health clouds. The main component of this architecture consists of query interface, query administrator, and translator for NoSQL organization. In paper [18], the benefits of BDA applications to health care are identified and presented: like better health care, better patient care, better medical care, better healthcare value, and better care delivery, etc. Madyatmadja et al. [19] presented analysis of healthcare data in big data environment using decision tree algorithm. They had also performed medical data cardiovascular disease analysis. Philip et al. [20] presented data analytics and predictive, visual analysis of type 2 diabetes. They had also explored advanced data analytics techniques for clinicians in decision making process. Bi et al.[21] presented a model for IoT-enabled healthcare system using deep learning-based data analytics.
3 Need for Big Data and HBASE for Storing Healthcare Information Healthcare information is very crucial for patients, doctors and healthcare organizations. Due to the adaptation of different IoT-enabled devices in medical sectors, sophisticated sensors-enabled measuring equipment, adoption of computers, peripheral, and other handheld devices, the data is generated in huge volume in this sector.
238
S. S. Mondal et al.
If we want to keep all records or information about a patient, then we observe that the structure or data structure of patient information may vary from patient to patient. For example, suppose a patient has kidney problem, then we need to store information about different test results of kidney profile. For gastric problem-related information, we need to store gastric profile, similarly for brain-related problems brain profile, lungs-related problem lungs profile etc. In our observation, the patient information is huge in volume that may beyond extra bytes and so more data storage is required. Structure of patient information is unpredictable in nature and is difficult to normalize to store in traditional or distributed databases. On the other hand, not all problems can be solved in distributed computing environment. If the time for processing is not a constraint, complex task or job can be executed via a specialized remote service. So in case of medical information processing or analysis, time is not a major constraint like any transaction processing system (OLTP). It is basically an analytical system (OLAP). For complex data analysis or processing, the data will move to a remote or external service where lots of spare or alternate resources are available which collaboratively store and process the data. Due to the exponential growth in data volume in every year, it is also become essential to replace the old system by new one for storing and processing. It is also very difficult for companies in health sector to replace every time by a new one because of the financial constraint and difficulty in data migration. So scale up facility is not a very useful solution in that sector where time of processing is not a major constraint and scale out facility is an alternate model that can be adopted by healthcare industry. In case of scale up facility, we replace an existing server by a new sophisticated, better processing, and storage server which definitely increases cost. On the other hand in case of scale out facility, if the existing system is not capable for storing the current volume of data and also not capable for processing such volume of data, then we add some or one extra node with it. Big data is such a technology that is capable to store large volume of data in a distributed environment in fault tolerant and also have scale out facility. Big data also has the capability to process and analyze complex data. Big data technology is composed of multiple frameworks like Hadoop, HBase, Hive, Pig, Flink, Spark, Strom, Graph, and many more. Hadoop is an open-source software framework and part of big data technology, and it has two major parts. First one is Hadoop Distributed File System (HDFS) which has the capability to store huge volume of data in a distributed way in a cluster of commodity hardware, and it is also a file system generally act as storage platform of various kinds of data. HDFS splits the data and distributes across many nodes in a cluster, and it provides the scale out of hardware resources and fault tolerant. Second one is MapReduce, and it is a programming model that simplifies programming modeling using two function map and reduce. In MapReduce, map function processes all key/value pair and generate a set of intermediate key/value pairs, whereas a reduce function merges or summarizes all intermediate values associated with the same intermediate key produce by the map function (Fig. 1).
Performance Analysis of Healthcare Information in Big Data NoSql …
239
Fig. 1 Data flow in MapReduce
So Hadoop is a software framework and part of big data technology that can solve the problem of healthcare system in terms storage requirements, fault tolerance requirement, availability, and consistency of the data. One of the drawbacks of HDFS is that it gives sequential access in case of searching to a particular data in the system, and it is very slow compared to the requirement of any analytical platform. HBase is a NoSql system which works on top of Hadoop cluster. HBase stores the data in columnar way so the database is extensible horizontally. HBase is also a distributed database working with HDFS which acts as a persistence storage platform for it. It is expanded from Google BigTable that has the support to store large tables with a large number of columns and rows as it is superimposed on Hadoop. HBase is randomly accessible unlike sequential access in HDFS which reduces searching time. Read/write access in real time is also possible with HBase. It is also highly configurable as we can change replication factor, number of region server, backup server, and other configuration which may be adopted several times. It also provides a great deal of flexibility to address huge amounts of data efficiently. The following section explores how HBase addresses the challenges that exits in healthcare information. HBase emulates the features of RDBMS; additionally ,it is horizontal table or columnar database. In HBase, all data is stored in a table which consists of rows and columns like RDBMSs but unlike RDBMS which is scalable only in vertical direction, and in that case, it is scalable in both horizontal and vertical directions. HBase cell is identified as the intersecting point of a row and a column. Each cell value has “version” as an attribute. It is basically a timestamp which is used to uniquely identify the cell. Versioning is used to track changes in the cell, and it is used to identify the most recent updated data in the cell or retrieves any version of the
240
S. S. Mondal et al.
required and necessary contents. In HBase cell, all data is stored in the decreasing order of timestamp to always return the latest values first when read. HBase table stores record or data using row and columns like RDMS. Here, each row is identified by a “row key” which is unique. A “row key” can be assigned with a primitive type, string type or any other object type, so it is also flexible. The “row key” provides control access to the data stored in cells. The column is a part of column family in HBase table. The column family along with a column name is used to retrieve the data stored in a column for a particular row key. HBase implementation is regulated at column wise. It is therefore also important to know how we are able to retrieve data and how large that column is going to be. All of these features like table name, row key, column family name, column name jointly define the schema of HBase table. The schema definition and creation have to be done before any record insertion. Tables have been altered, and new column families and new column under some column families have been appended when database is operational. So all together, Hadoop and HBase are very essential for storing, updating, and retrieving of patient information.
4 Platform for Storing Healthcare Information with Comparison Hadoop and HBase are the storage or warehouse solution for healthcare information. Hadoop stores the data in clusters of commodity hardware or computer and if needed in extra storage or processing. It has the capability to handle this situation by adding extra nodes. On the other hand, HBase which works on top of Hadoop has the facility to store data in a columnar way and has the facility to access data randomly. Data stored in HBase can be accessed via HBase shell or using a Java Client API. To make HBase accessible from outside the cluster, we prefer Java client API. Like HBase, there are some other NoSql platforms also available like Cassandra, MongoDB [22], etc. Cassandra uses a master less ring-like topological architecture which offers many advantages over master–slave architecture where all nodes in a cluster are treated equally likely. Cassandra stores data like traditional RDBMS in columns and rows. Cassandra uses query language which is known as Cassandra Query Language (CQL), and it is very similar to SQL used in RDBMS, and thus, it becomes easier for SQL users to understand. The data stored in Cassandra is highly available and reliable because of its advanced repair processes. It also provides better fault tolerance and take less than 40 seconds to recover from any node failure. Cassandra provides eventual or weak consistency that is a major drawback of Cassandra over HBase (Fig. 2). MongoDB is a document base NoSql system which stores data as JSON-like documents. It is also a schema-less database. This provides flexibility and agility
Performance Analysis of Healthcare Information in Big Data NoSql …
241
Fig. 2 Proposed model
in the type of stored records where the fields can vary from one document to the other. MongoDB is horizontal scalable, and it also provides data availability through redundancy or replication. It supports faster query execution through indexing, and it also reduces the I/O overload which is generally associated with database systems. The major drawback of MongoDB is that unless we opt for one of the database as service (DBaaS) flavors, management operations like patching are manual and timeconsuming processes. But in case of column-oriented implementation, we prefer HBase. It may be noted that MongoDB is document oriented.
242
S. S. Mondal et al.
In perspective of storing healthcare information, we prefer HBase because it works on HDFS which is highly fault tolerant and highly available. Here, we propose an architecture based on the requirements of healthcare information. In this architecture, we propose a cluster of computers which serve as a Hadoop cluster and that can have four or more data nodes for storing raw data along with replica data for fault tolerance and availability. Hadoop cluster should have name node which stores all meta information about data node and coordinates among data nodes. Besides, we have proposed a secondary name node in case of failure of name node. On top of this Hadoop, HBase will work, and Zookeeper acts as coordinator or synchronization platform among multiple region servers to provide high consistency. Java Client API instead of HBase shell is preferred to interact with HBase because using this API, we can implement a server from where a doctor, healthcare organization, or patient can interact from remote place using any GUI mechanism. In that scenario, it can behave like big data as service cloud implementation.
5 Results and Discussions In our experiment, we have used two types of configurations of HBase to store healthcare information. In first one, we use HBase on top of Hadoop cluster which is in pseudo-distributed mode and second one HBase on top of Hadoop cluster on fully distributed mode with eight nodes. We have used the following medical records randomly from multiple UCI data sets [23]: ILPD (Indian Liver Patient Dataset) data set, thyroid disease data set and heart disease data set. Here we have found scan time or load time of data stored in HBase in both configurations, i.e., in pseudo-distributed or fully distributed mode. Fig. 3 shows comparison chart of scanning or loading time all records in pseudo-distributed mode and fully distributed mode. Here, X-axis indicates the no. of records in HBase, and Yaxis shows scanning time in nanoseconds (ns). From Fig. 3, we observe that the time taken for loading all records in an application does not depend on the data volume or number of records. In case of fully distributed mode, the average scanning or loading time is slightly higher than the pseudo-distributed mode as shown in Fig. 4. We have used pseudo-distributed mode for testing purpose, and if data volume increases rapidly, then cluster will be used. Here, we use a HBase on top of Hadoop cluster of 8 nodes connected via LAN. Figure 5 shows Hadoop cluster of 8 nodes, and Fig. 6 shows the HBase working on this Hadoop Cluster. When we search a record of a patient in HBase, the searching time does not depend on data volume or the number of records present in HBase. This behavior is captured in Fig. 7. In Fig. 7, x-axis indicates number of record in HBase, and Y-axis indicates searching time. HBase provides random access in case of searching any record, so in our observation, it will not depend on number of record or data volume. The average searching time is about 0.0774 s. In NoSql implementation of storing records, there are different alternatives available like HBase, Cassandra, MongoDB, and CouchDB but among all this, only
Performance Analysis of Healthcare Information in Big Data NoSql …
243
250000 200000 150000 100000 50000 1 23 45 67 89 111 133 155 177 199 221 243 265 287 309 331 353 375 397 419 441 463 485 507 529 551 573
0
No of Records Scan Time fully distributed (8 node) cluster(ns) Scan Time pseudo cluster(ns) Fig. 3 Showing loading time against no. of record 60000 50000 40000 30000 20000 10000 0 fully distributed (8 node) cluster(ns)
pseudo cluster(ns)
Fig. 4 Showing average loading time
HBase, Cassandra provide facility of storing records in columnar way. On the other hand, MongoDB and CouchDB store record as document. So clearly, HBase and Cassandra provide the solution for the implementation of storing records of healthcare information in a distributed way to support huge volume and processing of records. In Cassandra, the data is stored in a ring-like connected nodes in distributed manner with some replication factor to provide fault tolerance and high availability. According to CAP theorem [24], for the implementation of distributed system, Cassandra fulfills the requirements of distributed system as it implements two factors: availability (A) and partition tolerance (P) of the three factors but it provides eventual or weak consistency (C). On the other hand, HBase basically provides or implements high consistency (C) and partition tolerance (P) over availability (A) but as
244
Fig. 5 Hadoop cluster
Fig. 6 HBase on top of Hadoop cluster
S. S. Mondal et al.
Performance Analysis of Healthcare Information in Big Data NoSql …
1.4 1.2 1 0.8 0.6 0.4 0.2 0
245
Ɵme in sec searching
0
5
10
15
20
25
Fig. 7 Showing searching time against no. of records
Chart Title
1000000 800000 600000 400000 200000 0
Recovery Ɵme fully distributed (8 node) cluster(ns)
Recovery Time pseudo cluster(ns)
Fig. 8 Recovery time
HBase works on top of Hadoop which provides high availability. Recovery time is another important aspect of HBase in case of node failure. Figure 8 shows the average recovery time in pseudo-distributed mode and fully distributed mode where we have observed that the recovery time in fully distributed mode is slightly higher than pseudo-distributed mode. The recovery time depends on replication factor in our experiment where we have considered replication factor as 3. If we increase replication factor, then recovery time will be also increased.
6 Conclusion In this paper, we present big data characteristics in healthcare system focusing particularly on NoSQL databases. We have proposed an architectural model where we have used HBase as a NoSql on top of HADOOP platform and showed how performance of query execution differ according to the data volume stored in the HBase. In fully
246
S. S. Mondal et al.
distributed mode though the performance is not differed too much compared with standalone or pseudo-distributed mode, it gives more storage support.
References 1. Imran, S., Mahmood, T., Morshed, A., Sellis, T.: Big data analytics in healthcare-a systematic literature review and roadmap for practical implementation. IEEE/CAA J. Automatica Sinica 8(1) (2021) 2. Chawla, N.V., Davis, D.A.: Bringing big data to personalized healthcare: a patient-centered framework. J. Gen. Intern. Med. 28(S3), 660–665 (2013) 3. Reddy, A.R., Kumar, P.S.: Predictive big data analytics in healthcare. In: Proceedings of 2nd International Conference on Computational Intelligence & Communication Technology, Ghaziabad, India (2016) 4. Chen, H., Chiang, R. H. L. and Storey, V. C.: Business intelligence and analytics: From big data to big impact. MIS Quart. 36(4), 1165–1188 (2012) 5. Jee, K., Kim, G.H.: Potentiality of big data in the medical sector: focus on how to reshape the healthcare system. Healthcare. Inform. Res. 19(2), 79–85 (2013) 6. King, J., Patel, V., Furukawa, M.F.: Physician adoption of electronic health record technology to meet meaningful use objectives: 2009–2012. The Office of the National Coordinator for Health Information Technology, Tech. Rep. (2012, December) 7. Diebold, F.X.: Big data dynamic factor models for macroeconomic measurement and forecasting, in Advances in Economics and Econometrics, pp. 115–122. Eighth World Congress of the Econometric Society Cambridge, Cambridge, UK (2000) 8. Laney, D.: 3D data management: Controlling data volume, velocity, and variety. META Group, Tech. Rep. (2001) 9. Yao, Q., Tian, Y., Li, P.F., Tian, L.L., Qian, Y.M., Li, J.S.: Design and development of a medical big data processing system based on Hadoop. J. Med. Syst. 39(3), 23 (2015) 10. Harrison, G: Next Generation Databases: NoSQL, NewSQL, and Big Data. Apress (2015) 11. Wu, X., Kadambi, S., Kandhare, D., Ploetz, A.: Seven NoSQL Databases in a Week: Get Up and Running with the Fundamentals and Functionalities of Seven of the Most Popular NoSQL Databases Kindle. Packt Publishing, USA (2018) 12. Ercan, M., Lane, M.: An evaluation of the suitability of NoSQL databases for distributed EHR systems. In: Proceedings of 25th Australasian Conference on Information Systems, Auckland, New Zealand (2014) 13. Lee, B., Jeong, E.: A design of a patient-customized healthcare system based on the Hadoop with text mining (PHSHT) for an efficient disease management and prediction. Int. J. Softw. Eng. Appl. 8(8), 131–150 (2014) 14. Yang, C.T., Liu, J.C., Hsu, W.H., Lu, H.W., Chu, W.C.C.: Implementation of data transform method into NoSQL database for healthcare data. In: Proceedings of International Conference on Parallel and Distributed Computing, Applications and Technologies, Taipei, China, pp. 198– 205 (2013) 15. Park, Y., Shankar, M., Park, B.H., Ghosh, J.: Graph databases for large-scale healthcare systems: a framework for efficient data management and data services. In: Proceedings of IEEE 30th International Conference on Data Engineering Workshops, Chicago, USA (2014) 16. Štufi, M., Bacic, B., Stoimenov, L.: Big data analytics and processing platform in Czech republic healthcare. Appl. Sci. 10(5), 1705 (2020) 17. Gopinath, M.P., Tamilzharasi, G.S., Aarthy, S.L., Mohanasundram, R.: An analysis and performance evaluation of NoSQL databases for efficient data management in e-health clouds. Int. J. Pure Appl. Math. 117(21), 177–197 (2017) 18. Chen, K.L., Lee, H.: The impact of big data on the healthcare information systems. In: Transactions of the International Conference on Health Information Technology Advancement (2013)
Performance Analysis of Healthcare Information in Big Data NoSql …
247
19. Madyatmadja, E.D., Rianto, A., Andry, J.F., Tannady, H., Chakir, A.: Analysis of big data in healthcare using decision tree algorithm. In: 2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI) (2021) 20. Philip, N.Y., Razaak, M., Chang, J., O’Kane, S.M.M., Pierscionek, B.K.: A data analytics suite for exploratory predictive, and visual analysis of type 2 diabetes. IEEE Access 10, 13,460– 13,471 (2022) 21. Bi, H., Liu, J., Kato, N.: Deep learning-based privacy preservation and data analytics for IoT enabled healthcare. IEEE Trans. Industr. Inf. 18(7), 4798–4807 (2022) 22. Tudorica, B.G., Bucur, C.: A comparison between several NoSQL databases with comments and notes. In: Proceedings of RoEduNet International Conference on 10th Edition: Networking in Education and Research, Iasi, Romania (2011) 23. UCI Machine Learning Repository Homepage. https://archive.ics.uci.edu/ml/datasets.php. Last accessed 24 Dec 2022 24. Brewer, E.A.: Towards robust distributed systems. In: Proceedings of PODC, p. 7 (2000)
IoT-Based Smart Internet Protocol Camera for Examination Monitoring Sukanya Bose, Swarnava Ghosh, Subhadeep Dhang, Rohan Karmakar, Prasenjit Dey, and Aritra Acharyya
Abstract An artificial intelligence (AI)-based smart Internet protocol (IP) camera system for examination monitoring has been designed, realized, and rigorously tested under realistic environment. The purpose of the proposed IP cameras is to carry out real-time remote surveillance of the examination hall as well as store corresponding videos recordings in cloud storage. Internet-of-things (IoT) technology is used to remote access of the IP cameras and real-time video streaming on web-browser of any computer/smart phone connected to either internal or external network; a dedicated cloud service (named “remoteit”) is used for this purpose. Practically, 640 × 480 sq. pixels with 30–35 frames per second (fps) video capturing are used for the online streaming. On the other hand, compressed video size of 320 × 240 sq. pixels with 30– 35 fps is used for recording and storing in cloud. In addition, a custom TensorFlow lite model has been built for detecting undesirable chaos in the examination hall, and a bulk short messaging service (SMS) application programming interface (API) is used to send real-time alert to the competent authority in order to initiate necessary action. In future, the proposed system may be customized for various other sectors for security and alerting purposes. Keywords Artificial intelligence · Examination monitoring · IoT · IP camera · TensorFlow lite · Cloud storage
S. Bose · S. Ghosh · S. Dhang · R. Karmakar · A. Acharyya (B) Department of Electronics and Communication Engineering, Cooch Behar Government Engineering College, Harinchawra, Ghughumari, Cooch Behar, West Bengal 736170, India e-mail: [email protected] P. Dey Department of Computer Science and Engineering, Cooch Behar Government Engineering College, Harinchawra, Ghughumari, Cooch Behar, West Bengal 736170, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_21
249
250
S. Bose et al.
1 Introduction Generally, the Internet protocol (IP) cameras do not possess provision of local recording device. Those receive control data from server terminals and send image data via the IP network. The IP cameras most commonly use real-time streaming protocol (RTSP) for this purpose [1]. The primary advantages of IP cameras over the conventional close-circuit television (CCTV) cameras are those can stream the video data with significantly higher resolution and also reduce the cabling cost [2]. The main disadvantage of the IP camera is it fails to store the video footages to the cloud storage if the Internet is not available for long time of interval. Therefore, one must ensure the availability of Internet for getting the uninterrupted remote streaming as well as uninterrupted video footage storing in the cloud. Nowadays, the remote monitoring of the examination halls during the offline examinations has become an essential need for the school, colleges, and universities [3–7]. In this paper, the authors have proposed an artificial intelligence (AI)-based smart IP camera system for examination monitoring. The proposed system has been designed, realized, and rigorously tested under realistic environment (during real university examinations). The purpose of the proposed IP cameras is to carry out real-time remote surveillance of the examination hall as well as store corresponding videos recordings in the local as well as cloud storage. Internet-of-things (IoT) technology is used to remote access of the IP cameras and real-time video streaming on web-browser of any computer/smart phone connected to either internal or external network; a dedicated cloud service (named “remoteit”) is used for this purpose. Practically, 640 × 480 sq. pixels with 30–35 frames per second (fps) video capturing are used for the online streaming. On the other hand, compressed video size of 320 × 240 sq. pixels with 30–35 fps is used for recording and storing in cloud. In addition, a custom TensorFlow lite model has been built for detecting undesirable chaos in the examination hall, and a bulk short messaging service (SMS) application programming interface (API) is used to send real-time alert to the competent authority (officer-in-charge of the examination and external observer) in order to initiate necessary action in urgent basis. In future, the proposed system may be customized for various other sectors for security and alerting purposes.
2 System Architecture and Hardware Realization As mentioned in the earlier section, RTSP is used to get the online live video streaming from each IP camera of the system from any computer or smart phone connected to internal (home) or external (other) network. Schematic of the overall architecture of the proposed system is shown in Fig. 1. Different numbers of IP cameras are connected to a single router through star-connected wireless local area network (WLAN). The cameras connected to a single router are regarded as a cluster (e.g., Cluster—1, Cluster—2, …, Cluster—N (N number of clusters)). Within an
IoT-Based Smart Internet Protocol Camera for Examination Monitoring
251
organization, all the routers are connected to the Internet via a single or multiple switch/switches. The transmission control protocol (TCP) is used to register the camera port (conventionally the port number: 5000) of each IP camera in a premium remoteit cloud service account [8]. Each IP camera not only streams the video data over Internet through RTSP, but also it records the compressed videos and uploads to a dedicated premium Google drive (G-drive) through G-drive synchronization directory on the local storage. Therefore, unlike the conventional IP cameras, the proposed IP cameras possess local storage facility and capable of storing captured videos uninterruptedly, even though the Internet is unavailable for long period of time. The multi-threaded program developed for the IP cameras to capture, stream, and record simultaneously is regarded as the client application. On the other hand, a dedicated program regarded as the server application (not prepared yet; will be reported in due course) can be used from remote terminal to view the live stream from any IP camera within any of the cluster through RTSP. At this moment, the live streaming at any terminal can be viewed from web-browser by manually generating proxy IP from the remoteit cloud service. The terminal can access the live stream from an IP camera from the knowledge of the cluster number (cl) and camera number (cm) under that cluster (i.e., [(cl)(cm)]). The IP camera uses a Raspberry Pi zero 2W (wireless), quad-core 64-bit ARM Cortex-A53 processor clocked at 1 GHz and 512 GB SDRAM, connected with a 5.0 mega pixels (MP) Raspberry Pi zero camera through 15 pin ribbon type flat-flexible cable (FFC). Camera serial interface (CSI) is used for interfacing the camera with the Raspberry Pi. A 32 or 64 GB MicroSD card is used for externally loading the latest 32-bit Raspbian operating system (OS) which is inserted to the SD card slot of the Raspberry Pi. An AC (230 V, 50 Hz) to DC (5 V, 2 A) power adapter with male micro-USB port is used to power the Raspberry Pi. The entire system can be used as a minicomputer and can be from any computer by using virtual networking
Fig. 1 Schematic of the proposed system architecture
252
S. Bose et al.
Fig. 2 Schematic of the IP camera hardware
computing (VNC) application in headless mode (without connecting the monitor, keyboard, and mouse). The schematic of the proposed IP camera hardware is shown in Fig. 2. The photographs of different developmental stages of an IP camera prototype are shown in Fig. 3. Figure 4 shows the photographs of an IP camera prototype installed (wall mounted) at a classroom (classroom number: AC 112) for testing.
Fig. 3 Photographs of different developmental stages
IoT-Based Smart Internet Protocol Camera for Examination Monitoring
253
Fig. 4 Photographs of the smart IP camera prototype installed at the classroom for testing
3 Software Design and Implementation The description of the client program for simultaneously capturing, streaming, and recording frames is briefly discussed in this section. Three threads (Thread— A: Capturing Thread, Thread—B: Streaming Thread and Thread—C: Recording Thread) are simultaneously run via a python code in order to implement the afore-mentioned three simultaneous needs. Flowchart of the overall multi-threaded program is shown in Fig. 5. The capturing thread continuously captures the frames from the camera with 640 × 480 sq. pixels and passes each frame to custom TensorFlow lite (TFL) method [9] to detect the amount of chaos in the examination hall. If the amount of detected chaos is found to be greater than a pre-assigned threshold for sustained period of time, then SMSs are sent to multiple recipients (to the mobile numbers of competent authorities) via a bulk SM API. In this entire process of capturing and detecting, 30–35 frames per second (fps) is achieved. The schematic of the capturing thread is shown in Fig. 6. The Thread—B, i.e., the streaming thread, uses the flask streaming application for transmitting the captured video frames to Internet [10]. Finally, the Thread—C, i.e., the recording thread, records the captured and detected frames and saves it to a local directory. Recording lasts for around
254
S. Bose et al.
Fig. 5 Flowchart of the multi-threaded algorithm for the smart IP camera
30 min and that 30 min long video clip (of 9–12 MB size) is transferred to the G-drive synchronized directory before starting the recording of the next video clip. After transferring the video clip from the local directory to the G-drive synchronized directory, the video file is deleted from the local directory. The size of the G-drive synchronized directory is kept always fixed by deleting the oldest file from that directory if the directory size exceeds the predefined allowable storage size. The flowchart of the recording thread is shown in Fig. 7.
4 Performance Evaluation The performance of multiple IP camera prototypes installed at different examination halls is tested during a university examination at the host institution. Performance of an IP camera is presented in graphical form in Fig. 8. Typical performance parameters extracted from 500 captured frames during the mid of an examination is presented in Fig. 8. The said performance parameters are (i) number of classified objects (NCO:
IoT-Based Smart Internet Protocol Camera for Examination Monitoring
255
Fig. 6 Flowchart of the capturing thread (Thread—A) algorithm
number of head turning, standing, etc.), (ii) number of successful detection (NSD), (iii) number of unsuccessful detection (NUSD), and (iv) number of false detection (NFD) within a frame. On average, less than 4% NUSD and less than 1% NFD is achieved by considering different illumination conditions (considering both under daylight and artificial light). Typical detected frames obtained from the trial runs during different times of an examination day are shown in Figs. 9, 10, and 11. Square contours encircling the students’ heads by the custom TFL method signify the detected chaos within a frame.
256
S. Bose et al.
Fig. 7 Flowchart of the recording thread (Thread—C) algorithm
5 Conclusion An AI-based smart IP camera system for examination monitoring has been designed, realized and rigorously tested under realistic environment. The purpose of the proposed IP cameras is to carry out real-time remote surveillance of the examination hall as well as store corresponding videos recordings in cloud storage. The IoT technology is used to remote access of the IP cameras and real-time video streaming on web-browser of any computer/smart phone connected to either internal or external network; a dedicated cloud service is used for this purpose. Practically, 640 × 480 sq. pixels with 30–35 fps video capturing are used for the online streaming. On the other hand, compressed video size of 320 × 240 sq. pixels with 30–35 fps is used for recording and storing in cloud. In addition, a custom TensorFlow lite model has been built for detecting undesirable chaos in the examination hall, and a bulk SMS API is
IoT-Based Smart Internet Protocol Camera for Examination Monitoring
257
Fig. 8 Frame-wise numbers of classified objects (NOC), successful detections (NSD), unsuccessful detections (NUSD), and false detections (NFD) versus frame number plots
Fig. 9 Typical detected frames obtained from the trial run during morning session (under daylight) of the online examination during 11:20 am–12:00 pm
258
S. Bose et al.
Fig. 9 (continued)
Fig. 10 Typical detected frames obtained from the trial run during afternoon session (under daylight) of the online examination during 12:40 pm–1:20 pm
IoT-Based Smart Internet Protocol Camera for Examination Monitoring
259
Fig. 11 Typical detected frames obtained from the trial run during evening session (under artificial light) of the online examination during 4:40 pm–5:20 pm
used to send real-time alert to the competent authority in order to initiate necessary action. In future, the proposed system may be customized for various other sectors for security and alerting purposes.
References 1. RTSP: The Real-Time Streaming Protocol (2021). https://antmedia.io/rtsp-explained-what-isrtsp-how-it-works/. Accessed on 11 Nov 2022 2. Video Streaming in Web Browsers with OpenCV and Flask (2020). https://antmedia.io/rtspexplained-what-is-rtsp-how-it-works/. Accessed on 11 Nov 2022 3. Patil, N., Ambatkar, S., Kakde, S.: IoT based smart surveillance security system using raspberry Pi. In: 2017 International Conference on Communication and Signal Processing (ICCSP), pp. 0344–0348 (2017). https://doi.org/10.1109/ICCSP.2017.8286374 4. Lulla, G., Kumar, A., Pole, G., Deshmukh, G.: IoT based smart security and surveillance system. In: International Conference on Emerging Smart Computing and Informatics (ESCI), pp. 385–390 (2021). https://doi.org/10.1109/ESCI50559.2021.9396843 5. Kelly, S.D.T., Suryadevara, N.K., Mukhopadhyay, S.C.: IEEE Sens. J. 13(10), 3846–3853 (2013)
260
S. Bose et al.
6. Sarhan, Q.I.: Systematic survey on smart home safety and security systems using the Arduino platform. IEEE Access 8, 128362–128384 (2020) 7. Kumar, J., Kumar, S., Kumar, A., Behera, B.: Real-time monitoring security system integrated with Raspberry Pi and e-mail communication link. In: 9th International Conference on Cloud Computing Data Science Engineering (Confluence), pp. 79–84 (2019) 8. Remote.it: Unified & Automated Network Management (2022). https://www.remote.it/. Accessed on 11 Nov 2022 9. Tensorflow Lite Model Maker (2022). https://www.tensorflow.org/lite/models/modify/model_ maker. Accessed on 12 Sept 2022 10. Flask: Streaming Contents (2.2x) (2022). https://flask.palletsprojects.com/en/2.2.x/patterns/str eaming/. Accessed on 4 Nov 2022
Efficient Low-Complexity Message Passing Algorithm for Massive MIMO Detection Sourav Chakraborty, Salah Berra, Nirmalendu Bikas Sinha, and Monojit Mitra
Abstract Low-complexity, optimal performance massive multiple-input multipleoutput (MIMO) receiver design is a challenging task. Several low-complexity approaches are reported in literature for massive MIMO detection. However, when ratio of receiving to transmitting antenna ratio is lower than four, conventional linear detectors do not result good performance. Recently developed large-MIMO approximate message passing algorithm (LAMA) shows near-optimal detection performance. However, its complexity is still higher. In this work, we have proposed an efficient approach to updating the mean and variance in LAMA. A termination condition is added to reduce unnecessary computations. Simulation results show that the error performance of the proposed algorithm is almost identical to the conventional method. Also, significant complexity reduction is achieved in the proposed method. Keywords Massive MIMO · LAMA · Low-complexity · Detection · Message passing
S. Chakraborty (B) Department of Electronics and Communication Engineering, Cooch Behar Government Engineering College, Cooch Behar, India e-mail: [email protected] S. Berra COPELABS, Universidade Lusófona de Humanidades e Tecnologias, Campo Grande 376, Lisboa 1749-024, Portugal N. B. Sinha Maharaja Nandakumar Mahavidyalaya, Sitalpur, India M. Mitra Indian Institute of Engineering Science and Technology, Shibpur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2_22
261
262
S. Chakraborty et al.
1 Introduction The massive MIMO system is proved to be the most potential solution to achieving a higher data rate and higher spectral efficiency [1–4]. However, the advantages of massive MIMO systems come at the cost of higher computational complexity. Due to its higher dimension, the maximum likelihood (ML) solution [5] is not a realistic option for massive MIMO detection. Minimum mean-squared error (MMSE)/zero-forcing (ZF) detection can achieve an optimal solution when receiving antennas are several times larger than the number of transmit antennas [6, 7]. The exact MMSE/ZF detection requires a matrix inversion operation with order of complexity O(Nt3 ) [8]. Thus, the complexity of exact MMSE is still high for massive MIMO detection. Expectation propagation (EP)-based detectors can achieve optimal solution in large/massive MIMO detection. However, EP-based detectors requires exact matrix inversion for each iteration and increases the complexity significantly [9, 10]. Several low-complexity approximate MMSE methods are proposed in the literature, such as Newton iteration [11], Gauss-Seidel (GS) method [12], Richardson iteration (RI) [13] and hybrid iterative methods [7, 8]. However, when the number of transmitting and receiving antennas is close, many iterations are required for the approximate methods to converge with the exact MMSE solution. Also, MMSE failed to achieve a quasi-ML solution when the number of receive antennas is equal to or just higher than the transmit antennas [14]. Recently developed large-MIMO approximate message passing algorithm (LAMA) shows optimal performance in large-MIMO and massive MIMO detection [15, 16]. Also, it was shown that LAMA decouples noisy MIMO received signal vector into a set of parallel independent AWGN channels [15]. However, the complexity of LAMA is still high due to its redundant computation. A significant portion of computation comes from the mean and variance computation over the constellation symbol set. This motivates us to develop an efficient way to compute mean and variance. Our contributions are summarized as follows: a low-complexity LAMA is proposed in which computation is divided into two phases. The first phase is equivalent to conventional LAMA, and the second phase uses a low-complexity mean-variance (MV) update. A fixed neighbour-based mean and variance generation is proposed to reduce the computational overhead. Early termination is proposed to reduce unnecessary computation. Boldface capital and small letters indicate matrices and vectors, respectively. (.) and (.) return a complex number’s real and imaginary parts.
2 System Model A massive MIMO system is considered in this work where Nr is the number of base station antennas, and Nt is the number of single-antenna users. The base station serves all the users simultaneously in the same time-frequency resource. The received
Efficient Low-Complexity Message Passing Algorithm …
263
signal can be expressed as yc = Hc xc + wc ,
(1)
where xc ∈ C Nt ×1 is the transmitted vector whose elements are selected from M˜ yc ∈ C Nr ×1 is the received signal vector. The additive white QAM constellation set A. Gaussian noise (AWGN) in channel represented by wc ∈ C Nr ×1 whose elements are i.i.d AWGN and modelled as wi ∼ CN (0, σ 2 ). The channel matrix is Hc ∈ C Nr ×Nt . The complex signal model can be converted into a real equivalent model as
( yc ) (Hc ) −(Hc ) (xc ) (wc ) = + , ( yc ) (Hc ) (Hc ) (xc ) (wc )
(2)
y = H x + w,
(3)
where y ∈ R2Nr ×1 , H ∈ R2Nr ×2Nt , x ∈ R2Nr ×1 and w ∈ R2Nr ×1 . In real equivalent model, elements of x are selected from real equivalent constellation set A = − √ √ M + 1 , . . . , −1, 1, . . . , M − 1 .
3 Proposed Low-Complexity LAMA The likelihood of any symbol xˆi with mean μi and variance σi2 expressed as (xˆ − μ )2 i i . p(xˆi ) ∝ exp − 2σi2
(4)
The mean and variance can be updated by considering discrete constellation set A as (s−μi )2 s exp −
2 s∈A 2σi μi+1 = sp(xi = s) = (5) (s−μi )2 exp − s∈A 2 s∈A 2σ i
2 σi+1
2
i) s 2 exp − (s−μ 2σi2 2 − μi+1 = . (s−μi )2 exp − s∈A 2σ 2
s∈A
(6)
i
√ Thus, from Eqs. (5) and (6), we can notice that M number of√ exponential terms are √ required for each mean and variance update. Additionally, 3 M additions and 3 M multiplications are required. As the number of constellation points increases, the computation will significantly increase. It is important to understand that the overall complexity depends on the number of constellation points. That is why we choose a real equivalent system over a complex model. Before proposing a low-
264
S. Chakraborty et al. μi
μi
σi2 -3
-1
(a)
+1
+3 +5 +7
al
-5
σi2
σi2 -7
-5
-3
en tio n
-7
μi
+1 +3
+5 +7
-7
-5
-3
-1 (d)
μi
μi
Co
nv
μi
-1 (c)
σi2
+1
+3 +5
+7
σi2
σi2 Proposed
-7
-5
-3
-1
+1 +3
-7
+5 +7
-5
(b)
-3
-1
+1 +3 +5 +7
-7
-5
-3
(e)
-1
(f)
+1 +3
+5 +7
Fig. 1 Likelihood value computation of constellation points in conventional and proposed method
complexity solution, we need to understand the behaviour of likelihood values of different constellation points. Figure 1 shows a typical likelihood value plot for different constellation points. In Fig. 1a and b, we notice that the variance is high at initial iterations. Alternatively, we can say that the optimal solution is uncertain and has a low signal-to-interferencenoise ratio (SINR). After a certain number of LAMA iterations, the variance will decrease, which is shown in Fig. 1c. When a sufficient number of iterations are completed, variance becomes very small, and the mean value almost converges to a certain constellation point, as shown in Fig. 1d. Therefore, from the previous discussion, it is clear that after a certain number of iterations, only a few constellation points near the mean value are significant for the next iteration mean-variance update, and the rest of the constellation points can be discarded from the computation. Thus, the low-complexity solution of Eqs. (5) and (6) can be expressed as (s−μi )2
s∈NG(Q(μi )) s exp − 2σi2 sp(xi = s) = (7) μi+1 ≈ (s−μi )2 exp − s∈NG(Q(μi )) 2 s∈NG(Q(μi )) 2σ i
2 σi+1
2
i) s 2 exp − (s−μ 2σi2 2 − μi+1 ≈ . (s−μi )2 exp − s∈NG(Q(μi )) 2σ 2
s∈NG(Q(μi ))
(8)
i
Here, Q(μi ) returns the constellation point nearest to μi , NG(Q(μi )) is the set constellation points nearest to μi . For example, if we consider 16-QAM modulation and three neighbour set then, we get NG(−3) = {−3, −1, 1}, NG(−1) = {−3, −1, 1}, NG(1) = {−1, 1, 3}, NG(3) = {−1, 1, 3}. Figure 1e and f shows the likelihood value plot of three neighbours around the mean value. Equations (7) and (8) require an updated NG(Q(μi )) for each iteration, and this may require an extra computational effort. This work shall also consider fixed NG(Q(μi )) to compute the mean and variance for the rest of the L p iterations. The efficient computation of mean and variance
Efficient Low-Complexity Message Passing Algorithm …
265
over a set of neighbour symbols can reduce the computational burden significantly. However, when the mean matches its nearest constellation point, variance becomes almost zero, and no further improvement in detection performance is possible. In this case, the algorithm should be terminated to avoid any unnecessary computation. 2 Therefore, we terminate the algorithm whenever the cumulative variance σ t+1 reaches a certain threshold σth2 . Algorithm 1 Proposed low-complexity LAMA detection 1: 2: 3: 4: 5: 6: 7: 8:
inputs: H, y ; outputs: z t+1 preprocessing: G˜ = I 2Nt − diag(G)−1 G , G = H t H, y˜ M F = diag(G)−1 H t y, gu = G uu /2Nt , u = 1, 2, . . . , 2Nt initialize: ter minate = 0, z 1 = μ1 = 02Nt ×1 , ρ1 = 0 while terminate = 1 do if t ≤ L s then same as conventional method μt+1 = F(z t , ρt g), σ 2t+1 = V (z t , ρt g) mean and variance update Onsanger term calculation σˆ t+1 = N1r g t σ 2t+1 , α t = z t − μt ˜ t+1 + bt α t , ρt+1 = ( 1 σ 2 + σˆ t+1 )−1 9: bt = ρt σˆ t+1 , z t+1 = y˜ M F + Gμ Nr interference cancellation & SINR update 10: else use porposed low complexity mean-variance update ˜ t , ρt g), 11: μt+1 = F(z σ 2t+1 = V˜ (z t , ρt g) mean and variance update Onsanger term calculation 12: σˆ t+1 = N1r g t σ 2t+1 , α t = z t − μt ˜ t+1 + bt α t , ρt+1 = ( 1 σ 2 + σˆ t+1 )−1 13: bt = ρt σˆ t+1 , z t+1 = y˜ M F + Gμ Nr interference cancellation & SINR update 14: end if 2 then Algorithm termination condition, L = L + L 15: if (t < L) && σ 2t+1 ≥ σth s p 16: ter minate = 0, 17: t =t +1 18: else 19: ter minate = 1 20: end if 21: end while
The algorithm 1 shows the detailed steps for the proposed detection method. The first L s iterations (lines 6–9) are identical to conventional LAMA [15]. The rest of the L p iterations given in lines 11–13 are computed based on the proposed mean-variance update method. F(μi , σi2 ) and V (μi , σi2 ) compute element-wise posterior mean and variance using Eqs. (5) and (6). The low-complexity equivalent of F(μi , σi2 ) and ˜ i , σi2 ) and V˜ (μi , σi2 ), respectively, and computed using Eqs. (7) V (μi , σi2 ) are F(μ and (8). The termination of the algorithm is given in lines 15–19.
4 Complexity Analysis This section will show the number of arithmetic operations required for the proposed algorithm and compare it with the conventional method. As we can see, the first L s
266
S. Chakraborty et al.
Table 1 Comparison of different real arithmetic operations required for conventional and proposed LAMA Operation Conventional [16] Proposed MV update Other MV update Other computations computations √ √ Addition 6L Nt M 2Nt L(2Nt + 3) 6Nt (L s M + 2Nt L(2Nt + 3) L p Ng ) √ √ Multiplication 6L Nt M 4Nt L(Nt + 1) 6Nt (L s M + 4Nt L(Nt + 1) L p Ng ) √ √ Division 4Nt L( M + 1) L 4Nt (L s M + L L + L p Ng ) √ √ Exponential 2L Nt M 0 2Nt (L s M + 0 L p Ng )
iterations are identical to the conventional method. The low-complexity mean and L − L s iterations.√For variance update (MV update) is applied for the rest of the L p = √ iterations requires 3 M additions, 3 M each mean and variance update in first L s √ √ multiplications, 2 M + 2 divisions and M exponential operations as indicated in Eqs. (5) and (6).√Whereas, in last L p iterations, N g neighbour symbols are considered instead of M symbols. The number of arithmetic operations required for the Onsager term and SINR update is the same for both conventional and the proposed method. For each iteration, the number of required additions and multiplications for the Onsager term and SINR update is 4Nt2 + 6Nt and 4Nt2 + 4Nt , respectively. Table 1 gives the detailed comparison of the arithmetic operations required in conventional and the proposed method. However, operations listed in Table 1 for the proposed method correspond to the worst-case scenario. The termination condition given in line 15, algorithm 1, helps to reduce the redundant iterations above a certain threshold. Thus, the actual number of operations in the proposed method might be less than the table values. Also, √ it is worth mentioning that for higher-order modulation like 256-QAM, N g spr eadbest then 20: spr eadbest =s 21: Node = v 22: end if 23: end for 24: S = S ∪ Node 25: spread = spread + (spr eadbest - spread) Add marginal spread 26: i = i+1 27: end while spr ead 28: σ (S) = Calculate the average spread |S| 29: return σ (S) 30: end procedure
An Effective Community-Based Greedy Approach Over Selected …
275
3.1 Network Partitioning Our Network Partitioning algorithm works in three steps. (a) Using clique proximity measures, a transmuted network is created (line 2). (b) The Louvain algorithm (line 4) helps to detect several communities within a transmuted network, and (c) the post-processing method (line 5) is used to ensure which nodes are overlapped in each community and finally, overlapping communities are identified.
3.1.1
Clique Proximity
The clique proximity method is used to determine which nodes in a network are the most influential. In this technique, overlapping communities are detected using a disjoint community detection algorithm, which means some nodes have belonged to more than one community. Nodes that have belonged to more than one community have been considered influential nodes in our proposed algorithm. Clique proximity () [12] is used to produce a transmuted network from the original network (G). The number of nodes and edges in a transmuted network increases when some nodes follow the proper proximity. In this method, each node in the graph (G) creates a sub-network G¯ with their neighbouring nodes, and a 3-click percolation is used to determine which node will be replicated in this sub-network. Two types of clique proximity are exist. These are Clique proximity of vertex u is known as null proximation when the division set is empty ((u) = φ). Clique proximity of vertex u is known as proper proximation when the division set is not empty ((u) = φ). For example, Fig. 1a, and b shows the original and transmuted networks. Neighbours of node ‘2’ and node ‘3’ form the sub-network, where they follow the proper proximity. Thus, these nodes participated in a network more than once. Disjoint Community Detection Algorithm: Louvain technique is used for identification of disjoint communities where network G¯ is given as input. In the L OUVAIN method, the cluster count is never provided as an input since it is the unsupervised algorithm. This method has two phases: – Modularity Optimization: This algorithm optimizes the modularity of the network by randomly ordering all the nodes. Nodes are removed and inserted into each community until the modularity of the partition does not increase significantly. In addition, the quality of the partition is maximized by maximizing the modularity ( ). This can be depicted as, =
kx k y 1 ](cx c y + 1) [A x y − 4 xy 2m
where, A is weighted adjacency matrix
(1)
276
M. Roy et al.
Fig. 1 a Original network. b Transmuted network Fig. 2 Network with three communities
Degree of node vx and v y are k x and k y and cluster of node vx and v y are cx and c y Total edge count in network; m = 21 x k y Assuming cx = 1, if vertex vx belongs to group x and −1 otherwise. – Community Aggregation: In community aggregation, all minor communities are merged into one node. Count the number of links within a community before being collapsed into a single node. Figure 2 shows that three communities have been formed.
An Effective Community-Based Greedy Approach Over Selected …
277
Fig. 3 Network has three overlapping communities
3.1.2
Post-processing Method
Nodes that participate in multiple communities are renamed by the old node in the post-processing method. Nodes and communities that overlap after renaming are identified. All overlapping nodes within significant communities form a candidate set. Figure 3 shows that node ‘2’ and node ‘3’ are overlapping nodes but only node ‘3’ is a member of the significant community, and node ‘3’ is the only candidate node.
3.2 Candidate Node Selection Community information is obtained in third step. The candidate selection step (lines 6–12) derives a set of candidate nodes. Realistic social networks are usually very large and the state space for seed selection is huge. Consequently, it is very important to reduce the number of candidate nodes effectively. As a result of analysing the structure of communities, we find that not all communities are large enough for seed nodes to be accommodated and these communities will not be considered as significant. Though the network is separated into three communities (in Fig. 2), community 3 is insignificant in compared to 1 and 2 owing to its smaller community size. Such smaller communities depending on the community ratio (ρ) have been discarded. The community size means the number of nodes present in a community (|C|). The ith community ratio (ρi ) is measured by the difference between the ith (|Ci |) and the
278
M. Roy et al.
minimum size (|C|Min ) communities, that value is divided by the difference between the maximum (|C|Max ) and the minimum size (|C|Min ) communities. ρi =
|Ci | − |C|Min |C|Max − |C|Min
(2)
Here, communities 1 and 2 have been selected as the significant communities because their sizes are greater than equal to the threshold value ( = 0.5). All the overlapping nodes are taken from the significant communities and a candidate set is made with all these nodes. This idea with the greedy approach over the Linear Threshold diffusion model has worked brilliantly when the number of nodes is reduced effectively.
3.3 Seed Set Generation Greedy algorithm is used to identify the top k influential nodes (lines 13–27) from the candidate set. This method calculates the expected spread using the Monte Carlo Simulations over the Linear Threshold (LT) diffusion model. This method performs the same operation for k iterations. Specifically, the algorithm generates the marginal spread for rest of candidate nodes and identifies the node having largest marginal spread. Marginal spread is derived through LT() function. This method derives optimal seed set S and the average spread (σ (S)).
4 Result and Discussion This proposed algorithm (CBGA-IM) has been compared with two most excellent algorithms (New Greedy Algorithm [NG] and Community-Based Greedy Algorithm [CGA]). Both of which followed greedy method and Linear Threshold (LT) Diffusion Model. CBGA-IM has been tested on two benchmark real-world social network data sets (Table 1). NetPHY is constructed on academic collaboration networks. Each node represents an author, and an edge is created whenever two authors write the same paper together. An email network EmailEnr containing about half a million emails was derived from the Enron email network. In the graph, nodes represent email addresses, and an undirected edge between ‘i’ and ‘ j’ is formed if ‘i’ sent at least one email to ‘ j’. This test was performed on a 64 GB Ram machine in Core i7 Intel Processor where Centos Enterprise version 6.0 served as an operating system. The Python (version 3.8) programming language was used for implementation. Various useful packages like NetworkX 2.0, Numpy 1.8 were also used for graphs and numerical calculations. Proposed algorithm (Cbga- IM) was tested on two large data sets (Table 1) and the results were recorded. In most cases, (Cbga- IM) algorithm produced better result compared to the running time (t) and influence spread (σ ). NetPhy and EmailEnr data
An Effective Community-Based Greedy Approach Over Selected … Table 1 Data set information Data network NetPhy EmailEnr
Nodes
Edges
37,154 36,692
231,584 183,831
279
Fig. 4 NetPHY, = 0.60, k = 20, p = 0.02
Fig. 5 EmailEnr, = 0.65, k = 20, p = 0.02
sets proposed (Cbga- IM) algorithm took 75 and 25% less time compared to New Greedy [NG] [1] algorithm, and 17 and 21% less time compared to CommunityBased Greedy [CGA] [13] algorithm, respectively. Three user-defined parameters are used in (Cbga- IM) algorithm during experiment, represents a threshold value whose value lies between [0.5, 0.7], and significant communities are selected according to this value. plays a big role when the number of communities is increasing to determine which communities are significant. In the high-density graph, used ≥ 0.6 . The active seed set size is ‘k’, and the propagation probability (‘p’) is p = 0.02. Detail results are presented on NetPhy and EmailEnr data sets in Figs. 4 and 5 where the cardinality of the seed sets was k = 20, and activation probability was p = 0.02. In the greedy methods, 1000 Monte Carlo simulations were performed.
5 Conclusion The influence maximization method is used in a variety of campaigns, disease outbreak detections, sensor networks, and digital marketing strategies. Communitybased influence maximization method was applied over selected communities to achieve effective results with respect to execution time and average influence spread.
280
M. Roy et al.
Community-Based Greedy Approach over Selected Communities (Cbga- IM) has successfully met the desired objectives. It was successful in reaching the most promising part of the network and this algorithm also identified the quality of active nodes. Computation time was very much acceptable for large data networks. It is still possible to create more efficient and scalable algorithms to improve this existing one.
References 1. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146 (2003) 2. Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., Van Briesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429 (2007) 3. Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 199–208 (2009) 4. Wang, Y, Cong, G., Song, G., Xie, K.: Community-based greedy algorithm for mining topk influential nodes in mobile social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1039–1048 (2010) 5. Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005) 6. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. National Acad. Sci. 99(12), 7821–7826 (2002) 7. Gregory, S.: A fast algorithm to find overlapping communities in networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 408–423. Springer (2008) 8. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 10, P10008 (2008) 9. Shang, J., Zhou, S., Li, X., Liu, L., Hongchun, W.: CoFIM: a community-based framework for influence maximization on large-scale networks. Knowl. Based Syst. 117, 88–100 (2017) 10. Huang, H., Shen, H., Meng, Z.: Community based influence maximization in attributed networks. Appl. Intell. 50(2), 354–364 (2020) 11. Chen, X., Deng, L., Zhao, Y., Zhou, X., Zheng, K.: Community-based influence maximization in location-based social network. World Wide Web 24(6), 1903–1928 (2021) 12. Roy, M., Pan, I.: Overlapping community detection using clique proximity and modularity maximization. In: 2018 Fourth International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), pp. 226–230. IEEE (2018) 13. Rácz, G., Pusztai, Z., Kósa, B., Kiss, A.: An improved community-based Greedy algorithm for solving the influence maximization problem in social networks. Ann. Math. Inf. 44, 141–150 (2015)
Author Index
A Abhijit Bora, 75 Abhishek Basu, 173 Abtsega Tesfaye Chufare, 35 Alexander, Arkhipov, 49 Amit Konar, 99 Amit Shiuly, 207 Anup Kumar Roy, 21 Anurag Nayak, 207 Aritra Acharyya, 161, 249 Arnab Gain, 11 Averkin, Alexey, 1
B Bangermayang, 139 Bijon Guha, 61
D Debajyoty Banik, 35 Devika Rani Roy, 85
G Gavel D. Kharmalki, 75 Gideon D. Kharsynteng, 75 Gypsy Nandi, 75, 139
I Indrajit Kar, 61 Indrajit Pan, 113, 127, 271
J Jayanta Datta, 113
K Kosterev, Vladimir, 1
M Mahendra Kumar Gourisaria, 35 Maxim, Bobyr, 49 Mithun Roy, 127, 271 Monojit Mitra, 261 Mousumi Laha, 99 Mrittika Chakraborty, 187
N Narisha Skhemlon, 75 Natalia, Milostnaya, 49 Nirmalendu Bikas Sinha, 261
P Phidawanhun Pyngrope, 139 Pradei Sangkhro, 139 Prasenjit Dey, 11, 161, 197, 249 Pratima Sarkar, 149
R Rajeev Kamal, 173 Rajiv Ranjan Singh, 173 Rohan Karmakar, 249
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 S. Bhattacharyya et al. (eds.), Recent Trends in Intelligence Enabled Research, Advances in Intelligent Systems and Computing 1446, https://doi.org/10.1007/978-981-99-1472-2
281
282 S Saikat Basu, 21 Salah Berra, 261 Sandeep Gurung, 149 Sitesh Kumar Sinha, 85 Somen Mondal, 221, 235 Sourabrata Mukherjee, 197 Sourav Chakraborty, 261 Sourav De, 149 Subhadeep Dhang, 161, 249 Subhamita Mukherjee, 113, 127, 271 Sudip Kumar Adhikari, 11, 221, 235 Sudipta Mukhopadhyay, 61 Sukanya Bose, 161, 249 Sukanya Sardar, 187 Sukhendu S. Mondal, 221, 235 Suman Koner, 11, 207 Sunny Saini, 21
Author Index Swarnava Ghosh, 161, 249
T Tunnisha Dasgupta, 207
U Ujjwal Maulik, 187
V Veenadhari, S., 85 Vulligadla Amaresh, 173
Y Yarushev, Sergey, 1