128 58 23MB
English Pages [914]
Lecture Notes on Data Engineering and Communications Technologies 111
Mukesh Saraswat · Harish Sharma · K. Balachandran · Joong Hoon Kim · Jagdish Chand Bansal Editors
Congress on Intelligent Systems Proceedings of CIS 2021, Volume 2
Lecture Notes on Data Engineering and Communications Technologies Volume 111
Series Editor Fatos Xhafa, Technical University of Catalonia, Barcelona, Spain
The aim of the book series is to present cutting edge engineering approaches to data technologies and communications. It will publish latest advances on the engineering task of building and deploying distributed, scalable and reliable data infrastructures and communication systems. The series will have a prominent applied focus on data technologies and communications with aim to promote the bridging from fundamental research on data science and networking to data engineering and communications that lead to industry products, business knowledge and standardisation. Indexed by SCOPUS, INSPEC, EI Compendex. All books published in the series are submitted for consideration in Web of Science.
More information about this series at https://link.springer.com/bookseries/15362
Mukesh Saraswat · Harish Sharma · K. Balachandran · Joong Hoon Kim · Jagdish Chand Bansal Editors
Congress on Intelligent Systems Proceedings of CIS 2021, Volume 2
Editors Mukesh Saraswat Department of Computer Science & Engineering and Information Technology Jaypee Institute of Information Technology Noida, India K. Balachandran Department of Computer Science and Engineering CHRIST (Deemed to be University) Bengaluru, India
Harish Sharma Department of Computer Science and Engineering Rajasthan Technical University Kota, India Joong Hoon Kim Korea University Seoul, Korea (Republic of)
Jagdish Chand Bansal South Asian University New Delhi, India
ISSN 2367-4512 ISSN 2367-4520 (electronic) Lecture Notes on Data Engineering and Communications Technologies ISBN 978-981-16-9112-6 ISBN 978-981-16-9113-3 (eBook) https://doi.org/10.1007/978-981-16-9113-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
This proceedings contains the papers presented at the 2nd Congress on Intelligent Systems (CIS 2021), organized by CHRIST (Deemed to be University), Bangalore, and Soft Computing Research Society, during September 4–5, 2021. Congress on Intelligent Systems (CIS 2021) invited ideas, developments, applications, experiences, and evaluations in intelligent systems from academicians, research scholars, and scientists. The conference deliberation included topics specified within its scope. The conference offered a platform for bringing forward extensive research and literature across the arena of intelligent systems. It provided an overview of the upcoming technologies. CIS 2021 provided a platform for leading experts to share their perceptions, provide supervision, and address participants’ interrogations and concerns. CIS 2021 received 370 research submissions from 35 different countries, viz. Algeria, Bangladesh, Burkina Faso, China, Egypt, Ethiopia, Finland, India, Indonesia, Iran, Iraq, Kenya, Korea, The Democratic People’s Republic of Madagascar, Malaysia, Mauritius, Mexico, Morocco, Nigeria, Peru, Romania, Russia, Serbia, Slovakia, South Africa, Spain, Switzerland, Ukraine, United Arab Emirates, UK, USA, Uzbekistan, and Vietnam. The papers included topics about advanced areas in technology, artificial intelligence, machine learning, and data science. After a rigorous peer review with the help of program committee members and more than 100 external reviewers, 135 papers were approved. CIS 2021 is a flagship event of the Soft Computing Research Society, India. The conference was inaugurated Fr Dr. Abraham VM, Honorable Vice-Chancellor, CHRIST (Deemed to be University), Bangalore, and Chief Patron CIS 2021. Other eminent dignitaries include Prof. Joong Hoon Kim, General Chair, CIS 2021; Fr. Joseph Varghese, Patron, CIS 2021; and Prof. K. Balachandran, General Chair, CIS 2021. The conference witnessed keynote addresses from eminent speakers, namely Prof. Xin-She Yang, Middlesex University, The Burroughs, Hendon, London, Prof. P. Nagabhushan, Indian Institute of Information Technology Allahabad, Prof. J. C. Bansal, South Asian University, New Delhi, India, Prof. Lipo Wang,
v
vi
Preface
Nanyang Technological University, Singapore, and Prof. Nishchal K. Verma, Indian Institute of Technology Kanpur, India. Noida, India Kota, India Bengaluru, India Seoul, Korea (Republic of) New Delhi, India
Mukesh Saraswat Harish Sharma K. Balachandran Joong Hoon Kim Jagdish Chand Bansal
Contents
A Comprehensive Survey on Machine Reading Comprehension: Models, Benchmarked Datasets, Evaluation Metrics, and Trends . . . . . . . Nisha Varghese and M. Punithavalli A Novel Feature Descriptor: Color Texture Description with Diagonal Local Binary Patterns Using New Distance Metric for Image Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vijaylakshmi Sajwan and Rakesh Ranjan
1
17
OntoINT: A Framework for Ontology Integration Based on Entity Linking from Heterogeneous Knowledge Sources . . . . . . . . . . . . . . . . . . . . . N. Manoj, Gerard Deepak, and A. Santhanavijayan
27
KnowCommerce: A Semantic Web Compliant Knowledge-driven Paradigm for Product Recommendation in E-commerce . . . . . . . . . . . . . . N. Krishnan and Gerard Deepak
37
Removal of Occlusion in Face Images Using PIX2PIX Technique for Face Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sincy John and Ajit Danti
47
Age, Gender and Emotion Estimation Using Deep Learning . . . . . . . . . . . Mala Saraswat, Praveen Gupta, Ravi Prakash Yadav, Rahul Yadav, and Sahil Sonkar Qualitative Classification of Wheat Grains Using Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. Sarveswara Rao, K. Lohith, K. Satwik, and N. Neelima Pandemic Simulation and Contact Tracing: Identifying Super Spreaders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aishwarya Sampath, Bhargavi Kumaran, Vidyacharan Prabhakaran, and Cinu C. Kiliroor
59
71
81
vii
viii
Contents
Assessment of Attribution in Cyber Deterrence: A Fuzzy Entropy Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . T. N. Nisha and Prasenjit Sen
97
Predictive Maintenance of Bearing Machinery Using MATLAB . . . . . . . . 107 Karan Gulati, Shubham Tiwari, Keshav Basandrai, and Pooja Kamat Fuzzy Keyword Search Over Encrypted Data in Cloud Computing: An Extensive Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Manya Smriti, Sameeksha Daruka, Khyati Gupta, and S. Siva Rama Krishnan Application of Data Mining and Temporal Data Mining Techniques: A Case Study of Medicine Classification . . . . . . . . . . . . . . . . . 151 Shashi Bhushan A Deep Learning Approach for Plagiarism Detection System Using BERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Anjali Bohra and N. C. Barwar A Multi-attribute Decision Approach in Triangular Fuzzy Environment Under TOPSIS Method for All-rounder Cricket Player Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 H. D. Arora, Riju Chaudhary, and Anjali Naithani Machine Learning Techniques on Disease Detection and Prediction Using the Hepatic and Lipid Profile Panel Data . . . . . . . . . . . . . . . . . . . . . . . 189 Ifra Altaf, Muheet Ahmed Butt, and Majid Zaman Performance Analysis of Machine Learning Algorithms for Website Anti-phishing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 N. Mohan Krishna Varma, Y. C. A. Padmanabha Reddy, and C. Rajesh Kumar Reddy Analytical Analysis of Two-Warehouse Inventory Model Using Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Sunil Kumar and Rajendra Prasad Mahapatra Towards an Enhanced Framework to Facilitate Data Security in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Sarah Mahlaule, John Andrew van der Poll, and Elisha Oketch Ochola Kinematics and Control of a 3-DOF Industrial Manipulator Robot . . . . . 243 Claudia Reyes Rivas, María Brox Jiménez, Andrés Gersnoviez Milla, Héctor René Vega Carrillo, Víctor Martín Hernández Dávila, Francisco Eneldo López Monteagudo, and Manuel Agustín Ortiz López Enhanced Energy Efficiency in Wireless Sensor Networks . . . . . . . . . . . . . 255 Neetu Mehta and Arvind Kumar
Contents
ix
Social Structure to Artificial Implementation: Honeybees . . . . . . . . . . . . . 271 Amit Singh Lifetime Aware Secure Data Aggregation through Integrated Incentive-based Mechanism in IoT-based WSN Environment . . . . . . . . . . 287 S. Nandini and M. Kempanna Multi-temporal Analysis of LST-NDBI Relationship with Respect to Land Use-Land Cover Change for Jaipur City, India . . . . . . . . . . . . . . . 299 Arpana Chaudhary, Chetna Soni, Uma Sharma, Nisheeth Joshi, and Chilka Sharma Analysis and Performance of JADE on Interoperability Issues Between Two Platform Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Jaspreet Chawla and Anil Kr. Ahlawat A TAM-based Study on the ICT Usage by the Academicians in Higher Educational Institutions of Delhi NCR . . . . . . . . . . . . . . . . . . . . . 329 Palak Gupta and Shilpi Yadav An Investigation on Impact of Gender in Image-Based Kinship Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 Vijay Prakash Sharma and Sunil Kumar Classification of COVID-19 Chest CT Images Using Optimized Deep Convolutional Generative Adversarial Network and Deep CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 K. Thangavel and K. Sasirekha Intelligent Fractional Control System of a Gas-Diesel Engine . . . . . . . . . . 379 Alexandr Avsievich, Vladimir Avsievich, and Anton Ivaschenko Analysis on Advanced Encryption Standard with Different Image Steganography Algorithms: An Experimental Study . . . . . . . . . . . . . . . . . . 391 Alicia Biju, Lavina Kunder, and J. Angel Arul Jothi Diabetes Prediction Using Logistic Regression and K-Nearest Neighbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Ami Oza and Anuja Bokhare Linear Regression for Car Sales Prediction in Indian Automobile Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 Rohan Kulkarni and Anuja Bokhare Agent-Driven Traffic Light Sequencing System Using Deep Q-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Palwai Thirumal Reddy and R. Shanmughasundaram System Partitioning with Virtualization for Federated and Distributed Machine Learning on Critical IoT Edge Systems . . . . . . 443 Vysakh P. Pillai and Rajesh Kannan Megalingam
x
Contents
A Review on Preprocessing Techniques for Noise Reduction in PET-CT Images for Lung Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Kaushik Pratim Das and J. Chandra Optimal DG Planning and Operation for Enhancing Cost Effectiveness of Reactive Power Purchase . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Nirmala John, Varaprasad Janamala, and Joseph Rodrigues Prediction of User’s Behavior on the Social Media Using XGBRegressor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Saba Tahseen and Ajit Danti Model Order Reduction of Continuous Time Multi-input Multi-output System Using Sine Cosine Algorithm . . . . . . . . . . . . . . . . . . . . 503 Aditya Prasad Padhy, Varsha Singh, and Vinay Pratap Singh Smart E-waste Management in China: A Review . . . . . . . . . . . . . . . . . . . . . 515 Yafeng Han, Tetiana Shevchenko, Dongxu Qu, and Guohou Li Artificial Intelligence Framework for Content-Based Image Retrieval: Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Padmashree Desai and Jagadeesh Pujari Priority-Based Replication Management for Hadoop Distributed File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 Dilip Rajput, Ankur Goyal, and Abhishek Tripathi Comparing the Pathfinding Algorithms A*, Dijkstra’s, Bellman-Ford, Floyd-Warshall, and Best First Search for the Paparazzi Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561 Robert Johner, Antonino Lanaia, Rolf Dornberger, and Thomas Hanne Optimizing an Inventory Routing Problem using a Modified Tabu Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Marc Fink, Lawrence Morillo, Thomas Hanne, and Rolf Dornberger Classification of Histopathological Breast Cancer Images using Pretrained Models and Transfer Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 Mirya Robin, Aswathy Ravikumar, and Jisha John Handwritten Digit Recognition Using Very Deep Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 M. Dhilsath Fathima, R. Hariharan, and M. Seeni Syed Raviyathu Ammal The Necessity to Adopt Big Data Technologies for Efficient Performance Evaluation in the Modern Era . . . . . . . . . . . . . . . . . . . . . . . . . . 613 Sangeeta Gupta and Rupesh Mishra
Contents
xi
Forecasting Stock Market Indexes Through Machine Learning Using Technical Analysis Indicators and DWT . . . . . . . . . . . . . . . . . . . . . . . 625 Siddharth Patel, B. D. Vijai Surya, Chinthakunta Manjunath, Balamurugan Marimuthu, and Bikramaditya Ghosh Slotted Coplanar Waveguide-Fed Monopole Antenna for Biomedical Imaging Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639 Regidi Suneetha and P. V. Sridevi A Framework for Enhancing Classification in Brain–Computer Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651 Sanoj Chakkithara Subramanian and D. Daniel Measuring the Accuracy of Machine Learning Algorithms When Implemented on Astronomical Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 Shruthi Srinivasaprasad Artificial Intelligence in E-commerce: A Literature Review . . . . . . . . . . . . 677 Richard Fedorko, Štefan Kráˇl, and Radovan Baˇcík Modified Non-local Means Model for Speckle Noise Reduction in Ultrasound Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 V. B. Shereena and G. Raju Improved Color Normalization Method for Histopathological Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709 Surbhi Vijh, Mukesh Saraswat, and Sumit Kumar Parametric Analysis on Disease Risk Prediction System Using Ensemble Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719 Huma Parveen, Syed Wajahat Abbas Rizvi, and Praveen Shukla Sentimental Analysis of Code-Mixed Hindi Language . . . . . . . . . . . . . . . . . 739 Ratnavel Rajalakshmi, Preethi Reddy, Shreya Khare, and Vaishali Ganganwar Enhanced Security Layer for Hardening Image Steganography . . . . . . . . 753 Pregada Akshita and P. P. Amritha Matrix Games with Linguistic Distribution Assessment Payoffs . . . . . . . . 767 Parul Chauhan and Anjana Gupta Interval-valued Fermatean Fuzzy TOPSIS Method and Its Application to Sustainable Development Program . . . . . . . . . . . . . . . . . . . . 783 Utpal Mandal and Mijanur Rahaman Seikh An Empirical Study of Signal Transformation Techniques on Epileptic Seizures Using EEG Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 797 M. Umme Salma and Najmusseher
xii
Contents
Rainfall Estimation and Prediction Using Artificial Intelligence: a Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807 Vikas Bajpai, Anukriti Bansal, Ramit Agarwal, Shashwat Kumar, Namit Bhandari, and Shivam Kejriwal Image Classification Using CNN to Diagnose Diabetic Retinopathy . . . . . 821 S. Arul Jothi, T. Mani Sankar, Rohith Chandra Kandambeth, Tadepalli Siva Koushik, P. Arun Prasad, and P. Arunmozhi Real-Time Segregation of Encrypted Data Using Entropy . . . . . . . . . . . . . 835 P. Gowtham Akshaya Kumaran and P. P. Amritha Performance Analysis of Different Deep Neural Architectures for Automated Metastases Detection of Lymph Node Sections in Hematoxylin and Eosin-Stained Whole-Slide images . . . . . . . . . . . . . . . . 845 Naman Dangi and Khushali Deulkar Linguistic Classification Using Instance-Based Learning . . . . . . . . . . . . . . 863 Rhythm Girdhar, Priya S. Nayak, and Shreekanth M. Prabhu CoFFiTT-COVID-19 Fake News Detection Using Fine-Tuned Transfer Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 879 B. Fazlourrahman, B. K. Aparna, and H. L. Shashirekha Improved Telugu Scene Text Recognition with Thin Plate Spline Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 891 Nandam Srinivasa Rao and Atul Negi Analysing Voice Patterns to Determine Emotion . . . . . . . . . . . . . . . . . . . . . . 901 Amit Kumar Bairwa, Vijander Singh, Sandeep Joshi, and Vineeta Soni Limac¸on Inspired Particle Swarm Optimization for Large-Scale Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 917 Shruti Gupta, Rajani Kumari, and Sandeep Kumar Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 931
About the Editors
Dr. Mukesh Saraswat is an Associate Professor at Jaypee Institute of Information Technology, Noida, India. Dr. Saraswat has obtained his Ph.D. in Computer Science and Engineering from ABV-IIITM Gwalior, India. He has more than 18 years of teaching and research experience. He has guided two Ph.D. students, more than 50 M.Tech. and B.Tech. dissertations, and presently guiding four Ph.D. students. He has published more than 50 journal and conference papers in the area of image processing, pattern recognition, data mining, and soft computing. He was the part of successfully completed DRDE funded project on image analysis and SERB-DST (New Delhi) funded project on Histopathological Image Analysis. He is currently running a projects funded by and Collaborative Research Scheme (CRS), Under TEQIP III (RTU-ATU) on Smile. He has been an active member of many organizing committees of various conferences and workshops. He is also a guest editor of the Array, Journal of Swarm Intelligence, and Journal of Intelligent Engineering Informatics. He is an active member of IEEE, ACM, CSI Professional Bodies. His research areas include Image Processing, Pattern Recognition, Mining, and Soft Computing. Dr. Harish Sharma is an Associate professor at Rajasthan Technical University, Kota in Department of Computer Science and Engineering. He has worked at Vardhaman Mahaveer Open University Kota, and Government Engineering College Jhalawar. He received his B.Tech. and M.Tech. degree in Computer Engineering from Government Engineering College, Kota and Rajasthan Technical University, Kota in 2003 and 2009 respectively. He obtained his Ph.D. from ABV—Indian Institute of Information Technology and Management, Gwalior, India. He is secretary and one of the founder member of Soft Computing Research Society of India. He is a life time member of Cryptology Research Society of India, ISI, Kolkata. He is an Associate Editor of International Journal of Swarm Intelligence (IJSI) published by Inderscience. He has also edited special issues of the many reputed journals like Memetic Computing, Journal of Experimental and Theoretical Artificial Intelligence, Evolutionary Intelligence, etc. His primary area of interest is nature inspired optimization techniques. He has contributed in more than 105 papers published in various international journals and conferences. xiii
xiv
About the Editors
Dr. K. Balachandran is currently a Professor and Head CSE at CHRIST (Deemed to be University), Bengaluru, India. He has total 38 years’ experience in Research, Academia and Industry. He Served as Senior Scientific Officer in the Research and Development unit of Department of Atomic Energy for 20 years. His research interest includes Data Mining, Artificial Neural Networks, Soft computing, Artificial Intelligence. He has published more than fifty articles in well-known SCI/SCOPUS indexed international journals and conferences, and attended several national and international conferences and workshops. He has authored/edited four books in the area of computer science. Prof. Joong Hoon Kim a faculty of Korea University in the School of Civil, Environmental and Architectural Engineering, obtained his Ph.D. degree from the University of Texas at Austin in 1992 with the thesis title “Optimal replacement/rehabilitation model for water distribution systems”. Professor Kim’s major areas of interest include: optimal design and management of water distribution systems, application of optimization techniques to various engineering problems, and development and application of evolutionary algorithms. His publication includes “A New Heuristic Optimization Algorithm: Harmony Search”, Simulation, February 2001, Vol. 76, pp. 60–68, which has been cited over 2300 times by other journals of diverse research areas. His keynote speeches include “Optimization Algorithms as Tools for Hydrological Science” in the Annual Meeting of Asia Oceania Geosciences Society held in Brisbane, Australia in June of 2013, “Recent Advances in Harmony Search Algorithm” in the 4th Global Congress on Intelligent Systems (GCIS 2013) held in Hong Kong, China in December of 2013, and “Improving the convergence of Harmony Search Algorithm and its variants” in the 4th International Conference on Soft Computing For Problem Solving (SOCPROS 2014) held in Silcha, India in December of 2014. He hosted the 6th Conference of The Asia Pacific Association of Hydrology and Water Resources (APHW 2013) and the 2nd International Conference on Harmony Search Algorithm (ICHSA 2015). And he is hosting another one, the 12th International Conference on Hydroinformatics (HIC 2016) in August of 2016. Dr. Jagdish Chand Bansal is an Associate Professor at South Asian University New Delhi and Visiting Faculty at Maths and Computer Science, Liverpool Hope University UK. Dr. Bansal has obtained his Ph.D. in Mathematics from IIT Roorkee. Before joining SAU New Delhi he has worked as an Assistant Professor at ABV— Indian Institute of Information Technology and Management Gwalior and BITS Pilani. His Primary area of interest is Swarm Intelligence and Nature Inspired Optimization Techniques. Recently, he proposed a fission-fusion social structure based optimization algorithm, Spider Monkey Optimization (SMO), which is being applied to various problems from the engineering domain. He has published more than 70 research papers in various international journals/conferences. He has worked with the Liverpool Hope University UK for various research projects. He is the editor in chief of the journal MethodsX published by Elsevier. He is the series editor of the book series Algorithms for Intelligent Systems (AIS) and Studies in Autonomic,
About the Editors
xv
Data-driven and Industrial Computing (SADIC) published by Springer. He is the editor in chief of International Journal of Swarm Intelligence (IJSI) published by Inderscience. He is also the Associate Editor of IEEE Access published by IEEE and Array published by Elsevier. He is the general secretary of the Soft Computing Research Society (SCRS). He has also received Gold Medal at UG and PG level.
A Comprehensive Survey on Machine Reading Comprehension: Models, Benchmarked Datasets, Evaluation Metrics, and Trends Nisha Varghese and M. Punithavalli
Abstract Machine reading comprehension (MRC) is a core process in question answering systems. Question answering systems are capable to extract the answer from relevant resources automatically for the questions posed by humans, and machine reading comprehension brings attention to a textual understanding with answering questions. Recent advancements in deep learning and natural language processing pave the way to improve the accuracy in question answering systems with the major developments in neural machine reading comprehension, transfer learning, deep learning-based information retrieval, and knowledge-based information extraction. Herein, this research analysis included the comprehensive analysis of MRC tasks, benchmarked datasets, classic models, performance evaluation metrics, and modern trends and techniques on MRC. Keywords Baseline models · Benchmarked datasets · Evaluation metrics · MRC tasks · MRC types
1 Introduction Machine reading comprehension (MRC) is an indispensable task in the question answering system and the machines learn and answering from the text by using the series of modules in the MRC system. Read, learn, and understand the information and answer the most relevant answer for the question is one of the tedious tasks for machines. Each module in MRC plays a vital role in the question answering system to deliver the most accurate answer. In the MRC system, machines can understand the input documents and prepare the answers based on the given context, and it is also one of the promising tasks for direct information retrieval from natural language N. Varghese (B) · M. Punithavalli Department of Computer Applications, Bharathiar University, Coimbatore 641046, India e-mail: [email protected] M. Punithavalli e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_1
1
2
N. Varghese and M. Punithavalli
documents in a more efficient manner rather than a sequence of associated Web pages. The search engine retrieves an enormous amount of Web pages, here comes the role of MRC. The user can input the query and the context into the MRC system and the output extracted as the most relevant answer for the query. To recapitulate the MRC for a given passage P and question Q, MRC tasks can retrieve the accurate answer A to question Q by the function Ƒ, A = Ƒ(P, Q). Since 2015, neural machine reading comprehension is a topic that caught fire in researchers and NLP Community. Before that, MRC systems were presented with substandard performance due to the limited and insufficient datasets and manually designed syntactic-rule-based techniques and machine learning techniques. Typical mathematical and statistical techniques and methods like pattern matching, similarity checking between query and context with automated natural language techniques like stemming, lemmatization, named entity recognition are inadequate to extract the exact answer in question answering systems. Present MRC systems can gain success easily with the emerging trends in NLP and DNN techniques such as transfer learning, word vectorization methods, transformers, RNN, LSTM, GRU, attention mechanisms, and pointer networks in sequence to sequence modeling. To answer a question, the major challenges associated with question answering systems are seeking out and explore the relevant resources and then recognize and integrate the chunks of information from various repositories. MRC requires a comprehensive understanding of the given context to answer the question is one of the hurdles. This paper investigated through the most relevant and highly profiled papers from 2015 to 2020, and the purpose of this survey is to analyze and summarize the most promising models in MRC, various tasks or types of MRC systems and its representative datasets, modules and approaches, emerging trends on MRC and corresponding challenges.
2 Materials and Methods This section contains machine reading comprehension tasks with examples, MRC datasets examples, various types of MRC questions, a comprehensive description of the latest MRC trends, MRC architecture with the elaborated explanation of each module and the models, techniques, and methods using in corresponding modules.
2.1 Machine Reading Comprehension Tasks Cloze Tests. These tests are gap-filling tests or fill in the blanks like question answering; this type of question contains one or more blank space or gap in question. To answer the questions, MRC systems should be emphasized on the comprehensive understanding of context and the usage of vocabulary to fill in the blank with an accurate answer. Cloze test, for a given passage P from which a word or element
A Comprehensive Survey on Machine Reading Comprehension …
3
A(A ∈ P) is absent, then the model fills the gap with the most appropriate answer A by the function Ƒ, A = Ƒ (P − {A}). Multiple-Choice Questions (MCQ). MCQs are like factoid questions, which contain a list of answers with one or more correct answers. Proper understanding of the context is required to choose the right answer from the options. The formal definition of MCQ, for a given the passage P, the question Q and possible answers list A = {A1, A2, · · ·, An}, the MCQ task is to choose the accurate result Ai from A (Ai ∈ A) by, ƑAi = Ƒ (P, Q, A). Span Extraction. Span extraction reading comprehension is much more natural than the cloze style and MCQ MRC tasks, certain words are inadequate to answer the questions, and there may be no proper candidate answer in some cases. The formal definition of span extraction MCQ [1], for a given passage P, with n tokens, P = {t1, t2, · · ·, tn}, the task needs to extract the continuous and the question Q, subsequence A = ti, ti+1, . . . . . . . . . ti+k from passage P as the exact answer to the question Q by using the function Ƒ, A = Ƒ (P, Q). Answer A should be a span that is inferred from passage P. Free Form Answering. Free form answering or open-domain question answering (ODQA) delivers the correct information as the answer to the questions from various huge relevant data repositories, which produces more accurate and flexible answers and reduces the limitations of the previous tasks due to inadequate passage and inconsistent choices. Free answering has some possibilities like the exact answer contains in the context, the entities are from multiple contexts, all entities contained in the context and question, part of entities in output is unexplored in the context or question, and all entities in the answer are not contained in the context or question. Free answering, for a given passage P and the question Q, the exact answer A may not be the part of the input text, which means, the answer may or may not be is a subsequence of P. The free form task needs to deliver the correct answer A by Ƒ, A = Ƒ (P, Q).
2.2 Types of Machine Reading Comprehension Questions Cloze Style. It is typically with a placeholder in a statement and maybe a declarative sentence, imperative sentence, or any other pattern. Answer to question the system needs to seek out an entity or phrase that is appropriate to fill in the blanks. Natural Form. It is a question statement and that completely follows the grammatical rules of the language. For example, “What is your name?” Synthetic Style. These types of questions are the arrangements of words and that need not conform to any grammar. The question in Fig. 1 is an example of synthetic style.
4
N. Varghese and M. Punithavalli
Fig. 1 Examples for synthetic style questions from QAngaroo dataset [2]
2.3 Latest Trends on Machine Reading Comprehension In recent times, various machine reading comprehension datasets have been generated and that leads to the exponential increase in the research of the reading comprehension tasks and new trends. The type, size, model, characteristics, and application or approach to each model are varying from dataset to dataset. The corpus type is mainly divided into multi-model and textual. Multi-modal machine reading comprehension is an emerging trend in the QA system, which needs a comprehensive understanding of the textual data and the visual data. The textual corpus contains only plain texts, whereas, in the multi-modal MRC dataset, emphasis more on the given context of the text, architectures, images, audio, diagrams, and sometimes videos are also included to answer the multi-modal questions. Figure 2 is the example for multi-model MRC (M3 RC), taken from TQA [3], and the image comprises the rich diagram parsing category. Answer is extracted based on the relevant information from the question and image with textual contents. The latest challenges and trends are including MRC with unanswerable questions, reading comprehensions such as multi-hop, multi-modal, paraphrased paragraph, commonsense or world knowledge required MRC, complex logical reasoning, largescale MRC dataset, the domain-specific MRC, open-domain (ODQA) dataset, and conversational MRC. MRC with Unanswerable Questions. The complexity of the questions increases with a lack of evidence to answer the question. Then the system will answer the questions without any relevant content. It raises the problem of incorrect answer prediction. Consequently, these types of questions are challenging and difficult to retrieve answers from the context. Multi-Hop Reading Comprehension (MHRC). The MHRC is effortful with the multi-passage and complex corpora. The model requires more emphasis on these kinds of documents with multi-hop searching and reasoning. The datasets for MHRC are SQuAD 2.0 [4], MS MARCO [5], natural questions [6] and NewsQA [7], and all these datasets are the example for MRC with unanswerable questions.
A Comprehensive Survey on Machine Reading Comprehension …
5
Fig. 2 Example for M3 RC from textbook question answering; the answer extraction requires the combined analysis of question and image with textual content [3]
Complex Reasoning MRC. Logic and complex reasoning are one of the most important challenges in the MRC. Facebook CBT bAbI [8], DROP [9], RACE [10], and CLOTH [11], LogiQA [12] are the datasets generated for logical reasoning. Multi-modal Reading Comprehension (MMRC). Some complex or clear-cut concepts, architectures, logic diagrams, circuit diagrams can represent efficiently using images than text or a combination of both. The MMRC task includes heterogeneous data, which is a combination of images and language, which increases reading comprehension as an arduous task that may contain architectures, circuit diagrams, graphs, and much more with various colors and patterns. The representative dataset for the M3 RC is TQA [3], RecipeQA [13], COMICS [14], and MovieQA [15]. MRC with Commonsense or World Knowledge. Textual data is highly ambiguous and problematic due to word sense ambiguities like polysemous words and paraphrase sentences, semantic similarity conflicts, named entity recognition, and aspect term identification, and human language is more complex than a textual document. So, machine reading comprehension requires commonsense or world knowledge to answering these kinds of questions. Representative datasets are CommonSenseQA [16], ReCoRD [17], and OpenBookQA [18]. Conversational Machine Reading Comprehension (CMRC). The conversation is the natural way of communication; it includes actions or gestures, and the anaphoric, cataphoric, and exophoric references in compound statements are the other complex things to detect in the conversational MRC. To alleviate the problem, in the process of MRC on conversation needs, a proper identification of the previous conversations
6
N. Varghese and M. Punithavalli
and historical data on the conversation. Some of CMRC datasets are CoQA [19], QuAC [20], DREAM [21], and ShARC [22]. Domain-Specific Datasets. The context of a domain-specific dataset are taken from a specific domain, such as academics, cooking, health, or clinical reports. CliCR [23] is a medical domain cloze style MRC dataset, SciQ [24] is an MCQ–MRC dataset questions in science exam generated by crowdsourcing, and COVID-QA consists of 2019, COVID-19 questions which have been annotated into a broad category of transmission, prevention, and more information concerning COVID-19. Some other representative datasets are ReviewQA [25], SciTail [26], and PaperQA [27]. MRC with Paraphrased Paragraph. It alludes to the rephrasing passages using synonyms and altered words without changing the meaning of the context, which means the source context and paraphrased context are semantically similar. To extract the right answer to these questions in the task of paraphrased MRC, the system requires a comprehensive understanding of the true meaning of different versions of context. Some of the representative datasets for a paraphrased paragraph are DuoRC [28] and Who-did-What [29]. Open-Domain Question Answering (ODQA). The ODQA is a kind of task of identifying answers to natural questions from a large corpus of documents. Initially, the open-domain QA system starts with information retrieval to select a subset of documents from the corpus and collections of unstructured documents. Some of the representative datasets for ODQA are MCTest [30], CuratedTREC [31], Quasar [32], and SearchQA [33].
3 MRC Architecture, Modules, and Baseline Models Neural machine reading comprehension (NMRC) is the core process in the question answering system. In NMRC process, context and question are the inputs and produce the answer as results. Figure 3 depicts the MRC system which includes four phases: word embedding, feature extraction, context–question interaction, and answer prediction.
Fig. 3 Machine reading comprehension architecture
A Comprehensive Survey on Machine Reading Comprehension …
7
3.1 Word Embedding Word embedding is used to represent the text tokens by fixed-size vectors, and then, the words which have related meaning have represented in a close vector space area. The conversion of the text tokens into vectors is known as vectorization. These word vectors can extract the information based on the meaning and structure of the input context and the relationship between words in the context. The meaning of a word depends on the context of words surrounded by it.
3.2 Feature Extraction This phase takes the question and context embedding as input and extracts features individually. It also checks out the semantic and structural information of the sentences from the output of the word embedding. Some typical deep neural networks applied to mine the contextual features are recurrent neural networks (RNN), convolutional neural networks (CNN), and transformers.
3.3 Context–Question Interaction Context–question interaction module extracts the semantic association between the context and the question; herein the MRC provides more focus to the context and question to transfer the most accurate information to the answer prediction module. Attention mechanisms and one-hop or multi-hop interaction are used to capture the relevant content from the input. The attention mechanism has concentrated more in an input sequence and decides the position of the context in sequence with higher precedence. One-hop interaction has calculated the context–query interaction only once, but in multi-hop interaction, the context–query interaction calculation may be more than once. Multi-hop interaction includes some methods to implement, first computes the semantics between the context and query, then introduces memory spaces to carry the prior memories process of the interaction, and finally, the hidden states of RNN store the previous information of context–question interaction.
3.4 Answer Prediction Answer prediction is the final module, which delivers the answers to questions based on the actual context and also according to the specific tasks. To recapitulate, there are four types of answer prediction: word predictor, option selector, span extractor, and answer generator. Word predictor predicts the answer for close style tests and fills
8
N. Varghese and M. Punithavalli
Fig. 4 Machine reading comprehension modules and methods of machine reading comprehension system
the corresponding words or entities to the place holder. Option selector selects the correct answer from candidate options for the multiple-choice tasks. Span extractor retrieves the answer for a question from a specific input context. In answer generator, the answers do not need to be exactly the subsequence of the input context, and answers may be generated from multiple pieces of evidence in other contexts. Figure 4 illustrates all modules and associated methods of MRC.
4 Performance Evaluation Metrics Performance evaluation metrics measure the correctness of the MRC tasks using metrics. Evaluation metrics have been chosen for a specific dataset based on its characteristics. Some datasets may require more than one metric. Accuracy is the most common performance evaluation metrics for MRC models, and exact match, precision, recall, F1, ROUGE, BLEU, HEQ, and METEOR are the other metrics
A Comprehensive Survey on Machine Reading Comprehension …
9
Fig. 5 Statistics of performance evaluation metrics
followed by accuracy. For multiple-choice or cloze style tasks are usually evaluated using accuracy, and the evaluation metrics such as exact match, precision, recall, and F1measure are usually used for span extraction–prediction tasks. ROUGE and BLEU are the popular measures for machine translation; recently, both metrics are using for the MRC tasks. HEQ is the latest evaluation metric in MRC and is most suitable for conversational MRC tasks. METEOR includes the features like synonym matching, and the other versions show higher accuracy in paraphrase matching, text normalization, proper interaction between content word and function word. The statistical analysis results of performance evaluation metrics included in Fig. 5. The pie chart shows that accuracy is the most common evaluation metric with 42.5%.
4.1 Accuracy It is the percentage of the queries that correctly answers by an MRC system accurately, which means the ratio between the total number of questions (N) in MRC and the number of correctly answered questions (M). The accuracy can be calculated as in Eq. (1). Accuracy =
M N
(1)
10
N. Varghese and M. Punithavalli
4.2 Exact Match Exact match (EM) is the percentage of queries that match the correct answer with the system-generated answer; the answers can be an entity, phrase, or a sentence, but both answers should be equal. In the case of span extraction, both accuracy and the EM are the same. But in the case of factoid questions, EM does not work well, because there may or may not be a situation where the answer includes a part of the actual answer. The EM can be calculated using Eq. (2). Exact Match =
M N
(2)
4.3 Precision and Recall To define precision and recall, it is necessary to understand true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN). False positives and false negatives affect the accuracy; these values occur when your actual value opposes the predicted value. Precision and recall are two types, token-level and question-level. Precision, recall, and accuracy can be calculated using Eqs. (3), (4) and (5). Precision = Recall = Accuracy =
TP TP + FP
TP TP + FN
TP + TN TP + FN + TN + FP
(3) (4) (5)
4.4 F1 Score (F Score or F Measure) It shows the balance between precision and recall, and it can be calculated using Eq. (6). Similar to precision and recall, the F1 score also in token-level F1 and question-level F1. Token-level F1 for a single question can be calculated as in Eq. (7), and the question-level F1 as in Eq. (8). F1 =
2 × Precision × Recall Precision + Recall
(6)
A Comprehensive Survey on Machine Reading Comprehension …
F1TS = F1Q =
11
2 × PrecisionTS × RecallTS PrecisionTS + RecallTS
(7)
2 × PrecisionQ × RecallQ PrecisionQ + RecallQ
(8)
4.5 Recall-Oriented Understudy for Gisting Evaluation (ROUGE) ROUGE [34] is a performance evaluation metric for text summary systems, machine translation, MRC systems, and other NLP tasks. It includes five evaluation metrics. ROUGE-N—ROUGE-1 (unigram), ROUGE-2 (bigrams), ROUGE-3 (trigrams) and so on, ROUGE-L—Longest Common Subsequence (LCS)-based statistics, ROUGE-W—Weighted LCS-based statistics, ROUGE-S—Skip-bigram-based cooccurrence statistics, ROUGE-SU—Skip-bigram plus unigram-based co-occurrence statistics. ROUGE-N is an n-gram recall between a candidate summary and a set of reference summaries and can be calculated as in Eq. (9). Countmatch gramn S∈{R S} gramn∈S Count gramn
ROUGE-N =
S∈{RS}
gramn∈S
(9)
4.6 Bilingual Evaluation Understudy (BLEU) Auto-machine translation represents the quality of a model using the BLEU [35] score, which shows the similarity between the context texts to the reference texts; the similarity between the context text and the reference text is higher than the values closer to one. BLEU can be calculated using Eq. (10). iBLEU is an interactive version of BLEU, which compares two different systems visually and interactively and also helps to visually estimate the BLEU scores obtained by the candidate translations. The interpretations of the BLEU score are like the following Table 1. 1/4
4 reference − length Precisioni BLEU = min 1, exp 1 − output − length i=1
(10)
12
N. Varghese and M. Punithavalli
Table 1 BLEU score and interpretation
BLEU score
Interpretation
< 10
Almost useless
10–19
Hard to get the gist
20–29
The gist is clear but has significant grammatical errors
30–40
Understandable to good translations
40–50
High-quality translations
50–60
Very high quality, adequate, and fluent translations
60–69
Quality often better than human
4.7 Metric for Evaluation of Translation with Explicit Ordering (METEOR) METEOR [36] indicator uses recall and accuracy together to evaluate the model and based on the harmonic mean of unigram precision and recall, with recall weighted higher than precision. Unlike at the corpus level as in BLEU, the METEOR interlinks the better correlation with a sentence level or segment level human judgment. METEOR can be calculated using Eq. (11), where m is the total number of matched words, ch number of chunks, and α, β, γ are the parameters are tuned to maximize correlation with human judgments. The penality and Fmean can be estimated through Eqs. (12) and (13), respectively. Meteor = Fmean × (1 − Penality) Fmean =
Precision × Recall ∝ ×Precision + (1− ∝) × Recall β ch Penality = γ × m
(11) (12)
(13)
4.8 Human Equivalence Score (HEQ) The HEQ [20] is a new MRC evaluation metric, which estimates how the model is performing better than an ordinary person and that can be used in conversational MRC datasets, such kind of datasets contains questions with multiple valid answers, and then, the F1 score may be misleading. The HEQ score is computed by (14). HEQ =
M N
(14)
A Comprehensive Survey on Machine Reading Comprehension …
13
5 Summary and Conclusion This research article has presented a comprehensive survey on neural machine reading comprehension. First, it explained the need for an MRC system. MRC system contains four modules, word embedding, feature extraction, question–context interaction, and answer prediction. First, the system needs input as question and context; after that, both question and context will convert as word embedding. Any of the vectorization methods can be chosen for the embedding process, the output of the embedding transfer to the feature extraction module. In the second step, the most essential features for the answer prediction will be captured. For this process, most of the recent models use the transformer-based methodology, which emphasized transfer learning and fine-tuning. Then the extracted features transfer to the context– question interaction module, and finally, the system will predict the answer for the question. Based on the trends and pattern, the selection of a model to extract the accurate answer is an arduous task. For the statistical analysis, 55 benchmarked datasets have been considered from the period of 2015–2020, and these datasets included textual and multi-model patterns. The tremendous increase in the large-scale datasets in MRC has contributed to the highly accurate answers in the question answering system, instead of producing the passages with chunks of spreading answers. Various types of MRC and the sources of generation of the dataset are also included in the statistical analysis. In addition to various performance evaluation metrics, the statistical analysis of metrics and the need and role of each metrics in different tasks are also incorporated. With the development of new methods and models, especially the evolution of the transformer model in 2017, it has shown exceptional and dramatically accelerated performance not only in MRC but also for all natural language processing and deep learning tasks. With the evolution of the transformer, Google developed a pertained model BERT; in addition to that, numerous amounts of BERT versions have been generated, which makes higher accuracy in the leader board of every benchmarked dataset.
References 1. Liu S, Zhang X, Zhang S, Wang H, Zhang W (2019) Neural machine reading comprehension: methods and trends. arXiv:1907.01118v5. [cs.CL], 19(5), 1–45. https://doi.org/10.3390/app 9183698 2. Welbl J, Stenetorp P, Riedel S (2018) Constructing datasets for multi-hop reading comprehension across documents. arXiv preprint, 18(6):287–302. https://arxiv.org/abs/1710.06481 3. Kembhavi A, Seo M, Schwenk D, Choi J, Farhadi A, Hajishirzi H (2017) Are you smarter than a sixth grader? Textbook question answering for multimodal machine comprehension. Proc IEEE Conf Comput Vis Pattern Recognit 17(3):4999–5007
14
N. Varghese and M. Punithavalli
4. Rajpurkar P, Jia R, Liang P (2018) Know what you don’t know: unanswerable questions for SQuAD. In: Proceedings of the 56th annual meeting of the association for computational linguistics, Association for Computational Linguistics, 8(1), pp 784–789. https://doi.org/10. 18653/v1/P18-2124 5. Nguyen T, Rosenberg M, Song X, Gao J, Tiwary S, Majumder R, Deng L (2016) MS MARCO: a human generated machine reading comprehension dataset. CoRR, [1611.09268], 18(3), 1–11. http://arxiv.org/abs/1611.09268 6. Kwiatkowski T, Palomaki J, Redfield O, Petrov S (2019) Natural questions: a benchmark for question answering research. Trans Assoc Computat Linguist 19(7):453–466. https://doi.org/ 10.1162/tacl_a_00276 7. Trischler A, Wang T, Yuan X, Harris J, Sordoni A, Bachman P, Suleman K (2017) NewsQA: a machine comprehension dataset. In: Proceedings of the 2nd workshop on representation learning for NLP; Association for computational linguistics, 17(1), 191–200. https://doi.org/ 10.18653/v1/W17-2623 8. Hill F, Bordes A, Chopra S, Weston J (2016) The Goldilocks principle: reading children’s books with explicit memory representations. In: 4th International conference on learning representations, ICLR, 4(2), 1–13. https://arxiv.org/abs/1511.02301 9. Dua D, Wang Y, Dasigi P, Stanovsky G, Singh S, Gardner M (2019) DROP: a reading comprehension benchmark requiring discrete reasoning over paragraphs. Proc NAACL 19(1):2368–2378. https://doi.org/10.18653/v1/N19-1246 10. Lai G, Xie Q, Liu H, Yang Y, Hovy E (2017) RACE: Large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 conference on empirical methods in natural language processing; association for computational linguistics, vol 17(1), pp 785–794: Copenhagen, Denmark. https://doi.org/10.18653/v1/D17-1082 11. Xie Q, Lai G, Dai Z, Hovy E (2018) Large-scale Cloze test dataset created by teachers. In: Proceedings of the 2018 conference on empirical methods in natural language processing; association for computational linguistics, vol 18(1), pp 2344–2356. https://doi.org/10.18653/ v1/D18-1257 12. Liu J, Cui L, Liu H, Huang D, Wang Y, Zhang Y (2020) LogiQA: a challenge dataset for machine reading comprehension with logical reasoning. [cs.CL], 7(1), 1–7. https://arxiv.org/ abs/2007.08124 13. Yagcioglu S, Erdem A, Erdem E, Ikizler-Cinbis N (2018) RecipeQA: a challenge dataset for multimodal comprehension of cooking recipes. In: Proceedings of the 2018 conference on empirical methods in natural language processing, vol 18(1), pp 1358–1368. https://doi.org/ 10.18653/v1/D18-1166 14. Iyyer M, Manjunatha V, Guha A, Vyas Y, Boyd-Graber J, Daume H, Davis LS (2017) The amazing mysteries of the gutter: drawing inferences between panels in comic book narratives. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 18(2), pp 7186–7195. https://arxiv.org/abs/1611.05118 15. Tapaswi M, Zhu Y, Stiefelhagen R, Torralba A, Urtasun R, Fidler S (2016) Movieqa: understanding stories in movies through question-answering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 6(2), pp 4631–4640. https://arxiv.org/abs/1512. 02902 16. Talmor A, Herzig J, Lourie N, Berant J (2019) CommonsenseQA: a question answering challenge targeting commonsense knowledge. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), vol 19(1), pp 4149–4158. https://doi.org/10.18653/ v1/N19-1421 17. Zhang S, Liu X, Liu J, Gao J, Duh K, Durme VB (2018) Record: bridging the gap between human and machine commonsense reading comprehension. arXiv preprint, 18(1):1–14. https:// arxiv.org/abs/1810.12885 18. Mihaylov T, Clark P, Khot T, Sabharwal A (2018) Can a suit of armor conduct electricity? A new dataset for open book question answering. EMNLP 18(9):1–14. https://www.aclweb.org/ anthology/D18-1260.pdf
A Comprehensive Survey on Machine Reading Comprehension …
15
19. Reddy S, Chen D, Manning CD (2019) Coqa: a conversational question answering challenge. Trans Assoc Comput Linguist 10(9):249–266. https://doi.org/10.1162/tacl_a_00266 20. Choi E, He H, Iyyer M, Yatskar M, Yih WT, Choi Y, Liang P, Zettlemoyer L (2018) QuAC: question answering in context. In: Proceedings of the 2018 conference on empirical methods in natural language processing; Association for computational linguistics, vol 18(1), pp 2174– 2184. Brussels, Belgium. https://doi.org/10.18653/v1/D18-1241 21. Sun K, Yu D, Chen J, Yu D, Choi Y, Cardie C (2019) Dream: a challenge data set and models for dialogue-based reading comprehension. Trans Assoc Comput Linguist 9(7):217–231. https:// doi.org/10.1162/tacl_a_00264 22. Saeidi M, Bartolo M, Lewis P, Singh S, Rocktäschel T, Sheldon M, Bouchard G, Riedel S (2018) Interpretation of natural language rules in conversational machine reading. In: Proceedings of the 2018 conference on empirical methods in natural language processing; Association for computational linguistics, vol 18(1), pp 2087–2097. https://doi.org/10.18653/v1/D18-1233 23. Suster S, Daelemans W (2018) CliCR: a dataset of clinical case reports for machine reading comprehension. arxiv preprint, 18(3):1–13. https://arxiv.org/abs/1803.09720 24. Welbl J, Liu NF, Gardner M (2017) Crowdsourcing multiple choice science questions. In: Proceedings of the 3rd workshop on noisy user-generated text; Association for computational linguistics, vol 17(1), pp 94–106. https://doi.org/10.18653/v1/W17-4413 25. Grail Q, Perez J (2018) ReviewQA: a relational aspect-based opinion reading dataset. arXiv preprint, 18(1) 1–10. https://arxiv.org/abs/1810.12196 26. Khot T, Sabharwal A, Clark P (2018) Scitail: a textual entailment dataset from science question answering. In: Thirty-Second AAAI conference on artificial intelligence, 32(3), pp 1–9. http:// ai2-website.s3.amazonaws.com/publications/scitail-aaai-2018_cameraready.pdf 27. Park D, Choi Y, Kim D, Yu M, Kim S, Kang J (2019) Can machines learn to comprehend scientific literature? IEEE Access 19(7):16246–16256. https://ieeexplore.ieee.org/stamp/stamp.jsp? arnumber=8606080 28. Saha A, Aralikatte R, Khapra MM, Sankaranarayanan K (2018) DuoRC: towards complex language understanding with paraphrased reading comprehension. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), Association for computational linguistics, vol 18(1), pp 1683–1693. https://doi.org/10.18653/ v1/P18-1156 29. Onishi T, Wang H, Bansal M, Gimpel K, McAllester D (2016) Who did what: a large-scale person-centered cloze dataset. In: Proceedings of the conference on empirical methods in natural language processing, vol 16(1). Association for Computational Linguistics, Austin, Texas, pp 2230–2235. https://doi.org/10.18653/v1/D16-1241 30. Soricut R, Ding N (2016) Building large machine reading-comprehension datasets using paragraph vectors. arXiv preprint, 6(2), 1–10. https://arxiv.org/abs/1612.04342 31. Baudiš P, Šedivy J (2015) Modeling of the question answering task in the yodaqa system. In: International conference of the cross-language evaluation forum for European languages, vol 20(5). Springer, pp 222–228. https://doi.org/10.1007/978-3-319-24027-5_20 32. Dhingra B, Mazaitis K, Cohen W (2017) Quasar: datasets for question answering by search and reading. arXiv 2017, abs/1707.03904, 17(4) 1–11. https://arxiv.org/abs/1707.03904 33. Dunn M, Sagun L, Higgins M, Guney VU, Cirik V, Cho K (2017) Searchqa: a new q&a dataset augmented with context from a search engine. arXiv preprint, 17(2), 1–5. https://arxiv.org/abs/ 1704.05179 34. Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. Text summarization branches out. Association for computational linguistics, 4(13), 74–81. https://www.aclweb.org/ anthology/W04-1013.pdf 35. Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics, vol 15(2), pp 311–318. https://doi.org/10.3115/1073083.1073135 36. Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, vol 5(1), pp 65–72. https://www.aclweb.org/anthology/W05-0909
A Novel Feature Descriptor: Color Texture Description with Diagonal Local Binary Patterns Using New Distance Metric for Image Retrieval Vijaylakshmi Sajwan
and Rakesh Ranjan
Abstract The growth of digital data exponentially accelerates with each passing day. A storage media database usually contains large amounts of images and information content, which must be located and retrieved with relative ease. A novel distance metric and a diagonal local binary pattern (DLBPC) are introduced in this work for finding high-accuracy data. The device-independent L*a*b* color space is used in the description. The system’s effectiveness has been tested using the dataset Wang1K. The findings show that the recommended method is as effective as other systems that have been studied. Keywords Diagonal local binary pattern color image (DLBPC) · Local feature · Global feature · LBP · New distance metric
1 Introduction CBIR is a tool designed to locate an image from a large repository based on the user’s specifications. CBIR is also widely recognized as query by image content (QBIC). CBIR is a feasible option to standard TBIR, which has many disadvantages [1–5]. We extricate low-level (color, texture, and shape) or high-level (machine learning technique) features from images in general [6]. CBIR is likely the most challenging examination subject in the world of AI and pattern recognition, despite 30 years of development [7]. Numerous academics have been compelled to develop a strategy capable of achieving strong recovery rates in a brief period, because of the ongoing hunt for better and simpler QBIC system. Feature extraction is a crucial part of the CBIR frameworks, and they have matched. It is essential to identify and preserve a picture’s unique features throughout the feature extraction process. An image may V. Sajwan (B) Department of Computer Science and Engineering, Himgiri Zee University, Dehradun, India R. Ranjan Department of Engineering, Himgiri Zee University, Dehradun, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_2
17
18
V. Sajwan and R. Ranjan
be accurately described by a relevant, highly suitable characteristic that minimally varies between intra-class and inter-class photographs. When it comes to representing a whole, global and local features are distinguished. While neglecting local characteristics and structural relationships, a global attribute technique obtains attributes from the whole picture. They are both able to process, and they are resistant to picture noise. Despite the many difficulties with occlusion, perspectives, illumination variations, and global picture feature differences, not enough attention has been given to these issues by global approaches. This material is managed using a local feature extraction process, which uses block regions in an image to isolate a property. It is an overall texture-based global content extraction technique, such as Tamura texture content [8], Gabor filters [9], gray-level co-occurrence matrices [10], and the Markov random field model [11]. Feature extraction techniques such as SIFT [12], LBP [13], SURF [14], Daisy [15], PCA-SIFT [16], HOG [17], BRIEF [18], WLD [19] and Rank-SIFT [20] have proven to be successful for finding local features. Color and texture are important details to consider in speed and retrieval when selecting efficient functions. The conventional texture characteristics were applied to the grayscale pictures. SIFT, the most effective and accurate local texture descriptors for grayscale pictures, has been proved to be a part of the state-of-the-art grayscale image identification and classification system [7]. Locally textured graphics that have not been affected by lighting can be referred to as LBP. Many creative uses have been found for the LBP operator, including categorizing textures [21, 22], recognizing faces [23–25], identifying facial emotions [26, 27], and locating pictures [28–31]. On grayscale pictures, the majority of study on the traditional LBP classifier and its variations has been conducted. The LBP operator increasing popularity in various realistic applications has motivated academics to develop classifier, which can accurately represent color texture patterns in the same manner that the LBP classifier can do for grayscale images. Because a grayscale image the grayscale LBP operator naturally applied to each channel of a color image [1]. This technique, dubbed multispectral LBP (MSLBP), was utilized to characterize color texture [32]. Pandey et al. [33] studied CBIR for color features. Sharma et al. [34] considered firefly [35] algorithm for image segmentation [36]. Maenpaa et al. [37] explored color texture and discovered that three pairings of opponent color components are sufficient to convey the cross-correlation features between the three color channels, rather than six. We propose DLBPC for color images. A 3 × 3 window is applied after turning the RGB image to the L*a*b* color space. We modify the middle pixels of a 3 × 3 window iteratively, based on their diagonal element. To finish, create an attribute vector for a color image using a histogram of 256 diagonal local binary bins. This work presents a new distance measure that generate pictures to a query image. All present and proposed image retrieval approaches have been analyzed and compared.
A Novel Feature Descriptor: Color Texture Description …
19
2 Proposed Method 2.1 Diagonal Local Binary Pattern for Color Image Step 1: Preprocessing: In this step, system converts pictures to the L*a *b color space. It aims for consistency in perception, and its L feature closely mimics the sensation of lightness experienced by a human being. Step 2: Extraction of Features: The DLBPC operator tears the picture features, resulting in this step. In addition to splitting color pixels using an m-dimensional hyperplane, the envisaged DLBPC method is built around the idea of quantizing color pixel. This is due to the fact that m = 3 in this specific situation. Constructing a three-dimensional thresholding plane involves the following steps: while a vector I = (L(p, q), a(p, q), b(p, q)) may be used to represent a color pixel in the L*a*b* color space. Because of this, the number “m” is three unique hues. 3 × 3 windows are then used to improve the image, with an adjustment to the center pixel value depending on the image have two adjacent diagonal elements. To use this technique, perform the method until the entire image has been merged and a DLBPC feature has been found. Finally, using the histogram, create an attribute vector with 256 bins. We change the window middle pixel value as shown in Fig. 1.
3X3 window V1= B2+B8-2B1 Vi = Bi+1+Bi -1 -2Bi V7= B8+B6-2B7
i = 1…. 4
Fig. 1 (3 × 3) window and equation to encode the center pixel [6]
Fig. 2 Example of DLBPC calculation of each pixel
20
V. Sajwan and R. Ranjan
We will use Fig. 2 to compute Vi, since Ip = Vi. Make a note of Vi values and, if Vi < 0, write 0, else write 1. At last we use V1 as the LSB and V4 as the MSB to convert Vi values to decimal values; i = 1…… 4. To see how DLBPC is calculated for each pixel, look at Fig. 2. The features of the spectrum were represented using a histogram. We used the same method for the database images. We calculate the DLBPC through this below formula. LBP(xc , yc ) =
p−1
S(I p ) × 2 p
(1)
p=0
Step 3: Retrieval of Similar Items Based on Comparisons: After extracting the features from query images and database photographs, similarity matching between the query image and the corresponding repository images is done. The pictures are arranged ascending by the distance metric’s value. There are several distance metrics available for determining similarity; in these tests, we employed the City Block, Canberra, and Euclidean Distance measures, as well as suggested a new distance metric that is discussed below. The outcome of the study shows that our suggested new distance metric beats other distance metric findings.
3 Similarity Measures A widespread task in several application fields, including pattern recognition, data processing, multimedia retrieval, and computational biology involves searching for items similar to a particular query element using a distance metric. One example of CBIR is that because even a tiny amount of perturbation in an image can change the numerical values of its pixel intensities and related picture characteristics the search is performed at a level where the actual images are not compared pixel by pixel. The search process relies on proximity calculations at the proximal level of picture representations: The photographs closest to the query representation are given higher rankings. When looking for items with the greatest resemblance to the query, we first identify those things that meet all of the requirements of the query. When we are dealing with the last scenario, we seek items closest to the query. A similarity function and a distance function are equivalent if the ranked list of responses to a query is equal, while other distance metrics exist, including Euclidean, City Block, Canberra, Manhattan, Bray–Curtis, and square chord, among others [22]. All have different purposes. This study looks into Canberra, Euclidean, City Block, and our suggested New Distance Metric to see which one results in the most favorable outcomes.
A Novel Feature Descriptor: Color Texture Description …
21
3.1 Proposed New Distance Metric We offer a new distance measure (NDM) that is unlike any other. If the query and repository pictures each have feature vector that represented by A and B, then each attributes ai and bi should represent one feature of the picture, according to the assumptions. In order to account for each attribute ai and each attribute bi , the following new distance metrics (NDM) are defined: NDM(A, B) =
n 1/ p 1/ p ai − bi
(2)
i=1
where p is a non-negative integer, p ∈ N+, if p = 1, the new distance metric is identical to the city Block distance. Therefore, we restrict our technique to positive integer parameters p > 1. We have chosen p = 3 in our method, which outperforms other distance measures (like Euclidian, city block, Canberra). As part of our research on the Diagonal local binary pattern for color pictures, we devised a new distance measure.
4 Analysis and Result 4.1 Performance and Measures Precision and recall are two of the most important and often used metrics. Precision indicates an algorithm’s accuracy in returning significant items, which shows how precise an algorithm is in returning them. To put it another way, the recall (or sensitivity) assesses how well an algorithm does at retrieving relevant items once they have been properly fetched [7]. In our testing, every picture in the repository serves as a query image for the other images in the repository. According to Refs. [38, 39], the precision P(N) and recall R(N) for fetching the top N items are used to evaluate the effectiveness of an image retrieval system for retrieving the top N items. P(N ) =
IN , N
R(N ) =
IN M
(3)
A number of comparable photographs are collected from top N locations and stored in a repository as a result of Eq. (3) above. The term IN represents the total number of comparable images saved in the repository. Query average accuracy is calculated as the average of all precision values p(n), where n = 1, 2, …, N.
22
V. Sajwan and R. Ranjan
p(q) =
N 1 p(n) N n−1
(4)
It is the average of all query scores (Q) that determines the mean average precision (mAP): mAP =
Q 1 p(q) Q q=1
(5)
4.2 Datasets Wang database [42]: It has 1000 pictures, which are organized into ten classes, each of which contains 100 images. 256 × 384 pixels and 384 × 256 pixels are the resolutions of the pictures of this database. The Wang 1K database has ten categories, including African people, a beach, a home, a bus, a dinosaur, an elephant, a flower, a horse, a glacier, and food.
4.3 Obtained Results Several experiments are shown in this section to show the practicality of the proposed methodology and to compare their effects with those of the active structures LBP, LBPC, MSLBP, MDLBP, and LCVBP that are quite comparable to those of the aforementioned active structures. A Microsoft Windows 7 computer with an I3 Processor and 4 GB of RAM was utilized to test our suggested approaches using MATLABR14. Due to the fact that the LBP operator is employed to each pixel element, the LBP characteristic vector for a color image is 768 bits in length. DLBPC, LCVBP, MSLBP, and MDLBP function vectors have 256, 236, 2304, and 2048 dimensions. As a result, the accuracy of the system recognition will be slowed due to the longer characteristic vector length in all techniques MSLBP (2304) and MDLBP (2048). The mAP values for the top hundred pictures (N = 100) on the Wang-1000 datasets are shown in Table 1. To make things easier to find, the most effective retrieval methods are highlighted in bold. Each picture of repository are utilized as query images. As shown in Table 1, the proposed DLBPC with the proposed new distance metric achieves the best mAP values of 61.72. Our suggested technique’s feature vector is small, which enhances retrieval speed. DLBPC has another best mAP value of 60.89 when the Canberra distance measure is applied. MDLBP obtains the thirdhighest mAP value of 60.82 when utilizing the Canberra distance metric. MSLBP came in fourth with a mAP of 60.62 by using Canberra distance metric.
A Novel Feature Descriptor: Color Texture Description …
23
Table 1 Mean average precision (mAP) is provided for N = 100 as a consequence of several techniques on the Wang-1K dataset utilizing various distance metrics and a novel distance measure Technique
Total attribute Distance measures Euclidean City block Canberra New distance metric (Proposed) 3*256 = 768
LBP [13]
55.28
55.23
56.93
256
55.49
55.05
58.05
2304
59.86
59.83
60.62
MDLBP [40]
2048
59.58
59.58
60.82
LCVBP [41]
236
53.42
53.44
56.83
DLBPC (proposed) 256
57.34
57.75
60.89
precision(p) (%)
LBPC [7] MSLBP [32]
90 80 70 60 50 40 30 20 10 0
61.72
LBP LBPC MSLBP MDLBP LCVBP DLBPC using new distance metric (proposed)
10
20
30
40
50 60 Recall
70
80
90 100
Fig. 3 P–R curves n = 1 to 100 on the Wang dataset for one recommended method DLBPC as well as five other active strategies are discussed
Figure 3 illustrates the accuracy vs recall values for six techniques, 5 existing and 1 suggested technique. Below graph shows the proposed methods constantly have the maximum recall rates within all recall values, followed by MDLBP, MSLBP, LCVBP, LBP, and LBPC. Table 2 demonstrates the usefulness of the recommended DLBPC and real methods (MDLBP). The flower class mAP produces 94.00% in our proposed strategy, which is considerably greater than MDLBP’s returns of 80.08%. Elephant, mountain, building, bus, and horse exhibit considerable variances, with 0.47, 26.20, 2.19, 2.21, and 1.39, respectively. Having these variances, our suggested techniques offer superior retrieval outcomes than MDLBP. Horses, mountains, buildings, buses, and African people are among the other classes our suggested approaches yield substantially enhanced outcomes. These classes, like the elephant, come in a range of forms and sizes. Other courses do not have as many image choices. A more comprehensive examination of the number of relevant results was achieved.
24
V. Sajwan and R. Ranjan
Table 2 To generate class-wise mean average precision (mAP) in percent for N = 100, DLBPC was used using the Canberra distance metric on the Wang-1 K dataset Class
DLBPC using a new distance metric (suggested)
MDLBP [36] using Canberra distance metric
Difference
African people
57.00
55.75
1.25
Beach
50.00
45.21
4.79
Building
59.00
56.81
2.19
Bus
83.00
80.79
2.21
Dinosaur
96.00
97.10
−1.10
Elephant
42.00
41.53
0.47
Flower
94.00
80.08
13.92
Horse
63.00
61.61
1.39
Mountain
60.00
33.80
26.20
Food
45.00
55.54
−10.54
Average
64.90
60.82
4.08
5 Conclusion DLBPC and a novel distance measure are presented for high-precision data retrieval. DLBPC with recommended distance metric makes use of the L*a*b* color space and the threshold of the diagonal color pixel in the circularly symmetric neighborhood of an equally spaced member P with radius R. The suggested DLBPC method, when combined with the proposed new distance measure, results in a higher mAP value than the frequently utilized multichannel decoded local binary pattern (MDLBP). Additional study demonstrates that the proposed classifier is also valid when there is significant intra-class variation among the pictures. The suggested approach was evaluated on the difficult dataset Wang-1K to determine the efficacy of our image retrieval system and was shown to be competitive in terms of accuracy with other methods.
References 1. Rui Y, Huang T, Chang S (1999) Image retrieval: current techniques, promising directions, and open issues. J Vis Commun Image Represent 10(1):39–62 2. Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380 3. Kokare M, Chatterji B, Biswas P (2002) A survey on current content based image retrieval methods. IETE J Res 48(3–4):261–271 4. Liu Y, Zhang D, Lu G, Ma W (2007) A survey of content based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282 5. Datta R, Joshi D, Li J, Wang J (2008) Image retrieval. ACM Comput Surv 40(2):1–60
A Novel Feature Descriptor: Color Texture Description …
25
6. Alrahhal M, Supreethi KP (2019) Content-based image retrieval using local patterns and supervised machine learning techniques. In: Proceedings of IEEE International conference on artificial intelligence (AICAI), pp 118–124 7. Singh C, Walia E, Kaur KP (2018) Color texture description with novel local binary patterns for effective image retrieval. Pattern Recogn 76(2):50–68 8. Tamura H, Mori S, Yamawaki T (1978) Texture features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473 9. Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842 10. Haralick RM, Shangmugam K (1973) Textural feature for image classification. IEEE Trans Syst Man Cybern 3(6):6610–6621 11. Cross G, Jain A (1983) Markov random field texture models. IEEE Trans Pattern Anal Mach Intell 5(1):25–39 12. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110 13. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987 14. Bay H, Ess A, Tuytelaars T, Gool LJV (2008) SURF: speeded-up robust features. Comput Vis Image Understand 110(3):346–359 15. Tola E, Lepetit V, Fua P (2010) DAISY: an efficient dense descriptor applied to wide baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830 16. Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 506–513 17. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 886–893 18. Calonder M, Lepetit V, Zuysal MO, Trzcinski T, Strecha C, Fua P (2012) Brief: computing a local binary descriptor very fast. IEEE Trans Pattern Anal Mach Intell 34(7):1281–1298 19. Chen J, Shan S, He C, Zhao G, Pietikäinen M, Chen X, Gao W (2010) WLD: a robust local image descriptor. IEEE Trans Pattern Anal Mach Intell 32(9):1705–1720 20. Li B, Xiao R, Li Z, Cai Z, Lu B-L, Zhang L (2011) Rank-SIFT: learning to rank repeatable local interest points. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1737–1744 21. Mäenpää T, Pietikäinen M, Ojala T (2000) Texture classification by multi-predicate local binary pattern operators. In: Proceedings of International conference on pattern recognition, pp 3951–3954 22. Pietikäinen M, Mäenpää T, Jaakko V (2002) Color texture classification with color histograms and local binary patterns. In: Workshop on texture analysis in machine vision, pp 109–112 23. Ahonen T, Hadid A, Pietikäinen M (2004) Face recognition with local binary patterns. In: Proceedings of European conference on computer vision, pp 469–481 24. Ahonen T, Hadid A, Pietikäinen M (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041 25. Shan C, Gong S, McOwan PW (2005) Robust facial expression recognition using local binary patterns. In: Proceedings of IEEE International conference of image processing, pp 370–373 26. Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27:803–816 27. Zhang S, Zhao X, Lei B (2012) Facial expression recognition based on local binary patterns and local fisher discriminant analysis. WSEAS Trans Signal Process 8(1):21–31 28. Guoying Z, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal 29:915–928 29. Sajwan V, Goyal P (2014) Sub-block features based image retrieval. Comput Intell Data Mining 31:637–646
26
V. Sajwan and R. Ranjan
30. Takala V, Ahonen T, Pietikäinen M (2005) Block-based methods for image retrieval using local binary patterns. In: Proceedings of Scandinavian conference on image analysis, pp 882–891 31. Penatti OAB, Valle E, Torres RS (2012) Comparative study of global color and texture descriptors for web image retrieval. J Vis Commun Image Represent 23(2):359–380 32. Mäenpää T, Pietikäinen M, Viertola J (2002) Separating color and pattern information for color texture discrimination. In: Proceedings of 16th International conference on pattern recognition, vol 1, pp 668–671 33. Pandey S, Saini ML, Kumar S (2020) A comparative study on CBIR using color features and different distance method. In: Advances in computing and intelligent systems. Springer, Singapore, pp 605–617 34. Sharma A, Chaturvedi R, Kumar S, Dwivedi UK (2020) Multi-level image thresholding based on Kapur and Tsallis entropy using firefly algorithm. J Interdisc Math 23(2):563–571 35. Sharma A, Chaturvedi R, Dwivedi UK, Kumar S, Reddy S (2018) Firefly algorithm based effective gray scale image segmentation using multilevel thresholding and entropy function. Int J Pure Appl Math 118(5):437–443 36. Sharma A, Chaturvedi R, Dwivedi U, Kumar S (2021) Multi-level image segmentation of color images using opposition based improved firefly algorithm. Recent Adv Comput Sci Commun (Formerly: Recent Patents Comput Sci) 14(2):521–539 37. Mäenpää T, Pietikäinen M (2004) Classification with color and texture: jointly or separately? In: Proceedings of pattern recognition, pp 1629–1640 38. Abrishami Moghaddam H, Taghizadeh Khajoie T, Rouhi AH, Saadatmand Tarzjan M (2005) Wavelet correlogram: a new approach for image indexing and retrieval. Pattern Recogn 38:2506–2518 39. Salehian H, Jamzad F, Jamzad M (2011) Fast content based color image retrieval system based on texture analysis of edge map. Adv Mater Res 341:168–172 40. Dubey SR, Singh SK, Singh RK (2016) Multichannel decoded local binary patterns for content based image retrieval. IEEE Trans Image Process 25(9):4018–4032 41. Lee SH, Choi JY, Ro YM, Plataniotis KN (2012) Local color vector binary patterns from multichannel face images for face recognition. IEEE Trans Image Process 21(4):2347–2353. https://doi.org/10.1109/TIP.2011.2181526 42. Wang Database [online]. Available: http://www.wang.ist.psu.edu/docs/relate/. Last accessed on 05 Oct 2021
OntoINT: A Framework for Ontology Integration Based on Entity Linking from Heterogeneous Knowledge Sources N. Manoj, Gerard Deepak, and A. Santhanavijayan
Abstract In artificial intelligence, knowledge representation can be a crucial field of work, particularly in the development of the query answering system. Ontology is used to talk about a particular space for query answering structure of shared knowledge. Ontology integration is necessary to fathom this issue of blended information. In the proposed OntoINT framework, the ontologies are subjected to spectral clustering and ANOVA–Jaccard similarity index under sunflower optimization as similarity measurement for Ontology matching. The performance of the proposed OntoINT is analysed and associated with other iterations of the OntoINT and baseline models. It was found that our approach is superior in terms of performance. It can be observed precision to be 93.97%, recall to be 96.02%, accuracy to be 94.89%, F-measure to be 94.98% and percentage of new and yet relevant concepts discovered to be 88.48% for OntoINT network for dataset 1 and precision to be 91.89%, recall to be 93.02%, accuracy to be 92.89%, F-measure to be 92.45% and percentage of new and yet relevant concepts discovered to be 84.78% for dataset 2. Keywords ANOVA–Jaccard similarity · Entity linking · Ontology · Spectral clustering
1 Introduction Ontologies capture information about settled space and offer a conventional meaning of a space that can be reused and distributed over framework and services. To accomplish ontology integration, it is important to find a relation between two ontologies. Ontology integration is essential for assimilating verified knowledge in order to collate information from heterogenous sources. Understanding the very nature of N. Manoj · A. Santhanavijayan Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram, Chennai, India G. Deepak (B) Department of Computer Science and Engineering, National Institute of Technology, Tiruchirappalli, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_3
27
28
N. Manoj et al.
the operations of deliberations is the main element in analysing each knowledge model. We may define knowledge modelling into two types of reflection operations aggregation and generalization. The proposed approach includes the development, or abstraction of the deciding components of a given subject, of thought, also focused on monotonous versions of those attributes. In the proposed OntoINT framework, the ontologies are grouped using spectral clustering and ANOVA–Jaccard similarity index as similarity measurement for ontology matching. Ontology bins are created from each cluster. Both ontologies are eventually merged from each container into a single ontology, which makes a difference to us in reducing look exertion in questioning data in the processing of queries. The result of this research paper is to provide a much-improved approach for ontology merging. Motivation: Ontology integration is crucial to achieve a large integrated ontology or a knowledge graph. There are thousands of web services and other services that use ontology modelling. These services are merged using ontology integration. Its plays a crucial role for distribution over a framework. OntoINT is a method of ontology aggregation that uses query terms, and the first of its kind is ontology matching that integrates sunflower optimization. Contribution: A system for integration of framework from different services which has ontology modelling using ANOVA–Jaccard semantic similarity index and spectral clustering to integrate ontology. Experimental data showed that it is possible to achieve high accuracy by using OntoINT when compared to some traditional approaches. It can be observed precision to be 93.97%, recall to be 96.02%, accuracy to be 94.89%, F-measure to be 94.98% and percentage of new and yet relevant concepts discovered to be 88.48% for OntoINT network, respectively, for dataset 1 and precision to be 91.89%, recall to be 93.02%, accuracy to be 92.89%, F-measure to be 92.45% and percentage of new and yet relevant concepts discovered to be 84.78% for dataset 2. Organization: The following content of the paper is organized and arranged as given. Section 2 achieves to explain the relevant research that has been previously done related to this topic. Section 3 elaborately explains the proposed architecture. Section 4 consists of the implementation and also explains the performance and comparison. Section 5 concludes the paper.
2 Related Works Osman et al. [1] have put forward a systematic study of all types of ontological integration. All similar notions were explored, and new methods and approaches to literature were scrutinized. Babalou et al. [2] have categorized the criteria in which ontology can be integrated. Generic merge requirements (GMRs) are used to determine a compatible set of requirements using user input. Do et al. [3] ontological model is presented, which consist of concepts, relationships between concepts and
OntoINT: A Framework for Ontology Integration …
29
inference rules, used as the knowledge kernel. Additionally, to form an interconnected information-based structure, this kernel is often added to other knowledge, such as knowledge of operators and functions. Korableva et al. [4] have put forward an ontological system of automatic replenishment and a system of semantic searching according to certain parameters for data subsets. He et al. [5] have proposed a system for information and data incorporation, exchange and review of a community-based ontology for coronavirus disease. Nadal et al. [6] have put forward a query rewriting algorithm that transforms queries presented over the ontology into queries over the sources using the annotated ontology. A model that semi-automatically adapts the ontology to new releases to cope with syntactic evolution in the sources has been putforth. Ma et al. [7], have surveyed recent advances in ontology learning and how they can be adopted and play a critical role in the convergence of knowledge systems are illustrated. Peng et al. [8] have proposed an approach that applies semantic web technology to combine heterogeneously built services and devices with health data and home environment data. Buron et al. [9], have put forward a system to integrate data sources under an architecture based on RDF graphs and ontologies from several data models. Sobral et al. [10] have proposed an architecture to bolster intelligent transportation systems data integration and visualization. Sun et al. [11] have put forward a unified geospatial data ontology system, denoted as GeoDataOnt, to provide a semantic base for aggregation and sharing of geospatial data. They have a geospatial data hierarchy of characteristics. Next, for each geospatial data characteristic, we examine the semantic issues. Boyles et al. [12] have proposed a system where semantic approaches push new theoretical perspectives and bring diverse data into a scientifically relevant context. Qodim et al. [13] have put forward a system to integrate information with the use of ontology. El-Sappagh et al. [14] have presented a detailed mobile health structure with an incorporated CDSS capability. Makwana et al. [15] have proposed a method to measure the similarity between ontology using clustering. In Refs. [16–25], several ontology-driven schemes are discussed in support of this paper.
3 Proposed System Architecture The architecture of the proposed OntoINT is depicted in Fig. 1. The user need is first obtained as a query, and then the query is pre-processed by being tokenized into terms, lower case conversion lemmatization, stop word removal, punctuation removal, word-stemming and number to text conversion. Query words are given as input to ontology matching using ANOVA–Jaccard with sunflower optimization. The similarity function utilized in this case is known as ANOVA–Jaccard similarity. The idea of intersection over the union is used by Jaccard similarity functions. This function is used to compare the text similarity of two texts.
30
N. Manoj et al.
Fig. 1 Proposed architecture diagram for OntoINT
Jaccard SIM =
X ∩Y X ∪Y
(1)
Jaccard similarity is represented by Eq. (1), where X and Y are two sets containing text. It gives a float number between 1 and 0, which is multiplied by 100 to yield the percentage of similarity. In this case, we choose 0.75 as the cut-off, and number greater than 0.75 is considered true, while any value less than that is considered false. Another input that is required for ontology matching is data from ontology corpus that is converted from Web Ontology Language (OWL) to Resource Description Framework (RDF) using Onto Collab. The OWL is a semantic web dialect planned to talk about bunches of things and relationships between things with rich and complex details. RDF has been created and concurred upon by W3C for information interchange. The query-matched ontologies and entity-linked knowledge bases like CYC, DBpedia and Wikidata are used in spectral clustering to find required clusters. DBpedia ontology is a shallow, cross-domain ontology based on the most frequently used info boxes manually created inside Wikipedia. At present, ontology comprises 685 classes that form a subsumption hierarchy and are defined by 2795 separate properties. Spectral clustering is an EDA approach for breaking down large multidimensional datasets into clusters of comparable data in rarer dimensions. It is well known that encompassed clustering refers to the division of a mass–spring arrangement, where each mass is linked to an information point and each spring power compares to an edge weight indicating a closeness of the two information focuses related to it. Equation (2) clarifies that the problem of eigenvalue describing the transversal vibration modes of a mass–spring structure is the same as the problem of eigenvalue
OntoINT: A Framework for Ontology Integration …
31
for the Laplacian map characterized as L = D−A Dii =
Ai j
(2) (3)
j
L norm = I − D −1/2 AD −1/2
(4)
P = D −1 A
(5)
In Eqs. (2), (3) and (4), D is the diagonal matrix. Within the mass–spring framework, the masses that are strongly joined by the springs pass Along with the state of equilibrium in low-frequency vibration modes, the elements of the eigen vectors can be used for significant mass clustering compared to the smallest eigen values of the Laplacian map. The normalized cuts calculation or Shi-Malik calculation given by Jianbo Shi and Jitendra Malik, Eq. (3), widely used for image classification, is a well-known related spectral clustering technique. Compared to the second smallest eigenvalue of the symmetric normalized Laplacian characterized as Eq. (4), the segments depend on two sets (B1 and B2) centred on the eigenvector V. The Eq. (4) takes the eigenvector proportional to the largest eigenvalue of the normalized contiguousness matrix Eq. (5). Cyc is a project of artificial intelligence targeted at developing a systematic ontology and knowledge base that incorporates the basic principles and laws of how the world functions. Cyc relies on the accepted intuition that other AI stages can take for granted to catch common-sense insights. Segmentation can be achieved in variety of ways, recognizing the exclusive vectors, such as by measuring the middle m of the components of the least restrictive vector v at the moment and positioning all the focuses where the V dimension is greater than M in B1 and the remaining portion in B2. The approach could be used by monotonously allocating subsets in this way for multiple levelled clustering. The semantic similarity and entropy is computed from the dataset and return the top 10% of each cluster. The results are axiomatized, rules induced and reasoned; then, the results are given back to the user.
4 Implementation and Performance Evaluation By using the Python programming language and the Jupiter Notebook used as the Integrated Development Engine, the entire framework is accomplished as shown in Table 1 using OntoCollab. The ontological concept, based on the ontological tasks of the same or comparative concepts, is developed. The ontologies created are within the OWL framework, i.e. the structure of the Web Ontology Language, which offers insights into the proposed OntoINT. The Python’s NLTK is utilized to pre-process
32
N. Manoj et al.
Table 1 Ontology integration system using ANOVA and Jaccard similarity Input: One word or a multiple word query Q Output: Integrated ontology Step 1: the input query is pre-processed in such a way it encounters tokenization as well lemmatization, gets a list of terms of query words qw Step 2: while (qw! = NULL) obtain the semantic equivalence of qw and lookup the OWL Ontologies and add them to ontology matching Step 3: entity linked data and qw are clustered using spectral clustering Step 4: from the data set and clustered qw semantic similarity and entropy is being calculated Step 5: the clustered data are axiomatized, rule induction and reasoned Step 8: choosing ontology for user Step 9: return ontology to user End
the query words. For tokenization, a traditional tokenizer and a punctuator tokenizer are used. The Stanford Core NLP Lemmatizer for stemming purposes is used. The proposed OntoINT performs much better than existing approaches owing to the reason that ANOVA–Jaccard has been used. The incorporation spectrum clustering linking entities with matched ontologies incorporating three different knowledge sources, namely CYC, DBpedia and Wikidata. OntoINT is an ontology integration system that uses query words, and ontology matching incorporating sunflower optimization is the first of its kind. The query matched ontology driven spectral clustering has been incorporated for internal decking for three different heterogeneous CYC, DBpedia and Wikidata have been used together amalgamation give lots of background and auxiliary knowledge. Therefore, adding on to the increase in information density in the system furnishes desirable results by enprovisioning lateral knowledge. The amalgamation of ANOVA–Jaccard and sunflower optimization. Ultimately the use of entropy with semantic similarity has yielded the desirable results. It is inferable that the suggested architecture’s performance is assessed utilising precision, recall, accuracy, F-measure and FDR as possible metrics. Precision is calculated as the proportion of retrieved ontologies that are suitable to the total number of retrieved ontologies. The percentage of retrieved and applied ontologies divided by the total amount of ontologies is known as recall. Accuracy is defined as the average of precision and recall tests. Standard formulations for precision, recall, accuracy, F-measure and percentage of new and yet relevant concepts discovered of the system have been used. The different models taken for reference are OntoIGSM [15], SVM + Bayesian network, K-means clustering + cosine and Jaccard similarity, SIHD and OntoINT. Figure 2 shows the performance assessment for the baseline models, OntoIGSM and SIHD. It can be observed precision to be 82.41%, recall to be 86.38%, accuracy to be 85.17%, F-measure to be 84.35% and percentage of new and yet relevant concepts discovered to be 74.18% and for OntolGSM and for SIHD the observations are precision to be 87.42%, recall to be 86.01%, accuracy to be, F-measure to be 88.35% and percentage of new and yet relevant concepts discovered to be 64.21%. It can be observed the precision to be
OntoINT: A Framework for Ontology Integration …
33
Fig. 2 Performance of the proposed OntoINT model for dataset 1
Fig. 3 Performance of the proposed OntoINT model for dataset 2
86.17%, recall to be 90.38%, accuracy to be 88.02%, F-measure to be 88.22% and percentage of new and yet relevant concepts discovered to be 83.68% for SVM + Bayesian network. It can be observed the precision to be 85.81%, recall to be 88.31%, accuracy to be 86.31%, F-measure to be 87.06% and percentage of new and yet relevant concepts discovered to be 70.18% for K-means clustering + cosine and Jaccard similarity. It can be observed precision to be 93.97%, recall to be 96.02%, accuracy to be 94.89%, F-measure to be 94.98% and percentage of new and yet relevant concepts discovered to be 88.48% for OntoINT network. From Fig. 2 it can be concluded that the improvement in OntoINT compared to base model is better.
34
N. Manoj et al.
In Fig. 3, the performance of all other models using dataset 2 is shown. It can be observed precision to be 82.41%, recall to be 84.38%, accuracy to be 83.17%, F-measure to be 82.87% and percentage of new and yet relevant concepts discovered to be 68.18%, and for OntolGSM and for SIHD, it is noted to be 85.42%, 87.31%, 83.01%, 86.35% and 60.21% respectively. It can be observed precision to be 84.17%, recall to be 88.38%, accuracy to be 86.02%, F-measure to be 86.22% and percentage of new and yet relevant concepts discovered to be 79.68% for SVM + Bayesian network. and It can be observed precision to be 83.81%, recall to be 86.31%, accuracy to be 83.31%, F-measure to be 85.06% and percentage of new and yet relevant concepts discovered to be 71.18% for K-means clustering + cosine and Jaccard similarity. It can be observed precision to be 91.89%, recall to be 93.02%, accuracy to be 92.89%, f-measure to be 92.45% and percentage of new and yet relevant concepts discovered to be 84.78% for OntoINT network. From Fig. 3, it can be concluded that the improvement in OntoINT compared to base model is better.
5 Conclusions There is always a need for an approach for ontology integration that is semantically driven and user-oriented. This proposed approach makes predictions accurate and precise when compared to baseline models and other variants of OntoINT due to the knowledge aggregated by ANOVA–Jaccard semantic similarity index. As an outcome, it can be stated that ensuring approaches paired with ontology-driven frameworks would give better outcomes in future for the same goal fulfilment. In future, we can further increase the sub-ontology domain to increase the accuracy of the results. An overall F-measure of 94.98% and 92.45% for datasets 1 and 2, respectively, has been achieved.
References 1. Osman I, Yahia SB, Diallo G (2021) Ontology integration: approaches and challenging issues. Inf Fusion 2. Babalou S, König-Ries B (2019) GMRs: Reconciliation of generic merge requirements in ontology integration. In: SEMANTICS posters & demos 3. Do NV, Nguyen HD, Mai TT (2019) A method of ontology integration for designing intelligent problem solvers. Appl Sci 9(18):3793 4. Korableva ON, Kalimullina OV, Mityakova VN (2019) Designing a system for integration of macroeconomic and statistical data based on ontology. In: Intelligent computing-proceedings of the computing conference. Springer, Cham, pp 157–165 5. He Y, Yu H, Ong E, Wang Y, Liu Y, Huffman A, ... Smith B (2020) CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis. Sci Data 7(1):1–5 6. Nadal S, Romero O, Abelló A, Vassiliadis P, Vansummeren S (2019) An integration-oriented ontology to govern evolution in big data ecosystems. Inf Syst 79:3–19
OntoINT: A Framework for Ontology Integration …
35
7. Ma C, Molnár B (2020) Use of ontology learning in information system integration: a literature survey. In: Asian conference on intelligent information and database systems. Springer, Singapore, pp 342–353 8. Peng C, Goswami P (2019) Meaningful integration of data from heterogeneous health services and home environment based on ontology. Sensors 19(8):1747 9. Buron M, Goasdoué F, Manolescu I, Mugnier ML (2020) Obi-Wan: ontology-based RDF integration of heterogeneous data. Proc VLDB Endowment 13(12):2933–2936 10. Sobral T, Galvão T, Borges J (2020) An ontology-based approach to knowledge-assisted integration and visualization of urban mobility data. Expert Syst Appl 150:113260 11. Sun K, Zhu Y, Pan P, Hou Z, Wang D, Li W, Song J (2019) Geospatial data ontology: the semantic foundation of geospatial data integration and sharing. Big Earth Data 3(3):269–296 12. Boyles RR, Thessen AE, Waldrop A, Haendel MA (2019) Ontology-based data integration for advancing toxicological knowledge. Curr Opin Toxicol 16:67–74 13. Qodim H (2019) Educating the information integration using contextual knowledge and ontology merging in advanced levels. Int J Higher Educ 8(8):24–29 14. El-Sappagh S, Ali F, Hendawi A, Jang JH, Kwak KS (2019) A mobile health monitoring-andtreatment system based on integration of the SSN sensor ontology and the HL7 FHIR standard. BMC Med Inform Decis Mak 19(1):97 15. Makwana A, Ganatra A (2018) A better approach to ontology integration using clustering through global similarity measure. J Comput Sci 14(6):854–867 16. Adithya V, Deepak G (2021) OntoReq: an ontology focused collective knowledge approach for requirement traceability modelling. In: European, Asian, Middle Eastern, North African conference on management & information systems. Springer, Cham, pp 358–370 17. Adithya V, Deepak G, Santhanavijayan A (2021) HCODF: hybrid cognitive ontology driven framework for socially relevant news validation. In: International conference on digital technologies and applications. Springer, Cham, pp 731–739 18. Giri GL, Deepak G, Manjula SH, Venugopal KR (2017) OntoYield: a semantic approach for context-based ontology recommendation based on structure preservation. In: Proceedings of International conference on computational intelligence and data engineering: ICCIDE 2017, vol 9. Springer, p 265 19. Deepak G, Kumar N, Santhanavijayan A (2020) A semantic approach for entity linking by diverse knowledge integration incorporating role-based chunking. Procedia Comput Sci 167:737–746 20. Deepak G, Rooban S, Santhanavijayan A (2021) A knowledge centric hybrid-ized approach for crime classification incorporating deep bi-LSTM neural network. Multimedia Tools Appl 1–25 21. Krishnan N, Deepak G (2021) Towards a novel framework for trust driven web URL recommendation incorporating semantic alignment and recurrent neural network. In: 2021 7th International conference on web research (ICWR). IEEE, pp 232–237 22. Deepak G, Teja V, Santhanavijayan A (2020) A novel firefly driven scheme for resume parsing and matching based on entity linking paradigm. J Discrete Math Sci Crypt 23(1):157–165 23. Deepak G, Kumar N, Bharadwaj GVSY, Santhanavijayan A (2019) OntoQuest: an ontological strategy for automatic question generation for e-assessment using static and dynamic knowledge. In: 2019 Fifteenth international conference on information processing (ICINPRO). IEEE, pp 1–6 24. Deepak G, Santhanavijayan A (2020) OntoBestFit: a best-fit occurrence estimation strategy for RDF driven faceted semantic search. Comput Commun 160:284–298 25. Krishnan N, Deepak G (2021) KnowSum: knowledge inclusive approach for text summarization using semantic alignment. In: 2021 7th International conference on web research (ICWR). IEEE, pp 227–231
KnowCommerce: A Semantic Web Compliant Knowledge-driven Paradigm for Product Recommendation in E-commerce N. Krishnan and Gerard Deepak
Abstract Product recommendation is changing the way how e-commerce Web sites function and also the way how the products are advertised in a way that maximizes the profit by showing the targeted product to the target audience by making use of the user queries and user activity in the Web site. This paper proposes a semantically driven technique for product recommendation using knowledge engineering combined with deep learning and optimization algorithms. The dataset that is used for training the recommendation system is the users click data and user query which is combined into a set called an item configuration set which is later used to create an e-commerce ontology whose semantic similarity is compared to the neural network’s output, and using this similarity score, products are recommended to the users. The efficiency of the architecture is analyzed in comparison with the baseline approaches, and it is shown that the suggested method outruns the performance, with an F-measure and FDR of 93.08% and 93.72% accordingly. Keywords Recommendation system · Genetic algorithm · Radial basis neural network
1 Introduction E-commerce is responsible for a wide range of opportunities as it increases the scalability and availability of products. It has an adverse impact on the market and retail. It derives a path for the companies to create a more optimized and effective way to collaborate with each other inside the supply chain. E-commerce also creates an edge for the customer side as it offers a wide range of products from them to research and compare prices among various retailers. It offers an in-depth explanation of the N. Krishnan Department of Computer Science and Engineering, SRM Institute of Science and Technology, Ramapuram, Chennai, India G. Deepak (B) National Institute of Technology, Tiruchirappalli, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_4
37
38
N. Krishnan and G. Deepak
product. Product reviews given by the previous customers are available, so it would make it easy for the client to select a quality product. It is cost-efficient as it cuts out the intermediates and establishes sales between manufacturers and consumers. Recommendation systems are dynamic models which help in analyzing user data and feeding them back with relevant products as a recommendation. As every user has a distinctive use cases, recommendation system helps bridge this gap. The key constraint is being able to provide more personalized services to the user. Here, the services involve product or data or both product and data. Ratings and reviews also play a major role in users buying a product. If the rating is more, the user is more likely to buy the product. An effective recommendation system typically consists of user data to analyze the user, rating, and reviews along with social relationships. Motivation: As there is a wide range of products available, so the recommendation system helps users to find relevant products to purchase. In enhancing the user experience and user engagement, recommendation system plays a key role. It works by getting the user data such as navigation, queries, and clicks and recommends the most relevant product from the available product set. A more optimized recommendation system can boost sales in high margins. As it helps in cross-selling, for example, if a user buys a mobile phone during the checkout, it can recommend products like a phone case or display guard where the probability of buying it is more. A good recommendation system can make casual browsers into potential buyers. Contribution: Here, we collect user data such as navigation, queries, and clicks. The collected data is then preprocessed. RBFNN is composed of input, hidden, and output layers. RBFNN has only one hidden layer and is strictly constrained. This hidden layer is entitled a feature vector. It upsurges the dimension of the feature vector. Radial basis function network is used for predicting labels along with crowd feedback, social relationships, and reviews. The solution set is created for each iteration, and the best solution is found using a genetic algorithm. The genetic algorithm supports the idea of natural selection in which the fittest nodes are chosen for data transfer where data is transferred from nodes of the previous generation to next another. Every solution set we get is recorded by the algorithm. Organization: The remaining article is arranged as follows: Sect. 2 comprises of previously conducted related studies on the relevant areas. Section 3 describes the architecture suggested. Section 4 provides information about implementation. Section 5 is made up of performance assessment and observable outcomes. Section 6 contributes to the conclusion.
2 Related Works Wang et al. [1] have created a deep learning method which uses a clustering algorithm as well as an attention mechanism for the recommendation. It has an accuracy of 83.17%. Gerard et al. [2] incorporate conceptual frameworks and recommend items
KnowCommerce: A Semantic Web Compliant …
39
that include customization based on customer request, collected user navigation, and user profile analysis. Selvi et al. [3] suggested a new metric of similarity focused on collaborative filtering that preprocesses the dataset by deleting data with different characteristics. This paper also proposed a new solution to processed MFCM clustering dataset for a reduced error cluster where each optimum user is obtained from the cluster by using modified cuckoo search techniques. Lee et al. [4], in this paper, a novel CF hybrid model based on the doc2vec algorithm was proposed using search keywords and purchasing online customer history for product recommendation. Li et al. [5] used the hybrid collaborative filtering approach, where CF was determined by user algorithm and product. Zan et al. [6] proposed six user-based algorithms and performed analysis on them. Jesus et al. [7] had used a custom rating algorithm that considers user expectations and attitudes for the recommendation. Tsagkias et al. [8] stated challenges and opportunities in e-commerce. Zameer et al. [9] implemented a hybrid methodology along with ontology to retrieve useful data and make proper recommendations. Bidisha et al. [10] created a system that provides recommendation to both registered users and unregistered users. In support of the suggested methodology, [11–18] many ontological theories have been presented.
3 Proposed System Architecture The suggested architecture’s KnowCommerce approach involves of four phases and is depicted as Fig. 2. Phase 1 comprises of the creation of a user profile using user activity such as user clicks, user queries, and user navigation. This data is then sent for preprocessing where the useful data is labeled and separated. Now, the item configuration set is created. It is done to sort and configure relevant data as a single set. In Phase 2, the crowd feedback information, social relationships, and reviews along with the user profile created in Phase 1 are sent as an input parameter for the radial basis function neural network. Crowd feedback information is the data gathered from the previous users. The other components of it are social relationships where relationships are derived to form a social structure and reviews are given by users where we can use it to understand the relevance. The RBF network is a feedforward neural network with three layers: the input layer, the hidden layer, and the output layer. The dimension of the feature vector is boosted by RBFN. Prior to the classification, a nonlinear transfer function is added to the feature vector. Figure 1 demonstrates the working flow of the radial basis neural network technique. The output of the nth activation function H(i) in the hidden network layer can be determined using Eq. (1) on the basis of the interval between the input pattern I and the center r. (x − c)2 (1) H (i) = exp − r2
40
N. Krishnan and G. Deepak
Fig. 1 Illustration of radial basis function neural networks
The output of the output layer of the network can be calculated using Eq. (2) K (i) =
n
W j H j (i)
(2)
j=1
Phase 3. Initial solution set is created when label enrichment is done using semantic similarity. E-commerce ontology is a collection set of similar objects/products in a well-defined hierarchy where the main set will have a subset and so on concerning more similarity. Predict labels are created from the output sent from the RBFNN. Predict labels contain the label tags which are more relevant to the user. Label enrichment is done by comparing the user tags with the e-commerce ontology. The initial solution set is generated as a recommendation set that has products that are relevant to the user. In the final phase, the overall solution dataset is fetched to the genetic algorithm. Here, all the solution sets we get during each iteration are the population where the parameters are prepossessed user query, clicks, and navigation. The fitness function determines the amount of relevance between the product to be recommended and the user. The selection process is done by taking the most relevant product by taking the fittest element from the fitness function. To avoid premature convergence and to preserve diversity, mutation is done. The algorithm comes to an end when the population has converged. That is, when the correct product to be recommended is found, it returns the final optimal solution.
KnowCommerce: A Semantic Web Compliant …
41
Fig. 2 Architecture diagram of proposed KnowCommerce system
4 Implementation The KnowCommerce proposed system was created and built using the Windows 10 operating system. The implementation was carried out on an Intel Core i5 9th Gen CPU and 8 GB RAM. The Python NLP tool libraries used for preprocessing data and generating a user profile include NLTK, Re, and sklearn. Keras, a deep learning API
42
N. Krishnan and G. Deepak
framework written in Python that runs on top of TensorFlow, was used to construct and develop the RBF neural network model. The experiment was done using the dataset e-commerce item data (Kaggle) merged with Myntra dataset. The proposed KnowCommerce algorithm is depicted as Algorithm 1. Algorithm 1: KnowCommerce Product Recommendation Algorithm Proposal
Input: User Data, Crowd Feedback, social relationships, and reviews Output: Recommended Products Begin 1: Produce initial population of solution set with n solutions. 2: While (n < solution generated) { 3: Generate user Profile 4: Extract Queries, clicks and navigation from the user. 5: While (Queries, clicks and navigation =! NULL) { 6: Generate Item configuration set 7: Generate dynamic Ecommerce ontology} 8: Collect crowd feedback information 9: Derive Social Relationships and reviews 10: Using data from step 3,8 and 9 Train Radial Basis neural network 11: Set Predict Labels 12: If (semantic similarity = 2): Set text = ref_det[0] Set refID = ref_det[1] Set ref_title = find_ref_title(refID, joinReferences) Set ref_list = references.find_line_startwith(refID) if len(ref_list): elements.append(refID) elements.append(text) elements.append(findList[0]) if isfile('path_to_ref_doc_data/ref_{refID}'): ref_text=process_ref_doc('path_to_ref_doc_data/ref_{refID}') data = set_dataset(ref_text) sentence_sim_obj = sentence_similarity(data, bert_model) most_similar,dist=sent_sim_obj.get_most_similar_sent_cosdist(text) ref_pds = sum(dist) / len(dist) doc_pds = doc_pds + ref_pds elements.append(ref_pds) elements.append(join(most_similar)) elements.append(join(dist)) row_list.append(elements) 8. to_csv_file(row_list) 9.to_json_file(row_list) 10.doc_pds_count = ( doc_pds / len(references) ) * 100
A Deep Learning Approach for Plagiarism …
171
Fig. 3 PDS score calculation of the document
6.5 Experimental Results and Discussion The experiment shows that explicit semantic detection methods have improved the working pattern of plagiarism detection systems. As the method uses word embeddings for calculation of textual similarity, the word embeddings obtained from pretrained models like BERT are more efficient to extract the contextual information. Figure 3 shows the screenshot of the experimentation while calculating PDS score of the document. The proposed algorithm is available at open GitHub repository: https://github. com/rsanjalibohra/BERTpds. Figure 4 shows the comparative analysis of available and the proposed plagiarism detection system for two documents named paper1 and paper2. The obtained results are compared with UGC granted ‘Ouriginal’ plagiarism detection system available for access in Jai Narain Vyas University. (i) (ii) (iii) (iv)
This unique feature of transformer architecture helps in obtaining the precise semantics of the sentences. BERT model gives best results when used with large training and development datasets. The system uses N-gram word similarity concept. The comparative analysis of Ouriginal and proposed plagiarism detection system shows that the obtained PDS score of the document is increased to 15 to 20 percent when processed with proposed system.
7 Conclusion and Future Scope Word embeddings are the building blocks of natural language processing and provide remarkable results in various NLP tasks like part-of-speech tagging, classification, etc. The paper has explained the role of plagiarism in natural language processing
172
A. Bohra and N. C. Barwar
Fig. 4 Comparison of PDS score from two plagiarism detection systems
with explanation of various types of plagiarism and detection approaches. Extrinsic and intrinsic are the two approaches of plagiarism detection. Various detection methods have also been explained. The proposed detection algorithm has improved the performance of the system. The work shows that: (i) (ii) (iii)
The pre-trained models provide contextualized word embeddings which play key role in extracting semantic similarity of the content. The proposed system is 15 to 20% more efficient than existing systems using other detection algorithms. We are continuing the research to explore the other linguistic information like morphological concerned with the language.
Further work can be used for obtaining precise template for authorship attribution using word embeddings. Template based on semantic features will reveal precise results in intrinsic analysis of plagiarism detection. Acknowledgements I acknowledge the TEQIP III project for providing me assistance ship and resources for carrying out this research work. I am also thankful to Jai Narain Vyas University for providing Ouriginal plagiarism detection system for comparative analysis of the work.
A Deep Learning Approach for Plagiarism …
173
References 1. Manning Foundations of statistical natural language processing, Cambridge MIT press. 2. Introduction to natural language processing,https://algorithmia.com/blog/introduction-naturallanguage-processing-nlp last accessed 20/1/ 2021. 3. Natural Language Processing’, https://en.wikipedia.org/wiki/Natural_language_processing last accessed 22/1/2021. 4. Bengio, Learning deep architectures for AI, Foundation and trends in machine learning,2009 . 5. Deep learning Vs: Difference between Deep Learning and NLP, https://www.upgrad.com/blog/ deep-learning-vs-nlp/’ last accessed 22/1/2021. 6. Bengio, Representation learning: A review and new perspectives, IEEE transactions on pattern analysis and machine intelligence, Vol 35, pp 8 ,2013. 7. Dahl, Phone recognition with the mean-covariance restricted Boltzmann machine, Neural information processing systems (2010). 8. Introduction to natural language processing, https://becominghuman.ai/a-simple-introductionto-natural-language-processing- ea66a1747b32 last accessed 24/2/2021. 9. Examples of natural langauge processing, https://www.forbes.com/sites/bernardmarr/2019/06/ 03/5-amazing-examples-of-natural- language-processing-nlp-in-practice/?sh=4ae446611b30 last accessed 27/2/2021. 10. A. Bohra, ‘Performance Evaluation of Word Representation Techniques using Deep Learning Methods,’, 5th International Conference on Computing, Communication and Security (ICCCS), pp. 1–7, IEEE, 2020. 11. Bin Wang, ‘Evaluating word embedding models: methods and experimental result., APSIPA Transaction on Signal and Information Processing, 8, E19, 2019. 12. Senel, Semantic Structure and Interpretability of Word Embeddings, Transactions on audio, speech and language processing, vol 26, (2018. 13. Faiyaz khan, ‘A survey of word embeddings for clinical text’, Journal of Biomedical Informatics, (2019). 14. Studies in Linguistic Analysis, http://cs.brown.edu/courses/csci2952d/readings/lecture1-firth. pdf last accessed 23/2/2021. 15. Socher, Recursive Deep Learning for Natural Language Processing and Computer Vision, PhD thesis, Stanford University (2014). 16. Collobert A unified architecture for natural language processing: Deep neural net- works with multitask learning , (2008). 17. Hinton, A fast learning algorithm for deep belief nets, (2006). 18. Hobson, Natural language processing in action, Manning Publications, (2019), . 19. Ronald, Plagiarism Detection Algorithm using Natural Language based on grammar analyzing, Journal of Theoretical and Applied Information Technology (2014). 20. Nosheena, A Review on Various Plagiarism Detection Systems Based on Exterior and Interior Method, International Journal of Advanced Research in Computer and Communication Engineering. 21. Mihalcea, NLP (Natural Language Processing) for NLP (Natural Language Programming), (2006). 22. Hussain A, Plagiarism: Taxonomy, Tools and Detection Techniques. 23. Tom Kenter, Short text similarity with word embeddings, ACM International Conference on Information and Knowledge Management Melbourne Australia, 2015. 24. Nagoudi, Word Embedding-based Approaches for measuring semantic similarity of ArabicEnglish sentences, 2018. 25. Hanan, Cross-Language Plagiarism Detection using word-embeddings and inverse document frequency, International Journal of Advanced Computer Science and Applications, Vol. 11( 2), (2020). 26. Hadi, A Deep Learning Approach to Persian Plagiarism Detection, 2016. 27. Foltýnek, Academic Plagiarism Detection: A Systematic Literature Review, ACM Computing Surveys ,2019.
174
A. Bohra and N. C. Barwar
28. Evgeniy, Computing semantic relatedness using Wikipedia-based explicit semantic analysis, IJCAI , (2007). 29. Sorg AE (2009) Comparison of Explicit Semantic Analysis implementations for CrossLanguage Retrieval. Springer, NLDB 30. Zagreb, Corpus-Based Paraphrase Detection Experiments and Review,(2020). 31. Jin Xu,Semantic word cloud generator based on word embeddings, IEEE Pacific Visualization Symposium , (2016). 32. Nosheena, A Review, on Various Plagiarism Detection Systems Based on Exterior and Interior Method, IJARCCE, 2018. 33. Grigori, A graph based authorship identification approach—Notebook for PAN, CLEF ,(2015). 34. Tschuggnall, Detecting plagiarism in text documents through grammar-analysis of authors, BTW, 2013. 35. Horacek, An Experimental Comparison of Explicit Semantic Analysis Implementations for Cross-Language Retrieval, LNCS 5723 ( 36–48) Springer, (2010) 36. Evgeniy, Computing semantic relatedness using wikipedia-based explicit semantic analysis’, IJCAI,( 2007). 37. Juliano, Indra: A Word Embedding and Semantic Relatedness Server, LREC, ( 2018). 38. Lutful, Word Embeddings for Semantic Resemblance of Substantial Text Data: A Comparative Study, (2020). 39. Bhattacharya, Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval, (2016). 40. Bryan Christopher Runck, Using embeddings to generate data-driven human agent decisionmaking from natural language,(2019). 41. Yanshan, A comparison of word embeddings for the biomedical natural languag, Journal of Biomedical Informatics (87), (2018). 42. Efstathios, Plagiarism and authorship: introduction to the special issue, Lang Resources and Evaluation 45(1–4), (2011).
A Multi-attribute Decision Approach in Triangular Fuzzy Environment Under TOPSIS Method for All-rounder Cricket Player Selection H. D. Arora , Riju Chaudhary , and Anjali Naithani
Abstract Of all the sports played in the globe, cricket is one of the extremely popular and entertaining sports. The 20 overs game, named T-20 cricket, has recently been gaining popularity. The Indian Premier League (IPL) is instrumental in promoting Twenty-20 cricket in India. The goal of this research is to analyse performances of selecting best all-rounder cricket player using triangular fuzzy set approach through Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method. To cope with imprecise and ambiguous data, the suggested work uses five alternative multi-criteria procedures and four criteria in a fuzzy environment. The results suggest that the proposed model provides a more realistic way to select a best all-rounder cricket player among others. Keywords TOPSIS · Multi-attribute decision-making · Triangular fuzzy sets
1 Introduction The set theory related to fuzzy presented is a useful tool for describing circumstances in which information is vague or uncertain. The fuzzy set corresponds to a similar situation by defining the degree to which a particular object belongs to a given set. In fact, however, an individual may assume that x to some degree belongs to the set A, but it may be that he is uncertain about it. Fuzzy set theory, a dynamic approach , proposed by Zadeh [1] can be applied in situations involving ambiguity,
H. D. Arora · R. Chaudhary (B) · A. Naithani Department of Mathematics, Amity Institute of Applied Sciences, Amity University Uttar Pradesh, Noida, India e-mail: [email protected] H. D. Arora e-mail: [email protected] A. Naithani e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_14
175
176
H. D. Arora et al.
subjectivity and vague decisions. Fuzziness characterization and quantification are critical concerns that influence the management of uncertainty in many information systems and concepts. The triangular fuzzy number (TFN) can be used in situations where some attribute value needs to be expressed objectively in order to convey the decision-making information. This not only maintains the variable’s value in an interval but also displays the likelihood of different values in the considered interval. Multi-attribute decision-making (MADM) is commonly used in numerous domains, including environmental studies, energy, sustainability management, mathematical modelling. It is an extremely tedious job to obtain a precise mathematical value for the examined objects during strategic planning as it involves complexity and ambiguity of human reasoning. MADM is a complex decision-making method that incorporates both quantitative and qualitative elements. In order to choose the most likely optimal options, several MADM strategies and approaches have been suggested. As an augmentation to the fuzzy MADM approach is suggested in this work, where the rankings of options versus attributes, and the weights of all criteria, are assessed in linguistic values represented by fuzzy numbers. MCDM models under fuzzy environment have been proposed by several researchers in linguistic modelling [2, 3] and fuzzy preference modelling [4]. Herrera et al. [5] proposed a linguistic approach in group decision-making using fuzzy sets approach. Kacprzyk [6] provides fuzzy logic with linguistic quantifiers which can be used in group decision-making. Liu et al. [7] provide a strategy to solve fuzzy multi-attribute decision-making problems denoted by triangular fuzzy numbers based on connection number. Wang and Gong [8] suggested a set-pair analysis-based decision-making technique for solving MADM problems with known criteria weights. Zhao and Zhang [9] proposed triangular fuzzy number multi-attribute decision-making method based on set-pair analysis, to solve the problems with multi-attribute decision-making considering attribute value and attribute weight being TFN. Huang and Luo [10] formulated an index weight measure under triangular fuzzy number to evaluate uncertain MADM problem. An ideal solution comprises of the optimal values of all criteria, whereas a negative ideal solution comprises of worst values of all criteria, and the selection criteria for alternatives are based on Euclidean distance. The TOPSIS method is easy in implementation and has been applicable in the problems of selection and ranking of alternatives. MCDM methods are popular among the researchers in dealing with decision-making problems to get the most reliable alternative. Many researchers analysed MADM approach using Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method [11]. TOPSIS method has been applied in various decision-making problems based on polar fuzzy linguistic [12, 13], interval-valued hesitant fuzzy [14–18], natural resource management [19], investment development [20], spillway selection [21], supplier selection [22], selection of land [23, 24], entropy weight of IFSs [25], generalized fuzzy and IFS modelling [7, 26–35], linear programming methodology [36], equipment selection [37] and many other real-life situations flavoured with fuzzy sets and generalized fuzzy sets.
A Multi-attribute Decision Approach …
177
The shorter format of cricket, T-20, is India’s most popular entertainment sport. The Board for Control of Cricket in India (BCCI) launched the IPL in 2008 with eight franchises [38, 39]. The IPL provides a fantastic opportunity for all cricketers to realize their aspirations. Staden [40] displayed the comparison of performances of batters, bowlers and all-rounders, graphically. Parker et al. [41] calculated the value of players in the IPL. Lemmer proposed a number of methods for evaluating bowlers and batsmen’s performance. The rest of this article is organized as follows: Section 2 delves into the many terminologies which will be helpful in evaluating players’ performance. In the next section, TOPSIS algorithm and the procedure to find the ranking of players have been discussed. Section 4 presents the results and discussion. Finally, Sect. 5 concluded the article.
2 Basic Definitions This section covers fundamental concepts of fuzzy sets proposed by Zadeh [1] and Zimmermann [42, 43]. These fuzzy set concepts are expressed as follows: Definition 2.1. [1] . A fuzzy set M in is characterized by a membership function: M = {u, δM (u)|u ∈ }
(1)
where δM (u) : → [0, 1] is a measure of belongingness of degree of participation of an element u ∈ in M. Definition 2.2. [44]. Let M = [a, b, c, d] be a real fuzzy number, the membership function of which can be expressed as ⎧ L δ (x), a ≤ x ≤ b ⎪ ⎪ ⎨ M 1, b ≤ x ≤ c δM (x) = U ⎪ (x), c ≤ x ≤ d δ ⎪ ⎩ M 0, other wise
(2)
L (x) and δUM (x) are the lower and upper membership functions of fuzzy where δM number M, respectively, and a = −∞, or a = b, or b = c, or c = d or d = +∞.
Definition 2.3. [42, 43]. A triangular fuzzy number M is a fuzzy number with piecewise linear membership function δM defined by
δM
⎧ x−a1 ⎪ ⎨ a2 −a1 , a1 ≤ x ≤ a2 a3 −x = a3 −a2 , a2 ≤ x ≤ a3 ⎪ ⎩ 0, other wise
(3)
178
H. D. Arora et al.
Fig. 1 Triangular fuzzy number
which can be denoted as (a1 , a2 , a3 ). Graphical representation can be depicted by Fig. 1, as follows: Definition 2.4. [17]. Let P = (x1 , x2 , x3 ) and Q = (y1 , y2 , y3 ) be two TFNs. Then, distance measure function D(P, Q) can be defined as D(P, Q) =
1 (y1 − x1 )2 + (y2 − x2 )2 + (y3 − x3 )2 3
(4)
3 Proposed Fuzzy TOPSIS Algorithm When standard quantitative concepts are difficult or inadequately defined, the idea of linguistic variables is employed to provide approximate characterization. This segment presents MADM issue under fuzzy environment. A viable dynamic methodology is proposed to manage such MADM issues. Each decision matrix in MADM techniques has four main components as (a) criteria, (b) alternatives, (c) weight or relative importance of each attribute and (d) assessment value of alternatives with respect to the criteria. An algorithm of the proposed technique is too introduced which will be applied in selection procedure of an all-rounder cricket player in this session. The procedure of the TOPSIS method can be depicted by Fig. 2, in the subsequent flowchart as follows:
3.1 A Case Study In cricket, we have players who are either batsmen or bowlers or all-rounders. An all-rounder, in cricket, is a player who performs well in both bowling as well as batting. This paper is aimed at ranking of all-rounder cricket players and selection of the best all-rounder from the pool of considered (five) all-rounders. The franchise
A Multi-attribute Decision Approach …
179
Fig. 2 TOPSIS algorithm
owners of IPL Twenty-20 cricket tournament usually choose their players through auction and on the individual past performance of players. For proper bidding, each player’s price is determined by their individual performances, and franchise owners receive the complete picture and choose their right team with appropriate players, resulting in a significantly higher profit from the IPL T-20 event. In T-20 cricket, the all-rounder plays a considerably larger role than other category players, be it bowlers or batsmen. Data including batting average (C1), batting strike rate (C2), bowling average (C3) and bowling strike rate (C4) has been collected from https://www.cri cbuzz.com/profiles for the current study as shown in Table 1. Based on this data, the players need to be ranked and the best all-rounder needs to be determined. In MADM, the first step is to classify the considered problem in the form of benefit and cost criteria. Benefit criteria are those criteria for which higher values are desired, and cost criteria are those criteria for which lower values are desired. In the considered case study, C1 and C2 are benefit criteria, whereas C3 and
180
H. D. Arora et al.
Table 1 Data set in the form of a decision matrix (X) Alternatives/criteria
C1
C2
C3
C4
Shane Watson
31.08
139.53
29.15
22.05
Andre Russell
29.47
179.29
26.86
17.75
Ravindra Jadeja
26.62
128.14
30.25
23.8
Hardik Pandya
27.47
157.23
31.26
20.69
Ben Stokes
25.55
134.5
34.78
24.39
Table 2 Linguistic variables and their corresponding triangular fuzzy weights Importance
Fuzzy weight
Extremely low (EL)
(0,0,0.1)
Very low (VL)
(0,0.1,0.3)
Low (L)
(0.1,0.3,0.5)
Medium (M)
(0.3,0.5,0.7)
High (H)
(0.5,0.7,0.9)
Very high (VH)
(0.7,0.9,1.0)
Extremely high (EH)
(0.9,1.0,1.0)
C4 are cost criteria. To begin with, a seven-point scale of triangular fuzzy number, as shown in Table 2, needs to be chosen. Step 1: Let there be four decision-makers, namely owner (DM1), captain (DM2), coach (DM3) and sponsor (DM4), who will decide the best all-rounder cricket player. The decision-makers’ choice in terms of linguistic variables can be depicted in Table 3 as follows: Step 2: Fuzzy weights, based on the decision-makers’ subjective evaluation, are calculated and shown below: Step 3: The aggregated fuzzy weights, by taking into consideration the upper, medium and lower values of the four ratings from Table 4, have been calculated as follows (Tables 5 and 6): Step 4: Fuzzy weighted normalized decision matrix is then computed by multiplying the normalized decision matrix by the corresponding fuzzy weights, using the Table 3 Rating by decision-makers in linguistic scale Criteria/decision-maker
DM1
DM2
DM3
DM4
Owner
Captain
Coach
Sponsor
Batting average
EL
VL
M
EL
Batting strike rate
L
EH
VH
L
Bowling average
L
M
VH
M
Bowling strike rate
L
M
VL
VH
A Multi-attribute Decision Approach …
181
Table 4 Conversion of linguistic rating of decision-makers into fuzzy rating Criteria/decision-maker
DM1
DM2
DM3
Owner
Captain
Coach 0.3
0.3
DM4 0.5
Sponsor
C1
0
0
0.1
0
0.1
0.7
0
0
0.1
C2
0.1
0.3
0.5
0.9
1
1
0.7
0.9
1
0.1
0.3
0.5
C3
0.1
0.3
0.5
0.3
0.5
0.7
0.7
0.9
1
0.3
0.5
0.7
C4
0.1
0.3
0.5
0.3
0.5
0.7
0
0.1
0.3
0.7
0.9
1
Table 5 Aggregated fuzzy rating (W) Criteria/fuzzy weights
L fuzzy weight
M fuzzy weight
U fuzzy weight
C1
0.075
0.15
0.3
C2
0.45
0.625
0.75
C3
0.35
0.55
0.725
C4
0.275
0.45
0.625
Table 6 Fuzzy weighted normalized decision matrix (V) Batting average Shane Watson
0.0371
Batting strike rate
0.0742
0.1483
0.1886
0.2619
0.3143
Andre Russell
0.0352
0.0703
0.1407
0.2423
0.3366
0.4039
Ravindra Jadeja
0.0318
0.0635
0.1271
0.1732
0.2406
0.2887
Hardik Pandya
0.0328
0.0656
0.1311
0.2125
0.2952
0.3542
Ben Stokes
0.0305
0.0610
0.1220
0.1818
0.2525
0.3030
Bowling average
Bowling strike rate
Shane Watson
0.1492
0.2345
0.3091
0.1240
0.2029
Andre Russell
0.1375
0.2161
0.2848
0.0998
0.1633
0.2818 0.2268
Ravindra Jadeja
0.1548
0.2433
0.3208
0.1338
0.2190
0.3042
Hardik Pandya
0.1600
0.2515
0.3315
0.1163
0.1904
0.2644
Ben Stokes
0.1780
0.2798
0.3688
0.1371
0.2244
0.3117
formula V = X ×W where V = vij (i = 1, 2, . . . , 5, j = 1, 2, . . . , 12) is the normalized matrix, X = xi j (i = 1, 2, . . . , 5, j = 1, 2, . . . , 4) is the decision matrix, and W = wi j (i = 1, 2, . . . , 4, j = 1, 2, 3) is the aggregated fuzzy weights.
182
H. D. Arora et al.
Step 5: The fuzzy positive ideal solution (FPIS) Ak+ and the fuzzy negative ideal solution (FNIS) Ak− have been then computed using the following formulas: = r1k+ , r2k+ , . . . , rnk+ =
k k max ri j /j ∈ I , min ri j /j ∈ J
(5)
Ak− = r1k− , r2k− , . . . , rnk− =
min rikj /j ∈ I , max rikj /j ∈ J
(6)
A
k+
i
i
and
i
i
where I and J represent benefit and cost criteria, respectively. The subsequent values are presented in Table 7. Separation measures Si+ , Si− and Euclidean distance [17] D Ai , A+ , Step 6: D Ai , A− of each alternative from FPIS and FNIS have been calculated using Formulae (7) and (8) and presented in Tables 8 and 9. Si+ =
n
D Ai , A + ,
i=1
where Table 7 Positive and negative ideal solutions for each criteria C1
C2
L
M
U
L
M
U
A+
0.0370
0.0741
0.1483
0.2423
0.3365
0.4038
A−
0.0304
0.0609
0.1219
0.1731
0.2405
0.2886
L
M
U
L
M
U
A+
0.1375
0.2161
0.2848
0.0998
0.1633
0.2268
A−
0.1780
0.2798
0.3688
0.1371
0.2244
0.3117
C3
C4
Table 8 Separation measures for fuzzy positive ideal solutions for each criteria For FPIS
Shane Watson
C1
C2
C3
C4
Si+
0
0.0741
0.0188
0.0415
0.1344
Andre Russell
0.0050
0
0
0
0.0050
Ravindra Jadeja
0.0140
0.0953
0.0279
0.0584
0.1957
Hardik Pandya
0.0113
0.0411
0.0362
0.0283
0.1171
Ben Stokes
0.0174
0.0834
0.0652
0.0641
0.2302
A Multi-attribute Decision Approach …
183
Table 9 Separation measures for fuzzy negative ideal solutions for each criteria For FNIS
D Ai , A
Shane Watson
+
C1
C2
C3
C4
Si−
0.0174
0.0212
0.0463
0.0225
0.1076 0.2370
Andre Russell
0.0123
0.0953
0.0652
0.0641
Ravindra Jadeja
0.0033
0
0.0373
0.0056
0.0463
Hardik Pandya
0.0060
0.0542
0.0289
0.0357
0.1250
Ben Stokes
0
0.0118
0
0
0.0118
2 2 2 1 a1 − b1+ + a2 − b2+ + a3 − b3+ ∀i = 1, 2, . . . , 5 (7) = 3
and Si− =
n
D Ai , A − ,
i=1
where
D Ai , A
−
2 2 2 1 a1 − b1− + a2 − b2− + a3 − b3− ∀i = 1, 2, . . . , 5 (8) = 3
Step 7: The closeness coefficient (Ri ) for each considered alternative has been calculated using Eq. (9) as follows. D Ai , A − S− = + i − where 0 ≤ Ri ≤ 1, i = 1, 2, . . . , 5 Ri = + − D(Ai , A ) + D(Ai , A ) Si + Si (9) Ranking has been done in decreasing order of their magnitude as shown in Table 10. Table 10 Ranking result obtained from TOPSIS approach Shane Watson
Si+
Si−
Ri
Rank
0.1344
0.1076
0.4445
3
Andre Russell
0.0050
0.2370
0.9790
1
Ravindra Jadeja
0.1957
0.0463
0.1915
4
Hardik Pandya
0.1171
0.1250
0.5162
2
Ben Stokes
0.2302
0.0118
0.0489
5
184
H. D. Arora et al.
3.2 Sensitivity Analysis Initially, the decision-makers have been given equal priority for ranking the alternatives. However, there may be cases where the opinion of decision-makers may be given different priorities. Such cases have been considered in this section. Different priorities, ρk , have been assigned to each of the four decision-makers, 4 where ρk > 0, k = 1, 2, 3, 4 and ρk = 1. The distance measures ϑi+ , ϑi− and k=1
aggregated closeness coefficient i have been computed using Eqs. (10), (11) and (12) and presented in Table 11. Table 11 Aggregated closeness coefficient and ranking for each candidate Players
Distance Measures ϑi+
i
Rank
Best All-rounder
Andre Russell
ϑi−
(a) Case 1 : ρ1 = 0.4, ρ2 = 0.3, ρ3 = 0.2, ρ4 = 0.1 Shane Watson
0.0301
0.0248
0.452
3
Andre Russell
0.002
0.053
0.963
1
Ravindra Jadeja
0.0456
0.0093
0.1704
4
Hardik Pandya
0.0269
0.028
0.5098
2
Ben Stokes
0.0514
0.0035
0.0646
5
(b) Case 2 : ρ1 = 0.35, ρ2 = 0.25, ρ3 = 0.23, ρ4 = 0.17 Shane Watson
0.0299
0.0259
0.4641
3
Andre Russell
0.0017
0.054
0.9681
1
Ravindra Jadeja
0.0451
0.0107
0.1921
4
Hardik Pandya
0.0274
0.0284
0.5088
2
Ben Stokes
0.0528
0.0029
0.053
5
Andre Russell
(c) Case 3 : ρ1 = 0.3, ρ2 = 0.28, ρ3 = 0.27, ρ4 = 0.17 Shane Watson
0.032
0.02708
0.4578
3
Andre Russell
0.0015
0.0576
0.9742
1
Ravindra Jadeja
0.0472
0.0119
0.2018
4
Hardik Pandya
0.0289
0.0301
0.5102
2
Ben Stokes
0.0558
0.0033
0.0561
5
Andre Russell
(d) Case 4 : ρ1 = 0.15, ρ2 = 0.2, ρ3 = 0.4, ρ4 = 0.25 Shane Watson
0.0327
0.031
0.4867
3
Andre Russell
0.0007
0.063
0.988
1
Ravindra Jadeja
0.0469
0.0168
0.2641
4
Hardik Pandya
0.0315
0.0322
0.5059
2
Ben Stokes
0.0614
0.0023
0.0371
5
Andre Russell
A Multi-attribute Decision Approach …
ϑi+ =
185 s
ρk Ci+j
(10)
ρk Ci−j
(11)
k=1
and ϑi− =
s k=1
Also, i =
ϑi− where 0 ≤ i ≤ 1, i = 1, 2, . . . , 5 ϑi+ + ϑi−
(12)
Sensitivity analysis concluded that by assigning different priorities to the opinions of decision-makers, the result of the proposed method remains the same, as Andre Russell came out to be the best all-rounder in all the cases, thereby substantiating the validity and reliability of the proposed method.
4 Conclusion The concept of TOPSIS can be applied to solve real-life problems in fuzzy environments, which have uncertainty problems associated with them. On the basis of past performance record of cricket players [10], the choice of best all-rounder is made by the team of experts. To aid their selection process, an innovative approach, using mathematical modelling, has been proposed in this article. With the help of fuzzy TOPSIS approach, players have been ranked, and eventually, the best all-rounder has been found, from the considered list of players. Because of its capacity to accommodate decision-makers’ hazy opinions and perceptions, the proposed fuzzy TOPSIS method seems to be one of the best ways for tackling MADM challenges. This novel model assists decision-makers in making error-free decisions in a methodical manner, regardless of the multi-criteria field. The results of this research can be further studied in intuitionistic and Pythagorean fuzzy environment, as well as other uncertain, fuzzy situations, in future. Various distance/similarity measures can be considered along with different multi-criteria decision-making techniques, and their comparative study can be done to validate the efficacy/suitability.
186
H. D. Arora et al.
References 1. Zadeh LA (1965) Fuzzy sets. Inform Control 8:338–356 2. Bordogna G, Fedrizzi M, Pasi G (1997) A linguistic modeling of consensus in group decision making based on OWA operators. IEEE Trans Syst Man Cybern 27(1):126–132 3. Chen SJ, Hwang CL (1992) Fuzzy multiple attribute decision making. Springer, New York 4. Fodor JC, Roubens M (1994) Fuzzy preference modelling and multicriteria decision support. Kluwer Academic Publisher, Dordrecht 5. Herrera F, Herrera-Viedma E, Verdegay JL (1996) A linguistic decision process in group decision making. Group Decis Negot 5:165–176 6. Kacprzyk J, Fedrizzi M, Nurmi H (1992) Group decision making and consensus under fuzzy preferences and fuzzy majority. Fuzzy Sets Syst 49:21–31 7. Liu XM, Zhao KQ, Wang CB (2009) New multiple attribute decision-making model with triangular fuzzy numbers based on connection numbers. Syst Eng Electron 31:2399–2403 8. Wang JQ, Gong L (2009) Interval probability stochastic multi-criteria decision-making approach based on set pair analysis. Control Decis 24:1877–1880 9. Zhao Y, Zhang L (2014) Application of the set-pair analysis connection number in decisionmaking of black-start vague set. CAAI Trans Intell Syst 9:632–640 10. Huang ZL, Luo J (2015) Possibility degree relation method for triangular fuzzy number-based uncertain multi-attribute decision making. Control Decis 30:1365–1371 11. Hwang CL, Yoon K (1981) Multiple objective decision making—methods and applications: a state-of-the-art survey. Lecture Notes in Economics and Mathematical Systems. Springer, New York 12. Adeel A, Akram M, Koam ANA (2019) Group decision making based on mm-polar fuzzy linguistic TOPSIS method. Symmetry 11(6):735 13. Akram M, Smarandache F (2018) Decision making with bipolar neutrosophic TOPSIS and bipolar neutrosophic. ELECTRE-I. Axioms 7(2):33 14. Akram M, Adeel A (2019) TOPSIS approach for MAGDM based on interval-valued hesitant fuzzy NN-soft environment. Int J Fuzzy Syst 21(3):993–1009 15. Bai Z (2013) An interval-valued intuitionistic fuzzy TOPSIS method based on an Improved score function. Sci World J 1–6 16. Chen CT (2000) Extensions of the TOPSIS for group decision making under fuzzy environment. Fuzzy Sets Syst 1(114):1–9. ISBN 978-3-642-46768-4 17. Chen TY, Tsao CY (2008) The interval-valued fuzzy TOPSIS method and experimental analysis. Fuzzy Sets Syst 159(11):1410–1428 18. Gupta P, Mehlawat MK, Grover N, Pedrycz W (2018) Multi-attribute group decision making based on extended TOPSIS method under interval-valued intuitionistic fuzzy environment. Appl Soft Comput 69:554–567 19. Ananda J, Herath G (2006) Analysis of forest policy using multi-attribute value theory. In: Herath G, Prato T (eds) Using multi-criteria decision analysis in natural resource management. Ashgate Publishing Ltd., Hampshire, pp 11–40 20. Askarifar K, Motaffef Z, Aazaami S (2018) An investment development framework in Iran’s seashores using TOPSIS and best-worst multi-criteria decision-making methods. Decis Sci Lett 7(1):55–64 21. Balioti V, Tzimopoulos C, Evangelides C (2018) Multi-criteria decision making using TOPSIS method under fuzzy environment. Application in spillway selection. Proceeding 2:637. https:// doi.org/10.3390/proceedings211063 22. Boran FE, Genc S, Kurt M, Akay D (2009) A multi-criteria intuitionistic fuzzy group decision making for supplier selection with TOPSIS method. Expert Syst Appl 36:11363–11368 23. Chu TC (2002) Selecting plant location via a fuzzy TOPSIS approach. Int J Adv Manuf Technol 20(11):859–864 24. Chu TC (2002) Facility location selection using fuzzy TOPSIS under group decisions. Int J Uncertain Fuzziness Knowledge-Based Syst 10(6):687–701
A Multi-attribute Decision Approach …
187
25. Hung CC, Chen LH (2009) A multiple criteria group decision making model with entropy weight in an intuitionistic fuzzy environment. In: Huang X, Ao SI, Castillo O (eds) Intelligent automation and computer engineering. Lecture Notes in Electrical Engineering, Springer, Dordrecht 26. Kakushadze Z, Raghubanshi R, Yu W (2017) Estimating cost savings from early cancer diagnosis. Data 2:1–16 27. Kore NB, Ravi K, Patil SB (2017) A simplified description of fuzzy TOPSIS method for multi criteria decision making. Int Res J Eng Technol (IRJET) 4(5):1–4 28. Krohling RA, Campanharo VC (2011) Fuzzy TOPSIS for group decision making: a case study for accidents with oil spill in the sea. Expert Syst Appl 38(4):4190–4197 29. Kumar K, Garg H (2018) TOPSIS method based on the connection number of set pair analysis under interval-valued intuitionistic fuzzy set environment. Comput Appl Math 37(2):1319– 1329 30. Lai YJ, Liu TY, Hwang CL (1994) TOPSIS for MODM. Eur J Oper Res 76(3):486–500 31. Li DF, Nan JX (2011) Extension of the TOPSIS for multi-attribute group decision making under Atanassov IFS environments. Int J Fuzzy Syst Appl 1(4):47–61 32. Mahdavi I, Heidarzade A, Sadeghpour-Gildeh B, Mahdavi-Amiri N (2009) A general fuzzy TOPSIS model in multiple criteria decision making. Int J Adv Manuf Technol 45:406–420 33. Nadaban S, Dzitac S, Dzitac I (2016) Fuzzy TOPSIS: a general view. Procedia Comput Sci 91:823–831 34. Vahdani B, Mousavi SM, Tavakkoli MR (2011) Group decision making based on novel fuzzy modified TOPSIS method. Appl Math Model 35:4257–4269 35. Wang YJ, Lee SH (2007) Generalizing TOPSIS for fuzzy multiple-criteria group decisionmaking. Comput Math Appl 53:1762–1772 36. Wang CY, Chen S-M (2017) A new multiple attribute decision making method based on interval valued intuitionistic fuzzy sets, linear programming methodology, and the TOPSIS method. In: Advanced computational intelligence (ICACI), 2017 ninth international conference on, IEEE, pp 260–263 37. Yavuz M (2016) Equipment selection by using fuzzy TOPSIS method, world multidisciplinary earth sciences symposium (WMESS 2016). In: IOP Conf. series: earth and environmental science, vol 44, p 042040. https://doi.org/10.1088/1755-1315/44/4/042040, 1–5 38. Cricketbuzz Profile page: https://www.cricbuzz.com/profiles, last accessed 2021/07/11 39. IPL T-20 homepage. http://premierleaguecricket.in, last accessed on 2021/07/08 40. Van Staden PJ (2009) Comparison of bowlers, batsmen and all-rounders in cricket using graphical displays. Technical Report of University of Pretoria, 08 01. ISBN: 978 1 86854 733 3 41. Parker D, Burns P, Natarajan H (2008) Player valuations in the Indian Premier League. Frontier Economics October 42. Zimmermann HJ (1987) Fuzzy set, decision making and expert system. Kluwer, Boston 43. Zimmermann HJ (1991) Fuzzy set theory—and its application, 2nd edn. Kluwer, Boston 44. Chu TC, Lin YC (2009) An interval arithmetic based fuzzy TOPSIS model. Expert Syst Appl 36:10870–10876
Machine Learning Techniques on Disease Detection and Prediction Using the Hepatic and Lipid Profile Panel Data Ifra Altaf , Muheet Ahmed Butt , and Majid Zaman
Abstract Owing to the high availability of data, machine learning becomes an important technique for exhibiting the human process in the medical field. Liver function test data and lipid profile panel data comprise of many parameters with various values that specify certain evidence for the existence of the disease. The objective of this research paper is to provide a chronological, summarized review of literature with comparative results for the various machine learning algorithms that have been used to detect and predict the diseases from at least one of the attributes from liver function test data or lipid profile panel data. This review is intended to highlight the significance of liver function and lipid level monitoring in patients with diabetes mellitus. The association between LFT data and LPP data with diabetes is presented based on the review of past findings. Data is definitely a challenge, and region-specific medical data can be helpful in terms of the aspects that they can reveal. This review paper can help to choose the attributes required to collect the data and form an appropriate dataset. Keywords NAFLD · Diabetes mellitus · Machine learning · Lipid profile · Deep learning
1 Introduction Data analytics is essential for any kind of industry and plays a much-needed role for healthcare than it may possibly play in other markets. It has the great potential to drive the future of healthcare. In the health care industry, the mining algorithm applied to health data takes a considerable part in prediction as well as diagnosis of disease [1]. Data analytics in health care industry can be used to logically and scientifically analyze the huge amounts of rich complex health data coming from I. Altaf (B) · M. A. Butt Department of Computer Sciences, University of Kashmir, Srinagar, J&K, India M. Zaman IT&SS, University of Kashmir, Srinagar, J&K, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_15
189
190
I. Altaf et al.
various private and government heterogeneous health sources [2] to arrive at the accurate decision-making and actual prediction of different diseases. The data can be converted into valuable information that can assist in the selection of treatment or presume the different diseases. Based on the historical information, data analytics can help the physicians with accurate and customized diagnosis, and the service can become more patient-centered. The available data can be analyzed to ascertain which practices can help in improving the overall health of the population provided by the health care organizations. Diabetes is associated with a large number of liver disorders including hepatic enzymes dysfunction. Liver plays a vital part in the maintenance of normal glucose levels. The elevated liver enzymes caused from insulin resistance disorder can lead to diabetes [3]. Also, the diabetic patients are at bigger risk to develop lipid abnormalities [4]. The liver and pancreas play a significant role in the lipid metabolism. The raised levels of triglyceride levels arise due to health conditions such as hypothyroidism, diabetes, kidney or liver disease [5]. The abnormal cholesterol levels are caused by the liver function complications. High triglyceride levels can indicate the fatty liver disease [6]. According to the American Heart Association, diabetes often lowers the good cholesterol levels and raises the bad cholesterol levels as well as triglycerides levels. Liver disease causes 2 million deaths approximately per year worldwide. The 11th most common cause of death globally is ascribed to cirrhosis, and the 16th leading cause of death is liver cancer [7]. According to the WHO, diabetes was the seventh leading cause of death in 2016. Also, approximately 2.2 million and 1.6 million deaths were directly caused by diabetes in the years 2012 and 2016, respectively [8]. Since patients endure many laboratory tests regularly, so a lot of data gets generated which can be of valuable use. Many usable patterns can be derived from the data that can help the medical experts with precise and improved diagnosis. The most common laboratory tests include the diabetes, liver, lipid, thyroid and blood profiles. Sometimes while treating a particular disease, the diseases that are contingent to it can go untreated. There can moreover be a circumstance when the diseases are interrelated and one disease can be useful in predicting the onset of other disease. Therefore, the motivation behind writing this review paper with a novel idea is the availability of the data and the connection between the markers of certain diseases.
1.1 Liver-related Diseases and Conditions The liver is an essential organ that has many functions in the body. Besides storing energy, liver helps in digestion of food and detoxification of waste out of the bloodstream. The whole body is affected when the liver does not work well. Liver disease has many causes. It can be due to infection, immune system abnormality, genetics, cancer and other growths. The other causes include the alcohol abuse, drug overdoses and non-alcoholic fatty liver disease. Liver disease symptoms only turn up after possible liver damage has taken place as it is difficult to diagnose in the early stages
Machine Learning Techniques on Disease Detection …
191
due to the subtle symptoms. Some of the liver-related problems are viral hepatitis, neonatal hepatitis, liver cancer, primary biliary cirrhosis, liver cirrhosis, liver fibrosis, alcoholic and non-alcoholic liver damage, cholelithiasis, primary hepatoma, primary sclerosing cholangitis, Wilson disease, hemochromatosis and tyrosinemia [9]. Diagnosis of Liver Disease: It is usually done by a health care professional by means of clinical and diagnostic examination which includes the medical history, drinking and eating habits, over-the-counter medications, physical examination and certain imaging and biochemical tests performed by the medical personnel. The clinical investigation includes the physical examination often involving inspection (visual examination), palpation (determine with fingers the things like swelling), auscultation (listening to body parts using a medical instrument) and percussion (tapping the body parts with fingers). The non-clinical or diagnostic investigation involves certain imaging and blood tests that measure the enzyme levels related to liver in the blood [10] which are useful in evaluating liver function and the extent of damage. Liver Function Test: It is a simple blood test, considered as the initial step to determine the liver functions or liver injury. Liver function tests are also known as the hepatic panel, liver panel or liver function panel. A liver function test is most often recommended to check the liver damage caused by the viral hepatitis, diagnose the viral infectious diseases causing the liver damage and screen for the effect of alcohol on liver, monitor treatment progress and side effects of medications affecting the liver. A blood sample is taken out from the vein of the arm, and one has to fast for 10–12 h before the test. Fundamentally, LFTs are a set of blood tests usually done to monitor the liver’s production progress and its storage capacities. LFT includes liver enzymes such as alanine aminotransferase (ALT), alkaline phosphatase (ALP), aspartate aminotransferase (AST), gamma-glutamyl transpeptidase (GGT) and proteins like albumin, total protein and bilirubin [11].
1.2 Lipid Metabolism Disorders and Conditions Lipids, also known as lipoproteins, are the structural constituents of the cell membrane. They play a very important role in our body. They aid in proper digestion and absorption of food thereby providing energy to the body. The alternative word for lipids is “fat” [12] that produces hormones in our body. Lipids include lowdensity lipoproteins (LDL) also known as “bad” cholesterol, high-density lipoproteins (HDL) also known as “good” cholesterol and triglycerides. Various lipid disorders include dyslipidemia, hyperlipidemia, hypercholesterolemia, hypertriglyceridemia and hyperlipoproteinemia [13]. The imbalance in lipid levels can cause atherosclerosis, obesity, cardiovascular disease, heart attack or stroke, diabetes. Diagnosis of Lipid Disorders: The abnormality in lipid levels may take place with aging, or with inherited disorders, drug exposure, influence of other diseases, or by being physically inactive or being obese [14]. Usually, the lipid disorders do not reflect any symptoms. Though certain types of lipid disorders or when the cholesterol
192
I. Altaf et al.
levels are abnormally high, yellowish nodules may develop around tendons and joints. The lipid disorder may also cause elevated yellowish patches above the eyelid or white arcs around the cornea of the eye [15]. The diagnosis of lipid disorders can be done by the measurement of fasting serum cholesterol, triglyceride, HDL and LDL levels. There are several diagnostic tests that can be used to accurately diagnose lipid disorders. These include cholesterol test also known as the lipid profile panel (LPP), advanced lipid profile testing, lipoprotein(a) test, coronary artery calcium (CAC) scan/scoring also known as the heart scan, exercise stress testing, vascular laboratory [16]. Lipid Profile Panel (LPP): It is also known as a complete cholesterol test. Lipid profile measures the levels of lipids in the blood. It aids in evaluating the risk of developing heart attacks, stroke and other heart and blood vessel diseases by assisting with the detection of the risk of fatty deposits in the arteries, a condition where arteries narrow down (atherosclerosis) and reduce the blood flow. It serves as a preliminary screening step to determine the abnormalities in lipids particularly the cholesterol and triglycerides. A lipid profile generally includes the tests such as total cholesterol (TC), high-density lipoprotein (HDL), low-density lipoprotein cholesterol (LDL), triglycerides (TG), very low-density lipoprotein (VLDL), total cholesterol and highdensity lipoprotein ratio (TC/HDL) [17].
1.3 Derangement of Enzyme Levels in Diabetes Diabetes refers to a chronic condition or a metabolic disease that results in too much sugar in the blood. Diabetes is a group of diseases linked to abnormal high levels of blood glucose in the blood. The insulin is produced by the pancreas that helps to lower the blood glucose. Diabetes is caused by the deficiency in the production of insulin or by the inability of the body to properly use insulin. Serious problems in diabetic patients can be caused by the improper blood glucose control such as heart and kidney disease, strokes and blindness [18]. Derangement of Liver Enzymes in Diabetes Mellitus: Liver is recognized as the insulin-sensitive organ. Diabetes increases the risk of a condition known as nonalcoholic fatty liver disease (NAFLD) where excess fat builds up in the liver. The liver is responsible for regulating blood sugar, and the presence of excess fat can make it less receptive to insulin. This can result in accumulation of much glucose in the blood that can cause the diabetes mellitus [19]. NAFLD can occur in at least half of those with type 2 diabetes. The liver enzymes are typically unbalanced in diabetic people and act as the pointers of hepatocellular damage [20]. Derangement of Lipid Levels in Diabetes Mellitus: Diabetes has a tendency to lower “good” cholesterol levels and elevate triglyceride and “bad” cholesterol levels. This can elevate the risk for heart diseases and stroke. The lipid abnormalities in patients with diabetes are usually categorized by high total cholesterol, high triglycerides and low-/ high-density lipoprotein cholesterol. However, the low-density lipoprotein cholesterol levels may be normal or reasonably increased [21].
Machine Learning Techniques on Disease Detection …
193
2 Related Works Clinical diagnostic and predictive models are becoming increasingly predominant. Various data mining and machine learning models have been proposed, and most of them have obtained high predictive accuracies. The major role is played by the supervised algorithms for diagnosing the presence of disease that can potentially replace the involvement of physicians by refining the disease prediction accurateness.
2.1 Analysis of State-of-the-art Techniques in Liver Disease Detection and Prediction Using LFT Data Several research techniques have been proposed on liver disease diagnosis and prognosis using the LFT dataset. Many of the machines learning techniques demonstrate a good diagnostic and prediction accuracy. Hashem et al. [22] used artificial neural network (ANN) to simulate the nonlinear relation between fibrosis grades and biomarkers. Stoean et al. [23] used the evolutionary-driven support vector machines (EDSVM). To impute the missing values, mean substitution method is used and then the data is normalized. The study tried to hybridize the learning component within support vector machines and the evolutionary approach. Adeli et al. [24] proposed new fusion algorithm of genetic algorithm (GA) for the feature selection and the adaptive network-based fuzzy inference system (ANFIS) for the classification purpose in MATLAB. Seera et al. [25] developed a hybrid intelligent system by combining the fuzzy min–max neural network (FMM), classification and regression tree (CART) and random forest model (RF) to diagnose and predict the liver disorders. Vijayarani et al. [26] implemented the naïve Bayes and support vector machine (SVM) on liver disease dataset in MATLAB. Tapas et al. [27] attempted to use the various statistical decision trees and the multilayer perceptron algorithms where the latter gave the best classification accuracy. Ghosh et al. [28] considered the Bayesian classifier, lazy classifier, function classifier, meta-classifier and tree classifier implemented in Weka to classify the disease. Nahar et al. [29] experimented with the seven decision tree Weka for investigating the early prediction of liver disease. The experimental results showed that the decision stump provided the highest accuracy. Liaqat et al. [30] proposed an ensemble model that used dimensionality reduction and genetically optimized support vector machine to predict the hepatocellular carcinoma. The model consisted of three algorithms, namely linear discriminant analysis (LDA) which was used for dimensionality reduction, support vector machine (SVM) which was used for classification and genetic algorithm (GA) which was used for SVM optimization giving rise to an intelligent prediction system. The missing values were imputed using the SimpleImputer class of the scikit-learn library of Python. Pei, et al. [31] used the feature importance exploration to demonstrate the new features for predicting the possibility of fatty liver disease.
194
I. Altaf et al.
2.2 Existing Machine Learning Techniques in Lipid Metabolism Disorders Detection and Prediction Using LPP Data as a Risk Factor The constituent tests of LPP have been used as risk factors or predictor variables in many lipid-related disease prediction and detection machine learning techniques. Parthiban et al. [32] predicted the probabilities of getting heart disease among diabetic patients by using the naïve Bayes method in Weka. Medhekar et al. [33] while using naïve Bayes on the Cleveland heart disease dataset tried to predict the cardiovascular disease. The accuracy improved by increase in the size of the training data. Ziasabounchi et al. [34] proposed adaptive neuro-fuzzy interface system (ANFIS) to classify the heart disease data. Perveen et al. [35] put forward an ensemble techniques of bagging and AdaBoost using J48 (Ensemble J48) as a base learner and implemented it in Weka to predict diabetes. Fulvia et al. [36] tried to predict the chronic damage in systemic lupus erythematosus using the recurrent neural networks (RNN). Arianna et al. [37] predicted to diabetes complications by using the random forest (RF) to deal with missing data and also handled the class imbalance problem with suitable approaches. Naushad et al. [38] developed an ensemble machine learning algorithms (EMLA) risk prediction model in order to forecast the risk of coronary artery disease. Dinh et al. [39] developed a weighted ensemble model by using multiple supervised machine learning models to predict cardiovascular diseases. The key variables were identified using the information gain of tree-based models. Sivakumar et al. [40] used C4.5 algorithm for the prediction of chronic liver disease in humans. Leon et al. [41] attempted to detect the type 2 diabetes at an early stage by using the five machine learning prediction models out of which linear regression model (LR) was the most successful. Kavitha et al. [42] applied the random forest (RF) and decision tree (DT) algorithms separately to predict the heart disease and then compared the result with their hybrid (RF-DT), and the latter achieved the higher accuracy comparatively.
2.3 Literature Review for Occurrence of Abnormal Liver Function Tests in Diabetes Mellitus The previous literature work considers that the increased activities of liver enzymes are indicators of hepatocellular injury as determined by the review. In 2011, Smith et al. [43] showed the co-existence of non-alcoholic fatty liver disease (NAFLD) and diabetes mellitus. The authors showed that type 2 diabetes mellitus increases the risk of cirrhosis and is a risk factor for non-alcoholic steatohepatitis. In 2012, Abbasi et al. [44] showed an improvement in the prediction for medium-term risk of incident diabetes with the addition of LFTs. In 2013, Xuan Guo et al. [45] observed that the type 2 diabetes mellitus increases the risk of HCV infection. In 2014, HyeRan et al. [46] analyzed the association between liver enzymes and risk of type 2
Machine Learning Techniques on Disease Detection …
195
diabetes using logistic regression models. In 2015, Sun-Hye et al. [3] showed that the higher levels of GGT and ALT and lower levels of AST/ALT are the risk factors of impaired fasting glucose. In 2016, Ye-Li Wang et al. [47] showed that the increased risk of diabetes is associated with the higher levels of GGT and ALT, whereas no relationship was witnessed for AST, ALP and LDH for determining the risk of type 2 diabetes. In 2017, Nirajan et al. [48] alleged that the type 2 diabetes mellitus involves hepatic enzyme derangement. In 2018, Amrendra et al. [49] used the unpaired t-test, chi-square/Fisher’s exact test and the Pearson/Spearman correlation test to show the association and correlation between the liver enzymes of diabetic dataset. Statistical Package for Social Sciences (SPSS) was used to examine the data. In 2019, Aditi et al. [50] found that in type-2 diabetic dataset, the elevated AST, ALT, ALP and bilirubin had the frequencies of 59.3%, 52.6%, 42.1% and 31.5%, respectively. In 2020, Shiful et al. [51] used the multinomial logistic regression analysis that showed the mean concentrations of ALT, AST, ALP and GGT serum were considerably higher in the diabetic group as compared to the non-diabetes group. In 2021, Blomdahl et al. [52] found that the liver disease patients having diabetes and moderate alcohol intake are at a higher risk of getting fibrosis.
2.4 Literature Review for Incidence of Abnormal Lipid Levels in Diabetes Mellitus Some of the recent studies show that the lipid panel irregularities are common in people with type 2 diabetes and pre-diabetes. In 2011, Khursheed et al. [53] showed that the elevated and deranged TG, LDL, HDL and TC are the markers for lipid profile abnormalities in diabetic people of Pakistan. In 2012, Singh et al. [54] showed that 59% of the diabetic records were having high TC levels, and 98% were having increased LDL levels, while 89% of the records were having lower HDL level. In 2013, Daniel et al. [55] found that 50.4% of the Ghana’s dyslipidemia patients with abnormal lipid parameters and diabetes had abnormal HDL, while 17.3% had abnormal TC, HDL and LDL. In 2014, Ozder [56] analyzed the lipid profile of diabetic population. Unpaired student’s t-test and Pearson’s correlation were utilized to get the association between the glucose levels and lipid profile. The results showed a positive correlation of fasting blood glucose with TC and TG. In 2015, Sultania et al. [57] used the mean, standard deviation, independent t-test and correlation (Pearson’s) test to show a highly remarkable difference in mean HDL as well as mean TG levels in diabetic and non-diabetic patients. In 2016, Habiba et al. [58] used the logistic regression model and chi-square tests to evaluate the relationship of blood lipid profile and demographic variables. In 2018, Bhowmik et al. [59] performed the analysis of covariance (ANCOVA) and regression analysis on the diabetic dataset. The diabetic population was considerably associated with high TC, high TG and low HDL. In 2019, Majid et al. [4] performed a multidisciplinary study and analyzed the data using the SPSS version 22. The study showed the abnormalities that included high
196
I. Altaf et al.
serum TG, high serum TC, low HDL and high LDL in 58.1%, 61.9%, 44.8% and 53.3% of dataset. In 2021, YK Tham et al. [60] used the logistic regression models and found that the plasma lipids can assist in detecting and predicting the atrial fibrillation in diabetic people apart from the traditional risk factors.
3 Comparative Review of Various Machine Learning Algorithms Using LFT and LPP for Diagnosing and Predicting Diseases Based on the accuracies and other performance parameters, the comparative exploration of various machine learning techniques for the detection and forecast of chronic liver disease and lipid-related diseases using the predictive variables from the liver function test and lipid profile panel have been outlined chronologically in Tables 1 and 2, respectively. Figures 1 and 2 give the visual illustration of the accuracies of the machine learning algorithms that are used for the prediction of diseases using the hepatic profile panel and lipid profile panel data, respectively.
3.1 Discussion The liver disorder dataset and the lipid-related disease dataset have been used in various experiments that comprised of either the complete or the constituent test values of the liver function panel and lipid profile panel with some other clinical, demographic and diagnostic biochemical attributes to diagnose and predict the associated diseases. Shahwan et al. [61] performed a study to confirm the substantial link between abnormal hepatic enzymes and lipid levels in diabetic patients. In this study, the survey shows that the technique that occurs with the highest accuracy for predicting the liver-related diseases using LFT data is KStar (98.50%), while the technique that occurs with the lowest accuracy is the EDSVM (64.07%). The survey further reveals that the technique that came about with the maximum accuracy for predicting lipid-related diseases using at least one of the risk factors from LPP data is the C4.5 (94.36%). This review puts emphasis on the association between diabetes mellitus with the liver function abnormalities as well as abnormal lipid levels. The study rationalizes a concrete pathological connection between liver and diabetes and shows that the diabetes disease can elevate the liver enzymes mostly the ALT, AST and GGT levels in blood. Also, the study shows that there can be high cholesterol, high triglyceride, low HDL levels and abnormal levels TC/HDL ratio in the blood due to diabetes.
Machine Learning Techniques on Disease Detection …
197
Table 1 Comparison table on existing machine learning technique for classifying LFT data Author, year
Risk factors from LFT
Disease dealt
Techniques used
Accuracy (%)
Hashem et al. 2010
ALP, ALT/SGPT, AST/SGOT, albumin and total bilirubin
Hepatic fibrosis
ANN
93.7
Stoean et al. 2011
AST, ALT, ALP, GGTP and total bilirubin
Liver fibrosis in chronic hepatitis C
EDSVM
64.07
Adeli et al. 2013 ALP, ALT, AST and GGTP
Hepatitis
GA-ANFIS
97.44
Seera et al. 2014 ALP, ALT, AST, and GGTP
Liver disorder
FMM-CART-RF
95.01
Vijayarani et al. 2015
Liver disorder
SVM
79.66
Tapas et al. 2016 ALP, ALT, AST, and GGTP
Cirrhosis, acute hepatitis, bile duct disease, liver cancer
MLP
71.59
Ghosh et al. 2017
AST, ALT, ALP, GGTP, total bilirubin, direct bilirubin, albumin and A:G ratio
Liver disorder
KStar
98.5
Nahar et al. 2018 AST, ALT, ALP, GGTP, total bilirubin, direct bilirubin, albumin and A:G ratio
Liver disorder
Decision stump
70.67
Liaqat et al. 2020
AST, ALT, ALP, GGTP, total bilirubin, direct bilirubin, albumin and total proteins
Hepatocellular carcinoma
LDA-GA-SVM
90.30
Pei et al. 2021
AST, ALT, ALP, GGTP and bilirubin
Fatty liver
XGBoost
94.15
AST, ALT, ALP, GGTP, total bilirubin, direct bilirubin, total proteins, albumin and A:G ratio
198
I. Altaf et al.
Table 2 Comparison table on existing machine learning technique using LPP as risk factor Author, year
Risk factors from LPP
Disease dealt
Techniques used
Performance
Parthiban et al. 2011
Cholesterol
Heart disease for diabetic patients
Naïve Bayes
Accuracy = 74%
Medhekar et al. Cholesterol 2013
Cardiovascular diseases
Naïve Bayes
Accuracy = 88.96%
Ziasabounchi et al. 2014
Cholesterol
Heart diseases
ANFIS
Accuracy = 92.30%
Perveen et al. 2016
HDL and triglycerides
Diabetes
Ensemble J48
–
Fulvia et al. 2017
Total cholesterol and triglycerides
Chronic damage in systemic lupus erythematosus
RNN
AUC = 0.77
Arianna et al. 2018
Cholesterol and triglycerides
Diabetes complications
RF-LR
Accuracy = 83.8%
Naushad et al. 2018
LDL cholesterol and triglycerides
Coronary artery diseases
EMLA
Accuracy = 89.3%
Dinh et al. 2019 LDL and HDL
Cardiovascular diseases
Weighted ensemble model
Accuracy = 83.9%
Sivakumar et al. 2019
Bad cholesterol level
Chronic liver disease
C4.5
Accuracy = 94.36%
Leon et al. 2020
LDL, HDL, total cholesterol and triglycerides
Type 2 diabetes
LR
RMSE = 0.838
Kavitha et al. 2021
Cholesterol
Heart diseases
RF-DT
88.7%
Fig. 1 Performance of various machine learning technique based on their accuracy using LFT data
Machine Learning Techniques on Disease Detection …
199
Fig. 2 Performance of various machine learning technique based on their accuracy using LPP data
4 Research Gaps The research study conducted helped us to refine a research question as well as inspired a new research idea besides creating a greater understanding of the topic. The research question raised is “can we detect and predict one disease from the markers of another disease? This question can be refabricated in our case as can we detect or predict the diabetes disease from the markers of hepatic and lipid panel biomarkers? There is no such machine learning technique implemented in terms of the predictive effectiveness of the liver enzymes and the lipid levels for diabetes detection or prediction despite having an association between abnormal serum hepatic enzymes and lipid levels in patients with diabetes mellitus as per the existing literature. Furthermore, the research studies have mostly used the data from the online repositories, whereas same dataset results cannot be appropriate to all the geographical regions. Geography affects the health, and medical data is region dependent.
5 Pragmatic Implications and Recommendations The practical inferences and recommendations of this research study suggest that since there is a relationship between liver enzymes, lipid levels and diabetes, for that reason, a dataset that consists of the attributes from the liver function test and lipid profile panel as bio-markers should be collected and then analyzed to obtain the interesting and useful patterns from the data in order to predict the diabetes disease. There is a need to collect the real-world dataset of other regions as well that can give better critical aspects based upon the geographical area.
200
I. Altaf et al.
6 Contribution The main contributions of this paper include a novel perspective of predicting one disease from the markers of another disease/s. The study ascertains the association between liver function test data and lipid profile test data with diabetes on the basis of the review of past findings so as to explore new laboratory test indicators of the particular disease in addition to the traditional markers. This study can help with the collection and formation of a disease dataset by choosing the appropriate markers from the liver and lipid profiles that have an impact on the diabetes disease. Our approach can help to make a good observation system that can be of a foremost public health prominence. The diseases or markers linked to other diseases can be observed, and the information about the occurrence of other diseases can be provided ahead of time.
7 Conclusion and Future Work The accuracy of the models depends upon the type of dataset and preprocessing techniques applied to it. In order to generate the good prediction results, the quality of the data can be enhanced by choosing the impactful and minimum predictive attributes. This can be done by using the proper feature selection methods. This comparative study concentrates on the association between the derangement of liver enzymes and diabetes as well as imbalance in lipid levels and diabetes. There is a relationship between liver enzymes and lipids too. The researchers consider that the liver problems and lipid metabolism disorder lead to diabetes and contrariwise. Therefore, the deranged LFTs and LPPs can be used to predict the diabetes disease. Since the machine learning algorithm accuracy is highly influenced by the quality of the dataset, therefore, the future work will focus on collecting a high-quality LFT and LPP dataset, verified from expert physicians. The predictive effectiveness of the liver enzymes and lipid levels for diabetes will be validated using the data mining and machine learning techniques. Also, the dataset can be tested using deep learning algorithms, and a comparative study between the conventional and deep learning approaches can be focused upon.
References 1. Ranitha S, Vydehi S (2017) Data mining in healthcare datasets. Int J Eng Dev Res (IJEDR) 5(4):84–86. ISSN:2321-9939. Available at :http://www.ijedr.org/papers/IJEDR1704014.pdf 2. Cerquitelli T, Baralis E, Morra L, Chiusano S (2016) Data mining for better healthcare: a path towards automated data analysis? In: 2016 IEEE 32nd international conference on data engineering workshops (ICDEW), pp 60–63 3. Ko S-H et al (2015) Increased liver markers are associated with higher risk of type 2 diabetes. World J Gastroenterol: WJG 21(24):7478
Machine Learning Techniques on Disease Detection …
201
4. Majid MA et al (2019) A study on evaluating lipid profile of patients with diabetes mellitus. Int J Community Med Public Health 6(5):1869 5. Endocrinology, Diabetes, and Metabolism. https://www.hopkinsmedicine.org/endocrino logy_diabetes_metabolism/patient_care/conditions/lipid_disorders.html 6. What High Triglycerides Can Do to You, WebMD. https://www.webmd.com/cholesterol-man agement/result 7. Asrani SK et al (2019) Burden of liver diseases in the world. J Hepatol 70(1):151–171 8. Diabetes, World Health Organization. https://www.who.int/news-room/fact-sheets/detail/dia betes 9. Singh A, Pandey B (2014) Intelligent techniques and applications in liver disorders: a survey. Int J Biomed Eng Technol 16(1):27–70 10. Koch A (2007) Schiff’s diseases of the liver—10th Edition. J Am Coll Surg 205(5):e7. ISSN 1072-7515 11. Liver Panel, Lab Tests Online. https://labtestsonline.org/tests/liver-panel 12. Medical Definition of Lipid, MedicineNet. https://www.medicinenet.com/lipid/definition.htm 13. Lipid disorders. https://www.amboss.com/us/knowledge/Lipid_disorders 14. Overview of Cholesterol and Lipid Disorders. https://www.msdmanuals.com/home/hormonaland-metabolic-disorders/cholesterol-disorders/overview-of-cholesterol-and-lipid-disorders 15. Symptoms of Lipid Disorders. https://www.winchesterhospital.org/health-library/article?id= 19746 16. Diagnosing Lipid Disorders. https://www.froedtert.com/preventive-cardiology-lipid-therapy/ diagnostics 17. Lipid Profile, MedLife. https://labs.medlife.com/amp/lipid-profile-test-in-Bid 18. Ali RE et al (2019) Prediction of potential-diabetic obese-patients using machine learning techniques 19. The Hidden Risk of Liver Disease From Diabetes, WebMD. https://www.webmd.com/diabetes/ diabetes-liver-disease-hidden-risk 20. Gowda KMD (2014) Evaluation of relationship between markers of liver function and the onset of type 2 diabetes. J Health Sci 4 21. Santos-Gallego CG, Rosenson RS (2014) Role of HDL in those with diabetes. Curr Cardiol Rep 16(9):1–14 22. Hashem AM et al (2010) Prediction of the degree of liver fibrosis using different pattern recognition techniques, In: 2010 5th Cairo international biomedical engineering conference. IEEE 23. Stoean R et al (2011) Evolutionary-driven support vector machines for determining the degree of liver fibrosis in chronic hepatitis C. Artif Intell Med 51(1):53–65 24. Adeli M, Bigdeli N, Afshar K (2013) New hybrid hepatitis diagnosis system based on genetic algorithm and adaptive network fuzzy inference system. In: 2013 21st Iranian conference on electrical engineering (ICEE). IEEE 25. Seera, Manjeevan, and Chee Peng Lim. “A hybrid intelligent system for medical data classification.“ Expert Systems with Applications 41.5 (2014): 2239–2249 26. Vijayarani S, Dhayanand S (2015) Liver disease prediction using SVM and Naïve Bayes algorithms. Int J Sci Eng Technol Res (IJSETR) 4(4):816–820 27. Baitharu TR, Pani SK (2016) Analysis of data mining techniques for healthcare decision support system using liver disorder dataset. Procedia Comput Sci 85:862–870 28. Ghosh SR, Waheed S (2017) Analysis of classification algorithms for liver disease diagnosis. J Sci Technol Environ Inform 5(1):360–370 29. Nahar N, Ara F (2018) Liver disease prediction by using different decision tree techniques. Int J Data Mining Knowl Manage Process 8(2):01–09 30. Ali L et al (2020) LDA–GA–SVM: improved hepatocellular carcinoma prediction through dimensionality reduction and genetically optimized support vector machine. Neural Comput Appl:1–10 31. Pei X et al (2021) Machine learning algorithms for predicting fatty liver disease. Ann Nutr Metab 77(1):38–45
202
I. Altaf et al.
32. Parthiban G, Rajesh A, Srivatsa SK (2011) Diagnosis of heart disease for diabetic patients using naive bayes method. Int J Comput Appl 24(3):7–11 33. Medhekar DS, Bote MP, Deshmukh SD (2013) Heart disease prediction system using naive Bayes. Int J Enhanced Res Sci Technol Eng 2(3) 34. Ziasabounchi N, Askerzade I (2014) ANFIS based classification model for heart disease prediction. Int J Electr Comput Sci IJECS-IJENS 14(02):7–12 35. Perveen S et al (2016) Performance analysis of data mining classification techniques to predict diabetes. Procedia Comput Sci 82:115–121 36. Ceccarelli F et al (2017) Prediction of chronic damage in systemic lupus erythematosus by using machine-learning models. PloS one 12(3):e0174200 37. Dagliati A et al (2018) Machine learning methods to predict diabetes complications. J Diabetes Sci Technol 12(2):295–302 38. Naushad SM et al (2018) Machine learning algorithm-based risk prediction model of coronary artery disease. Mol Biol Reports 45(5):901–910 39. Dinh A et al (2019) A data-driven approach to predicting diabetes and cardiovascular disease with machine learning. BMC Med Inform Decis Making 19(1):1–15 40. Sivakumar D et al (2019) Chronic liver disease prediction analysis based on the impact of life quality attributes. Int J Recent Technol Eng 41. Kopitar L et al (2020) Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Sci Reports 10(1):1–12 42. Kavitha M et al (2021) Heart disease prediction using hybrid machine learning model. In: 2021 6th international conference on inventive computation technologies (ICICT). IEEE 43. Smith BW, Adams LA (2011) Nonalcoholic fatty liver disease and diabetes mellitus: pathogenesis and treatment. Nat Rev Endocrinol 7(8):456–465 44. Abbasi A et al (2012) Liver function tests and risk prediction of incident type 2 diabetes: evaluation in two independent cohorts. PloS one 7(12):e51496 45. Guo X et al (2013) Type 2 diabetes mellitus and the risk of hepatitis C virus infection: a systematic review. Sci Reports 3(1):1–8 46. Ahn H-R et al (2014) The association between liver enzymes and risk of type 2 diabetes: the Namwon study. Diabetol Metab Syndr 6(1):1–8 47. Wang Y-L et al (2016) Association between liver enzymes and incident type 2 diabetes in Singapore Chinese men and women. BMJ Open Diabetes Res Care 4(1) 48. Shrestha N et al (2017) Hepatic involvement with elevated liver enzymes in Nepalese subjects with type 2 diabetes mellitus. Int J Biochem Res Rev 16:1–8 49. Mandal A et al (2018) Elevated liver enzymes in patients with type 2 diabetes mellitus and non-alcoholic fatty liver disease. Cureus 10(11) 50. Singh A et al (2019) Deranged liver function tests in type 2 diabetes: a retrospective study 51. Islam S et al (2020) Prevalence of elevated liver enzymes and its association with type 2 diabetes: a cross-sectional study in Bangladeshi adults. Endocrinol Diabetes Metab 3(2):e00116 52. Blomdahl J et al (2021) Moderate alcohol consumption is associated with advanced fibrosis in non-alcoholic fatty liver disease and shows a synergistic effect with type 2 diabetes mellitus. Metabolism 115:154439 53. Uttra KM et al (2011) Lipid profile of patients with diabetes mellitus (a multidisciplinary study). World Appl Sci J 12(9):1382–1384 54. Singh G, Kumar AK (2012) A study of lipid profile in type 2 diabetic Punjabi population. J Exerc Sci Physiotherapy 8(1):7 55. Tagoe DNA, Amo-Kodieh P (2013) Type 2 diabetes mellitus influences lipid profile of diabetic patients. Ann Biol Res 4(6):88–92 56. Ozder A (2014) Lipid profile abnormalities seen in T2DM patients in primary healthcare in Turkey: a cross-sectional study. Lipids Health Dis 13(1):1–6 57. Sultania S, Thakur D, Kulshreshtha M (2017) Study of lipid profile in type 2 diabetes mellitus patients and its correlation with HbA1c. Int J Contemp Med Res 4(2):2454–7379 58. Habiba NM et al (2016) Correlation of lipid profile and risk of developing type 2 diabetes mellitus in 10–14 year old children. Cell Physiol Biochem 39(5):1695–1704
Machine Learning Techniques on Disease Detection …
203
59. Bhowmik B et al (2018) Serum lipid profile and its association with diabetes and prediabetes in a rural Bangladeshi population. Int J Environ Res Public Health 15(9):1944 60. Tham YK et al (2021) Novel lipid species for detecting and predicting atrial fibrillation in patients with type 2 diabetes. Diabetes 70(1):255–261 61. Shahwan MJ et al (2019) Association between abnormal serum hepatic enzymes, lipid levels and glycemic control in patients with type 2 diabetes mellitus. Obesity Med 16:100137 62. Miller RA, Pople HE Jr, Myers JD (1982) INTERNIST-1, an experimental computer-based diagnostic consultant for general internal medicine. New Engl J Med 307:468–476 63. Baxt WG (1990) Use of an artificial neural network for data analysis in clinical decisionmaking: the diagnosis of acute coronary occlusion. Neural Comput 2(4):480–489
Performance Analysis of Machine Learning Algorithms for Website Anti-phishing N. Mohan Krishna Varma, Y. C. A. Padmanabha Reddy, and C. Rajesh Kumar Reddy
Abstract Phishing has become the main hazard to most of the web users, and website phishing makes people to lose millions of dollars every year. In today’s world, most of the files are placed on web. Security of these files is not guaranteed. In the same way, phishing makes easier to steal the data. One simple approach is not sufficient to solve this problem. This paper provides the overview of different antiphishing techniques using machine learning approach to solve the website phishing. Machine learning is technique of learning from experience. Machine learning has different paradigms like supervised, unsupervised, semi-supervised, and reinforcement learning. This paper follows supervised learning approach to provide solution to the website phishing problems. Supervised learning is used in classification and regression. The comparison of accuracy levels of these anti-phishing techniques is discussed in this paper. Keywords Website phishing · Anti-phishing · Machine learning
1 Introduction Phishing is a cyberattack that uses anonymous email as a weapon. Phishers use a wide variety of technologies. Their main goal is to trick and loot the people. Various techniques can be used for finding phishing attacks. One of them is detecting the spoof emails. By this way, we can communicate to users whether spoofing is true or not. Another one is blacklist approach, which identifies phishing URL using heuristics [1]. Whitelist approach detects phishing based on the users online. Heuristic approach uses confidence weighted learning algorithm. Spam emails contain phishing, fake lottery emails, fake job opportunities, and fake advertising email. Comparisons approach checks whether the website is legitimate or not through the result of the N. M. K. Varma · C. Rajesh Kumar Reddy Department of CSE, Madanapalle Institute of Technology & Science, Madanapalle, AP, India Y. C. A. Padmanabha Reddy (B) Department of CSE, B V Raju Institute of Technology, Narsapur, Telangana,, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_16
205
206
N. M. K. Varma et al.
search engine, including many supervised machine learning approaches like logistic regression [2], random forest, decision trees, and support vector machine algorithms. This paper has following sections. Section 2 explains supervised machine learning methods for anti-phishing, Sect. 3 explains decision tree for anti-phishing, Sect. 4 explains random forest algorithm for anti-phishing, Sect. 5 explains support vector machine for anti-phishing, Sect. 6 explains logistic regression for anti-phishing, results are discussed in Sect. 7, and conclusions are given in Sect. 8.
2 Supervised Machine Learning Methods for Anti-phishing Making a machine to learn by itself is called machine learning. Different kinds of machine learning approaches are available like supervised learning, reinforcement learning, and unsupervised learning. In supervised learning [3], all the datasets are fed to the algorithm; in classification algorithm, output is continuous; output is discrete in regression algorithm, whereas in unsupervised learning, machine learns by itself through forming some clusters and using frequent pattern mining or association rule mining. Clustering shows interest to find cohesive groups. Reinforcement learning is used to control the system’s behavior. Here, we used four supervised learning approaches such as logistic regression [2], random forest, decision trees, and support vector machine (SVM) for anti-phishing. Decision tree algorithm [4, 5] looks like an inverted tree, it has a root node (variable) and leaf nodes (output), and the link between the leaf nodes and the root node is taken as the decision (yes or no). The best way is using ID3 algorithm for constructing decision tree. For constructing decision tree, information gain and entropy are used, for given dataset. ID3 algorithm find the best variable, which classifies the dataset effectively, assign the selected variable [4] as a root node, for each value of the variable, construct the leaf nodes, assign the classifications (yes/no) for each leaf node, if the data is perfectly classified, then stops, else iterate the tree. Entropy measures the purity of the split of the attribute. Information gain [6] is calculated after the split of the attribute, based on the decrease in the entropy. Random forest [7] solves both regression and classification problems. It considers the various decision trees for giving the final result. It creates various decision trees at random. After creation of different decision trees, it takes the prediction result from all the decision trees and finds the best solution. The main improvement in random forest is to reduce the training time, and it is very accurate. It is also used effectively in intrusion detection system [8] which plays important role in security. Support vector machine (SVM) algorithm is proposed by Vapnik [9, 10]. SVM solves regression and classification difficulties, mostly classification problems [10]. The process of SVM is to preprocess the dataset, train the model, and test the dataset. The main aim of SVM algorithm is to separate the given dataset into different classes. SVM chooses some best attributes to separate the dataset into different classes, and these vectors are known as “support vectors”. SVM will work better for binary class
Performance Analysis of Machine Learning Algorithms …
207
Fig. 1 Decision tree for anti-phishing
than multiclass problems. Transductive SVM is also available to handle unknown data. Logistic regression is used to solve classification problems. The main aim of this algorithm is predicting probability of a dependent variable. The output of that dependent variable is either 0 or 1, which means that there would be only two possible classes. It is simple learning approach that is used for various classification problems such as spam detection [11], diabetes prediction [12], cancer detection [13]. References [14] and [15] are the most recent approaches which explain about the phishing detection techniques using machine learning. Reference [14] uses University of California Irvine (UCI) dataset and [15] uses kaggle.com dataset for performance evaluation.
3 Decision Tree Algorithm for Anti-phishing Decision tree algorithm is used in classification problems. It is simple to understand, interpret, and visualize. Overfitting is the disadvantage of decision tree. This can be applied to anti-phishing. For applying decision tree to anti-phishing, we need to consider features to choose conditions for splitting, knowing when to stop and pruning. Decision tree will be used in random forest and boosting algorithms. Decision tree model for anti-phishing is shown in Fig. 1. When we tested the dataset provided by the University of New Brunswick (UNB) [16] with decision tree classifier, we got the accuracy of 0.815. The UNB dataset has five kinds of URLs. Table 1 shows these five kinds of URLs in UNB dataset.
4 Random Forest for Anti-phishing Random forest algorithm can be applied to both regression and classification problems. It can handle the missing values and maintain the accuracy for missing data.
208
N. M. K. Varma et al.
Table 1 Types of URLs in University of New Brunswick dataset URL type
Number of URLs
Benign
35,300
Spam
12,000
Phishing
10,000
Malware
11,500
Defacement
45,450
Fig. 2 Random forest for anti-phishing
It can handle the large dataset. Random forest model for anti-phishing is shown in Fig. 2. When we tested the dataset provided by the UNB with random forest classifier, we got the accuracy of 0.819.
5 Support Vector Machine Algorithm for Anti-phishing Support vector machine (SVM) algorithm is used to separate two classes, so it can be used for anti-phishing. SVM can be linear or nonlinear. SVM finds the decision boundary and segregates the two classes based on the support vectors. If there is no optimal decision boundary, it is difficult to classify the two classes and may lead to misclassification. Support vectors decide the data points for classification. Training examples are not important in SVM. Applications of SVM are medical, financial, and page ranking. SVM for anti-phishing is shown in Fig. 3. When we tested the dataset provided by the UNB with SVM, we got the accuracy of 0.8085.
Performance Analysis of Machine Learning Algorithms …
209
Fig. 3 SVM for anti-phishing
Fig. 4 Logistic regression for anti-phishing
6 Logistic Regression for Anti-phishing Logistic regression can be used to handle or solve the text classification problems. It uses the binary output representation and involved probabilities of two outcomes using the training dataset. These probabilities are used to train the model, and logistic transformation equations are used for classification of emails. Logistic regression model for anti-phishing is shown in Fig. 4. When we tested the dataset provided by the UNB with logistic regression model, we got the accuracy of 0.798.
7 Results and Discussion For finding best classifier, we used the two kinds of datasets. One dataset is taken from the University of New Brunswick (UNB) [16] and another phishing website dataset from the kaggle.com [17]. From the UNB dataset, we took JavaScript features,
210 Table 2 Feature selection of University of New Brunswick dataset
N. M. K. Varma et al. Feature type
Features
JavaScript
Website redirection, etc.
Domain
Website hits, hosted date, etc.
Address
IP address, uniform resource locator, etc.
domain features, and address features. Table 2 shows the feature selection of UNB dataset. Based on the features, we used Google Colaboratory [18] to test the accuracy of decision tree, random forest, SVM, and logistic regression algorithms. The comparison of phishing website detection algorithms for UNB dataset is shown in Table 3. With UNB dataset, decision tree gives accuracy of 81.3 and 81.5 for training and testing datasets, for random forest, 81.33 and 81.9 for training and testing datasets, 80.05 and 80.85 for SVM, and 80.1 and 79.8 for logistic regression to detect phishing website. Figure 5 shows the accuracy of different algorithms with UNB dataset. Dataset from kaggle.com has 31 attributes with 2456 instances. The comparison of phishing website detection algorithms for kaggle.com is shown in Table 4. Results are taken from the different combinations of dataset. First, results are taken from the full dataset; after that, dataset is divided into two parts, and then dataset is divided into four parts. For total dataset, decision tree gives accuracy of 90.72 for detecting phishing website, random forest gives accuracy of 96.52, SVM gives accuracy of 44.94, and logistic regression gives accuracy of 84.86 for detecting phishing website. Results Table 3 Accuracy of different algorithms with University of New Brunswick dataset
Fig. 5 Comparison of anti-phishing algorithms with UNB dataset
Algorithm
Training set
Testing set
Decision tree
81.3
81.5
Random forest
81.33
81.9
SVM
80.05
80.85
Logistic regression
80.1
79.8
Accuracy Results 83 82 81 80 79 78
Decision Tree
Random Forest Training set
SVM Testing set
Logistic Regression
Performance Analysis of Machine Learning Algorithms …
211
Table 4 Accuracy of different algorithms with kaggle.com dataset Algorithm
Total dataset
First part of dataset
Second part of dataset
First quarter of dataset
Second quarter of dataset
Third quarter of dataset
Fourth quarter of dataset
Decision tree
90.72
98.53
90.45
86.73
54.42
52.98
59.59
Random forest
96.52
97.51
96.68
83.26
53.50
54.58
58.57
SVM
44.94
47.03
50.64
16.08
51.83
47.77
49.46
Logistic regression
84.86
92.65
84.87
80.56
52.46
53.46
58.71
for other combinations of dataset are shown in Table 3. Figure 6 shows the accuracy of different algorithms with kaggle.com dataset. The comparison of phishing website detection algorithms for UNB and kaggle.com datasets is shown in Table 5. With UNB and kaggle.com datasets, the best accuracy is given by random forest algorithm to detect phishing website. Figure 7 shows the accuracy of different algorithms with UNB and kaggle.com datasets. Based on these results, we can say that the best classifier for detecting phishing websites is “random forest”, whereas for detecting phishing websites, “SVM” gives the least performance.
Fig. 6 Comparison of anti-phishing algorithms with kaggle.com dataset
212 Table 5 Accuracy of different algorithms with UNB and kaggle.com datasets
Fig. 7 Comparison of anti-phishing algorithms with UNB and kaggle.com datasets
N. M. K. Varma et al. Algorithm
UNB
Kaggle.com
Decision tree
81.5
90.72
Random forest
81.9
96.52
SVM
80.85
44.94
Logistic regression
80.1
84.86
Accuracy Results 150 100 50 0
Decision Tree
Random Forest UNB
SVM
Logistic Regression
koggle
8 Conclusion This paper compares accuracy of logistic regression, random forest, decision trees, and support vector machine algorithms to detect phishing websites. Phishing website datasets are taken from the University of New Brunswick and the kaggle.com. We divided the kaggle.com dataset into different two halves and four quarters to analyze anti-phishing algorithms’ accuracies. Based on the results, random forest gives the best accuracy and SVM gives least accuracy for detecting phishing websites.
References 1. Mohammad RM, Fadi T, McCluskey L (2015) Phishing websites features. School of Computing and Engineering, University of Huddersfield 2. Teki S, Banothu B, Varma M (2019) An un-realized algorithm for effective privacy preservation using classification and regression trees. Revue d’Intelligence Artificielle 33(4):313–319 3. Padmanabha Reddy YCA, Varma M (2020) Review on supervised learning techniques. In: Emerging research in data engineering systems and computer communications. Springer, Singapore, pp 577–587 4. Tripathi, Diwakar, I. Manoj, G. Raja Prasanth, K. Neeraja, Mohan Krishna Varma, and Ramachandra Reddy, B.: Survey on Classification and Feature Selection Approaches for Disease Diagnosis. In: Emerging Research in Data Engineering Systems and Computer Communications, pp. 567–576. Springer, Singapore (2020). 5. Priyanka, and Dharmender, K.: Decision tree classifier: a detailed survey. International Journal of Information and Decision Sciences 12(3), 246–269 (2020).
Performance Analysis of Machine Learning Algorithms …
213
6. Liao, Huchang, Xiaomei Mi, and Zeshui, X.: A survey of decision-making methods with probabilistic linguistic information: bibliometrics, preliminaries, methodologies, applications and future directions. Fuzzy Optimization and Decision Making 19(1), 81–134 (2020). 7. Shaik, Anjaneyulu Babu, and Sujatha, S.: A brief survey on random forest ensembles in classification model. In: International Conference on Innovative Computing and Communications, pp. 253–260. Springer, Singapore (2019). 8. Resende, Paulo Angelo Alves, and André Costa, D.: A survey of random forest based methods for intrusion detection systems. ACM Computing Surveys (CSUR) 51(3), 1–36 (2018). 9. Vapnik VN (1995) The nature of statistical learning theory. Springer Verlag, New York 10. Cervantes, Jair, Farid Garcia-Lamont, Lisbeth Rodríguez-Mazahua, and Asdrubal, L.: A comprehensive survey on support vector machine classification: Applications, challenges and trends, Neurocomputing (2020). 11. Dedeturk, Bilge Kagan, and Bahriye, A.: Spam filtering using a logistic regression model trained by an artificial bee colony algorithm. In: Applied Soft Computing (2020) 12. Liu, Sen, Wei Wang, Yan Tan, Miao He, Lanhua Wang, Yuting Li, and Wenyong, H.: Relationship Between Renal Function and Choroidal Thickness in Type 2 Diabetic Patients Detected by Swept-Source Optical Coherence Tomography. In: Translational Vision Science & Technology 9(5), 17–17 (2020) 13. Pati, Dakshya P., and Sucheta, P.: A Comprehensive Review on Cancer Detection and Prediction Using Computational Methods. In: Computational Intelligence in Data Mining, pp. 629–640. Springer, Singapore (2020). 14. M. N. Alam, D. Sarma, F. F. Lima, I. Saha, R. -E. -. Ulfath and S. Hossain, "Phishing Attacks Detection using Machine Learning Approach," 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), 2020, pp. 1173–1179. 15. Rashid J, Mahmood T, Nisar MW, Nazir T (2020) Phishing detection using machine learning technique. In: 2020 first international conference of smart systems and emerging technologies (SMARTTECH), pp 43–46 16. URL dataset. https://www.unb.ca/cic/datasets/url-2016.html, last accessed 2021/07/09 17. Phishing website dataset. https://www.kaggle.com/akashkr/phishing-website-dataset, last accessed 2020/09/11 18. https://colab.research.google.com/. last accessed 2021/07/09
Analytical Analysis of Two-Warehouse Inventory Model Using Particle Swarm Optimization Sunil Kumar and Rajendra Prasad Mahapatra
Abstract A stock setup for weakening things with two degree of capacity framework and time-subordinate interest with halfway accumulated deficiencies is created in this research topic. Stock is moved from hired warehouse (RW) to personal warehouse (OW) in mass delivery design and cost of transportation considered as insignificant. Rates of weakening in all the distribution centres are consistent yet unique because of the distinctive safeguarding methodology. Up to a particular time, holding cost is viewed as consistent, and after some time, it increases. Particle swarm optimization having fluctuating populace numbers is utilized to tackle the setup. In given PSO, a fraction of better kids is incorporated along with the populace of parent for future. Size of its parent set and kid’s subset is having same level. The mathematical model is introduced to validate the presented setup. Affectability examination is performed independently for every boundary. Keywords Time-dependent demand · Variable holding cost · Shortages and particle swarm optimization
1 Introduction The old-style inventories models are fundamentally evolved with the single distribution centre framework. Before, specialists have set up a great deal of exploration in the area of stock the executives and stock control framework. Stock administration and control system basically manages request and inventory network issues, and for this, creation units (producer of completed merchandise), vender’s, providers and retailers have to stock the crude materials, completed products for future interest and supply on the lookout and to the clients. In the customary models, it is accepted that the interest and holding cost are steady, and products are provided immediately under endless renewal strategy, when requested, however, as time died numerous analysts thought about that interest may shift with time, because of cost and based on S. Kumar (B) · R. P. Mahapatra SRM IST, NCR Campus Modi Nagar, Ghaziabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_17
215
216
S. Kumar and R. P. Mahapatra
different factors and holding cost likewise may fluctuate with time and relying upon different components. Numerous models have been created considering different time-subordinate interest with deficiencies and without lack. Each one of those models considers request variety because of stock level, except that the holding cost is steady for the whole stock cycle. In investigations of stock models, limitless distribution centre limit is frequently accepted. Nonetheless, in occupied commercial centres, like grocery stores, organization markets, the capacity region for things might be restricted. Another case, of lacking stockpiling region, can happen when an acquisition of a lot of things is chosen. That could be because of, an appealing value rebate for mass buy which is accessible, or when the expense of securing merchandise is higher than the other stock related expenses, or when interest for things is high, or when the thing viable is an occasional item like the yield of a collector, when there are a few issues in continuous acquirement. For this situation, these things cannot be obliged in the current storage facility (the own stockroom, a bridged as OW). Subsequently, to store the abundance things, an extra distribution centre (the leased stockroom, contracted as RW), which might be situated at a brief separation from the OW or somewhat away from it, because of non-accessibility of stockroom close by, is recruited on a rental premise. So, in this proposed model, an optimal solution is presented to maximize the retailer profit.
2 Literature Reviews Kumar and Mahapatra [1] discussed “Multi-Objective Genetic Algorithm based Optimization method for Two-Warehouse Inventory Model with Time-Dependent Demand and Deteriorating Items. Karimi et al. [2] presented a model for deteriorating items with fluctuating demands when deficiencies permitted a partial backlog in production inventory and lot size with time-varying operating costs and female decline”. They proposed a model for integrated supply chain model for material spoilage with linear demand based on inventory in an inaccurate and inflationary environment. Yadav et al. [3], they developed an article in which different cost parameters are taken into account: flexible volume two-stage model with fluctuating demand and inflationary holding costs are taken. Barun et al. [4], they discussed a model which produces both imperfect and perfect items, and in this situation, model will maximize the profit. Singh et al. [5] in this article, they considered an efficient inventory model for supply chain management. Pandey et al. [6], she presented a setup for the inventory optimization of the marble industry based on genetic algorithms and particle swarm optimization. Palanivel et al. [7] proposed a model with considering stock-dependent order and imperfect items. Yadav et al. [8] provided drug industry supply chain management for blockchain applications using artificial neural networks. Manna et al. [9] suggested the EPQ setup for advertisement-based demand. Yadav et al. [10]: a supply chain management for the rosé wine industry for storage using a genetic algorithm. Yadav et al. [11]: a method for calculating the reliability of the LIFO stock model with bearings in the chemical industry. Sana [12]:
Analytical Analysis of Two-Warehouse Inventory …
217
price competition between green and non-green products in the context of a socially responsible retail and consumer services business magazine. Sana [13]: an EOQ model for stochastic demand for limited storage capacity. Moghdani et al. [14]: fuzzy model for economic production quantity with multiple items and multiple deliveries. Haseli et al. [15]: basic criterions for the multi-criteria decision-making method and its applications. Ameri et al. [16]: self-assessment of parallel network systems with intuitionistic fuzzy data: a case study. Birjandi et al. [17]: assessment and selection of the contractor when submitting a tender with incomplete information according to the MCGDM method. Bhambu et al. [18] developed a new variant of PSO for complex optimization. Gholami et al. [19]: ABC analysis of clients using axiomatic design and incomplete estimated meaning. Jamali et al. [20]: Hybrid Improved Cuckoo Search Algorithm and Genetic Algorithm to Solve Marko Modulated Demand.
3 Presumption and Documentations The numerical model of two-stockroom stock model for weakening things depends on the accompanying documentation and suspicions notations mentioned in Table 1.
3.1 Assumption 1.
Request change according to time, and it is linear function of time and can be written as if t = 0 X1 D(t) = ; where X 1 > 0 and X 2 > 0 (1) X 1 + X 2 t if t > 0
2.
So, we assume that holding cost will be taken fixed up to an unmistakable time point in RW, and afterwards, it will build as per a negligible part of requesting cycle length. So, holding cost is y1 Before the time point k holding cost is constant y1 = t ≤ k
3.
4 Mathematical Formulation of Model and Analysis At time point t = 0 (start of cycle), quantity of one tonne of Qmax units of stock is sent in the framework where previously unclear (Qmax-IB) items are cleared and the leftover items M is stocked into two stockpiling as W1 items in OW and W2 units in RW. Now, based on assumptions and parameter defined in Table 1, different equations formed below are used to find out the different cost involved in warehouse.
218
S. Kumar and R. P. Mahapatra
Table 1 Parameters for mathematical model oA
Requesting cost per order
e1
OW capacity
e2
RW capacity
Tn
The length of renewal cycle
(Max) Q
Most extreme stock level per cycle to be requested
T1
Time at which stock evaporates in RW
Tn
Time point where stock level tends to zero in OW and deficiencies start
K
Clear time where holding cost becomes steady
y2
It is the cost of holding/unit time in OW
y1
It is the cost of holding/unit time in RW
y3
It is the cost of deficiencies/unit time
y4
The possibility cost/unit/time
B R (t)
The degree of stock in RW
Biw (t)
Stock level where i = 1, 2 in OW
Bs (t)
Stock level at time point t; in this duration, product has shortages
S0
Deterioration rate in RW
s1
Deterioration rate in OW
y5
It is the cost of purchase/unit of items
BA
Most extreme measure of stock multiplied
LA
Amount of stock lost
P(Cost)
Cost of purchase
S(Cost)
The current worth expense of deficiencies
L(Cost)
The current worth expense of lost sale
H (Cost)
It is the current worth cost for holding stock
T C [T1 , Tn ]
The all-out applicable stock expense per unit time of stock system
In between time span [0 t 1 ] the stock of RW declines because of the interest and deterioration and is represented by the accompanying differential condition: dB R (t) = −[(X 1 + 1) + X 2 t] − [s0 B R (t)]0 ≤ t ≤ t1 dt
(2)
During time span [0 t 2 ], the stock level decreases in OW because of crumbling just and is administered by differential condition dB1w = −s1 B1w (t)0 ≤ t ≤ t1 dt
(3)
During time span [t 1 t 2 ], the stock level in OW diminishes because of interest and crumbling both and is represented by the accompanying differential condition
Analytical Analysis of Two-Warehouse Inventory …
dB2w (t) = −[(X 1 + 1) + X 2 t] − [s1 B R (t)]t1 ≤ t ≤ t2 dt
219
(4)
Presently, at t = t 2 , the stock becomes zero, and deficiencies happen in the interim [t 2 T ], a part f of the complete deficiencies is accumulated, and the deficiencies amount is provided to the clients in the beginning of the following recharging cycle. The deficiencies are represented by the differential condition dBs (t) = −[(X 1 + 1) + X 2 t]t3 ≤ t ≤ T dt
(5)
Recharging of cycle restarts at time point t = T. Aim of the presented method is to limit the expense in stock. Presently, the level of stock in various time frames is given by addressing the above differential conditions (2) to (5) under limit conditions B R (T1 ) = 0 B1w (0) = Z 1 B2w (T2 ) = 0 Bs (T2 ) = 0 Now, Differential Eq. (1) gives ⎤ ⎡ (X + 1) X2 1 + 2 {s0 T1 − 1}es0 (T1 −t) ⎥ ⎢ s0 s0 ⎥ B R (t) = ⎢ ⎦ ⎣ (X 1 + 1) X2 − + 2 {s0 t − 1} s0 s0
(6)
B1w (t) = e1 e−s1 T1
(7)
⎤ ⎡ (X + 1) X2 1 + 2 {s1 T2 − 1}en 1 (T2 −t) ⎥ ⎢ s1 s1 ⎥ B1w (t) = ⎢ ⎦ ⎣ (X 1 + 1) X2 − + 2 {s1 t − 1} s1 s1
X2 2 Bs (t) = f (X 1 + 1)(T1 − t) + T2 − t 2 2
(8)
(9)
Now, at t = 0, I r (0) = W2 ; therefore, Eq. (5) yields ⎤ X2 (X 1 + 1) − ⎥ ⎢ s2 s0 ⎥ ⎢ ⎥ e2 = ⎢ 0 ⎦ ⎣ + 1) X2 (X 1 + 2 (s0 T1 − 1)e−s0 T1 + s0 s0 ⎡
(10)
220
S. Kumar and R. P. Mahapatra
Greatest measure of stock backlogged in between shortages period of time (at t = T ) is given by G A = −Bs (Tn ) X2 2 2 T −t G A = f (X 1 + 1)(Tn − t) + 2 n
(11)
During shortage period, lost of inventory amount is L A = [1 − G A ]
X2 2 L A = 1 − f (X 1 + 1)(Tn − t) + Tn − t 2 2
(12)
Maximum inventory amount requested is given as (Max) Q = [e1 + B R (0) + G A ]
⎡ X2 X2 (X 1 + 1) (X 1 + 1) ⎤ −s0 T1 + e1 + − T − 1)e + (s 0 1 ⎢ ⎥ s0 s0 s02 s02 ⎥ (Max) Q = ⎢ ⎣ ⎦ X2 2 2 Tn − t + f (X 1 + 1)(Tn − t) + 2 (13) Now, continuity at t = t1 shows that I 1w (t1 ) = I 2w (t1 ); therefore; from Eqs. (6) and (7), we have X 2 s12 T22 − (X 1 + 1)s12 T2 − s12 (e1 + e) + X 2 − (X 1 + 1)s12 = 0
(14)
where
e=
X2 (X 1 + 1) + 2 (s1 T1 − 1) e−s1 T1 s1 s1
Whichever is quadratic over t2 as well as additional as it may be resolved being t2 in terms of t1 , i.e. T2 = ϒ(T1 )
(15)
where −(X 1 + 1)2 s14 ± ϒ(T1 ) = 2X 2 s12
√
D
Analytical Analysis of Two-Warehouse Inventory …
221
and ⎡
⎤ X 2 − (X 1 + 1)s1 ⎥ ⎢ D = (X 1 + 1)2 s14 + 4X 2 s12 ⎣ 2 X2 (X 1 + 1) ⎦ +s1 e1 + + 2 (s1 T1 − 1) e−s1 T1 s1 s1 Next, the absolute applicable stock expense per cycle incorporates following boundaries: 1. 2. 3.
Requesting cost per cycle = O A Buy cost/cycle = P × (Max) Q Current worth cost of holding = H (Cost)
Condition is k < T and 0 ≤ k < t1 in RW H (Cost) =
k 0
y1 B R (t)dt +
T1 k
y1 T B R (t)dt +
T1 0
y2 B1w (t)dt +
T2 T1
y2 B2w (t)dt
⎧ ⎫⎤ X 2k2 X 2 T12 ⎪ 2 ⎪ ⎪ ⎪ + 1)T k + X T k − − (X 1 1 2 1 ⎪⎥ ⎢ ⎪ 2s0 3s0 ⎪ ⎪ ⎪ ⎥ ⎢ ⎪ ⎪ ⎪ ⎪ ⎥ ⎢ ⎪ ⎪ ⎪ + 1)k (X ⎪ ⎪ ⎥ ⎢ ⎨ 1 3 4 ⎬ +(X 1 + 1)T1 + X 2 T1 − ⎥ ⎢ ⎥ ⎢ y1 s0 ⎥ ⎢ ⎪ ⎪ 2 2 2 2 ⎪ ⎥ ⎢ ⎪ ⎪ ⎪ −X T k − + 1)T k − X T k (X 2 1 1 1 2 1 ⎪ ⎥ ⎢ ⎪ ⎪ ⎪ ⎪ ⎥ ⎢ ⎪ ⎪ ⎪ 2 2 3 ⎪ H (Cost) = ⎢ ⎪ ⎥ T k + 1)k k X X (X ⎪ 2 ⎩+ 2 1 + 1 ⎭ ⎥ ⎢ ⎪ + ⎥ ⎢ s0 s0 3s0 ⎢ ⎧ ⎫ ⎥ ⎢ 2 (X 1 + 1)T2 (X 1 + 1)T1 T2 ⎪ ⎥ ⎥ ⎢ ⎪ ⎪ ⎪ e1 T1 + − ⎢ ⎨ ⎬ ⎥ ⎥ ⎢ s1 s1 ⎥ ⎢ +y2 2 2 ⎦ ⎣ ⎪ ⎪ ⎪ ⎪ ⎩ + (X 1 + 1)T1 − (X 1 + 1)T2 ⎭ 2s1 2s1 ⎡
(16)
The present worth of shortages cost is ⎫⎤ ⎧ 2 3 2 3 ⎪ ⎪ ⎪ ⎪ (X 1 + 1)Tn − (X 1 + 1)T2 + X 2 Tn − X 2 T2 ⎬⎥ ⎨ ⎢ 2 2 6 6 ⎥ S(Cost) = ⎢ y f ⎣ 3 ⎪ ⎦ X 2 T22 Tn (X 1 + 1)T32 ⎪ ⎪ ⎪ 2 ⎭ ⎩ + −(X 1 + 1)T1 Tn + (X 1 + 1)T2 − 2 2 (17) ⎡
The current worth chance expense/lost deal cost is
222
S. Kumar and R. P. Mahapatra
⎧ ⎫⎫⎤ ⎡ ⎧ X 2 T23 (X 1 + 1)T22 X 2 Tn3 (X 1 + 1)Tn2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ − + − ⎨ ⎬⎥ ⎬⎪ ⎢ ⎨ 2 2 6 6 ⎥ L(Cost) = ⎢ ⎦ ⎣ y4 ⎪1 − ⎪ 2 2 ⎪⎪ ⎪ ⎪ ⎩ (X + 1)T T + (X + 1)T 2 − X 2 T2 Tn + (X 1 + 1)T3 ⎪ ⎭ ⎩ ⎭⎪ 1 1 n 1 2 2 2
(18)
Present worth purchase cost is ⎫⎤ ⎡ ⎧ X2 (X 1 + 1) ⎪ ⎪ ⎪ ⎪ e1 + − ⎪ ⎪ 2 ⎥ ⎪ ⎢ ⎪ s s ⎪ ⎪ 0 0 ⎥ ⎪ ⎢ ⎪ ⎪ ⎪ ⎥ ⎬ ⎢ ⎨ X2 (X 1 + 1) ⎥ ⎢ −s0 T1 p(Cost) = ⎢ y4 + 2 ((X 1 + 1)T1 − 1)e + ⎥ ⎥ ⎢ ⎪ ⎪ s0 s0 ⎪ ⎪ ⎥ ⎢ ⎪ ⎪
⎪
⎪ ⎪ ⎦ ⎣ ⎪ ⎪ ⎪ X 2 ⎪ ⎪ 2 2 ⎩ + f (X 1 + 1)(Tn − T2 ) + ⎭ Tn − T2 2
(19)
Total involved stock cost/unit of time is in Case 1 T C (t1 T ) =
1 [C A + CP + CS + CL + HC ] T
(20)
5 Particle Swarm Optimization PSO is a stochastic improvement method presented by Eberhart and Kennedy (21) in 1995, motivated by friendly conduct of bird movement or fish tutalege. In PSO, the expected arrangements, called particles which can be begin with irregular arrangements of population and it try to find out optima by refreshing ages, fly through the issue space by following the current ideal particles. Each molecule tracks its trajectory in the provided area which is related to the finest arrangement, and it has achieved up to now. Here, we say pBest is the worth. Other “best” esteem which is approached by molecule swarm analyser is the best worth, gotten up to date by any molecule in the neighbours. We say this is lBest. Up to this point when a molecule gains all the populace as its topological neighbours, the best worth is a worldwide best and is called gBest. The molecule swarm improvement idea comprises of, at each time step, changing the speed of (speeding up) every molecule towards its pBest and lBest areas (neighbourhood adaptation of PSO). Speed increase is weighted by an irregular term, with discrete arbitrary numbers being created for speed increase towards pBest and lBest areas. Procedure of PSO for TSP /* Define initial probabilities for particles’ moves:*/ Step 1. pr1 ← a1 /*to follow its own way*/
Analytical Analysis of Two-Warehouse Inventory …
223
Step 2. pr2 ← a2 /*to go towards pbest*/ Step 3. pr3 ← a3 /*to go towards gbest*/ /* a1 + a2 + a3 = 1 */ Initialize the population of particles Step 4. do Step 5. for each particle p Step 6. valuep ← Evaluate(xp) Step 7. if (value(xp) < value(pbestp)) then Step 8. pbestp ← xp Step 9. if (value(xp) < value(gbest)) then Step 10. gbest ← xp Step 11. end for Step 12. for each particle p Step 13. velocity p ← define velocity (pr1, pr2, pr3) Step 14. xp ← update (xp,velociyp) Step 15. end_for /* Update probabilities*/ Step 16. while (a stop criterion is not satisfied)
6 Numerical Analysis The accompanying arbitrarily picked information in suitable units has been utilized to track down the ideal arrangement and approve the presented setup for three stack holders, the maker, the wholesaler and last one retailer. The numerical data are as follows: X 1 = 500, C = 1500, Z 1 = 2000, X 2 = 0.50, y2 = 60, y1 = 75, y5 = 1500 s0 = 0.013, s1 = 0.014, y3 = 250, k = 1.61, f = 0.06 and y4 = 100 The decision variable’s values are calculated for the presented setup for two different situations separately. The binary PSO-2 algorithm is coded in C language for lot size problem. The binary PSO-2 is compared with PSO-2–1 and the optimal Wagner. Coding for traditional PSO-1 is written in C. We have taken random test problem to test the proposed model. The PSO-2–1 is having uniform crossover, simple inversion mutation with tournament selection of size 2. Here, different parameters are used for the binary PSO-2 and PSO-1. For PSO-2, in the swarm, population size is taken as the twice the number of periods. Social and cognitive parameters are taken as 11.3 cc consistent with the literature. In PSO-1, population size equals to PSO-2. The crossover is taken 33, and the mutation rates are taken 0.3. Both algorithms run for 300 generations/iterations. In first one, test suit having 3 problem instances with total needs for 23 periods is developed from a uniform distribution, UNIF (23, 123), for second one having UNIF(30, 123) and the third one having UNIF (30, 123). Ten numbers of problem
224
S. Kumar and R. P. Mahapatra
instances will execute for both PSO-2–1 and the binary PSO-2 having H(c) = e 0.23, O(c) = A = e 30, cost of shortages (SC) = e 0.30. We have to make comparison of results with optimum value of Wagner-Whit in algorithm. For every instance of each problem, we conducted ten replications. We took CPU time with min, max, average and standard deviation. It can be observed with the help of Tables 1 and 2, the binary PSO-2 generated comparable outcomes with PSO-1, and it even calculated optimum outcomes (Table 3). The PSO-1 had the option to track down the seven ideal arrangements out of ten, while the double PSO-2 was capable to track down the nine ideal arrangements out of ten despite the fact that the normal standard deviation of the PSO-1 more than ten replications was somewhat better compared to the paired PSO-2, i.e. = PSO-1 (3.24) = PSO-2(σ3.47). Table 2 PSO-1 results P
WW
PSO-1
OPT
BEST
MAX
AVG
STD
1
2103.30
2103.30
2111.73
2108.00
3.23
2
2177.00
2177.00
2183.73
2178.73
2.69
3
2031.00
2031.00
2031.00
2031.00
1.00
4
2039.00
2039.00
2039.00
2039.00
2.00
3
2011.23
2011.23
2022.30
2022.70
1.38
6
2124.30
2124.30
2134.60
2124.70
2.11
7
2034.30
2034.30
2134.70
2068.30
3.83
8
2130.30
2130.30
2130.00
2140.00
0.48
9
2216.30
2216.30
2206.30
2216.30
0.33
10
2012.00
2012.00
2013.00
2014.00
3.30
Table 3 PSO-2 results P
WW
PSO-2
OPT
BEST
MAX
AVG
STD
1
2203.30
2203.30
2211.73
2208.00
4.23
2
2277.00
2277.00
2283.73
2378.73
3.69
3
2131.00
2131.00
2131.00
2231.00
2.30
4
2139.00
2139.00
2139.00
2139.00
4.03
3
2111.23
2111.23
2122.30
2222.70
2.78
6
2224.30
2224.30
2234.60
2324.70
4.11
7
2134.30
2134.30
2234.70
2168.30
4.83
8
2230.30
2230.30
2230.00
2440.00
1.48
9
2316.30
2316.30
2306.30
2316.30
2.33
10
2112.00
2112.00
2113.00
2114.00
6.30
Analytical Analysis of Two-Warehouse Inventory …
225
As far as the computing time, PSO-1 required roughly 3 s for each occasion, but PSO-2 required 8 s, which is in terms of computing time, it is costly than PSO-1. However, PSO-2’s acceptable presentation on discovering ideal arrangements more than PSO-1 remunerates its computational shortcoming. A BPSO numerical analysis for lot sizing problem.
7 Conclusions We investigated a deterministic two-stockroom stock setup for crumbling things with straight time-subordinate interest and fluctuating cost of holding concerning requesting cycle length with the target of limiting the all-out stock expense. Deficiencies are permitted and halfway accumulated. Two unique situations are being discussed, first situation having variable cost of holding during the span of time frame and second one with consistent cost of holding in complete cycle duration, and it has been observed that in between variable cost of holding, the absolute stock expense is substantially greater than the other one case. This model is valuable for the things which are exceptionally disintegrating, since as the decay rate expansions in both product houses of the all-out stock expense diminish. This implemented method can be additionally stretched out by consolidation along with crumbling rate, probabilistic interest design and other sensible mixes.
References 1. Kumar S, Mahaptra RP (2020) Multi-objective genetic algorithm optimization based twowarehouse inventory model with time-dependent demand and deteriorating items. Solid State Technol 63(6):8290–8302 2. Karimi M, Jafar Sadjadi S, Ghasemi Bijaghini A (2019) An economic order quantity for deteriorating items with allowable rework of deteriorated products. J Ind Manage Optim 15(4):1857–1879 3. Yadav AS, Swami A (2018) Integrated supply chain model for deteriorating items with linear stock dependent demand under imprecise and inflationary environment. Int J Procurement Manage 11(6):684–704 4. Khara B, Dey JK, Mondal SK (2017) An inventory model under development cost-dependent imperfect production and reliability-dependent demand. J Manage Analytics 4(3):258–275 5. Singh and kumar (2011) Inventory optimization in efficient supply chain management. Int J Comput Appl Eng Sci I(IV):428–433 6. Pandey et al (2019) An analysis marble industry inventory optimization based on genetic algorithms and particle swarm optimization. Int J Recent Technol Eng 7(6S4):369–373 7. Palanivel M, Uthayakumar R (2016) An inventory model with imperfect items, stock dependent demand and permissible delay in payments under inflation. RAIRO—Oper Res 50(3):473–489 8. Yadav AS, Selva NS, Tandon A (2020) Medicine manufacturing industries supply chain management for Blockchain application using artificial neural networks. Int J Adv Sci Technol 29(8s):1294–1301 9. Manna AK, Dey JK, Mondal SK (2017) Imperfect production inventory model with production rate dependent defective rate and advertisement dependent demand. Comput Ind Eng 104:9–22
226
S. Kumar and R. P. Mahapatra
10. Yadav AS, Pandey T, Ahlawat N, Agarwal S, Swami A (2020) Rose wine industry of supply chain management for storage using genetic algorithm. Test Engraining Manage 83:11223– 11230 11. Yadav N, Sharma A, Swami (2020)A method for calculating the reliability of the LIFO stock model with bearings in the chemical industry. Int J Adv Trends Comput Sci Eng 9(1):403–408 12. Sana SS (2020) Price competition between green and non-green products under corporate social responsible firm. J Retail Consum Serv 55:102118 13. Sana SS (2015) An EOQ model for stochastic demand for limited capacity of own warehouse. Ann Oper Res 233(1):383–399 14. Moghdani R, Sana SS, Shahbandarzadeh H (2020) Multi-item fuzzy economic production quantity model with multiple deliveries. Soft Comput 24(14):10363–10387 15. Haseli G, Sheikh R, Sana SS (2020) Base-criterion on multi-criteria decision-making method and its applications. Int J Manage Sci Eng Manage 15(2):79–88 16. Ameri Z, Sana SS, Sheikh R (2019) Self-assessment of parallel network systems with intuitionist fuzzy data: a case study. Soft Comput 23(23):12821–12832 17. Birjandi AK, Akhyani F, Sheikh R, Sana SS (2019) Evaluation and selecting the contractor in bidding with incomplete information using MCGDM method Soft Computing 23(20):10569– 10585 18. Bhambu P, Kumar S, Sharma K (2018) Self-balanced particle swarm optimization. Int J Syst Assur Eng Manage 9(4):774–783 19. Gholami A, Sheikh R, Mizani N, Sana SS (2018) ABC analysis of the customers using axiomatic design and incomplete rough set. RAIRO-Oper Res 52(4–5):1219–1232 20. Jamali G, Sana SS, Moghdani R (2018) Hybrid improved cuckoo search algorithm and genetic algorithm for solving Markov modulated demand. RAIRO-Oper Res 52(2):473–497
Towards an Enhanced Framework to Facilitate Data Security in Cloud Computing Sarah Mahlaule , John Andrew van der Poll , and Elisha Oketch Ochola
Abstract Cloud- and associated edge computing are vital technologies for online sharing of computing resources with respect to processing and storage. The SaaS provisioning of services and applications on a pay-per-use basis removes the responsibility of managing resources from organisations which in turn translates to cost savings by reducing capital expenditure for such organisation. Naturally, any online and distributed environment incurs security challenges, and while ordinary users might not be concerned by the unknown whereabouts of their data in the cloud, the opposite may hold for organisations or corporates. There are numerous interventions that attempt to address the challenge of security on the cloud through various frameworks, yet cloud security remains a challenge since the emergence of cloud technology. This research highlights and critically analyses the structure of and mechanisms associated with three prominent cloud security frameworks in the literature to evaluate how each of them addresses the challenges of cloud security. On the strength of a set of qualitative propositions defined from the analyses, we develop a comprehensive cloud security framework that encompasses some components of the studied frameworks, aimed at improving on data and information security in the cloud. Keywords Cloud computing (CC) · Cloud security · Cloud service provider (CSP) · Cloud security framework · Data security · Cloud service integrator (CSI) · Proposition
S. Mahlaule · E. O. Ochola School of Computing, College of Science, Engineering and Technology, University of South Africa (Unisa), Florida, South Africa e-mail: [email protected]; [email protected] E. O. Ochola e-mail: [email protected] J. A. van der Poll (B) Digital Transformation and Innovation, Graduate School of Business Leadership (SBL), University of South Africa (Unisa), Midrand, South Africa e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_18
227
228
S. Mahlaule et al.
1 Introduction Cloud computing (CC) plays a vital role in enabling access to information on the go, in support of today’s technological needs. Consequently, CC has grown from a mere business concept into one of the fastest-growing technologies in the information technology industry [1]. However, the growth of CC continues to face a growing threat of insecurities of the stored information. Similarly, Ref. [2] highlighted that concerns of data security in the cloud contribute to the slow acceptance of CC technology, especially by business. Reference [3] confirms that surveys have identified security as the main concern that discourages organisations from moving to the cloud. CC is a convenient solution to individuals and organisations despite security challenges; hence, there have been attempts to address the concerns of cloud data security [4]. Many security frameworks and models have been proposed, yet the literature indicates that, insecurities involving CC, particularly with respect to data security, remains apparent. More work to improve on the security of data in the cloud is necessary to facilitate the growth and further adoption of cloud technology. The layout of the paper is as follows: Following this introductory section, a brief introduction to CC, the service models and deployment models are presented in Sect. 2, followed by a brief synopsis on data security in Sect. 3. Prominent cloud security frameworks are analysed in conjunction with propositions formulated from these in Sect. 4. Our framework that incorporates advantages of the preceding frameworks with additional improvements is defined in Sect. 5. A conclusion and directions for future work are presented in Sect. 6.
2 Cloud Computing Overview The literature presents numerous definitions of CC, among other, [3, 5] described CC as a technology that affords users access to resources such as networks, services, applications, servers, and storage over the Internet. Reference [6] added that the implementation of CC allows for a scenario wherein resources are provisioned and reused as and when required. Arguably, the standard definition of CC has been coined by the US National Institute of Standards and Technology (NIST) [7] as shown in Fig. 1. NIST outlines three CC service models—IaaS, PaaS, and SaaS—and four deployment models—public cloud, private cloud, hybrid cloud, and community cloud [8]. The pooling of resources, scalability (elasticity), rapid availability (on demand), and broad network access are further characteristics of a hired CC service [9].
Towards an Enhanced Framework to Facilitate Data …
229
Fig. 1 NIST visual model for cloud computing [7]
2.1 Cloud Service Models As indicated, cloud technology embodies three service models: infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS), depicted in Fig. 1. These services follow a basic computing architecture, i.e. hardware, system, and application layer, respectively, [10] with essential differences shown in Fig. 2 [11]. Infrastructure as a service is a hardware-level service [12]. CC storage, memory, networks, and processing including the OS are provided by this service for users to run applications over an Internet connection [12]. Platform as a service is a systemlevel service [12]. It caters mostly for developers. Infrastructure in a form of web servers is provided by the vendor, enabling developers to develop, test, deploy, and manage web applications and other software. It also makes provision for developer tools supported by a PaaS provider [13]. Software as a service is an application-level service. It provides consumers of cloud services with the ability to use software applications on demand, reducing the cost of hardware and licensing procurement, software development, overall maintenance, and operations [14]. From Fig. 2, it is evident that SaaS is most suitable for companies or individuals who desire maximum functionality from their cloud service provider, leading to our first proposition: • Proposition #1: An IaaS service model may be preferred by corporates wishing to exercise control over their sensitive data, while the SaaS model may be preferred by individuals who desire less responsibility with respect to data security.
230
S. Mahlaule et al.
Fig. 2 Cloud computing service modes [11]
2.2 Cloud Deployment Models Four deployment models, namely private cloud, public cloud, hybrid cloud, and community cloud, are considered industry standards according to NIST. Naturally, each of these deployment models presents their own unique security challenges and implications which could be compromised [15]. A public cloud is shared by more than one customer and is a popular type of deployment. Like with the SaaS service model, customers have little or no control over how data are managed or where it is stored. The cloud service provider provides security and infrastructure [16, 17], e.g. Netflix. Private cloud is a deployment established for use by a single entity, where such entity provides and manages its own infrastructure [18]. Hybrid cloud is a combination of both private and public clouds. Some organisations deploy both types with the aim of ensuring security for critical data and running less critical operations on the public cloud [12]. Community cloud is a deployment that serves many customers like a public cloud, but users are grouped together according to the type of data they need to access. The physical infrastructure is owned and managed by a group of service providers [12]. The discussions in this section lead to our second proposition: • Proposition #2: Corporates with sensitive information may prefer a private cloud deployment model, while individuals with fewer financial resources would opt for a public- or community cloud deployment.
Towards an Enhanced Framework to Facilitate Data …
231
3 Data Security Overview Before identifying and analysing the selected cloud security frameworks, we note the standards that guide CC and its security aspects. Also, the CC security model is discussed, and information security characteristics according to The CIA Triad of Information Security are reflected on. Lastly, Cloud Security Alliance’s research around security in the cloud is presented.
3.1 Cloud Computing Security Model Figure 3 depicts a high-level, generic CC security model. It consists of four security units and caters for the three CC service models—IaaS, PaaS, and SaaS, as well as the deployment models—public, hybrid, and private cloud. Four software units are provided for verification and validation (V&V), privilege control, data protection, and attacks detection. Verification and validation (V&V) unit. This unit authenticates users and determines the validity of user data and associated services in the cloud. CSPs employ digital signature algorithms to verify data validity. One-time password (OTP) and two-factor authentication (2FA) techniques can be employed in this component. One-time password (OTP): A password is valid for one login session only [9]. OTP offers much improved security over static passwords, but can be time intensive,
Fig. 3 Cloud computing security model [19]
232
S. Mahlaule et al.
hence inconvenient, albeit having enhanced security [19]. Different types of OTPs are time-based, counter-synchronised, seed chain, and challenge-based OTPs [19]. Two-factor authentication (2FA): This type of security requires a user to supply two pieces of information, namely a static password and an OTP to authenticate. In most cases, a user uses a static password known to the authenticating server, and the server would issue an OTP to a user, then the user uses such OTP to authenticate. In essence, therefore, it is a two-phase authentication [19]. The verification aspect involved in the V&V unit presents ideal opportunities for the use of formal methods (FMs) alluded to in the future work part of this paper. Having observed the presence of a V&V unit in a security framework leads to a preliminary version of our next proposition: • Proposition #3a: A modern cloud security framework ought to embed a verification and validation (V&V) authentication unit. Privilege control unit. The privilege control unit manages how the cloud is used by applying policies and rules to facilitate data integrity and confidentiality [19]. Users are given different types of permissions based on their account type. Only authorised users are allowed to access encrypted data through encryption/decryption algorithms such as the Advanced Encryption Standard (AES), and Ron’s Code 4 (RC4). Data protection unit. This unit is employed in the generic model to ensure data stored in cloud servers are secure and protected from being compromised. Common techniques employed in this security component are truncation, reduction, encryption, hash functions, and Application of a Message Authentication Code (MAC) [19]. Attacks detection/prevention unit. This unit protects cloud resources with respect to all data and physical and virtual resources from malicious attacks. It deploys technologies to facilitate the availability of data and mitigates denial-of-service (DoS) attacks. The units are employed within a cloud security model to address different vulnerabilities that include confidentiality and privacy challenges in relation to cloud technology, and, therefore, we view all four units to have equal significance. Yet, the data protection unit and verification and validation unit are more relevant for the purpose of this paper. These security units function effectively together towards ensuring data and service integrity. While the verification and validation unit aims to ensure the correctness of data and services once a user has been successfully authenticated, the data protection unit ensures security and protection of the same data and services provided to the user by the CSP. Consequently, we arrive at an enhanced version of proposition #3a: • Proposition #3: A cloud security framework ought to include a security control function with units V&V, privilege control, data protection, and attack detection management.
Towards an Enhanced Framework to Facilitate Data …
233
3.2 The CIA Triad of Information Security The three (3) fundamental attributes of information security, namely confidentiality, integrity, and availability (known as “The CIA triad of Information Security”), are important in defining the security position of an organisation [20] and used as a benchmark for evaluating the effectiveness of an information system [21]. Confidentiality refers to data privacy which is defined as prevention of unauthorised disclosure of information intentionally or unintentionally [20]. Data integrity refers to confidence that cloud data in a state of rest or in transit are not accessed and compromised by unauthorised parties [22]. Availability of data is defined as “the ability to make information and related resources accessible as needed” [23]. Other security concerns that apply to CC are user authentication and authorisation, multi-tenancy secure sharing of resources among devices, and system vulnerabilities [24]. The above discussion leads to: • Proposition #4: A cloud security framework ought to uphold the principles of CIA—confidentiality, integrity, and availability.
3.3 Cloud Security Alliance The Cloud Security Alliance (CSA) is a leading organisation dedicated to defining and raising awareness of best practices to facilitate a secure cloud computing environment [25]. The CSA cautions that to prevent data loss and like threats, it is critical for organisations to establish the correct cloud security mechanisms and policies, yet it may not be possible for organisations or a CSP to eliminate all security threats [26]. Some best practices to secure data in the cloud include encrypting data at rest, in use, and in motion; deploying two-factor authentication (2FA) or multifactor authentication to verify user identity; isolate cloud data backup; and log and monitor all incidents of data access [26], leading to: • Proposition #5: Among others, data encryption is an important mechanism to facilitate cloud security.
4 Cloud Security Frameworks This research highlights three prominent cloud security frameworks to evaluate how each addresses the challenges of cloud security and privacy. The first one presented in Fig. 4 is based on trust as it facilitates collaboration between cloud service providers and cloud service integrators. Figure 5 presents a cloud security model centred around personal cloud. It provides for a secure connection for accessing cloud services from
234
Fig. 4 Service-collaboration orientated security framework [3]
Fig. 5 Personal cloud security framework [27]
S. Mahlaule et al.
Towards an Enhanced Framework to Facilitate Data …
235
Fig. 6 G-cloud-based framework [28]
Table 1 Security challenges versus deployed solutions [3]
Security challenge
Deployed solution/module
Authentication and identity management
IDM module
Access control
AC Policies
Policy integration
Policy engineering mechanisms
Service management
Collaboration of SLAs
Trust management
Developed trust management approach
Heterogeneity management
Ontology-based approach
personal devices. The third security framework presented in Fig. 6 is based on cloud security for a government system, in this case, Saudi Arabia’s healthcare system. It focuses on refining access controls by introducing a multifactor authentication system. As presented in Table 1, Sect. 4.1 elaborates on the criteria for selecting these frameworks.
4.1 Service-Collaboration Orientated Security Framework Like the CC security model in Fig. 1, the service-collaboration orientated security framework in Fig. 4 accommodates all three cloud service models, namely SaaS,
236
S. Mahlaule et al.
PaaS, and IaaS. This framework addresses six key security challenges through the module/solution indicated in Table 1. The service-collaboration orientated security framework is a specialised cloud security framework based on trust between a service integrator and a service provider. The service integrator carries out the process of collaboration between different service providers aimed at discovering new services as requested. Each service integrator consists of three management components: service, trust, and security. These are responsible for the establishment and maintenance of trust (involving negotiations) between the local provider domain and between the providers and the users. The goal is, therefore, to provide services and create global policies based on mutual trust. The service integrator first discovers services, then carries out negotiations, followed by integrating those services to form groups of collaborating services, and finally provides these to users. In a service provider, the security management components facilitate privacy, and the data encryption module ensures encryption of outsourced data. Service and trust management components in this model allow for systematic service provision and address trust needs between these services. The security management component is employed into the service integrator and service provider unit of this framework. Such component under service integrator employs access control and privacy and data encryption security modules on top of the identification and IDM units. The framework is based on trust between the service integrator and service providers, and trust is achieved through trust modules and policies as indicated. A disadvantage of the trust framework is that data security is overlooked in the service integrator processes. It appears, therefore, that the model “tries so hard to establish trust” that strict security aspects are compromised, creating a loophole for unauthorised user access. Presumably, a privacy and data encryption security module can be incorporated to ensure that data remain encrypted and protected when it is provisioned to the user. These discussions lead to: • Proposition #6: A cloud security framework ought to provide for service and security management.
4.2 Personal Cloud Computing Framework The personal cloud security framework in Fig. 5 addresses cloud security for the personal cloud, a hybrid (combination of private and public cloud) deployment model. It provides a secure connection for accessing cloud services from personal devices, for example, smart phones, notebooks, web browsers, and host applications. A secure connection is achieved by deploying a single sign-on access token (SSAT) and third-party certification access between the client and end-user service portal. Access control uses eXtensible Access Control Markup Language (XACML) and Kosovo Independent Media Program (KIMP) protocols to share user profile information among components in the portal. The security component provides
Towards an Enhanced Framework to Facilitate Data …
237
protection for access control, security policies, and key management components against security threats [27]. A personal cloud server is characterised by privacy aspects among some of its attributes; hence, the framework presents the security control component as protection against security threats. The access control, security policies, and key management sub-components within the security control module are deployed, respectively, together relevant protocols to facilitate privacy. As a disadvantage, the framework does not necessarily focus on protecting user data, but merely provides for a secure connection between the user and a cloud service. The framework may, therefore, be viewed as a path protector and not necessarily a data- or information protector. We note: • Proposition #7: A cloud security control unit ought to embed functionality for a virtual private network (VPN) and service configuration, based on the use of sign-on access tokens (SATs).
4.3 Secure G-Cloud-Based Framework for Government Healthcare Services The G-cloud-based framework in Fig. 6 addresses security challenges in an egovernment CC platform by deploying multi-authority ciphertext policy to enforce access control policies. A trust key authority is utilised to allow for multifactor authentication as a measure to facilitate a secure connection between trusted authorities. The framework consists of four important and interacting entities to perform their individual tasks. These entities described as the patient (main entity of the framework), healthcare provider (e.g. nurse, doctor, pharmacists, etc.), the trusted authority (e.g. government), and the e-government cloud-based electronic health records (EHRs) are the backbone of the framework consisting of cloud services described below. First service responsible for storing encrypted EHRs is made up of a data repository and computing resources. Second service of which the main responsibility is to generate access policies and provide for keys management. Third service is responsible for hosting a secure online website, accessible only by authorised stakeholders. The framework satisfies the following security requirements: data privacy, access control, efficiency, and scalability. In Fig. 6, framework is a specialised framework; it, therefore, does not cater for all security concerns, yet it elicits: • Proposition #8: A trusted key authority via trust management is important for a cloud security framework to adhere to. • Proposition #9: Access policies should underlie the design of a cloud security framework.
238
S. Mahlaule et al.
Table 2 Comparing existing frameworks against security threats Cloud security aspect
Frameworks [3]
[27]
[28]
Negotiation of service delivery
✓
✓
X
Monitoring service delivery
✓
✓
X
Trust management
✓
X
✓
Data security and privacy
✓
✓
✓
Verification and validation (access control techniques)
✓
✓
✓
Privilege control (integrity and confidentiality)
X
X
✓
Data protection (encryption techniques, etc.)
✓
✓
✓
Attacks detection (attack mitigation techniques)
X
X
X
Table 2 depicts the comparison of the three frameworks discussed above. The last line in Table 2 emphasises the importance of: • Proposition #10: A threat detection manager should pre-empt possible attacks in a cloud security framework (refer also Proposition #3).
5 Proposed Cloud Security Framework On the strength of the analyses of the frameworks in Sect. 4, coupled with the ten propositions, we propose a new framework aimed at combining aspects of the generic cloud computing security model and the three specialised frameworks, mitigating the challenges with the frameworks as indicated. Components and modules from each framework that were identified to address data security challenges in the cloud are incorporated in the definition of the new framework in Fig. 7. Our ten propositions, Proposition #i for i ∈ {1, 2, … 10}, are indicated by Pi. Our framework addresses privacy and security challenges from the moment a user submits a request for a service to a cloud service provider until a request is rendered. The user authenticates from a client, e.g. computer, smart phone, etc., by using a third-party certificate authority (CA) to issue a service token. At this point, the user gains access to a single CSP via a portal. Key generation and validation are based on NIST recommendations [28], as follows. The trusted third party (TTP), such as a certificate authority (CA), generates a key pair. It ensures integrity protection for the key pair’s source authentication and confidentiality of the private key during transfer to the owner. Lastly, the owner receives validation of key pairs and assurance of having received a correct private key. Validation using various key pair validation methods for data consistency and security can be renewed at any time, depending on security requirements. Our cloud security framework also caters for the three cloud service models— SaaS, PaaS, and IaaS at both cloud service provider and cloud service integrator
Towards an Enhanced Framework to Facilitate Data …
239
Cloud Concerns & Risks
Attacks & Threats
PROBLEM
PROBLEM P9
Security & Privacy Requirements Policies
INTERVENTION Verification & Validation
Data Protection
Privilege Control
Monitoring
Confidentiality
P3 Authentication
Encryption
Access control
DOS Attacks
Authorisation
P5
Data-retention & Protection
Non Repudiation
Attacks Detection
Availability
P4
Accountability
Wrapping
Integrity
Abuse & nefarious use
Authorisation
Cloud injection
Security Control
VPN Manager
P7
Service Broker
Service Configuration
Security Control
Threat Detection Manager
P6
Service Gateway
P10
Trust Manager
Access Control
P7 {Token} SSO
SaaS
P1
IaaS
PaaS
SaaS
P1
IaaS
PaaS
Security Management
Security Management
CLOUD SERVICE PROVIDER #m…
CLOUD SERVICE INTEGRATOR #n…
Service Management Trust Management
Private, Public Hybrid Cloud
P6
Service Management
P8
Trust Management
End-User Portal P2
Private, Public Hybrid Cloud
Client Multi-factors Authentication
Cloud User
Fig. 7 Proposed cloud security framework (synthesised by researchers)
levels. These form a collaborated set of services as the need arises from the VPN manager component. Access control positioned in the security control module provides secure access control within the portal forming a virtual private network (VPN) between a user and the cloud service provider. The security management component provides for security protocol needs between the user and the cloud communication, while trust management provides for a trust key management role between cloud services as they are delivered to a cloud user through the trust manager. Both the cloud service provider and cloud
240 Table 3 Proposed security framework versus security threats
S. Mahlaule et al. Cloud security aspect
Proposed cloud security framework
Negotiation of service delivery
✓
Monitoring service delivery
✓
Trust management
✓
Data security and privacy
✓
Verification and validation (access control techniques)
✓
Privilege control (integrity and confidentiality)
✓
Data protection (encryption techniques, etc.)
✓
Attacks detection (attack mitigation techniques)
✓
service integrator are composed of service management, deployed directly into the portal to facilitate effective cloud interoperability, and safe cloud orchestration in collaboration with service configuration in the security control module. The service management component is accessible via the service gateway using a service broker component deployed to negotiate the collaboration of cloud services. Our framework adheres to guidelines in the propositions, e.g. a new facility, the threat detection manager serves as an additional security control measure to mitigate attacks and threats and security policies via the security and privacy requirements. Table 3 shows that the new framework addresses all the cloud security aspects identified before.
6 Conclusions and Future Work This paper introduced cloud security with reference to the CIA triad of information security. While cloud security has been around since the early 2000s, challenges with respect to data security remain. Four cloud security frameworks—a generic framework, a service-collaboration framework (based on trust), a personal framework, and a healthcare framework—were analysed, and several propositions were synthesised from these. Possible shortcomings in these frameworks were identified. On the strength of these observations, a comprehensive cloud security framework was defined, aimed at eliminating possible shortcomings in the existing frameworks and incorporating necessary aspects by providing some degree of protection at all levels. A discussion of the new framework confirmed that it incorporates the strengths of the existing frameworks and incorporates the guidelines proposed in the ten propositions. Future work in this area may be pursued by exercising our framework using formal methods (FMs)—discrete mathematics and formal logic—to verify the functionality
Towards an Enhanced Framework to Facilitate Data …
241
of individual components. FMs are generally considered to play a critical role in the development of any system where security and reliability are crucial [29] and may assist in determining whether the system designs are accurate and function as intended [30]. Acknowledgments This work is based on the research supported in part by the National Research Foundation of South Africa (Grant Number 119210).
References 1. Subashini S, Kavitha V (2011) A survey on security issues in service delivery models of cloud computing. J Netw Comput Appl 34(1):1–11 2. Weinhardt C, Anandasivam A, Blau B, Stöer J (2009) Business models in the service world. IT Prof 11:28–33 3. Takabi H, Joshi JBD, Ahn GJ (2010) SecureCloud: towards a comprehensive security framework for cloud computing environments. In: Proceedings—international computer software and applications conference, pp 393–398 4. Tawalbeh L, Al-Qassas RS, Darwazeh NS, Jararweh Y, Aldosari F (2015) Secure and efficient cloud computing framework. In: Proceedings—2015 international conference on cloud and autonomic computing, ICCAC 2015 5. Alshammari A, Alhaidari S, Alharbi A, Zohdy M (2017) Security threats and challenges in cloud computing. In: Proc. - 4th IEEE Int. Conf. Cyber Secur. Cloud Comput. CSCloud 2017 3rd IEEE Int. Conf. Scalable Smart Cloud, SSC 2017, pp 46–51 6. Sharma SK, Al-Badi AH, Govindaluri SM, Al-Kharusi MH (2016) Predicting motivators of cloud computing adoption: a developing country perspective. Comput Human Behav 62:61–69 7. Mell P, Grance T (2011) The NIST definition of cloud computing recommendations of the National Institute of Standards and Technology. Natl Inst. Stand Technol Inf Technol Lab 145:7 8. Snaith B, Hardy M, Walker A (2011) Emergency ultrasound in the prehospital setting: the impact of environment on examination outcomes. Emerg Med J 28:1063–1065 9. Jain A, Mahajan N (2017) Introduction to cloud computing. The Cloud DBA-Oracle 3–10 10. Zhou M, Zhang R, Zeng D, Qian W (2010) Services in the cloud computing era: a survey. In: 2010 4th international universal communication symposium, IUCS 2010—proceedings. IEEE, pp 40–46 11. Chou D. Cloud Service Models (IaaS, PaaS, SaaS) Diagram. https://dachou.github.io/2018/09/ 28/cloud-service-models.html, last accessed 2021/07/06 12. Lele A (2019) Cloud computing. In: Disruptive technologies for the militaries and security. Smart innovation, systems and technologies, vol 132. Springer, Singapore. https://doi.org/10. 1007/978-981-13-3384-2_10 13. Dasgupta D, Naseem D (2014) A framework for compliance and security coverage estimation for cloud services. Cloud Technol 543–565. https://doi.org/10.4018/978-1-4666-5788-5.ch005 14. Jaya Chandrareddy B, Uma Mahesh G, Bandi S (2012) Cloud zones: security and privacy issues in cloud computing. Asian J Inf Technol 11:83–93 15. Chang V, Ramachandran M, Yao Y, Kuo Y-H, Li C-S (2016) A resiliency framework for an enterprise cloud. Int J Inf Manage 36:155–166 16. Kim H, Lee H, Kim W, Kim Y (2009) A trust evaluation model for cloud computing. In: Communications in computer and information science. Springer, Berlin Heidelberg, pp 184– 192 17. Bai Y, Policarpio S (2011) On cloud computing security. Commun Comput Inf Sci 162 CCIS 388–396
242
S. Mahlaule et al.
18. Halabi T, Bellaiche M (2017) Towards quantification and evaluation of security of Cloud Service Providers. J Inf Secur Appl 33:55–65 19. Date A, Datar D (2014) A multi-level security framework for cloud computing. Int J Comput Sci Mobile Comput 3(4):528–534 20. Ficco M, Palmieri F, Castiglione A (2015) Modeling security requirements for cloud-based system development. Concurr Comput 2107–2124 21. Wang R (2017) Research on data security technology based on cloud storage. In: Procedia engineering, vol 174, pp 1340–1355 22. Kumar PR, Raj PH, Jelciana P (2018) Exploring data security issues and solutions in cloud computing. Procedia Comput Sci 125:691–697 23. Khazanchi D (2009) Information availability. In: Handbook of research on information security and assurance. IGI Global, pp 230–239 24. Bhayana M, Kriplani K, Jha P, Sharma S (2020) An overview on cloud computing. Int J Eng Appl Sci Technol 04:290–292 25. Cloud Security Alliance: Homepage. https://cloudsecurityalliance.org/, last accessed 2021/04/29 26. What is Cloud Security? https://searchcloudsecurity.techtarget.com/definition/cloud-security, last accessed 2021/04/27 27. Na S-H, Park J-Y, Huh E-N (2010) Personal cloud computing security framework. In: 2010 IEEE Asia-pacific serv. comput. conf., pp 671–675. https://doi.org/10.1109/APSCC.2010.117 28. Baker E, Chen L, Roginsky A, Vassilev A, Davis R, Simon S (2019) Recommendation for pairwise key establishment using integer factorisation cryptography. NIST Special Publication (SP) 800–56B Rev. 2 29. Sharaf S, Shilbayeh NF (2019) A secure G-cloud-based framework for government healthcare services. IEEE Access 7:37876–37882. https://doi.org/10.1109/ACCESS.2019.2906131 30. Peng Y, Jones IW, Greenstreet MR (2016) Finding glitches using formal methods. In: Proceedings—International symposium on asynchronous circuits and systems. https://doi.org/10.1109/ ASYNC.2016.12
Kinematics and Control of a 3-DOF Industrial Manipulator Robot Claudia Reyes Rivas, María Brox Jiménez, Andrés Gersnoviez Milla, Héctor René Vega Carrillo, Víctor Martín Hernández Dávila, Francisco Eneldo López Monteagudo, and Manuel Agustín Ortiz López
Abstract This article presents the analysis of the kinematics and dynamics of a manipulator robot with three rotational degrees of freedom. The main objective is to obtain the direct and inverse kinematic models of the robot, as well as the equations that describe the motion of two pairs: τ1 and τ2, through the dynamic model and the development of the Lagrange equations. For this reason, this document shows the mathematical analysis of both models. Once the equations representing the robot have been described, the PD + controller calculations are described, as well as the results obtained by simulating the manipulator equations, using the VisSim 6.0 software, with which the kinematic models were programmed. To observe the importance of this analysis, a predefined linear trajectory was designed. Keywords Robot · Kinematics · Articular dynamic modeling · PD + control · Simulation
1 Introduction Mechanical arms, also known as manipulator robots, are used in various areas such as the automotive industry, metallurgy, medicine, among others, where they perform tasks of cutting, welding, painting, assembly, palletizing, coating, surgeries, etc. [1]. The movement of these industrial robots is based on the solution of their kinematic model that describes the movement in coordinate space, as well as the dynamic model that describes the forces that produce their movement [1, 2].
C. R. Rivas (B) Universidad de Córdoba, España y Universidad Autónoma de Zacatecas, Zacatecas, Mexico e-mail: [email protected]; [email protected] M. B. Jiménez · A. G. Milla · M. A. O. López Universidad de Córdoba, Córdoba, Spain H. R. V. Carrillo · V. M. H. Dávila · F. E. L. Monteagudo Universidad Autónoma de Zacatecas, Zacatecas, Mexico © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_19
243
244
C. R. Rivas et al.
Among the main problems that arise in the operation of manipulative robots, it is determining the trajectories that each of their joints must follow to perform a specific task [3]. Usually, such trajectories are described by continuous mathematical expressions concerning time, which must be previously calculated and programmed in the corresponding control system [4]. This procedure is complicated and tedious, especially in applications where the robot must perform very elaborate movements. The main objective of this document is to carry out the complete analysis of a 3gdl robot, which consists of obtaining the direct kinematic model and inverse kinematics, as well as the articular dynamic model of the manipulator; it is worth mentioning that only the two main joints were considered for this last analysis, in addition to designing a PD + controller that will allow us to control the movements of the manipulator robot so that it executes a desired linear trajectory. The VisSim 6.0 simulation software was used to observe the behavior of the robot. To obtain the mathematical modeling of the robot, it must be considered that robotic systems are nonlinear, since the force required to maintain the link in a position is not the same, nor it is linearly proportional to the force required to maintain the link in another position [5]. For the dynamic model of the robot, its kinematics and the forces acting on it are taken into account. These forces can be defined by Newton’s laws or by Lagrange’s equations of motion [6]. For the design of the controller, it is necessary to take into consideration the dynamics of the robot [6]. However, the dynamic model of the robot usually consists of nonlinear terms that are mainly due to gravity, friction, and centrifugal forces. Therefore, these nonlinearities must be compensated for in the control design [7]. Programming of robot manipulators is usually done in terms of Cartesian coordinates of the workspace, falling on the controller the task of translating said specification into joint or motor variables, which are those that govern the movements of the robot. Some advanced programming languages even contemplate the possibility of specifying certain movements of the robot in terms of sensory variables. Therefore, robot control depends critically on the availability of functions that allow it to pass from physical space or sensory space to the space of joint or motor variables [7, 8].
1.1 Method Description Analysis of the mathematical model of the kinematics of the manipulator robot The direct and inverse kinematics of the industrial manipulator robot with three degrees of freedom were analyzed, which is connected through three rotational joints as shown in Fig. 1. The links have constant lengths L 1 , L 2 , L 3 , respectively. All the links have rotational movements called q1 , q2 , q3 , and the rotation of the joints can be up to 360°. The robot parameters are shown in Tables 1, 2, and 3.
Kinematics and Control of a 3-DOF Industrial Manipulator Robot
245
Fig. 1 Robot manipulator with three degrees of freedom with rotational links Table 1 Parameters of the Robot joint 1
Table 2 Parameters of the Robot joint 2
Table 3 Parameters of the Robot joint 3
Parameters of Joint 1
Symbol
Value
Mass1
m1
20.5 [Kg]
Joint length 1
L1
0.35 [m]
Length to center of mass1
Lc1
0.09 [m]
Joint inertia 1
I1
1.436 [Kg m2 ]
Viscous friction 1 Coulomb friction 1
Fv1 Fc1
2.198 [Nm s/°] 6.21 [Nm]
Parameters of Joint 2
Symbol
Value
Mass 2
m2
4.56 [Kg]
Joint length 2
L2
0.32 [m]
Length to center of mass 2
Lc2
0.065 [m]
Joint inertia 2
I2
0.095 [Kg m2 ]
Viscous friction 2 Coulomb friction 2
Fv2 Fc2
0.197 [Nm s/°] 1.823 [Nm]
Parameters of Joint 3
Symbol
Value
Mass 3
m3
3.88 [Kg]
Joint length 3
L3
0.37 [m]
Length to center of mass 3
Lc3
0.048 [m]
Joint inertia 3
I3
0.053 [Kg m2 ]
Viscous friction 3 Coulomb friction 3
Fv3 Fc3
0.151 [Nm s/°] 1.734 [Nm]
246
C. R. Rivas et al.
Direct kinematics modeling The direct kinematic model describes the relationship between the joint position q and the position and orientation in the Cartesian coordinate reference plane (x, y, z). ⎡ ⎤ ⎡ ⎤ x [L 2 sen(q2 ) + L 3 sen(q2 + q3 )] cos(q1 ) ⎣ y ⎦ = ⎣ [L 2 sen(q2 ) + L 3 sen(q2 + q3 )]sen(q1 ) ⎦ L 1 − L 2 cos(q2 ) − L 3 cos(q2 + q3 ) z
(1)
where q1 , q2 , q3 are the angular positions in degrees of Joints 1, 2, and 3, respectively. The previous Eq. (1) is derived, and the angular velocity vector is obtained: ⎤ −[L 2 sen(q2 ) + L 3 sen(q2 + q3 )]sen(q1 )q˙1 ⎡ ⎤ ⎢ ⎥ x˙ ⎢ + cos(q1 )[L 2 cos(q2 )q˙2 + L 3 cos(q2 + q3 )(q˙2 + q˙3 )] ⎥ ⎥ ⎢ ⎣ y˙ ⎦ = ⎢ −[L 2 sen(q2 ) + L 3 cos(q2 + q3 )] cos(q1 )q˙1 ⎥ ⎥ ⎢ ⎣ +sen(q1 )[L 2 cos(q2 )q˙2 + L 3 cos(q2 + q3 )(q˙2 + q˙3 )] ⎦ z˙ L 2 sen(q2 )q˙2 + L 3 sen(q2 + q3 )(q˙2 + q˙3 ) ⎡
(2)
From the angular velocity vector, the Jacobian is: ⎤ x˙ ⎢ ⎥ ⎣ y˙ ⎦ z˙ ⎡ − L 2 sen(q2 ) + L 3 sen(q2 + q3 ) sen(q1 ) ⎢ = ⎣ − L 2 sen(q2 ) + L 3 sen(q2 + q3 ) cos(q1 ) 0
q˙1 q˙2 ⎡
⎤ L cos(q2 ) + L 3 cos(q2 + q3 ) cos(q1 ) L 3 cos(q1 ) cos(q2 + q3 ) 2 ⎥ L 2 cos(q2 ) + L 3 cos(q2 + q3 ) sen(q1 ) L 3 sen(q1 ) cos(q2 + q3 ) ⎦ L 2 sen(q2 + L 3 sen(q2 + q3 )) L 3 sen(q2 + q3 )
(3)
Inverse kinematics The inverse kinematic model is more used in industrial applications since it allows obtaining the joint coordinates from the working coordinates. To complement the analysis, the inverse kinematic model was obtained, which consists precisely of the inverse relationship of the direct kinematic model. q1 = tan−1 q2 = tan
−1
q3 = tan−1
x 2 + y2 L1 − z
y
(4)
x
− tan−1
L 3 sen(q3 ) L 2 + L 3 cos(q3 )
(L 1 − z)2 + x 2 + y 2 − L 22 − L 23 2L 2 L 3
(5)
(6)
Kinematics and Control of a 3-DOF Industrial Manipulator Robot
247
Joint dynamic model The joint dynamic model is formed by kinetic energy, potential energy, Lagrangian calculation, and application of the Lagrange equations of motion. It is worth mentioning that for the analysis of the dynamic model, only the first two links were considered and the third link will remain in its initial position, due to the requirements of the trajectory that the manipulator robot will execute. Lgn(q(t), q(t)) = K 1 (q(t), q(t)) + K 2 (q(t), q(t)) − U1 (q(t)) − U2 (q(t)) (7) where Lgn(q(t), q(t)) ˙ is Lagrangian; K (q(t), q(t)) ˙ is kinetic energy; U (q(t)) is potential energy. 1 1 m 1 Lc12 q12 + I1 q˙12 2 2 1 2 2 + m 2 L 1 q˙1 + Lc22 (q˙1 + q˙2 )2 + 2L 1 Lc2 (q˙1 + q˙2 )q˙1 cos(q2 ) 2 (8)
Lgn(q(t), q(t)) =
From the development of the Lagrange equations, the equations of motion of the pairs τ 1 and τ 2 of Actuators 1 and 2 are obtained; that is, in this case, only two joints were considered: τ1 = m 1 Lc12 + m 2 L 21 + m 2 Lc22 + 2m 2 L 1 Lc2 cos(q2 ) + I1 + I2 q¨1 + I2 + m 2 Lc22 + m 2 L 1 Lc2 cos(q2 ) q¨2 − m 2 L 1 Lc2 sen(q2 )(q˙1 q˙2 ) − m 2 L 1 Lc2 sen(q2 )(q˙1 + q˙2 )q˙2 + m 1 gLc1 sen(q1 ) + m 2 gL 1 sen(q1 ) + m 2 gLc2 sen(q1 + q2 ) + f c1 tanh(q˙1 ) + f v1 q˙1
(9)
τ2 = m 2 Lc22 + m 2 L 1 Lc2 cos(q2 )I2 q¨1 + m 2 Lc22 + I2 q¨2 + m 2 L 1 Lc2 sen(q2 )(q˙1 ) + m 2 gLc2 sen(q1 + q2 ) + f c2 tanh(q˙2 ) + f v2 q˙2
(10)
PD controller + compensation Due to the nonlinearities that occur in the dynamic model caused by the forces of friction, gravity, and centrifugal forces, it is necessary to apply a PD + controller combined with compensators, which counteract the nonlinearities that appear in the equations of motion of the pairs τ 1 and τ 2 . For the design of a controller, the complete analysis of the manipulator model is previously required, as well as the design of the desired trajectory. General equation of the Control Law τ = M(q)q¨ + C(q, q) ˙ q˙ + g(q) + f (q) ˙ + K p qd − K p q + K v q˙d − K v q˙
(11)
248
C. R. Rivas et al.
where M(q)q¨ is the inertia matrix,C(q, q) ˙ q˙ is the matrix of centrifugal and Coriolis forces, g(q) is the vector of gravities, f (q) ˙ is the friction vector, K p is the proportional gain of the controller, K v is the derivative gain of the controller, q is the angular position, q˙ is the angular velocity, qd is the desired angular position, and q˙d is the desired angular velocity. Nonlinearities of τ 1 were grouped, and the control law was applied, so q˙d1 = 0:
m 1 Lc12 + m 2 L 21 + m 2 Lc22 + I1 + I2 q¨1 + f v1 q˙1 + K p1 q1 + K v1 q˙1 = K p1 qd1 (12)
The Laplace transform was applied to the previous equation to obtain the following transfer function: K p1 Q(s) = Q d (s) m 1 Lc12 + m 2 L 21 + m 2 Lc22 + I1 + I2 s 2 + ( f v1 + K v1 )s + K p1
(13)
A PD control was designed with the following characteristics: bypass Mp = 2%, a damping coefficient ζ = 0.779, and a peak time tp = 1 s.
1 Q(s) = Q d (s) a s2 +
K p1 ( f v1 +K v1 ) a
+
K p1 a
(14)
where: a = m 1 Lc12 + m 2 L 21 + m 2 Lc22 + I1 + I2 . The general equation is used to represent second-order systems. ωn2 Q(s) = Ke 2 Q d (s) s + 2ζ ωn s + ωn2
(15)
where K e is the static gain of the system, ωn is the natural frequency, and ζ is the damping coefficient. Eq. (14) is equated with the general equation for second-order systems (15), to obtain the values of the corresponding controllers’ gains of K p1 , K v1 . K v1 = (2aζ ωn ) − f v1 = 16.06
(16)
K p1 = ωn2 (a) = 59.02
(17)
For the second actuator, the same procedure was carried out to obtain the values of K p2 and K v2 . The control law was applied and made getting q2d = 0. m 2 Lc22 + I2 q¨2 + f v2 q˙2 + K p2 q2 + K v2 q˙2 = K p2 qd2
(18)
Kinematics and Control of a 3-DOF Industrial Manipulator Robot
249
Fig. 2 Path executed by the robot manipulator with the PD regulator + compensation in the y–z plane
The Laplace transform was applied to Eq. (18) to obtain the following transfer function: K p2 Q(s) = 2 2 Q d (s) m 2 Lc2 + I2 s + ( f v2 + K v2 )s + K p2
(19)
A PD control was designed with the same specifications as for the first link, obtaining the following equation:
1 Q(s) = Q d (s) b s2 +
K p2 ( f v2 +K v2 ) s b
+
K p2 b
(20)
where b = m 2 Lc22 + I2 . Equation (20) is equated with the general equation for second-order systems (15), to obtain K p2 and K v2 . K v2 = (2bζ ωn ) − f v2 = 0.6203
(21)
K p2 = ωn2 (b) = 2.558
(22)
Solid line path generation To carry out the corresponding tests and verify that the controller complies with the design conditions, for this purpose, a continuous line trajectory in the shape of a dolphin was programmed in the (y, z) plane, to check the operation of the PD +
250
C. R. Rivas et al.
controller, using VisSim 6.0 software. In Fig. 2, it is observed that how the control makes the tool pass through the said path. It should be mentioned that initially the robot was located in the home position (i.e., the links of the robot are located on the negative z-axis, that is, q1 = 0, q2 = 0, and q3 = 0); after finishing the tracking of the trajectory, the robot must return to the home position. Results After the mathematical analysis was performed and the robot manipulator models were obtained, the PD control + compensation was designed and applied and a continuous line trajectory was programmed to verify the operation of the controller. In Fig. 2, the result of the simulation in real time is shown, in which it is observed that the manipulator executes the trajectory in the form of a dolphin efficiently, and the expectations programmed with the PD + compensation control are fulfilled; that is, the manipulator robot executes the trajectory without any problem and with great accuracy, and once the routine is finished, the robot returns to the home position. Figure 3 is also attached, which belongs to the graph of desired positions and which executes the manipulator robot according to the programmed trajectory, to make the dolphin figure. That is, the figure shows the displacement in degrees of Actuators 1 and 2, which in turn allow the movement of the two links together to draw the trajectory, in turn it is observed that the conditions of the controller design are met, previously described in the document. To check that the controller meets the design conditions: Mp = 2% and a peak time tp = 1 s, the position graph was zoomed in, the first position movement was selected, and the final value was measured in a steady state. In Fig. 4, the final measured steady-state value of 127.24° is shown.
Fig. 3 Plot of robot positions concerning time
Kinematics and Control of a 3-DOF Industrial Manipulator Robot
251
Fig. 4 Plot final value in steady state of the position of the robot
Fig. 5 Plot of the maximum overshoot of the robot position
The steady-state value is multiplied by 2% (which is the value of the Mp overshoot), resulting in the following: 127.24 ∗ 0.02 = 2.54
(23)
After, the result of Eq. (23) is added the value in steady state to know the value of the maximum overshoot: 127.24 + 2.54 = 129.78◦
(24)
252
C. R. Rivas et al.
In Fig. 5, the value of the maximum overshoot measured in the graph is observed, which is 129.76°, having an error of 0.02° and a measured controller efficiency of 99.98%. This can be improved by increasing the simulation sampling time. It can also be verified that it complies with the peak time of one second. Conclusions At present, manufacturers of manipulative robots are betting on the design of simple, inexpensive, and simple control laws in their implementation, which are regularly controlled by the error signals obtained from the position or speed sensors and which guarantee accuracy and a good execution in the movements of the robot. Economic reasons are of great importance when deciding whether to calculate and apply complex control laws, since many man–machine hours are required and also generate excessive expenses in the acquisition of specialized computer equipment, necessary for the implementation of the control algorithms. For the above, the PD + compensation controller is an excellent option for the motion control of manipulative robots, since it is a simple algorithm, easy to implement, and that ensures a global control system. It has the main characteristic that it compensates for the nonlinearities that occur in the dynamic model of the manipulator and that is a great advantage in relation to other control methodologies. It is worth mentioning that kinematic and dynamic modeling are of great importance to understand the behavior of the robot, in addition to being necessary to design a PD control system + motion compensation for tracking continuous line and point-to-point trajectories. The results showed that the dynamic model that was obtained correctly represents the behavior of the robot’s dynamics, and by programming the direct and inverse kinematics and the PD + compensation control in the software, it is possible for the robot to execute the programmed continuous trajectory with great efficiency. It was demonstrated that the design of the PD + compensation controller is very simple, as well as that it is easy to implement in the VisSim program, using matrix blocks that contain the information of the robot model and its controller. It was also evidenced that the design characteristics are met both in overshoot and in peak time, so it is considered that the PD + control is an excellent option for manipulative robots. It is worth mentioning that the controller gains must be calculated so that they do not exceed the maximum values allowed in the torque of the joint motors. This is verified by reviewing the parameters of the motors through the manufacturer’s datasheet, and with the help of simulation, also as a recommendation, it should be verified that the programmed path does not exceed the robot’s working space. The simulation of the behavior of robots using the VisSim 6.0 program represents a great advantage since the programming, the implementation of the system model, the control, and the adjustment of gains are carried out very easily. The advantage of real-time simulation allows the designer to concentrate on aspects of mathematical modeling and design, rather than thinking about programming details. Any modification can also be made before it is built, which is reflected in the fact that its proper functioning is guaranteed, as well as in a better optimization of material and economic resources.
Kinematics and Control of a 3-DOF Industrial Manipulator Robot
253
References 1. Boudy GS, León MJ, Estrada RY (2011) System for the intelligent control of a robot manipulator. Revista Ciencias Técnicas Agropecuarias 20(2) 2. Craig Jonh J (2006) Robotics3rd edn. Pearson Prentice-Hall 3. González VS, Moreno VJ (2013) Timescale separation of a class of robust PD-type tracking controllers for robot manipulators. ISA Trans 52:418–428 4. Durango S, Delgado Martínez M, Álvarez Vargas C, Flórez Hurtado R, Flórez Ruiz M (2020) Diseño cinemático de un robot paralelo 2-PRR, ST. Revista Scientia et Technica 25(3):372–379 5. Gámez García J, Robertsson A, Gómez Ortega J, Johansson R (2007) Estimación de la fuerza de contacto para el control de robots manipuladores con movimientos restringidos. Revista Iberoamericana de Automática e Informática Industrial RIAI 4(1):70–82 6. Slotine JJ, Li W (1987) On the adaptive control of robot manipulator. Int J Robot Res 6(3):49–59 7. Solis A, Hurtado J (2020) Reutilización de software en la robótica industrial: un mapeo sistemático. Revista Iberoamericana de Automática e Informática Industrial RIAI 17(4):354–367 8. González-Reyes D, Heebum K, Rubio-Martínez D, Cervantes Culebro H, Elías Espinosa M (2021) Metodología de diseño para robots paralelos de cinco eslabones y dos grados de libertad. Revista Científica editada por el Instituto Politécnico Nacional (IPN) de México, (ESIME) 25(1)
Enhanced Energy Efficiency in Wireless Sensor Networks Neetu Mehta and Arvind Kumar
Abstract A wireless sensor network incorporates a range of sensor motes or nodes that normally run-on battery power with limited energy capacity and also the battery replacement is a difficult job because of the size of these networks. Energy efficiency is thus one of the main problem and the design of energy-efficient protocols is essential for life extension. In this paper, we discuss communication systems that may have a major effect on the total dissipation of energy of the WSN networks. Based on the reviews, that traditional mechanisms for route discovery, static clustering, multi-hop routing as well as minimum transmission are not ideal for heterogeneous sensor network operations, we formulated customized low-energy network clustering hierarchy (CLENCH) which uses the random rotational mode of local cluster sink stations (cluster heads) for dynamic distribution of energy between the sensor nodes within the network. Simulation showed that CLENCH may reduce power consumption by as much as eight factors compared to traditional routing methods. CLENCH may also uniformly distribute energy among the sensor nodes which almost doubles the usable network lifetime for the model designed. Keywords Wireless sensor networks · Energy efficiency · Communication protocol · Sensor nodes · Network lifetime · Duty cycle
1 Introduction The technology of the wireless sensor networks (WSNs) has evolved swiftly over the past decades. In a large region, ecological events may be tracked utilizing monitoring devices known as motes or sensor nodes [1]. Battery-powered WSNs consist of multiple sensors, CPUs, and RF modules. The sensor motes (nodes) may be wirelessly interconnected through a communication connection (following various topologies) to a cluster head (CH) node or coordinator node through the use of a source gateway for sharing data/information [1, 2]. N. Mehta (B) · A. Kumar Department of Computer Science & Engineering, SRM University, Sonepat, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_20
255
256
N. Mehta and A. Kumar
This information sharing between sensor nodes will rely on the combination of a variety of senators, from basic (i.e., humidity, temperature, and pressure) to advanced (i.e., localization, monitoring and micro-radar, remote sensing images), enabling WSNs to track a wide variety of environments in order to obtain accurate field information [2]. The monitoring, storing, processing, and transmission of information capacities of sensor devices have thus grown consistently [3].
1.1 Wireless Sensor Network Due to the considerable development in low-cost, small-scale, smart sensors, lowenergy consumption, and highly integrated digital electronics with wireless communications technologies, the wireless sensor network (WSN) has evolved as a key technology for many application scenarios. The WSN’s communication may be either single-hop or multi-hop [4]. Owing to the constrained sensor’s capacity and availability, radios are utilized to transmit sensed information through wireless communication channels to a gateway node generally situated on a distant location for further computation. Sensor nodes in wireless networks are responsible for many tasks, such as sensing events, aggregating data, processing data, and sending and receiving data; some of these data are highly sensitive. Furthermore, many WSNs sensors have unique characteristics, including autonomy, limited energy, constraint processing capability, and contested radio environment that makes sensing and communication difficult [5]. These sensors are expected to run autonomously on their battery power for a long period. On the one hand, ensuring security, availability, and confidentiality of WSNs data has become critical.
1.2 Energy Consumption in WSNs The wireless sensor nodes are battery-powered, which itself is frequently placed in a harsh external surrounding; consequently, battery replacement is a difficult job, because certain channels may contain hundreds of nodes. Such vast geographically dispersed networks exacerbate the difficulties of battery change and make recharging practically difficult throughout operation [5]. High energy consumption and energy efficiency are thus one of several difficult issues for wireless communication methods utilized for various WSN deployments [4]. The lifespan of a WSN node relies on the sources of energy present and their total energy consumption. Consequently, owing to the complexities involved in developing an energy-efficient infrastructure, numerous routing methods have been suggested. Moreover, the design of sustainable WSNs becomes even more challenging in resource-constrained environments; this implies that a node must effectively utilize its resources, and increase its lifetime by closely monitoring its energy consumption
Enhanced Energy Efficiency in Wireless Sensor Networks
257
and security. On the other hand, WSNs are designed for specific applications ranging from small healthcare systems to large-scale tactical military systems and have to satisfy a set of specific requirements that varies from one application to another [6]. In light of these networking constraints, energy efficiency and security have attracted considerable attention from researchers in the past few years [7–10]. However, there is still more research required to develop energy-efficient and secure schemes of the existing algorithms in WSNs.
2 Related Works The WSNs design and implementation are dependent on various fields of application for which it is being deployed. Among all the aspects of an application, energy efficiency is one of the most critical concerns. Most of the recent WSN studies have focused on how to maximize the lifetime of the system without sacrificing other factors, such as latency and throughput. Many scholars have submitted a distinct study into MAC layer, routing layer, transport, and application layer energy saving methods. A variety of hierarchical energy-efficient routing protocols were identified through literature surveys like LEACH [8] also the list includes HPAR [9], APTEEN and TEEN [9, 10], MECN and SMECN [5, 11], PEGASIS [12], sensor aggregate [13], SOP [14], VGA [14], TTDD [15], position-based energy-efficient protocol [16], energy-efficient self-healing [8], as well as CELRP [17]. The literary evaluation states that hierarchical protocol’s primary benefit is to limit the duplication of data as well as being especially suitable for aggregation of data in WSN applications. Liu et al. [9] suggested adaptive tariff-based congestion control to minimize traffic congestion and maintain connectivity among sensor nodes (ADCC). The ADCC is essentially a lightweight, energy efficiency, congestion management system with duty cycle (DC) adjustments for WSNs. It manages the resources closely depending on the incoming traffic. In recent years, research engineers have worked with different energy-efficient DC algorithms to increase reliability and scalability as well as network lifetime. The researchers in [14] presented the cluster-based hierarchical routing system, a two-layer hierarchical routing algorithm (CBHRP). They developed a new aspirational idea known as headset, which includes primary head of the cluster and several additional associated heads inside a cluster. Results indicate that such a protocol substantially lowers energy usage as well as increases sensor network’s lifetime in comparison with LEACH protocol. To extend the network’s lifetime as well as enhance the network security, Rawat, Chauhan, and Priyadarshi [15] developed an enhanced EDEEC for three types of nodes. As a result, the network’s heterogeneity as well as energy level increases. It was also reported that with higher stability and efficient messaging, EDEEC performs considerably better over the LEACH protocol.
258
N. Mehta and A. Kumar
Subramanian and Paramasivam suggested an RTMAC [16] real-time MAC protocol that was based on the TDMA protocol to solve the problem of increased latency in a low-energy consumption S-MAC protocol.
3 Methodology 3.1 Improving Energy Efficiency of WSNs In WSN applications, the energy consumption may be reduced in almost all the layers in any communication protocol deployment. In physical layers, the energy of the node may be conserved by decreasing data size, data rate efficiency as well as an efficient energy strategy. Design of energy-efficient MAC duty cycle (DC) methods including scheduled packet transmission in MAC layer has been shown to reduce energy consumption [12]. Routing protocols with efficient energy uses may be developed to minimize network layer energy usage. Efficient congestion management, congestion prevention, and dynamic load sharing techniques, for instance, help improve network life within the transport layer.
3.1.1
Physical Layer
Communications among the sensors require a physical layer radio link, wherein the energy is used as the radio receives or transmits the data. The physical layer includes the modulation and encoding of the information in the radio transmitter, which is subsequently the best way to decipher the data in the reception end [18]. There are three modes for the radio communication system: active, idle, and sleeping. The core to enhancing the energy is thus to turn off the radio whenever the radio transceiver is not transmitting or idle; to reduce energy consumption, it is crucial to limit energy and time to transition between various modes, to broadcast and to receiving state. A reduced energy listening method may also work on the physical layer wherein the fundamental principle is to regularly enable the receiver to check the incoming packets.
3.1.2
MAC Layer-Based
The primary objective of the network layer is to identify the path from sensor nodes to the sink to transfer data in an effective and dependable way to maximize the life of the network. The MAC layer should be accountable for dependability, increasing energy efficiency, high performance, and minimal access delays to make optimum use of the energy-constrained mobile node capacities. Maximum quantity of energy lost during
Enhanced Energy Efficiency in Wireless Sensor Networks
259
MAC layer protocol activities including collisions, interrupting, overhead packet control, and overhearing reducing WSN’s energy consumption and implement efficient methods such as duty cycling, efficient routing, range and packets scheduling, as well as adaptive transmission time [12–14]. This involves sensor node to sleep/wake-up conduct for energy conservation as part of a MAC layers’ duty cycling. Protocols for sleep and wake-up include placing the radio transmission interface in (low-power) sleep mode when transmission is not requested. Preferably, the radio transceiver should be turned off when there are no more sending/receiving information and should therefore be restarted whenever the next transmission of data packet is available.
Sift Protocol The Sift [11] strategy leverages the event-based characteristics of the MAC protocol construct for the wireless sensor networks. It is a MAC protocol based on a contention window.
3.1.3
Cluster Formation/Clustering
Clustering is a grouping technique that partitions a network into several clusters— each of which has a cluster head [13, 19]. In WSNs, the selection of cluster heads that use energy-efficient clustering algorithms is very crucial as it they affect the lifetime and performance of the network. Typically, a cluster head (CH) is responsible for efficient communication between its cluster members and other clusters (Fig. 1).
Fig. 1 Cluster formation in WSN [19]
260
N. Mehta and A. Kumar
Cluster Head Selection A sensor node promotes itself as a cluster head when it has the largest battery capacity or longer lifetime than all its nearest sensor nodes, disconnecting associations with node ID. Instead, every node may make this selection autonomously depending on transmitted discovery messages. Each node then configures its network ID as its cluster head’s node ID.
Leach A popular self-organizing technique, low-energy adaptive clustering hierarchy (LEACH), equalizes the energy load distribution across sensor nodes by rotating the cluster heads (CHs) periodically. Regarding data transmission topology, LEACH receives the most focus of all hierarchical clustering techniques [17]. Cluster head nodes are selected arbitrarily in a round-robin approach in LEACH protocol, with the energy cost of the whole network uniformly divided among each sensor mote [18]. Because each node randomly elects itself to be a cluster head according to predefined probabilities, thus, ultimate choices are made by the sensor nodes themselves in the LEACH protocol. A node’s selection as cluster head probability is a feature vector of the operational round, limiting the epoch value for the sensor node to become a cluster head [20]. Thus, the energy usage may be balanced by rotating the CHs on a regular basis. However, assuming that all nodes start with an equal energy level, LEACH’s effectiveness in heterogeneous WSN is still not remarkable. As a result, while building new strategies, the energy heterogeneity of the sensor nodes must also be addressed.
3.1.4
Routing
WSN routing protocol design is a particularly difficult job since it includes energy consumption from all of the other motes on a specific path for a source–destination pairing. Routing is the mechanism that identifies the route among the source and the destination during data transmission. The routing protocols work on clustered, mesh, tree, etc., architectures to transmit data packets to the destination [21]. Various approaches utilize various ways to prolong the sensors lifespan. Routing is considerably more essential relative to WSNs than just about any other networking models.
Enhanced Energy Efficiency in Wireless Sensor Networks
3.1.5
261
Network Congestion
Because of its important effect on energy consumption and performance of the network framework, congestion is among the major problems concerning WSNs. Each node has an advantage to select the route with the lowest possible energy cost when sending packets to prevent the overhead costs of retransmission of packet loss owing to a collision causing an extra impact on the battery life.
3.1.6
Duty Cycling
Duty cycling (DC) may be referred to as the process that makes sensor nodes switch from the operating to the sleeping intervals based on the network activities. This DC method lowers the sensor network’s idle (inactive) overhead for listening.
3.2 Experimental Setup In this work, we considered a WSN composed of N sensor nodes si (i = 1, 2, 3, …, N), that are equally dispersed over a targeted region within a size of W × W to constantly transmit data. Large or harmful areas require probabilistic implementation, where sensor nodes are individually installed in the suitable locations. It is therefore possible to create a generic wireless sensor network architecture for simulations using the premise of random node positioning. Cluster Formation: If a member node is notified that a cluster head (CH) is nearby, it utilizes the MAC protocol to respond to the transmission of data/packets to that CH. Further, the protocol determines the closest distance (Euclidian distance) among the specific sensor node as well as its CH. For sensor nodes’ activity levels, the sensors may be programmed to measure the surroundings at certain sampling or time intervals. Additionally, they can also sleep as much to conserve battery life. The Euclidian distance (d) can be calculated by: da,b =
2
2 (ax − bx )2 + a y − b y
(1)
Energy Calculation: Sensor nodes are assumed to be aware of their total energy and energy consumption while receiving a data packet from the network’s base station (BS). Here the total energy of a heterogeneous wireless sensor network may be determined by: E Total =
N i=1
N E init = E 0 N + ai = E 0 (N + A) i
(2)
262
N. Mehta and A. Kumar
where in the beginning, each sensor node has a set amountof energy E init . Further, E 0 is the lowest bound of the battery’s stored energy, while iN ai is the heterogeneity factor that defines how much max. energy may be stored. We assume a continuous period across t1 and t2 for measuring the power consumption. The residual energy (E resid ) at a time t is measured by excluding energy used in t as calculated from initial energy (E init ) of the battery at t – t. Therefore, the energy consumption is thus reduced in the same way at t: E resid,i (t2 ) = E resid,i (t1 ) − E cons,i (t)E cons,i (t) =
3.2.1
E resid,i (t) tt = t2 − t1 t (3)
Radios Model for Energy Calculation
A model for the dissipation of energy for the radio hardware was developed in which the transmitters distribute energy into the powerup amplifier plus radio electronic components, as well as where the receiver dissipates energy into radio electronics, [22] as illustrated in “Fig. 2”. Here the first order of the radio paradigm provides an assessment of the energy utilization when a sensing node for each round transmits or receives data (Table 1). The power consumption for transmission of k bits to a sensor node having at a distance range of d from the sending node can be formulated as:
Fig. 2 Radio model for reception and transmission of packets
Table 1 Characteristics of radio model
Radio mode
Energy Consumption
Receiver electronics (EelecRx )
40 nan_J/bit
Transmitter electronics (EelecTx ) (Eelec = EelecTx = EelecRx) Idle (E-idle)
38 nan_J/bit
Transmission Amplifier (εamp )
100 pic_J/bit/m2
Sleep
0
Enhanced Energy Efficiency in Wireless Sensor Networks
E T x (d) = εamp d 2 k
263
(4)
and energy consumption of receiving k bits from a node is proportional to the receiver electronics energy per bit, eelec , is: E Rx = eelec k
(5)
Thus, the overall energy spent by any transmitter radio to send a k-bit packet over a distance d is determined by: E T x (k, d) = k ∗ E elec + k ∗ E f s ∗ d 2 d < d0; k ∗ E elec + k ∗ E amp ∗ d 4 d ≥ d0 d0 =
E fs E amp
(6)
(7)
Thus, the relationship between energy consumed and remaining energy is measured as follows (from Eq. (1)): E cons,i (t) = E init,i (t − t)−−Er esid,i (t)
3.2.2
(8)
Simulation Environment
The proposed study is focused on the performance metrics of WSN under proposed (CLENCH) topology implemented in network simulator and animation tools for simulations, i.e., ns-2, (Fig. 3). The simulation scenario includes sensor data rates (packets delay) captured by numerical temperature, environmental conditions data, along with humidity, pH, etc. Figure 4 represents the flowchart of the proposed WSN routing protocol.
4 Simulation and Analysis of Results This paper analyzed CLENCH, a homogenous clustering technique, and examined the effect of heterogeneity in node energy to extend the network lifetime of the wireless sensor network. The relatively homogeneous and heterogeneous methods were then evaluated for a comparative assessment among LEACH and CLENCH. The performance attributes network lifetime of LEACH and CLENCH are discussed below. Performance and Network Lifetime.
264
N. Mehta and A. Kumar
Fig. 3 Simulation output showing cluster formation and energy level detection
The LEACH and CLENCH cluster compositions during each round were evaluated. LEACH has cluster heads closely packed together, and cluster nodes for each cluster are inconsistent since LEACH solely relies on the probabilistic model to pick cluster heads. But at the other end, while CLENCH sees the residual energy node with local topological configuration, the CHs in CLENCH are much more elevated than in LEACH. In order to further test the scalability performance of CLENCH, we use two alternative network sizes, specifically N = 100 & N = 200. The energy used in communication (transmission and receipt) is considerably more than the sensor node’s idle and sleep state. It demonstrates that energy savings are required in WSN data transmission. Figure 5 compares the energy consumption of LEACH and CLENCH protocols. It shows that CLENCH is more energy-efficient as compared to the LEACH protocol. Figure 6 shows the comparison between LEACH, PEGASIS, and CLENCH protocols. Figure 7 represents the comparison of each protocol with first node dead (FND), TND, and all nodes dead (AND). It was found that the proposed protocol performs better than the other two protocols.
5 Discussion and Conclusion CLENCH’s network life is longer than that of other protocols, and CLENCH’s adaptability is greater than the other protocols (such as LEACH). The first node fatality takes place in CLENCH after around 1474 cycles as well as 8118 rounds for 100 and 200 nodes, respectively. CLENCH extends network life to about 13% compared to LEACH for the first node’s death criteria, 12% compared with PEGASIS. CLENCH
Enhanced Energy Efficiency in Wireless Sensor Networks
265
Initialize network, Input no. of nodes
Randomly distribute and disply nodes
Neighbor discovery
Cluster formation
Selection of cluster head
Create routing table
Compute energy consumption of each node
Communicate (data) to base station
Threshold reached? Yes
Simulation and results Fig. 4 Flow diagram of the proposed framework
No
266
N. Mehta and A. Kumar
ENERGY CONSUMPTION LEACH
CLENCH
ENERGY CONSUOPTION
3000 2500 2000 1500 1000 500 0
20
40
60
80
100
120
140
160
180
200
180
200
NUMBER OF NODES Fig. 5 Comparison of energy consumption by LEACH and proposed protocol
ENERGY CONSUMPTION LEACH
CLENCH
PEGASIS
ENERGY CONSUMPTION
3000 2500 2000 1500 1000 500 0
20
40
60
80
100
120
140
160
NUMBER OF NODES Fig. 6 Comparison of energy consumption of LEACH, PEGASIS, and CLENCH protocols
analyzes the remaining energy, the number of neighboring sensor motes, and the localization in choosing the cluster heads that may affect energy utilization. In addition, cluster heads build a stable multi-hop route that does not break, but also lowers the overhead control. In CLENCH, the energy may be used equally by all sensor nodes, and thus, network energy consumption is balanced. As LEACH selects the cluster heads randomly and their distribution is not even, there may be too many cluster heads.
Enhanced Energy Efficiency in Wireless Sensor Networks
267
All Dead
TND
CLENCH PEGASIS LEACH
FND
0
2000
4000
6000
8000
10000
NO. OF ROUNDS Fig. 7 Comparison of number of dead nodes in FND, TND as well as AND
These cluster heads may thus use a great deal of energy and thus, die very quickly. Conclusion energy efficiency is the major problem in the design of WSN protocols, owing to the small quantity of energy capacity of the sensor nodes. Thus, the main purpose behind every routing system is to ensure that the network is operational for a longer duration and is as energy-efficient as feasible. The lifespan of the network is the most significant performance measure for WSNs. In [23], the lifespan of the network is defined as the amount of time until the first node fails or runs out. Definition allows us to describe the network life as a circle till the initial node is powerless.
5.1 Conclusion In this paper, we have introduced CLENCH protocol in a heterogeneous environment that is an energy-efficient WSN clustering technique designed to reduce energy consumption and increase network lifetime. We differ in sensor nodes and network size for monitoring the scalability, energy efficiency, and performance of the WSN. It was found that the CLENCH method reduces the multi-routed transmission of data packets/messages, which establishes network’s energy efficiency as well as enhances network life. If the nodes have data to transmit (send/receive) after the successful transmission of data packets, they remain awake, and alternatively, they switch to the sleeping mode. Moreover, this adaptive method reduces the need to flooding or stream data packets unnecessarily as we use the unicast transfer technique.
268
N. Mehta and A. Kumar
References 1. Muhammad A, Hala M, Muhammad ZK (2015) An energy efficient management scheme for wireless sensor networks. Int J Crit Comput Based Syst (IJCCBS) 6(2) 2. Rai R, Rai P (2019) Survey on energy-efficient routing protocols in wireless sensor networks using game theory. In: Advances in communication, cloud and big data, pp 1–9. Springer, Singapore 3. Gherbi C, Aliouat Z, Benmohammed M (2016) An adaptive clustering approach to dynamic load balancing and energy efficiency in wireless sensor networks. Energy 114:647–662 4. Zaman N, Tang JL, Yasin MM (2016) Enhancing energy efficiency of wireless sensor network through the design of energy efficient routing protocol. J Sensors. https://doi.org/10.1155/2016/ 9278701 5. Thirukrishna JT, Karthik S, Arunachalam VP (2018) Revamp energy efficiency in homogeneous wireless sensor networks using optimized radio energy algorithm (OREA) and power-aware distance source routing protocol. Futur Gener Comput Syst 81:331–339 6. Siddiqui S, Ghani S, Khan AA (2018) PD-MAC: design and implementation of polling distribution-MAC for improving energy efficiency of wireless sensor networks. Int J Wireless Inf Networks 25(2):200–208 7. Agudo JE, Valenzuela-Valdés JF, Luna F, Luque-Baena RM, Padilla P (2016) Analysis of beamforming for improving the energy efficiency in wireless sensor networks with metaheuristics. Prog Artif Intell 5(3):199–206 8. Wang J, Cao J, Ji S, Park JH (2017) Energy-efficient cluster-based dynamic routes adjustment approach for wireless sensor networks with mobile sinks. J Supercomput 73(7):3277–3290 9. Liu Y, Liu A, Zhang N, Liu X, Ma M, Hu Y (2019) DDC: dynamic duty cycle for improving delay and energy efficiency in wireless sensor networks. J Network Comput Appl 131:16–27 10. Ahmad A, Ahmed S, Imran M, Alam M, Niaz IA, Javaid N (2017) Energy efficiency in underwater wireless sensor networks with cooperative routing. Ann Telecommun 72(3–4):173– 88 11. Ding W, Tang L, Ji S (2016) Optimizing routing based on congestion control for wireless sensor networks. Wireless Netw 22(3):915–925 12. Alrashidi M, Nasri N, Khediri S, Kachouri A (2020) Energy-efficiency clustering and data collection for wireless sensor networks in: industry 4.0. J Ambient Intell Humanized Comput 1–8 13. Rezaei E, Baradaran AA, Heydariyan A (2016) Multi-hop routing algorithm using steiner points for reducing energy consumption in wireless sensor networks. Wireless Pers Commun 86(3):1557–1570 14. Zhou Z, Niu Y (2020) An energy efficient clustering algorithm based on annulus division applied in wireless sensor networks. Wireless Pers Commun 115(3):2229–2241 15. Rawat P, Chauhan S, Priyadarshi R (2020) Energy-efficient clusterhead selection scheme in heterogeneous wireless sensor network. J Circuits, Syst Comput 29(13):2050204 16. Subramanian AK, Paramasivam I (2017) A priority-based energy efficient MAC protocol for wireless sensor networks varying the sample inter-arrival time. Wireless Pers Commun 92(3):863–881 17. Senthil M, Rajamani V, Kanagachid G (2014) Energy-efficient cluster head selection for life time enhancement of wireless sensor networks. Inf Technol J 13(4):676–682 18. Kaur H, Seehra A (2014) Performance evaluation of energy efficient clustering protocol for cluster head selection in wireless sensor network. Int J Peer-to-Peer Netw 5(3):1–13 19. Gui T, Ma C, Wang F, Li J, Wilkins DE (2016) A novel cluster-based routing protocol for wireless sensor networks using spider monkey optimization. In: IECON 2016–42nd annual conference of the IEEE industrial electronics society. 2016 Oct 23, pp 5657–5662. IEEE 20. Ezhilarasi M, Krishnaveni V (2019) An evolutionary multipath energy-efficient routing protocol (EMEER) for network lifetime enhancement in wireless sensor networks. Soft Comput 23(18):8367–8377
Enhanced Energy Efficiency in Wireless Sensor Networks
269
21. Selvi M, Velvizhy P, Ganapathy S, Nehemiah HK, Kannan A (2019) A rule-based delay constrained energy efficient routing technique for wireless sensor networks. Clust Comput 22(5):10839–10848 22. Bhola J, Soni S, Cheema GK (2020) Genetic algorithm based optimized leach protocol for energy efficient wireless sensor networks. J Ambient Intell Humaniz Comput 11(3):1281–1288 23. Dener M (2018) A new energy efficient hierarchical routing protocol for wireless sensor networks. Wireless Pers Commun 101(1):269–286
Social Structure to Artificial Implementation: Honeybees Depth and Breadth of Artificial Bee Colony Optimization Amit Singh Abstract Swarms are individuals known as agents of a colony system that collectively performs computationally complex real-world problems in a very efficient way. The collaborative effort of these agents achieves a common goal in a distributed and self-organized manner. In nature, such individuals as bees in beehives, ants in colony system, and birds in the flocking system, etc., are some examples of a long list of swarms. An inspirational and efficient course of action in complex real-world problems of similar kind attracted researchers to study such optimization solutions. Bonabeau et al. have transported this natural swarm intelligence into artificial. This paper presents an extensive review of the state-of-the-art artificial bee colony optimization, inspired by the natural beehives system in various application domains. In addition to the performance in complex real-world engineering problems, the paper also enlights the computational feasibility of its candidacy in the related domain areas. Overall, the application domains are categorized into various specialized domains of computer science and robotics. Finally, the paper concludes with possible future research trends of bee colony optimization. Keywords Swarm intelligence · Artificial bee colony · Natural to artificial · ABC optimization
1 Introduction Swarm intelligence (SI) is a solution approach that belongs to the bottom-up category of algorithmic design. SI [1, 2] optimizes complex and distributed problems by controlling constraint parameters of the solution space. It is inspired by the mindset of natural insects, such as bees, ants, birds, fishes, termites, and immune cells as shown in Fig. 1. These individual species have a complex social structure to be managed by selforganizing nature and underpin cooperation, i.e., indirect communication. There A. Singh (B) Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies, Dehradun, Uttrakhand 248007, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_21
271
272
A. Singh
Fig. 1 Examples of social insects [3] (left to right: honeybees, ants, fishes, and birds)
exist a huge number of problems around all the domains of real life that shows similar behavior like beehives, colony system, animal herding, fish schooling, bacterial growth, bird flocking, hawk hunting, etc. In this colonization, a huge population of individuals called agents exists to perform routine tasks and thus efficiently manages the activities of the colony system. Bonabeau et al. [4, 5] studied behavioral optimization of these agents in the colony system and transported the natural intelligence into the artificial system to reform the solution of real-life complex problems. Bonabeau has shown that the simple organizational rules of these insects control the overall system performance and generate efficient outcomes in the colony. Simultaneously, the group behavior of these agents is unpredictable and thus a small rule change propagates a measurable reflection in group behavior. Among various such swarm optimization techniques, artificial bee colony (ABC) [6] is one of the popular approaches presented by Karaboga in 2005. Afterward, a lot many variations with improved efficiency were proposed for almost all research domains of engineering.
1.1 Honeybees to Artificial Bees Artificial bees are inspired by the social foraging behavior of natural honeybees. Similar to the distributed beehive system [7, 8], artificial bees are also divided into three categories to perform the task. Employee Bees are the bees currently employed with some food source to achieve maximum output gain, Onlooker Bees are the ones who keep on eying over the employee bees to get the information of the good food sources, and Scout Bees are the bees responsible for the search of new food sources near the hive. Based on the above three categories of the bees, the artificial bees coordinate the overall distribution of the work to obtain the best outcome. Two fundamental concepts are required and inherited from the natural honeybees to carry out the optimal exploration of solution search space: Self-organization. The maximum number of interactions has a higher probability of getting the best result at the termination point. Self-organization shows the dynamic behavior of honeybees and is thus characterized by few activities of bees [4]. Positive
Social Structure to Artificial Implementation: Honeybees
273
feedback is the sharing of information, negative feedback helps to avoid a kind of local minima problem, fluctuation is required to explore new food sources, and multiple interactions can help to learn individuals from individuals and lead toward the global optima. Division of labor. Similar to honeybees, the artificial bees split the job so that the skillful individuals can take care of the assigned part of the job more efficiently in a parallel manner. Parallel processing is better than sequential processing and if performed by skillful individuals, the outcome is enhanced manyfold. Information sharing is another beauty of the beehives colony system. These intelligent agents communicate through different dance performances within the hive as per the quality of food sources. Based on the previous discussion, the artificial bee colony algorithm for solving a distributed complex problem is given below: Algorithm 1.1 Honey Beehive Ecosystem step 1. Scout bees roam around the beehives to find the initial food sources step 2. while (termination criterion) step 3. do step 4. Employee bees visit the initial food sources to compute the quality of food step 5. Employee bees convey the information to unemployed bees in the hive step 6. Based on the probability, the most profitable food sources are selected by the Onlooker bees step 7. Onlooker bees compute the nectar value of food sources engaged in step 8. Memorize the best food source explored so far step 9. if (iteration > number of trials) step 10. then step 11. Abandon the exhausted food sources step 12. Scout bees roam near the beehive to find other food sources step 13. endif step 14. endwhile
As shown in Algorithm 1.1, three types of bees perform their allocated task independently and support other bees to achieve the global solution. The scout bees are accountable to abandon the exhausted food sources and exploring new ones. The unemployed bees can watch the dancing style and thus identify the significance as well as the quality of information accordingly. The selection of food sources depends upon the profitability of the food that can be shared by the employed bees with different kinds of ‘waggle’ dances. Thus, the unemployed bees in the hive have a higher probability of selecting the most profitable food source.
1.2 Standard Artificial Bee Colony Optimization Artificial bee colony (ABC) optimization is one of the best examples of social intelligence. This complex problem-solving approach inspired by the honey beehive colony is decentralized and self-organized. ABC optimization has imported a
274
A. Singh
Start Initial Population
Abandon infeasible solution and generate new one
iteration
Employee Bee Phase Onlooker Bee Phase Scout Bee Phase no
Termination Critei yes
Update, Fitness function and Probability computation for all solutions (bees). Selection of solutions (bees) based on Probability and further Update, Fitness test to such solutions only.
Stop Fig. 2 Transformation of honeybees into ABC optimization
similar methodology of honeybees to maintain the beehives’ ecosystem to solve computational and combinatorial real-world problems. The algorithmic approach of ABC problem solving is divided into three phases as shown below in the flowchart (Fig. 2). As depicted in the flowchart, Employee, Onlooker, and Scout Bee phases are the reflections of the natural beehive ecosystem to explore the solution space efficiently and converge the solution at the global best optima by interacting with the intermediate local best solutions of individuals.
1.3 Standard ABC and Its Different Variants Due to the simple implementation and efficient swarm intelligence approach, standard and improved ABC is quite popular in the research community as shown in Table 1. A wide range of applications has been optimized all around the engineering domains. Theoretical and mathematical computer science is one of the domains in which ABC is applied to optimize network issues, such as routing strategies, clustering problems, node, and sink deployment. IoT, cloud computing, neural networks, and robotics are a few of as exemplary. Similarly, other domains like cloud computing, IoT, neural networks, and robotics are a few of the application domains of ABC. The subsequent Sect. 2 is presenting some pieces of evidence of its candidacy in such domains for optimizing resources. Finally, the conclusion of the review along with the future research direction is presented in Sect. 3.
Social Structure to Artificial Implementation: Honeybees
275
Table 1 Mathematics behind standard ABC and its variants ABC variant
Mathematical foundation
Discussion
Standard ABC [6]
Fitness function
The fitness function is used to evaluate and carry forward the offspring to the next level. The probability function is used to select onlooker bees. Offsprings are updated through updated population functions. Subscripts i and j represent the problem dimension
fiti =
1 1+ f i 1 1+abs( f i )
if f i ≥ 0 if f i < 0
Probability function Pi =
fiti sn i=1
fiti
Update Function X i, j = X min, j + rand(0, 1) X max, j − X min, j Population update X +X V = ( min, j i, j ) if V < X i, j
2
i, j
min, j
X +X Vi, j = ( max, 2j i, j ) if Vi, j < X max, j
Gbest guided ABC (GABC) [9]
PSO inspired ABC Vi, j = X i, j + i, j X k, j − X i, j + ψi,∗ j gbest j − X i, j
ψi,∗ j is a uniformly distributed random number between [0, 1.5]. The global best solution is used to improve the ABC optimization
Improved ABC [10] ABC/best/1 ABC/rand/1 based on DE/best/1 DE/rand/1
rand(0,1) function is replaced with logistic map
A logistic map is used to initiate the population rather than randomization. k represents the iteration, whereas K represents the chaotic iteration. μ is the control parameter Inspired by Differential Evolution (DE). r1, r2, r3 are the mutually different random variables. i is the population set members. Population evolution is controlled through F positive real integer (value less than 1)
ch k = μch k (1 − ch k ) ch k (0, 1), k = 0, 1, 2 . . . K ABC/best/1 Vi,m = X best,m + i, j X i,m − X r 1,m ABC/rand/1 Vi,m = X r 1,m + i, j X i,m − X r 2,m where, m = (1, 2, 3 . . . M)and Mproblem dimension Above ABC variants are based on DE mutation as DE/best/1 Vi = X best + F(X r 1 − X r 2 ) DE/rand/1Vi = X r 1 + F(X r 2 − X r 3 )
Quick ABC (qABC) [11]
Quick behavior of Onlooker bees
X best Nm represents the best solution among self and neighborhoods. The neighborhood solution is observed through the mean Euclidian distance (continued)
276
A. Singh
Table 1 (continued) ABC variant
Mathematical foundation
Discussion
Self-adaptive Chaotic Tent map Definition Artificial Bee Colony based on Tent map 2cxt 0 ≤ cxt ≤ 21 = cx t+1 (STOC-ABC) [12] 2(1 − cxt ) 21 ≤ cxt ≤ 1 best VNbest = X best Nm i + m,i X Nm i − X k,i mi
cxt represents the tent chaotic vector and value ∈ / {8.0, 6.0, 4.0, 2.0}
Update function X i, j = X min, j + cxk, j X max, j − X min, j Adaptive ABC [13]
New solution selection Vi, j =
X i, j + i, j X i, j − X k, j if R j < M R X i, j
else
The uniformly distributed real value R j is generated for each X i, j and greater than modification rate (MR), the value of Vi, j will be changed
2 Research: Usage of Basic and Improve-ABC Beehive colonization is one of the most explored and used optimization metaheuristic techniques and is widely deployed in complex and combinatorial problems of all domains. However, here in this paper, few applications of recent domains are considered as follows:
2.1 Clustering, Routing, and Deployment in Wireless Ad Hoc Networks Various wireless ad hoc networks [14] have been yet evolved as per the need of area coverage and communication enhancement. Wireless sensor network (WSN), wireless mesh network (WMN), mobile ad hoc networks (MANET), vehicular ad hoc networks (VANET), Wi-Fi, and Wi-Max are some examples of ad hoc networks. Being ad hoc in nature, these networks do not possess a fixed infrastructure and thus face many challenging issues during deployment and communication stages. Some of such issues and its solution approaches using ABC are highlighted in this section. Swarm intelligence-based routing mechanisms are exhaustively explored in the review. The ABC-based route-finding mechanism is discussed within clusters [15] to ensure the QoS, such as end-to-end delay, bandwidth remained, jitter, and link expiration time. A vehicular ad hoc network is used to validate the simulation results and demonstrated promising performance as compared to conventional routing strategies. Energy optimization in mobile sink WSN is a challenging task, and in such cases, optimization techniques play a vital role. A seamless clustering multi-hop routing
Social Structure to Artificial Implementation: Honeybees
277
mechanism using improved ABC (IABC) is presented in [16]. Cluster head (CH) and sub-cluster head (CH-β) are selected by keeping the average energy of neighborhoods and residual energy of current CH in mind. The proposed mechanism is suitable for low-power WSNs. Another variation claims dynamicity in the scout bee phase of ABC is proposed for cluster head selection to conserve the energy of WSN [17, 18]. Data collection in WSN is one of the most challenging tasks, and the consequence of such activity is the frequent depletion of battery power at the sink’s neighborhood due to data bottleneck at neighboring nodes. This issue is addressed in [19] which provides ABC for an optimal solution by collecting data throughout the terrain area using mobile sink nodes. The proposed algorithm reduces the amount of data transmission and thus battery consumption to improve node and network lifetime. Optimal placement of relay node using the ABC approach is employed in WSN [20]. An improvement is network lifetime due to relaying of data is recorded. A hybrid approach called culture-based ABC (CBABC) is proposed [21]. This hybridization reduces the size of the solution space. Around 17% improvement is obtained in convergence rate and function evaluation. Node localization in WSN assisted by unmanned aerial vehicle (UAV) is proposed [22]. A movable UAV with a single GPS module increases the accuracy of node location by precisely selecting the optimal flying height and a least-square optimization. The least-square selection is identified optimally through the ABC approach and observed minimum node localization error. Cluster head selection based on the beneficence of ACO and ABC is proposed (i.e., HACO-ABC-CH) in [18]. ACO takes care of the low convergence rate of ABC, whereas stagnation in the intensification of ACO is prevented by efficient exploration of employee bees.
2.2 Internet of Things (IoT) IoT applications [23, 24] in healthcare systems are increasing day by day. ABC-based green routing mechanism is proposed in wireless body area network (WBAN) in [25, 26] for the sensor-based healthcare system. The system operates on a low-powered sensor node deployed in the human body to measure various parameters depending on the disease that persists in the human body. These sensors are connected with WBANs, and being very low power, it is very difficult to replace battery sources very frequently and thus requires an efficient routing mechanism. The authors presented an energy-efficient path using the ABC optimization. The simulation result shows a better percentage of optimal solutions and convergence rate. To circumvent the local minima trap and low convergence rate, an improved ABC optimization-based clustering (IABCOCT) using Grenade Explosion Method (GEM) and Cauchy operator is proposed [27, 28]. Through such inclusions, the authors significantly presented the multimedia data transmission in smart sensor environments. A hybrid artificial bee colony algorithm with an efficient schedule
278
A. Singh
transformation (HABCA-EST) is proposed to sense different objects in the monitored area [29]. The authors claim a reduction in the required number of smart devices. Another IoT-based ABC optimization is presented in smart agriculture to reduce the amount of data generated [30] and thus reduces the minings and decision-making capabilities. The optimization using the bee-inspired routing algorithm (BIRA) for device-todevice communication is presented in [31] and addressed the improvement in average end-to-end delay in various traffic loads. Service-legal agreement (SLA) ensures the product’s future monitoring and guarantees QoS. Multi-round bilateral negotiation is essential and required in a serviceoriented paradigm. To provide automatic SLA in an IoT environment, ABC-based optimization provides a better balance between success rate and negotiation utility [32].
2.3 Task Scheduling in Cloud Computing Meta-heuristic optimization is widely explored in NP-hard and complete problems. Task scheduling in a cloud environment [33, 34] is one of such problems that have been addressed with standard ABC optimization in [35]. The experimental results are shown in terms of makespan and validated its proficiency over other standard algorithms. To optimize both the workflow makespan and cost of task scheduling, a combination of GA and ABC optimization is proposed in [36]. The authors attempted the scheduling of tasks on the available cloud resources over the IaaS cloud environment. Virtual machine (VM) migration improves the efficiency of cloud servers in terms of accessibility, internal failures, and energy consumption. However, a further improvement in energy conservation in VM migrations is proposed in [37]. It exploits the hybrid approach of ABC and bat algorithm (BA) for optimization and confirmed better efficiency over CloudSim [38] against the traditional methods. With the rapid growth in data storage over a cloud environment, the identification of malicious data at different nodes is leading to a challenging task. An ensembledABC is proposed in [39] for anomaly detection of multi-class datasets. ABC-based Cmean clustering technique is used for optimal clustering. Various evaluation metrics such as anomaly detection, false alarm, and accuracy rate are considered during result validation. Simulated annealing (SA) [40]-based ABC optimization is explored for load balancing and scheduling as per the size of the tasks, proximity distance of client and cloud server, and request priority. A multi-objective ABC optimization is used for task assignment and scheduling in cloud computing [41]. The proposed approach measuredly increases the system performances.
Social Structure to Artificial Implementation: Honeybees
279
2.4 Neural Network and Expert System One of the attractive applications of ABC optimization reported in the literature is a neural network [42, 43]. A mathematical model in which honeybees are considered as spiking neural networks, whereas the bee colony is mapped as a network of spiking neural networks [44]. The impact of the model results in an efficient neural network by utilizing the interactions among the spiking neural network similar to the honeybees’ local interactions. A multi-layer perceptron (MLP)-based neural network is optimally trained through the ABC using the most desirable value of linkage weight to generate an intrusion detection system (IDS) [45]. To enhance the accuracy of prediction for software aging, backpropagation (BP) neural networks are quite popular. However, when it comes to accuracy measures, the random allocation of weights and biasing threshold restricts the performance. ABC optimization proposed in [46] conveyed a better and stable prediction as well as faster convergence speed to take care of weights and bias parameters. The self-organizing and self-configuring behavior of ABC is exploited in intelligent systems like self-adaptive software system (SAS) [47] to achieve the QoS goal [48]. Each subsystem of the system has an objective and is mapped with honeybees to generate local optimization and thus converge at the global solution of a decentralized system. It enhances the self-healing capability and makes the system more robust. The alarming situation while driving automatic vehicles is a crucial action. The ABC-based solution is presented in [49] to avoid collisions with moving and stationary objects for front-to-rear accidents. However, collision speed and stopping distance-like parameters are considered while validating the comparisons with existing approaches.
2.5 Robotics Employed bees of beehives provide path guidance to unemployed bees for the food sources already explored. These characteristics of hive colonization are exported in the domain of robotics to optimize the state-of-art route. However, the low convergence rate and one-dimensional search strategy of ABC optimization limit its beneficence in robotics. A differential evolutionary algorithm (DE) and global best leading strategy are combined to exploit the foraging behavior of ABC optimization (CoGABC) for path guidance of robots in [50]. The authors claimed better performance against the conventional path guiding approaches. The reason behind the improvement is the change in the update behavior of onlooker bees in terms of dimensiondependent variables and employee bees’ enhanced exploration capability of solution search space as compared to standard ABC optimization.
280
A. Singh
Another combination of evolutionary programming (EP) that works in phenotype space and ABC optimization is proposed by Cruz et al. [51] in which the local solution of ABC optimization is well-exploited by the evolutionary algorithm to refine further the global path. The authors validated the improvement against the probabilistic roadmap method (PRM) using Dijkstra and claimed its practical deployment feasibility. Another similar work for multi-target and multi-robot systems for optimal path planning is executed in [52] to avoid collision in a dynamic environment. An unknown environment is used for the experimentation and validated the navigation effectiveness against the traditional path planning approaches. ABC optimization works in a distributed manner, and this feature is imported in a team of multi-robot path planning [53]. Artificial bees are used as different robots and adopt the velocity for collision avoidance and speed up the convergence rate of solution. The authors consider the parameters like the number of initial collisions, the number of message exchanges and time consumed to generate paths for all robots, etc., and validated in real environment of robots varying from 1 to 30 in number. Similar work has been simulated in [54] for the collision-free move of multi-robot systems. Efficient objective functions are used for targets, obstacles, and collisionfree moves. Another performance is evaluated in [55] and adopted an improved ABC optimization with the Arrhenius equation (aABC). The proposed approach has presented a balance between the exploration and exploitation nature of honeybees. The optimization of recent area domains and upcoming research trends is a major challenging issue. A huge amount of data is generating due to high-resolution audio and video transmissions that require preprocessing and optimization during computational analysis. The standard ABC is a candidacy of such application but requires some more improvement by adapting better operations like crossover and swapping over the population set.
3 Genetic Algorithm-Inspired ABC The standard ABC approach does not contain any crossover and mutation kind of operations during the optimization stage. However, these operations can trigger better solutions. To solve the binary optimization problems, a crossover and mutationbased ABC is proposed in [56]. However, another novel-based ABC (NBABC) [57] claims the outperformance of binary optimization over other binary swarm-based and evolutionary-based optimization. Integration of crossover from the genetic algorithm in ABC name CbABC has been proposed in [58] to boost solution search space. Similarly, few more genetic crossover-based ABC variations are proposed for specific constraint-oriented numerical optimization problems [59, 60]. The authors in [61] explored eight such crossovers compatible with ABC optimization to further improve standard ABC performance. The best of eight crossovers applied to ABC with multiple cognitive inputs and multiple outputs-based orthogonal frequency division multiplexing (MIMO-OFDM) cognitive system. An enhancement over standard GA, PSO, and DE-based algorithms is demonstrated. The optimization
Social Structure to Artificial Implementation: Honeybees
281
in communication problems in radar and satellite is highlighted in [62], and population reduction-based ABC is used to address the exploration of solution search space. In addition to population reduction, the binary constraint repair method is used to take care of boundary condition violations. The proposed approach is named population reduction and hybrid repair ABC (PRHRABC). A significant improvement in redundant search and population diversity is achieved by hybrid ABC (HABC) [63]. HABC uses Powell’s method for efficient exploration and exploitation of solution space. The learning of an adaptive network fuzzy inference system (ANFIS) along with the benchmark functions is optimized by adaptive and hybrid ABC (aABC) [64]. A significant amount of performance enhancement is claimed against the standard ABC approach. As per the authors, ANFIS performance increases from 4.51 to 33.33%. The generalized opposition-based learning strategy is employed over standard ABC [65] to discover more information in guided search. For such employment, the Gaussian barebones search equation inspired with BBPSO [66] is implemented in standard ABC. A PSO-based ABC optimization is proposed that overcome the local minima problem of PSO [67].
4 Conclusion and Future Scope of ABC Optimization This review paper is written concisely accommodating a few of the recent application domains, such as wireless ad hoc networks, cloud computing, IoT, neural network, expert system, and robotics. The natural foraging behavior of bees in the hive is utilized for exploring the solution search space and converging at the optimum global best with local best interactions. The beneficence of GA operators is also investigated to improve the candidacy of the bees solution approach. The following conclusive points are mentioned in this review: 1.
2.
The standard ABC optimization is a powerful technique for complex combinatorial and optimization problems; however, its variants like PSO-ABC, GAABC, and SA-ABC show significant improvement in convergence rate and explorations of bees. Due to higher complexity, the meta-heuristic techniques require a huge amount of memory and computational power and thus are useful in the application where enough such resources are available. In case of scarcity of resources, the application may be stuck to get optimum solution.
A lot many research areas have already been deployed through ABC; however, a recent scope of utilizing such optimization techniques is: 1.
One of the basic modifications that can enhance the efficiency of ABC optimization is the selection of the initial population. There are strategies through which the best initial population set can be fed to the different types of bees. Further, one can explore the multiple fitness functions as per the need of the problem domain for a better global best solution.
282
2.
3.
A. Singh
Including the mentioned variants of standard ABC, the researchers can attempt to adapt ABC with other nature-inspired techniques like biogeographical migration, bird flocking, and fish schooling to improve efficiency further at the next level. Due to the development in transmission technologies like 4G, 5G, a huge amount of data generating needs to be analyzed for prediction. Hence, data analysis can be one of the domains of ABC candidacy.
Acknowledgements I would like to extend my gratefulness to anonymous reviewers for their valuable suggestions to evolve the article better in shape. I am thankful to the University of Petroleum and Energy Studies, Dehradun, for providing infrastructure and financial support for the article.
References 1. Bonabeau E, Meyer C (2001) Swarm intelligence. Harv Bus Rev 79(5):106–114 2. Bonabeau E, Corne D, Poli R (2010) Swarm intelligence: the state of the art special issue of natural computing. Nat Comput 9(3):655–657 3. Barracuda swimming iStock: https://www.istockphoto.com/photos/, Last accessed: 2021/07/09 4. Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems. Oxford University Press, New York, NY 5. Bonabeau E, Sobkowski A, Theraulaz G, Deneubourg JL (1997) Adaptive task allocation inspired by a model of division of labor in social insects. In: BCEC, pp 36–45 6. Karaboga D (2005) An idea based on honey bee swarm for numerical optimization, vol 200, pp 1–10. Technical report-tr06, Erciyes University, Engineering Faculty, Computer Engineering Department 7. Jeanne RL (1986) The evolution of the organization of work in social insects. Monit Zool Ital 20:267–287 8. Oster G, Wilson EO (1978) Castes and ecology in the social insects. Princeton University Press, Princeton, NJ 9. Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173 10. Gao W, Liu S (2011) Improved artificial bee colony algorithm for global optimization. Inf Process Lett 111(17):871–882 11. Karaboga D, Gorkemli B (2014) A quick artificial bee colony (qABC) algorithm and its performance on optimization problems. Appl Soft Comput 23:227–238 12. Kuang F, Jin Z, Xu W, Zhang S (2014) A novel chaotic artificial bee colony algorithm based on tent map. In: 2014 IEEE congress on evolutionary computation (CEC). IEEE, pp 235–241 13. Liao X, Zhou J, Zhang R, Zhang Y (2012) An adaptive artificial bee colony algorithm for long-term economic dispatch in cascaded hydropower systems. Int J Electr Power Energy Syst 43(1):1340–1345 14. Misra S, Woungang I, Misra SC (eds) (2009) Guide to wireless ad hoc networks. Springer Science & Business Media 15. El Amine Fekair M, Lakas A, Korichi A (2016) CBQoS-Vanet: cluster-based artificial bee colony algorithm for QoS routing protocol in VANET. In: 2016 International conference on selected topics in mobile & wireless networking (MoWNeT), pp 1–8. https://doi.org/10.1109/ MoWNet.2016.7496597 16. Zhang T, Chen G, Zeng Q, Song G, Li C, Duan H (2020) Seamless clustering multi-hop routing protocol based on improved artificial bee colony algorithm. EURASIP J Wirel Commun Network 2020(1):1–20
Social Structure to Artificial Implementation: Honeybees
283
17. Shankar A, Jaisankar N (2018) Dynamicity of the scout bee phase for an artificial bee colony for optimized cluster head and network parameters for energy-efficient sensor routing. SIMULATION 94(9):835–847 18. Gambhir A, Payal A (2019) Analysis of particle swarm and artificial bee colony optimizationbased clustering protocol for WSN. Int J Comput Syst Eng 5(2):77–81 19. Yue Y, Li J, Fan H, Qin Q (2016) Optimization-based artificial bee colony algorithm for data collection in large-scale mobile wireless sensor networks. J Sens 2016 20. Ayinde BO, Hashim HA (2018) Energy-efficient deployment of relay nodes in wireless sensor networks using evolutionary techniques. Int J Wireless Inf Netw 25(2):157–172 21. Saad E, Elhosseini MA, Haikal AY (2019) Culture-based artificial bee colony with heritage mechanism for optimization of wireless sensors network. Appl Soft Comput 79:59–73 22. Annepu V, Rajesh A (2020) Implementation of an efficient artificial bee colony algorithm for node localization in unmanned aerial vehicle assisted wireless sensor networks. Wirel Pers Commun 114:2663–2680 23. Ashton K (2009) That ‘internet of things’ thing. RFID J 22(7):97–114 24. Atzori L, Iera A, Morabito G (2010) The internet of things: a survey. Comput Netw 54(15):2787–2805 25. Yan J, Peng Y, Shen D, Yan X, Deng Q (2018) A novel energy-efficient routing scheme based on artificial bee colony algorithm in wireless body area networks. In: 2018 International conference on computer, information and telecommunication systems (CITS), pp 1–5. https://doi.org/10. 1109/CITS.2018.8440188 26. Yan J, Peng Y, Shen D, Yan X, Deng Q (2018) An artificial bee colony-based green routing mechanism in WBANs for sensor-based E-healthcare systems. Sensors 18(10):3268 27. Famila S, Jawahar A, Sariga A, Shankar K (2019) Improved artificial bee colony optimization based clustering algorithm for SMART sensor environments. Peer-to-Peer Netw Appl, pp 1–9 28. Famila S, Jawahar A (2020) Improved artificial bee colony optimization-based clustering technique for WSNs. Wirel Pers Commun 110(4):2195–2212 29. Muhammad Z, Saxena N, Mansoor Qureshi I, Ahn CW (2017) Hybrid artificial bee colony algorithm for an energy-efficient internet of things based on wireless sensor network. IETE Tech Rev 34(1):39–51 30. Sathish C, Srinivasan K (2021) An artificial bee colony algorithm for efficient optimized data aggregation to agricultural IoT devices application. J Appl Sci Eng 24(6). https://doi.org/10. 6180/jase.202112_24(6).0013 31. Almazmoomi AM, Mostafa Monowar M (2019) On designing bee inspired routing algorithm for device-to-device communication in the Internet of Things. Int J Adv Comput Sci Appl 10(11):99–107 32. Li F, Clarke S (2020) Automated SLA negotiation in a dynamic IoT environment-a metaheuristic approach. In: International conference on service-oriented computing, Springer, Cham, pp 110–120 33. Wang L, Von Laszewski G, Younge A, He X, Kunze M, Tao J, Fu C (2010) Cloud computing: a perspective study. New Gener Comput 28(2):137–146 34. Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G et al (2010) A view of cloud computing. Commun ACM 53(4):50–58 35. Navimipour NJ (2015) Task scheduling in the cloud environments based on an artificial bee colony algorithm. In: International conference on image processing, pp 38–44 36. Gao Y, Zhang S, Zhou J (2019) A hybrid algorithm for multi-objective scientific workflow scheduling in IaaS Cloud. IEEE Access 7:125783–125795 37. Karthikeyan K, Sunder R, Shankar K, Lakshmanaprabu SK, Vijayakumar V, Elhoseny M, Manogaran G (2020) Energy consumption analysis of virtual machine migration in cloud using hybrid swarm optimization (ABC–BA). The J Supercomput 76(5):3374–3390 38. Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw: Pract Experience 41(1):23–50
284
A. Singh
39. Garg S, Kaur K, Batra S, Aujla GS, Morgan G, Kumar N, Zomaya AY, Ranjan R (2020) EnABC: an ensemble artificial bee colony based anomaly detection scheme for cloud environment. J Parallel Distrib Comput 135:219–233 40. Van Laarhoven PJM, Aarts EHL (1987) Simulated annealing. In: Simulated annealing: theory and applications. Springer, Dordrecht, pp 7–15 41. Jena RK (2017) Task scheduling in cloud environment: a multi-objective ABC framework. J Inf Optim Sci 38(1):1–19 42. Kröse B, Krose B, van der Smagt P, Smagt P (1993) An introduction to neural networks 43. Yegnanarayana B (2009) Artificial neural networks. PHI Learning Pvt. Ltd. 44. Fernando S, Kumarasinghe N (2018) Modeling honeybee communication using network of spiking neural networks to simulate nectar reporting behavior. Artif Life Robot 23(2):241–248 45. Hajimirzaei B, Navimipour NJ (2019) Intrusion detection for cloud computing using neural networks and artificial bee colony optimization algorithm. ICT Express 5(1):56–59 46. Liu J, Meng L (2019) Integrating artificial bee colony algorithm and BP neural network for software aging prediction in IoT environment. IEEE Access 7:32941–32948 47. Salehie M, Tahvildari L (2009) Self-adaptive software: landscape and research challenges. ACM Trans Auton Adapt Syst (TAAS) 4(2):1–42 48. Rajan B, Chandra V (2017) ABC metaheuristic based optimized adaptation planning logic for decision making intelligent agents in self adaptive software system. In: International conference on data mining and Big Data. Springer, Cham, pp 496–504 49. Yousef Q, Alqudah A, Alboon S (2016) Forward vehicle collision mitigation by braking system based on artificial bee colony algorithm. Neural Comput Appl 27(7):1893–1905 50. Xu F, Li H, Pun C-M, Hu H, Li Y, Song Y, Gao H (2020) A new global best guided artificial bee colony algorithm with application in robot path planning. Appl Soft Comput 88:106037 51. Contreras-Cruz MA, Ayala-Ramirez V, Hernandez-Belmonte UH (2015) Mobile robot path planning using artificial bee colony and evolutionary programming. Appl Soft Comput 30:319– 328 52. Faridi AQ, Sharma S, Shukla A, Tiwari R, Dhar J (2018) Multi-robot multi-target dynamic path planning using artificial bee colony and evolutionary programming in unknown environment. Intell Serv Robot 11(2):171–186 53. Contreras-Cruz MA, Lopez-Perez JJ, Ayala-Ramirez V (2017) Distributed path planning for multi-robot teams based on artificial bee colony. In: 2017 IEEE congress on evolutionary computation (CEC), pp 541–548. https://doi.org/10.1109/CEC.2017.7969358 54. Liang J-H, Lee C-H (2015) Efficient collision-free path planning of multiple mobile robots system using efficient artificial bee colony algorithm. Adv Eng Softw 79:47–56 55. Nayyar A, Nguyen NG, Kumari R, Kumar S (2020) Robot path planning using modified artificial bee colony algorithm. In: Frontiers in intelligent computing: theory and applications. Springer, Singapore, pp 25–36 56. Ozturk C, Hancer E, Karaboga D (2015) A novel binary artificial bee colony algorithm based on genetic operators. Inf Sci 297:154–170 57. Santana CJ Jr, Macedo M, Siqueira H, Gokhale A, Bastos-Filho CJA (2019) A novel binary artificial bee colony algorithm. Fut Gener Comput Syst 98:180–196 58. Kumar S, Sharma VK, Kumari R (2014) A novel hybrid crossover based artificial bee colony algorithm for optimization problem. arXiv preprint arXiv:1407.5574 59. Yan X, Zhu Y, Chen H, Zhang H (2015) A novel hybrid artificial bee colony algorithm with crossover operator for numerical optimization. Nat Comput 14(1):1–16 60. Brajevic I (2015) Crossover-based artificial bee colony algorithm for constrained optimization problems. Neural Comput Appl 26(7):1587–1601 61. Zhang X, Zhang X (2017) Using artificial bee colony algorithm with crossover for power allocation in cognitive MIMO-OFDM system. Phys Commun 25:363–368 62. Zhang X, Zhang X (2016) A novel artificial bee colony algorithm for radar polyphase code and antenna array designs. J Wirel Com Network 2016:40. https://doi.org/10.1186/s13638-0160533-4
Social Structure to Artificial Implementation: Honeybees
285
63. Ma L, Hu K, Zhu Y, Chen H (2015) A hybrid artificial bee colony optimizer by combining with life-cycle, Powell’s search and crossover. Appl Math Comput 252:133–154 64. Karaboga D, Kaya E (2016) An adaptive and hybrid artificial bee colony algorithm (aABC) for ANFIS training. Appl Soft Comput 49:423–436 65. Zhou X, Wu Z, Wang H, Rahnamayan S (2016) Gaussian bare-bones artificial bee colony algorithm. Soft Comput 20(3):907–924 66. Kennedy J (2003) Bare bones particle swarms. In: Proceedings of the 2003 IEEE swarm intelligence symposium. SIS’03 (Cat. No. 03EX706), IEEE, pp 80–87 67. Han Z, Li Y, Liang J (2018) Numerical improvement for the mechanical performance of bikes based on an intelligent PSO-ABC algorithm and WSN technology. IEEE Access 6:32890– 32898. https://doi.org/10.1109/ACCESS.2018.2845366
Lifetime Aware Secure Data Aggregation through Integrated Incentive-based Mechanism in IoT-based WSN Environment S. Nandini and M. Kempanna
Abstract Internet of things grabbed fine attention by researchers due to wide range of applicability in daily human life-based application like healthcare, agriculture, and so on. WSN possesses a restricted environment and also generates a huge amount of data and further cause’s data redundancy. Although data redundancy is efficiently solved through the various data aggregation mechanism, security remains a primary concern for adaptability in the real-time environment. Integrated incentive-based mechanism (IIBM) follows three parts, i.e., first, this research work designs the optimal and secure data aggregation; second part follows the formulation of correctly identification of deceptive data packets, and third part includes discarding deceptive node through conditional approach. Integrated incentive mechanism is evaluated considering the different security parameters like identification of malicious node and misidentified malicious or dishonest node; further, comparison is carried out with the existing model to prove the model efficiency. Furthermore, another parameter like energy utilization and several node functioning is considered for the optimality evaluation of the model. Performance evaluation shows enhancement of nearly 7%, 14%, and 15% considering the three distinctive deceptive nodes, i.e., 5, 10, and 15 (in percentage), respectively. Keywords Data aggregation · WSN-IoT · Secure data aggregation
S. Nandini (B) Department of Computer Science & Engineering, Research Centre-Bangalore Institute of Technology, Bangalore 560004, India e-mail: [email protected] M. Kempanna Department of AI&ML, Bangalore Institute of Technology, Bangalore 560004, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_22
287
288
S. Nandini and M. Kempanna
1 Introduction Internet of things (IoT) has aided in the advancement of multiple omnipresent sensors, this had led to high qualitative and quantitative growth in the healthcare, management, and entertainment sectors [1]. Engagement of the wireless sensor networks (WSNs) as digital skin in the IoT has provided intelligent services. Some of the significant computations regarding environment such as rural, non-rural, suburban, and underwater can be derived using WSNs [2, 3]. The incorporation of the multiple numbers of small sensors, which consume less energy, has enabled WSN to develop a context-awareness model using various types of sensors. The exposure and various advancements under IoT such as developing smart cities, smart vehicles, intelligent factories, intelligent buildings, smart wearables, and smart healthcare monitoring have made to design systems with minimal challenges. To achieve the same, the primary criterion is to detach the current problems for the betterment and to escalate the everyday employment of this interdisciplinary technology [4]. Figure 1 shows the data transmission over the designed WSN environment. Moreover, Fig. 1 comprises sensors, base stations, and cluster heads; data are sensed through the sensors and passed to the cluster head where the data are aggregated and further send to the base station. The method used for the reduction of the energy intake and to the removal of the redundant data is the data aggregation technique. In the process of data aggregation, the sensor nodes are arranged as a tree, the root being the base station. The transitional sensor nodes aggregate the data arriving from the leaf nodes and then pass on the aggregated output to the root, base station. Yet, sometimes this process gives some issues in few applications, namely remote healthcare observing systems. The sensor nodes are frequently positioned in an unfriendly environment that has minimal bandwidth and uncertain communication channels. This might allow for hostile data
Fig. 1 Data transmission
Lifetime Aware Secure Data Aggregation through Integrated …
289
alterations and forgery of the data, emerging in the infringement of the privacy of the user. For instance, an intruder might counterfeit a duplicate alarming reading and share it with the whole network to devalue the performance of the network. The amount of data from sensors is vast which makes the processing and storage of those data tough; the solution to solve this problem is to deduct the redundant data in the sensor data; this is done using the aggregation method [5]. Few instances of the function of aggregation are max, count, and min which are used for processing the numerical data. In past, several researchers have taken security into a concern such as in [6], the issue of the privacy-protecting data aggregation is examined in circumstances of a cyber-physical model. In [7], radial bias function neural network (RBFNN) prognosticates the speed of the data aggregation process. This aids to model the energy-saving MDC routes. In [8], the modeling of an event distribution depending on the data aggregation model with the 3D environment review is done based on deep Q-network (DQN). In [9], the energy-conscious data analysis and the data aggregation make use of the reinforcement learning method. In [10], the deep reinforcement learning (DRL) dependent method is used to unite MDC route design and blockchain. Although, the above-mentioned method will reduce either energy cost or the ratio of aggregation. In [11], the blockchain’s anonymous behavior is made use of in the user privacy-preserving data aggregation. In [12], a structure that is not centralized depending on the blockchain is delivered, called CrowdBC. The registration of the end-users is done without the actual identity, and the vital data are reserved in shared storage. In [13], for the electrical information accumulation, privacy and protection concerns are contemplated by deploying the blockchain and the edge computing mechanisms in the smart grid. In [14], the blockchain-dependent shared cloud framework is modeled with the fog nodes, which are entitled software explained networking. Manju et al. [15] developed QoS-based approach for target coverage and k-coverage [16] in WSN. Goyal et al. developed some routing algorithm for secure routing [17, 18]. Kumar et al. [19, 20] discussed security issue in WSN. Nevertheless, these mechanisms will not reach the security expectations depending on the data aggregation because of the conventional block header model and the block generation procedures.
1.1 Motivation and Contribution of Research Work Data aggregation possesses dynamic characteristics and has wide range of application that provides the flexibility in designing the energy efficient model. Thus, motivated by the application, low-cost deployment along with security concern, this research work designs and develops secure and efficient model of data aggregation that provides the data security. Further, contribution of research work is given as 1.
This research focuses on designing and developing an integrated incentive mechanism which comprises various sub-mechanism; at first, system model is designed following the IIBM model.
290
2.
3.
S. Nandini and M. Kempanna
Integrated incentive mechanism comprises the three-part, where first part deals with designing developing a mechanism for optimal and secure data aggregation, the second part deals with the formulation of proposing condition for identifying the deceptive node and discarding it through the conditional approach. Further, the IIBM model is evaluated considering the number of functioning nodes and energy consumption for efficiency concern; further, a comparative analysis is carried out considering the parameter like misclassified and classified data packet identification.
This particular research work is divided into various sections following the research format; the first section discusses the basics of IoT along with WSN and the application of the same. Further, these researches discuss the need for data aggregation and various security concerns; moreover, the particular section ends with motivation and research contribution. The second section focused on designing the IIBM methodology along with mathematical modeling; the last but not least section follows the performance evaluation of IIBM methodology along with comparative analysis with the existing model.
2 Proposed Methodology In this section of the research, this research work designs and develops an integrated incentive mechanism for reliable and verifiable secure data aggregation; Fig. 2 shows the IIBM workflow which comprises five distinctive blocks. At first, the research preliminaries are initialized along with network design such as node placement and cluster head selection. The second block refers to designing the mathematical model
Fig. 2 IIBM methodology
Lifetime Aware Secure Data Aggregation through Integrated …
291
for optimal and secure data aggregation through our designed formulation. The third block computes the correct packet identification, i.e., how many honest data packets are identified as malicious (also referred to as dishonest data packets) and vice-versa. In the fourth block, this research work designs the particular condition that distinguishes the malicious packet and normal packet, also energy utilization is carried out. Further based on the condition designed, fifth blocks discard the malicious node.
2.1 Network Model Let us consider a particular network that comprises several users, i.e., Y = {y1 , y2 , . . . ., y P } and data gathering center (DGC), DGC behaves like head of the cluster to gather the data; all P have data as Z = {z 1 , z 2 , . . . ., z P } to DGC. Here, each node is rated as a malicious node or (deceptive node), normal node, or (honest node) along with its reputation; further master computes the average secure network as given as A = P −1
P
zk
(1)
k=1
Further, in this model, untruster DGC is considered along with two types of nodes are considered named normal nodes and malicious nodes and consider two distinctive threats, first, threat is aggregated data quality and the second one is comprising the privacy.
A = m
O
ykm z km
(2)
k=1
2.2 Deceptive Packet Detection and Nodes Discard Once, the malicious or deceptive nodes are identified, they should be discarded, thus identification and discarding of such nodes are carried out through updating the sensor node status for mth time, and similarly, a parameter of same can be presented through m um k , also parameter is initialized denoted as yk . Moreover, as discussed earlier in the same section about the deviation among the packet and reference f M x = k M x{ f k } and the deviation among the normal packet and the corresponding reference can be denoted as o . Moreover, the discarding of packets can be designed through the below steps.
292
S. Nandini and M. Kempanna
Condition 1: if f k is less than or equal to f M x , then the updation value increases with a decrease in f k , hence the reliability of the node increases if the particular node provides the consistent sensed data packets. Condition 2: In case, if f k is greater than f M x , then its reliability decreases with an increase in the value of f k − f M x ; this means to say that if a particular sensor sends the deceptive data packet, then its value of parameter reaches null and thus it is discarded. Thus, updation function can be designed through the below equation −1 m−1 . exp{− f k } + 2 f k − f M x + 1 . 1 − u m−1 um k ← uk k −1 .(1 − exp{−R( f k − f M x )}) + 2 f k − f M x − 1 .u m−1 k
(3)
In the above equation, R and are considered as the real numbers for scaling the ; um k R and parameters are evaluated in the above equation through monitoring the updation value of a normal or deceptive node. Above formulation and condition preserve the privacy of node and remove the unsecured data packets; further, normal data aggregation is carried out as below. This paper uses the normal data for weight computation; thus, let us consider the Z = {z 1 , z 2 , . . . , z P } which denotes the normal data verified through the above condition; thus, average aggregation weights are given as ykm
=
Z km
P
−1 Z km
(4)
k=1
The above equation provides the guarantee for the normal data as normal data are considered for computing the average aggregation. Performance Evaluation In this section of the research, authors evaluate the IIBM model; moreover, the IIBM model is evaluated through designing the specific network parameter given in Table 1. Furthermore, evaluation is carried out on the windows 10 platform using the visual studio 2017 IDE using the sensoria simulator; moreover, system architecture follows the 8 GB of CUDA-enabled Nvidia RAM and 1 TB of a hard disk. Furthermore, sensoria simulator [21] is used for the simulation; also, since our approach is consensus-based, we have chosen consensus-based approach [22] as the existing model. Furthermore, our model considered 100 sensor nodes along with 5, 10, and 15% of deceptive nodes having initial energy as 0.05j. Table 1 Improvisation of IIBM model over the existing model
Deceptive node
Improvisation over the existing model (%)
5% deceptive nodes
6.59
10% deceptive nodes
14.666
15% deceptive nodes
15.38
Lifetime Aware Secure Data Aggregation through Integrated …
293
Fig. 3 Network lifetime
2.3 Network Lifetime Network lifetime is one of the eminent parameters to evaluate the IIBM model; moreover, Fig. 2 shows the lifetime of the network concerning energy consumption through varying the number of deceptive nodes, i.e., 5, 10, and 15%. Furthermore, through Fig. 3, it is observing that with an increased number of deceptive nodes, there is more amount of energy consumption.
2.4 Correctly Packet Identification over a Deceptive Node Deceptive packet identification or deceptive packet identification parameter in data aggregation plays a major role; also, the correct identification is other constraints. More correct identification of deceptive nodes indicates the security of the network. Figure 4 shows the comparison of an existing model and IICM mechanism varying the percentage of several nodes.
2.5 Dead Nodes The lifetime of the network directly depends on the number of sensor nodes alive and network energy consumption. More number of alive nodes indicates less energy consumption and model efficiency. Figure 5 shows the graph analysis of dead nodes through percentage variation in deceptive nodes; the graph is plotted against the
294
S. Nandini and M. Kempanna
IIBM
NODES_IDENTIFIED
SCDA
PERCENTAGE OF DECEPTIVE NODES Fig. 4 Comparison of correctly identified packets
Fig. 5 Number of dead nodes
number of rounds. Thus, it is observing that for starting off the round and there are no dead nodes but as the simulation proceeds further. Nodes start failing and reach up to 15 dead nodes.
Lifetime Aware Secure Data Aggregation through Integrated … Fig. 6 Throughput comparison varying the deceptive nodes
IIBM
THROUGHPUT
SCDA
295
PERCENTAGE OF DECEPTIVE NODES
2.6 Throughput In general, throughput is defined as the rate at which work is getting done; the more throughputs show the better efficiency of model. Figure 6 shows throughput performance comparison of existing and IIBM model; in case of 5% deceptive node, existing model achieves the throughput value of 0.8099 and IIBM model achieves 0.8633. Similarly, in case of 10% and 15% deceptive nodes, existing model achieves the throughput value of 0.5555and 0.3705 whereas the IIBM model achieves a throughput value of 0.6364 and 0.4274, respectively.
2.7 Comparative Analysis and Discussion In this section, improvisation of IIBM model over the existing model is discussed; considering the above section, there are two parameters, i.e., correct packet identification whether it is deceptive or normal and throughput. Furthermore, Table 1 shows the improvisation of IIBM model over the existing model, i.e., in case of 5, 10, and 15% deceptive node, IIBM model observes the improvisation of 6.59, 14.66, and 15.38% in a respective manner.
3 Conclusion Data aggregation is considered as the basic operation of IoT application specifically in WSN-based application to avoid data redundancy; Data aggregation possesses various advantages apart from data redundancy such as energy utilization, number
296
S. Nandini and M. Kempanna
of node functioning, and other parameter optimization. However, due to vulnerability in WSN environment, security has been one of the primary concerns; hence, this research work focuses on designing reliable verifiable, secure, and efficient data aggregation mechanism. Moreover, this paper designs and develops a method named integrated incentive mechanism; in IIBM mechanism, three sub-mechanisms are integrated. First, sub-mechanism deals with designing of optimal and secure data aggregation; second, sub-mechanism deals with formulating the number of correct identification of data packets. Further, condition is designed in third part that distinguishes the malicious and non-malicious node; also considering the condition malicious nodes are discarded. Moreover, IIBM model is evaluated considering security parameter such as misclassified data packets and identified data packets and through the comparative analysis, it is observed that integrated incentive mechanism simply outperforms the existing model. Similarly, throughput comparison also indicates the model efficiency over the other model. Although, integrated incentive mechanism possesses significant advantage, due to research-oriented model and WSN variant, there are other areas which can be explored such as considering a greater number of parameters for model evaluation; also, there is a hope for further improvisation in terms of identification of malicious node and further work would be discarding the deceptive packets and nodes and also computation of misclassified data packet computation.
References 1. Shafique K, Khawaja BA, Sabir F, Qazi S, Mustaqim M (2020) Internet of Things (IoT) for next-generation smart systems: a review of current challenges, future trends and prospects for emerging 5G-IoT scenarios. IEEE Access 8:23022–23040. https://doi.org/10.1109/ACCESS. 2020.2970118 2. Shahraki A, Taherkordi A, Haugen Ø, Eliassen F A survey and future directions on clustering: from WSNs to IoT and modern networking paradigms. IEEE Trans Netw Serv Manag. https:// doi.org/10.1109/TNSM.2020.3035315 3. Lazarescu MT (2013) Design of a WSN platform for long-term environmental monitoring for IoT applications. IEEE J Emerg Sel Top Circuits Syst 3(1):45–54. https://doi.org/10.1109/JET CAS.2013.2243032 4. Gupta V, De S Energy-efficient edge computing framework for decentralized sensing in WSNassisted IoT. IEEE Trans Wirel Commun. https://doi.org/10.1109/TWC.2021.3062568 5. Dang LM, Piran M, Han D, Min K, Moon H (2019) A survey on internet of things and cloud computing for healthcare. Electronics 8(7):768. View at: Publisher Site | Google Scholar 6. Yu J, Wang K, Zeng D, Zhu C, Guo S (2018) Privacy-preserving data aggregation computing in cyber-physical social systems. ACM Trans Cyber Phys Syst 3(1). Article 8 7. Wang J, Zhang H, Ruan Z, Wang T, Wang XD (2020) A machine learning based connectivity restoration strategy for industrial IoTs. IEEE Access 8:71136–71145 8. Toyoshima K, Oda T, Hirota M, Katayama K, Barolli L (2020) A DQN based mobile actor node control in WSAN: simulation results of different distributions of events considering threedimensional environment. In: International conference on emerging internetworking, data and web technologies. Springer, Cham, pp 197–209
Lifetime Aware Secure Data Aggregation through Integrated …
297
9. Xu C, Wang K, Li P, Xia R, Guo S, Guo M (2020) Renewable energy aware big data analytics in geo-distributed data centers with reinforcement learning. IEEE Trans Netw Sci Eng 7(1):205– 215 10. Liu CH, Lin Q, Wen S (2018) Blockchain-enabled data collection and sharing for industrial IoT with deep reinforcement learning. IEEE Trans Industr Inf 15(6):3516–3526 11. Yang M, Zhu T, Liang K, Zhou W, Deng RH (2019) A blockchainbased location privacy preserving crowdsensing system. Futur Gener Comput Syst 94:408–418 12. Li M, Weng J, Yang A, Lu W, Zhang Y, Hou L, Liu J, Xiang Y, Deng RH (2018) CrowdBC: a blockchain-based decentralized framework for crowdsourcing. IEEE Trans Parallel Distrib Syst 30(6):1251–1266 13. Chen S, You Z, Ruan X (2020) Privacy and energy co-aware data aggregation computation offloading for Fog-assisted IoT networks. IEEE Access 8:72424–72434. https://doi.org/10. 1109/ACCESS.2020.2987749 14. Chen Y, Martínez J-F, López L, Yu H, Yang Z, A dynamic membership group-based multipledata aggregation scheme for smart grid. In: IEEE Internet of Things J. https://doi.org/10.1109/ JIOT.2021.3063412 15. Manju, Singh S, Kumar S, Nayyar A, Al-Turjman F, Mostarda L (2020) Proficient QoS-based target coverage problem in wireless sensor networks. IEEE Access 8:74315–74325 16. Manju, Bhambu P, Kumar S (2020). Target K-coverage problem in wireless sensor networks. J Discrete Math Sci Crypt 23(2):651–659 17. Goyal M, Kumar S, Sharma VK, Goyal D (2020) Modified Dragon-Aodv for efficient secure routing. In: Advances in computing and intelligent systems. Springer, Singapore, pp 539–546 18. Goyal A, Sharma VK, Kumar S, Kumar K (2020) Modified local link failure recovery multicast routing protocol for MANET. J Inf Optim Sci 41(2):669–677 19. Kumar S, Goyal M, Goyal D, Poonia RC (2017) Routing protocols and security issues in MANET. In: 2017 International conference on INFOCOM technologies and unmanned systems (trends and future directions) (ICTUS). IEEE, pp 818–824 20. Kumar S, Saini ML, Kumar S (2020) Improved DYMO-based ACO for MANET using distance and density of nodes. In: Microservices in Big Data analytics. Springer, Singapore, pp 29–38 21. Al-Karaki JN, Al-Mashaqbeh GA (2007) SENSORIA: a new simulation platform for wireless sensor networks. In: 2007 International conference on sensor technologies and applications (SENSORCOMM 2007), 2007, pp 424–429. https://doi.org/10.1109/SENSORCOMM.2007. 4394958 22. He J, Cai L, Cheng P, Pan J, Shi L (2019) Consensus-based data-privacy preserving data aggregation. IEEE Trans Autom Control 64(12):52225229. https://doi.org/10.1109/TAC.2019. 2910171
Multi-temporal Analysis of LST-NDBI Relationship with Respect to Land Use-Land Cover Change for Jaipur City, India Arpana Chaudhary, Chetna Soni, Uma Sharma, Nisheeth Joshi, and Chilka Sharma
Abstract There have been multiple studies showing the comparison between land surface temperature—normalized difference built-up index (LST-NDBI) relationship especially in urban areas; however, many of the studies have lower accuracy while comparing LST-NDBI due to lower temporal availability of higher-resolution images particularly those used for LST derivation. The main reason behind this is the solid heterogeneity of land use-land cover (LULC) surfaces due to which LST changes drastically in space as well as in time; hence, it involves measurements with thorough spatial and temporal sampling. In this study, a comparison of the multi-temporal LSTNDBI relationship is done, and also, the further comparison is shown using LULC. The results are in agreement with previous studies which show a strong and positive correlation across the years (r = 0.69, r = 0.64 and r = 0.59 for 2001, 2011 and 2020, respectively). In addition, the LST trend shows the reduction in daytime LST over the years in the summer season which also reaffirms the findings of those very few studies conducted in semiarid regions. These results can help understand the effects of increasing built-up areas and the interclass LULC change on LSTs in urban settings. However, it is recommended that multi-seasonal comparisons will provide a better idea of the LST-NDBI relationship with higher-resolution LST maps. Keywords LST · NDBI · LULC · Built-up · Multi-temporal · Remote sensing
A. Chaudhary (B) · C. Soni · C. Sharma School of Earth Science, Banasthali Vidyapith, Banasthali 304022, India C. Sharma e-mail: [email protected] U. Sharma · N. Joshi Department of Computer Science, Banasthali Vidyapith, Banasthali 304022, India e-mail: [email protected] N. Joshi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_23
299
300
A. Chaudhary et al.
1 Introduction Normalized difference built-up index (NDBI) has been used extensively for the study of urban areas and to understand the effects of urban expansion on the land-use change, not only for detecting built-up areas easily but also to analyze land-use change to understand the magnitude and rate of increase in urban areas [1]. The variations in land surface temperatures (LSTs) in urban areas are one of the properties which represents effects of urbanization and the associated change in landscape patterns, and therefore, remote sensing-based indices when used with LSTs can help quantify changes in landscape due to urbanization and further provide suggestions in terms of ways and means for minimizing urban heat absorption [2]. There is a significant increment in built-up areas from the conversion of barren and vegetated areas, and similarly, the built-up areas, dry-barren lands, and underconstruction land show higher temperatures followed by vegetated areas which exhibit lower temperatures, whereas waterbodies have the lowest temperatures in comparison [3]. Besides, the surface urban heat island (SUHI) effect has also been studied extensively by using Landsat-derived LSTs due to the higher temporal availability and spatial coverage in order to understand variations in UHI effects in different parts of urban environments, e.g., central city areas compared to suburbs. Because of that one common trend has been identified in semiarid regions over the years, i.e., relatively lower LSTs in urban residential areas compared to others [4]. Moreover, NDVI-LST relationship exhibits seasonal variations as the LSTs are highly dependent on soil moisture content and vegetation cover in contrast to builtup areas which more or less are consistent, and as a result, NDBI-LST relation is consistent across seasons and could be considered as an accurate indicator of UHI effects and substitute for percent impervious surface area for the study of UHI [5]. Most importantly, there is an overwhelming effect of bare and semi-bare soils on the UHI especially those surrounding cities in semiarid and arid regions but given that the LST-NDBI relationship can be influenced by associated factors such as humidity, dryness intensity, and precipitation, and therefore, the usage of these indices especially NDVI is inappropriate in these arid/semiarid regions [6]. As mentioned before, different LULC types exhibit different LST such that commercial and industrial areas express higher LSTs than residential land in urban areas, whereas forest and water bodies show the lowest temperatures followed by pastures with slightly higher temperatures comparatively [7]. However, other studies note that fallow land has the highest LSTs, i.e., up to 45.17 °C [8] compared to other LULC types [9]. This study focuses on the relationship of LST-NDBI in summer and compares with similar past studies that how different LULC features relate to LST differently by also using LULC maps. Also, the aim is to identify the increase in built-up areas with NDBI but by understanding the difference between built-up and barren land and how it affects LST and NDBI overall using open-source GIS software—QGIS.
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
301
2 Related Works NDBI is used to identify and map built-up areas (especially urban areas) automatically and rapidly which is achieved by utilizing the spectral response of built-up and other areas [10, 11]. Another reason that the NDBI and LST relation is strong and significant because it was found in a study where it was observed that when downscaling techniques such as thermal sharpening downscaling algorithm (TsHARP) were employed for obtaining higher temporal and spatial resolution of LST, NDBI has been proven to be the most effective parameter comparatively in downscaling for all seasons [12, 13]. Because even though urban areas have different types of land use-land cover (LULC) classes, built-up areas are the most prominent ones which preside over other classes when it comes to governing LSTs [14].
3 Study Area See Fig. 1. The study area includes Jaipur City, Rajasthan, with the extent from Latitude 75° 42 N–75° 54 N and Longitude 27° 00 E–26° 48 E. The Jaipur City limits which are used in the study have an area of 423.53 km2 . It has a semiarid to the arid climate and significant human population and settlement. The major parts of cities include Jaipur urban area, Jhalana Reserve Forest (JRF), and airport [15].
Fig. 1 Study area boundary map of Jaipur City with background reference satellite imagery from ESRI
302
A. Chaudhary et al.
4 Proposed Methodology Data has been collected from various sources like US Satellite platform and Reference image of LULC from ISRO-Bhuvan platform to enhance classification understanding (Up to Satellite Level 1). Then geo-referencing and rectification of the data are done according to the coordinate system of the study area. Subsequently, remotely sensed data are layer stacked, mosaicked and cropped. Preprocessing is done by using image processing software. After preprocessing of the data, classification of the data is performed using multiple training sites on the false-color composite (FCC) of the satellite imagery and the attributes are added based on expert knowledge. The imageries are employed according to the user-defined classes, and it follows a specific classification system. The emphasizing intent was to classify the images into study area-specific classes only. After obtaining the LULC map, NDVI, NDBI, and LST are also obtained by doing some band math (Fig. 2).
4.1 Datasets and Processing Landsat 7 ETM+ (Enhanced Thematic Mapper) satellite imagery was obtained from USGS Earth Explorer for the study. For 2001, 2011, and 2020 Landsat 7 ETM+ images in the form of cloud-free Level 1TP (Terrain Corrected and suitable for pixel-level time series analysis) scenes of April month of respective years were used. Also, due to the failure of Scan Line Corrector (SLC) instrument in Landsat 7 in 2003, 2011, and 2020 images had scan line errors. The spectral bands selected for NDBI analysis were Band 4, i.e., near-infrared (0.45–0.52 μm wavelength) and Band 5, i.e., short-wave infrared (1.55–1.75 μm wavelength). Both bands (Bands no. 4 and 5) were in 30 m spatial resolution that was used for NDBI analysis.
Fig. 2 Flowchart for the overview of methodology carried out in the study
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
303
4.2 Derivation of NDBI The NDBI uses SWIR and NIR bands. Although the range of NDBI is between −1 and + 1, they are considered as extreme values [16]. The equation to compute NDBI provided by [10] is as follows: NDBI =
SWIR1 − NIR SWIR1 + NIR
(1)
4.3 Derivation of LSTs The LSTs were derived by what is called the Mono-window algorithm performed using Band 6 (Thermal), Band 3 (Red), and Band 4 (NIR). The algorithm is performed in several steps provided by [17] as listed as follows: The digital numbers (DN) value were converted to top-of-atmosphere radiance (Lλ) by using Eq. 2: Lλ = ML × Qcal + AL
(2)
where ML = Band-specific multiplicative rescaling factor contained in the metadata AL = Band-specific additive rescaling factor contained in the metadata Qcal = Quantized and calibrated DN value (pixel value). Then the radiance (Lλ) is converted to At-Satellite Brightness Temperature (TB) using thermal constants and also converting it from Kelvin to °C (273.15) as shown in Eq. 3: TB =
ln
K2 K1 Lλ
+1
− 273.15
(3)
where Lλ = Top-of-Atmosphere Radiance ln = Natural Logarithm K1 and K2 = Band-specific thermal conversion constants contained in the metadata. Then the land surface emissivity denoted as (surface’s ability to emit thermal energy) was measured using Eq. 3: e = 0.004 Pv + 0.986.
304
A. Chaudhary et al.
where Pv = proportion of vegetation which is calculated by using NDVI and once NDVI is obtained the minimum and maximum NDVI values are used again as shown sequentially by Eqs. 4 and 5: NDVI = Pv =
NIR − Red NIR + Red
NDVI − NDVImin (NDVI + NDVImin )2
(4) (5)
Finally, LSTs were derived using the LST Eq. (6) LST =
TB 1 + W × TB × ln(e) p
(6)
where p = h × cs (where h is Planck’s constant, c is the speed of light, s is the Boltzmann constant) [17].
4.4 Generation of Land Use-Land Cover (LULC) Also, for generating LULC for supplementary comparison and analysis, the maximum likelihood classifier (MLC) technique was used. Using several spectral band combinations as RGB Image (Red–Green–Blue), training samples were created. Based on these training samples, the image classification signature file was prepared, and using this signature file, MLC was carried out. Six LULC classes were generated, viz. agriculture, fallow land, forest patches, other vegetation, built-up, and waterbodies for multi-temporal comparison.
5 Result and Discussion Based on LST map (Fig. 3), the temperatures are relatively high in outskirts of the city as in 2001 the major built-up area is just located in the center rather than covering the whole study area. Waterbodies show lower LSTs as seen in the northeast part of the city which also corresponds clearly to NDBI map (Fig. 4) and LULC map (Fig. 9) as those same areas exhibit lower built-up index value (Fig. 4). The higher NDBI in the outskirts signifies the dominant presence of barren land. LSTs were found to be the highest in 2001 compared to 2011 (Fig. 5) and 2020 (Fig. 7). The mean LST for the 2001 study area was 40.46 °C (SD = 2.66 °C). Nevertheless, some of the dense vegetation/forest patches, i.e., JRF on central-east shows lower temperatures, shows the cooling effect due to moisture availability and shadow effect [18].
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
305
Fig. 3 LST (in °C) map for 2001
As per Fig. 5, the LSTs are significantly reduced compared to 2001 as the builtup area expands on barren land as the mean LST for the 2011 study area was 36.80 °C (SD = 2.42 °C). Besides, the cooling effect is again seen in the dense vegetation/forest patches, i.e., JRF in central-east and also northeast part due to the presence of water bodies and vegetation. In addition, features such as airport are distinguishable in NDBI map (Fig. 6) as the airstrip and surrounded barren land have relatively higher NDBI values, whereas water bodies and vegetation show lower NDBI values. However, in north borders and the area surrounding the central-east vegetation patches NDBI values are very high which signifies an increase in barren land comparatively (Fig. 10). The mean LST (Fig. 7) for 2020 study area was 35.01 °C (SD = 1.95 °C) which means in 2020 the LSTs have reduced drastically over the years to urban built-up areas as mentioned before that because of the rise of residential built-up areas, it can reduce LST relatively to industrial, commercial, or barren land. This result corresponds to another study which shows that cities, i.e., urban areas have lower LSTs in daytime compared to surrounding areas because urban areas especially residential areas are greener due to urban green spaces and parks, whereas semiarid/arid regions where agriculture is rain-fed rather than irrigated especially in summer season (as per study area satellite imagery dates) when they remain uncultivated and dry [19] (Fig. 8).
306
A. Chaudhary et al.
Fig. 4 NDBI map for 2001
The accuracy assessment LULC maps prepared for all the study years as shown in Table 1 provides overall accuracy (%) and Kappa coefficient. The overall accuracy was 98.3%, 94.1%, and 97.5% for the years 2001, 2011, and 2020, respectively, which makes comparison substantially accurate. Besides, the Kappa coefficient was 0.98, 0.93, and 0.9 for 2001, 2011, and 2020, respectively. While comparing Figs. 9 and 10 (LULC maps for 2001 and 2011), respectively, the built-up areas seem to increase significantly between 2001 and 2011. Also, Fallow land which is considered as uncultivated resembles barren areas had increased. The most drastic change was the reduction of agriculture area because of increasing builtup. The other vegetation class usually has sparse vegetation, and it resembles barren areas too. Forest patches also reduce slightly in comparison; however, the JRF area remains more or less the same. By comparing LULC maps of 2011 and 2020 (Figs. 10 and 11), it clearly shows the increase in built-up areas up to the city limits, whereas agriculture and fallow land reduced significantly over the years. Also, forest patches seem to decrease, however, slightly with increasing sparse vegetation. Waterbodies did not have a major change as the extent covered by them was nevertheless same. The 2001 trend shows that the correlation is positive and strong (r = 0.694). This is also highest among other years. The NDBI density plot shows that negative values are very less compared to other years. Bare soils, sparse, and dry vegetation can also
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
307
Fig. 5 LST (in °C) map for 2011
show high NDBI values as observed in a similar study [20] and also found similarity in the negatively skewed density plots (Fig. 12). The 2011 trend shows again that the correlation is positive and strong (r = 0.645). The density plots show that negative values are very less compared to other years (Fig. 13). Also, LST values are much lower compared to 2001 but the NDBI values are still negatively skewed but slightly moderate through 2020. As per Fig. 13, the correlation is positive and moderate to strong (r = 0.595). However, the NDBI and LST almost follow a normal distribution and this shows moderate values are more in NDBI and LST compared to other years.
6 Conclusion LST, NDBI, and LULC maps were derived from satellite data analysis that represents the spatial distribution of different phenomena in the Jaipur City. This study concludes that the LST-NDBI relation is strong especially when barren areas which vary in LULC classes between built-up and sparse vegetation are more dominant but as residential built-up areas increase the relation is weakened as observed from
308
Fig. 6 NDBI map for 2011
Fig. 7 LST (in °C) map for 2020
A. Chaudhary et al.
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
Fig. 8 NDBI map for 2020
Fig. 9 LULC map for 2001
309
310
A. Chaudhary et al.
Fig. 10 LULC map for 2011
Table 1 Accuracy assessment outputs including overall accuracy and Kappa coefficient
Overall accuracy (%) Kappa statistics
2001
2011
2020
98.3
94.1
97.5
0.98
0.93
0.9
2001 to 2020. Also, the results are in line with similar studies which show a reduction in mean LST over the years as built-up areas increase compared to barren and sparsely vegetated areas. Moreover, other studies previously used a comparatively lower spatial resolution of LST maps; however, they were able to do more frequently because of high temporal resolution due to the availability of that type of satellite imagery. As a result, it is recommended that a multi-seasonal study be done to observe variations in LST-NDBI relations.
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
Fig. 11 LULC map for 2020
Fig. 12 LST-NDBI correlation analysis graphs for 2001
311
312
A. Chaudhary et al.
Fig. 13 LST-NDBI correlation analysis graphs for 2011, 2020
References 1. Hadeel AS, Jabbar MT, Chen X (2009) Application of remote sensing and GIS to the study of land use/cover change and urbanization expansion in Basrah province, Southern Iraq. GeoSpatial Inf Sci 12(2):135–141 2. Chen YC, Chiu HW, Su YF, Wu YC, Cheng KS (2017) Does urbanization increase diurnal land surface temperature variation? Evidence and implications. Land Sc Urban Plan 157:247–258 3. Sultana S, Satyanarayana ANV (2018) Urban heat island intensity during winter over metropolitan cities of India using remote-sensing techniques: impact of urbanization. Int J Remote Sens 39(20):6692–6730 4. Madanian M, Soffianian AR, Soltani Koupai S, Pourmanafi S, Momeni M (2018) The study of thermal pattern changes using Landsat-derived land surface temperature in the central part of Isfahan province. Sustain Cities Soc 39(March):650–661 5. Macarof P, Statescu F (2017) Comparison of NDBI and NDVI as indicators of surface urban heat island effect in Landsat 8 Imagery: a case study of Iasi. Present Environ Sustain Dev 11(2):141–150 6. Naserikia M, Shamsabadi EA, Rafieian M, Filho WL (2019) The urban heat island in an urban context: A case study of Mashhad, Iran. Int J Environ Res Public Health 16(3) 7. Weng Q, Lu D, Schubring J (2004) Estimation of land surface temperature-vegetation abundance relationship for urban heat island studies. Remote Sens Environ 89(4):467–483 8. Malik MS, Shukla JP, Mishra S (2019) Relationship of LST, NDBI and NDVI using Landsat-8 data in Kandaihimmat watershed, Hoshangabad, India. Indian J Geo-Marine Sci 48(1):25–31 9. Abutaleb K, Ngie A, Darwish A, Ahmed M, Arafat S, Ahmed F (2015) Assessment of urban heat island using remotely sensed imagery over Greater Cairo, Egypt. Adv Remote Sens 04(01):35– 47 10. Zha Y, Gao J, Ni S (2003) Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int J Remote Sens 24(3):583–594 11. He C, Shi P, Xie D, Zhao Y (2010) Improving the normalized difference built-up index to map urban built-up areas using a semiautomatic segmentation approach. Remote Sens Lett 1(4):213–221 12. Liu L, Zhang Y (2011) Urban heat island analysis using the Landsat TM data and ASTER Data: A case study in Hong Kong. Remote Sens 3(7):1535–1552 13. Sun Q, Wu Z, Tan J (2012) The relationship between land surface temperature and land use/land cover in Guangzhou, China. Environ Earth Sci 65(6):1687–1694 14. Govil H, Guha S, Dey A, Gill N (2019) Seasonal evaluation of downscaled land surface temperature: a case study in a humid tropical city. Heliyon 5(6):e01923 15. Kumbhojkar S, Yosef R, Mehta A, Rakholia S (2020) A camera-trap home-range analysis of the Indian leopard (Panthera pardus fusca) in Jaipur, India. Animals 10(9):1–23 16. Guha S, Govil H, Gill N, Dey A (2020) A long-term seasonal analysis on the relationship between LST and NDBI using Landsat data. Quat Int (April)
Multi-temporal Analysis of LST-NDBI Relationship with Respect …
313
17. Rongali G, Keshari AK, Gosain AK, Khosa R (2018) A mono-window algorithm for land surface temperature estimation from Landsat 8 thermal infrared sensor data: a case study of the Beas river basin, India. Pertanika J Sci Technol 26(2):829–840 18. Mathew A, Khandelwal S, Kaul N (2018) Spatio-temporal variations of surface temperatures of Ahmedabad city and its relationship with vegetation and urbanization parameters as indicators of surface temperatures. Remote Sens Appl Soc Environ 11:119–139 19. Rasul A, Balzter H, Smith C (2016) Diurnal and seasonal variation of surface urban cool and heat islands in the semi-arid city of Erbil, Iraq. Climate 4(3) 20. Mathew A, Khandelwal S, Kaul N (2017) Investigating spatial and seasonal variations of urban heat island effect over Jaipur city and its relationship with vegetation, urbanization and elevation parameters. Sustain Cities Soc 35:157–177
Analysis and Performance of JADE on Interoperability Issues Between Two Platform Languages Jaspreet Chawla and Anil Kr. Ahlawat
Abstract There are a large number of toolkits and frameworks for multi-agent systems available on the market. These toolkit and framework help the researchers to build an architecture that works on interoperability issues of Web services on different software languages. After studying numerous multi-agent tools, we observed that JADE is a suitable multi-agent software tool that acts as a bridge between interplatform languages and works efficiently on a distributed network. This paper shows the results and analysis of different interoperability issues of Web service between the two languages, Java and .NET, and proves the quality and maturity of JADE. The analysis focuses on interoperability issues like precision issues of data types, array with null values, unsigned numbers, complex data structure, and date– time formats between Java and .NET, and how JADE acts as middleware, built the agent handler, and resolves the Web service interoperability issues effectively. Keywords JADE · Multi-agent · JAVA · .NET · Web service
1 Introduction A Web service is said to be interoperable when a client invokes a service from a different platform regardless of worrying about its hardware and software configuration. An interoperable Web service can be easily transferred between different software applications, operating systems, data models, and Web servers, for example, whether forecast, currency convertor, etc. A Web service is written in the universal language XML and follows the SOAP protocol. The three main standards of a Web J. Chawla (B) Department of Computer Science & Engineering, JSS Academy of Technical Education, Noida, Affiliated To AKTU, Lucknow, India A. Kr. Ahlawat Department of Computer Science & Engineering, KIET Group of Institutions, Ghaziabad, Affiliated To AKTU, Lucknow, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_24
315
316
J. Chawla and A. K. Ahlawat
service are simple object access protocol (SOAP), universal description discovery and integration (UDDI), and Web service description language (WSDL). The problem arises when, in some cases, the same Web service gives different results in different languages. Due to incorrect results, sometimes, the programming languages were optimized for a particular task, and sometimes, issues were suppressed due to lack of knowledge of interoperable tools. To resolve these issues, the WS-I (Web Service-Interoperability) organization provides the guidelines on Web service interoperability and delivered many versions of the basic profile (1.0/2/0) document [1]. However, still, interoperability issues exist. In this paper, we are focusing on two programming languages, Java and .NET and trying to resolve their interoperable issues with the help of a third-party multi-agent tool, JADE. Both the languages Java and .NET have their own standards, different data types and formats, but due to the interoperable nature of the Web services, the Java client can invoke the .NET service and the .NET client can call the Java Web service easily, but at some cases, due to interoperable issues like precision of data types, passing of null values, unsigned numbers, complex data types, date–time issue format etc., results are not accurate and precise. To improve these interoperability issues, we have worked with the JADE software tool. With the help to JADE, the exact results of the Web service will be transferred between the two platforms. JADE is a software tool that is written in Java, follows the FIPA guidelines and help with message passing between different platforms [2, 3]. With the help of these agents, communication is possible among different programming languages [4]. Agent technology is becoming popular day by day as more and more software challenges are coming in various fields.
1.1 Multi-Agent System A multi-agent system consists of a number of agents which interact with one another in performing a coherent task. Each agent can work independently and can participate in the decisions of other agents [5]. The main characteristics of agents are their flexibility, reactivity, adaptability, pro-activity, responsibility, robustness, and mobility. Due to its mobile nature, agents can transfer messages from one network to another by using ontological descriptions. Agent Communication Language (ACL) is a language that supports inter-agent communication [6, 7]. JADE supports graphics that help in development and debugging of agents. JADE is a tool that suits distributed network devices like cell phones, smart phones, laptops, or desktop systems. Agent Identification (AID) removes the duplicity among agents and makes the multi-agent system simple, as shown in Fig. 1. The main feature of JADE is that it provides the same set of homogeneous API’s that are independent of JAVA versions and underlying networks. JADE has its own agent management system (AMS) to manage all the agents’ activities under one container [8].
Analysis and Performance of JADE on Interoperability Issues …
317
Fig. 1 Components of JADE
2 Related Work and Comparison with Other Technologies Due to JADE’s important features and its comparison with other agent technologies like spade, JaCaMo, Akka [2], IBM-Aglet, voyager, Anchor, Zeus [9] etc., it has been proved that JADE works well on different kind of devices operating on the Internet. JADE is an open-source tool and provides the most balanced toolkits and security features that help the agent system to deal with simple as well as complex structure of multi-agent system. The author, Arti Singh et al. [6], has explained a variety of multi-agent tools and how these tools help in developing a robust infrastructure for Web services. The author compared JADE with five multi-agent tools that were previously used and proved that JADE is good at security mechanisms, agent mobility, and communication techniques. Khalid et al. [10] proved that on cloud-based systems, JADE is persistent, scalable, easily deployed, and acquires less memory on the cloud. Radhakrishnan et al. [11] have done a comparison of JADE with spade software. She focused on data security in the cloud through an efficient agent framework and how agents act as a host between users and systems, between different systems, and between different platforms. The author proposed a sentinel multi-agent system and proved how CPU usage and rate of message transfer can be made fast and efficient using JADE software. Author Raquel Trillio et al. [9] has given a comparison chart for mobile agents. She has done an experimental set up of seven mobile agents (Aglets, Voyager, Grasshopper, Tryllian, JADE, Tracy, and SPRINGS) and tried to prove which one is good (like springs) to use in distributed system but also considered JADE, a very popular platform for developing multi-agent system. Bergenti et al. [12] recently published a paper that explains well over twenty years of agent-based software development with JADE. According to his study, JADE is well-suited for industrial projects, agent-oriented software engineering, object management group, network and service management, and explained the high-level architecture of WORKFLOW AND AGENTS (WANTS) that are deployed on JADE
318
J. Chawla and A. K. Ahlawat
and tried to prove somewhat how JADE can be used with Java. Siddig [13] has done a comparison and evaluation of mobile agent tool kits with JADE with Aglet software. According to Shkashuki methodology [13], the author worked on five criteria (availability, environment, development, characteristic properties, and performance) and proved that communication, security, authentication, and mobility between agents in JADE are more efficient than Aglet. Researchers have their different views on JADE and the comparison of JADE with other toolkits has also proved that the agents built by JADE are mobile, secure, distributed, and works on different platform [14]. Due to lot of advantages of JADE, we have done an implementation and analysis of JADE on two different platforms, Java and .NET and tried to make them compatible on the client side.
3 Comparison with Other Technologies Table 1 that shows the JADE comparison with different tool and with same set of features. From comparison, we observed that compliance with FIPA is present only in JADE and SPADE, but CPU usage in JADE is optimal as compared to spade [6].
4 Framework for Interoperability Issues Resolution Web service interoperability is always a matter of subject between different platform Web services. Ivano [13] has done an experimental set up of thousands of Web services with three major servers and proves that interoperability always exists while transferring Web services between different platforms. Due to interoperability issues, sometimes errors come from service generation to service testing. Theoretical interoperability resolution of Java and .NET Web services is explained by many researchers, but practical demonstration and analysis may be done by a few. We have already published a few papers on interoperability issues and their resolution between Java and .NET [15, 16]. Finally, in this paper, we are moving toward the performance and analysis part of the analysis part of Web services on different platforms. We have focused on six interoperability issues of Web services, like precision value, array with null value, unsigned number, date–time format, date with null value, and collection of complex data types. For the complete set up, we have used a third-party tool, Web Service Interface Gateway (WSIG) with JADE.
JADE
Agent behavior
Agents, DF, AMS, MTS, container
Yes
Yes (FIPA)
LGPL
Yes
Strong
Features
Model
Elements
Asynchronous communication
Message
Availability
GUI
Security mechanism
Basic
Some
IBM License Required
Yes
Yes
Agents
Events
Aglet
Basic
Yes
Not used now
Yes
Yes
Agents, regions, places
Procedural
Grasshopper
Table 1 Comparison of JADE with other technologies Procedural
Voyager
No
Yes
Strong
Yes
Moderate
Yes
BSD License Not free
weak
Yes
Agent, ASM, Agents, server ACL, AJNDI
Agent behavior
Anchor
Moderate
Yes
LGPL
Yes (FIPA)
Yes
Agents, ADK, AFC
Tasks
Tryllian
Moderate
No
Binaries
Yes
Yes
Agents, plugin
Procedural
Tracy
Moderate
No
Binaries
Yes
Yes
Agents, regions, places
Procedural
Spring
Strong
Yes
MIT License
Yes (FIPA)
Yes
Agents, ACC, PC, MD
Agent behavior
Spade
Analysis and Performance of JADE on Interoperability Issues … 319
320
J. Chawla and A. K. Ahlawat
4.1 JADE-WSIG Agents with Web services have been explored for a long time. Designing a Web system by adding the functionalities of agents makes it more useful, distributed, and platform independent [17]. The main purpose of agents is to solve complex problems; otherwise, it would be difficult to solve them individually [12, 18]. Agents are always capable of deciding what actions are taken according to the conditions, and JADE is one of them. JADE with WSIG acts as a middleware and supports the Web services for minimization of interoperability issues, as shown in Fig. 2. The following tasks are done by JADE-WSIG: 1. 2. 3. 4. 5. 6.
WSIG with JADE is designed to connect two functionalities together, i.e., Web service and agent system, and ensure minimum human and service interruption. Agents help the Web service with re-direction, aggregation, integration, and administration. JADE comprises of an ontology and a set of actions that are used to define vocabulary and semantics used in communication between agents. JADE-WSIG provides the transparency, automation, and integration of agents with Web services. All the interaction between agents and Web services includes the mapping between WSDL ACL and SOAP ACL [19, 20]. WSIG also used the ontolgies to describe and declare agent operations that expose the Web services.
Multi-Agent System
ACLSOAP Codec
Web-service ents (JAVA/.NET)
Axis web server
cli-
ACLWSDL Co-
JADE-Gateway Agent (WSIG)
UDDI (JUDDI)
Agent container (contains six Agents)
JADE-DF
Register/De-register/Modify services
Fig. 2 Framework of Web services with JADE
web
Analysis and Performance of JADE on Interoperability Issues …
7. 8.
321
Like other agents, WSIG is registered as an agent in directory facilitator (DF) and in UDDI also [21]. Access to these directories providing agents and Web service clients by exposing a Web-appropriate interface and providing the operations like registering, deregistering, and modifications of service descriptions [22].
5 Algorithm for Mapping of Web Services with Multi-Agent System 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Create a SOAP-based Web service either using the Java or .NET platform that contains all the methods having interoperability issues. Deploy the Web service on the Eclipse platform using Wildfly Server. A WSDL file of the Web service will be retrieved simply by pointing the browser at the Web service URL. The same WSDL file is passed to Java and .NET clients. The results of both the clients will vary due to interoperability issues between the two platforms. The multi-agent system is used; it helps to create agents with JADE tool. JADE-WSIG is used as a middleware between Web services and agent systems. The agents will capture the result (WSDL) before it is passed to the clients. Agent will control the results and modify the parameters by doing mapping between ontology. Finally, the same set of results will be passed to both the clients.
6 Agent Handler We have already created a number of agents and agent handlers that help to communicate with Web services and minimize interoperability issues. Agent handler functions receive the input parameters coming from Java and .NET files and synchronize the output in the new WSDL created by agents. Agents control the inputs, map the ontologies, and create an agent object. These agent objects perform the appropriate actions as per the Web service issue and try to resolve it [23].
6.1 Precision Handler Agent JADE designed, built, and implemented a precision handler agent to resolve precision value issues on both the Java and .NET platforms. Java rounds off Web service data values to the seventh decimal place, while .NET rounds off Web service data to the sixth decimal place. As a result, JADE agents developed a common solution for both
322
J. Chawla and A. K. Ahlawat
Fig. 3 Analysis of precision value issue
platforms. If the data values of inter-platform Web services after decimal are six, seven, or more digits, the JADE precision handler agent will control the precision and round off incoming Web service data requests to the fifth decimal place before returning the response to the client/s as an updated WSDL [24]. The analysis part of the three service clients is shown in Fig. 3. The graph shows the comparison of three platforms: Java, .NET, and JADE. As the precision of Java is more than .NET, JADE controls the output of JAVA, rounds it off and makes it equal to .NET. So the new WSDL created by JADE is sent to both the clients.
6.2 Null Handler Agent JADE created and deployed a null handler function agent to solve the interoperability issue of an array with a null value. Because both Java and .NET produce the same results; when using a hard code solution to create an array with null, the string length remains the same. However, when using Web services to call array methods, the string length count drops to 1, and the result is inaccurate. So, with the help of the null handler agent, JADE provides a common solution that replaces the null value with an empty string and prints the string length while taking the null value into account [24]. The analysis part of the three services clients are shown in Fig. 4. The graph shows the comparison of three platforms: Java, .NET, and JADE. The null value issues arise both on the Java and .NET side, so the string count of Java and .NET is always one less than the actual. Hence, JADE controls the output of Java and .NET, increases
Analysis and Performance of JADE on Interoperability Issues …
323
Fig. 4 Analysis of array with null value issue
the string length by one, and creates an updated WSDL. The updated WSDL created by JADE is sent to both the clients, and the issue is resolved.
6.3 Unsigned Number Handler Agent Unsigned number data types are not supported by Java, but they are supported by the .NET platform. With the help of the multi-agent system, we created an unsigned number handler agent to address the interoperability issue. The agent implements the unsigned number API (org.apache.axis.types.UnsignedInt), controls the parameters of user-defined unsigned numbers, and resolve interoperability issues. The analysis part of the three services clients are shown in Fig. 5. The graph shows the comparison of three platform Java, .NET, and JADE. The unsigned number is not supported in Java as like .NET, so JADE control the output of Java with unsigned number handler agent or with unsigned Apache API, and updated WSDL of JADE Web service is forwarded to both the clients.
6.4 Date Handler Agent In the JAVA platform, a date–time data type is formatted with three digits of milliseconds, whereas in .NET, it is formatted with seven digits. The .NET platform’s standard date–time format (yyyy-mm-dd-hh-mm-ss: ffffffk) shows precision up to seven digits of milliseconds, but Java Web services carry date–time formats with precision
324
J. Chawla and A. K. Ahlawat
Fig. 5 Analysis of unsigned number issue
of up to three digits. However, if more than three digits of milliseconds are added, the entire date–time value is altered, rounded off, and the precision is lost. To fix the problem, the date handler agent took control of the .NET time parameter and rounded it to three digits. The analysis part of the three service clients is shown in Fig. 6. The graph shows the comparison of three platforms: Java, .NET, and JADE. The date–time format of milliseconds is differently supported in Java and .NET, so to control the output of .NET, JADE used a date handler agent that convert the time format of seven milliseconds to three milliseconds as in Java. Finally, the updated WSDL of the JADE Web service is forwarded to both the clients.
Fig. 6 Analysis of date–time issue
Analysis and Performance of JADE on Interoperability Issues …
325
Fig. 7 Analysis of date with null check issue
6.5 Date Null Agent A date–time is a value type object in .NET. When a null is entered into the Web service method, .NET treats the null as a value and generates no error condition. But in Java, date–time data type is treated as a reference type. But when a null is entered into this Web service method, it will not be stored in a heap, rather it will raise null pointer exception. The date null agent does the ontology mapping, the data structure is passed from the WSDL of the Web service to the agent services ontology translator and converts the data types, and methods of Web service to agent ontology and stored in DF. The analysis part of the three service clients is shown in Fig. 7. The graph shows the comparison of three platforms: Java, .NET, and JADE. The date with null check issue of Java is controlled by the date null agent of JADE. The handler agent converts the null or empty string into a system date format like .NET. Finally, the updated WSDL of the JADE Web service is forwarded to both the clients.
6.6 Array Handler Agent ArrayList is the most common collection type object used by every language. The Web service built on the .NET platform contains “ArrayList” as the collection type works accurately, but the same service does not work on the JAVA client side. JAVA does not support or automatically convert “ArrayList” into an “ArrayOfAnyType” collection object. To resolve this issue, we have created an array handle agent. Array handle agent replaces the “ArrayOfAnyType” with an array of objects, and the object
326
J. Chawla and A. K. Ahlawat
Fig. 8 Analysis of array of any type issue
array works well on the JAVA platform. In this way, the agent helps to convert the array list WSDL into a valid array of objects WSDL [25]. The analysis part of the three service clients is shown in Fig. 8. The graph shows the comparison of three platforms: Java, .NET, and JADE. The array of any type of issue of java is controlled by array handler agent of JADE. The array handler agent coverts the array of any type data structure in to array of objects that work like collections in .NET. Finally, the updated WSDL of the JADE Web service is forwarded to both the clients.
7 Conclusions A Web service framework’s main goal is to enable enterprise applications to use heterogeneous Web services regardless of service platform. To provide flawless and appear-less services to industries, services must be independent, loosely coupled, interoperable, and adhere to industry standards such as UDDI, WSDL, and SOAP, regardless of location or platform. If the platform, domain, or operating systems are different, according to WS-I, there may be an interoperability issue. In this paper, we have examined Jade’s performance on interoperability issues and attempted to resolve the issues using JADE-WSIG technology.
Analysis and Performance of JADE on Interoperability Issues …
327
References 1. WS-interoprabilty [online] https://www.ibm.com/developerworks/webservices/tutorials/sun derstand-web-services6. Accessed 25 April 2007 2. Cossentino M, Lopes S, Nuzzo A, Renda G, Sabatucci L (2008) A comparison of the basic principles and behavioural aspects of Akka, JaCaMo and JADE development frameworks. In: WOA, pp 133–141 3. Foundation for intelligent Physical agent [online] http://www.fipa.org/. Accessed 8 June 2005 4. JAVA agent Development Framework [online] http://JADE.tilab.com/papers/2005/JADEWo rkshopAAMAS/AAMAS05_JADE-Tutorial_WSIG-Slides.pdf. Accessed June 2002 5. Poggi A, Turci P (2009) Multi-agent systems for semantic web services composition. In: Handbook of research on social dimensions of semantic technologies and web services. IGI Global, pp 324–339 6. Aarti S, Dimple J, Sharma AK (2011) Agent development toolkits. Int J Advancements Technol 1:158–165 7. JAVA agent Development Framework [online] http://www.JADE.tilab.com/. Accessed 8 June 2017 8. Greenwood D, Lyell M, Mallya A, Suguri H (2007) The IEEE FIPA approach to integrating software agents and web services. In: Proceedings of the 6th international joint conference on autonomous agents and multiagent systems, pp 1–7 9. Trillo R, Ilarri S, Mena E (2007) Comparison and performance evaluation of mobile agent platforms. In: Third international conference on autonomic and autonomous systems (ICAS’07). IEEE, pp 41–41 10. Khalid N, Tahir GA, Bloodsworth P (2020) Persistent and scalable JADE: a cloud based in memory multi-agent framework. arXiv preprint arXiv:2009.06425 11. Kristensen T, Dyngeland M (2015) Design and Development of a multi-agent e-learning system. Int J Agent Technol Syst (IJATS) 7(2):19–74 12. Bergenti F, Caire G, Monica S, Poggi A (2020) The first twenty years of agent-based software development with JADE. Auton Agent Multi-Agent Syst 34:1–19 13. Siddig JM (2015) Comparison and evaluation of performance of mobile agent toolkits (JADE and Aglet). Doctoral dissertation, Sudan university of scince and technology 14. Vadivelou G, IIavarasan E, Prasanna S (2011) Algorithm for web service composition using multi-agents. Int J Comput Appl 13(8):40–45 15. Kumar U, Trolamine KP, Khanna V (2013) A Comparison of J2EE and .NET as platforms for developing E-government applications. Int J Eng Res Dev 7(1):116–121 16. Elia IA, Laranjeiro N, Vieira M (2014) Understanding interoperability issues of web service frameworks. In: 44th Annual IEEE/IFIP international conference on dependable systems and networks, IEEE, pp 323–330 17. Radhakrishnan G, Chithambaram V, Shunmuganathan KL (2018) Comparative study of JADE and spade multi agent system. Int J Adv Res 6(11):1035–1042 18. Web service interoperability organization. [online] http://www.ws-i.org/docs/ws-i_faq_01.pdf. Accessed 26 Feb 2019 19. Micsik A, Pallinger P, Klein A (2009) Soap based message transport for the JADE multiagent platform 20. Chawla J, Ahlawat A, Goswami G (2019) Integrated architecture of web services using multiagent system for minimizing interoperability. In: 2019 6th International conference on “computing for sustainable global development”, 13–15 March 2019, Bharati Vidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi (INDIA) 21. Casals A, Seghrouchni AEF, Negroni O, Othmani A (2018) Exposing agents as web services in JADE. In: International workshop on engineering multi-agent systems. Springer, Cham, pp 340–350 22. Chawla J, Ahlawat A, Goswami G (2018) A review on web services interoperability issues. IN: 5th IEEE-Uttar Pradesh section international conference on electrical, electronics and computer engineering (UPCON), 2–4 Nov 2018, pp 1–5
328
J. Chawla and A. K. Ahlawat
23. Cao J, Wang J, Hu L, Lai R (2012) A service intermediary agent framework for web service integration. In: 2012 IEEE Asia-Pacific services computing conference. IEEE, pp 14–21 24. Chawla J, Ahlawat AK, Gautam J (2020) Resolving interoperability issues of precision and array with null value of web services using WSIG-JADE framework. Model Simul Eng, Hindawi J 25. Chawla J, Ahlawat AK (2021) Resolving interoperability issues of date with null value and collection of complex data types by using JADE-WSIG framework. Webology 18(1):263–284
A TAM-based Study on the ICT Usage by the Academicians in Higher Educational Institutions of Delhi NCR Palak Gupta
and Shilpi Yadav
Abstract Recent scenario has seen massive up-shift in information and communication technology (ICT) usage where each and every sector has started using ICT for automating its business processes. This has brought a shift from manual processes to semi or fully automated business operations leading to advanced efficiency, productivity, cost-saving, and timely results. Even the education sector has seen transformation from offline to online or hybrid model. The ICT has brought huge disruption in methodology and ways of hosting education. New online tools and the support of cloud platforms, artificial intelligence (AI) and machine learning (ML), virtual interactions, and flipped classrooms have revolutionized higher education. In this paper, a primary survey has been done on ICT adoption and usage by the academicians in their teaching methodology especially in higher educational institutions of Delhi NCR using the research framework on technology acceptance model (TAM) to determine the predictors of ICT adoption by academicians. Empirical analysis through Python, Jamovi, and IBM SPSS statistics has been done to analyze how successful ICT adoption has been for the academicians in fulfilling teaching pedagogy and bringing better awareness and satisfaction among the students toward curriculum and industry practices. Keywords Cloud platforms · Python · AI · ML · ICT · TAM
P. Gupta (B) · S. Yadav Department of Management, Jagannath International Management School, New Delhi, India e-mail: [email protected] S. Yadav e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_25
329
330
P. Gupta and S. Yadav
1 Introduction ICT has entirely revolutionized the conduct and pattern of education specially in higher educational institutes as the level of interaction with the students has turned more from offline to online and hybrid models that have given more flexibility in terms of discussions, sharing assignments, corrections, doing assessments, sharing tutorials, flipped classroom, or recorded lectures. It helps the academicians in not only proper organizing their study material, preparing presentations but it also enables them for easy sharing of materials to their students and their quick and effective assessment too [1]. Access to ICT has opened new paradigm and domains for education which is now popularly called as online learning or e-learning. E-learning is popularly defined as the intentional use of ICT in teaching and learning [2]. ICT helps academicians to better present, express, and deliver their ideas and share their views through various online platforms. It enables immediate sharing of information; the assignments can be checked instantly, and it is quick and effective assessment can be done on the spot with grades or scores. ICT techniques are helping the academicians in doing their work more smartly, intelligently, and effectively in quicker span of time. Also, the teaching pedagogy has now changed to more practical oriented, group oriented, and leadership-based, especially in the domain of higher education where more of such traits and skills are required among the students [3]. ICT has improved the quality of education by enabling enhanced acquisition of teaching and learning skills, adaptation to the various digital transformation tools so that the higher educational institutes are at par with foreign universities, colleges, and academic institutions where classroom teaching is blended with various online tools and techniques of teaching like role plays, flipped classroom, cloud-based apps like Google Meet, Microsoft Teams, Zoom, Webex, etc. References [4–6] which give a lot of exposure through real-time data, live cases, industry scenarios, etc.
2 Literature Review Educational institutions have shifted, accepted, and adapted the use of information and communication technologies (ICTs) for making better service, quality, and achieving the effective organizational outputs in a competitive global environment ICT application in various sectors like Internet banking, e-business, online retail, electronic supply chain management (e-SCM), electronic customer relationship management (e-CRM), etc., have been thoroughly researched but its impact and usage in the higher educational institutions show less study, and hence, a gap was identified to study and understand academician’s perspective toward using and adopting ICT tools in higher education institutions. The study is based on the theory of TAM, targeting the academicians of higher educational institutions of Delhi NCR. ICT leverages the
A TAM-based Study on the ICT Usage by the Academicians …
331
academicians in better adaptation to the digital education and groom their evaluation and effectiveness capabilities [7, 8]. ICT has changed the education model, its pedagogy, methods, and reach [9, 10]. The technology acceptance Mmdel (TAM) has been used and researched in this paper to explore the acceptance of new e-technology or new e-services [11, 12]. In this research, extended and advanced technology acceptance model has been used for studying academicians’ comfort in adoption of ICT in higher education institutions. TAM was applied to study the acceptance and successful parameter of ICT usage in which the actual usage, user intention, and behavior of individuals were affected by the two key variables: perceived ease of use and perceived usefulness. Therefore, results showed that a strong relationship exists between perceived usefulness, attitude, and behavioral intention [13] toward dedicated, sincere, and the actual usage of the system and tools [14]. In this study, attempt has been made to investigate the factors affecting the intention to use ICTs from the academicians’ perspectives in higher educational institutions of Delhi NCR. Perceived usefulness and perceived ease of use are core aspects of the TAM and TAM2 models, and it is very necessary that any technology been adopted should be optimally used for its best results [15, 16].
3 Research Objectives The following research objectives have been met in this study: • To find out the adoption level and impact of ICT on academicians’ performance while imparting learning in higher educational institutions. • To determine the relationship between factors of perceived usefulness and the perceived ease of use toward ICT. • To find the impact and level of perceived usefulness and perceived ease of use on the academicians’ behavioral intention and attitude toward using ICT. • To study how sincere, dedicated, and successful the academician is in actual usage of ICT tools.
3.1 Research Method This survey aims to understand academician’s perspective toward the usage and adoption of ICT tools in higher education institutions. A survey has been created using the Google forms to reach out the masses of the academicians across higher educational institutes of Delhi NCR to gather the required inputs on the TAM-based informational study on the major five aspects such as perceived usefulness, perceived ease of use, attitude toward using, behavioral intention to use, and actual system usage as indicated in Fig. 1.
332
P. Gupta and S. Yadav
Fig. 1 Modified TAM-based framework
The survey consisted of close ended questions wherein 20 questions of 4 each has been split into the technology acceptance model structure. The following questions were been showcased for further empirical and statistical analysis using Python. (1)
Perceived Ease of Use Questions about this particular topic on the TAM-based study were a. b. c. d.
(2)
Perceived Usefulness Questions about this particular topic on the TAM-based study were a. b. c. d.
(3)
ICT tools are easy for conducting online classes. ICT tools are easy to install, understand, and implement. ICT tools are easy for conveying and sharing class study material, posting, and evaluating assignments and taking online presentations and quizzes. ICT tools are secure for online exams, proctoring, and effective assessment.
ICT tools are useful in implementing online education pedagogy and in curriculum enrichment. ICT tools are effective in improving my technical skills and having better job prospects. ICT tools are comfortable platforms for virtual classroom interaction and enhancing effectiveness. ICT tools are useful in connecting with people for academic sharing and discussions.
Attitude towards using ICT in higher education Questions about this particular topic on the TAM-based study were a. b. c. d.
Digital transformation is necessary and academicians should adapt to ICT tools. Academicians should adapt to online education, its platforms, and pedagogy. ICT tools usage brings positive transformation in personal and professional growth. ICT tools have changed attitude and style of functioning in multitasking model.
A TAM-based Study on the ICT Usage by the Academicians …
(4)
Behavioral Intention to Use Questions about this particular topic on the TAM-based study were a. b. c. d.
(5)
333
ICT tools do affect the academicians emotionally, behaviorally, and mentally. Will you continue with the usage of online tools for further academic or in other industry sectors? Will you spread awareness of these online ICT tools to your known ones? Do you extend support and responsiveness to other faculty colleagues and student groups during conduct of online classes through ICT tools?
Actual System Use Questions about this particular topic on the TAM-based study were
a. b. c. d.
Is ICT usage relevant for your academic profile? Are you serious while using the ICT tools during teaching and academic sharing? Are ICT tools justifying up gradation from offline to online classes and new teaching pedagogy? Are ICT tools impactful and effective in current and future domain?
3.2 Dataset Survey took academicians with varied domains across Universities and Colleges of Delhi NCR, and responses were recorded in brief. Excel dataset converted into csv file to perform analysis using Python in consideration of different libraries such as pandas, NumPy, seaborn, and matplotlib. Numerous open-source libraries in Python gave an in-depth analysis with greater impact toward the data visualization. Firstly, the dataset has been imported into Jamovi base structure to perform descriptive and exploratory analysis followed by technical analysis of the final evaluation using Python. IBM SPSS was also used for correlation and regression analyzes.
4 Analysis and Findings 4.1 Descriptive Analysis Dataset basic features functionality resulted in a holistic overview of the demographic structure which states as per Table 1, academicians having experience of higher education between 10 and 20 years and who are mostly associate professors have more inclination toward ICT usage. Also, as per Table 2, academicians in IT/CS and management department are savvier and more comfortable with ICT usage and as per Table 3, academicians who have been using ICT tools for more than 5 years prefer using ICT tools in higher education.
334
P. Gupta and S. Yadav
Table 1 Demographic structure of academicians Frequencies of experience in higher education Designation Experience in Assistant higher education professor
Associate professor
Faculty/instructor
Lecturer
Professor
10–20 years
5
12
0
0
0
2–5 years
8
0
0
4
0
5–10 years
8
0
4
0
0
Above 20 years
0
0
0
0
2
Less than 2 years
2
0
6
0
0
Table 2 Demographic structure of academicians Frequencies of department Designation Department
Assistant professor
Associate professor
Faculty/instructor
Lecturer
Professor
IT/CS
15
0
4
4
0
Law
0
0
1
0
0
Management
5
12
3
0
2
Others
2
0
0
0
0
Science
1
0
2
0
0
Table 3 Demographic structure of academicians Frequencies of duration of using ICT tools in higher education Designation Duration of using ICT tools in higher education
Assistant professor
Associate professor
Faculty/instructor
Lecturer
Professor
1–3 years
12
0
4
2
2
3–5 years
0
0
2
0
0
Above 5 years
6
12
2
0
0
Less than 1 year
5
0
0
2
0
Not used
0
0
2
0
0
A TAM-based Study on the ICT Usage by the Academicians …
335
4.2 Reliability Analysis In our study, we have performed reliability tests such as Cronbach (alpha reliability) and McDonald (omega reliability) tests on 20 questions as per the modified TAMbased model. Separate reliability analysis of scaled grouping has been done on the basis of 5 aspects of TAM-based model. As per the reliability analysis shown in Figs. 2, 3, 4, 5, and 6, the overall Cronbach (alpha reliability) and McDonald (omega reliability) on all questions are 0.873 and 0.895, respectively. Fig. 2 Reliability analysis-perceived ease of use
Fig. 3 Reliability analysis-perceived usefulness
336 Fig. 4 Reliability analysis-attitude toward using ICT tools
Fig. 5 Reliability analysis-behavioral intention to use
P. Gupta and S. Yadav
A TAM-based Study on the ICT Usage by the Academicians …
337
Fig. 6 Reliability analysis-for actual system use
The results as per Table 4 showed that Cronbach’s alpha of overall scaling reliability statistics is 0.873, and McDonald’s omega is 0.895 gives the strength of the attached features which eventually results into modified TAM-based model. The mean and standard deviation of the reliability analysis are 4.03 and 0.405. Separate evaluation of the reliability test of each grouping has been showcased in Figs. 2, 3, 4, 5, and 6. For group 1, i.e., perceived ease of use Cronbach (alpha reliability) and McDonald (omega reliability) tests results were 0.728 and 0.834. Attitude toward using ICT among the academicians across Delhi NCR is the highest among the rest of the groups stating Cronbach αand McDonald ω tests result as 0.875 and 0.885. The reliability analysis results have been quite robust in nature and satisfy the reliability requirements following different parameters of the study of the TAM-based study of ICT usage by the academicians of various universities/institutes. The relative grouped association with inclusion of all the 5 parameters states very high nature of the reliability analysis.
4.3 Correlation Correlation has been performed using Python programming language on the TAMbased 5 group of 20 questions to find the relationship of association among them. According to the results generated, it has been found that Q9 and Q10 showed highly positive correlation of 0.8 which can be stated in Fig. 7 as academicians adapt to
338
P. Gupta and S. Yadav
Table 4 Combined reliability analysis of all the 5 parameters Reliability analysis Scale reliability statistics Scale
Mean
SD
Cronbach’s Alpha
McDonald’s Omega
4.03
0.405
0.873
0.895
Item reliability statistics If item dropped Mean
SD
Item-rest correlation
Cronchbach’s Alpha
McDonald’s Omega
Q1
4.23
0.751
0.548
0.864
0.888
Q2
4.27
0.765
0.615
0.862
0.886
Q3
4.1
0.857
0.421
0.869
0.893
Q4
3.19
0.283
0.432
0.874
0.893
Q5
4.04
0.683
0.533
0.865
0.891
Q6
4.33
0.694
0.488
0.867
0.889
Q7
3.88
0.703
0.643
0.861
0.886
Q8
4.25
0.636
0.623
0.863
0.888
Q9
4.46
0.651
0.687
0.861
0.883
Q10
4.27
0.36
0.59
0.863
0.886
Q11
4.02
0.601
0.636
0.863
0.887
Q12
4.15
0.684
0.457
0.868
0.892
Q13
3.88
0.789
0.223
0.876
0.897
Q14
3.96
0.459
0.551
0.867
0.889
Q15
3.83
0.595
0.697
0.861
0.885
Q16
3.92
0.679
0.12
0.878
0.899
Q17
4.19
0.704
0.565
0.864
0.888
Q18
4.1
0.425
0.377
0.871
0.895
Q19
3.88
0.733
0.609
0.862
0.888
Q20
3.63
0.084
0.213
0.882
0.898
online education, its platforms, and pedagogy resulting to greater impact on digital transformation adaption of the ICT tools. Correlation matrix analysis using heatmap showed the overall association among the 5 grouping structures of TAM-based study keeping all the features attached for academician’s adoption toward ICT tools in the higher educational universities/institutes of Delhi NCR. The overall impact of the revised technology adoption model is positive in nature as per Figs. 8 and 9.
A TAM-based Study on the ICT Usage by the Academicians … Fig. 7 Heatmap correlation analysis using Python
Fig. 8 Correlation analysis using Python-lower triangular
339
340
P. Gupta and S. Yadav
Fig. 9 Correlation analysis using Python-upper triangular
4.4 Principal Component Analysis Dimensionality reduction technique, principal component analysis using Python for dimension reduction, and segmentation analysis with usage of the different Python open-source libraries were used which eventually enhances the segmentation results. PCA method of dimensionality reduction offers a robust way of identification of different factors for standardizing features associated by further determining the no. of clusters to signify the importance of the modified TAM-based model. It is an unsupervised linear dimensionality reduction which helps to track down more meaningful basis for standardization to scale the validity of features. Using Sklearn preprocessing, standard scaler module has been used for standardizing the features. Sklearn has been considered one of the important and the widely used libraries in machine learning for performing important statistical techniques. Using Pandas DataFrame, data can be considered for further segmentation and can lead to decomposition into structured framework. Firstly, we explored the data and further preprocessing of data inputs was done using principal component analysis by standardizing through pca.fit (segmentation_std). Once the data are fitted well, major decision lies on the number features attached to keeping base on the cumulative variance plot. The attributes showcase how much variance is explained by each of the 20 individual components through pca.explained_variance_ratio by standardizing values. Further, we chose 4 components and tried to fit the model to our dataset with the required number of actual components. Figure 10 represents amount of variance
A TAM-based Study on the ICT Usage by the Academicians …
341
Fig. 10 Cumulative explained variances
captured on the y-axis depending on the number of the components on the x-axis. As per the thumb rule, 80% variance needs to be preserved, so it is acceptable to keep 4 components which can result to choose by visualizing Fig. 10. Further, model needs to be fitted using the selected number of components with attached features and the components score results using transform and segmentation functions of Python. In order to determine the number of clusters, within cluster sum of squares (WCSS) has been determined which needs to be plotted against the number of the components of the graph as shown in Fig. 11. As per the above analysis, elbow method was used. It is a part of the cluster analysis wherein elbow method is a heuristic method and is used in determining the number of clusters in the dataset, which results in picking the elbow of the curve as the numbers of clusters to be used for it. We chose clusters as 5 and used the transformed data from the principal component analysis (PCA). The graph below shows the WCSS against the number of the components. The kink in the graph needs to be looked closely for a slope decline of the mark at 5 clusters. Therefore, we proceed further to using 5 clusters by using elbow method. With Table 5, a new data frame has been created so that the values can be added up in the individual components to the segmented data. In Fig. 12, 4 segments were separated with the help of Python analysis to have better visualization. Principal component matrix (PCA) was used to reduce the count of the variables by combining them for better analysis. The clusters formed had the similar responses indicating academicians adaption to ICT for better education pedagogy.
342
P. Gupta and S. Yadav
Fig. 11 WCSS graph using elbow method
4.5 Regression Analysis on the Modified TAM-Based Model Approach IBM SPSS statistics has been used for multiple regression analysis to examine the relationship between various parameters involved for actual usage technology adoption by the academicians. Four hypotheses were formulated based on the modified TAM model. Each and every hypothesis has been tested for significance based on the regression statistics for check on the acceptance of the technology and its actual usage. H1: H2:
H3:
H4:
Perceived ease of use on the ICT usage has a significant effect on the perceived usefulness on the ICT usage by the academicians Perceived ease of use and perceived usefulness on the ICT usage has a significant effect on the attitude toward using ICT tools usage by the academicians Attitude toward using and perceived usefulness of the ICT usage has a significant effect on the behavioral intention to use of the ICT tools usage by the academicians Behavioral intention to use on the ICT tools has a significant effect on the actual use of the ICT tools by the academicians.
5
5
4
4
4
0 5
1 5
2 4
3 4
4 4
4
5
5
5
5
1
3
1
5
3
4
4
3
4
4
5
3
3
4
5
4
4
4
4
5
5
5
4
5
5
3
4
3
5
5
3
4
3
5
5
… 5
… 5
… 3
… 3
… 5
4
4
3
4
5
4
5
4
4
5
4
5
1
4
5
4
5
2
4
5
0.940842
1.448492
5.597836
2.534421
5.079465
0.732792 1.436667
−0.556921
2.00265
5.788801
0.330482
−1.618642 0.199596
1.377173 −1.325306
−0.717158
0.573514
4.502231
1.469966
−1.139584
2
2
4
1
1
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 … Q16 Q17 Q18 Q19 Q20 Component1 Component2 Component3 Component4 Segment k-means PCA
Table 5 Description of principal component analysis variable-wise
A TAM-based Study on the ICT Usage by the Academicians … 343
344
P. Gupta and S. Yadav
Fig. 12 Visualization of the K-means clustering using principal component analysis
As per Table 6, regression analysis has been conducted with 2 factors of which one being independent and the other one being dependent-perceived ease of use being independent factor and perceived usefulness being dependent factor. R square calculated value is 0.420 which indicates the predictor factor perceived ease of use explained 42% of perceived usefulness. The standardized coefficient of beta value which is 0.648 indicates the effectiveness of the significant effect on perceived usefulness, and hence, the regression model was significant as it has been seen from the value of F-statistic which is 35.428. ANOVA Model
Sum of squares
1 Regression Residual Total a Dependent
df
Mean Square
F
Sig.
4.582
1
4.582
35.428
0.000b
6.337
49
0.129
10.919
50
variable: perceived usefulness (constant), perceived ease of use
b Predictors:
b Dependent
0.408
Adjusted R square
(constant), perceived ease of use variable: perceived usefulness
0.42
0.648a
1
a Predictors:
R square
R
Model
Model summaryb
0.35962
Std. error of the Estimate 0.42
R square change 35.428
F change 1
df1
49
df2
0
Sig.f change
2.258
Durbin-Watson
Table 6 Regression analysis was conducted by testing the above hypothesis H1 of perceived ease of use on the ICT usage shows significant effect on the perceived usefulness on the ICT usage by the academicians
A TAM-based Study on the ICT Usage by the Academicians … 345
1 (Constant) Perceived Ease of use
Model
Coefficientsb
0.3
0.075
2.359
0.445
0.648
Beta
B
Std. error
Standardized coefficeints
Unstandardized coefficients
5.952
7.871
t
0
0
Sig.
0.295
1.757
Lower bound 0.595
2.961
Upper bound
95.0% Confidence interval for B
0.648
Zero-order
Correlations
0.648
Partial 0.648
Part 1
Tolerance
1
VIF
Collinearity statistics
346 P. Gupta and S. Yadav
A TAM-based Study on the ICT Usage by the Academicians …
347
As per Table 7, regression analysis valuation on the perceived ease of use and perceived usefulness is the independent factor and attitude toward using the ICT tools being dependent factor for testing this hypothesis. The results have been estimated as significant relationship between multiple factors (p = 0.000). The factors of perceived ease of use and perceived usefulness explained 34.5% of the factor of attitude for ICT tools usage by the academicians. The F-statistic value of 12.620 indicates the regression model being significant. ANOVAa Model
Sum of squares
1 Regression residual total
df
Mean square
F
Sig.
5.571
4
2.785
12.62
0.000b
10.594
48
0.221
16.164
50
a Dependent
variable: perceived usefulness b Predictors: (constant), perceived ease of use
As per Table 8, regression analysis was conducted by testing the above hypothesis H3 and was found to have significant influence on the behavioral intention to use of the usage of ICT tools by the academicians. Regression analysis stated that the ICT usage was found to explain the value of R square as 31.7% with the actual effect on the behavioral intention to use ICT tools. The F-statistic value being 11.122 indicates the effect on the behavioral intention while being considered multiple factors. ANOVAa Model
Sum of squares
df
Mean square
F
Sig.
1 Regression residual total
2.616
2
1.308
11.122
0.000b
5.644
48
0.118
8.26
50
a Dependent
variable: behavioral intention to use (constant), perceived usefulness, attitude toward using
b Predictors:
As per Table 9, a significant relationship has been observed of being actual use of the ICT tools as the dependent factor and multiple factors being behavioral intention to use and perceived usefulness being the independent factor. The R square value 0.361 and its F-statistic value of 27.624 indicate the significance of the model. ANOVAa Model 1 Regression residual total
Sum of squares
Df
Mean square
F
Sig.
4.642
1
4.642
27.624
0.000b
8.235
49
0.168
12.877
50
b Dependent
0.317
Adjusted R square 0.46979
Std. error of the estimate
(constant),perceived usefulness, perceived ease of use variable: attitude toward using
0.345
0.587a
1
a Predictors:
R square
R
Model
Model summaryb
0.345
R square change 12.62
F change 2
df1
48
df2
0
Sig. F change
1.046
Durbin -Watson
Table 7 Regression analysis was conducted by testing the above hypothesis H2 of perceived ease of use and perceived usefulness on the ICT usage which shows significant effect on the attitude toward using ICT tools usage by the academicians
348 P. Gupta and S. Yadav
b Dependent
0.288
Adjusted R square 0.34291
Std. error of the estimate
(constant), perceived usefulness, attitude toward using variable: behavoral intention to use
0.317
0.563a
1
a Predictors:
R square
R
Model
Model summaryb
0.317
R square change 11.122
F change 2
df1
48
df2
0
Sig. F change
2.786
Durbin-Watson
Table 8 Regression analysis was conducted by testing the above hypothesis H3 of attitude toward using and perceived usefulness of the ICT usage which shows a significant effect on the behavioral intention to use of the ICT tools usage by the academicians
A TAM-based Study on the ICT Usage by the Academicians … 349
b Dependent
0.347
Adjusted R square
(constant), behavioral intention to use variable: actual system use
0.361
0.600a
1
a Predictors:
R square
R
Model
Model summaryb
0.40995
Std. error of the estimate 0.361
R square Change 27.624
F change 1
df1
49
df2
0
Sig. F change
2.493
Durbin-Watson
Table 9 Regression analysis was conducted by testing the above hypothesis H4 shows that behavioral intention to use the ICT tools has a significant effect on the actual use of the ICT tools by the academicians
350 P. Gupta and S. Yadav
A TAM-based Study on the ICT Usage by the Academicians …
351
Fig. 13 Modified TAM-based framework that is tested and proved successful
Summation of the above results below As per Fig. 13 model, hypotheses were analyzed, the findings have proved that the perceived usefulness and perceived ease of use has a positive significant relationship on the attitude and behavioral intention of the academicians of higher educational institutions with coefficient of determinations as 0.345 which led to actual usage of ICT tools optimally and leverage it for all academic concerns. ICT tools are useful in implementing online education pedagogy and curriculum enrichment, in improving technical skills and having better job prospects and enhancing interaction and effectiveness for virtual classroom [17]. ICT tools usage has brought positive transformation in personal and professional growth of academicians and has changed their attitude and style of functioning in multitasking model. ICT tools do affect academicians emotionally, behaviorally, and mentally, and they should extend support and responsiveness to other faculty colleagues and student groups during conduct of online classes through ICT tools which can be seen with the value of 0.361.
5 Conclusion Information and communication technology (ICT) has become an important source of disruption, innovation, and improvement of efficiency for many sectors across the Delhi NCR. In education sector, predominantly, the application of ICT has become a critical part of the learning process for academicians of higher education both inside and outside the classroom setup [18]. The findings highlight that perceived usefulness and perceived ease of use have a positive significant relationship on the attitude and the behavioral intention of the academicians of higher educational institutions letting them to actually use with good intention and will to use ICT tools. ICT adoption by the academicians is a must in the current scenario where online tools, virtual education, and LMS are in high demand. Academicians are now more comfortable with ICT tools related to their installation, implementation, and actual and optimal usage. The academicians’ interaction during the online sessions led to the change in their mindset toward the digitalized educational structure. Some academicians faced the technical glitches during the sharing, grading, or submission of the assignments,
352
P. Gupta and S. Yadav
projects, etc., but eventually, they have adopted the learning toward the advancement of the heterogeneous software environments with the help of the institutional online infrastructural capabilities [19]. Therefore, digital transformation is necessary and the academicians should adapt to ICT tools as it has impacted academicians emotionally and behaviorally. Majority of them agreed that they have adopted and are compatible now with usage of ICT tools as now there is no need of classroom or physical teaching. ICT tools are quite easy for conducting and conveying class study material, posting, and evaluating assignments and taking online presentations and quizzes and are also secure for online exams, proctoring, and effective assessment. By this study descriptive, exploratory, and advanced statistical analysis based on the modified TAM study, digitalized learning experience by the academicians has shown behavioral and emotional quotient. There is a positive impact of ICT adoption on the academicians’ performance while using and imparting online learning in the higher education. Though there are many limitations and hurdles in ICT usage during online education like the network connectivity and speed, computer speed and configuration, good antitrust support, etc., but these all will be minimized gradually as ICT will rule all the sectors including higher educational institutions.
References 1. Naidu S (2003) E-learning: a guidebook of principles, procedures and practices: Commonwealth Educational Media Centre for Asia (CEMCA) 2. Yadav S, Gupta P, Sharma A (2021) An empirical study on adoption of ICT tools by students in higher educational institutions. In: 2021 International conference on innovative practices in technology and management (ICIPTM), Noida, India, pp 266–271. https://doi.org/10.1109/ICIPTM52218.2021.9388341; http://ieeexplore.ieee.org/stamp/stamp. jsp?tp=&arnumber=9388341&snumber=9388317 3. Tarcan E, Varol ES, Toker B (2010) A study on the acceptance of information technologies from the perspectives of the academicians in Turkey. EgeAkademikBakı¸sDergisi 10(3):791–812 4. Biró K, Molnár G, Pap D, Sz˝uts Z (2017) The effects of virtual and augmented learning environments on the learning process in secondary school. In: 2017, 8th IEEE International conference on cognitive info communications (CogInfoCom). IEEE, pp 000371–000376 5. Davis FD (1989) Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Q 13(3):319–340. https://doi.org/10.2307/249008,JSTOR249008 6. Sánchez-Prieto JC, Olmos-Migueláñez S, García-Peñalvo FJ (2017) MLearning and preservice academicians: an assessment of the behavioral intention using an expanded TAM model. Comput Hum Behav 72:644–654 7. Davis FD, Venkatesh V (1996) A critical assessment of potential measurement biases in the technology acceptance model: three experiments. Int J Hum Comput Stud 45(1):19–45 8. Gunasinghe A, Abd Hamid J, Khatibi A, Azam SF (2019) Academicians’ acceptance of online learning environments: a review of information system theories and models. Glob J Comput Sci Technol 9. Ali M, Raza SA, Qazi W, Puah CH (2018) Assessing e-learning system in higher education institutes. Interact Technol Smart Educ 10. Horváth I (2016) Digital Life Gap between students and lecturers. In: 2016 7th IEEE international conference on cognitive info communications CogInfoCom). IEEE, pp 000353–000358 11. Cuellar N (2002) The transition from classroom to online teaching. Nursing Forum 37(3):5
A TAM-based Study on the ICT Usage by the Academicians …
353
12. Dede C (2011) Emerging technologies, ubiquitous learning, and educational transformation. In: European conference on technology enhanced learning. Springer, Berlin, Heidelberg, pp 1–8 13. Kövecses-G˝osi V (2018) Cooperative learning in VR environment. Acta PolytechnicaHungarica 15(3):205–224 14. Venter P, van Rensburg MJ, Davis A (2012) Drivers of learning management system use in a South African open and distance learning institution. Aust J Educ Technol 28(2) 15. Warschauer M (2007) The paradoxical future of digital learning. Learn Inq 1(1):41–49 16. Alharbi S, Drew S (2014) Using the technology acceptance model in understanding academics’ behavioural intention to use learning management systems. Int J Adv Comput Sci Appl 5(1):143–155 17. Basri WSh, Alandejani JA, Almadani FM (2018) ICT adoption impact on students ‘academic performance: evidence from Saudi Universities. Educ Res Int 2018:9p. Article ID 1240197. https://doi.org/10.1155/2018/1240197 18. Pookulangara S, Parr J, Kinley T, Josiam BM (2021) Online sizing: examining True Fit® technology using adapted TAM model. Int J Fashion Des, Technol Educ 1–10 19. Mlekus L, Bentler D, Paruzel A et al (2020) How to raise technology acceptance: user experience characteristics as technology-inherent determinants. Gr Interakt Org 51:273–283. https:// doi.org/10.1007/s11612-020-005
An Investigation on Impact of Gender in Image-Based Kinship Verification Vijay Prakash Sharma and Sunil Kumar
Abstract The task of kinship verification is to establish a blood relationship between two persons. Kinship verification using facial images provides an affordable solution as compared to biological methods. KV has many applications like image annotation, child adoption, family tree creation, photo album management, etc. However, the facial image verification process is challenging because images do not have fixed parameters like resolution, background, age, gender, etc. Many parameters are affecting the accuracy of the methods. One such parameter is the gender difference in the kin relation. We have investigated the impact of the gender difference in the kin relation on popular methods available in the literature. The investigation suggests that gender difference affects kin detection accuracy. Keywords Kinship verification · Gender impact · Facial recognition · Feature matching
1 Introduction Psychology says that facial appearance plays a reliable and vital role in genetic comparison between the parent and their children. Computer vision researchers have already started exploring this theory to develop a model for kinship verification. Kinship verification using facial images has become an exciting research area. It has many real-life applications like finding missing children, social media analysis, forensic investigations, organizing family albums, etc. Four types of kin relation have been used in most of the studies, namely father–daughter (F–D), mother–son (M–S), father–son (F–S), and mother–daughter (M–D). Although there are other relations like a sibling, grandparent–grandchild, very few researchers have studied them.
V. P. Sharma Department of IT, SCIT, Manipal University Jaipur, Jaipur, India S. Kumar (B) Department of CCE, SCIT, Manipal University Jaipur, Jaipur, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_26
355
356
V. P. Sharma and S. Kumar
The human face can help reveal many personality attributes like age, civilization, emotion, gender, etc. Variations in these attributes cause significant appearance changes. Generally, parent–child pairs have an age difference and gender differences in mother–son or father–daughter. In this paper, we discuss that how gender differences affect kinship verification. We show that for different gender pairs, the researcher got less accuracy compared to the same gender. Despite the growing contribution of the research community, kinship verification problems are challenging in a real-world scenario. Four major issues faced in kinship verification are: (1) unconstrained images, (2) age difference, (3) different Gender, (4) hierarchy in kin relation [1]
2 Methodology In this paper, we have investigated the impact of gender difference on the accuracy of well-known kinship verification methods. The work done for kinship verification can be classified according to the approaches used, namely feature-based approaches [2, 3], metric-based approaches [4, 5], and convolutional neural networks models [6, 7]. The popular methods for kinship verification have been classified into three categories. Comparative analysis has been performed on all three classes. Figure 1 shows facial image pairs of kins from the popular dataset KinfaceW-1. Fig. 1 a Images of the same gender, b images of a different gender from KinFaceW-I dataset
An Investigation on Impact of Gender in Image-Based … Table 1 Popular kinship image datasets
357
Database
Size
Cornell KinFace [2]
150 pairs
UB KinFace [8]
200 groups
KinFaceW [9]
KinFaceW-I
533 pairs
KinFAceW-II
1000 pairs
Family101 [10]
14,816 images
FIW [11]
656,954 images pairs
Database [4]
200 images
Database [5]
800 images
Sibling-face [12]
600
Group face(30) [12]
106
To compare the performance of kinship verification algorithms, researchers have prepared some facial image datasets. Table 1 lists a few popular datasets. These datasets contain mainly four types of facial image pairs with kin relations (F–D, M–S, M–D, and F–S). A very few datasets also contain other relations like grandparent–grandchild and sibling pairs. KinFaceW-I and KinFaceW-II are the oldest and most popular dataset for kinship verification. In KinFaceW-I, images are collected from different photos, but in KinfaceW-II, images are cropped from the same photo. KinfaceW-I has 156 pair images of F–S, 134 pair images of F–D, 116 pair images of M–S, and 127 pairs of M–D, and KinFaceW-II also have four relations and 250 pair images in each pair.
3 Investigation Results We have compared the accuracy of popular methods on kinship datasets in three categories. The reason for doing so is to ascertain that the observations are independent of the method used to verify kinship relations.
3.1 Feature-Based Methods We have compared the feature-based methods first. In this approach, the researchers used some feature extraction methods like LBP, HOG, and SIFT. They used some classification algorithms like SVM and KNN to check the similarity between features and judge that the pair have a kin relation or not. Fang et al. [2] used 14 local features of the face, e.g., right eye color, eye to nose distance, left eye window, etc., and calculated the difference between parent’s and child’s feature vector. Based on this difference, they classify that whether image pair has kin relation or not. Zhou et al. [3] proposed
358
V. P. Sharma and S. Kumar
a spatial pyramid learning-based feature signifier that worked on the direct raw pixel. Support vector machine is used for binary classification. Bottinok et al. [11] used textural features, like LPQ and WLD. Three patch LBP and four patch LBP methods. They took different combinations of these feature extraction approaches. Minimum redundancy maximum relevance (mRMR) algorithm and sequential forward floating selection (SFS) approaches used for feature selection. Goyal et al. [12] did not use a complete face image for matching. Instead, they extract subparts like the left eye, right eye and match them with other face images. They used the Canny edge detection algorithm to extract different parts. D. Xiaodong, T. Zheng [13] subtract unrelated features from extracted features, and then applied the Gaussian-based normalized squared Euclidean distance comparison method on resultant features for classification. The accuracy of the above feature-based methods for opposite-gender pairs and same-gender pairs is summarized in Table 2. The relative performance is also shown in Fig. 2. It can be easily observed that the average accuracy for the same-gender kin relations is more than the opposite-gender kin relations. Table 2 Accuracy comparison of opposite gender and same gender Approach
Opposite gender
Same gender
F–D
M–S
F–S
M–D
Local feature [2]
54.55
73.81
72.94
61.29
Spatial pyramid learning-based (SPLE) [3]
61.5
72.5
63.5
73.5
FPLBP-LPQ-WLD [13]
85.4
84.5
85.6
89.7
Facial parts [14]
86.43
93.13
88.5
87
LPQ + feature subtraction metric [15]
63.8
69.9
75.4
74.6
Fig. 2 Comparison of feature-based approaches on opposite-gender pairs
An Investigation on Impact of Gender in Image-Based …
359
3.2 Neural Network Based In this section, we will discuss some neural network-based approaches. Instead of a basic state-of-the-art approach for extracting features, the researcher used a convolutional neural network for feature extraction. A neural network model consists of convolution layer, max-pooling/average-pooling layer followed by flattening layer. The convolution layer extract features from the image. The pooling layer extracts the most important feature from the feature vector, and the flattening layer converts data to one-dimension vector. Some researchers used pre-trained networks like VGGNet, ResNet to overcome small dataset problems. Some of them use two parallel models and design the Siamese network. Yan H, Wang S [7] proposed a multiple-input attention network consisting of three convolution layers, two max-pooling layers, followed by a fully connected layer. After each convolution layer, an attention layer was added. Zhang et al. [14] proposed a three-layer CNN architecture. The first layer has 16 filters in this model, the second layer has 64 filters, and third layer has 128 filters. Nandy and Mondal [16] used Siamese CNN architecture; they used Squeeze Net network; this network was pre-trained on VGGFace2 dataset. Dahan and Keller [15] used 20-layer SphereFace CNN and create a Siamese network for learning face embedding. In addition, sphere loss is used to optimize the CNN model. Liang et al. [6] used a four-layer neural network model for facial feature extraction. And an autoencoder is used for extracted relational features from facial features. Based on these relational features, kin relation was determined. Duan et al. [17] proposed coarse-to-fine transfer (CFT) model. It has two parts, coarse CNN (cCNN) and fine CNN (fCNN). cCNN, pre-trained with ImageNet dataset and fCNN train with any kinship dataset. cCNN is utilized to find facial components, and fCNN is used to find specific features of kin relation. The accuracy of the above neural network-based methods for opposite gender pairs is summarized in Table 3. The relative performance is also shown in Fig. 3. It can be easily observed that in the neural network-based approach, average accuracy for the same-gender kin relations is more than the opposite-gender kin relations. Table 3 Accuracy comparison CNN models Name of approach
Opposite gender
Same gender
F–D
M–S
F–S
M–D
Multiple-input attention network [17]
75.9
78.2
81.2
85.2
CNN model + autoencoder [18]
69.3
70.2
71.2
73.3
20-layer SphereFace CNN [19]
68.9
70.9
72.9
73.8
CNN model [20]
81.9
87.9
89.4
92.4
VGG-face model [23]
89.87
89.76
91.79
90.94
Coarse-to-fine transfer [21]
71.7
77.2
78.8
81.9
360
V. P. Sharma and S. Kumar
Fig. 3 Accuracy comparison of different neural network-based solution for kinship verification
3.3 Metric-Based Methods In this section, we will discuss metric learning approaches and some transfer learning approaches. In metric learning, we calculate the distance between objects: mostly Euclidean distance and Mahalanobis distances are calculated. Kin relation images have less distance compare to non-kin images. Xu and Shang [22] proposed an online similarity learning with the average strategy (OSL-A) method. The target of OSL-A is a weighted matrix W such that the distance between kin pair images is a small and large distance between non-kin pair images. OSAL-A is based on an online passive-aggressive algorithm. Zhou et al. [19] proposed a scalable similarity learning (SSL) approach. SSL tries to become familiar with a diagonal bilinear similarity metric on the human face dependent on an online sparse learning strategy; they used multiple feature representations of the face. Alirezazadeh et al. [21] used SPLE for feature extraction. They combined the feature vector of parent and child. For selecting features from this feature vector, they proposed a genetics algorithm named kinshipGA. Yan et al. [5] suggested a metric learning-based approach. They extracted multiple features using different face descriptors and combined them;- then using multiple distance metrics, they verified kin relation using multiple distance metrics. Lu et al. [4], proposed a distance metric technique in which kin-related images pull closed while those without kin relation are pushed away. The accuracy of the above metric learning-based methods for opposite gender pairs and same gender is summarized in Table 3. The relative performance is also shown in Fig. 3. It can be easily observed that the average accuracy for the samegender kin relations in the metric learning method is more than the opposite-gender kin relations (Table 4). Relative Performance is shown in Fig. 4.
An Investigation on Impact of Gender in Image-Based …
361
Table 4 Accuracy comparison of metric learning approaches Approach
Opposite gender
Same gender
F–D
M–S
F–S
M–D
75
71.58
82.72
82.36
Scalable similarity learning (SSL) 69.9 + LBP [10]
71.3
80.2
73.8
Feature selection using genetic approach [26]
81.8
86.8
88.8
87.2
Discriminative multimeric learning (DMML) + SVM [27]
69.5
69.5
74.5
75.5
MNRML [9]
66.5
66.2
72.5
72
Online similarity learning with average strategy (OSL-A) [24]
Fig. 4 Accuracy comparison of different metric learning-based solution for kinship verification
4 Conclusion The study of kin relationships is significant and inherent for human society. We have conducted an important investigation about the kin relation. Our investigation reveals that establishing kinship in the same-gender facial image pairs is more accurate than the opposite. This may help to design further an efficient algorithm for establishing kinship relationships. The conclusion holds for three major classes of methods used in the literature.
362
V. P. Sharma and S. Kumar
References 1. Sharma S, Sharma VP (2021) Insights of kinship verification for facial images—a review. In: Proceedings of second international conference on smart energy and communication, pp 111–122 2. Fang R, Tang KD, Snavely N, Chen T (2010) Towards computational models of kinship verification. In: Proceedings—international conference on image processing, ICIP 3. Zhou X, Hu J, Lu J, Shang Y, Guan Y (2011) Kinship verification from facial images under uncontrolled conditions. In: MM’11—Proceedings of the 2011 ACM multimedia conference and co-located workshops 4. Lu J, Zhou X, Tan YP, Shang Y, Zhou J (2014) Neighborhood repulsed metric learning for kinship verification. IEEE Trans Pattern Anal Mach Intell 36(2):331–345 5. Yan H, Lu J, Deng W, Zhou X (2014) Discriminative multimetric learning for kinship verification. IEEE Trans Inf Forensics Secur 9(7):1169–1178 6. Liang J, Guo J, Lao S, Li J (2017) Using deep relational features to verify kinship. Commun Comput Inf Sci 771:563–573 7. Yan H, Wang S (2019) Learning part-aware attention networks for kinship verification. Pattern Recogn Lett 128:169–175 8. Fang R, Gallagher AC, Chen T, Loui A (2013) Kinship classification by modeling facial feature heredity. In: 2013 IEEE international conference on image processing, ICIP 2013— proceedings, pp 2983–2987 9. Robinson JP, Shao M, Wu Y, Fu Y (2016) Families in the Wild (FIW): large-scale kinship image database and benchmarks. In: MM 2016—Proceedings of the 2016 ACM multimedia conference 10. Guo Y, Dibeklioglu H, van der Maaten L (2014) Graph-based kinship recognition. In: Proceedings—international conference on pattern recognition, pp 4287–4292 11. Bottinok A, Islam IU, Vieira TF (2015) A multi-perspective holistic approach to kinship verification in the wild. In: 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol 2. IEEE 12. Goyal A, Meenpal T (2018) Detection of facial parts in kinship verification based on edge information. In: 2018 Conference on information and communication technology, CICT 2018 13. Duan X, Tan ZH (2015) A feature subtraction method for image based kinship verification under uncontrolled environments. In: Proceedings—international conference on image processing, ICIP, 2015, vol 2015-December, pp 1573–1577 14. Zhang K, Huang Y, Song C, Wu H, Wang L (2015) Kinship verification with deep convolutional neural networks, pp 148.1–148.12 15. Dahan E, Keller Y (2020) A unified approach to kinship verification. IEEE Trans Pattern Anal Mach Intell 2020 16. Nandy A, Mondal SS (2019) Kinship verification using deep Siamese convolutional neural network. In: Proceedings—14th IEEE international conference on automatic face and gesture recognition, FG 2019 17. Duan Q, Zhang L, Zuo W (2017) From face recognition to kinship verification: an adaptation approach. In: Proceedings—2017 IEEE international conference on computer vision workshops, ICCVW 2017, 2017, vol 2018-January, pp 1590–1598 18. Shao M, Xia S, Fu Y (2011) Genealogical face recognition based on UB KinFace database. In: IEEE computer society conference on computer vision and pattern recognition workshops 19. Zhou X, Yan H, Shang Y (2016) Kinship verification from facial images by scalable similarity fusion. Neurocomputing 197:136–142 20. Chergui A, Ouchtati S, Mavromatis S, Bekhouche SE, Lashab M, Sequeira J (2020) Kinship verification through facial images using CNN-based features. Traitement du Signal 37(1):1–8 21. Alirezazadeh P, Fathi A, Abdali-Mohammadi F (2015) A genetic algorithm-based feature selection for kinship verification. IEEE Signal Process Lett 22(12):2459–2463 22. Xu M, Shang Y (2016) Kinship verification using facial images by robust similarity learning. Math Prob Eng 2016
Classification of COVID-19 Chest CT Images Using Optimized Deep Convolutional Generative Adversarial Network and Deep CNN K. Thangavel
and K. Sasirekha
Abstract Coronavirus disease 2019 (COVID-19) pandemic has become a major threat to the entire world and severely affects the health and economy of many people. It also causes the lot of other diseases and side effects after taking treatment for COVID. Early detection and diagnosis will reduce the community spread as well as saves the life. Even though clinical methods are available, some of the imaging methods are being adopted to fix the disease. Recently, several deep learning models have been developed for screening COVID-19 using computed tomography (CT) images of the chest, which plays a potential role in diagnosing, detecting complications, and prognosticating coronavirus disease. However, the performances of the models are highly affected by the limited availability of samples for training. Hence, in this work, deep convolutional generative adversarial network (DCGAN) has been proposed and implemented which automatically discovers and learns the regularities from input data so that the model can be used to generate requisite samples. Further, the hyperparameters of DCGAN such as number of neurons, learning rate, momentum, alpha, and dropout probability have been optimized by using genetic algorithm (GA). Finally, deep convolutional neural network (CNN) with various optimizers is implemented to predict COVID-19 and non-COVID-19 images which assist radiologists to increase diagnostic accuracy. The proposed deep CNN model with GA optimized DCGAN exhibits an accuracy of 94.50% which is higher than the pre-trained models such as AlexNet, VggNet, and ResNet. Keywords CAD · Chest CT · CNN · COVID-19 · DCGAN · Genetic algorithm
K. Thangavel · K. Sasirekha (B) Department of Computer Science, Periyar University, Salem, Tamil Nadu, India e-mail: [email protected] K. Thangavel e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_27
363
364
K. Thangavel and K. Sasirekha
1 Introduction The novel coronavirus originated in Wuhan, China and increasingly spread across several countries with more death rates. Finally, it was declared as a pandemic by World Health Organization [1, 2]. It was found that COVID-19 is a type of viral infection that is caused by severe acute respiratory syndrome coronavirus 2 (SARSCoV-2). More specifically, it is considered a serious pathogen for humans because it severely infects the respiratory system. As per the recent statistics, there are millions of confirmed cases in the United States and India, and the number remains increasing. The COVID-19 pandemic has put the healthcare system under tremendous pressure to handle the situation across the globe [3], but India is the second top most country among the positive cases. Generally, reverse transcription polymerase chain reaction (RT-PCR) is a proven diagnostic tool for diagnosing COVID-19 [4]. Unfortunately, the testing procedure is time consuming and the sensitivity is also imperfect which results in a high risk of infecting larger population. In such situations, medical imaging techniques such as chest CT, and X-ray image play a significant role in COVID-19 diagnosis. As chest CT is a more precise technique for imaging, the chest with higher sensitivity and efficiency than chest X-rays, it is being used to diagnose COVID-19 [5]. In recent years, deep learning models are very popular in healthcare applications which diagnosis diseases automatically given medical images. With more generalization ability, it allows a machine learning model to be fed with raw images and to automatically discover the unique patterns requisite for further analysis [6]. Since COVID-19 is spreading widely, many studies have focused on developing CAD tools to diagnosis the disease at an early stage. The intention is to find out a more complex and abstract representation of the input data hierarchically by passing the data through multiple hidden layers. On the other hand, it is quite difficult and challenging to select the suitable deep learning model which will optimally predict COVID-19 from chest CT images. Nowadays, the deep convolutional neural network model had been applied to a wide variety of medical imaging problems with great success as they are more effective in learning data representation automatically through artificial intelligence [7]. On the other hand, effective training of the deep CNN model requires more data. With less data, the network parameters are underdetermined which results in poor generalization. This problem could be alleviated by using data augmentation methods which increase the diversity of data [8].
1.1 Contributions 1.
Deep convolutional generative adversarial network is employed to synthetically create more CT images of COVID-19.
Classification of COVID-19 Chest CT Images Using Optimized Deep …
2.
3.
365
Genetic algorithm is implemented to optimize the hyperparameters of DCGAN, namely number of neurons, learning rate, momentum, alpha, and dropout probability. A deep CNN model is trained with real and generated images to predict COVID19 more accurately.
The paper is structured as follows. Section 2 elaborately discusses the related work of machine learning models for COVID-19 prediction. The proposed CAD system using a deep CNN model with genetic algorithm optimized DCGAN augmentation is presented in Sect. 3. The experimental settings and the results of the model quantitatively are discussed in Sect. 4. Finally, conclusions are drawn and future directions are described in Sect. 5.
2 Literature Review Silva et al. [9] had developed an efficient COVID-Net model with a voting-based approach to classify COVID-19 images. Here, two datasets of CT images, namely SARS-CoV-2 CT scan and COVID-CT were considered for testing the model. To improve the accuracy, the images were normalized and augmented with the standard transformations such as rotation, horizontal flip, and scaling. An accuracy of 97.38% and 86.0% was attained with SARS-CoV-2 CT scan and COVID-CT dataset, respectively. Further, a cross-dataset analysis was performed by combining the samples of two datasets while training and testing. In [10], COVIDC model was presented to classify the COVID images. The CT images were denoised and contrast stretched for further analysis. Then, the more discriminate features were extracted using various pre-trained models. Finally, the classification was performed with SVM, random forest, and XGBoost classifier. Additionally, the severity of the positive cases was classified into mild and severe with the same model. An F1 score of 0.94 was achieved with the features extracted using DenseNet121 and SVM classifier. Rohila et al. [11] introduced the ReCOV-101 network which solves the problem of the vanishing gradient exploring skip connection. The ResNet-101 pre-trained model was used to extract features from lung CT images. An accuracy of 94.9% was obtained for the COVID-19 classification. In [12], a decision-based approach was constructed to improve the accuracy of COVID-19 prediction from chest CT images. In this work, various pre-trained models such as VGG16, InceptionV3, ResNet50, DenseNet121, and DenseNet201were employed for classification. Finally, a majority voting scheme was applied to make the final decision and achieved an F1-score of 0.86. Khadidos et al. [13] combined deep CNN and recurrent neural network and named as DeepSense model to predict COVID-19 from CT images. The developed model was tested on three different publicly available datasets such as COVID-CT, CORD-19, and IEEE8023. Sen et al. [14] constructed a CNN model that extracts features from the chest CT images. Then most appropriate features were selected with
366
K. Thangavel and K. Sasirekha
mutual information, relief-F, and dragonfly algorithm. Finally, the images were classified into COVID-19 and non-COVID using support vector machine, prediction rate of 98.39% and 90.0% were obtained with SARS-CoV-2 CT images and COVID-CT datasets, respectively. In [15], various pre-trained models were explored to detect the COVID-19 from CT images. The experiments were conducted on SARS-CoV-2 CT scan and the COVID19-CT datasets. Singh et al. [16] deployed SVM for analyzing COVID-19 data and Kumari et al. [17] predicted cases for future. Bhatnagar et al. [18] performed descriptive analysis of COVID-19 scenario. Further, the extracted features were visualized using the t-SNE and Grad-CAM methods. Hence, in this research, CT image has been chosen to detect the COVID-19 at an early stage using optimized DCGAN with the deep CNN model based on the above literature.
3 Materials and Method 3.1 Data Augmentation: Why? Generally, image augmentation is employed to synthetically increase the size of the training dataset to make predictions. The generalization ability of the deep learning model is highly dependent on the size and variations of the images imputed during the training process. Data availability is the biggest challenge in the medical imaging domain while training a machine learning model. Even though some medical datasets are available publicly, most datasets are limited in size and diversity. Further, collecting COVID-19 images is more complex during this pandemic as it requires effective collaboration with a radiologist. More recently, researchers are employing data augmentation to overcome the challenges.
3.2 GAN: Image Augmentation The standard image augmentation methods such as translation, rotation, flipping, and scaling produces only limited convincing alternative data. From the-state-ofart, generative adversarial network is a promising approach to generate synthetic images with more diversity [19, 20]. The GAN model consists of two networks that are trained in an adversarial process where one network (generator) generates fake images and the other network (discriminator) discriminates between real and fake images repeatedly. The basic idea of a GAN is creating more realistic images from a random distribution.
Classification of COVID-19 Chest CT Images Using Optimized Deep …
367
3.3 GA: Optimization Generally, the genetic algorithm is an evolutionary optimization algorithm which is effectively implemented to a variety of real-time complex problems to generate best quality solutions [21]. It has many advantages over conventional optimization algorithm as it could operate on large, noisy, and discontinuous data irrespective of the type of fitness function. It operates on a population of individuals to find an optimal solution by employing biologically inspired operators such as mutation, crossover, and selection.
3.4 Generating Synthetic CT Images with GA Optimized DCGAN In this work, a variant of conventional GAN, deep convolutional GAN is implemented where the generator and discriminator are deep CNNs. The generator (G) of the DCGAN is a neural network which generates images from a complex mapping procedure. It takes the noisy samples,G z Pg(z) , from a uniform distribution Pz as the input and outputs new data G z . The distribution of Pg(z) should be similar to the probability distribution of real data, Pdata(x) . The generator is learned to fool the discriminator by generating synthetic images alike actual images. The discriminator (D) of DCGAN is a classifier whose task is to classify the image is real or fake. It differentiates between a real data sample E x Pdata(x) and a generated data sample E z Pg(z) . The discriminator loss function penalizes the discriminator for misclassifying a real instance as fake or a fake instance as real, and updates the discriminator’s weights through back-propagation. Similarly, the generator loss function penalizes the generator for failing to fool the discriminator and updates its weights. The generator and discriminator are trained individually, i.e., the weights of the generator remain constant while it produces samples for the discriminator to train and vice versa while training the generator [22]. When the generator is accurate, its loss function decreases in contrast to discriminator loss. The DCGAN is trained by optimizing the loss function of a two-player minimax game as given in Eq. (1), MinG Max D = E x
Pdata(x)
log(D(x)) + E z
Pg(z)
log(1 − D(G(z)))
(1)
Here, the discriminator is trained to maximize D(x) for images with E x Pdata(x) and to minimize D(x) for images with E x= Pdata(x) . The generator is trained to fool the discriminator by producing the images G(z) such that D(G(z)) Pdata(x) . Hence, the generator is trained to maximize D(G(z)) or minimize 1− D(G(z)). The architecture of DCGAN is shown in Fig. 1 Hyperparameters Tuning of DCGAN. The hyperparameters are used to control the learning process of GAN models. The value of hyperparameters significantly affects the model performance. Hence, selecting the optimal set of values in
368
K. Thangavel and K. Sasirekha
Fig. 1 Deep convolutional GAN architecture
0: Real 1: Fake
D: Discriminator
Xn:Real Images
Yn: Fake Images
G: Generator
Z: Noise
order to build a DCGAN model is essential. In this research work, the essential hyperparameters values are optimally tuned with genetic algorithm. The optimized DCGAN hyperparameters are summarized as • • • • •
Learning rate of Adam optimizer: generator and discriminator Alpha parameter of ReLU: generator and discriminator Batch normalization of momentum: generator Dropout probability: discriminator Number of neurons in dense layer: discriminator.
The genetic algorithm optimizes the hyperparameters of DCGAN by minimizing the generator loss (g_loss) which is the fitness function. As a consequence, the discriminator accuracy (d_acc) and loss (d_loss) will be minimized and maximized, respectively. The complete procedure of COVID-19 CT image augmentation with the proposed DCGAN with genetic algorithm is presented in algorithm 1.
Classification of COVID-19 Chest CT Images Using Optimized Deep …
369
Algorithm 1: Pseudocode of Genetic Algorithm Optimized DCGAN begin Input: A set of GA parameters, image dataset, GAN architecture Output: The best set of hyper-parameters for DCGAN 1 Generate the initial population (0) 2 Evaluate individuals in the initial population (0) as: 3 for each individual 4 build generator and discriminator architecture 5 train DCGAN 6 return g_loss 7 ←0 8 repeat 9 randomly select individuals ( ) in initial population (0) 10 generate offspring ( ) 11 evaluate new offspring ( ) by training DCGAN 12 perform selection from P(i) ∪ Q (i) for next generation; individuals with maximum g_loss will be discarded ← +1 13 14 until max_num_iteration reached 15 return the best set of hyper-parameters for DCGAN ; individual having the best fitness in ( ) end procedure
3.5 COVID-19 Prediction with Deep CNN Model After image augmentation, the deep CNN model is trained with a series of convolutional and pooling layers to extract the optimal features as learned filter weights from a raw chest CT image. These weights serve as input to the dense architecture of the deep network for COVID-19 prediction [23]. Subsequent feature map values are calculated using Eq. (2), here, the input image is represented as I and kernel or filter as F. F[ j, k]I [m − j, n − k] (2) C[m, n] = (I ∗ F)[m, n] = j
k
The proposed CAD system exploits rectified linear units (ReLUs) as the activation function at the coding network which is nonlinear in nature and trains the network many times faster. Moreover, it increases the nonlinear properties of the decision function without affecting the receptive fields of the convolution layer and is given in Eq. (3). The block diagram of the proposed model is presented in Fig. 2. f (x) =
0 for x < 0 x for x ≥ 0
(3)
370
K. Thangavel and K. Sasirekha
Genetic Algorithm: Hyper-parameters Optimization
N GA Parameters
Evaluate Popula on
Ini alize Popula on
Generate Offsprings
Do Survival Selec on
Max Ite ? Y
Return Op mal Set
g_loss
G
D
Real Fake
Output
CT Dataset
Augmenta on
Deep CNN
Dense Layer
Fig. 2 Workflow of the proposed CAD system
Here, the CT images are augmented using DCGAN and deep convolutional neural network is implemented to classify COVID-19 from non-COVID-19 images. Further, the hyperparameters of DCGAN have been optimized by using genetic algorithm.
4 Experimental Results The proposed COVID-19 prediction system has been implemented and analyzed. The experiments were conducted to validate the proposed GA optimized DCGAN on generated CT images with deep CNN model.
4.1 Dataset COVID-CT scan dataset is used which is publicly available at GitHub repository [24]. It contains 746 images of 349 COVID-19 and 397 non-COVID-19 CT scans as presented in Fig. 3. The performance of the deep CNN model is highly depending on the size of the dataset. Therefore, to increase the dataset size, 600 images of 300
Classification of COVID-19 Chest CT Images Using Optimized Deep …
371
Fig. 3 Sample of CT images in COVID-CT scan dataset
COVID-19 and 300 non-COVID-19 CT scans are generated synthetically with the genetic algorithm optimized DCGAN.
4.2 Experimental Setup Generator and Discriminator Architecture: The DCGAN is a generative adversarial network with convolutional neural network as its generator and discriminator. The generator network comprises of a dense layer to interpret the input vector followed by three franctional-strided layers with a kernel size of 3 × 3. It generates CT image of dimension 150 × 150 × 1 from the uniform distribution of the random vector of size 100 × 1. The discriminator network consists of two convolution and max pooling layers with a filter of 3 × 3 after that a fully connected layer with sigmoid activation to classify the input image into real or fake. The ReLU activation is applied for the generator network layers except the output layer which uses tanh function and the internal layers of the discriminator network use LeakyReLU activation function. Further, batch normalization is performed at all the layers of generator and discriminator. The parameter initialization of genetic algorithm to perform the optimization of DCGAN hyperparameters are given in Table 1. Deep CNN Architecture: The developed CAD system is able to classify the chest CT images using the flattened weighted feature vector acquired from the deep CNN model. It computes the loss and updates the weights of the internal nodes accordingly. The hyperparameters of the proposed model with convolution, pooling, and fully connected (FC) layers are reported in Table 2. Transfer Learning Architecture: Deep transfer learning models, in particularly deep CNN, have shown tremendous success in classifying medical images with high
372
K. Thangavel and K. Sasirekha
Table 1 Genetic algorithm parameters Parameter
Value
population_size
150
crossover_probability
0.5
mutation_probability
0.1
elit_ratio
0.01
parents_portion
0.3
crossover_type
uniform
max_num_iteration
3000
Table 2 Hyperparameters of the proposed CAD system Layer parameters
Layer 1 convolution
Layer 2 pooling
Layer 3 convolution
Layer 4 pooling
Layer 5 FC-1
Layer 6 FC-2
Layer 7 FC-3
No. of filters
64
–
32
–
–
–
–
Filter size
3×3
–
3×3
–
–
–
–
Conv. stride 1 × 1
–
1×1
–
–
–
–
Pooling size
–
2×2
–
2×2
–
–
–
Pooling stride
–
1×1
–
1×1
–
–
–
No. of nodes
–
–
–
–
512
256
2
Activation
ReLU
–
ReLU
–
ReLU
ReLU
Softmax
sensitivity. The parameters and architectures of the transfer learning methods such as AlexNet, VggNet, and ResNet used in this work [25] are presented in Table 3. Table 3 Parameters of deep transfer learning models Network
Depth
Parameters (in millions)
Image input size
AlexNet
8
60
227 × 227
VggNet
16
138
224 × 224
ResNet
50
25
224 × 224
Classification of COVID-19 Chest CT Images Using Optimized Deep …
373
Fig. 4 Synthetic CT images: a COVID-19 images, b non-COVID images
4.3 Discussion All the experiments and evaluations were performed on Google Colab environment with Keras framework and GPU. Training: The DCGAN is trained with 349 images of COVID-19 CT images to synthetically generate COVID-19 CT images and with 397 images of non-COVID19 to generate non-COVID-19 CT images, respectively. For COVID-19 prediction, 80% of data has been used for training and the remaining 20% for testing the deep CNN model with fivefold cross-validation. Tuned Hyperparameters: The optimized hyperparameters values obtained with genetic algorithm of DCGAN are as follows: learning rate of Adam optimizer for DCGAN architecture is 0.0002; alpha value of ReLU both generator and discriminator is 0.1; momentum of batch normalization is 0.9; the discriminator dropout probability is 0.25; the neurons of dense layer in discriminator are 128. The sample of images generated with DCGAN is represented in Fig. 4. After image augmentation, while training a deep CNN model for COVID-19 prediction, the weights are initialized randomly and updated in every epoch to increase the overall accuracy of the model. In particular, the change in weights with respect to the loss function is controlled by the optimizer. In this work, the optimizers such as stochastic gradient descent (SGD), root mean square propagation (RMSProp), Adadelta, Adamax, and adaptive moment estimation (Adam) as presented in Table 4 have been implemented to predict COVID-19. To classify the chest CT image into COVID-19 and non-COVID-19, Softmax function as in Eq. (4) is exploited at the dense layer of the deep CNN model. exp(xi ) f (xi ) = j exp(x i )
(4)
More recently, the performance of the machine learning models is evaluated with the results of confusion matrix [26]. Hence, the quantitative metrics such as precision,
374
K. Thangavel and K. Sasirekha
Table 4 Deep CNN parameters Optimizer
Formula
Stochastic gradient descent
θ = θ − η∇ j(θ; x, y)
Root mean square Propagation
θt+1 = θt −
Adadelta
RMS[θ]t−1 RMS[gt ] .gt ; θt+1 = θt + θt ∞ u t = β2 .vt−1 + β2∞ .|gt |∞ = max(β2 .vt−1 , |gt |) vt t θt+1 = θt − √vηt +ε m t ; vˆt = 1−β ˆt = m t ; m β1t 2
η 2 (1−γ )g(t−1) +γ gt +ε
.gt
θ =
Adamax Adaptive moment estimation
recall, F-measure, accuracy, and error rate obtained from the confusion matrix have been utilized to evaluate the proposed model. The experimental results of the deep CNN model trained with the actual images of COVID-CT scan dataset are presented in Table 5 and Fig. 5. The classifier provides accuracies of 74.50%, 76.10%, 77.50%, 83.00%, and 90.14% with optimizers such as SGD, RMSProp, Adamax, Adadelta, and Adam, respectively. From the results, it is observed that the weights are converged more quickly with Adam optimizer than Table 5 Deep CNN model with actual images only Dense activtion
Optimizer
Precision
Recall
F-measure
Accuracy
Error rate
Softmax
SGD
74.50
74.50
74.50
74.50
25.50
RMSProp
76.50
76.90
76.69
76.10
23.90
Adamax
77.50
76.72
77.10
77.50
22.50
Adadelta
84.20
78.90
81.46
83.00
17.00
Adam
89.24
91.60
90.14
90.14
09.86
100
Evaluation Measures
90 80 70 60
Precision
50
Recall
40
F-Measure
30
Accuracy
20
Error Rate
10 0
SGD
RMSProp
Adamax
Adadelta
Adam
Optimizers Fig. 5 Relative quantitative measures of the deep CNN model with actual images only
Classification of COVID-19 Chest CT Images Using Optimized Deep …
375
SGD, RMSProp, Adamax, and Adadelta. The model exhibits highest and lowest precision of 84.20% and 74.50%, respectively. The experimental results of the deep CNN model trained with the actual images and augmented images are presented in Table 6. The classifier provides accuracies of 84.67%, 87.30%, 87.70%, 92.90%, and 94.50% with optimizers such as SGD, RMSProp, Adamax, Adadelta, and Adam, respectively. The model exhibits highest and lowest F-measure of 97.89% and 82.99%, respectively. From the results, it is found that the deep CNN model quantitative measures are improved when it is trained with both actual and augmented images generated from GA optimized DCGAN. The graphical illustration of the developed model is presented in Figs. 5 and 6. The validation and testing accuracy of the deep CNN before and after image augmentation are provided in Table 7. The benchmark transfer learning methods such as AlexNet, VggNet, and ResNet have been explored in this research work which produces accuracies of 90.70%, 91.20%, and 92.00%, respectively. In summary, the proposed deep CNN model trained with images generated from genetic algorithm optimized DCGAN produces the highest accuracy of 94.50% and less error rate of 5.50% than the benchmark transfer learning methods such as AlexNet, VggNet, and ResNet as presented in Table 8. Table 6 Deep CNN model with actual and augmented images Dense activation
Optimizer
Precision
Recall
F-measure
Accuracy
Error rate
Softmax
SGD
83.33
82.67
82.99
84.67
15.33
RMSProp
87.30
87.30
87.30
87.30
12.70
Adamax
90.20
85.60
87.80
87.70
12.30
Adadelta
92.30
92.50
92.39
92.90
07.10
Adam
97.80
98.00
97.89
94.50
05.50
Evaluation Measures
120 100 80
Precision Recall
60
F-Measure
40
Accuracy 20 0
Error Rate SGD
RMSProp
Adamax
Adadelta
Adam
Optimizers Fig. 6 Relative quantitative measures of the deep CNN model with actual and augmented images
376
K. Thangavel and K. Sasirekha
Table 7 Validation and testing accuracy of deep CNN model Dataset
Accuracy (%) Validation
Testing
Before augmentation
94.85
90.14
After augmentation
98.31
94.50
Table 8 Computational results of deep transfer learning methods using Softmax with Adam optimizer Transfer learning
Precision
Recall
F-measure
Accuracy
Error rate
AlexNet
89.10
91.70
90.38
90.70
09.30
VggNet
91.20
91.20
91.20
91.20
08.80
ResNet
92.00
93.00
92.49
92.00
08.00
Proposed CAD system
97.80
98.00
97.89
94.50
05.50
5 Conclusion Currently, the early diagnosis of COVID-19 is one of the main challenges in the global world. Medical imaging like chest CT image plays an important role in screening lung related problems which help to combat COVID-19 at an early stage and control community outbreaks. In this research, a better image augmentation model was proposed which optimizes the hyper-parameters of deep convolution generative adversarial network via population-based proven genetic algorithm. The constructed deep CNN classifier predicts COVID-19 chest CT image with more sensitivity. Moreover, the extensive experiments evident that the proposed method exhibits an accuracy of 94.50% which is higher than the pre-trained CNN models such as AlexNet, VggNet, and ResNet. In future, the developed CAD model could be extended to predict COVID-19 from other lung abnormalities such as viral Pneumonia and bacterial pneumonia that can assist radiologists in making decisions. Acknowledgements Authors would like to thank UGC, New Delhi, for the financial support received under UGC-SAP No. F.5-6/2018/DRS-II (SAP-II).
Classification of COVID-19 Chest CT Images Using Optimized Deep …
377
References 1. Phelan AL, Katz R, Gostin LO (2020) The novel coronavirus originating in Wuhan, China: challenges for global health governance. JAMA 323(8):709–710 2. Lai CC, Shih TP, Ko WC, Tang HJ, Hsueh PR (2020). Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and corona virus disease-2019 (COVID-19): the epidemic and the challenges. Int J Antimicrob Agents 105924 3. Burdorf A, Porru F, Rugulies R (2020) The COVID-19 (Coronavirus) pandemic: consequences for occupational health. Scand J Work Environ Health 46(3):229–230 4. Jawerth N (2020) How is the COVID-19 virus detected using real time RT-PCR. IAEA Bull, 8–11 5. Li K, Fang Y, Li W, Pan C, Qin P, Zhong Y, Li S (2020) CT image visual quantitative evaluation and clinical classification of coronavirus disease (COVID-19). Eur Radiol 30(8):4407–4416 6. Sekhar R, Sasirekha K, Raja PS, Thangavel K (2021) A novel GPU based intrusion detection system using deep autoencoder with Fruitfly optimization. SN Appl Sci 3(6):1–16 7. Li Q, Cai W, Wang X, Zhou Y, Feng DD, Chen M (2014) Medical image classification with convolutional neural network. In: 2014 13th International conference on control automation robotics & vision (ICARCV). IEEE, pp 844–848 8. Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) GANbased synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321:321–331 9. Silva P, Luz E, Silva G, Moreira G, Silva R, Lucio D, Menotti D (2020) COVID-19 detection in CT images with deep learning: a voting-based scheme and cross-datasets analysis. Inform Med Unlocked 20:100427 10. Abbasi WA, Abbas SA, Andleeb S, Ul Islam G, Ajaz SA, Arshad K, Abbas A (2021) COVIDC: an expert system to diagnose COVID-19 and predict its severity using chest CT scans: application in radiology. Inf Med Unlocked 23:100540 11. Rohila VS, Gupta N, Kaul A, Sharma DK (2021) Deep learning assisted COVID-19 detection using full CT-scans. Internet of Things 14:100377 12. Mishra AK, Das SK, Roy P, Bandyopadhyay S (2020) Identifying COVID19 from chest CT images: a deep convolutional neural networks-based approach. J Healthc Eng 13. Khadidos A, Khadidos AO, Kannan S, Natarajan Y, Mohanty SN, Tsaramirsis G (2020) Analysis of COVID-19 infections on a CT image using deep sense model. Front Public Health 8 14. Sen S, Saha S, Chatterjee S, Mirjalili S, Sarkar R (2021) A bi-stage feature selection approach for COVID-19 prediction using chest CT images. Appl Intell 1–16 15. Alshazly H, Linse C, Barth E, Martinetz T (2021) Explainable COVID-19 detection using chest CT scans and deep learning. Sensors 21(2):455 16. Singh V, Poonia RC, Kumar S, Dass P, Agarwal P, Bhatnagar V, Raja L (2020) Prediction of COVID-19 corona virus pandemic based on time series data using support vector machine. J Discrete Math Sci Crypt 23(8):1583–1597 17. Bhatnagar V, Poonia RC, Nagar P, Kumar S, Singh V, Raja L, Dass P (2021) Descriptive analysis of COVID-19 patients in the context of India. J Interdisc Math 24(3):489–504 18. Kumari R, Kumar S, Poonia RC, Singh V, Raja L, Bhatnagar V, Agarwal P (2021) Analysis and predictions of spread, recovery, and death caused by COVID-19 in India. Big Data Min Anal 4(2):65–75 19. Frid-Adar, Maayan et al (2018) GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321:321–331 20. Goel T, Murugan R, Mirjalili S, Chakrabartty DK (2021) Automatic screening of COVID-19 using an optimized generative adversarial network. Cognitive Comput 1–16 21. Mithuna KT, Sasirekha K, Thangavel K (2017) Metaheuristic optimization algorithms based feature selection for fingerprint image classification. In Proceedings of the international conference on intelligent computing systems (ICICS 2017–Dec 15th –16th 2017) organized by Sona College of Technology, Salem, Tamilnadu, India
378
K. Thangavel and K. Sasirekha
22. Alarsan FI, Younes M (2021) Best selection of generative adversarial networks hyperparameters using genetic algorithm. SN Comput Sci 2(4):1–14 23. Zhang S, Gong Y, Wang J, Zheng N (2016) A biologically inspired deep CNN model. Advances in multimedia information processing, Lecture Notes in Computer Science, vol 9916 24. https://github.com/UCSD-AI4H/COVID-CT. Last accessed on 05.10.2021 25. Talo M, Yildirim O, Baloglu UB, Aydin G (2019) Acharya, U. R.: Convolutional neural networks for multi-class brain disease detection using MRI images. Comput Med Imaging Graphics 78:101673 26. Sasirekha K, Thangavel K (2020) Biometric face classification with the hybridised rough neural network. Int J Biometrics 12(2):193–217
Intelligent Fractional Control System of a Gas-Diesel Engine Alexandr Avsievich , Vladimir Avsievich , and Anton Ivaschenko
Abstract The paper presents a new intelligent control system aimed at improving the operational and technical characteristics of an internal combustion engine running on a mixture of diesel fuel and natural gas. The proposed solution is intended for use in large power units, which place high demands on efficiency and reliability, for example, diesel locomotives and vessels. New digital computing algorithm is proposed for fractional proportional–integral–differential control to improve the stability and quality of transient processes in a gas-diesel engine. Controller coefficients are determined by intelligent algorithm, the integral link with a differintegral, taking into account the prehistory. The conclusions and results of the study are to substantiate the advantages of implementing the proposed control algorithm in terms of the time of the transient process and the integral assessment of the quality in comparison with the classical algorithms. The developed control system makes it possible to reduce fuel consumption and increase the safety of the gas-diesel internal combustion engine while reducing the time of the transient process by implementing fractional control of the crankshaft rotation frequency. Keywords Automation systems · Intelligent control systems · Transportation · Automated transport · Simulation
1 Introduction In modern operating conditions of internal combustion engines (ICE) of large power units, increased requirements are imposed on power, efficiency, reliability and environmental friendliness. For example, to improve the technical and operational characteristics of the locomotive diesel engine, it is being re-equipped to operate on
A. Avsievich · V. Avsievich Samara State Transport University, 1 Bezymyanny, 16, Samara, Russia A. Ivaschenko (B) Samara State Technical University, Molodogvardeyskaya, 244, Samara, Russia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_28
379
380
A. Avsievich et al.
gas. At the same time, unlike gasoline engines, the modernization of a diesel engine requires significant changes in the standard power supply system. Improvement of existing and development of new gas-diesel structures are associated with the measurement and control of the parameters of the gas-diesel mode with the formation on this basis of fuel supply models. Adequate choice of the type of control and executive regulating devices allows increasing the efficiency of gas diesel. To control a propulsion system of this type, it is relevant to use modern control algorithms. This paper proposes to implement a fractional proportional– integral–differential (PID) controller, the features and advantages of the practical implementation of which are described below.
2 Motivation The main difficulty in the transition of a diesel internal combustion engine [1] to gas [2, 3] is associated with the way the fuel is ignited in the combustion chamber. This process in diesel engines occurs due to the high pressure of the fuel–air mixture; however, strong compression of the gas does not create conditions for its combustion. To solve this problem, both types of fuel are used: diesel fuel in a small amount of an ignition dose in the mixture is assigned the role of a detonator, and then the engine completely switches to gas. In this case, the maximum reduction in fuel costs is achieved with a high percentage of replacement of diesel fuel with gas, which, in turn, is determined by the gas supply control system and high-precision limitation of the ignition dose of diesel fuel. Thus, the urgent task is to increase the efficiency of the internal combustion engine of a diesel locomotive, taking into account its high power by improving the control system [4, 5]. The supply of gaseous fuel to the combustion chamber is carried out in a metered manner, for which an electric gas valve with nozzles or metering devices is additionally installed. The calculation of the dose of gaseous fuel is carried out in an electronic control unit based on a microcontroller and sensors that control the operation of the engine. Existing automatic fuel control systems are based on the implementation of a PID controller. The disadvantage of implementing a PID controller is the impossibility of reducing the time of the transient process when switching from one load speed mode to another without additional significant fuel consumption. Implementation of intelligent technologies is frequently considered to improve modern PID controllers [6–8]. Intelligent PID controllers are able to consider uncertainty of the unknown factors, which might be highly nonlinear and/or time varying. One of the ways to introduce intelligence is to implement the fractional order [9–11]. Study of the existing research results allowed identifying the issue, which consists in low utilization of intelligent PID controllers in locomotive gas-diesel engines. Controllers of this type were not previously used due to the excessive complexity
Intelligent Fractional Control System of a Gas-Diesel Engine
381
of incomplete integration and differentiation in the control system. However, for modern digital microcontrollers, this problem is insignificant. Considering this experience, we propose to implement a fractional PID algorithm for intelligent controlling of the crankshaft rotation frequency. A hypothesis was proposed about replacing the integrating element with a differintegral, taking into account the prehistory. The main idea is to change the integrating and differentiating components of the regulator so as to eliminate the overshoot of the crankshaft speed above the target value. When changing the dynamic operating mode of the engine, it is necessary to compensate for large deviations. At the end of the transient process, the deviation has already been reduced, and the integrator still remembers large deviations and tries to compensate for them. The differentiating component does not correct the situation, which is typical for objects with delay. It was decided to add a power law function to the integrator, which would reduce the contribution of earlier deviations as the deviation was reduced. This can be done by generalizing the Cauchy reintegration formula. The combined operator of integration and differentiation is called fractional integro-differentiation. Therefore, in contrast to the classical PID controller, in the fractional PID controller [12–14], when determining the current impact, a different contribution of the components is assessed taking into account the history. First ideas of fractional PID controller implementation for a locomotive gas-diesel engine are presented in [15, 16]. This paper presents new results of algorithm digital implementation and testing results. The proposed solution makes it possible to more accurately calculate the doses of gaseous fuel when changing the speed and load modes of the power unit. Considering the high power of locomotive engines, this modification improves the efficiency and safety of the locomotive gas-diesel internal combustion engine while maintaining its reliability.
3 Control System The internal combustion engine of a diesel/gas power unit (see Fig. 1) operates on ultra-lean mixtures and demonstrates high efficiency with a high percentage of diesel fuel substitution with gas, which, in turn, is determined by the gas control system and high-precision limitation of the ignition dose of diesel fuel. In the modernized gas-diesel engine, gas is supplied to the combustion chamber of each cylinder in a metered manner, for which an electric gas valve with nozzles is additionally installed. Ignition is provided by an unchanged pilot dose of diesel fuel. The calculation of the dose of gaseous fuel at each cycle is carried out in an electronic control unit based on a microcontroller and sensors that control the crankshaft rotation speed. Improvement of existing and creation of new gas-diesel designs are associated with the development and implementation of modern elements and devices
382
A. Avsievich et al.
Fig. 1 Gas-diesel engine
of computer technology and control systems. The general scheme of the control system operation is shown in Fig. 2. The supply of gas fuel to the combustion chamber is carried out in a metered manner, for which an electric gas metering device is additionally installed. The calculation of the dose of gaseous fuel is carried out in an electronic control unit based on a microcontroller.
Fig. 2 Modified control system
Intelligent Fractional Control System of a Gas-Diesel Engine
383
To improve the control system for fuel supply of the diesel engine, the architecture was extended by the air flow sensor, exhaust gas temperature sensor, the block for diagnosis of coking of engine outlet windows and the block for generating a correction signal of engine operation. The control unit receives signals from sensors that monitor the operation of the engine, analyzes the received data and corrects the system operation. Main sensors are temperature sensor, phase mark sensor and crankshaft speed control sensor. The main dynamic characteristics of a locomotive duel fuel engine that determine the fuel consumption include the time of the transient process and the corresponding overshoot that appears due to excessive correction. In this regard, it is promising to modernize the control system of a gas-diesel internal combustion engine by implementing new elements and devices of a digital control system. In addition, the existing control systems do not take into account the coking of engine outlet windows, which also leads to increased fuel consumption and an increase in the likelihood of a diesel igniting. To solve this problem, we propose to implement diagnostics of outlet windows by indirect signs.
4 Digital Computing Algorithm Description To solve the stated problem, a digital recurrent algorithm for fractional proportional– integral–differential control is proposed. To substantiate the algorithm, a generalization of frequency stability criteria for a fractional PID control system for controlling an internal combustion engine was carried out, and a simulation model of a control system for a gas-diesel generator was constructed and investigated. Let us introduce the following notation: w(t)—crankshaft rotation frequency at time t, rpm; w ∗ —target value of the crankshaft rotation speed, rpm; w max —maximum value of the crankshaft rotation speed, rpm; ∗ · 100% —mismatch value, %; e(t) = w w−w(t) max u(t)—control action, %; V DT (t) = const—the proportion of diesel fuel, l; V G (t) = K G · u(t)—gas fraction, taking into account the engine operating conditions, m3 . At present, the best quality of automatic frequency control is provided by a microprocessor system, in which a classical PID controller or its modifications are used as a control algorithm. A known disadvantage of the PID controller is the negative effect of differentiation, which consists in an excessive increase in the section of the amplitude–frequency characteristic of the circuit, in which the phase shift is already so great that differentiation does not correct the situation, which is typical for objects with delay. As a result, a reduction in the transient time leads to an increase in the overshoot of the crankshaft speed w(t) above the target value w ∗ .
384
A. Avsievich et al.
This problem can be solved by digital implementation of the fractional PID controller: u(t) = K P e(t) +
1 α β β I (e(t)) + TD D0t (e(t)), TIα 0t
(1)
where K P is a proportional coefficient (gain), TI is a constant of integration, TD is a constant of differentiation, and t is time. In contrast to the classical PID controller, in which the value of the integrating component is affected by the initial deviation of the controlled value, and the differentiating component is proportional to the rate of change of the deviation, in the fractional PID controller, when determining the current effect, a different contribution of the components is assessed taking into account the history. When implementing the digital algorithm of fractional PID control, the transition from the continuous form of the description to the discrete one is made. In this case, the central task is to choose a method for calculating its integral and differential components, which can provide the required accuracy. For this, numerical methods for calculating the fractional integral were proposed, studied and compared, as a result of which a method was chosen that has the smallest error: Tcα (i + α − 1) en−i , (α) i=1 (i) n
I0tα (e(t)) ≈
(2)
where Tc is the sampling interval. When calculating the fractional differential component of Eq. (1), the Hölder derivative of the function e(t) was used: β D0t (e(t))
n (i + β − 1) en−i , ≈ β (i) Tc (−β) i=1
1
(3)
Thus, from the continuous fractional PID of the control law (1), a digital one is obtained by replacing the integral link with expression (2) and the differential link with expression (3). In this case, the expression for calculating the control action will be as follows. (4) In practice, instead of calculating the absolute values of the control signal, it is more convenient to calculate its increments u n at each clock cycle. For this, a recurrent control algorithm was developed:
Intelligent Fractional Control System of a Gas-Diesel Engine
u n = u n−1 + u n ,
385
(5)
where n is the step of the algorithm. As a result, we get the expression: u n = u n−1 + q0 en + q1 en−1 + q2 en−2 + εn ,
(6)
As can be seen from (5), in the refinement coefficient εn , there is a sum that needs to be recalculated at each new sample, but the calculation of the refinement coefficient can be limited |εn | < ς and thereby reduce the number of operations, where ς —permissible calculation error.
5 Implementation and Tests Transient process is one of the key characteristics of a control system. Providing an intelligent control requires adaptive reaction to various deviations. This feature can be formalized by memory—the possibility of intelligent control system to consider time when generating the feedback. The integral element of regulator is forced to compensate for the large deviation at the beginning of the transient process. At the end of the transient process, the deviation has already been reduced, and the integrator still “remembers” large deviations and tries to compensate for them. The differential element partially solves this problem, but is inefficient in inertial systems. The proposed fractional PID controller allows introducing the function of forgetting that would reduce the contribution of earlier deviations coming closer to the target value. This idea is illustrated in Fig. 3. Figure 4 shows the resulting difference between the formation of the control action of the classical PID controller and the fractional PID controller. A study of the operation of a fractional PID controller as part of an automatic control system based on a laboratory bench was carried out. The experimental data (see Figs. 5 and 6) were obtained at the same time interval from t = 0 to t = 3.5 * 105 ms with a change in the crankshaft speed setting with a time interval of 10 s.
386
A. Avsievich et al.
Fig. 3 Differences between PID and fractional PID control action
Fig. 4 Differences between PID and fractional PID crankshaft rotation frequency
To process the results of the experiment, we used the software included in the laboratory setup. The results of a comparative analysis of the PID and fractional PID control algorithms are shown in Table 1. It can be seen that the use of fractional PID control provides a fuel economy Q of 5%, while the crankshaft speed overshoot above the target value is less by 59, 1%, and the time of the transient process is less
Intelligent Fractional Control System of a Gas-Diesel Engine
387
Fig. 5 Implementing PID and fractional PID controllers with 770 rpm
Fig. 6 Experimental results
Table 1 Comparing PID and fractional PID KP
TI
TD
α
β
w, %
tp, s
Q
PID
2
0.01
0.005
–
–
8.3
2.46
0.0057
Fract
2
0.01
0.005
0.43
0.68
3.4
2.31
0.0053
by (tp, s)—7%. Recurrent implementation of the control algorithm allows to achieve time complexity of fractional PID controller equal to the classical one. Thus, the advantage of the fractional PID control algorithm with respect to the time of the transient process in comparison with the classical PID controller has been revealed. Based on the results obtained, it was concluded that the proposed modernization provides 1–2 classes of speed control accuracy in accordance with standard.
388
A. Avsievich et al.
The calculations showed that when implementing the proposed control system for internal combustion engines of diesel locomotives operating on mixed fuel, the annual economic effect per diesel locomotive will amount to 25,200 rubles. According to the inventory, the Russian Railways fleet contains 536 passenger diesel locomotives, 3750 freight locomotives and 5989 shunting locomotives. As a result, the expected savings from the proposed modernization of the control system are 258,930,000 rubles per year.
6 Conclusion Thus, the advantage of the model with the fractional PID algorithm in terms of the time of the transient process and the integral estimation in comparison with the model of the classical PID algorithm is revealed. The main result is that when using the fractional PID algorithm, fuel economy is obtained, and the controller breaks the relationship between the crankshaft speed overshoot over the target value while reducing the transient time. The results of the study of the digital control system of a gas-diesel internal combustion engine of a diesel locomotive, which implements the algorithm of fractional PID control algorithm, in the course of simulation showed an average advantage in the emission of the crankshaft rotation speed over the target value by 45.5%, the transition time by 33.3% and integral quality assessment by 6.8% compared to the classic PID controller. Tests on the basis of a laboratory stand showed that when using a classical PID controller, the fuel consumption was 0.0057 m3 /h, and when using a fractional PID controller, 0.0053 m3 /h, which gives fuel savings up to 5% while increasing maneuverability and reliability. The idea of implementing differintegrals in intelligent control units turns out to be promising for complex systems. Therefore, the next steps of this research are planned to apply the proposed approach of fractional regulating considering the transient process pre-history in other problem domains.
References 1. Taylor CF (1985) The internal combustion engine in theory and practice, vol 1, 2nd edn, The MIT Press, revised, 584p 2. Benajes J, Antonio G, Monsalve-Serrano J, Boronat V (2017) Dual-fuel combustion for future clean and efficient compression ignition engines. Appl Sci 7:36. https://doi.org/10.3390/app 7010036 3. Baêta J, Amorim R, Valle R, Barros J, Carvalho R (2005) Multi-fuel spark ignition engine. Optim Perform anal. https://doi.org/10.4271/2005-01-4145 4. Shatrov M, Sinyavski V, Dunin A, Shishlov I, Vakulenko A, Yakovenko A (2018) Using simulation for development of the systems of automobile gas diesel engine and its operation control. Int J Eng Technol (UAE) 7:288–295. https://doi.org/10.14419/ijet.v7i2.28.12947
Intelligent Fractional Control System of a Gas-Diesel Engine
389
5. Volpato FO, Theunissen F, Mazara R (2006) Control system for diesel—compressed natural gas engines. https://doi.org/10.4271/2006-01-3427 6. Fliess M, Join C (2008) Intelligent PID controllers. In: 16th Mediterranean conference on control and automation, pp 326–331. https://doi.org/10.1109/MED.2008.4601995 7. Huang C-N, Chung A (2016) An intelligent design for a PID controller for nonlinear systems. Asian J Control 18(2):447–455 8. Arunachalam SP, Kapa S, Mulpuru SK, Friedman PA, Tolkacheva EG (2016) Intelligent fractional-order PID (FOPID) heart rate controller for cardiac pacemaker. In: 2016 IEEE HI-POCT conference, pp 105–108. https://doi.org/10.1109/HIC.2016.7797708 9. Pan I, Das S (2013) Intelligent fractional order systems and control: and introductions. Stud Comput Intell 438:298p 10. Guan H, Chen Y (2018) Design of fractional order PID controller for velocity of micro intelligent vehicles. ICIC Express Lett 12(1):87–96 11. Kanagaraj N, Al-Dhaifalla M, Nisar KS (2017) Design of intelligent fuzzy fractional-order PID controller for pressure control application. In: 2017 International conference on intelligent computing, instrumentation and control technologies (ICICICT), pp 525–530. https://doi.org/ 10.1109/ICICICT1.2017.8342618 12. Caponetto R, Dongola G, Fortuna L et al (2010) Fractional order systems. Modelling and control applications. World Scintific, Singapore, 200p 13. Bettoua K, Charef A (2009) Control quality enhancement using fractional PIλDμ controller. Int J Syst Sci 40(8):875–888 14. Panteleev AV, Letova TA, Pomazueva EA (2018) Parametric design of optimal in average fractional-order PID controller in flight control problem. Autom Remote Control 79(1):153– 166 15. Ivaschenko A, Avsievich V, Avsievich A (2020) Fractional controlling system of an autonomous locomotive multifuel engine. In: 2020 IEEE conference on industrial cyberphysical systems (ICPS), Tampere, pp 425–428 16. Ivaschenko A, Avsievich V, Avsievich A (2020) Fractional control system simulation to modernize a locomotive dual-fuel engine. In: Proceedings of the 34th annual European simulation and modelling conference 2020. pp 242–244
Analysis on Advanced Encryption Standard with Different Image Steganography Algorithms: An Experimental Study Alicia Biju , Lavina Kunder, and J. Angel Arul Jothi
Abstract In this ever-changing world of technology, data security is of utmost importance. This research paper focuses on identifying the best combination of cryptography and steganography algorithms for securing data. The proposed approach developed a complete end-to-end system that encrypted text message using the Advanced Encryption Standard algorithm. The encrypted message was then embedded onto images using steganography techniques like Least Significant Bit, Discrete Cosine Transform and Discrete Wavelet Transform. The message was later decrypted and extracted. The performance of the algorithms was evaluated using various metrics. The best performing combination of algorithms for each metric was then identified. Keywords Image steganography · Cryptography · Advanced encryption standard (AES) · Least significant bit (LSB) · Discrete cosine transform (DCT) · Discrete wave transform (DWT) · Cyber security
1 Introduction 1.1 Cryptography The study of classified and protected communication methods that allows only the transmitter (source) and intended receiver (destination) of a message to view its contents with the help of different codes is described as Cryptography [1]. It can be used to safeguard data from being stolen or modified and for user authentication. It is closely related to cryptology, as well as cryptanalysis [2]. Cryptology is the study of mathematics, such as number theory, and the use of formulas and algorithms to A. Biju · L. Kunder (B) · J. Angel Arul Jothi Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, DIAC, Dubai, United Arab Emirates J. Angel Arul Jothi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_29
391
392
A. Biju et al.
Fig. 1 Secret key cryptography. Note the same secret key is shared between the sender and the receiver
solve problems and Cryptanalysis is the study of ciphertext, ciphers, and cryptosystems with the goal to learn how it works [3, 4]. Cryptography is mostly identified with encoding plaintext (ordinary text or clear text) into ciphertext with the help of encryption and then decrypting it [2]. The four major objectives of cryptography are confidentiality, integrity, non-repudiation and authentication [5]. Secret key cryptography (Symmetric Cryptography), public key cryptography and hash functions are the three different types of cryptography. In secret key cryptography, a unique secret key is allocated to both the sender and the receiver. The sender gets the unencrypted message, encrypted using this common secret key and the respective algorithm. The encrypted message is then sent to the destination. The recipient uses the same key to decrypt the encrypted message. This helps extract the decrypted message. Some examples include Data Encryption Standard (DES), Advanced Encryption Standard (AES) and Blowfish. Figure 1 details the secret key cryptography. In public key cryptography (Asymmetric Cryptography) there are two associated keys (public and private). The public key can be freely distributed and is used for encryption, but the private key is kept a secret and is used for decryption. The sender gets the unencrypted message, encrypted using the public key and the respective algorithm. The encrypted message is then sent to the destination. The recipient uses the private key to decrypt the encrypted message. This helps extract the decrypted message. Some of the examples are Diffie–Hellman key exchange and RSA (Rivest–Shamir–Adleman). Figure 2 details the public key cryptography. Hash Functions do not use a key. The plain text is hashed with a fixed-length hash value that prevents the plain text’s data from being retrieved. Some examples of hash functions include SHA-1 and SHA-2 [6].
Analysis on Advanced Encryption Standard with Different Image …
393
Fig. 2 Public key cryptography. Note different keys are shared between the sender and the receiver
1.2 Steganography Steganography is defined as the procedure of hiding secret or encrypted data (message) in an ordinary file, message, or object [7]. This technique can be used to disguise any type of content, such as images, audio, video, text, etc., within any other type of digital content. In simple words, it is the art of hiding one file within another [8]. Invisibility, undetectability, signal-to-noise ratio, capacity, tamper resistance and robustness are the factors of good steganography [9]. There are mainly 3 different types of steganography techniques. They are: Least Significant Bit (LSB), Discrete Cosine Transform (DCT) and Discrete Wave Transform (DWT). Based on the category of the cover objects used, steganography can be split into 5 main types. Text Steganography involves hiding sensitive information in text files. Video Steganography is very useful for hiding large amounts of data within moving images and sounds that constitute a digital video format. Network/Protocol Steganography is the embedding of information in network control protocols used in data communication, e.g., TCP, ICMP, UDP, etc. Most commonly, the data is hidden in the headers of the TCP/IP packets or in other fields that are unimportant and optional. Image Steganography, the main focus of this work, involves disguising data using images. This technique is widely used due to the vast number of bits present in images that are easy to manipulate therefore generating a new “stego” image that looks similar to the original image to the naked eye but hides sensitive information within. In Audio Steganography the data is hidden by altering the binary sequence of an audio signal. This is a much more exhausting process as compared to Image Steganography despite the similarity due to the ease by which the difference in the binary sequence can be detected [10].
394
A. Biju et al.
Fig. 3 Working of steganography
Figure 3 shows the steps in steganography. A cover file (X) of any digital format is chosen along with the secret message (M) to be embedded. For added protection, the message can be encrypted. The algorithm used in the steganographic encoder is dependent on the type of cover file and the confidential information, both of which are fed into the algorithm to generate a key (K) and “stego” object. The key serves as the secret keeper and is used for both encrypting and decrypting the file. Size of the key depends on the method used. The “stego” object generated is similar to the original object and is impossible to distinguish using human senses. There are very subtle changes in the binary sequences within the file that hide the secret text. The stego object is then sent to the intended receiver through any communication channel or platform available. The receiver also needs to have access to the key generated while encoding, otherwise he/she will be unable to decode the file. The key as well the stego object is fed into the steganographic decoder. The decoder also contains the same algorithm used for encrypting except this time, it is used to access the hidden binary sequence. If the key entered is correct, then the decoder generates the secret message that was embedded into the stego object. If the key is incorrect, the receiver loses access to the secret message [10]. This paper is organized as follows: Sect. 2 details the literature survey. Section 3 provides the methodology. The performance metrics are elaborated in Sect. 4. Results and discussions are given in Sect. 5. The conclusion is given in Sect. 6.
2 Literature Survey This section summarizes the related works. Srivatsava, J. et al. had explored an implementation of triple DES algorithm. A picture pressure procedure is additionally
Analysis on Advanced Encryption Standard with Different Image …
395
utilized here to compress and store substantial amount of information. As per the algorithm used, 2 keys are generated. The first key is used to hide data. The second is used to update or eliminate unessential data; therefore, reducing content length as well as building security. LSB and pixel value differencing (PVD) steganography was mainly employed here to embed the encrypted keys [11]. Different cryptography algorithms have different encryption times, output images and security level. Damrudi M. et al. have explored AES, RSA, DES, 3DES, and blowfish algorithms using peak signal-to-noise ratio (PSNR), mean square error (MSE) and the histogram comparison of the input and output image. The experimental results looked for high PSNR and low MSE for best results; for which all the algorithms had appropriate qualities of output image but when ranked, 3DES showed the best results. RSA had the highest key length and encryption time whereas DES had the lowest in both fields. AES and blowfish on the other hand shared the same key length but AES proved to have a much faster encryption time. Histogram analysis was used to measure the possibility of a successful attack the result of which showed that all the algorithms were equally susceptible to attacks but the success rate was low across all of them [12]. Saikumar, I. et al. elaborated on the 16 rounds of operation of DES cryptography techniques that proved to be very useful in providing an increased level of security. Since DES is a symmetric algorithm, only a singular key is generated for both encrypting and decrypting. Although, more keys can be generated for increased security [13]. Steganography and cryptography implemented together give the best security. Which is why Mittal, S. analyzed the difference when they were both implemented together and then separately based on various parameters and then implemented them simultaneously and observed the differences based on the same parameters. For steganography, LSB technique was used and for cryptography, Rivest–Shamir– Adleman (RSA) algorithm was used. Through space complexity analysis, histograms for input and output images were analyzed which showed that embedding plain text or ciphertext in the cover image, both generate similar histograms. Hence it was established that the image is not altered as much making it safe against hackers who broke into encrypted data by observing the differences in pixels of the images [14]. Bhargava S. et al. have implemented a user interface which was established for easier encryption and decryption using RSA and LSB techniques. Multiple images were also used to check the differences in the output images for the same algorithm. The MSE and PSNR values were obtained from different images to establish which kind of images could be best used as cover input images. It was also established that the brute force attack was hard on RSA algorithms as they held the capability of using both LSB and DWT embedding techniques. When the algorithms were used simultaneously, it was harder to forcibly decode the data when deprived of the key [15]. AbdelWahab, O. et al. explored the LSB and DCT steganographic techniques. They proceeded to draw a comparison between the two to understand which one among them would be the best choice. In the first case, only LSB technique was used with no encryption whereas in the second, the message was encrypted and then
396
A. Biju et al.
LSB technique was used. Furthermore, DCT technique was used on the stego image generated. Their execution was evaluated on the basis of mean square error and peak signal-to-noise ratio. It was found that the combination of the encryption, LSB and DCT technique reduced the overall size of the file, made it easy for transmission and also ensured safety of the data [16]. Zagade, S et al. implemented a Discrete Wavelet Transform steganography technique in collaboration with biometrics. Difference in skin tones over a region was the biometric feature used for steganography. Skin tone detection was done using hue, saturation and value (HSV) color space. Data was secretly concealed in one of the high frequency sub-bands of the skin tone by tracing the pixels present there. Another technique used was cropping of images to hide data. It was concluded that tracing tone differences allowed higher security in different images and almost made data perceptually invisible [17]. Authors Goel S. et al. focused on comparing the least significant bit (LSB), discrete cosine transform (DCT) and discrete wavelet transform (DWT) based steganography techniques. The work explained how the LSB technique embedded data onto images by manipulating the least significant bits of the pixels in the cover image while the DCT and the DWT techniques converted the cover image into the frequency domain where the data was embedded into the respective frequency components. The execution of these techniques was evaluated based on MSE, PSNR, capacity and robustness measures. It was found that while LSB allowed the maximum capacity of data to be embedded, DCT provided better quality of the image and DWT offered better robustness and security. DCT and DWT also seemed to have a higher invisibility as compared to LSB and were therefore chosen as the better techniques [18]. Vimal G. et al. showed that discrete wavelet transform (DWT) could be used for image steganography. The transform had resolved the issue of security and adaptability. Even though a shift in the wavelet was used, no stenographic structure was completely immune to the attacks. The wavelet space which was a frequency attribute of localization made strong geometric attacks that helped analyze image characters well. This property would help increase the embedded area while also increasing protection. The DWT results for high imperceptibility, accuracy and PSNR were in the range of 30–54 db [19]. The work by Rabie T. and et al. proposed a discrete cosine transform based steganography approach. To help optimize the uniformity level among pixels in a segment, a precise segmentation process which used a region-based growing segmentation method was used to maximize the hiding potential while achieving an improved stenographic output. In contrast to challenging segmentation-based techniques, the suggested methodology was able to attain higher efficiency and PSNR values [20]. The comparison of the DCT and DWT methods was the basis for the article written by Desai J. et al. The cover image was converted from the spatial domain to the frequency domain, and the hidden picture was inserted into the frequency attributes of the cover image, by using the DCT and DWT algorithms. The parameters MSE, PSNR, and MSE/PSNR were used to test the efficiency and contrast of these two techniques [21].
Analysis on Advanced Encryption Standard with Different Image …
397
The applications of deep neural networks have expanded to steganography which is seen in the work of Pham Huu Q. et al. Deep learning techniques were used for hiding audio into images and the architecture contained two sub-models. Two convolutional neural networks were used to conceal the secret audio into an image and then recover the data later. The integrity of both the image and the audio was retained. Through experimental results the length of the hidden audio was improved in comparison with the traditional methods [22]. Muhammad H. et al. have delved into using Stretch Technique combined with LSB to hide secret messages. The stretch technique was extensively used to enlarge the image size horizontally on the right most pixel block, in proportion to the length of the secret message. It was found that the combination of LSB steganography and the Stretch technique can enlarge an image automatically according to the number of secret messages. This technique overcomes the size issue of the message but results in increased size of the generated stego image. A solution can be found by researching implementation of image stretch both horizontally and vertically [23]. The aim of this work is to integrate both cryptography and steganography. There is a lack of significant research in this area. Nevertheless, we were able to make our work unique by using an extensive image repository which consists of colored and grayscale images. We implemented three different combinations of algorithms and a comparison study was drawn to narrow down to the best possible values for the chosen performance metric.
3 Methodology Cryptography is the way of concealing the meaning of the data and steganography is the way of disguising the data’s existence. In this work we aim to experimentally analyze and find the efficiency of combining the concepts of cryptography and steganography in order to discreetly and safely send sensitive data using images. We used the Advanced Encryption Standard (AES) method as the cryptography technique. This was experimented with three different image steganography techniques like Least Significant Bit (LSB), Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT). Figure 4 details the proposed method which includes two components. Initially, a password and a secret message are input by the user. The first component is the encryption and embedding block which encrypts and encodes the secret message. For encryption, a 256-bit AES encryption key is given to the AES encryption block. The message entered by the user is then encrypted using the 256-bit AES encryption algorithm. After the message is encrypted, it is then embedded and encoded into the cover image using any one of the steganography techniques (LSB, DCT, DWT) mentioned earlier to generate the stego image. The second component is the decoding and the decryption block. The stego image is given as an input to undergo a process of decoding using the steganography techniques (LSB, DCT, DWT). This helps extract the previously embedded encrypted
398
A. Biju et al.
Fig. 4 Proposed method
message. The encrypted message is then decrypted using the same 256-bit AES encryption key that goes through AES decryption. Ultimately the secret message embedded was retrieved.
3.1 Advanced Encryption Standard (AES) The Advanced Encryption Standard (AES) is a symmetric-key block cipher published by the National Institute of Standards and Technology (NIST) in December 2001. AES is a non-Feistel cipher that encrypts and decrypts data blocks of 128 bits. It has three definite versions: with 10, 12 and 14 rounds. Each of these three versions uses various cipher key sizes of 128, 192 or 256 bits. However, the round keys are constantly 128 bits. The AES encryption algorithm begins with SubBytes as the first transformation where a byte is substituted as two hexadecimal digits in the matrix. This operation involves sixteen independent byte-to-byte transformations which is done by referring to the SubBytes transformation table; mathematically shown by Eq. (1). The inverse transformation for SubBytes is InvSubBytes which will refer to the InvSubBytes table this will be done in the AES decryption algorithm, as represented in Eq. (2). −1 ⊕y SubBytes → d = X Sr,c
(1)
−1 InvSubBytes → X −1 (d ⊕ y) −1 −1 −1 −1 = X −1 X Sr,c ⊕y⊕y = Sr,c = Sr,c
(2)
Analysis on Advanced Encryption Standard with Different Image …
399
where, X denotes the matrix, Sr,c represents the individual byte if the matrix, d is the substituted byte and y is the operation term. The second transformation for AES encryption is ShiftRows, where a Shift Left process has been implemented on each row of the matrix. This process is also known as Permutation. In the decryption process the InvShiftRows transformation is used with elements in the matrix being shifted to the right. The third transformation for AES encryption is MixColumns where the bits inside each byte are changed based on the neighboring bytes. The bytes are changed to provide diffusion. This transformation works at the column level where each column is changed to a new column. The InvMixColumns is used for decryption. The last transformation is AddRoundKey. This proceeds from one column at a time which adds a round key with each state in the column matrix. The operation performed is addition. This transformation is an inverse of itself [24]. In this work, we use 256-bit AES encryption algorithm. The password input to the algorithm was converted into its equivalent binary hexadecimal format to generate the 256-bit AES encryption key. This key was then used to encrypt the message that was entered by the user. The next stage was AES decryption which began with a password check wherein if the correct password was entered the message would be decrypted else the decryption process would be rejected. Few advantages of AES are as follows: Since AES was formulated after DES, many of the known breaches on DES were already tested on AES. AES is certainly more secure than DES due to the larger-size key. The algorithms used in AES are so straightforward that they can be easily executed using inexpensive processors and least amount of memory. AES can be easily employed in software, hardware, and firmware. The implementation can use table lookup process or routines that use a well-constructed algebraic structure [24].
3.2 Least Significant Bit (LSB) Text message can be hidden within an image through LSB substitution. In this method, the least significant bit values of the image pixels are replaced with bits of the text message that needs to be kept as a secret. The concept behind LSB steganography is that changing a pixel’s most significant bit (MSB) has a greater effect on the overall value of the pixel, while changing the LSB has a smaller impact therefore least significant bit steganography is used to hide secret messages. In grayscale images the pixels that do change in the image are the topmost pixels which are dependent upon the size of the secret message. Usually, programmers choose to change the LSBs of the topmost pixels [25]. In this work, the LSB embedding technique began with the user inputting their secret message and cover image of choice. The process then moved forward to understand whether the image was a grayscale or an RGB image. Based on this distinction the algorithm proceeds to convert the encrypted message into an 8-bit binary format from ASCII (American Standard Code for Information Interchange).
400
A. Biju et al.
The image pixels were modified according to the 8-bit binary secret message data and then the stego image was formed. For the decoding the previously entered confidential information, the user would start with entering the image name whose pixels would be scanned to extract the embedded information. Functions within the program worked on extracting the contents of the image as a sequence object that contained the modified pixel values, which was the secret message. This was then converted from binary to ASCII.
3.3 Discrete Cosine Transform (DCT) Computer image data is transformed from the spatial to the frequency domain using DCT which is a mathematical function. This technique is very powerful, and it helps divide the picture into sections of varying significance. It converts the spatial domain of the picture, be it grayscale or an RGB image, to the frequency domain and divides the image into three frequency components: high, middle, and low [26]. The standard Eq. (3) for a two-dimensional DCT image is given as, C(u, v) = α(u)α(v)
N −1
N −1
f (x, y)
x=0 y=0
cos
(2x + 1)uπ (2y + 1)vπ cos 2N 2N
(3)
where, C(u, v) is the cosine transform, α is a constant, f (x, y) is the image and the cos functions are basis vectors (that is sampled cosine functions). The value of α is calculated in Eq. (4) as, α(k) =
√ 1/N if k = 0 √ 2/N otherwise
(4)
where k can be u or v, these are the individual pixel coordinates. As the most significant information is stored in the low frequency coefficients, we can see that DCT has energy conservation. We move forward to the DCT embedding technique where we used user-friendly code such that the individual would input a secret message which was converted from ASCII to binary. They could then choose a cover image of their selection and a name for the new stego image was also given. The image was then read, its dimensions were compelled to be 8 × 8 acquiescent and it was then resized to meet the DCT requirements. These requirements included a forward DCT stage for the creation of DCT blocks, a quantization stage, and an arrangement of DCT coefficients in terms of frequency where the confidential data was encoded by changing the coefficients of the middle frequency sub-band, which would help not to alter the image’s visibility.
Analysis on Advanced Encryption Standard with Different Image …
401
The secret message was then embedded into a Luminance layer, a dequantization stage was implemented, an inverse DCT stage was applied, and the image channel was rebuilt. This component ended with the creation of a new stego image with a name that was provided before. The second component of our program was extraction of the previously entered confidential information. We started with entering the image name that we wanted to extract the information from. A forward DCT stage which investigated the Luminance layer, a quantization stage, and an arrangement of DCT coefficients in terms of frequency was executed. We then extracted the secret message from the DCT coefficients which was then converted from binary to ASCII.
3.4 Discrete Wavelet Transform (DWT) It is a transform that breaks down any given image, that is grayscale or RGB, into a finite number of sets, where each set is a sequence of coefficients depicting the time evolution of the image in the equivalent frequency band. When DWT is applied to images; for each frequency band, 4 sub-bands are created: LL, LH, HL and HH. The LL is a low frequency component that the human eye is most receptive to and therefore contains the most important details about the image. As the other 3 subbands are high frequency; they mostly contain insignificant data that is not very receptive to the eye. Therefore, embedding of the secret data is mostly done in these 3 parts as it doesn’t degrade the quality of the image. Embedding can also be done in the LL sub-band but it can lead to distortions in stego image, making it easily identifiable and traceable. Although it does not generate the best stego image as compared to other algorithms; DWT is well known for its robustness [27]. We then move forward to the DWT embedding technique where we used userfriendly code such that the individual could choose a cover image of his or her selection and a name for the new stego image was also given. The user inputs a secret message as well which was converted from ASCII to binary. The image was then read, and it was converted to the YCbCr color space where Y is the luma component and CB and CR are the blue-difference and red-difference chroma components. The blue-difference chroma component was extracted and the Haar wavelet was composed. The data is hidden in the high frequency sub-bands (LH, HL, HH). This component ended with the creation of a new stego image with name that was provided before. The second component of our program was extraction of the previously encoded confidential information. We started with entering the image name that we wanted to extract the information from. The stego image was then read, it was converted to the YCbCr color space, the blue-difference chroma component was extracted and the Haar wavelet was designed. The data was extracted from the same high frequency sub-bands as mentioned before and the text was then converted from binary to ASCII.
402
A. Biju et al.
4 Performance Metrics The proposed method was evaluated using metrics like peak signal-to-noise ratio (PSNR), mean squared error (MSE), execution time and storage space required.
4.1 Mean Squared Error (MSE) The MSE is the average of the squared difference between the predicted quantity and the actual quantity. It signifies the squared error between the cover and the stego image. The lesser the value of MSE, lower the error. The MSE is always positive and is used in the calculation of the PSNR metric. MSE =
m−1 n−1 1
[I (i, j) − K (i, j)]2 mn i=0 j=0
(5)
Equation (5) gives the MSE where I is the cover image, K is the stego image, m is the number of rows in the images, n is the number of columns in the images, and (i, j), are the individual pixel coordinates.
4.2 Peak Signal-to-Noise Ratio (PSNR) PSNR stands for peak signal-to-noise ratio, which is the measure of a signal’s highest achievable power to the power of degrading noise that influences its quality. The PSNR value is used as a quality estimation between the cover and a stego image. The greater the PSNR, the higher the similarity between the two images. A higher value also stands for a better standard of the stego image.
MAX2I PSNR = 10 log10 MSE
MAX I = 20 log10 √ MSE = 20 log10 (MAX I ) − 10 log10 (MSE)
(6)
The PSNR (in dB) is defined by Eq. (6) where, MAX I is the cover images’ highest achievable power and MSE is the mean squared error.
Analysis on Advanced Encryption Standard with Different Image …
403
4.3 Execution Time The execution time includes the time spent during run-time. A lower execution time is preferred as it stands for a faster implementation of the given method.
4.4 Storage Space It is the storage requirement of the stego or the cover images. Smaller storage space is preferred for algorithms as an algorithm that can fit into the cache memory of a device can attain very high execution speeds. An increase in speed also results in minimization of execution time.
5 Results and Discussions In this work, the concepts of cryptography and steganography have been combined and implemented to secure the data. The following experiments were conducted: (a) AES + LSB, (b) AES + DCT and (c) AES + DWT and the best combination of algorithms were identified. We used a repository of 5 images for all the experiments. Out of the 5 images, 3 were RGB images and 2 were grayscale images. Figure 5
Fig. 5 Image repository used
404
A. Biju et al.
Table 1 Performance metric values for LSB, DCT and DWT with AES encryption IMAGES (image size in KB)
Bird (212 KB)
Circle (48 KB)
Lena (68 KB)
Pup (80 KB)
Straw (720 KB)
AVERAGE (225.6 KB)
LSB
PSNR (dB)
84.79
45.02
75.003
48.63
84.32
67.55
MSE
0.0002
0.14
0.002
0.89
0.0002
0.21
Time (s)
49.65
50.95
47.01
43.95
48.15
47.94
Storage (KB)
256
184
75.8
756
761
406.56
PSNR (dB)
40.38
39.28
33.71
38.03
37.59
37.80
MSE
5.98
7.67
27.71
10.23
11.33
12.58
Time (s)
54.58
47.58
45.18
49.28
49.47
49.22
Storage (KB)
476
216
124
847
740
480.60
PSNR (dB)
56.95
51.22
56.45
52.10
52.18
53.78
MSE
0.13
0.49
0.15
0.41
0.39
0.31
Time (s)
25.46
22.67
22.90
23.22
23.71
23.59
Storage (KB)
248
112
77
800
704
388.20
DCT
DWT
The best values are represented in bold
shows all the images used for this work. In order to have a fair comparison and to maintain uniformity the same password and secret message were used across all the experiments. Table 1 shows the results obtained for all the experiments. For each experiment, the 4 performance metric values were tabulated for each image. The last column of the table shows the average value computed over all the 5 images for each metric for each experiment. From Table 1 it can be observed that the AES + LSB method achieved average values of 67.55, 0.21, 47.94 s, and 406.56 KB for the PSNR, MSE, time and storage respectively. The AES + DCT method achieved average values of 37.80, 12.58, 49.21 s and 480.60 KB for the PSNR, MSE, time and storage respectively. The AES + DWT method achieved average values of 53.78, 0.31, 23.59 s and 388.20 KB for the PSNR, MSE, time and storage respectively. The results of Table 1 show that the best PSNR and MSE values of 67.552 and 0.21 respectively were obtained by AES + LSB method. The best execution time value of 23.59 s was given by AES + DWT method. Also, the AES + DWT method exhibited the lowest storage space requirements. In an attempt to choose the best combination of algorithms, we had set a few parameters that had been used across all the programs, so as to maintain uniformity. This ensured that we obtained the best possible dataset with no bias toward any particular program. The secret message used was “The quick brown fox jumps over the
Analysis on Advanced Encryption Standard with Different Image …
405
lazy dog,” the size of which was 119 bytes. The password used was “Alina,” the size of which was 43 bytes. Upon AES encryption, the ciphertext generated on entering the message and the password, which would later be embedded into the image was b’7898e7119c93b3b8b42d57fa0537f1422f0e8797dac6f811286e5898dea7c6706a 1db0364d15c644afb015d9’. This ciphertext was then embedded onto the images using the above-mentioned steganography techniques. It was then extracted from the images which was later decrypted to return the secret message.
6 Conclusion Security and safety of data will always be a priority in everyone’s lives especially with the way the world is advancing in technology. This paper discussed how data could be concealed and protected with the help of cryptography and steganography. The AES encryption method was applied as the cryptography technique and LSB, DCT and DWT were explored as the steganography techniques. In future, this work can be further expanded by increasing the number of images and including different type of images in the repository. Also, it would be interesting to analyze the use of recent stretch techniques and deep neural networks along with steganography techniques for hiding data.
References 1. Kaspersky, Cryptography Definition, http://kaspersky.com/resource-center/definitions/whatis-cryptography, 2021 2. SearchSecurity, What is cryptography?—Definition from WhatIs.com, https://searchsecurity. techtarget.com/definition/cryptography, 2021 3. SearchSecurity, What is cryptology?—Definition from WhatIs.com, https://searchsecurity.tec htarget.com/definition/cryptology, 2021 4. SearchSecurity, What is cryptanalysis?—Definition from WhatIs.com, https://searchsecurity. techtarget.com/definition/cryptanalysis, 2021 5. The Economic Times, What is Cryptography? Definition of Cryptography, Cryptography Meaning—The Economic Times, https://economictimes.indiatimes.com/definition/cry ptography, 2021 6. Fruhlinger J (2021) What is cryptography? How algorithms keep information secret and safe, CSO Online. https://www.csoonline.com/article/3583976/what-is-cryptography-how-alg orithms-keep-information-secret-and-safe.html 7. Codr J, Unseen, An overview of steganography and presentation of associated java application c-hide, https://www.cse.wustl.edu/~jain/cse571-09/ftp/stegano/index.html 8. Dickson B (2020) What is steganography? A complete guide to the ancient art of concealing messages, https://portswigger.net/daily-swig/what-is-steganography-a-completeguide-to-the-ancient-art-of-concealing-messages, 2020/02/06 9. Steganography Tutorial (2020) A complete guide for beginners, https://www.edureka.co/blog/ steganography-tutorial, 2020/11/25
406
A. Biju et al.
10. Semilof M, Clark C (2018) What is steganography?—Definition from whatis.com, https://sea rchsecurity.techtarget.com/definition/steganography#:~:text=Steganography%20is%20the% 20technique%20of,for%20hiding%20or%20protecting%20data, 2018/12/28 11. Srivatsava J, Sheeja R (2020) Implementation of triple DES algorithm in data hiding and image encryption techniques. Int J Adv Sci Technol 29(3):10549–10559 12. Damrudi M, Jadidy Aval K (2019) Image Steganography using LSB and encrypted message with AES, RSA, DES, 3DES and Blowfish. Int J Eng Adv Technol (IJEAT) 8(6S3):2249–8958 13. Saikumar I (2017) DES–data encryption standard. Int Res J Eng Technol (IRJET) 4(3):1777– 1782 14. Mittal S, Arora S, Jain R (2016) Data security using RSA encryption combined with image steganography. In: 1st India international conference on information processing (IICIP) IEEE, pp 1–5 15. Bhargava S, Mukhija M (2019) Hide image and text using LSB, DWT and RSA based on Image steganography. ICTACT J Image Video Process 9(3):1940–1946 16. AbdelWahab O, Hussein A, Hamed H, Kelash H, Khalaf A, Ali H (2019) Hiding data in images using steganography techniques with compression algorithms. TELKOMNIKA 17(03):1168– 1175 17. Zagade S, Bhosale A (2014) Secret data hiding in images by using DWT techniques. Int J Eng Adv Technol (IJEAT) 03(05):230–235 18. Goel S, Rama A, Kaur M (2013) A review of comparison techniques of image steganography. Glob J Inc. (USA) 13(4):9–14 19. Vimal Kumar Murugan G, Uthandipalayam Subramaniyam R (2020) Performance analysis of image steganography using wavelet transform for safe and secured transaction. Multim Tools Appl 79:9101–9115 20. Baziyad M, Rabie T, Kamel I (2020) Achieving stronger compaction for DCT-based steganography: a region-growing approach. In: Trends and innovations in information systems and technologies. WorldCIST 2020. Advances in intelligent systems and computing, vol 1160. Springer Nature Switzerland, pp 251–261 21. Desai J, Hemalatha S, Shishira S (2014) Comparison between DCT and DWT steganography algorithms. Int J Adv Inf Sci Technol (IJAIST) 24(24):2319–2682 22. Huu Q, Dinh T, Tran N, Van T, Minh T (2021) Deep neural networks based invisible steganography for audio-into-image algorithm. In: IEEE 8th Global conference on consumer electronics (GCCE), pp 423–427 23. Harahap M, Khairina N (2020) Dynamic steganography least significant bit with stretch on pixels neighborhood. J Inf Syst Eng Bus Intell 6(2):151–158 24. Forouzan B, Mukhopadhyay D (2007) Cryptography and network security. Tata McGraw-Hill Publishing Company Limited, New Delhi 25. Chan CK, Cheng LM (2004) Hiding data in images by simple LSB substitution. Pattern Recogn 37(3):469–474 26. Ahmed N, Natarajan T, Rao KR (1974) Discrete cosine transform. IEEE Trans Comput C23(1):90–93 27. Sari W, Rachmawanto E, Setiadi I, Sari C (2017) A good performance OTP encryption image based on DCT-DWT steganography. TELKOMNIKA 15(4):1987–1995
Diabetes Prediction Using Logistic Regression and K-Nearest Neighbor Ami Oza and Anuja Bokhare
Abstract Diabetes is a long-term illness that has the ability to become a worldwide healthcare crisis. Diabetes mellitus, sometimes known as diabetes, is a metabolic disorder characterized by an increase in blood sugar levels. It is one of the world’s most lethal diseases, and it is on the rise. Diabetes can be diagnosed using a variety of traditional approaches complemented by physical and chemical tests. Methods of data science have the potential to benefit other scientific domains by throwing new light on prevalent topics. Machine learning is a new scientific subject in data science that deals with how machines learn from experience. Several data processing techniques have been developed and utilized by academics to classify and predict symptoms in medical data. The study employs well-known predictive techniques such as K-nearest neighbor (KNN) and logistic regression. A predicted model is presented to improve and evaluate the performance and accuracy by comparing the considered machine learning techniques. Keywords Diabetes · Machine learning · K-nearest neighbor · Logistic regression
1 Introduction Currently, in a global setting, various chronic diseases are spreading across the globe, both inside developing and developed countries. Diabetes mellitus (DM) is one of the chronic maladies that kills people at an early age, among other things. Diabetes mellitus (DM) is a term coined by medical practitioners. Diabetes condition is now spreading at a breakneck pace, particularly in India. It is not difficult to anticipate what percentage of diabetes is severe and chronic. The standard identifying procedure requires people to attend a diagnostic center, consult with their doctor, and stay for at least a day urge their reports. Diabetes is frequently misdiagnosed in the early stages, leading to severe symptoms that necessitate hospitalization. Diabetes, if left untreated, can cause a variety of health A. Oza · A. Bokhare (B) Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed University), Atur Centre, Gokhale Cross Road, Model Colony, Pune, Maharashtra 411016, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_30
407
408
A. Oza and A. Bokhare
problems. Machine learning is the scientific study of an entity with the main attribute of intelligence in the broadest sense of the word. Machine learning seeks to create computer systems that can adapt and learn from their experiences. In this situation, computers are given the power to think via learning and growing intelligence. To classify datasets, a variety of machine learning techniques are applied. There are algorithms available for supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, evolutionary learning, and deep learning. Diabetes mellitus is diagnosed in this study using the K-nearest neighbor (KNN) algorithm. In current study results are obtained from classification and prediction method, where to classifies the data KNN algorithm is used and for the prediction Logistic Regression method is used to train the data.
2 Previous Study Saxena [1] applied artificial intelligence techniques and algorithms in her research work, which can be used to accurately anticipate and diagnose numerous diseases. These artificial intelligence algorithms have proven to be cost-effective and timesaving for diabetes patients and doctors. The accuracy rate reflects how many of the test dataset’s outputs are identical to various characteristics of the training dataset’s outputs. The rate of mistake/error shows how many data outputs from the test dataset differ from data outputs from distinct training dataset features. The results show that when the value of k increases, so does the accuracy and mistake rate. In the future, they recommend a simulation method that may be performed using tools other than MATLAB, such as WEKA, to obtain better and more accurate findings. Sharmila et al. [2] intend to evaluate data from patients’ medical records in order to anticipate diabetes. According to the report, nearly 40 million Indians currently suffer from diabetes. This work used decision trees with statistical implications and the R tool to analyze diabetes from massive medical records. Decision trees were also used in the study because they are straightforward to grasp, inexpensive to create, simple to interface with database systems, and relatively accurate in a variety of applications. R was used in this study to perform a detailed examination of the diabetic dataset. This study’s findings can also be applied to create excellent models of prediction. Sadhana et al. [3] emphasize the significance of analyzing previously available big diabetic datasets in order to uncover some critical facts that may contribute in the construction of any prediction model. This project will analyze datasets using Hadoop, hive, and R, in addition to data mining methodologies (as previously done). The data came from the National Institute of Diabetes and Digestive and Kidney Diseases’ Pima Indians Diabetes Database. A total of eight criteria were used to predict a diabetic patient’s health (number of pregnancies, glucose plasma concentration, blood pressure, serum insulin, body mass index, age, diabetes pedigree, and skin fold depth). The results were astoundingly fast, with hive analyzing 768 records in only 19s. R’s graphs can help you better understand the results.
Diabetes Prediction Using Logistic Regression …
409
Gowsalya et al. [4] intend to offer a system that can predict the probability of readmission for diabetic patients in the following 30 days utilizing the MapReduce approach. The article proposes a way for analyzing large datasets using Hadoop MapReduce. The data is collected directly from patients (through body sensors) and their providers. After that, the data was saved in the Hadoop distributed file system (HDFS), which employs the MapReduce algorithm. An analysis is carried out on datasets containing information about hospital admission, diabetes encounters, laboratory testing, medications, and length of stay in the hospital. If the value is larger than 8%, the likelihood of being readmitted is strong. A distributed file system is used to build this suggested system, which uses low-cost existing technologies to store data between nodes. This predictive approach assists hospitals and other healthcare organizations in effectively allocating clinicians, nurses, machinery, and other resources. Veena Vijayan [5] discussed that for diagnoses diabetes there was the traditional method which was not convenient as they are not accurate so they proposed a data mining algorithm in which they have talking about the KNN algorithm. The sampling algorithm is used to identify and optimize the hope in succeeding repeated cycles. The KNN method is used to categorize objects and forecast labels depending on some criteria nearby training samples in the feature space. There have been numerous data mining techniques developed to classify, predict, and diagnose diabetes. An expert can uncover unexpected and erroneous values by examining the data using the values. Pradeep and Naveen [6] examined and measured the performance of machine learning algorithms based on their accuracy in this study. As discovered in this experiment, the technique’s precision differs before to and following pre-processing. This shows that in disease prediction the pre-processing of the data collection affects the forecast’s execution and precision. The decision tree technique increases precision. In this research, using a diabetes dataset, random forest and support vector machine give improved post-processing prediction. Machine learning techniques, such as those developed by Ioannis et al. [7], are critical for predicting various medical datasets, including diabetic illness datasets (DDD). Using tenfold cross-validation, support vector machines (SVM), logistic regression, and naive Bayes were utilized to predict different/various medical datasets, including diabetes datasets (DD). Based on their findings, the researchers analyzed the accuracy and performance of the algorithms and concluded that the support vector machine (SVM) algorithm outperforms the other algorithms stated. Song et al. [8] are among those who have contributed to this study. Authors have described and explained several classification methods based on characteristics such as glucose, blood pressure, skin thickness, insulin, body mass index (BMI), diabetes pedigree, and age. The researchers did not incorporate pregnant factors in their prediction of diabetes illness (DD). In this study, the researchers used only a limited sample size to predict diabetes. In this paper, five distinct algorithms were used, namely GMM, ANN, SVM, EM, and logistic regression. Finally, the researchers believe that artificial neural network (ANN) provided high accuracy for diabetes prediction. Nirmala Devi [9] addressed the creation of an amalgam model for classifying the Pima Indian diabetes database in her work (PIDD). This hybrid model utilizes k-means, k-nearest neighbor (KNN), and a multi-step preprocessing procedure. The quality of the data in this amalgam model is enhanced
410
A. Oza and A. Bokhare
by minimizing noisy data, which helps to increase the accuracy and efficiency of the KNN method. To discover and delete wrongly classified cases, k-means clustering is applied. The goal of this research is to use amalgam KNN to calculate the value of k for PIDD in order to improve classification accuracy. The experimental results reveal that the suggested amalgam KNN with pre-processing produces the best results for a wide range of k values. When the k value is increased, the proposed model obtains 97.4% classification accuracy. Tenfold cross-validation with a bigger k value improves PIDD classification accuracy. For the same k values, the result is also being examined to standard KNN, cascaded K-means, and KNN. Manivannan and Kavitha [10] have investigated numerous computerized technologies, the drawbacks of traditional approaches, and how ML can aid in the diagnosis of diabetes. AI is mostly used to forecast and identify diseases. Diabetes was identified in the suggested model using the K-nearest neighbor algorithm, which is the most widely used model in artificial intelligence. They discovered that utilizing KNN enhances accuracy and results over other techniques. According to Abdulhakim Salum Hassan [11], classification techniques such as decision tree, k-nearest neighbors, and support vector machines were used in the study to categorize people with diabetes mellitus. Precision, accuracy, sensitivity, and specificity were used to evaluate the performance of many applicable techniques. SVM surpassed decision trees and KNN in terms of accuracy, with a maximum of 90.23%. The performance of various classification algorithms was evaluated in order to identify which technique should be used to examine the provided dataset in the future [12]. Authors have used method of classification J48, CART, SVMs, and KNN which were applied to a medical dataset to determine the best answer for diabetes. The performance metrics accuracy, specificity, sensitivity, precision, and error rate are calculated for the provided dataset. Accusation, in conjunction with a good data pre-processing approach, can improve the classifier’s accuracy. The data normalization function had a considerable impact on categorization performance and significantly improved J48’s performance. The KNN algorithm’s performance has a low level of accuracy. [13] During the work, four machine learning algorithms were utilized to predict diabetes, and the performance of each kind is evaluated using several accuracy criteria. The conventional invasive method is used to record actual outcomes. After comparing each method to the real results, it is discovered that the decision tree algorithm outperforms all others, with a precision of 89.97% for dataset1 for diabetes prediction. Using Clarke error grid analysis, we obtained 94.27% of clinically relevant data points approved region A and B for dataset 2 [14]. Authors have investigated the development of a diabetes model utilizing several machine learning classification techniques such as super vector machine, logistic regression, decision tree, random forest, and voting classifier. A classification model was trained using a dataset that included 768 samples of both male and female individuals. When compared to other established models, the SVM classifier demonstrated good accuracy for both female and male data. Table 1 indicates that most of the author have used KNN technique to predict the diabetes. Current study focuses on a comparative study of K-nearest neighbor
Diabetes Prediction Using Logistic Regression …
411
Table 1 Inferences analyzed from the previous study S. No. Technique used Limitations
References
1
AI Technique
• Simulations can be done using different tools other than MATLAB like WEKA, etc., to get better and more accurate results
[1]
2
Decision Tree
• It is form hybrid classification models so it is unstable [2–4, 6, 13] • Derivative features for better results of measurement is not present • generally leads to overfitting of the data
3
KNN
• To use adaptive neuro-fuzzy inference system [5, 9–12] algorithm for better accuracy • Future work is to add more datasets so as to check the prediction accuracy for scalable data • To improve the model • In the future, it is planned to improve the accuracy prediction by testing our classification techniques with huge dataset and its performance can be improved • Future works may also include hybrid classification models by combining some of the data mining technique
4
SVM
• Not suitable for large dataset • Training time is higher
[7, 14]
5
ANN
• Not give accurate result can increase their accuracy by applying time series • Clustering and association rules
[8]
algorithm and logistics regression algorithm to test the correctness of the result and evaluate the model. It also aims to verify which gives better accuracy.
3 Methodology In current study, we did a comparative study of KNN and logistic regression for get to know which gives accurate results or better accuracy. KNN is used to classify the data, and to predict the data, a supervised machine learning-based logistic regression approach is used to train my model. KNN is a simple data mining approach that is used for categorization. KNN is an instance-based learning technique, sometimes known as lazy learning, in which the function is estimated locally and globally the computation is postponed until categorization. It can be useful to give weight to unneighborly efforts, because those who live nearest to us contribute more to the average than those who live farther away. The Euclidean distance formula is commonly used to calculate distance. Here, k is a fixed integer that usually takes an odd number between 1 and 5. It is the most basic machine learning algorithm. This approach can be used to anticipate any sort of label.
412
A. Oza and A. Bokhare
• It is analytically tractable and highly adaptable to local storage. • It is classed with the majority of its neighbors, with the class item most commonly assigned among its closest k neighbors, in where k is a positive integer. • The KNN method estimates using the closest data points; parallel implementation is also possible. Because the approach is instance-based, it searches the training table for each data point’s K-nearest neighbor. Because each data point is different from the others, the search and scoring can be done at the same time. For training data, the K folds cross-validation procedure is utilized. This technique is most commonly employed when the goal is prediction, and the study’s goal is to explore how well a predictive model performs in practice, particularly in terms of accuracy. A training dataset of known data instances is often supplied to a model, as is a collection of anonymized data against which the model is evaluated in the prediction task (dataset for testing). To evaluate predictive models using this strategy, divide the original sample dataset in half and divide it into two parts: a training set for training the model and a test set for evaluating it. The original sample is randomly divided into k equivalent size subsamples in k-fold cross-validation. One of the k subsamples is labeled as validation data and is used to test the model, while the remaining k-1 subsamples are practice data. The cross-validation process is then repeated k times (called folds), with each of the k samples serving as validation data just once. It executes in a loop. The fact that both training and validation are accomplished through observation, but only once for validation, is one advantage of utilizing this technique. Figure 1 depicts the prediction model for the procedure. Logistic regression is a classification technique that is based on linear regression. Logistic regression is best utilized for classification rather than regression. The target variable might be either binary or multi-class. To use the logistic regression model to classify, we will use the LogisticRegression () function. First, we have to train our dataset in a logistic regression model on (X train, y train), and then, we can assess the model using (X test, y test). So, thus, we can create a logistic regression model, predict for the X test, and compare the prediction to the y test.
4 Dataset and Experiment Discussion In current study, the “PIMA” Indian diabetes database is used. The Diabetes and Digestive and Kidney Disease National Institutes of Health provided this dataset. The goal of this dataset is to determine whether a patient is diabetic depending on various diagnostic measures in the dataset. The selection of these cases from a wider database was subject to a number of constraints. All patients are women of Pima Indian ancestry who are at least 21 years old. The datasets include various medical predictor variables as well as one objective variable, outcome. The patient’s number of pregnancies, BMI levels, insulin levels, age, and so on are all forecasting variables.
Diabetes Prediction Using Logistic Regression …
413
Fig. 1 Diabetic prediction model using KNN algorithm
The dataset consists of ten attributes, which are as follows: (1) Pregnancy, (2) Glucose Levels, (3) Blood Pressure, and (4) Skin Thickness, (5) Insulin is a hormone, (6) Body Mass Index (7) Diabetes Pedigree Function, (8) Age, (9) End result (0 or 1). The implementation is done using Python libraries. For screening of association between variables to study bivariate relationship, we have used pair plot to study the association between variables—from individual scatter plots. Figure 2 depicts the correlation between attributes in the dataset. From scatter plots, to me only BMI and Skin Thickness and Pregnancies and Age seem to have positive linear relationships. Another likely suspect is Glucose and Insulin. There are no nonlinear relationships. Figure 3 explains the heatmap outcome of correlation.
5 Result Analysis In this study, for getting accurate result, classification and regression techniques are used. This case study has used large set of data, so for working with dataset KNN is
414
A. Oza and A. Bokhare
Fig. 2 Correlation between attributes
applied for classification of data. The inaccuracy rate of K value is depicted in Fig. 4. The K value is shown on the X-axis, and the mean error is shown on the Y-axis. As result by applying the KNN classifier, it shows the True Positive (TP) as 109, True Negative (TN) as 36, False Positive (FP) as 21 and False Negative (FN) as 26 which displayed in Fig. 5. It reached to the conclusion that the algorithm predicts 75.52% of the test data correctly. Applying the logistic regression, it shows the True Positive (TP) as 115, True Negative (TN) as 37, False Positive (FP) as 15, and False Negative (FN) as 25 which displayed in Fig. 6. So, thus, as result, it reached to the conclusion that the algorithm predicts 79.16% of the test data correctly. Tables 2 and 3 show the performance comparison between KNN and logistic regression machine learning algorithm.
Diabetes Prediction Using Logistic Regression …
Fig. 3 Heatmap of feature (and outcome) correlations
Fig. 4 Error rate K value
415
416
A. Oza and A. Bokhare
Fig. 5 Confusion matrix for K-nearest neighbor
Fig. 6 Confusion matrix for logistic regression
Table 2 Performance metrics of the KNN classifier
Table 3 Performance metrics of logistic regression
Performance metric name
Value (%)
Precision
63.15
Recall
58.06
F1-score
60.50
Accuracy
75.52
Performance metric name
Value (%)
Precision
71.15
Recall
59.67
F1-score
64.91
Accuracy
79.16
Diabetes Prediction Using Logistic Regression …
417
6 Discussion and Interpretation The use of vast amounts of data has been recognized as clinically helpful as medical records have increasingly become electronic databases. However, if these data are not therapeutically relevant to our day-to-day clinical practice, they are rendered ineffective. To be useful and beneficial, data must be evaluated, interpreted, and translated into healthcare approach. Machine learning is a new technology for processing and analyzing large amounts of data. Using supervised learning is a learning strategy that helps to enhance prediction accuracy by training the machine using well-labeled data. KNN learning models (using 1, 10, and 100 neighbors with the Euclidian distance measurement method) were developed to validate the validation test according to the learning models, and the cross-validity test was done from 1 to 30 times. The most stable piece of the cut-off was searched. Logistic regression was also used to categorize individuals suffering from type 1 and type 2 diabetes using optimal machine algorithms. The outcome is predicted using logistic regression and KNN with 75% of the training data and 25% of the testing data. After training the model, the accuracy of 79.16 and 75.52% is achieved by respective algorithm on testing data. The importance of comparing conventional regression modeling to machine learning is highlighted in this work, especially when only a few well-understood, strong predictor variables are used.
7 Conclusion Several data mining approaches and applications were explored or analyzed in the prior case study. In a number of medical datasets, including the machine diabetes dataset, automated learning methods were applied. The goal of machine learning is to make the detecting process easier. Machine learning methods perform differently in different datasets. For this study, we have done comparative study. The proposed system focused on the features analysis and classification parts, for that KNN and logistic regression algorithm. By KNN, we got accuracy 75.52%, and by using logistic regression, we got 79.16% accuracy. It is concluded from the study logistic regression gives better diabetic disease prediction than applying KNN algorithm.
8 Future Scope The proposed system can be developed in a variety of ways, giving the system a wide range of potential enhancements. These are some descriptions: improve the algorithms correctness, improve the algorithms to increase the system’s efficiency, improve its operation, work on some additional characteristics in order to fight diabetes even more effectively to develop it into a complete healthcare diagnosing system for use in hospitals.
418
A. Oza and A. Bokhare
References 1. Saxena K, Khan Z, Singh S (2014) Diagnosis of diabetes Mellitus using K nearest neighbor algorithm 2. Sharmila K, Manickam S (2015) Efficient prediction and classification of diabetic patients from bigdata using R. Int J Adv Eng Res Sci 2 3. Sadhana S, Savitha S (2014) Analysis of diabetic data set using hive and R. Int J Emerg Technol Adv Eng 4 4. Gowsalya M, Krushitha K, Valliyammai C (2014) Predicting the risk of readmission of diabetic patients using MapReduce, pp 297–301 5. Veena Vijayan V, Ravikumar A (2014) Study of data mining algorithms for prediction and diagnosis of Diabetes Mellitus 6. Pradeep KR, Naveen NC (2016) Predictive analysis of diabetes using J48 algorithm of classification techniques. In: 2016 2nd International conference on contemporary computing and informatics (IC3I). IEEE, pp 347–352 7. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I (2017) Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J 8. Komi M, Li J, Zhai Y, Zhang X (2017) Application of data mining methods in diabetes prediction. In: 2017 2nd International conference on image, vision, and computing (ICIVC). IEEE, pp 1006–1010 9. Nirmala Devi M, Appavu Alias Balamurugan S, Swathi UV (2013) An amalgam KNN to predict diabetes Mellitus. In: IEEE international conference on emerging trends in computing, communication and nanotechnology (ICECCN), pp 691–695 10. Manivannan D, Kavitha M (2021) Health monitoring system for diabetic patients. Turk J Comput Math Educ 11. Hassan AS, Malaserene I, Anny Leema A (2020) Diabetes Mellitus prediction using classification techniques. ResearchGate 12. Saravananathan K, Velmurugan T (2016) Analyzing diabetic data using classification algorithms in data mining. Indian J Sci Technol 13. Bavkar VC, Shinde AA (2021) Machine learning algorithms for diabetes prediction and neural network method for blood glucose measurement. Indian J Sci Technol 14. Shruthi U, Kumar TN, Ajay A, Charan A, Yadav ADK, Rupesh K (2021) Diabetes prediction using machine learning technique. Int Res J Modernization Eng Technol Sci
Linear Regression for Car Sales Prediction in Indian Automobile Industry Rohan Kulkarni and Anuja Bokhare
Abstract The automobile industry is one of the leading industries in our economy. Sudden up rise in the demand for automobile vehicle and also the growth in profits are the leading factors for this industry to become one of the major and important ones. This industry is also coming up with various financial aids and schemes for the general population which is why people buying vehicles is causing a ripple effect and maximizing their profits and the growth of industry. This industry has been a great force and a contributor to our economy. That is why this is of important significance for us to accurately predict the sales of automobile. That is why every industry or organization wants to predict the result by using their own past data and various learning algorithms of machine learning. This will help them visualize past data and help them to determine their future goals and plan accordingly and, thus, making sales prediction the current trend in the market. Current study helps to get the prediction of sales in automobile industry using machine learning techniques. Keywords Automobile · Profits · Economy · Sales prediction · Leaning algorithm · Visualize · Machine learning
1 Introduction In the Indian car sales industry, preparation is regarded as a crucial practice. The demand for vehicles is growing as the market increases each year. However, the business plan needs appropriate goals based on revenue targets, which needs the use
R. Kulkarni · A. Bokhare (B) Symbiosis Institute of Computer Studies and Research, Symbiosis International (Deemed University), Atur Centre, Gokhale Cross Road, Model Colony, Pune, Maharashtra 411016, India R. Kulkarni e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_31
419
420
R. Kulkarni and A. Bokhare
Fig. 1 Sales of automobiles in India from financial year 2011 to 2020 (in million units) ( Source www.statista.com)
of sales forecasting since the growth of sales in some segments has slowed in recent years. As a result, it can be seen that sales forecasting is an important part of making appropriate decisions [1]. Figure 1 shows the percentage of sales growth in Indian automobile industry. Predicting or forecasting sales is a major game changer in the business world. It can help you in gaining the information of what is currently going on in the market and what are the steps we can take to be a conqueror of market. It is known that if you want to do anything in life, the first step is always planning and that same applies here as well and with the help of the information, and using predictions, it will guide further in the study to make proper planning which leads to take right steps; this process will help the enterprise in sales and marketing. Sales prediction will always be a strong pillar that will help the industry [2]. The dataset that has been used in this case study is car sales which is taken from Kaggle.com. Car sales dataset does not consist of any independent variables. Reason for that is just consider that if decision is to buy a car, then there are many factors that affect the decision like what is the cars power? How will be the comfort? What will be the mileage? etc. Just like that, here as well, these factors affect the sales. In this study, focus is on what are those factors which has a significant effect on sales, and for this, use of machine learning algorithm (linear regression) will be done. This process will help in seeing and identifying the pattern that were hidden before. The final products of these study will give enterprises the solution of what are the factors affecting the sales and how to deal with it. This will help the enterprise in gaining an advantage over the customer as well as other competitors. The objective of the study is to predict the sales of cars in future and with providing various visualizations to help and understand what and how?
Linear Regression for Car Sales Prediction in Indian …
421
2 Previous Study The study focuses about the concept of machine learning (ML). What is ML? It highlights how to develop such machines that can mimic human brains way of thinking. Just like humans learn from life experiences, the same way this machine will also learn from their experiences. And that is one of the reasons why ML is booming so quickly [1]. Author suggests that the machine learning today has forgot its relation with greater world of science and our general society which also means that there is a possibility of huge limitations existing in the data that is being processed, the attributes considered while predicting, and also, the output obtained as a result. Author suggests the ways in which the impact is decreased about limitations on the data while processing [2]. Glimpse of such four areas is presented in the study. One of them has been the importance of accuracy by the various ML algorithms. Second has been a process to ramp up the supervised algorithms toward development. Third is making sure that the model been trained is learning properly. Last has been to help the model learn complicated algorithms [3]. The perspective of the author regarding machine learning is discussed in this study. It says that ML is a pillar of AI and is the reason of so much accomplishment in the digital world. It contains the thorough study of various algorithms of ML which are commonly used and, hence, are the preferred ones of everyone. It jots down the advantages and disadvantages of each one of them which will help people in using the best one for their study [4]. According to the study, data mining is one of the trends in today’s era. Cause of that been that it can be applied in any domain to push it toward evolution. Classification from DM can be a great example of supervised learning. It gives labels to the data present in the dataset on the bases of relation and pre-defined specifications. It gives us a summarized study of various algorithms and its comparison with each other [5]. A summary of various learning methods was presented, and it gives idea of which one should be used for what purpose. This paper uses ML models on ecological dataset and to identify the solutions of questions understudy. It also gives us an explanation of how they are approaching the process and show various outputs using these models [6]. Paper gives a detailed view of the comparison between supervised and unsupervised learning models with respect to their ability to sort the data and recognizing the patterns hidden in the dataset. Classification is an important step in any ML study, and this paper provides with a guide to which one can use to get most accurate results [7]. Authors describe the views on supervised learning and its various algorithms which were used on the same dataset and then compared the results to see which of them was the most efficient and to provide detailed review about it. All these algorithms were applied to the same dataset of diabetes which had almost 700+ data entries with eight columns of attributes. Conclusion was that if an algorithm wants to be classified as supervised, then it should have accuracy [8]. Author studies the relation between patterns and age. Uses linear regression to predict the age of the author using various attributes like blogging style, the way they talk over the phone, way they type online etc.; this is done using a combination of learning models [9]. Linear regression was
422
R. Kulkarni and A. Bokhare
used to differentiate between different types of variables those can be dependent or independent. One of the most famous and preferred choices while analyzing is linear regression [10]. It helps in identifying attributes and its relations to each other. It can be used to identify dependent attributes by use of many independent attributes [11]. A new approach for forecast sales growth is illustrated, and the researchers have used a hybrid model to predict the sales using the output of time series. After seeing the need to increase the functional power of the algorithm, the researches composed a mixture of AHW and BPNNs methods. This step helped in increasing the prediction results. In this study, the dataset was of the last few year’s sales data and various previous predictions. The same data was used to train and test the machine. The most important result to notice was that the output from this method was very good than any other model. Qingyu et al. [12] The sales prediction is a trend in business world and every company wants to apply this to increase its profits. The data used in the study has been derived from various sources and also told that what were the many factors that affect the sales. It gave us an idea that what methodology should apply to view how good the dataset is and how good it will work. Authors have applied multiple algorithms like linear regression, random tree, and random forest then at the last, the TOPSIS was applied which helped the researchers to make a final decision and choose between various patterns that were identified. Madhuvanthi et al. [13] Study highlights about what the industry dynamics are and what need to do to make any correct decisions if one can forecast what factors will cause a shift in car sales. It is important to understand what critical variables are in order to construct and train an accurate system, which will contribute to the production while also forecasting market changes. There is a need to build a model based on historical data in order to forecast longer-term car sales. The main aim of this research is to decide what cut point for each variable would result in a major shift in auto car sales and to obtain a reliable forecasting. [14] This paper showed what is the importance of the automobile industry and why it is important to predict car sales as accurate as possible. The study uses dataset that is a mixture of Web searches and past sales data. The machine was designed to use the keywords available and then auto recommend the results. It uses the keywords in the regression model which identify the dependency of the car and its price and how it will affect it. Gegic et al. [15] The researcher explained that the view on why car prediction is important and why it is gaining so much attention this fast. A hybrid model of algorithm which make use of three different machine learning algorithm was suggested to obtain high level of accuracy in results. The dataset was obtained through internet using the PHP scripting language. Then, efficiency of all the algorithms was compared to see which of them was more suited and provided more accuracy. The final model was then combined with a Java application. The outcome of the model provided was 87% accuracy. [16] Study describes the view for evaluating factors and various methods that were used for predicting the sales. Processes used in the study were AHP and neural networking. Questionnaires created for the people who were talented in this domain and were asked for what are the attributes which affect the car sales. This data was used to train the model using various methods and to get more accurate
Linear Regression for Car Sales Prediction in Indian …
423
output. This models’ initial steps were to find the attributes affecting the sales and then identifying the patterns. The result showed that the accuracy of neural networks was more than any other methods applied. Current study focuses on modeling, implementing, and analyzing linear regression model to predict sales of car based on various attributes.
3 Methodology The current study uses linear regression learning algorithm. This algorithm is a subcategory of supervised leaning. In this, the expected result is in continuous form and follows a linear slope. This algorithm is used to predict that results are in a continuous form without just making them categorized. In the current study, there is an independent variable that is denoted by ‘X’ and dependent variable denoted by ‘Y.’ Linear relationship is built between these variables [17]. Linear regression can be more understood using the Eq. (1). Y = β0 + β1 ∗ X
(1)
where X and Y are called independent and dependent variables, respectively, β 1 is the coefficient for independent variable and β 0 is the constant term. The dataset considered for the study is of the car sales which consists of model, type, manufacturer, type, fuel capacity etc. This data will be used to predict various factors affecting the sales of cars and then to generate a scatter diagram to show what are the relations between various attributes like sales, model, price, type etc.
Fig. 2 Prediction model for car sales system
424
R. Kulkarni and A. Bokhare
Fig. 3 Sale versus resale value
4 Dataset and Experiment Discussion Dataset obtained from Kaggle which was considered for the study is car sales which consists of various attributes like model, type, manufacturer, type, fuel capacity etc. This data will be used to predict various factors affecting the sales of cars and then to generate a scatter diagram to show what are the relations between various attributes like sales, price, width, wheelbase, resale, engine size, horsepower etc. Considered dataset has many columns which are unnamed, and some of them are named but some are not required for the study; such features will get filtered and will be drop all such columns from the dataset. Due to this, there is no more complications when proceeded in the study. This can be called the data cleaning stage. In this step, the outliers and missing values are identified values and outliers capping: Scatter diagrams with respective to various relations: From Figs. 3, 4, 5, 6, 7, 8 and 9, it can be observed that sale, price, engine size, horsepower, and wheelbase have significantly negative correlation with sales; on the other hand, we can see that resale value and price have a higher correlation.
5 Result Analysis Objective of the study was to analyze and predict the car sales that may happen in future using various combination of attributes and also to identify what are those attributes that affect the sales most. It can be viewed that the model at some extent was successful in achieving that goal. After performing all the analysis and interpreting all the results it can be viewed that the model is in need for improvements.
Linear Regression for Car Sales Prediction in Indian …
425
Fig. 4 Sale versus price
Fig. 5 Sale versus engine size
Figure 10 shows heat map was plotted to see the correlation between all the attributes/factors that can affect car sales. The most important observation in this is that price of the car is not highly correlated with the sales whereas that is not what is observed in real life. • Highly correlated attributes to sales are: horsepower, wheelbase, fuel capacity, width, length, and fuel efficiency • Low correlated attributes to sales are: curb weight, engine size, and price. Ordinary least squares (OLS) is a method which is used to predict the parameters. R-square is 51.4%, i.e., variation in y is explained by independent variables, i.e.,
426
R. Kulkarni and A. Bokhare
Fig. 6 Sale versus horsepower
Fig. 7 Sale versus wheelbase
horsepower, wheelbase, fuel capacity, width, length, and fuel efficiency and adjusted R-Square is 49.7% shown in Table 1. Figure 11 shows the comparison between the predicted output from the linear regression model that was trained and the actual result. From Fig. 11, it is observed that the model has done well considering there was only one algorithm used to train. But on contradictory, it can be seen clearly that the model can be improved.
Linear Regression for Car Sales Prediction in Indian …
427
Fig. 8 Sale versus length
Fig. 9 Price versus resale value
6 Discussion After studying the case study dataset and applying machine learning algorithm on the dataset, it is found that car sales are affected by various variables like the type of car, fuel capacity of the car, manufacturer of the car, resale value of the car after one year etc. But the most important factor affecting the sale of the cars is the price of the car. A linear model is created for car sales with respective various factors like wheel base, horse power, engine size, value of resale after a year etc. Applying single machine algorithm on the dataset obtained accuracy less than 50%. The mean absolute error obtained was 0.62989 and root mean square error
428
R. Kulkarni and A. Bokhare
Fig. 10 Heat map
Table 1 Statistical values of the model
Performance metric name
Value (%)
R-square
51.4
Adjusted R-square
49.7
obtained was 0.82699. Therefore, the ensemble of multiple machine learning algorithms is suggested for future. Although this car sale prediction has done well, it can be improved in future (Table 2).
7 Conclusion Automobile industry has been an asset to our economy. This study has helped to analyze the past data of organizations. This will help the organization in studying and identifying various factors that affect and contribute most to their car sales. As
Linear Regression for Car Sales Prediction in Indian …
429
Fig. 11 Comparison between actual result and predicted result
Table 2 Error values
Error description
Values
Mean absolute error
0.62989
Root mean square error
0.82699
Accuracy
49%
discussed above and viewing the results obtained after applying the model, it can be said that the model was successful at achieving the goal to some extent, and it has vast scope for improvement in future.
8 Future Scope Future scope can be having bigger dataset with more values which can give more accurate predictions and making the data more representable so that it can be easier to understand and well-defined. Also, applying different type of machine learning algorithms like clustering, random forest, decision tree, support vector machine may help to achieve better result with same dataset. The output will then compared to identify which algorithm is the best for the dataset under consideration.
430
R. Kulkarni and A. Bokhare
References 1. Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260 2. Wagstaff KL, Machine learning that matters. Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109 USA 3. Dietterich TG (1997) Machine-learning research. AI Mag 18(4):97 4. Ray S (2019) A quick review of machine learning algorithms. In: 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India, pp 35–39. https://doi.org/10.1109/COMITCon.2019.8862451 5. Bhavsar H, Ganatra A (2012) A comparative study of training algorithms for supervised machine learning. Int J Soft Comput Eng (IJSCE) 2(4). ISSN: 2231-2307 6. Crisci C, Ghattas B, Perera G (2012) A review of supervised machine learning algorithms and their applications to ecological data. Ecol Modell 240:113–122. ISSN 0304-3800 7. Sathya R, Abraham A (2013) Comparison of supervised and unsupervised learning algorithms for pattern classification (IJARAI). Int J Adv Res Artif Intell 2(2):34 8. Akinsola O, Awodele, Hinmikaiye J, Akinjobi O (2017) Supervised machine learning algorithms: classification and comparison. J Int J Comput Trends Technol (IJCTT) 48(3) 9. Nguyen D, Smith NA, Rose CP (2011) Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT workshop on language technology for cultural heritage, social sciences, and humanities, Portland, pages 115–123 10. Kumari K, Yadav S (2018) Linear regression analysis study. J Pract Cardiovasc Sci 4:33–36 11. Bin Othman M, Sokkalingam R, Thangarasu G, Subramanian K (2020) A new approach for forecast sales growth in automobile industry. Int J Sci Technol Res 9(01). ISSN 2277-8616 12. Qingyu Y, Geng P, Ying L, Benfu L (2011) A prediction study on the car sales based on web search data. In: 2011 international conference on E-business and E-government (ICEE), Shanghai, China, pp 1–5. https://doi.org/10.1109/ICEBEG.2011.5882762 13. Madhuvanthi K, Nallakaruppan MK, Senthilkumar NC, Siva Rama Krishnan S (2019) Car sales prediction using machine learning algorithms. Int J Innov Technol Explor Eng (IJITEE) 8(5). ISSN: 2278-3075 14. Auto car sales prediction: a statistical study using functional data analysis and time series. Honors Thesis by Yuchen Lin Advised by Professor Ed Rothman University of Michigan Department of Statistics 15. Gegic E, Isakovic B, Keco D, Masetic Z, Kevric J (2019) Car price prediction using machine learning techniques. Int Burch Univ Sarajevo Bosnia and Herzegovina Tem J 8(1):113–118. ISSN 2217-8309. https://doi.org/10.18421/Tem81-16 16. Farahani DS, Momeni M, Amiri NS (2016) Car sales forecasting using artificial neural networks and analytical hierarchy process case study: Kia and Hyundai Corporations in the USA. In: The fifth international conference on data analytics data analytics 17. Rahimi R (2017) Organizational culture and customer relationship management: a simple linear regression analysis. J Hosp Market Manag 26(4):443–449
Agent-Driven Traffic Light Sequencing System Using Deep Q-Learning Palwai Thirumal Reddy and R. Shanmughasundaram
Abstract Reinforcement learning (RL) is a machine learning technique where an agent successively improves its control policies through feedback. It can address complex real-world problems with minimum development effort as the agent understands the environment by itself. One such complex scenario is to control the traffic flow in areas with high traffic density. This work is to automate the sequencing of traffic lights providing less waiting time at an intersection. The agent is a computer program that acts accordingly by observing traffic at an intersection with the help of sensors. It learns over time based on its interactions with the environment. The Deep Q-Learning technique is chosen to build this agent because of its better performance. This set-up is implemented using Python in the SUMO simulator environment. A comparison is drawn between static traffic sequencing and RL traffic agent. The traffic agent performs better over static traffic sequencing. Keywords Reinforcement learning · Machine learning · Deep Q-Learning and traffic light
1 Introduction According to United Nations, 68% of the global population will live in cities by 2050 [1]. This percentage is constantly raising and in turn increasing the road traffic in cities. The traffic flow in urban areas needs attention as conventional traffic control methods are causing transport delays. RL can be implemented to manage the traffic flow efficiently in multiple scenarios [2]. In general, machine learning can be completely or partially observable, deterministic or stochastic, onetime or sequential, static or dynamic and discrete or continuous [3]. Recent studies show that an RL agent modelled to make a series of sequential decisions with occasional feedback from surroundings performs best in many control-based scenarios [4]. RL P. T. Reddy · R. Shanmughasundaram (B) Department of Electrical and Electronics Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 M. Saraswat et al. (eds.), Congress on Intelligent Systems, Lecture Notes on Data Engineering and Communications Technologies 111, https://doi.org/10.1007/978-981-16-9113-3_32
431
432
P. T. Reddy and R. Shanmughasundaram
agent creates a mapping between input and output of the system using rewards and punishment as signals for positive and negative behaviour. It uses the trial-and-error method to learn interactively from its own actions and experiences. It is said to be intelligent when it can learn the right action using interactions with its environment [5]. It tends to find a suitable action model that would maximize the total cumulative reward to make an intelligent agent. The accessibility of multiple traffic simulators makes this research cost-effective [6]. This work is to handle traffic at a four-way junction using RL. RL agent observes the traffic density and takes an appropriate action established on the previous experience over the environment. The actions are initially random, where the agent learns by exploration and then uses acquired data to exploit the environment. The simulations are carried out in a virtual simulator called Simulation of Urban MObility (SUMO). RL agent is coupled with SUMO using its application programming interface (API). In the following section, i.e. Sect. 2, the ideology of RL and the Deep Q-Learning algorithm is described. Section 3 talks about the mathematical representation of the agent. Section 4 is about the implementation of the agent in the SUMO simulator. Section 5 drafts a comparison between static traffic sequencing and RL traffic agent, followed by a conclusion in Sect. 6.
2 Reinforcement Learning The objective of an RL agent is to match the intelligence of humans. To achieve this, it should interact with the environment and learn to perform an appropriate action. Recently, RL proved its capability by playing complex virtual games [7]. An RL agent initially observes the environment and represents its state at that time t using state-space matrix S t . The agent then performs an action At to transform the environment to a new state St+1 . After every action, the agent gets a reward Rt+1 which helps the agent to analyse the action performed in the previous step as shown in Fig. 1. Any RL agent aims to attain a policy π which maximizes the cumulative reward for the set of actions taken by following that policy.
2.1 Q-Learning The Q-Learning is an RL algorithm where Q(S t , At ) denotes the Q-value of the agent at the time t. The learning rate, α, and the discount factor, γ , are the parameters used to train the agent as required. The learning rate controls the learning ability of the agent. The agent with lower α does not adjust to new data, and the agent with the higher α only relies on newer data. The discount factor controls the weightage given to the reward. The agent with lower γ gives more weightage to immediate rewards, and the agent with higher γ gives more weightage to future rewards. An optimal value between 0 and 1 needs to be chosen for α and γ based on the environment.
Agent-Driven Traffic Light Sequencing System Using …
433
Fig. 1 Loop of reinforcement learning process
2.2 Deep Q-Learning The Deep Q-Learning algorithm is a combination of the Q-Learning algorithm and the deep neural networks. Deep neural network is composed of multiple layers of artificial neurons [8]. These neurons form a mathematical function to relate the input and the output of the system. Deep Q-Learning is a deep RL algorithm where a neural network is implemented to achieve the Q-value function. Instead of classification, this neural network provides Q-values from the input states.
3 Mathematical Representation of Traffic Agent The four-way junction considered is shown in Fig. 2. Each road extends from that junction, making a total of four different legs. Again, each leg has two different sides; traffic on one side of the leg leaves the junction and the other side approaches the junction. The traffic agent needs to control traffic only on that side of the leg where vehicles are approaching the junction.
434
P. T. Reddy and R. Shanmughasundaram
Fig. 2 Extension of the leg form a four-way junction
3.1 State Matrix The formation of the state matrix is done by representing the presence of vehicle by 1 and the absence by 0 across the lane to a considerable distance. This matrix helps the traffic agent to realize the environment. Each leg at the intersection is divided into smaller units, and each unit is a variable given to the state matrix. A road length of 90 m from the junction is considered. This part of the road is divided into 60 units in every leg with 15 units in each lane. These 60 units of 15 × 4 state matrix are considered assuming that there will be a vehicle in each unit. This assumption is made to approximately estimate traffic density. The distance of 90 m is taken based on the assumption that traffic beyond 90 m is already heavy traffic and need not be analysed separately.
3.2 Action Matrix The action matrix is represented by the appropriate traffic light sequence. This matrix is used to take an action after analysing the current state to move to the next state. It contains a predefined set of actions that the traffic agent can act upon. The traffic agent controls the vehicles using the standard RYG light sequence where yellow stays for 3 s between a change of state from red to green or vice-versa. Here, the traffic agent checks the environment for every 10 s to either retain or change the state. According to the standard ring and barrier traffic flow mechanism, action sets A1 or A2 as in Eqs. 1 and 2 or even the combinations of them can be used for the smooth flow of traffic. In this paper, the action set A1 is chosen as per the European standards. The visual representation of action set A1 is given in Fig. 3. A1 = {1and5, 2and6, 3and7, 4and8}
(1)
Agent-Driven Traffic Light Sequencing System Using …
435
Fig. 3 Action set followed by the traffic agent
A2 = {1and6, 2and5, 3and8, 4and7}
(2)
3.3 Reward Matrix The reward matrix helps the traffic agent to analyse the effect caused by the last action. It is based on a function that considers the number of vehicles waiting and their waiting time. The feedback is taken from the traffic environment to evaluate the effect of taken action on the environment. This step is important in RL to justify the traffic agent’s actions and learn accordingly. The reward in RL is like a score while playing a game; positive value represents an outcome that aids the traffic agent’s function and the negative value corresponds to an unjust action. The cumulative waiting time, H(t), is considered for the reward function. It is given as the sum of waiting times, h(t) of all the individual vehicles at the junction as in Eq. 3. H (t) =
n i=1
h(t)
(3)
The reward function is defined as Eq. 4, which is the function of the cumulative waiting time of all the vehicles.
436
P. T. Reddy and R. Shanmughasundaram
R(t) = H (t − 1) − H (t)
(4)
When more vehicles keep waiting in the junction, the reward tends towards a negative value. So, the traffic agent always tries to reduce the delay in the journey.
4 Implementation The motivation for using RL to handle the traffic flow is due to recent developments in this field [9]. RL agents can handle complex situations by exploring the environment without any hardcoding. An RL agent is given only a simplified version of the environment using state-space representation. Traditionally traffic lights have hardcoded sequencing which is static irrespective of traffic flow [10]. This system causes unnecessary delay at the junction which is a regular encounter. This implementation integrates and understands RL applied at a four-way junction [11]. Netedit tool of the SUMO simulator package is utilized to create a four-way junction road layout as the initial step. A typical four-way junction road layout is shown in Fig. 4. Implementation of RL and integration to SUMO simulator using its application programming interface (API) is the important phase of this work. The steps followed to achieve this is as shown in Fig. 5. The Python programming language is used to realize the algorithm in the SUMO environment. Pedestrians, footpaths and zebra crossings are also taken into account. The road layout is prepared by following regular city intersection standards of Europe [12]. The European standards are chosen as they can be modified easily with minor changes into either Indian or American layouts. The traffic agent will be in a continuous loop to obtain the traffic density, provide a proper traffic light sequence and calculate the cumulative waiting time. The training process consists of 100 episodes where each episode accounts for 30 min of training. This provides the traffic agent with two days of traffic experience. This process took about 5 h of computational time using the Intel-Core-i7-5500U processor. The Weibull distribution is used to spawn the vehicles for providing the agent experience in a different range of traffic densities. To provide a unique traffic scenario in every episode, each vehicle spawned is given a random path. Deep Q-Learning utilizes concepts of both deep neural networks and Q-Learning [13]. Q-Learning works around a parameter called Q-value, which is given by Eq. 5. Q new (St , At ) = Q(St , At ) + α(Rt+1 + γ .maxQ(St+1 , At ) − Q(St , At ))
(5)
This Q-value is updated in every iteration with a learning rate of 0.0005. The reward function provides the path to experiment and exploits the environment properly with a discount rate of 0.8. Here, Deep Q-Learning is implemented with the help of a convolutional neural network (CNN) for breaking complex representation of state matrix directly into Q-values [14]. CNN utilized here contains 60 neurons in the input layer, 4 hidden layers with 300 neurons each and 4 neurons for the output layer as shown in Fig. 6. The input to CNN will be the state matrix, and the output
Agent-Driven Traffic Light Sequencing System Using …
437
Fig. 4 Road layout of four-way junction
is a set of action matrix with updated Q-values [15]. The epsilon-Greedy policy is used to choose between exploration and exploitation which helps the traffic agent to explore more initially and exploit more in final iterations [16].
5 Results The implementation in the SUMO simulator is as shown in Fig. 7. The left lane in each leg has a different signal compared to the other three lanes. The left lane signal corresponds to the vehicles taking a left turn and the signal in the other three lanes control the vehicles moving forward or taking a right turn. To evaluate the traffic agent’s performance, its result is compared with static sequencing of traffic lights. The same traffic density pattern is used in both cases to accommodate a fair chance. Each sequence has a delay of 30 s in the static sequencing
438
P. T. Reddy and R. Shanmughasundaram
Fig. 5 Flow of traffic agent
Fig. 6 Traffic agent workflow
considered here for comparison, the same as most of the conventional systems. The average waiting time of the vehicles and the pedestrians is the parameter compared between the two cases as shown in Table 1. The traffic agent has a better traffic handling mechanism compared to static traffic sequencing. It is observed that as the traffic density increases there is no difference in the performance between traffic agent and static sequencing.
Agent-Driven Traffic Light Sequencing System Using …
439
Fig. 7 Implementation of traffic agent in SUMO
Table 1 Comparison of the average waiting time of the vehicles and the pedestrians
Traffic density
Static sequencing (s)
Agent-driven (s)
Low traffic (