217 24 23MB
English Pages 823 [785] Year 2022
Lecture Notes in Networks and Systems 444
I. Jeena Jacob Selvanayaki Kolandapalayam Shanmugam Robert Bestak Editors
Expert Clouds and Applications Proceedings of ICOECA 2022
Lecture Notes in Networks and Systems Volume 444
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Fernando Gomide, Department of Computer Engineering and Automation—DCA, School of Electrical and Computer Engineering—FEEC, University of Campinas— UNICAMP, São Paulo, Brazil Okyay Kaynak, Department of Electrical and Electronic Engineering, Bogazici University, Istanbul, Turkey Derong Liu, Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, USA Institute of Automation, Chinese Academy of Sciences, Beijing, China Witold Pedrycz, Department of Electrical and Computer Engineering, University of Alberta, Alberta, Canada Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Marios M. Polycarpou, Department of Electrical and Computer Engineering, KIOS Research Center for Intelligent Systems and Networks, University of Cyprus, Nicosia, Cyprus Imre J. Rudas, Óbuda University, Budapest, Hungary Jun Wang, Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
I. Jeena Jacob · Selvanayaki Kolandapalayam Shanmugam · Robert Bestak Editors
Expert Clouds and Applications Proceedings of ICOECA 2022
Editors I. Jeena Jacob Department of Computer Science and Engineering GITAM University Bangalore, India
Selvanayaki Kolandapalayam Shanmugam Department of Mathematics and Computer Science Ashland University Ashland, OH, USA
Robert Bestak Department of Telecommunication Engineering Czech Technical University in Prague Prague, Czech Republic
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-19-2499-6 ISBN 978-981-19-2500-9 (eBook) https://doi.org/10.1007/978-981-19-2500-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
We are honored to dedicate the proceedings of ICOECA 2022 to all the participants, organizers and editors of ICOECA 2022.
Foreword
RV Institute of Technology and Management was honored to host the Second International Conference on Expert Clouds and Applications (ICOECA 2022), which was held at Bangalore, India, from February 3 to 4, 2022. The main purpose of this conference series is to provide a research forum for establishing a communication between the researchers, academicians and industrialists/users of computing technologies and applications. It is equally gratifying that the first conference series of ICOECA has received an ample number of research submissions. The conference program has included a keynote session by two keynote speakers—Dr. Joy Iong-Zong Chen, Professor, Electrical Engineering, Dayeh University, Taiwan, and Dr. Archana Patel, Department of Software Engineering, School of Computing and Information Technology, Eastern International University, Vietnam. The conference event was attended by 57 participants and RV Institute of Technology and Management is so glad to acknowledge the conference organizing committee composed of faculty and non-faculty members and student volunteers for their tremendous and generous support in making it all happen. A special note of gratitude for reviewers, who have contributed much in behind the scenes for many months in the preparation of conference and in the final delivery of high-quality research papers in the conference proceedings. Dr. J. Anitha HOD/CSE RV Institute of Technology and Management Bengaluru, India
vii
Preface
It is a great privilege for us to present the proceedings of the First International Conference on Expert Clouds and Applications (ICOECA 2022) to the readers, delegates and authors of the conference event. We greatly hope that all the readers will find it more useful and resourceful for their future research endeavor. International Conference on Expert Clouds and Applications (ICOECA 2022) was held in Bangalore, India, from February 3 to 4, 2022, with an aim to provide a platform for researchers, academicians and industrialists to discuss the state-of-the-art research opportunities, challenges and issues in intelligent computing applications. The revolutionizing scope and rapid development of computing technologies will also create new research questions and challenges, which in turn results in the need to create and share new research ideas and encourage significant awareness in this futuristic research domain. The proceedings of ICOECA 2022 provides a limelight on creating a new research landscape for intelligent computing applications with the support received and the research enthusiasm that has truly exceeded our expectations. This made us to be more satisfied and delighted to present the proceedings with a high level of satisfaction. The responses from the researchers to the conference had been overwhelming from India and overseas countries. We have received 282 manuscripts from the prestigious universities/institutions across the globe. Furthermore, 57 manuscripts are shortlisted based on the reviewing outcomes and conference capacity constraints. Nevertheless, we would like to express our deep gratitude and commendation for the entire conference review team, who helped us to select the high-quality research works that are included in the ICOECA 2022 proceedings published by Springer. Also, we would like to extend our appreciation to organizing committee member for their continual support. We are pleased to thank Springer publications for publishing the proceedings of ICOECA 2022 and maximizing the popularity of the research manuscript across the globe.
ix
x
Preface
At last, we wish all the authors and participants of the conference event a best success in their future research endeavors. Bengaluru, India Ashland, USA Prague, Czech Republic
Dr. I. Jeena Jacob Dr. Selvanayaki Kolandapalayam Shanmugam Dr. Robert Bestak
Acknowledgments
We wish to express our gratitude and appreciation to our beloved President Dr. M. K. Panduranga Setty and dynamic Secretary A. V. S. Murthy for their constant encouragement and guidance to successfully organize the conference in this series. They have strongly encouraged us in conducting the conferences at RV Institute of Technology and Management Bangalore, India. Also, we like thank Principal Dr. R. Jayapal and all the board of directors for their perpetual support during the conduct of the ICOECA 2022 from which the conference proceedings book has evolved into existence. We like to thank all our review board members, who have assured the research novelty and quality from the initial to final selection phase of the conference. Also, we are thankful especially to our eminent speakers, reviewers and guest editors. Furthermore, we like to acknowledge all the session chairs of the conference event for the seamless contribution in evaluating the oral presentation of the conference participants. We would like to mention the hard work put up by the authors to revise and update their manuscripts according to the review comments to meet the conference and publication standards. We like to acknowledge the support of Springer publications for their constant and timely support throughout the publication process.
xi
Contents
An Approach for Summarizing Text Using Sentence Scoring with Key Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. Malarselvi and A. Pandian
1
Image Classification of Indian Rural Development Projects Using Transfer Learning and CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aditya Mangla, J. Briskilal, and D. Senthil Kumar
17
Design of a Mobile Application to Deal with Bullying . . . . . . . . . . . . . . . . . Vania Margarita, Agung A. Pramudji, Benaya Oktavianus, Randy Dwi, and Harco Leslie Hendric Spits Warnars Selection of Human Resources Prospective Student Using SAW and AHP Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ahmad Rufai, Diana Teresia Spits Warnars, Harco Leslie Hendric Spits Warnars, and Antoine Doucet Implementation of the Weighted Product Method to Specify scholarship’s Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chintia Ananda, Diana Teresia Spits Warnars, and Harco Leslie Hendric Spits Warnars A Survey on E-Commerce Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . Astha Patel, Ankit Chauhan, and Madhuri Vaghasia Secured Cloud Computing for Medical Database Monitoring Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Balamurugan, M. Kumaresan, V. Haripriya, S. Annamalai, and J. Bhuvana
31
43
65
81
91
Real-Time Big Data Analytics for Improving Sales in the Retail Industry via the Use of Internet of Things Beacons . . . . . . . . . . . . . . . . . . . 111 V. Arulkumar, S. Sridhar, G. Kalpana, and K. S. Guruprakash
xiii
xiv
Contents
Healthcare Application System with Cyber-Security Using Machine Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 C. Selvan, C. Jenifer Grace Giftlin, M. Aruna, and S. Sridhar Energy Efficient Data Accumulation Scheme Based on ABC Algorithm with Mobile Sink for IWSN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 S. Senthil Kumar, C. Naveeth Babu, B. Arthi, M. Aruna, and G. Charlyn Pushpa Latha IoT Based Automatic Medicine Reminder . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Ramya Srikanteswara, C. J. Rahul, Guru Sainath, M. H. Jaswanth, and Varun N. Sharma IoT Based Framework for the Detection of Face Mask and Temperature, to Restrict the Spread of Covids . . . . . . . . . . . . . . . . . . . 173 Ramya Srikanteswara, Anantashayana S. Hegde, K. Abhishek, R. Dilip Sai, and M. V. Gnanadeep Efficient Data Partitioning and Retrieval Using Modified ReDDE Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Praveen M. Dhulavvagol, S. G. Totad, Nandan Bhandage, and Pradyumna Bilagi Stock Price Prediction Using Data Mining with Soft Computing Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 R. Suganya and S. Sathya A Complete Analysis of Communication Protocols Used in IoT . . . . . . . . 211 Priya Matta, Sanjeev Kukreti, and Sonal Sharma Image Captioning Using Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . 225 Disha Patel, Ankita Gandhi, and Zubin Bhaidasna Recommender System Using Knowledge Graph and Ontology: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Warisahmed Bunglawala, Jaimeel Shah, and Darshna Parmar A Review of the Multistage Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Velia Nayelita Kurniadi, Vincenzo, Frandico Joan Nathanael, and Harco Leslie Hendric Spits Warnars Technology for Disabled with Smartphone Apps for Blind People . . . . . . 271 Hartato, Riandy Juan Albert Yoshua, Husein, Agelius Garetta, and Harco Leslie Hendric Spits Warnars Mobile Apps for Musician Community in Indonesia . . . . . . . . . . . . . . . . . . 283 Amadeus Darren Leander, Jeconiah Yohanes Jayani, and Harco Leslie Hendric Spits Warnars
Contents
xv
A Genetic-Based Virtual Machine Placement Algorithm for Cloud Datacenter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 C. Pandiselvi and S. Sivakumar Visual Attention-Based Optic Disc Detection System Using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 A. Geetha Devi, N. Krishnamoorthy, Karim Ishtiaque Ahmed, Syed Imran Patel, Imran Khan, and Rabinarayan Satpathy An Overview of Blue Eye Constitution and Applications . . . . . . . . . . . . . . 327 Jadapalli Sreedhar, T. Anuradha, N. Mahesha, P. Bindu, M. Kathiravan, and Ibrahim Patel Enhancement of Smart Contact on Blockchain Security by Integrating Advanced Hashing Mechanism . . . . . . . . . . . . . . . . . . . . . . . 337 Bharat Kumar Aggarwal, Ankur Gupta, Deepak Goyal, Pankaj Gupta, Bijender Bansal, and Dheer Dhwaj Barak Evaluation of Covid-19 Ontologies Through OntoMetrics and OOPS! Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Narayan C. Debnath, Archana Patel, Debarshi Mazumder, Phuc Nguyen Manh, and Ngoc Ha Minh Recognition of the Multioriented Text Based on Deep Learning . . . . . . . . 367 K. Priyadarsini, Senthil Kumar Janahan, S. Thirumal, P. Bindu, T. Ajith Bosco Raj, and Sankararao Majji Fruit and Leaf Disease Detection Based on Image Processing and Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 S. Naresh Kumar, Sankararao Majji, Tulasi Radhika Patnala, C. B. Jagadeesh, K. Ezhilarasan, and S. John Pimo A Survey for Determining Patterns in the Severity of COVID Patients Using Machine Learning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 385 Prachi Raol, Brijesh Vala, and Nitin Kumar Pandya Relation Extraction Between Entities on Textual News Data . . . . . . . . . . . 393 Saarthak Mehta, C. Sindhu, and C. Ajay A Comprehensive Survey on Compact MIMO Antenna Systems for Ultra Wideband Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 V. Baranidharan, S. Subash, R. Harshni, V. Akalya, T. Susmitha, S. Shubhashree, and V. Titiksha A Review of Blockchain Consensus Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 415 Manas Borse, Parth Shendkar, Yash Undre, Atharva Mahadik, and Rachana Yogesh Patil A Short Systematic Survey on Precision Agriculture . . . . . . . . . . . . . . . . . . 427 S. Sakthipriya and R. Naresh
xvi
Contents
Improving System Performance Using Distributed Time Slot Assignment and Scheduling Algorithm in Wireless Mesh Networks . . . . 441 K. S. Mathad and M. M. Math Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453 S. Santhameena, Edwil Winston Fernandes, and Surabhi Puttaraju Remote Controlled Patrolling Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Samarla Puneeth Vijay Krishna, Vritika Tuteja, Chatna Sai Hithesh, A. Rahul, and M. Ananda Novel Modeling of Efficient Data Deduplication for Effective Redundancy Management in Cloud Environment . . . . . . . . . . . . . . . . . . . . 479 G. Anil Kumar and C. P. Shantala A Survey on Patients Privacy Protection with Steganography and Visual Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Hussein K. Alzubaidy, Dhiah Al-Shammary, and Mohammed Hamzah Abed Survey on Channel Estimation Schemes for mmWave Massive MIMO Systems – Future Directions and Challenges . . . . . . . . . . . . . . . . . . 505 V. Baranidharan, B. Moulieshwaran, V. Karthick, M. K. Munavvar Hasan, R. Deepak, and A. Venkatesh Fake News Detection: Fact or Cap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 C. Sindhu, Sachin Singh, and Govind Kumar The Future of Hiring Through Artificial Intelligence by Human Resource Managers in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 Ankita Arora, Vaibhav Aggarwal, and Adesh Doifode A Computer Vision Model for Detection of Water Pollutants Using Deep Learning Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543 Anaya Bodas, Shubhankar Hardikar, Rujuta Sarlashkar, Atharva Joglekar, and Neeta Shirsat Medical IoT Data Analytics for Post-COVID Patient Monitoring . . . . . . . 555 Salka Rahman, Suraiya Parveen, and Shabir Ahmad Sofi SNAP—A Secured E Exchange Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 Neeta Shirsat, Pradnya Kulkarni, Shubham Balkawade, and Akash Kasbe An AES-Based Efficient and Valid QR Code for Message Sharing Framework for Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581 Abhinav Agarwal and Sandeep Malik
Contents
xvii
Categorization of Cardiac Arrhythmia from ECG Waveform by Using Super Vector Regression Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 S. T. Sanamdikar, N. M. Karajanagi, K. H. Kowdiki, and S. B. Kamble Performance Analysis of Classification Algorithm Using Stacking and Ensemble Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 Praveen M. Dhulavvagol, S. G. Totad, Ashwin Shirodkar, Amulya Hiremath, Apoorva Bansode, and J. Divya Conceptual Study of Prevalent Methods for Cyber-Attack Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631 S. P. Sharmila and Narendra S. Chaudhari Performance Analysis of Type-2 Diabetes Mellitus Prediction Using Machine Learning Algorithms: A Survey . . . . . . . . . . . . . . . . . . . . . . 643 B. Shamreen Ahamed, Meenakshi Sumeet Arya, and V. Auxilia Osvin Nancy Dynamic Updating of Signatures for Improving the Performance of IDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659 Asma Shaikh and Preeti Gupta M-mode Carotid Artery Image Classification and Risk Analysis Based on Machine Learning and Deep Learning Techniques . . . . . . . . . . . 675 P. Lakshmi Prabha, A. K. Jayanthy, and Kumar Janardanan Analyzing a Chess Engine Based on Alpha–Beta Pruning, Enhanced with Iterative Deepening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 Aayush Parashar, Aayush Kumar Jha, and Manoj Kumar Intelligent Miniature Autonomous Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . 701 Dileep Reddy Bolla, Siddharth Singh, and H. Sarojadevi Energy and Trust Efficient Cluster Head Selection in Wireless Sensor Networks Under Meta-Heuristic Model . . . . . . . . . . . . . . . . . . . . . . . 715 Kale Navnath Dattatraya and S Ananthakumaran Performance Analysis of OTFS Scheme for TDL and CDL 3GPP Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737 I. S. Akila, S. Uma, and P. S. Poojitha Design of Smart Super Market Assistance for the Visually Impaired People Using YOLO Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749 D. Jebakumar Immanuel, P. Poovizhi, F. Margret Sharmila, D. Selvapandian, Aby K. Thomas, and C. K. Shankar Investigating the Effectiveness of Zero-Rated Website on Students on E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765 Asemahle Bridget Dyongo and Gardner Mwansa
xviii
Contents
Role of Machine Learning Algorithms on Alzheimer Disease Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 V. Krishna Kumar, M. S. Geetha Devasena, G. Gopu, and N. Sivakumaran Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791
Editors and Contributors
About the Editors I. Jeena Jacob is working as a Professor in Computer Science and Engineering department at GITAM University, Bengaluru, India. She actively participates on the development of the research field by conducting international conferences, workshops and seminars. She has published many articles in referred journals. She has guest edited an issue for International Journal of Mobile Learning and Organisation. Her research interests include mobile learning and computing. Selvanayaki Kolandapalayam Shanmugam is currently working as a Professor in the Department of Mathematics and Computer Science, Ashland University, Ashland, OH, 44805. She had overall 15+ years of Lecturing sessions for theoretical subjects, experimental and instructional procedures for laboratory subjects. She presented more research articles in national and international conferences and journals. Her research interest includes image processing, video processing, soft computing techniques, intelligent computing, Web application development, object-oriented programming like C++, Java, scripting languages like VBScript and JavaScript, data science, algorithms, data warehousing and data mining, neural networks, genetic algorithms, software engineering, software project management software quality assurance, enterprise resource planning, information systems, database management systems. Robert Bestak obtained Ph.D. degree in Computer Science from ENST Paris, France (2003), and M.Sc. degree in Telecommunications from Czech Technical University in Prague, CTU, Czech Republic (1999). Since 2004, he has been an Assistant Professor at Department of Telecommunication Engineering, Faculty of Electrical Engineering, CTU. He participated in several national, EU, and third-party research projects. He is the Czech representative in the IFIP TC6 organization and chair of working group TC6 WG6.8. He annually serves as Steering and Technical Program Committee Member of numerous IEEE/IFIP conferences (networking, WMNC,
xix
xx
Editors and Contributors
NGMAST, etc.), and he is Member of editorial board of several international journals (Computers and Electrical Engineering, Electronic Commerce Research Journal, etc.). His research interests include 5G networks, spectrum management and big data in mobile networks.
Contributors Abed Mohammed Hamzah College of Computer Science and Information Technology, Al-Qadisiyah University, Al-Dewaniyah, Iraq Abhishek K. Nitte Meenakshi Institute of Technology, Bengaluru, India Agarwal Abhinav Department of Computer Science, Singhania University, Jhunjhunu, India Aggarwal Bharat Kumar Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India Aggarwal Vaibhav O.P. Jindal Global University, Sonipat, India Ahmed Karim Ishtiaque Computer Science, Bahrain Training Institute, Higher Education Council, Ministry of Education, Manama, Bahrain Ajay C. Department of Computing Technologies, SRM Institute of Science and Technology Kattankulathur, Kattankulathur, India Akalya V. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Akila I. S. Department of ECE, Coimbatore Institute of Technology, Coimbatore, India Al-Shammary Dhiah College of Computer Science and Information Technology, Al-Qadisiyah University, Al-Dewaniyah, Iraq Alzubaidy Hussein K. College of Computer Science and Information Technology, Al-Qadisiyah University, Al-Dewaniyah, Iraq Ananda Chintia Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Ananda M. Department of Electronics and Communication (UGC), PES University Electronic City Campus (UGC), Bangalore, India Ananthakumaran S School of CSE, VIT Bhopal University, Madhya Pradesh, India Anil Kumar G. Department of Computer Science and Engineering, Channabasaveshwara Institute of Technology Gubbi, Tumkur, India; Visvesvaraya Technological University, Belagavi, Karnataka, India
Editors and Contributors
xxi
Annamalai S. School of Computing Science and Engineering, Galgotias University, Greater Noida, India Anuradha T. Department of Electrical and Electronics Engineering, KCG College of Technology, Chennai, India Arora Ankita Apeejay School of Management, Delhi, India Arthi B. Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Kanchipuram, Chennai, Tamil Nadu, India Arulkumar V. School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamilnadu, India Aruna M. Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Kanchipuram, Chennai, Tamil Nadu, India Arya Meenakshi Sumeet SRM Institute of Science and Technology, Vadapalani Campus, Vadapalani, TN, India Auxilia Osvin Nancy V. SRM Institute of Science and Technology, Vadapalani Campus, Vadapalani, TN, India Balamurugan M. Department of Computer Science and Engineering, School of Engineering and Technology, Christ (Deemed to Be University), Bangalore, India Balkawade Shubham Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Bansal Bijender Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India Bansode Apoorva School of Computer Science and Engineering, KLE Technological University, Hubballi, India Barak Dheer Dhwaj Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India Baranidharan V. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Bhaidasna Zubin Parul Institute of Engineering and Technology, Vadodara, India Bhandage Nandan School of Computer Science and Engineering, KLE Technological University, Hubballi, India Bhuvana J. School of Computer Science and IT, Jain (Deemed to Be) University, Bangalore, India Bilagi Pradyumna School of Computer Science and Engineering, KLE Technological University, Hubballi, India
xxii
Editors and Contributors
Bindu P. Department of Mathematics, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India Bodas Anaya Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Bolla Dileep Reddy Department of CSE, Nitte Meenakshi Institute of Technology, Yehahanka, Bangalore, India Borse Manas Pimpri Chinchwad College of Engineering, Pune, India Briskilal J. SRM Institute of Science and Technology, Chennai, Tamil Nadu, India Bunglawala Warisahmed Parul Institute of Engineering and Technology, Vadodara, India Charlyn Pushpa Latha G. Department of IT, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India Chaudhari Narendra S. Department of Computer Science and Engineering, Indian Institute of Technology, Indore, India Chauhan Ankit Parul Institute of Engineering and Technology, Vadodara, India Dattatraya Kale Navnath Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India Debnath Narayan C. Department of Software Engineering, Eastern International University, Binh Duong, Vietnam Deepak R. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Dhulavvagol Praveen M. School of Computer Science and Engineering, KLE Technological University, Hubballi, India Divya J. School of Computer Science and Engineering, KLE Technological University, Hubballi, India Doifode Adesh Institute for Future Education, Entrepreneurship and Leadership, Pune, India Doucet Antoine Laboratoire L3i, Université de La Rochelle, La Rochelle, France Dwi Randy Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Dyongo Asemahle Bridget Walter Sisulu University, East London, South Africa Ezhilarasan K. Department of ECE, CMR University, Bangalore, Karnataka, India Fernandes Edwil Winston Department of Electronics and Communication Engineering, PES University, Bangalore, India Gandhi Ankita Parul Institute of Engineering and Technology, Vadodara, India
Editors and Contributors
xxiii
Garetta Agelius Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Geetha Devasena M. S. Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore, India Geetha Devi A. Department of ECE, PVP Siddhartha Institute of Technology, Vijayawada, Andhra Pradesh, India Gnanadeep M. V. Nitte Meenakshi Institute of Technology, Bengaluru, India Gopu G. Department of Electronics and Communication Engineering, Sri Ramakrishna Engineering College, Coimbatore, India Goyal Deepak Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India Gupta Ankur Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India Gupta Pankaj Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak, Haryana, India Gupta Preeti Amity University Maharashtra, Mumbai, Maharashtra, India Guruprakash K. S. Department of Computer Science and Engineering, K. Ramakrishnan College of Engineering, Trichy, India Hardikar Shubhankar Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Haripriya V. School of Computer Science and IT, Jain (Deemed to Be) University, Bangalore, India Harshni R. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Hartato Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Hasan M. K. Munavvar Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Hegde Anantashayana S. Nitte Meenakshi Institute of Technology, Bengaluru, India Hiremath Amulya School of Computer Science and Engineering, KLE Technological University, Hubballi, India Hithesh Chatna Sai Department of Electronics and Communication (UGC), PES University Electronic City Campus (UGC), Bangalore, India Husein Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia
xxiv
Editors and Contributors
Jagadeesh C. B. New Horizon College of Engineering, Bangalore, India Janahan Senthil Kumar Department of CSE, Lovely Professional University, Phagwara, Punjab, India Janardanan Kumar Department of General Medicine, SRM Medical College Hospital and Research Centre, Chennai, India Jaswanth M. H. Nitte Meenakshi Institute of Technology, Bengaluru, India Jayani Jeconiah Yohanes Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia Jayanthy A. K. Department of Biomedical Engineering, SRM Institute of Science and Technology, Chennai, Tamil Nadu, India Jebakumar Immanuel D. Department of Computer Science and Engineering, SNS College of Engineering, Coimbatore, TamilNadu, India Jenifer Grace Giftlin C. Department of Information Technology, Sri Krishna College of Engineering and Technology, Coimbatore, TN, India Jha Aayush Kumar Delhi Technological University, Delhi, India Joglekar Atharva Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Kalpana G. Rajalakshmi Institute of Technology, Chennai, India Kamble S. B. Electronics Department, PDEA’s College of Engineering, Manjari, Pune, India Karajanagi N. M. Instrumentation Department, Government College of Engineering and Research, Awasari Khurd, Pune, India Karthick V. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Kasbe Akash Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Kathiravan M. Department of Computer Science and Engineering, Hindustan Institute of Technology and Science, Padur, Kelambaakkam, Chengalpattu, India Khan Imran Computer Science, Bahrain Training Institute, Higher Education Council, Ministry of Education, Manama, Bahrain Kowdiki K. H. Instrumentation Department, Government College of Engineering and Research, Awasari Khurd, Pune, India Krishna Kumar V. Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore, India
Editors and Contributors
xxv
Krishna Samarla Puneeth Vijay Department of Electronics and Communication (UGC), PES University Electronic City Campus (UGC), Bangalore, India Krishnamoorthy N. MCA Department, SRM Institute of Science and Technology, Chennai, India Kukreti Sanjeev Department of Department of Computer Science and Engineering, Graphic Era University, Dehradun, India Kulkarni Pradnya Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Kumar Govind Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, India Kumar Manoj Delhi Technological University, Delhi, India Kumaresan M. School of Computer Science and Engineering, Jain (Deemed to Be) University, Bangalore, India Kurniadi Velia Nayelita Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia Leander Amadeus Darren Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia Mahadik Atharva Pimpri Chinchwad College of Engineering, Pune, India Mahesha N. Department of Civil Engineering, New Horizon College of Engineering, Bangalore, India Majji Sankararao Department of Electronics and Communication Engineering, GRIET, Hyderabad, India Malarselvi G. SRM Institute of Science and Technology, Chennai, India Malik Sandeep Department of Computer Science, Oriental University, Indore, India Mangla Aditya SRM Institute of Science and Technology, Chennai, Tamil Nadu, India Manh Phuc Nguyen Department of Software Engineering, Eastern International University, Binh Duong, Vietnam Margarita Vania Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Margret Sharmila F. Department of Computer Science and Engineering, SNS College of Engineering, Coimbatore, TamilNadu, India Math M. M. KLS Gogte Institute of Technology, Belagavi, Karnataka, India Mathad K. S. KLS Gogte Institute of Technology, Belagavi, Karnataka, India
xxvi
Editors and Contributors
Matta Priya Department of Department of Computer Science and Engineering, Graphic Era University, Dehradun, India Mazumder Debarshi Department of Software Engineering, Eastern International University, Binh Duong, Vietnam Mehta Saarthak Department of Computing Technologies, SRM Institute of Science and Technology Kattankulathur, Kattankulathur, India Minh Ngoc Ha Department of Software Engineering, Eastern International University, Binh Duong, Vietnam Moulieshwaran B. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Mwansa Gardner Walter Sisulu University, East London, South Africa Naresh Kumar S. School of Computer Science and Artificial Intelligence, SR University, Waragal, Telangana, India Naresh R. Department of Computer Science and Engineering, SRM Institute of Science and Technology(SRMIST), Kattankulathur, Chennai, India Nathanael Frandico Joan Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia Naveeth Babu C. Department of Computer Science, Kristu Jayanti College, (Autonomous), Bangaluru, Karnataka, India Oktavianus Benaya Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Pandian A. SRM Institute of Science and Technology, Chennai, India Pandiselvi C. Department of Computer Science, Cardamom Planters’ Association College, Bodinayakanur, India Pandya Nitin Kumar Parul Institute of Engineering and Technology, Vadodara, India Parashar Aayush Delhi Technological University, Delhi, India Parmar Darshna Parul Institute of Engineering and Technology, Vadodara, India Parveen Suraiya Department of Computer Science and Engineerin, Jamia Hamdard University New Delhi, Delhi, India Patel Archana Department of Software Engineering, Eastern International University, Binh Duong, Vietnam Patel Astha Parul Institute of Engineering and Technology, Vadodara, India Patel Disha Parul Institute of Engineering and Technology, Vadodara, India
Editors and Contributors
xxvii
Patel Ibrahim Department of ECE, B V Raju Institute of Technology, Narsapur, Medak, India Patel Syed Imran Computer Science, Bahrain Training Institute, Higher Education Council, Ministry of Education, Manama, Bahrain Patil Rachana Yogesh Pimpri Chinchwad College of Engineering, Pune, India Patnala Tulasi Radhika Department of Electronics and Communication Engineering, GITAM University, Hyderabad, India Pimo S. John St. Xavier’s Catholic College of Engineering, Nagercoil, India Poojitha P. S. Department of ECE, Coimbatore Institute of Technology, Coimbatore, India Poovizhi P. Department of Information Technology, Dr. N.G.P. Institute of Technology, Coimbatore, TamilNadu, India Prabha P. Lakshmi Department of Biomedical Engineering, SRM Institute of Science and Technology, Chennai, Tamil Nadu, India Pramudji Agung A. Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Priyadarsini K. Department of Data Science and Business Systems, School of Computing, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chennai, India Puttaraju Surabhi Department of Electronics and Communication Engineering, PES University, Bangalore, India Rahman Salka Department of Computer Science and Engineerin, Jamia Hamdard University New Delhi, Delhi, India Rahul A. Department of Electronics and Communication (UGC), PES University Electronic City Campus (UGC), Bangalore, India Rahul C. J. Nitte Meenakshi Institute of Technology, Bengaluru, India Raj T. Ajith Bosco Department of Electronics and Communication Engineering, PSN College of Engineering and Technology, Tirunelveli, Tamil Nadu, India Raol Prachi Parul Institute of Engineering and Technology, Vadodara, India Rufai Ahmad Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Sai R. Dilip Nitte Meenakshi Institute of Technology, Bengaluru, India Sainath Guru Nitte Meenakshi Institute of Technology, Bengaluru, India Sakthipriya S. Department of Computer Science and Engineering, SRM Institute of Science and Technology(SRMIST), Kattankulathur, Chennai, India
xxviii
Editors and Contributors
Sanamdikar S. T. Instrumentation Department, PDEA’s College of Engineering, Manjari, Pune, India Santhameena S. Department of Electronics and Communication Engineering, PES University, Bangalore, India Sarlashkar Rujuta Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Sarojadevi H. Department of CSE, Nitte Meenakshi Institute of Technology, Yehahanka, Bangalore, India Sathya S. Department of Information Technology, School of Computing Sciences, Vels Institute of Science, Technology and Advanced Studies, (VISTAS), Chennai, Tamilnadu, India Satpathy Rabinarayan CSE (FET), Sri Sri University, Cuttack, Odisha, India Selvan C. Department of Computer Science & Engineering, New Horizon College of Engineering, Bangalore, Karnataka, India Selvapandian D. Department of Computer Science and Engineering, Karpagam Academy of Higher Education, Coimbatore, TamilNadu, India Senthil Kumar D. Sri Sairam Engineering College, Chennai, India Senthil Kumar S. Department of Computer Science, Sri Ramakrishna Mission Vidyalaya College of Arts and Science (Autonomous), Coimbatore, India Shah Jaimeel Parul Institute of Engineering and Technology, Vadodara, India Shaikh Asma Amity University Maharashtra, Mumbai, Maharashtra, India; Marathwada Mitra Mandal College of Engineering, Pune, India Shamreen Ahamed B. SRM Institute of Science and Technology, Vadapalani Campus, Vadapalani, TN, India Shankar C. K. Department of Electrical and Electronics Engineering, Sri Ramakrishna Polytechnic College, Coimbatore, TamilNadu, India Shantala C. P. Department of Computer Science and Engineering, Channabasaveshwara Institute of Technology Gubbi, Tumkur, India; Visvesvaraya Technological University, Belagavi, Karnataka, India Sharma Sonal Uttaranchal University, Dehradun, India Sharma Varun N. Nitte Meenakshi Institute of Technology, Bengaluru, India Sharmila S. P. Department of Information Science and Engineering, Siddaganga Institute of Technology, Tumakuru, India Shendkar Parth Pimpri Chinchwad College of Engineering, Pune, India
Editors and Contributors
xxix
Shirodkar Ashwin School of Computer Science and Engineering, KLE Technological University, Hubballi, India Shirsat Neeta Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India Shubhashree S. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Sindhu C. Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, India Singh Sachin Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur, India Singh Siddharth Department of CSE, Nitte Meenakshi Institute of Technology, Yehahanka, Bangalore, India Sivakumar S. Cardamom Planters’ Association College, Bodinayakanur, India Sivakumaran N. Instrumentation and Control Engineering, NIT, Trichy, India Sofi Shabir Ahmad Department of Information Technology, National Institute of Technology Srinagar, Jammu and Kashmir, India Sreedhar Jadapalli EEE Department, Vignana Bharathi Institute of Technology, Hyderabad, India Sridhar S. Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India Srikanteswara Ramya Nitte Meenakshi Institute of Technology, Bengaluru, India Subash S. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Suganya R. Research Scholar, Department of Information Technology, School of Computing Sciences, Vels Institute of Science, Technology and Advanced Studies (VISTAS), Chennai, Tamilnadu, India Susmitha T. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Thirumal S. Department of Computer Science and Engineering, Vels Institute of Science Technology and Advanced Studies, Chennai, India Thomas Aby K. Department of Electronics and Communication Engineering, Alliance College of Engineering and Design, Alliance University, Bangalore, India Titiksha V. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Totad S. G. School of Computer Science and Engineering, KLE Technological University, Hubballi, India
xxx
Editors and Contributors
Tuteja Vritika Department of Electronics and Communication (UGC), PES University Electronic City Campus (UGC), Bangalore, India Uma S. Department of ECE, Coimbatore Institute of Technology, Coimbatore, India Undre Yash Pimpri Chinchwad College of Engineering, Pune, India Vaghasia Madhuri Parul Institute of Engineering and Technology, Vadodara, India Vala Brijesh Parul Institute of Engineering and Technology, Vadodara, India Venkatesh A. Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India Vincenzo Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta, Indonesia Warnars Diana Teresia Spits Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia Warnars Harco Leslie Hendric Spits Computer Science Department, Binus Graduate Program-Doctor of Computer Science, Bina Nusantara University, Jakarta, Indonesia Yoshua Riandy Juan Albert Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta, Indonesia
An Approach for Summarizing Text Using Sentence Scoring with Key Optimizer G. Malarselvi and A. Pandian
Abstract There is an enormous amount of textual content rolled out over the web, which performs automatic text summarization efficiently. Specifically, extracting the multi-keywords from the textual content produces the summary from the source document by reducing the isolating text. In recent research, these summarization approaches and the problems related to this process are easily addressed with the optimization approaches. In existing research, most investigators concentrate on single-objective solutions; however, multi-objective approaches provide solutions to various issues during summarization. This work adopts a Keyword-based Elephant Yard Optimization (KEY) approach that improves the summarization process. In KEY, the analysis of the elephant movement is performed based on the group (cluster) of elephants. The significance of the movement relies on the priority given to the head. Accordingly, the textual contents are optimized based on the clustering priority. The analysis is performed over the available online datasets for text summarization to provide multiple solutions by handling multi-objective problems. Some of the predominant metrics like ROUGE-1 and ROUGE-2 score and kappa coefficient are evaluated to attain superior outcomes. Keywords Textual content · Text summarization · Keyword-based elephant yard optimization · Multi-objective problems · Multi-keywords
1 Introduction The growth of textual information is improved due to the expansion of internet applications and smartphones in our everyday routine. The explosion of generated data makes it impossible for humans to summarize it, and machines also struggle to manage the huge data from various applications, technologies, and firms [1]. The G. Malarselvi · A. Pandian (B) SRM Institute of Science and Technology, Chennai, India e-mail: [email protected] G. Malarselvi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_1
1
2
G. Malarselvi and A. Pandian
most challenging task is to analyze the huge amount of unstructured data, and also it is impossible to manage it. These massive information documents require automated document summarization to generate more concise documents to maintain important data [2]. Hence, the summarization technique is important to acquire awareness from data and to make decisions. Therefore, in today’s world, text summarization is one of the best methods. Many social media applications like Facebook and Twitter are now used for marketing and political and personal use. Today, social media is used for political campaigns to reach their respective supporters [3]. For outstanding marketing and political scenario, textual information that is extracted plays an essential role. To attain better marketing or political campaigns, the real-time applications of automatic text summarization are unrestricted. To attain the observation from data and ease of making the decision, summarization is used. Text summarization is used to get the shorter illustration of the search results over the search engines and in keywordbased subscriptions [4]. Based on user navigation between various contents [5], text summarization of social media facilitates the users’ trust effectively. The different categorization methods are available for the summarization of documents. Moreover, automatic summarization models are classified into two types. They are: (i) abstractive; and (ii) extractive. Abstractive technique recognizes the text deeply to exhibit the text deeply in a shorter alternate form. Contrarily, an extractive technique aims to choose the most substantial parts. Although the machine finds this method is difficult to generate, a summary, i.e., understandable by humans, mostly extractive techniques are utilized practically. In the proposed summarization task, the suggested technique eliminates the requirement of feature engineering. Since machine learning algorithms have the most important feature extraction phase, they mostly concentrate on sentence selection. For the summarization process, few efforts have been taken recently to identify an optimum feature set. This technique examines the importance of every feature as a binary problem that includes each feature in the feature set or not. The practical outcome provides that KEY can catch the basic design of features. A group of similar samples is found as the outcome and features are weighted locally in every group. These features weights explain the significance of every feature in sub-spaces (clusters). These spaces show not being or being in summary in the summarization issue. The major benefaction can be summarized as follows: • The KEY optimization acquires general performance. To illustrate this, we give some proof as experiments where the model is trained, and the discrimination takes place on the significance of all features of diverse classes on different datasets are informed. • This work executes human evaluation experiments to evaluate summaries from the human point of view, which provides ground truth. This experiment shows that the KEY approach’s summaries are unnecessary and descriptive than summaries generated by challenging techniques. • The KEY approach has an added benefit that is being more understandable to be a state-of-the-art performer. In the optimization process, the separated terms
An Approach for Summarizing Text Using Sentence …
3
permit to trace of the output summary. It is effective to describe decisions that are produced with the help of the system to the end-user.
2 Related Works To get the most needed sentences, most studies can compute the score for each sentence [6]. To evaluate the score for every sentence, there are two steps. They are (i) sentence scoring [7] and (ii) word scoring [8]. Based on statistical principles [9], these steps are summary generation models. The features should evaluate the score for each word, and also sentence score is the summation of scores of each word[10]. To evaluate the significance of words like word frequency [11] and TF [12]. Sentence position [13] and sentence length [14] are two essential elements in a sentence to compute the sentence’s score. Nengroo et al. [15] utilize the similarity of title and sentence to enhance the correctness of summary sentences, sentence length, and sentence position. Mikolov et al. [16] suggested hybrid and single-document text summarization has three methods. They are: (a) domain knowledge approach, (b) statistical features approach, and (c) genetic algorithms. The domain knowledge approach organizes the knowledge base domain corpus manually, and based on the domain keywords; this method summarizes the text. Statistical features method use heuristic methods to concentrate on ranking and segment extraction, which allocates weighted scores to text segmentation. The genetic algorithm-based method considers the automated text summary work as a classification issue. Categorization utilizes GA’s ML approach to generate a cohesive, readable, and better summary identical to the document title. The description of documents depends on the set of attributes. Single-text summary models are presented in this model [2]. A combination of three news features (person, time, and location) and a new text feature selection method are suggested with the news text features. A combination of fuzzy logic systems and the genetic algorithm is established. The genetic algorithm weights the text features. These features are turned twice to obtain more accurate summary sentences using the fuzzy logic system to produce a high-quality summary [17]. Table 1 depicts the comparison of various prevailing text summarization approaches.
3 Methodology The anticipated model is composed of three-pre-dominant stages to perform the summarization process. The redundant contents from the available document are filtered, and the successive stage is the optimization process. The weights of available document features are optimized to highlight the important sentence in the provided document. Finally, the text summary is presented by summarizing the basic sentences based on connectivity and similarity. Consider, a document set represented as doc =
4
G. Malarselvi and A. Pandian
Table 1 Comparison of various text summarization approaches Topic
Problems
Techniques
Methods
Extractive
Semantic Optimization Hybrid Clustering Topic modeling
Fuzzy-based statistics Machine learning Machine learning Graphical Machine learning Topic
Fuzzy hypergraph LSA and NMF LSA + ANN DL CGS DQN LDA
Abstractive
Extraction Ambiguity Clustering Sentence scoring
Machine learning
NAT-SUM TF-IDF NeuFuse Markov
Unsupervised learning
Optimization Noise Ambiguity Sematic
Machine learning
Bi-RNN Round robin and POS FROM UAE and Deep AE
Single document
Similarity Redundancy
Graphical Machine learning
NLP parser AE, NN
Multi-document
Extraction Redundancy Hybrid Clustering
Statistics Fuzzy-based Machine learning Machine learning
TEDU + COMP ANFISA ExpQuery RA-MDS
Optimization
Optimization Similarity Ambiguity
Machine learning Machine learning Machine learning
ABC and MOABC Shark smell optimization MMI diversity
Real-time
Similarity Redundancy Keywords Feature and real-time Semantic clustering
Fuzzy-based Machine learning Machine learning Statistics Machine learning
Fuzzy formal concept IncreSTS DTM RSE IS, NS IncreSTS
Domain
Extraction Similarity Clustering Keyword Feature Semantic analysis Redundancy Latent Selection
Fuzzy-based Statistics Fuzzy-based Machine learning Statistics Statistics Machine learning Statistics Statistics
Fuzzy AHP WordNet + common sub-summer CUBS Term frequency Sentence scoring LSA SVD MWI-sum
{doc1 , doc2 , . . . , docn } with a sentence set doci = sen1 , sen2 , . . . , sen|doci | where |doci | specifies the total sentence over the document. The available document is provided for pre-processing steps like stop word removal, stemming, and sentence separation. As an outcome, the sentence is transformed into a word set provided S j = wd1 , wd2 , . . . , wd|sen j | where sen j specifies the complete word sentence. The redundancies over the provided sentence set are removed, and the word similarity
An Approach for Summarizing Text Using Sentence …
5
Fig. 1 Block of KEY model
is measured with the Keyword-based Elephant Yard Optimization (KEY). Here, the less similar sentence from one document is taken, and the document formation Doc n = sen1 sen2 . . . sen L contains several sentences. It is expressed as L ≤ n = i=1 |doci |. The summarized text is provided with the sentence of length 1, i.e., C = 1, where C specifies compression value. The summarized text is extracted doc∗ 100 from the weighted features. The optimal weight is measured using the KEY concept. The flow diagram of the anticipated model is shown in Fig. 1
3.1 Data Acquisition The text summarization process begins with the selection of documents from online resources. The PubMed medical dataset is considered for coherent text segments beneficial for human comprehension and information retrieval. It is an open phrase for searching biomedical literature Word vectors (word2vec binary format), trained on all PubMed abstracts.
3.2 Essential Pre-processing Steps To initialize the process, the input is taken from the dataset. Some of the essential pre-processing steps are sentence separation, stemming, and stop word removal. (i)
Sentence separation: The provided document is partitioned into the sentence, and it assists in predicting the initialization of sentences.
6
(ii)
(iii)
G. Malarselvi and A. Pandian
Stemming: Here, the process maps the provided words into common words, and this work adopts Porter stemming algorithm. The process shows some deviation, i.e., negative or insignificant influence over the system related to semantic analysis. It is an optional process, with or without stems. Stop word removal: Here, the repetitive words over the document are removed, i.e., common words with no important information like prepositions, queries, helping verbs, conjunction, articles and so on. The text content is limited to the essential word summary.
3.3 Similarity Computation with Weighted Features The summary from the provided input dataset is merged where the redundant content over the document is filtered via maximal content coverage. The summaries are examined using semantic similarity measure among the sentences, and it is expressed as in Eq. (1): sim s j , s j =∝ ∗simweighted measure s j , s j
+ (1− ∝) ∗ simnormalized weighted measure s j , s j
(1)
The weighted parameter ∝∈ [0, 1] specifies the relationship among the sentence information from the weighted feature measure and the sentence provided is provided as in Eq. (2): simweighted measure s j , s j W p ∈s j W p ∈s simweighted measure W p , W p j = s j . s j
(2)
The sentence similarity is determined with the weighted measure, and the outcomes specify the summary execution. The objective is to reduce redundancy and maximize coverage. The summary should hold the essential sentence information to maximize the content coverage. The superior capacity measure is related to the special nearness of summary sentences with the actual content of the gathered document. Thus, it needs to be maximized. The summary has to be away from the sentence replication of the document set to reduce the redundancy. It is measured as the non-redundant where the similarity among two sentences relies on the provided threshold T ∈ (0, 1). At last, the difference among the non-redundant sentences is provided as in Eq. (3): sim s j , s j ≤ T
(3)
An Approach for Summarizing Text Using Sentence …
7
3.4 Feature Weight Scoring Here, the feature chosen for text summarization is sentence similarity, sentence position, numerical data, title word, proper noun, numerical data, proper noun, frequent word, sentence significance, and sentence length. Consider, T f x specifies xth text feature, s j specifies sentence set, ST f x specifies text feature score for every sentence. The text feature characteristics are discussed below: (i)
Sentence position: It is depicted as the essential feature for phrase extraction. It is expressed as in Eq. (4): i doc − ST f 1 S j 2 doci 2
(ii)
j
(4)
Sentence length (T f 2 ): It is considered to eliminate longer or shorter sentences, and it is expressed as in Eq. (5), and the average distance sentences are expressed as in Eq. (6): AL(s) − s j ST f 2 S j = 1 − max s j min s j + max s j Al(S) = 2
(iii)
(6)
Title words (T f 3 ): It is known as the essential phrases from the available summary. The sentences are composed of words and represent the essential content document. These are known as the baseline of the document, and it is provided with a high score. It is expressed as in Eq. (7): ST f 3 S j =
(iv)
(5)
W p ∈s j
W p ∈tw
W p ∩ W p
2
S j .|tw|
Here, |tw| specifies the total amount of title words. Numerical data (T f 4 ): It has some numerical data that specifies the important information from the summary, and it is expressed as in Eq. (8): ST f 4 s j = number of available numerical data in s j s j
(v)
(7)
(8)
Proper noun: It includes organization, place, or person in the provided sentences is determined more weight, and it is evaluated as in Eq. (9): number of proper nouns in s j ST f 5 s j = s j
(9)
8
G. Malarselvi and A. Pandian
3.5 Sentence Scoring with KEY Optimizer Generally, elephants are measured as extroverted creatures with a huge population (multi-document), and it is partitioned into various clans. Every individual lives within the clan under the female matriarch leadership, and when the individual reaches adulthood, some number of males leave the clan it belongs to. It has some characteristics by partitioning the searching agent behavior known as updation and separation operator. (i)
Updation operator: The individual elephant position j over the clan Ci is expressed as in Eq. (10): xnew,ci , j = xci, j + α ∗ xbest,ci − xci, j ∗ r
(ii)
Here, xnew,ci , j and xci, j specifies the original and new position of every individual j over clan Ci , α and r specify both random numbers within [0, 1]. Separation operator: It is provided to update the worst individual over the clan, and it is expressed in Eq. (11): xworst,ci = xmin + (xmax − xmin + 1) ∗ r
(iii)
| log(r )|
(12)
Cyclone-based foraging strategy: It performs swin forwarding in spiral pattern and every individual approaches its prey and adjusts the current position, prior best agents, and maintains swim forwarding. The weighted coefficient β is expressed as in Eq. (13):
β = 2e
r1
(v)
(11)
Here, xmax and xmin specify the upper and lower bounds of every clan’s position, r specifies the random number of individuals [0, 1]. The novelty of the work relies on the foraging strategies with initialization and agent updation via iteration sequentially to predict the superior (best) solution. Chain-based foraging strategy: This chain process is performed sequentially from the up-to-down manner. The optimal solution updates the agent position, and the attained solutions are produced during prior iterations. The weighted coefficients ∝ expressed as in Eq. (12): ∝= 2 ∗ r ∗
(iv)
(10)
T −t +1 T
∗ sin(2πr1 )
(13)
Here, r1 specifies the random number r1 ∈ [0, 1], T specifies the maximal amount of iterations, the update position by searching the random position to improve exploration abilities. Flip-flop foraging strategies: This strategy considers the food as the central point (hub). Every agent performs flipping motion over the central point.
An Approach for Summarizing Text Using Sentence …
9
The agent updates its position over the present optimal solution to improve the exploitation. Algorithm 1 shows the explanation of Keyword-based optimization. Here, S specifies the flipping factor and specifies the ranges of the flip-flop motion. Here, the parameter is set as one where r2 and r3 specify the random number [0, 1]. Algorithm 1 KEY optimizer Input: Initialize population size N , maximal no. of generations tmax , upper and lower bounds xmax and xmin ; Output: Best optimal solutions xbest ; 1. Initialize parameters like S, α, β; 2. Compute fitness of every agent based on the fitness value; 3. While t < tmax do; 4. if r and < 0.5 and
t tmax
, then
5. Update xi ; //as in Eq. (18); 6. else if 7. Update xi ; //as in Eq. (19); 8. end if 9. Compute fitness value based on updation strategy; 10. for i = 1 : N , do 11. Update the position; 12. if f |xt (t + 1)| < f (xbest ) then 13. Substitute xbest with xi (t + 1) 14. end if 15. end for 16. Compute fitness value based on position updaiton; 17. Sort new population-based on every updation; 18. t = t + 1; 19. end while 20. return xbest ;
3.6 Summarization Process Finally, text summarization is performed to extract the summarized outcomes. It provides optimal outcomes with the weighted coefficients of β. The weighted factors of the summarized text and the sentence score are based on the individual text summary. It is expressed as in Eq. (14):
10
G. Malarselvi and A. Pandian dim Score s j = aq x ∗ Summarized text s j ;
(14)
q=0,x=1
At last, the highly scored sentences are extracted from the provided document and considered for the summarization process. It is performed based on the generated similarities, and it possesses a significant position over the main sentence. The deleted sentence is like the first sentence, the separated sentence looks like of successive sentence and so on.
4 Numerical Results and Discussion The simulation is done in MATLAB 2016b environment, and the experimental settings and the metrics analysis are performed with the PubMed medical dataset. Here, 70% of data is used for training and 30% is used for testing purposes. The ROUGE toolbox is used for the computation of estimation scores, and the efficacy is well-formed. One word is evaluated with ROUGE-1 to fix with system and referral summarization. ROUGE-2 evaluates the successive summarization to check with the baseline summarization. Table 2 depicts the ROUGE-1 and ROUGE-2 score computations. Some metrics like precision, recall, and F1-score is evaluated for various methods like Secure Squirrel Optimizer (SSO), Particle Swarm Optimization (PSO), and Elephant Herd Optimization (EHO). In Table 2, the similarity measure of KEY is 0.325, SSO is 0.305, and EHO is 0.312, respectively. This analysis proves that the model works more effectively than the other approaches like SSO and EHO models. Here, three different metrics are observed with precision (P), recall (R), and F1score (F). The evaluation is based on the summarization process where P specifies how the summary information is nearer to reference summary; R specifies the amount of information during system summaries nearer to reference summary; F specifies similarity among reference and system offering weights equal to P and R score. It is expressed as in Eq. (15) F=
2∗P∗R P+R
(15)
Table 2 ROUGE-1 and ROUGE-2 computation Techniques
ROUGE-1
ROUGE-2
Summarization process
P
R
F1
P
R
F1
Similarity measure
SSO
0.345
0.125
0.195
0.155
0.065
0.085
0.305
EHO
0.352
0.128
0.202
0.161
0.070
0.092
0.312
KEY
0.364
0.130
0.210
0.168
0.085
0.095
0.325
An Approach for Summarizing Text Using Sentence …
11
Fig. 2 Similarity measure computation of KEY with SSO and EHO
Table 3 Similarity measure computation Techniques
ROUGE-1
ROUGE-2
Summarization process
P
R
F1
P
R
F1
Similarity measure
PSO
0.330
0.127
0.183
0.186
0.039
0.064
0.250
SSO
0.347
0.146
0.205
0.165
0.062
0.089
0.235
EHO
0.350
0.148
0.208
0.167
0.063
0.090
0.240
KEY
0.364
0.130
0.210
0.168
0.085
0.095
0.325
Table 3 depicts the similarity measure computation of the anticipated model with various existing approaches like PSO, SSO, and EHO. The similarity measure of KEY is 0.325, PSO is 0.250, SSO is 0.235, and EHO is 0.240, which shows that the KEY model is superior to other approaches. The performance of the anticipated model is superior to other approaches like PSO, SSO, and EHO. The proposed and the existing approaches are provided in a readable manner (See Figs. 3, 4, 5 and 6) (Table 4). The kappa coefficient of this process is expressed as in Eq. (16): Kappa coefficient =
p(c) − p(s) 1 − p(s)
(16)
Here, p(c) specifies the proportion of time concurred, and p(s) specifies the proportion of time agreed upon by a certain coincidence. The evaluation of the kappa coefficient over an interval [2/3, 1] is provided acceptably.
12
G. Malarselvi and A. Pandian
Fig. 3 ROUGE-1 computation of KEY with others
Fig. 4 ROUGE-2 computation of KEY with others
5 Conclusion This research concentrates on modeling and efficient Keyword-based Elephant Yard Optimization (KEY) approaches for the automatic text summarization process. This model performs three preliminary phases: redundancy elimination contents by filtering, text features optimization, and text summarization based on keyword connectivity and similarity. Here, Keyword-based Elephant Yard Optimization (KEY) optimizes weights of the enormous text functionality, and this optimizer intends to predict the proper sentences over a document. The summarization process
An Approach for Summarizing Text Using Sentence …
13
Fig. 5 Similarity measure of KEY with others
Fig. 6 Kappa coefficient computation of KEY with others
Table 4 Kappa coefficient evaluation
Techniques
Judgment mode
Kappa coefficient
PSO
Readable
0.723
SO
Readable
0.690
WHO
Readable
0.802
KEY
Readable
0.990
14
G. Malarselvi and A. Pandian
is generated by summarizing appropriate sentences based on certain similarities. The anticipated model is evaluated with various existing approaches like PSO, SSO, and EHO using PubMed dataset. The experimental outcomes revealed that the anticipated model provides superior performance to the general approaches. The performances of the summaries are measured with metrics like recall, precision, and F1-score over PubMed dataset. Some features like redundancy and coverage give better performance than other approaches. The major research constraints are the selection of a dataset for the summarization process; however, in the future, the similarity methods are examined with the deep network concept to attain superior generalization over the comprehensible summaries.
References 1. X. Xia, D. Lo, E. Shihab, X. Wang, Automated bug report field reassignment and refinement prediction. IEEE Trans. Rel. 65(3), 1094–1113 (2016) 2. A.M. Rush, S. Chopra, J. Weston, A neural attention model for abstractive sentence summarization, in Proceedings of Conference Empirical Methods Natural Language Processing (2015), pp. 379–389 3. H. Jiang, N. Nazar, J. Zhang, T. Zhang, Z. Ren, PRST: a pagerankbased summarization technique for summarizing bug reports with duplicates. Int. J. Softw. Eng. Knowl. Eng. 27(6), 869–896 (2017) 4. H. Jiang, X. Li, Z. Ren, J. Xuan, Z. Jin, Toward better summarizing bug reports with crowdsourcing elicited attributes. IEEE Trans. Rel. 68(1), 2–22 (2019) 5. R. Nithya, A. Arunkumar, Summarization of bug reports using feature extraction. Int. J. Comput. Sci. Mob. Comput. 52(2), 268–273 (2016) 6. E. Vázquez, R.A. García-Hernández, Y. Ledeneva, Sentence features relevance for extractive text summarization using genetic algorithms. J. Intell. Fuzzy Syst. 35(1), 353–365 (2018) 7. S. Charitha, N.B. Chittaragi, S.G. Koolagudi, Extractive document summarization using a supervised learning approach, in Proceedings of IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER) (2018), pp. 1–6 8. E. Cardinaels, S. Hollander, B.J. White, Automatic summarization of earnings releases: attributes and effects on investors’ judgments. Rev. Accounting Stud. 24(3), 860–890 (2019) 9. M. Afsharizadeh, H. Ebrahimpour-Komleh, A. Bagheri, Queryoriented text summarization ˙ using sentence extraction technique, in Proceedings of 4th International Conference on Web Research (ICWR) (2018), pp. 128–132 10. S. Narayan, S.B. Cohen, M. Lapata, Ranking sentences for extractive summarization with reinforcement learning (2018). arXiv:1802.08636. (Online). Available: http://arxiv.org/abs/1802. 08636 11. S. Chopra, M. Auli, A.M. Rush, Abstractive sentence summarization with attentive recurrent neural networks, in Proceedings of the NAACL-HLT (San Diego, CA, USA, 2016), pp. 93–98 12. R. Nallapati, B. Zhou, C.N. dos Santos, C. Gulcehre, B. Xiang, Abstractive text summarization using sequence-to-sequence RNNs and beyond,in Proceedings of the EMNLP (2016), pp. 1–12 13. Z. Lin, M. Feng, C.N. dos Santos, M. Yu, B. Xiang, B. Zhou, Y. Bengio, A structured selfattentive sentence embedding (2017), pp. 1–15. arXiv:1703.03130. (Online). Available: https:// arxiv.org/abs/1703.03130 14. T. Shen, T. Zhou, G. Long, J. Jiang, S. Pan, C. Zhang, Disan: directional self-attention network for RNN/CNN-free language understanding (2017), pp. 1–11. arXiv:1709.04696. (Online). Available: https://arxiv.org/abs/1709.04696
An Approach for Summarizing Text Using Sentence …
15
15. A. Shaqoor Nengroo, K.S. Kuppusamy, Machine learning-based heterogeneous web advertisements detection using a diverse feature set. Future Gener. Comput. Syst. 89, 68–77 (2018) 16. T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality (2013). arXiv:1310.4546. (Online). Available: http:// arxiv.org/abs/1310.4546 17. Z. Tu, Z. Lu, Y. Liu, X. Liu, H. Li, Modeling coverage for neural machine translation, in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (2016), pp. 76–85 18. A. Sinha, A. Yadav, A. Gahlot, Extractive text summarization using neural networks (2018). arXiv:1802.10137. (Online). Available: https://arxiv.org/abs/1802.10137 19. C. Yadav, A. Sharan, A new LSA and entropy-based approach for automatic text document summarization. Int. J. Semantic Web Inf. Syst. 14(4), 1–32 (2018) 20. R. Rattray, R.C. Balabantaray, Cat swarm optimization based evolutionary framework for multi-document summarization. Phys. A Stat. Mech. Appl. 477, 174–186 (2017) 21. W. Song, L.C. Choi, S.C. Park, X.F. Ding, Fuzzy evolutionary optimization modelling and its applications to unsupervised categorization and extractive summarization. Expert Syst. Appl. 38(8), 9112–9121 (2011) 22. A. Sungheetha, R. Sharma, Transcapsule model for sentiment classification. J. Artif. Intell. 2(03), 163–169 (2020) 23. S. Smys, J.I.Z. Chen, S. Shakya, Survey on neural network architectures with deep learning. J. Soft Comput. Paradigm (JSCP) 2(03), 186–194 (2020) 24. J.S. Manoharan, J. Samuel, Capsule network algorithm for performance optimization of text classification. J. Soft Comput. Paradigm (JSCP) 3(01), 1–9 (2021) 25. J.S. Raj, J. Vijitha Ananthi, Recurrent neural networks and nonlinear prediction in support vector machines. J. Soft Comput. Paradigm (JSCP) 1(01), 33–40 (2019) 26. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021)
Image Classification of Indian Rural Development Projects Using Transfer Learning and CNN Aditya Mangla, J. Briskilal, and D. Senthil Kumar
Abstract In recent years, the convolutional neural network has been demonstrated to be effective in classifying data from animals to objects and as well as the human hand signs. Convolutional Neural Network (CNN) shows high performance on image classification. Recent trends in CNN that are used extensively are transfer learning and data augmentation. In this paper, the classification of Indian Government rural projects such as check dams, farm ponds, soak pits, etc. that promote agricultural activities are classified based on image. These projects built in the rural parts of India are of similar features and hence a challenging task to classify them. Remote-Sensing (RS) model has been proposed for the classification of these projects which is further compared with DenseNet-121 model over the same task on the basis of different data sizes and number of layers. Moreover, checking their influence on classifications of these images has been performed. Using this proposed RS model, a test accuracy of 0.9150 has been achieved. Keywords Convolutional neural network · Deep learning · Image augmentation · DenseNet121 · Image classification · Transfer learning
1 Introduction The research and development of remote-sensing images are of significant importance in many areas. Remote-sensing agriculture-related projects lead to precision agriculture and rural planning. Since the past decade, data available for A. Mangla · J. Briskilal (B) SRM Institute of Science and Technology, Potheri, Kattankulathur, Chengalpattu District, Chennai, Tamil Nadu 603203, India e-mail: [email protected] A. Mangla e-mail: [email protected] D. Senthil Kumar Sri Sairam Engineering College, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_2
17
18
A. Mangla et al.
remote-sensing research has increased drastically which leads to ease of training convolutional neural networks model to achieve decent accuracy. Convolutional Neural Network has shown great levels of efficiency and accuracy when it comes to image classification tasks and is breaking boundaries day after day. From the work of Razavian et al. [1], it is determined that along with CNN, Transfer learning has also gained huge significance due to the reduced amount of training time and sufficient dataset required to achieve decent results [2]. Rather than training model with the random initialization of weights, if weights that are trained on the network architecture have been used for the same or another dataset, and used that as pre-training task and then transfer to a new task, the proposed model will progress faster in a lesser time [3].
1.1 Dataset The dataset used in this research is acquired from ISRO’s Geoportal- Bhuvan. 6 k images for training for six classes, 1.5 k images for validation with 450 × 450 × 3 input size, and 1250 pictures per class for six different classes are obtained. To experiment with effectiveness of augmentation techniques, a restricted data of 550 images per class for six classes are also trained, with and without DenseNet. The projects constructed under Mahatma Gandhi National Rural Employment Act include cement roads in rural villages, check dams, farm ponds, soak pits, horticulture for agricultural activities, Latrines for sanitization, and even more. The images of all these projects are maintained by ISRO’s Geoportal—Bhuvan. The data is initially categorized according to the regions and then according to the projects built in that region. Intentionally, decrement of the size of the dataset is performed, in order to apply augmentation to the images and evaluate how better it works with less dataset plus data augmentation, and with larger dataset plus without augmentation. While there are some advanced data augmentation techniques such as GANs (Generative Adversial Networks), as given by Luis Perez et al. [4], traditional data augmentation transformations are used in this work. The model used to experiment with is “DenseNet”, given by Liu et al. [5]. By Zhang et al. [6], Transfer learning is widely used for COVID-19 diagnosis, since an already trained model on chest x-rays tends to perform better on covid x-rays than a model from scratch. Whereas in some cases such as COVID-Net, it has been proved to perform better on chest X-ray images than transfer learning models. Here, RS model which outperforms the transfer learning model DenseNet-121 is proposed.
Image Classification of Indian Rural Development …
19
2 Background Object recognition with convolutional neural network was first proposed by LeCunn et al. [7], in which, how learning the right features can produce a better accuracy and good performance has been elaborated. Convolutional network’s architecture showed, how a pixel of an image can be mapped to feature, which helps in prediction [8]. One of the state-of-the-art models is DenseNet-121 proposed by Huang et al. [9]. The work done in past decade has shown that convolutional neural networks can be denser along with being more accurate and efficient. In vanilla convolutional neural network, the input image is passed through the neurons of the neural network to get the result as predicted, where the forward pass is really direct. However, DenseNet engineering is tied in with changing this standard CNN design. In a DenseNet engineering, a five thick layer block that has a development pace of k = 4 has been used [10]. DenseNet design is in such a way that each layer is related to all the other single layers, as per the name.
3 Literature Review This literature review examines the works related to image classification of rural projects using remote-sensing data or image datasets. Additionally, it illustrates how data augmentation was performed in traditional ways, along with transfer learning that provides decent results even on a small dataset.
3.1 Works Related to Image Classification of Rural Projects The task of image classification of rural projects is used for village formation which leads to better rural planning and resource allocation. Previous research by Singh et al. [11] proposed the formation of town data arrangement of a Moga area in Punjab, utilizing Geoinformatics. The work focused on Land Use which included Built Up, Pond, River, Wasteland, Plantation, etc. information of the Moga district was extracted from the Satellite Imagery with the help of supervised classification technique. Support Vector Machine was used in order to achieve the former on the IRS-P6 LISS-III data acquired in 2007 and used information of visually interpreted land use of 2002 to generate the set for the classifier. The disaster-prone areas and amenities required for its management were mapped along with the identification of poor infrastructure. Al-doski et al. [12] looked into the image classification process and procedures, and image classification techniques which are K-means Classifier and Support Vector
20
A. Mangla et al.
Machine, respectively. The supervised and unsupervised methods were used to obtain accuracy over the dataset. The multiple remote-sensing features, including spectral, multitemporal, and spatial were utilized. Varshney et al. [13] proposed a model to recognize metal rooftops from covered rooftops based on the fact that the metal rooftops are dazzling white or dark, whereas the covered rooftops are caramel with less fresh and somewhat less straight edges. Arbitrary woods relapses were prepared for the absolute number of rooftops in the picture fix. The satellite symbolism and the prepared model were used on that. An outright blunder of 1.95 on the quantity of rooftops and 0.162 on the extent of metal was accomplished. The above research mainly used Support Vector Machine on the satellite data to classify images. However, a convolutional neural network model can provide much more efficiency than an SVM is argued here since it captures different features of an image and applies different filters to the image other than making prediction based on distance from the hyperplane as in SVM. In this paper, RS model which is able to extract features at a much better level, along with strengthening feature propagation has been proposed. In the model, feature propagation is dynamic which benefits in seamless flow of information. Furthermore, computationally it requires lesser resources compared to ResNet [14].
3.2 Works Related to Data Augmentation Techniques Wang et al. [4] explored various data augmentation solutions in image classification. The information was compelled falsely to a little subset of the ImageNet dataset and considered the execution of every information expansion procedure. The tinyimagenet-200 information and MNIST were utilized. Initially, customary changes were applied on the dataset, for example, producing a copy picture that is zoomed in/out, turned, flipped, mutilated, and so on. For the N size dataset, a 2 N size dataset was used. Further, Generative Adversial Networks was applied for style change utilizing six distinct styles: Cezanne, Enhance, Monet, Ukiyo-e, Van Gogh, and Winter. To test the viability of different expansions, 10 investigations were run on picture net information. On characterizing canines versus goldfish, the customary performed better compared to the GANs. However, the best result was given by Neural+. Yet, if there should arise an occurrence of Dogs versus Cat, customary gave 0.775 approval exactness which was the most elevated. Traditional transformations method proved to be more useful in data augmentation than GANs. Geometric Transformation, flipping, color space, cropping, etc., prove to fill up the gap for the less number of images. Many such traditional transformations were explored by Shorten et al. [15]. A baseline accuracy of 48.13 was increased to 61.95 using different augmentation techniques. Baseline results from Wong et al. [16] show that CNN performed better from additional training samples, as it improves both training and validation error. Applying
Image Classification of Indian Rural Development …
21
data augmentation, provides the RS model, images with different transformations such as distortion and zoom which helps the model to learn better even with less number of images along with those additional training samples.
3.3 Gaps Identified in the Related Works In the above-related work, the major gap found is the use of traditional machine learning methods instead of advanced machine learning methods. Above researchers used SVM and K-means which are traditional machine learning methods. At present, there are much more advanced computer vision models available such as deep learning models, convolutional neural networks, etc. Furthermore, data augmentation was not used and this leads to low accuracy as if the same image is flipped or zoomed. Therefore, the model might fail to recognize the image. Moreover, the model was trained from scratch instead of using a pre-trained model, which leads to uses of a lot of computational power and is not efficient due to computational power usage. The above issues result in longer training time, inefficient use of computational resources, and lack of producing good efficiency.
3.4 Contributions of This Paper To overcome the above-mentioned problems, use of convolutional neural network to train the images, which leads to learn useful features instead of random numbers, is proposed. Along with this, it helps to train in less time. Dataset has been created from the beginning. The use of latest techniques such as transfer learning and data augmentation allows the model to be robust again with no cropped or blurred images or any other altered image. Moreover, with the use of transfer learning methods, it is not required to train the model from the start, instead previously trained model can be used on some other dataset for the classification task and can be trained further on the right dataset.
4 Proposed Work The classification of Indian Government rural projects to promote agricultural activities, based on image is proposed. These projects are built in the rural places with similar features. Thus, classifying them is difficult. In particular, at first training the convolutional neural network model to perform a rudimentary classification is done [17]. After which it is followed by data augmentation along with transfer learning. Then, DenseNet121 is used to pretrain weights to further train on the model. Later, exploration and proposal of image classification model to classify agricultural and
22
A. Mangla et al.
Fig. 1 Conceptual and architectural design
rural development projects for precision agriculture and rural planning are performed. Finally, training with RS model to out-bit DenseNet121 model accuracy is done. Below is the architectural diagram (Fig. 1). Experiments are implemented by RS model on the basis of different data sizes and number of layers. Constraining dataset to less number of images in order to apply image augmentation to these images and optimize the RS model accordingly are executed.
4.1 Experimental Setup and Analysis 4.1.1
Dataset Creation
The dataset is created by web-scrapping the images from the Bhuvan Indian Geo Platform of ISRO. The images of six classes to be classified are obtained from the portal. 1400–1500 images were downloaded for each class. After dataset cleaning, finally 1000 images for each class of different dimensions, all greater than 500 × 500 were gathered. In total, 6000 images for six classes, i.e., Cement roads in rural
Image Classification of Indian Rural Development …
23
areas, Check Dams in rural areas, Farm Ponds in rural areas, Soak pits in rural areas, Indian Household Latrines in rural areas, and Horticulture land in rural areas, taken from the state of Tamil Nadu, a southern part of India, were collected.
4.1.2
Dataset Validation and Evaluation
For the dataset validation, Cohen’s kappa score is used to get the interrelated agreement and reliability of the dataset. It is a quantifiable coefficient that tends to level the precision and dependability in a verifiable game plan. • • • •
A=> Both raters agree to include B=> Only the first rater wants to include C=> Only the second rater wants to include D=> Both raters agree to exclude
To work out the kappa esteem, there is a need to know the likelihood of the arrangement. This recipe is determined by adding the quantity of tests where the appointed authorities concur, then separating it by the complete number of tests. Utilizing the model would mean (A + D)/ (A + B + C + D). Po = Number in Agreement/Total P(correct) = (A + B/A + B + C + D) ∗ (A + C/A + B + C + D) P(incorrect) = (C + D/ A + B + C + D) ∗ (B + D/A + B + C + D) Pe = P(correct) + P(incorrect) The values for the A, B, C, and D for each class are given in Tables 1. Using these values in the formula the following results are obtained (Table 2). Table 1 Values of A. B. C. D of Cohen’s kappa Class
A
B
C
D
Cemented road
851
33
29
87
Check dam
837
34
28
101
Indian latrines
980
10
3
7
Farm pond
880
17
3
100
Horticulture
900
12
8
80
Soak pits
953
5
1
41
Total
5401
111
72
416
24 Table 2 Cohen’s Kappa result
A. Mangla et al. % of agreement
96.95%
Cohen’s k
0.8030722168974801
Agreement
Almost perfect agreement
4.2 DenseNet121 DenseNet comprises dense blocks connected by transition layer. The component guides a large number of previous layers that are utilized as wellsprings of data, and its own component graphs are used as commitment for each of the following layers [18]. DenseNet designing is isolated into various thick squares. Each plan includes four dense blocks with changing number of layers. The Densenet121has [19–22] layers in the four thick squares. A global spatial average pooling layer is added at the end of the base model of DenseNet-121, along with this a dense output layer with softmax activation function is attached. Softmax is a numerical capacity that changes a vector of numbers into a vector of probabilities, where the probabilities of each worth correspond to the general size of each worth in the vector. Softmax activation function has the given formula.
The input picture is addressed as x o , output of the ith layer as x i, and each convolutional module as capacity H. Contribution to the ith layer is yields of all past layers. 450 × 450 size of images are taken for training and then the data augmentation is applied on the dataset [23]. Batch size of 30 is used for training and 25 for validation. 1000 images for each class and 250 for validation. So in total 6000 images are considered to train the dataset and 1500 images for validation. It is then trained for 30 epochs where it takes 200 steps per epoch for training and 60 steps per epoch for validation. Categorical cross-entropy loss is used to calculate loss. RMSprop is utilized as the optimizer at a learning rate of 1e-4. Metrics used for evaluation are accuracy. After training for 30 epochs, the result of 82.40% accuracy is obtained on training dataset and 84.20% accuracy on validation dataset. The loss for training is 0.4962 and loss for validation is 0.4345. The following plot is received from plotting training accuracy and validation accuracy against epochs.
Image Classification of Indian Rural Development …
25
4.3 Remote-Sensing (RS) Architecture RS model architecture is proposed to further boost network’s performance. As an input layer, input of images of dimensions 350 × 350 × 3 are taken, where, three represent the RGB value of the image. Next, the images are passed through the convolutional 2D layer with 32 filters of dimension 3 × 3 with ReLU as an activation function [24]. With regards to artificial neural networks, the rectifier or ReLU activation function is an actuation work characterized as the positive piece of its contention: f (x) = x + = max where x is the input to a neuron. MaxPooling layer is used in between the convolutional layers. MaxPooling is a discretization process. The goal is to down-example an information portrayal (picture, stowed away layer yield lattice, and so on), diminishing its dimensionality and taking into account suspicions to be made with regards to the highlights contained in the sub-locales binned. When there is a 4 × 4 framework addressing underlying info and a 2 × 2 channel to run over feedback, there will be a step of 2 (which means the (dx, dy) for venturing over feedback (2, 2)) and will not cover areas. For every area addressed by the channel, the maximum of that locale is considered and a new yield lattice is made, where, every component is the maximum of a district in the first information. Total params: 26,458,822 Trainable params: 26,458,822 Non-trainable params: 0 Yield is inferred utilizing argmax work which reacts with the greatest likelihood among the two classes, as anticipated by the model. Probabilities utilizing softmax work are used. The softmax work is a limit that changes a vector of K authentic characteristics into a vector of K real characteristics that totals to 1. The data regards can be positive, negative, or at least zero conspicuous than one, but the softmax transforms them into values of someplace in the scope of 0 and 1, so that they can be interpreted as probabilities. On the off chance that one of the sources of info is little or negative, the softmax transforms it into a little likelihood, and assuming an information is huge, it transforms it into a huge likelihood, yet it will consistently stay somewhere in the range of 0 and 1. Confidence is acquired by ascertaining the softmax of the contributions at the softmax layer. The softmax work is mostly utilized as the last initiation work in the neural organizations for characterization issues. This capacity standardizes an info vector, into a reach that frequently prompts a probabilistic translation. While the facts really confirm that the yield of the softmax capacity can be treated as a likelihood vector, it is not difficult to fall into the snare of entrapping this data with proclamations about trust (in the measurable sense). Since the softmax work doesn’t change the requesting of the yield esteems, the biggest worth before standardization, will in any case be the biggest worth after the standardization. Regardless of whether the expectation is right, there may be a punishment related to yield esteem. Batch size of 30 for training and 25 for validation is considered. 1000 images for each class and 250 for validation. So in total 6000 images are considered to train
26 Table 3 Precision and recall for each class
A. Mangla et al. Class
Precision
Recall
Cement road
0.90
0.889
Check dam
0.92
0.889
Farm pond
0.89
0.923
Horticulture
0.907
0.900
Indian household latrines
0.91
0.892
Soak pits
0.887
0.921
and 1500 images for validation. It is trained the model for 30 epochs where it takes 200 steps per epoch for training and 60 steps per epoch for validation. Categorical cross-entropy loss is used to calculate loss. RMSprop is utilized as the optimizer at a learning rate of 1e-4. Metrics used for evaluation are accuracy. After training for 30 epochs, the result of 88.93% accuracy is obtained on training dataset and 90.00% accuracy on validation dataset. The loss for training is 0.2583 and loss for validation is 0.2541. The following plot is received from plotting training accuracy and validation accuracy against epochs. Evaluation accuracy on test dataset is obtained as 91.50% with a loss of 0.2272 on the test dataset with RS model (Table 3). The above precision and recall results were achieved for each class with RS model.
4.4 Comparative Analysis Figure 2 shows the result of image classification using DenseNet architecture. Using DenseNet-121 models, accuracy of 82.40% with a loss of 0.4962 and validation accuracy of 84.20% with validation loss of 0.4345 are achieved. Fig. 2 Plot for training and validation accuracy for DenseNet-121
Image Classification of Indian Rural Development … Table 4 Accuracy of models
Model
27 Train
Validation
DenseNet 121
0.8240
0.8420
RS Architecture
0.8893
0.9000
Fig. 3 Plot for training and validation accuracy for RS Model
Figure 3 shows the result of image classification using RS model architecture. Using RS model, accuracy of 88.93% with a loss of 0.2583 and validation accuracy of 90.00% with validation loss of 0.2541 are achieved (Table 4). On test dataset with RS model architecture, the accuracy of 0.9150 is obtained (Fig. 3).
5 Conclusion Data augmentation and transfer learning has been shown to enhance model accuracy significantly. Though traditional data augmentation has been used, yet it has proved to be very effective. Transfer learning using DenseNet-121 increases the accuracy. Pre-trained DenseNet-121 model proved to be of great help while training it further on dataset. The RS model consumes less time than the former model to train and hence is able to produce better performance with the help of data augmentation techniques and hyper tuning the model so that the model does not underfit or overfit on the dataset at hand. This shows, although DenseNet121 is a state-of-the-art model, the RS model on this specific task performs better in terms of computation time and accuracy. Experimenting with different hyperparameters to get the overall most efficient results is implemented. There has been previous research using SVM and K-means, but using convolutional neural networks has produced edge over those models. Moreover, using RS model, a better accuracy has been achieved which can further lead in the formation of village information systems and resource allocation.
28
A. Mangla et al.
Data augmentation techniques may be utilized not only to tackle lacking adequate information problems but also, to assist with working on the present state-of-theart classification algorithms. Besides, the above work can be relevant on satellite imagery to distinguish rural development projects, and check for resource allocation and utilization in a more fitting manner.
References 1. A.S. Razavian, H. Azizpour, J. Sullivan, S. Carlsson, et al., CNN features off-the-shelf: an astounding baseline for recognition (2014) 2. M. Tripathi, Analysis of convolutional neural network based image classification techniques. J. Innov. Image Process. (JIIP) 3(02), 100–117 (2021) 3. E.-Y. Huan, G.-H. Wen, Transfer learning with deep convolutional neural network for constitution classification with face image. Multimedia Tools Appl. 79(17–18), 11905–11919 (2020) 4. L. Perez, J. Wang, et al., The effectiveness of data augmentation in image classification using deep learning (2017) 5. G. Huang, Z. Liu, L. van der Maaten, K.Q. Weinberger, Densely connected convolutional networks, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) 6. Y.-D. Zhang, S.C. Satapathy, X. Zhang, S.-H. Wang, COVID-19 diagnosis via DenseNet and optimization of transfer learning setting. Cogn. Comput. (2021) 7. Y. LeCun, P. Haffner, L. Bottou, Yo. Bengio, et al., Object Recognition with gradient based learning (1999) 8. S. Smys, J.I.Z. Chen, S. Shakya, Survey on neural network architectures with deep learning. J. Soft Comput. Paradigm (JSCP) 2(03), 186–194 (2020) 9. T. Guo, J. Dong, H. Li, Y. Gao, Simple convolutional neural network on image classification, in 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA) (2017). https://doi. org/10.1109/icbda.2017.8078730 10. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 11. H. Singh, K. Krishan, P.K. Litoria, et al., Creation of village information system of Moga district in Punjab using Geoinformatics (2009) 12. J. Al-doski, SB. Mansor, H.Z.M. Shafri, et al., Image classification in remote sensing (2013). (Department of Civil Engineering, Faculty of Engineering, Universiti Putra Malaysia) 13. K.R. Varshney, G.H. Chen, B. Abelson, K. Nowocin, V. Sakhrani, L. Xu, B.L. Spatocco, Targeting villages for rural development using satellite image analysis. Big Data 3(1), 41–53 (2015) 14. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019) 15. C. Shorten, T.M. Khoshgoftaar, A survey on image data augmentation for deep learning. J. Big Data 6(1) (2019) 16. S.C. Wong, A. Gatt, V. Stamatescu, M.D. McDonnell, Understanding data augmentation for classification: when to warp? CoRR, abs/1609.08764, 2016. 1 17. E. Maggiori, Y. Tarabalka, G. Charpiat, P. Alliez, Convolutional neural networks for largescale remote-sensing image classification. IEEE Trans. Geosci. Remote Sens. 55(2), 645–657 (2017). https://doi.org/10.1109/tgrs.2016.2612821 18. W. Liu, K. Zeng, SparseNet: a sparse DenseNet for image classification (2018). Sun Yat-sen University
Image Classification of Indian Rural Development …
29
19. M. Miu, X. Zhang, M.A.A. Dewan, J. Wang, et al., Aggregation and visualization of spatial data with application to classification of land use and land cover (2017) 20. Y. Kubo, G. Tucker, S. Wiesler, Compacting neural network classifiers via dropout training. ArXiv e-prints (2016) 21. T. Wellmann, A. Lausch, E. Andersson, S. Knapp, C. Cortinovis, J. Jache, S. Scheuer, P. Kremer, A. Mascarenhas, R. Kraemer, A. Haase, Remote sensing in urban planning: contributions towards ecologically sound policies? Landscape Urban Plann. 204, 103921 (2020) 22. J. Briskilal, C.N. Subalalitha, An ensemble model for classifying idioms and literal texts using BERT and RoBERTa. Inf. Process. Manage. 59(1), 102756 (2022) 23. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 24. L. Wang, Z.Q. Lin, A. Wong, . COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images (2020)
Design of a Mobile Application to Deal with Bullying Vania Margarita, Agung A. Pramudji, Benaya Oktavianus, Randy Dwi, and Harco Leslie Hendric Spits Warnars
Abstract Any type of bullying can make the victims traumatized and not all victims can overcome it without the need for professional help, particularly the teenagers who are in the vulnerable age for bullying. Hence, a mobile application named “Protect ur smile” has been created for victims who can consult with their choice of psychologists. Moreover, they can share their stories among other youngsters who are victimized to gain mental support. The proposed mobile application is designed using the use case diagram, class diagram, and user interface as a prototype showing design. Furthermore, the application has several phases such as registering, login, creating a bully report, consultation, forum, and the information column. Using this mobile application to gain control over bullying, will increase a victim’s confidentiality and decrease the number of suicides. Keywords Bullying mobile application · Anti-bully mobile application · Anti-bullying technology · Software engineering · Infomation systems
V. Margarita · A. A. Pramudji · B. Oktavianus · R. Dwi Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] A. A. Pramudji e-mail: [email protected] B. Oktavianus e-mail: [email protected] R. Dwi e-mail: [email protected] H. L. H. S. Warnars (B) Computer Science Department, Graduate Program, Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_3
31
32
V. Margarita et al.
1 Introduction In the last few years, “bullying” has been increasingly heard in Indonesia. Bullying is expressing aggressive behavior intentionally to abuse weaker people. There are several types of bullying, such as: 1.
2.
3.
4.
5.
Physical oppression Victims of physical oppression receive various abusive physical treatments ranging from blocking the victim’s path, stumbling, pushing, hitting, grabbing, throwing things at him etc. Verbal oppression Verbal oppression is carried out with painful or teasing words, statements, nicknames, etc. that cause psychological stress. Exclusion Exclusion victims may not be physically or verbally abused, rather be hostile and ignored by their social environment. Victims become isolated and forced to be alone and find it difficult to find friends, since the bully has a strong influence to persuade others to isolate the victim. Cyber-oppression Bullying in cyberspace is known as cyberbullying. Because of its free nature, the victim may receive oppression from someone he does not know or someone with a disguised username. This oppression occurs in cyberspace, for example, through social media and chat applications. Bullying that occurs is usually in the form of insults or satire. It could also be gossip about victims that spread through social media. Sexual-oppression The oppressor will comment on, tease, try to peek, or may touch the victim sexually. Moreover, sexual oppression also includes spreading photos of victims secretly to satisfy the sexual arousal of the perpetrators or forcing victims to watch or see pornography [1].
Based on data from the KPAI or the Indonesia Child Protection Commission in 2017, 73 cases of bullying occurred in children aged 12 to 17 years. In 2018, KPAI said that the increase in bullying on social media was 112 cases [2]. Bully behavior has many adverse effects on victims, for example: a.
Having mental disorders: • • • • •
b.
Depression Inferiority Anxious Difficulty to sleep well Want to hurt their self
Changes are visible in victims: • Not enthusiastic to go to school/work
Design of a Mobile Application to Deal with Bullying
• • • • •
33
Achievement decreases Decreased appetite Run away from home Stressed while returning home Physical injury on the victim
Bully actions are troublesome and inappropriate to be experienced. Bullying also causes damage to the future, which hurts the victim and those around him. People who like to bully, should be given a deterrent effect not to repeat his actions. Because of the rampant bullying cases, this study aims to design a mobile-based application to help victims get psychological help from the psychiatrists. This research has worked along with the Indonesian Psychiatrist Association, which has helped to solve the problem of bullying in society by implementing online consultation in the application with guaranteed confidentiality. Additionally, the application assists victims in reporting to the local police if they want it to be legally processed so that the authorities may take deterrent actions on the bullying perpetrators. This application would have an impact on many people and advance the country’s growth by making minor changes to the younger generation. If bullying is continued without serious measures to stop it, perhaps the younger generation may become embarrassed and unable to make a better future. This application can positively impact victims of bullying to express their feelings.
2 Literature Review Previous research papers from both national and international journals or conferences have been reviewed. According to recent research, young people in this era, are very vulnerable to emotional problems which leads to various kinds of disorders, for example, depression and anxiety. Disorders are caused due to various reasons, such as genetic issue, caused by other disease, or environmental influences [3]. There are cases of bullying beginning from school environment; where schools are supposed to be a place to learn and shape one’s character, it can be a terrible place for some youngsters [4]. Most perpetrators do such immoral acts because of the lack of moral education. Communication within the family is essential to overcome bullying. According to [5] that study about bully-victim behavior and communication between families, it turns out that the role of the family is vital to subdue bullying. According to data that classifies young people based on bullying, 23% are being bullied via Internet, 16% are perpetrators of cyberbullying, the remaining 61% are bullied directly both in the school and in the community [6]. Bully victims become depressed and tend to be quiet and unconfident. The depression rate experienced by verbal bullying is higher than that of cyberbullying because, verbal bullying is done directly to victims and could not be avoidable unlike bullying through social media or the Internet [7]. According to research, the impact of a verbal bully is harrowing.
34
V. Margarita et al.
Victims can be traumatized, severely depressed, or stressed, and the worst effect is suicide. Mocking or spreading a rumor, can be counted as an indirect bully that causes victims to become stressed and depressed [8]. Research indicates that the urge to bully can be conceived through one’s personal character, family problems, and mental health problems. However, moral education is critical to be taught in childhood. Therefore, it requires the willingness of parents and people around, to model good things [9]. The school is also expected to pay attention to the students. School must employ professionals to observe students, who display the following characteristics such as being reticent, anxious, uncomfortable, or abandoned by the community. Sufferers are not very confident, always see the negatives inside them, and think of themselves as stupid or failure [10]. Bullying has had a severe adverse effect for a long time. People who are victims of bullying can turn into bullies due to severe stress, depression, and trauma [11]. According to the data, most direct bullying perpetrators are boys. They are more involved in verbal or physical bullying, while female perpetrators are more involved in indirect bullying and cyberbullying, such as insulting and spreading rumors that hurt victims. Cyberbullying in schools happens a lot nowadays. The perpetrators are usually not aware that their behavior is considered bullying. Victimized students usually become afraid and are embarrassed to go to school [12]. There are many anti-bullying programs being run in schools nowadays. It is expected that the anti-bully program can reduce the incidence of bullying in schools. In the program, it is taught to understand the concept of bullying, which is a stressful life event. According to research, most of the bully perpetrators at school had been victims of bullying on the streets. This makes them want to dominate the school area by bullying other students [13]. So, the role of the social environment becomes essential, especially for teenagers, who are usually looking for their identity. According to Holt and Espelage [13], a study of 784 young people was conducted about their opinions on bully cases. It was found that 61.6% of young people had never been involved in bullying, 14.3% were perpetrators, 12.5% were victims, and 11.6% were bully victims, proving that there are teenagers who have good character and do not wish to be involved in bullying [14]. Apart from being in school, bullying also occurs in the workplace, which results in victims becoming uncomfortable at work. This behavior severely impacts individuals and companies. In many cases, such bullying incidents go unnoticed [15]. Therefore, a right approach is needed to deal with this incident. Cyberbullying also occurs in a relationship. Usually, consequences include breaking up, envy, and jealousy among them. Therefore, bullying actions carried out by the perpetrators need to be discussed and considered, so that there is no more making others life miserable [16].
3 Proposed Work The proposed mobile application is made for bully victims so that they can consult a psychologist and share their stories or events. This mobile-friendly application
Design of a Mobile Application to Deal with Bullying
35
Fig. 1 Use case diagram of the anti-bully mobile application
called “Protect your Smile”, with the tagline “Be Happy Now” helps to reduce the level of depression or trauma experienced. Figure 1 shows the use case of the proposed application where the user can register, login, create a bully report, consult psychologists, create a forum, comment in forum’s thread, and read the information regarding bullying. Moreover, the class diagram of content attributes from the class of database is seen in Fig. 2, where there are seven tables of database such as Survivor, Report, Forum, Admin, Consultation, Psychiatrist, and Report Detail. The survivor table consists of eight attributes which serves the purpose of saving the user’s data. The table is connected to three tables: consultation, report, and forum. The relationship between the consultation table and survivor table is 1..* to 1 because one survivor can make one or more consultations. Similarly with the survivor table and the forum table relationship, because one survivor can make one or more forum threads and comments. A report table is used to save report data, such as descriptions and supporting evidence. The relationship between survivor and report tables is 1 to 1..* because one survivor can have one or many reports. The consultation table is used to save the consultation details between users and psychiatrists. It is related to the psychiatrist table which stores the details of the psychiatrist. A psychiatrist can handle one or more consultations.
36
V. Margarita et al.
Fig. 2 Class diagram of the anti-bully mobile application
Meanwhile, the forum table stores news, events, and thread information. It is related to the admin table because one admin can handle one or many forum posts. The admin table is used to store the admin’s data. The admin table is related to the report detail table because the admin has to check, and follow up on the submitted reports.
4 Results and Discussion The user interfaces of the mobile application are shown in Figs. 3, 4 and 5. The main menu of the application is displayed in Fig. 3a. As seen in the use case diagram (Fig. 1), the activities are elaborated as follows: (1)
Register Register is used by new users when they use this mobile application for the first time. Users have to fill in their data on this page, such as name, email, password, and password confirmation, which is shown in Fig. 3b.
Design of a Mobile Application to Deal with Bullying
Fig. 3 a Main menu, b register form
Fig. 4 a Login form, b create a bully report
37
38
V. Margarita et al.
Fig. 5 a Consultation page, b forum menu
(2)
(3)
(4)
(5)
(6)
Login The login page is used for registered users to open the mobile application. They have to fill in their data on this page, such as their registered email and password, as shown in Fig. 4a. Create a Bully Report This page is used for the users who want to report bullying that happened to themselves or others. They can add evidences as photos, as shown in Fig. 4b. Users can choose different format such as jpg, pdf, movies, mp4, mp3, gif, png etc., for uploading the files. Moreover, the admin can access this page, who has the right to view the report and contact the police if required. Consultation This page is used for the users to consult the psychologists. Fig. 5a shows that consultation is done via text message. The users can choose to display their names or appear anonymous so that they do not feel scared and embarrassed because of privacy leaks. Forum This page is used for the users to communicate with other users, as shown in Fig. 5b. Users can make threads and comment on others. Information This page displays the user’s profile, as shown in Fig. 6a, and the psychologists’
Design of a Mobile Application to Deal with Bullying
39
Fig. 6 a User profile, b psychologists’ profile
profile, as shown in Fig. 6b. The users can see psychologists’ educational backgrounds and career life. Moreover, there are various information about bullying. One of the regulations on bullying in Indonesia (as shown in Fig. 7) which emphasizes child protection is that the perpetrators of bullying can be convicted under Law Number 23 of 2002. Article 54 of Law 35/2014 says that children in and within the education unit are required to receive protection from acts disturbing their physical and mental growth.
5 Conclusion This application is intended to help survivors of bullying to report to the police, consult with psychiatrists, and share their experiences and incidents with others. In the future, there are opportunities to improve and expand this application by adding more features, such as, emergency contacts and emergency records. This work hopes to reduce intimidation because, without intimidation, the community would be a better place, the development would not be disturbed, and everyone can feel safe. In the future, use of Artificial Intelligence (AI) technology would be carried out by implementing machine learning or deep learning algorithms. Using sensors as the
40
V. Margarita et al.
Fig. 7 Regulation page of anti-bully mobile application
Internet of Things (IoT) implementation is also a part of AI technology implementation, where machines using sensors will be able to talk with other machines and hence could be executed in the upcoming research.
References 1. F. Dehue, C. Bolman, T. Vollink, Cyberbullying: youngsters’ experiences and parental perception. CyberPsychol. Behav. 11(2) (2008) 2. T.L. Faiza, Perbedaan Tingkat Depresi pada Korban Bullying Verbal dan Cyberbullying pada Remaja, University of Muhammadiyah Malang Bachelor Thesis, 2019 3. A.C. Baldry, The impact of direct and indirect bullying on the mental and physical health of Italian youngsters. Aggressive Behav. 30(5), 343–355 (2004) 4. L. Arseneault, L. Bowes, S. Shakoor, Bullying victimization in youths and mental health problems: ‘Much ado about nothing? Psychol. Med. 40(5), 717–729 (2010) 5. I. Rivers, V.P. Poteat, N. Noret, N. Ashurst, Observing bullying at school: the mental health implications of witness status. Sch. Psychol. Q. 24(4), 211–223 (2009) 6. P.R. Smokowski, K.H. Kopasz, Bullying in school: an overview of types, effects, family characteristics, and ıntervention strategies. Child. Sch. 27(2), 101–110 (2005)
Design of a Mobile Application to Deal with Bullying
41
7. D. Wolke, S.T. Lereya, Long-term effects of bullying. Arch. Dis. Child. 100(9), 879–885 (2015) 8. J. Wang, R.J. Iannotti, T.R. Nansel, School bullying among adolescents in the United States: physical, verbal, relational, and cyber. J. Adolesc. Health 45(4), 368–375 (2009) 9. P.W. Agatston, R. Kowalski, S. Limber, Students’ perspectives on cyber bullying. J. Adolesc. Health 41(6), S59–S60 (2007) 10. M.M. Ttofi, D.P. Farrington, Effectiveness of school-based programs to reduce bullying: a systematic and meta-analytic review. J. Exper. Criminol. 7(1), 27–56 (2011) 11. S.M. Swearer, S. Hymel, Understanding the psychology of bullying: moving toward a socialecological diathesis-stress model. Am. Psychol. 70(4), 344–353 (2015) 12. H. Andershed, M. Kerr, H. Stattin, Bullying in school and violence on the streets: are the same people ınvolved? J. Scand. Stud. Criminol. Crime Prev. 2(1), 31–49 (2010) 13. M.K. Holt, D.L. Espelage, Perceived Social Support among Bullies, Victims, and BullyVictims. J. Youth Adolesc. 36(8), 984–994 (2007) 14. L. Hidayati, Pembulian di Tempat Kerja dalam konteks Asia, in Research Report of Seminar Nasional dan Gelar Produk (SENASPRO) (2016), pp. 133–142 15. D.L. Hoff, S.N. Mitchell, Cyberbullying: causes, effects, and remedies. J. Educ. Adm. 47(5), 652–665 (2009) 16. D. Olweus, S.P. Limber, Bullying in school: evaluation and dissemination of the Olweus Bullying Prevention Program. Am. J. Orthopsychiatry 80(1), 124–134 (2010)
Selection of Human Resources Prospective Student Using SAW and AHP Methods Ahmad Rufai, Diana Teresia Spits Warnars, Harco Leslie Hendric Spits Warnars, and Antoine Doucet
Abstract The role of Human Resources (HR) in an organization is significant, and the role of the world of education plays an essential role in producing and educating qualified and qualified human resources. In this paper, 20 students were evaluated as learning with Simple Additive Wight (SAW) and Analytic Hierarchy Process (AHP), which applied six criteria: experience, recommendations, interviews, discipline, skills, and physical health. The SAW and AHP methods are techniques for determining the best value from several predetermined criteria so that they are suitable for use in the selection of HR candidates the company will accept. The results showed that the consistency of similarity between the SAW and AHP methods had a score of (7 + 8)/20 = 0.75 or a similarity consistency index of 75%. The similarity consistency index is evaluated using the same rank order and the opposite with a reversal. Keywords Simple additive weighting · Analytic hierarchy process · Human resource selection · Decision support system
A. Rufai · D. T. S. Warnars Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] D. T. S. Warnars e-mail: [email protected] H. L. H. S. Warnars (B) Computer Science Department, Binus Graduate Program-Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] A. Doucet Laboratoire L3i, Université de La Rochelle, La Rochelle, France e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_4
43
44
A. Rufai et al.
1 Introduction Looking at these problems, the role of the Human Resources (HR) division is considered to be not yet maximal in handling the problem of selecting human resources. Selecting or receiving human resources is still not done professionally but using bribery, friendship, or family relations. This is happening because there is no human resource selection process with a standard systematic method to assess the eligibility of prospective HR. Therefore, in overcoming the professional acceptance of human resources, the role of the Human Resources (HR) division is to establish cooperation with educational institutions/schools. The company (Human Resources division) surrendered fully to educational institutions/schools to select quality prospective human resource students in this collaboration. Based on the background of the problem outlined above, several problems can be formulated, namely: 1 2 3
How is the decision support system for selecting prospective HR students using the SAW and AHP methods?. Regarding the calculation process, which method is the easiest to understand and is more suitable for selecting HR candidates. From the side of the final results obtained, which method is more accurate.
It is necessary to limit the practical problem so that the problem under study is straightforward and not complex. In this case, the restrictions taken are: 1. 2.
Research focuses on selecting HR candidates and the final results of the method’s calculation. Examples of the application are carried out to evaluate the selection of twenty prospective HR students.
2 Current and Previous Similar Research Papers The Decision Support System (DSS) was first revealed in the early 1970s by Michael S. Scott Morton in the term Management Decision System. The system is a computerbased system that is shown to help decision-makers by utilizing specific data and models to solve various unstructured problems. Meanwhile, Human Resources (HR) is the most crucial asset in a company or organization. If managed correctly and adequately, employees can be potential but will be a burden if mismanaged. The SAW method is also a term that often weighs the sum method. The basic concept of the SAW method is to find the number of performance ratings for each alternative on all attributes. The SAW method requires the decision matrix normalization process (X) to a scale that can be compared with all rankings of alternatives. Analytical Hierarchy Process (AHP) is a decision support model developed by Thomas L. Saaty. In essence, AHP takes into account qualitative and quantitative matters. The concept is to change qualitative values into quantitative values to be taken more objectively.
Selection of Human Resources Prospective …
45
Combination of SAW and AHP methods have been implemented in many papers, and we are limited to current paper publications such as: 1.
2.
3.
4.
5.
6.
7.
8.
9.
SAW and AHP methods are used to weigh for school e-learning readiness evaluation. They use eight criteria to assess psychological readiness, Sociological readiness, Environmental readiness, Human resource readiness, Financial readiness, Technological skill (aptitude) readiness, equipment readiness, Content readiness [1]. Meanwhile, SAW and AHP methods were compared for the tender process of television transmission stations based on four criteria: price, quality, service, and reliability. The paper mentioned that the SAW method is easy to apply rather than the AHP method[2]. A comparative study was done where SAW-AHP methods were compared to SAW, TOPSIS, and TOPSIS-AHP to determine students’ tuition fees where the SAW-AHP method had a 74% accuracy measurement[3]. Moreover, 29 websites of the international museum were assessed using a combination of SAW and AHP methods using 14 criteria in 3 categories such as content, usability, and functionality[4]. The criteria include currency/clarity/text comprehension, completeness/richness, quality content, support of research, consistency, accessibility, structure/navigation, easy to use/simplicity, user interface/overall presentation/design, efficiency, multilingualism, multimedia, interactivity, and adaptivity. Meanwhile, SAW and AHP methods were used for food crop planting recommendations, using three criteria: rainfall, temperature, and humidity. This paper used AHP for weighting criteria and SAW for ranking each criterias[5]. SAW and AHP methods were combined for determining the location of cake shop retail business using criteria such as revenue and distance using 50 cake shops around java island and assessed using specific criteria such as a competitor, infrastructure, distance, rental price, population density, size, road condition, and culture [6]. This paper discusses the combination of SAW and AHP methods for determining natural sandy gravel using ten criteria upon 19 mining locations, categorized into natural, environmental, and esthetic factors [7]. Groundwater quality assessment et al.-Shekhan area in Mosul city, Iraq, was assessed using SAW and AHP methods. Groundwater quality assessment uses 30 wells as water resources as an alternative are assessed. These 30 well are assessed using 12 criteria drinking parameters such as Depth, Ca, Mg, CI, Na, SAR, SO4 , HCO3 , NO3 , TDS, EC, and pH [8]. A Decision Support System application of the rental car system in Bali island was developed using SAW and AHP methods as a recommendation and including Geographic Information System (GIS) for location position. The AHP was used by calculating the weights on each of the criteria and then proceeding with SAW to determine the weight rating specified in the criteria [9].
46
10.
11.
12.
13.
14.
15.
16.
17.
A. Rufai et al.
The hybrid method was developed for decision support systems using SAW and AHP methods to weight bandwidth management for faculty at Semarang university [10]. The alternative used seven faculties such as FTIK, Psychology, Laws, Engineering, Economic, THP, and POST upon five criteria: number of lab computers, number of lecturers, number of students, weekly teaching time, and number of the study program. SAW and AHP methods were combined to make a decision support system for satisfaction measurement at service-provided e-procurement to evaluate customer service quality[11]. They used some criteria such as parts, employees, system acceptance, usefulness, ease of use, etc. The quality salt industry in Indonesia was assessed by comparing SAW and AHP methods to determine the highest quality salt. The comparison showed that the SAW method is better with an accuracy of 80% rather than the AHP method with an accuracy of 76%, using three criteria to weight salt quality such as NaCL, water content, and salt color[12]. A decision-making application was created on choosing a keyword, where a keyword selection website improves search engine visibility using a combination of SAW and AHP methods. The webmaster used the system to prioritize the optimal keywords, and they used some keyword searching text and assessed using four criteria: volume, result, CPC, and competition [13]. AHP method was used to weight the parameters and the SAW method to find the final value and rank. A decision support system application for significant selection for the new student was built for Buddhi high school using SAW and AHP methods combination, where the SAW method was used to weight the performance rating score for each alternative, and the AHP method was used to carry on Pairwise Comparison Matrix (PCM) including Normalization PCM (NPCM). The process used four criteria: student middle school report score, student national exam scores, academic potential test score, and interview result in score [14]. A novel hybrid approach as a combination of SAW and AHP methods was proposed to measure and evaluate the preeminent brake friction composite formulation to find the best formulation that yields optimized tribological properties considering all the performance attributes at a single time. The formulation NF-1 with five wt% ramie fiber is the best combination of tribological properties [15]. SAW and AHP methods were combined for a decision support system for private tutor business as non-formal education called “E-private” using five criteria: education, experience, cost, discipline, and teaching. AHP method was done by doing input criteria and comparison scale, PCM activity, priority weight and find the consistency ratio where if not consistent, it will loop to new input criteria and comparison scale, while if consistent, the SAW method will be applied. SAW method will value the alternatives, create a matrix, normalize the matrix, and show the ranking result [16]. Another decision support application was built using by comparison SAW and AHP methods for selection of giving subsidized home loan which is used by
Selection of Human Resources Prospective …
18.
19.
20.
21.
47
developer company using six criteria such as type of work, income, number of dependents, down payment, Indonesian Bank (BI) checking and the completeness of documents [17]. In this paper, the SAW method has a better result than the AHP method. A simple application was developed using SAW and AHP methods to help the music studio or music group in the singer selection process and find the best selection singer, using experimental results and consistent results. AHP was used for the weighting process of criteria and SAW for participants’ scores [18]. Risk analysis as risk management as anticipate the hazard and develop the action to reduce the risk at PT. PLN Persero was assessed using a combination of SAW and AHP methods, where the SAW method was applied to measure the subjective concept of human-related uncertainty and the AHP method to determine the weight criteria [19]. In this paper, the SAW method was run in 9 steps while the AHP method was in 4 steps. Decision support system application for employee placement is created and compared using four methods: SAW, AHP, TOPSIS, and Preference Ranking Organization Method for Enrichment Of Evaluations (PROMENTHEE). Five criteria are used such as knowledge, skill, ability, physical, and attitude, and based on the experiments upon 60 datasets for employee placement then AHP has 50% accuracy, SAW has 81% accuracy, TOPSIS has 95% accuracy, and the PROMENTHEE method has 93.44% accuracy [20]. Groundwater potential (GWP) in the United Arab Emirates (UAE) and Oman is modeled by the influence of physiographic variables affecting groundwater accumulation and three different methods such as SAW, AHP, and Probabilistic Frequency Ratio (PFR). The SAW and AHP were valid for well potential zones as water resources at 98% and 92%, and spring at 63% and 86%, respectively [21].
3 Simple Additive Weighting Implementation The beginning of the process carried out in the SAW method is to classify the criteria chosen to determine the decision. These criteria are divided into two attribute categories: the cost and benefit criteria. The benefit is if the match value of each criterion, the higher the value, the better. While, the cost is if the match value of each criterion, the smaller the value, the better. SAW method has four steps such as:
1 2 3 4
List of Weighting criteria, including the range of criteria and normalization of criteria range. Scoring of Sample data using a normalized range of criteria. Normalization of performance rating score. Ranking Score.
48
A. Rufai et al.
Table 1 Weighting criteria No.
Criteria
Criteria weight (%)
Range of criteria
Normalized range of criteria 1–2
1
Experience (K 1 )
5
0, 1
2
Recommendations (K 2 )
20
0, 1, 2, > 2
1–4
3
Interview (K 3 )
10
1–5
1–5
4
Discipline (K 4 )
10
1–4
1–4
5
Skills (K 5 )
25
0–5
1–6
6
Physical health (K 6 ) 30
1–4
1–4
3.1 List of Weighting Criteria, Including the Range of Criteria and Normalization of Criteria Range Following what has been determined, the weighting criteria are six criteria with each range of criteria. Next is the detail of the range criteria. a. b. c. d. e. f.
Criteria experience has ranges 1 or 0 for with or without. Criteria recommendation has ranges 0, 1, 2, > 2 for more minor, adequate, very adequate, substantial. Criteria Interview has ranges 1, 2, 3, 4, 5 for very bad, bad, adequate, good, very good. Criteria Discipline has ranges 1, 2, 3, 4 for deficient, adequate, good, very good. Criteria Skill has ranges 0, 1, 2, 3, 4, 5 for bad, deficient, adequate, passable, good, very good. Criteria Physical health has ranges 1, 2, 3, 4 for bad, deficient, satisfied, very satisfied.
The normalized range of criteria can be seen in the last column in Table 1, where all the ranges will have similarities and start from 1 to 6.
3.2 Scoring of Sample Data Using a Normalized Range of Criteria After having the weighting criteria table, the scoring students’ data will be normalized with the last column in Tables 1 and 2 showing the result of 20 students’ scoring as in this paper using 20 students as examples.
Selection of Human Resources Prospective …
49
Table 2 Results of collecting student/student assessment data Alternative
Criteria K1
K2
K3
K4
K5
K6
Student 1
2
1
2
3
3
2
Student 2
1
1
3
2
3
2
Student 3
1
2
4
3
2
3
Student 4
1
1
3
4
3
3
Student 5
2
4
3
2
4
4
Student 6
1
2
3
2
3
3
Student 7
2
3
3
2
3
2
Student 8
1
1
2
2
2
2
Student 9
2
3
4
3
5
3
Student 10
1
1
3
3
6
3
Student 11
2
1
2
3
3
3
Student 12
1
1
3
3
6
3
Student 13
1
2
2
1
1
3
Student 14
1
1
2
2
4
2
Student 15
1
2
3
2
4
3
Student 16
2
1
4
3
3
4
Student 17
1
1
2
2
3
2
Student 18
1
1
1
1
2
2
Student 19
1
2
2
3
4
2
Student 20
2
4
5
3
5
3
3.3 Normalization of Performance Rating Score In order to find normalized performance rating, then Eq. (1) is used where there are two options for benefit and cost, but in this paper, we limit to benefit purpose only X then we use equation r ij = Maxi jX i j . ri j =
Xi j , MaxX i j Min X i j , Xi j
if attribut j is benefit if attribut j is cos t
where r ij X ij Max X ij Min X ij i j
is the normalized performance rating score is the Weighting criteria score is the highest weighting criteria score is the lowest weighting criteria score is an alternative (student) is Criteria
(1)
50
A. Rufai et al.
Table 3 Normalized matrix Alternative
Criteria K1
K2
K3
K4
K5
K6
Vi
Ranking order
Student 1
1.00
0.25
0.40
0.75
0.50
0.50
0.49
14
Student 2
0.50
0.25
0.60
0.50
0.50
0.50
0.46
16
Student 3
0.50
0.50
0.80
0.75
0.33
0.75
0.588
8
Student 4
0.50
0.25
0.60
1.00
0.50
0.75
0.585
9
Student 5
1.00
1.00
0.60
0.50
0.67
1.00
0.827
2
Student 6
0.50
0.50
0.60
0.50
0.50
0.75
0.585
10
Student 7
1.00
0.50
0.60
0.50
0.50
0.50
0.585
11
Student 8
0.50
0.25
0.40
0.50
0.33
0.50
0.398
19
Student 9
1.00
0.75
0.80
0.75
0.83
0.75
0.788
3
Student 10
0.50
0.25
0.60
0.75
1.00
0.75
0.685
4
Student 11
1.00
0.25
0.40
0.75
0.50
0.75
0.565
12
Student 12
0.50
0.25
0.60
0.75
1.00
0.75
0.685
5
Student 13
0.50
0.50
0.40
0.25
0.17
0.75
0.457
17
Student 14
0.50
0.25
0.40
0.50
0.67
0.50
0.482
15
Student 15
0.50
0.50
0.60
0.50
0.67
0.75
0.627
7
Student 16
1.00
0.25
0.80
0.75
0.50
1.00
0.68
6
Student 17
0.50
0.25
0.40
0.50
0.50
0.50
0.44
18
Student 18
0.50
0.25
0.20
0.25
0.33
0.50
0.353
20
Student 19
0.50
0.50
0.40
0.75
0.67
0.50
0.557
13
Student 20
1.00
1.00
1.00
0.75
0.83
0.75
0.858
1
As seen in Table 2, Max Xij for Criteria K 1 , K 2 , K 4 , K 5 , and K 6 are 2, 4, 5, 4, 6, and 4. For example, for student 1, as seen in the first line in Table 3 using equation X ri j = Maxi jX i j where i = 1 and j = 1 to 6, then: r ij = r 11 = 2/2 = 1, r ij = r 12 = 1/4 = 0.25, r ij = r 13 = 2/5 = 0.4, r ij = r 14 = 3/4 = 0.75, r ij = r 15 = 3/6 = 0.5 and r ij = r 16 = 2/4 = 0.5. The first line student 1 as shown in Table 3 shows the result of implementation Eq. (1) with score 1, 0.25, 0.4, 0.75, 0.5 and 0.5. Table 3 shows the result for 20 students’ data in Table 2 which are applied with Eq. (1).
3.4 Ranking Score The final step is to rank using the Eq. (2) Vi =
n j=i
W j ri j
(2)
Selection of Human Resources Prospective …
51
Information: Vi Wj r ij i j
ranking for each alternative (student) the weight score of each criteria normalized performance ranking value is alternative (student) is Criteria
W j is criteria weight as seen in the third column in Table 1 with composition Criteria K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 with weight score 0.05, 0.2, 0.1, 0.1, 0.25, and 0.3 respectively. For example for student 1 as seen in first line in Table 3 using Eq. (2) where i = 1 and j = 1 to 6, then: V i = (0.05*1) + (0.2*0.25) + (0.1*0.4) + (0.1*0.75) + (0.25*0.50) + (0.3*0.5) = 0.05 + 0.05 + 0.04 + 0.075 + 0.125 + 0.15 = 0.49. Last column in Table 3 shows the V i as ranking score for each student and the highest score 0.858 for student 20 shows as the best student and the lowest score 0.353 for student 18.
4 Analytical Hierarchy Process Implementation The AHP method will be run with two steps such as. 1. 2.
Processing Criteria. Processing alternatives. The AHP method will process the criteria as there are six criteria in this paper such as K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 , and 20 alternatives such as student 1 to student 20, as seen in Table 3. Each step will apply the Pairwise Comparison Matrix (PCM) and Normalized Pairwise Comparison Matrix (NPCM). Based on NPCM, the alternatives will be scored to get the best criteria and the best alternatives.
4.1 Processing Criteria Processing criteria are a step that process criteria as there are six criteria in this paper such as K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 , as seen in Table 3. The processing criteria have three sub-steps such as: a. b. c.
Pairwise Comparison Matrix. Normalized Pairwise Comparison Matrix. Calculating the consistency.
52
4.1.1
A. Rufai et al.
Pairwise Comparison Matrix
Pairwise comparison matrix (PCM) is created based on criteria weight on the third column in Table 1 with composition Criteria K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 with weight scores 0.05, 0.2, 0.1, 0.1, 0.25, and 0.3, respectively. The pairwise comparison matrix is created with Eq. (3). Pairwise Comparison matrix = row/col
(3)
In Eq. (3), row and column numbers are row and column numbers in Table 4. For example for first row is K 1 /K 1 = 0.05/0.05 = 1, K 1 /K 2 = 0.05/0.2 = 0.25, K 1 /K 3 = 0.05/0.1 = 0.5, K 1 /K 4 = 0.05/0.1 = 0.5, K 1 /K 5 = 0.05/0.25 = 0.2, K 1 /K 6 = 0.05/0.3 = 0.17, then in the first column in Table 4 will have score, 1, 0.25, 0.5, 0.5, 0.2 and 0.17. The last column in Table 4 as summarization for each column which will be used for next step to NPCM. Table 5 is a scale of relative importance (Saaty Scale) for Table 4, giving meaning for the PCM. For example, score five at row 5, and column 1 (K 5 /K 1 ) shows essential importance based on the Saaty scale, as shown in Table 5. Table 4 Pairwise comparison matrix for criteria Row/col
K1
K2
K3
K4
K5
K6
K1
1.00
0.25
0.50
0.50
0.20
0.17
K2
4.00
1.00
2.00
2.00
0.80
0.67
K3
2.00
0.50
1.00
1.00
0.40
0.33
K4
2.00
0.50
1.00
1.00
0.40
0.33
K5
5.00
1.25
2.50
2.50
1.00
0.83
K6
6.00
1.50
3.00
3.00
1.20
1.00
sum
20
5
10
10
4
3.33
Table 5 Saaty scale
Comparative scale
Definition
1
Equal Importance
3
Weak importance of one over another
5
Essential or strong importance
7
Demonstrated importance
9
Extreme importance
2, 4, 6, 8
Intermediate values between the two adjacent judgments
Reciprocal
The opposite
Selection of Human Resources Prospective …
4.1.2
53
Normalized Pairwise Comparison Matrix
Normalization of Pairwise Comparison (NPCM) will be done using Eq. (4) by dividing each score with score summarization for each column as shown in the last row in Table 4. Normalized PCM = score/sum_col
(4)
A score is a number for each score, and sum_col summarizes each column. Table 6 shows the Normalized PCM with criteria weights column using Eq. (5), an average cumulation row score average. Criteria weights can be called a priority vector or Eigenvector. Criteria weights = Avg(sum_row_score)
(5)
Criteria weights = (EV(1/n) ) ∗ sum(EV)
(6)
where EV = Eigen Value as multiplication all score in the same row. Furthermore, finding Eigenvector can be done by finding “eigenvalue” first by calculation all the scores per row in the PCM table as shown in Table 4. For example, they used the first row in Table 4, Eigen Value = (1 * 0.25 * 0.5 * 0.5 * 0.2 * 0.17) = 0.0021. Using Eq. (6), then Eigen Value power or ^ (1/n) * sum(Eigen Value) where n is the number of criteria = 6, = 0.0021^(1/6) = 0.359 as shown in the first row in column “Eigen Value” in Table 6. After all the criteria have found the Eigenvalue, the sum(Eigen Value) is 7.146, as shown in the last row in column “Eigen Value” in Table 6. Then, for example, the first row in column “Eigen Value” in Table 6, “Criteria Weights” = 0.359/7.146 = 0.05 as the same score value in the first row in column “Criteria Weights” in Table 6.
4.1.3
Calculating the Consistency
Meanwhile, Calculation of the consistency is used to check whether the calculation scores are correct, where each score in the PCM in Table 4 is multiplied with the “criteria weights” column in Table 6. After the process was done, the result was the same as shown in Table 6. The column “weighted sum value” in Table 6 was getting using Eq. (7), where each score summarizes each row. Weighted sum value = sum(row score)
(7)
Criteria Weights = EV(1/n)
(8)
0.25
0.30
0.25
0.30
1.00
K5
K6
Sum
0.10
1.00
0.10
0.10
0.10
K3
0.20
0.05
K2
K4
0.05
0.20
K1
K2
K1
Row/col
1.00
0.30
0.25
0.10
0.10
0.20
0.05
K3
Table 6 Normalized PCM for criteria
1.00
0.30
0.25
0.10
0.10
0.20
0.05
K4
1.00
0.30
0.25
0.10
0.10
0.20
0.05
K5
1.00
0.30
0.25
0.10
0.10
0.20
0.05
K6
7.146
2.144
1.786
0.714
0.714
1.431
0.359
Eigen Value
6.00
1.80
1.50
0.60
0.60
1.20
0.30
Weighted sum value
1.00
0.30
0.25
0.10
0.10
0.20
0.05
Criteria weights/Eigenvector
λ max = 6
6
6
6
6
6
6
Ratio
54 A. Rufai et al.
Selection of Human Resources Prospective …
55
Table 7 Random consistency index (RI) n
1
2
3
4
5
6
7
8
9
10
RI
0
0
0.58
0.9
1.12
1.24
1.32
1.41
1.45
1.49
Furthermore, finding “Criteria weights” can be done with Eq. (8) upon normalized PCM Table 6 and using EV = Eigen value as multiplication all scores in the same row, for example, using the first row in Table 6 then Eigen Value = (1 * 0.05 * 0.05 * 0.05 * 0.05 * 0.05) = 0.000000016. After that, Eigen Value power or ^ (1/n) = 0.000000016^(1/6) = 0,05, where n is the number of criteria = 6. The same score value is in the first row in column “Criteria Weights” in Table 6. Ratio = Weighted sum value/Criteria weights
(9)
λ max = AVG(Ratio)
(10)
CI =
λ max −n n−1
CR = CI/RI
(11) (12)
Moreover, column ratio was got it using Eq. (9) by dividing the “Weighted sum value” with “Criteria weights.” Lambda maximum (λ max) as = the largest eigenvalue has a score of 6 using Eq. (10) as the average for all ratio scores. CI (Consistency −n = 6−6 = 05 = 0. Index) was done using Eq. (11) CI = λ max n−1 6−1 Furthermore, Consistency Ratio (CR) is a comparison between CI and RI, where CI was processed using Eq. (11) and Random Consistency Index (RI), which randomly generated can be seen in Table 7 based on the number of n as the number of criteria = 6. Since n = 6, the RI score = 1.24, as seen in Table 7. Then, CR = CI/RI = 0/1.24 = 0. If CR < = 0.1, then the consistency is acceptable, and on the other hand, if CR > 0.1, the comparison needs to be reevaluated again. The value of 0 is the lowest and can be the most consistent, so the value cannot be harmful, even if the value is negative, meaning there is an error in the calculation process. Since the finding CR = 0, then the equation is consistent.
4.2 Processing Alternatives After processing criteria, processing alternatives is a step that processes 20 alternatives, such as student 1 to student 20, as seen in Table 3. The processing alternatives have two sub-steps such as:
56
a. b.
A. Rufai et al.
Pairwise comparison matrix. Normalized pairwise Comparison matrix.
4.2.1
Pairwise Comparison Matrix
Pairwise Comparison Matrix (PCM) for alternatives is created based on 20 student assessment data in Table 2, where each student has weighing criteria, and each PCM will be scored based on each criterion where in this paper, there are six criteria such as K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 . So, there will be six tables like Table 8, which shows the PCM alternative for each criterion, and Table 8 shows the PCM alternative for criteria K 1 as “experience” criteria as shown in Table 2. Attributes S 1 to S 20 in Table 8 represent student 1 to student 20. Each column in Table 8 is a comparison matrix for criteria K 1 , as shown in Table 2, where each criterion score per student in column K 1 will be pairwise with other K 1 student criteria scores. For example, the first row S 1 in Table 8, such as column S 1 /S 1 = 2/2 = 1 where S 1 = 2 is score student 1 at column K 1 in Table 2. Moreover, S 1 /S 2 = 2/1 = 2, S 1 /S 3 = 2/1 = 2, S 1 /S 4 = 2/1 = 2, S 1 /S 5 = 2/2 = 1 and etc., where S 2 = 1, S 3 = 1, S 4 = 1, S 5 = 2 as shown at each student number at column K 1 in Table 2. Meanwhile, the last row in Table 8 as summarization for each column will be used for the next step to normalize the PCM.
4.2.2
Normalized Pairwise Comparison Matrix
Normalization of Pairwise Comparison (NPCM) will be done using Eq. (4) by dividing each score with score summarization for each column as shown in the last row in Table 8. As seen in Eq. (4), Normalized PCM = score /sum_col, where the score is a number for each score and sum_col is summarization for each column in Table 8. Since there are six criteria such as K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 , then the same like PCM will have six tables then there are 6 NPCM tables, and Table 9 is an example of NPCM for an alternative based on criteria K 1 as “experience” criteria as shown in Table 1. For example, the first row S 1 in Table 9, such as column S 1 /S 1 = 1/13.5 = 0.074, where 1 is a score at position S 1 , S 1 , and 13,5 is summarization in Table 8. Moreover, S 1 /S 2 = 2/27 = 0.074, where 2 is a score at position S 1 /S 2 and 27 is the summarization score column S 2 at Table 8. Meanwhile, the last row in Table 9 as summarization for each column that should have scored one and score 1 in the last row shows that the equations were done correctly. After having 6 PCM and NPCM tables based on Tables 8 and 9, respectively then from each NPCM table will process the Criteria Weights or Eigenvector using Eqs. (5) or (6) or (8), and Table 10 shows the Eigenvector for all six criteria such as K 1 , K 2 , K 3 , K 4 , K 5 , and K 6 respectively. Similar to summarization in the last row for NPCM in Table 9, where the score should be one, the last row of Table 10 should have a score of 1 to represent the success of equations implementation.
Selection of Human Resources Prospective …
57
Table 8 Pairwise comparison matrix for alternatives based on criteria K 1 Alter native
S1
S2
S3
S4
S5
S6
S7
S8
S9
S 10
S1
1
2
2
2
1
2
1
2
1
2
S2
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S3
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S4
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S5
1
2
2
2
1
2
1
2
1
2
S6
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S7
1
2
2
2
1
2
1
2
1
2
S8
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S9
1
2
2
2
1
2
1
2
1
2
S 10
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 11
1
2
2
2
1
2
1
2
1
2
S 12
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 13
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 14
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 15
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 16
1
2
2
2
1
2
1
2
1
2
S 17
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 18
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 19
0.5
1
1
1
0.5
1
0.5
1
0.5
1
S 20
1
2
2
2
1
2
1
2
1
2
Sum
13.5
27.0
27.0
27.0
13.5
27.0
13.5
27.0
13.5
27.0
Alter native
S 11
S 12
S 13
S 14
S 15
S 16
S 17
S 18
S 19
S 20
S1
1
2
2
2
2
1
2
2
2
1
S2
0.5
1
1
1
1
0.5
1
1
1
0.5
S3
0.5
1
1
1
1
0.5
1
1
1
0.5
S4
0.5
1
1
1
1
0.5
1
1
1
0.5
S5
1
2
2
2
2
1
2
2
2
1
S6
0.5
1
1
1
1
0.5
1
1
1
0.5
S7
1
2
2
2
2
1
2
2
2
1
S8
0.5
1
1
1
1
0.5
1
1
1
0.5
S9
1
2
2
2
2
1
2
2
2
1
S 10
0.5
1
1
1
1
0.5
1
1
1
0.5
S 11
1
2
2
2
2
1
2
2
2
1
S 12
0.5
1
1
1
1
0.5
1
1
1
0.5
S 13
0.5
1
1
1
1
0.5
1
1
1
0.5
S 14
0.5
1
1
1
1
0.5
1
1
1
0.5 (continued)
58
A. Rufai et al.
Table 8 (continued) Alter native
S 11
S 12
S 13
S 14
S 15
S 16
S 17
S 18
S 19
S 20
S 15
0.5
1
1
1
1
0.5
1
1
1
0.5
S 16
1
2
2
2
2
1
2
2
2
1
S 17
0.5
1
1
1
1
0.5
1
1
1
0.5
S 18
0.5
1
1
1
1
0.5
1
1
1
0.5
S 19
0.5
1
1
1
1
0.5
1
1
1
0.5
S 20
1
2
2
2
2
1
2
2
2
1
Sum
13.5
27.0
27.0
27.0
27.0
13.5
27.0
27.0
27.0
13.5
Moreover, the score for each column should be related to each criterion, where column K 1 uses criteria K 1 . For example, for the position, Student 1/K 1 in Table 10 has a score of 0.07407407, and using Eq. (5) where Criteria weights = Avg (sum_row_score), then the score 0.07407407 is the average of the first row (S 1 ) in Table 9. Moreover, the Total column Eigenvector is average for each row using Eq. (5) for finding the total Eigenvector for alternatives student 1 to student 10. Finally, based on the Total column Eigenvector, then ranking order was created where the highest Total Eigenvector score is 0.0778149 for student 20, and the lowest Total Eigenvector score is 0.0283161 for student 18.
5 Calculation Result of SAW and AHP Method After carrying out the stages of the process of calculating all the methods and the results obtained, it is concluded that: • The SAW method has four steps, while the AHP method has two steps. • The SAW method uses two equations while the AHP method uses eight equations where equations Criteria Weight or Eigenvector, as seen in Eqs. (5), (6), and (8), are similar but with different variables. Next are the different stages for the SAW and AHP methods where the SAW method does: • SAW method weighs criteria, ranges of criteria, and normalization of criteria. • SAW method scoring alternatives against criteria using a normalized range of criteria as a matrix. • SAW method normalizes the performance rating score of the matrix • SAW method does ranking score Meanwhile, the AHP method does:
Selection of Human Resources Prospective …
59
Table 9 NPCM for Alternatives based on criteria K 1 Alter native
S1
S2
S3
S4
S5
S6
S7
S8
S9
S 10
S1
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S2
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S3
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S4
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S5
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S6
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S7
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S8
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S9
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S 10
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 11
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S 12
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 13
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 14
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 15
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 16
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S 17
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 18
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 19
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 20
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
sum
1
1
1
1
1
1
1
1
1
1
Alter native
S 11
S 12
S 13
S 14
S 15
S 16
S 17
S 18
S 19
S 20
S1
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S2
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S3
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S4
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S5
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S6
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S7
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S8
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S9
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S 10
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 11
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S 12
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 13
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 14
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
(continued)
60
A. Rufai et al.
Table 9 (continued) Alter native
S 11
S 12
S 13
S 14
S 15
S 16
S 17
S 18
S 19
S 20
S 15
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 16
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
S 17
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 18
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 19
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
0.037
S 20
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
0.074
Sum
1
1
1
1
1
1
1
1
1
1
• AHP method processes criteria and alternatives differently and does the Pairwise Comparison Matrix (PCM) and Normalized Pairwise Comparison Matrix (NPCM). • PCM for criteria is a process to create a comparison matrix among the criteria scores, while PCM alternatives are a process to create a comparison matrix among the alternatives for each criterion. • NPCM is the percentage of PCM score, and each NPCM has NPCM average and NPCM total score, where the NPCM average is Criteria Weights or Eigenvector score, and the NPCM total score has a score one and score one shows as legitimate NPCM. Specifically for the criteria process, consistency is carried on using Random Consistency Index (RI) to check if the calculation scores are correct. Based on Table 11, six students have similar ranking order such as student 20, student 3, student 19, student 1, student 8, and student 18. Meanwhile, there are similarity SAW method score for student 10 and student 12 with score = 0.685, student 4, student 6 and student 7 with score = 0.585. For similarity consistency, the ranking order is revised to become ranking order 9, 10, and 11 for student 7, student 4, and student 6, respectively. Since student 4, student 6, and student 7 have similar scores and are in rank numbers 9, 10, and 11, respectively, there is another one similarity ranking order for student 4, which makes a total there are seven students with similarity ranking order. The SAW method has 7/20 = 0.35 or 35% similarity consistency. Moreover, there is also a similar AHP method score for student 10 and 12 with a score = 0.0538194. Interestingly, there is four opposite ranking order, such as ranking order 2 and 3, wherein the SAW method has ranking order student 5 and student 9 while the AHP method has opposite ranking order student 9 and student 5. The next opposite ranking order, such as ranking order 11 and 12, wherein the SAW method has ranking order student 6 and student 11 while the AHP method has opposite ranking order student 11 and student 6. Moreover, another opposite ranking order, such as ranking order 15 and 16, wherein the SAW method has ranking order student 14 and student 2 while the AHP method has opposite ranking order student 2 and student 14. Furthermore, another opposite ranking order, such as ranking order 17 and 18, wherein the SAW
0.02857143
0.02857143
0.05714286
0.02857143
0.02857143
0.02857143
0.02857143
0.05714286
0.02857143
0.05714286
0.11428571
0.07407407
0.03703704
0.03703704
0.03703704
0.07407407
0.03703704
0.07407407
0.03703704
0.07407407
0.03703704
0.07407407
0.03703704
0.03703704
0.03703704
0.03703704
0.07407407
0.03703704
0.03703704
0.03703704
0.07407407
1
Student 2
Student 3
Student 4
Student 5
Student 6
Student 7
Student 8
Student 9
Student 10
Student 11
Student 12
Student 13
Student 14
Student 15
Student 16
Student 17
Student 18
Student 19
Student 20
Sum
1
0.02857143
0.02857143
0.05714286
0.02857143
0.08571429
0.02857143
0.08571429
0.05714286
0.11428571
K2
K1
Criteria weights or Eigenvector
Student 1
Alternative
1
0.08928571
0.03571429
0.01785714
0.03571429
0.07142857
0.05357143
0.03571429
0.03571429
0.05357143
0.03571429
0.05357143
0.07142857
0.03571429
0.05357143
0.05357143
0.05357143
0.05357143
0.07142857
0.05357143
0.03571429
K3
1
0.06122449
0.06122449
0.02040816
0.04081633
0.06122449
0.04081633
0.04081633
0.02040816
0.06122449
0.06122449
0.06122449
0.06122449
0.04081633
0.04081633
0.04081633
0.04081633
0.08163265
0.06122449
0.04081633
0.06122449
K4
1
0.07246377
0.05797101
0.02898551
0.04347826
0.04347826
0.05797101
0.05797101
0.01449275
0.08695652
0.04347826
0.08695652
0.07246377
0.02898551
0.04347826
0.04347826
0.05797101
0.04347826
0.02898551
0.04347826
0.04347826
K5
Table 10 Criteria weights or Eigenvector for 20 students (alternatives) upon six criteria
1
0.05555556
0.03703704
0.03703704
0.03703704
0.07407407
0.05555556
0.03703704
0.05555556
0.05555556
0.05555556
0.05555556
0.05555556
0.03703704
0.03703704
0.05555556
0.07407407
0.05555556
0.05555556
0.03703704
0.03703704
K6
1
0.0778149
0.0476878
0.0283161
0.0371091
0.0588085
0.050349
0.0395245
0.0367251
0.0538194
0.0497697
0.0538194
0.0700768
0.0346936
0.0557819
0.0479336
0.0691321
0.0499744
0.0518957
0.0400853
0.0466833
Total Eigenvector
1
13
20
17
4
9
16
18
7
11
6
2
19
5
12
3
10
8
15
14
Ranking order
Selection of Human Resources Prospective … 61
62 Table 11 Ranking order similarity consistency of SAW and AHP methods
A. Rufai et al. Ranking order
SAW method
AHP method
Rank = 1
Student 20 = 0.858
Student 20 = 0.0778149
Rank = 2
Student 5 = 0.827
Student 9 = 0.0700768
Rank = 3
Student 9 = 0.788
Student 5 = 0.0691321
Rank = 4
Student 10 = 0.685
Student 16 = 0.0588085
Rank = 5
Student 12 = 0.685
Student 7 = 0.0557819
Rank = 6
Student 16 = 0.68
Student 10 = 0.0538194
Rank = 7
Student 15 = 0.627
Student 12 = 0.0538194
Rank = 8
Student 3 = 0.588
Student 3 = 0.0518957
Rank = 9
Student 4 = 0.585
Rank = 10
Student 6 = 0.585
Student 4 = 0.0499744
Rank = 11
Student 7 = 0.585
Student 11 = 0.0497697
Rank = 12
Student 11 = 0.565
Student 6 = 0.0479336
Rank = 13
Student 19 = 0.557
Student 19 = 0.0476878
Rank = 14
Student 1 = 0.49
Student 1 = 0.0466833
Rank = 15
Student 14 = 0.482
Student 2 = 0.0400853
Rank = 16
Student 2 = 0.46
Student 14 = 0.0395245
Rank = 17
Student 13 = 0.457
Student 17 = 0.0371091
Rank = 18
Student 17 = 0.44
Student 13 = 0.0367251
Rank = 19
Student 8 = 0.398
Student 8 = 0.0346936
Rank = 20
Student 18 = 0.353
Student 18 = 0.0283161
Student 15 = 0.050349
method has ranking order student 13 and student 17 while the AHP method has opposite ranking order student 17 and student 13. Thus, since there is four opposite ranking order where in this case we reverse the opposite of each ranking order, then it will add the number of similar ranking order with eight students such as student 5 and student 9 in ranking order 2 and 3, student 6 and student 11 in ranking order 11 and 12, student 14 and student 2 in ranking order 15 and 16, student 13 and student 17 in ranking order 17 and 18. Finally, the similarity consistency between the SAW method and the AHP method has a score (7 + 8)/20 = 0.75 or 75% similarity consistency index.
Selection of Human Resources Prospective …
63
6 Conclusions From the discussion and review of the decision support system model for selecting HR candidates, conclusions can be drawn, namely: This decision support system can provide convenience in data collection and calculation of the value of selecting prospective HR students so that producing HR candidates with good competency scores can be selected. The decision support system using the SAW and AHP methods can produce accurate decision-making to improve effectiveness and help the selection process of prospective HR students. Moreover, the experiment shows that the option between using the SAW and the AHP methods has a similarity consistency of 75% as roughly both of them have closed equally result in a score. Interestingly, both results show the same result for the first ranking order 1 for student 20 as the highest-ranking score and the last ranking order 20 for student 18 with the lowest ranking score. The current experiment using 20 students is an initial experiment, and the actual implementation will use a web-based application with Personal Home Pages (PHP) as server programming with MySQL database including using HTML, Code Style Sheet (CSS), and Javascript as client programming.
References 1. S. Andayani, B.H.M. Sumarno, H. Waryanto, Comparison of promethee-topsis method based on SAW and AHP weighting for school e-learning readiness evaluation. J. Phys: Conf. Ser. 1581, 012012 (2020) 2. P. Sutoyo, D. Nusraningrum, Comparative study decision support system AHP and SAW method in tender process TV transmission stations. Dinasti Int. J. Digital Bus. Manage. 1(5), 842–856 (2020) 3. W. Firgiawan, N. Zulkarnaim, S. Cokrowibowo, A comparative study using SAW, TOPSIS, SAW-AHP, and TOPSIS-AHP for tuition fee (UKT). IOP Conf. Ser. Mater. Sci. Eng. 875(1), 012088 (2020) 4. K. Kabassi, C. Karydis, A. Botonis, AHP, Fuzzy SAW, and Fuzzy WPM for the evaluation of cultural websites. Multimodal Tech. Interact. 4(1), 5 (2020) 5. F. Noviyanto, A. Tarmuji, H. Hardianto, Food crops planting recommendation using analytic hierarchy process (AHP) and simple additive weighting (SAW) methods. Int. J. Sci. Technol. Res. 9(2), 4743–4749 (2020) 6. D. Wiguna, Decision support system to determine the location of a cake shop retail business using the AHP method and simple additive weighting (SAW). Systematics 2(2), 79–85 (2020) 7. A.S. Ajrina, R. Sarno, H. Ginardi, A. Fajar, Mining zone determination of natural sandy gravel using fuzzy AHP and SAW, MOORA, and COPRAS methods. Int. J. Intell. Eng. Syst. 13(5), 560–571 (2020) 8. M.F. Ahmed, R.M. Faisal, GIS based modeling of GWQ assessment at Al-Shekhan area using AHP and SAW techniques. Sci. Rev. Eng. Environ. Sci. 29(2), 172–183 (2020) 9. N.N.A.P. Siwa, I.M. Putrama, G.S. Santyadiputra, Development of car rental system based on geographic information system and decision support system with AHP (analytical hierarchy process) and SAW (simple additive weighting) method. J. Phys: Conf. Ser. 1516(1), 012013 (2020)
64
A. Rufai et al.
10. W. Adhiwibowo, B.A. Pramono, S. Hadi, N. Hidayati, Bandwidth management decision support system with a hybrid (SAW and AHP) method, in Engineering, Information, and Agricultural Technology in the Global Digital Revolution, ed. by Hendrawan, W.D. Arifin (Taylor and Francis Group, London, 2020) 11. Y.S. Sari, Decision support system (DSS) for measuring satisfaction of E-procurement service provider using simple additive weighting (SAW) and analytical hierarchy process (AHP). Int. Educ. J. Sci. Eng. (IEJSE) 3(2), 4–8 (2020) 12. A. Khozaimi, Y.D. Pramudita, E.M.S. Rochman, A. Rachmad, Sales quality determination using simple additive weighting (SAW) and analytical Hirarki process (AHP) methods. Jurnal Ilmiah Kursor 10(2), 95–100 (2020) 13. A.W. Murdiyanto, Decision support system of keyword selection web site using analytical hierarchy process (AHP) and simple additive weighting (SAW). Compiler 8(1), 81–93 (2019) 14. D.A. Pertiwi, B. Daniawan, Y. Gunawan, Analysis and design of decision support system in major assignment at buddhi high school using AHP and SAW methods. Tech-E. 3(1), 13–21 (2019) 15. N. Kumar, T. Singh, J.S. Grewal, A. Patnaik, G. Fekete, A novel hybrid AHP-SAW approach for optimal selection of natural fiber reinforced non-asbestos organic brake friction composites. Mater. Res. Express 6(6), 065701 (2019) 16. N.K.Y. Suartini, I.M.A. Wirawan, D.G.H. Divayana, DSS for" E-private" using a combination of AHP and SAW methods. IJCCS (Indonesian J. Comput. Cybern. Syst.). 13(3), 251–262 (2019) 17. R.D. Surbakti, R.S. Simamora, E. Cendana, D. Sitanggang, J. Banjarnahor, M. Turnip, Application selection lending houses subsidized by the method of AHP and SAW. J. Phys. Conf. Ser. 1230(1), 012082 (2019) 18. A. Cahyapratama, R. Sarno, Application of analytic hierarchy process (AHP) and simple additive weighting (SAW) methods in singer selection process, in International Conference on Information and Communications Technology (ICOIACT) (2018), pp. 234–239 19. A.F. Apriliana, R. Sarno, Y.A. Effendi, Risk analysis of IT applications using FMEA and AHP SAW method with COBIT 5, in International Conference on Information and Communications Technology (ICOIACT) (2018), pp. 373–378 20. M.M.D. Widianta, T. Rizaldi, D.P.S. Setyohadi, H.Y. Riskiawan, Comparison of multi-criteria decision support methods (AHP, TOPSIS, SAW & PROMENTHEE) for employee placement. J. Phys: Conf. Ser. 953(1), 12116 (2018) 21. W. Abrams, E. Ghoneim, R. Shew, T. LaMaskin, K. Al-Bloushi, S. Hussein, F. El-Baz, Delineation of groundwater potential (GWP) in the northern United Arab Emirates and Oman using geospatial technologies in conjunction with simple additive weight (SAW), analytical hierarchy process (AHP), and probabilistic frequency ratio (PFR) techniques. J. Arid Environ. 157, 77–96 (2018)
Implementation of the Weighted Product Method to Specify scholarship’s Receiver Chintia Ananda, Diana Teresia Spits Warnars, and Harco Leslie Hendric Spits Warnars
Abstract Scholarship assistance for children of educational age, especially during the COVID-19 pandemic, has provided some relief for some families affected by the economic turmoil during the COVID-19 pandemic, where some parents were laid off, or their parents were reduced in salary. This scholarship at least helps students to continue their dreams of obtaining and continuing the education they are pursuing so that in the end, they can continue their dreams for a brighter future for the welfare of themselves, their families, and the nation. However, on the other hand, new problems arise with the limited number of scholarships awarded and the manipulation of student data to obtain these scholarships. Therefore, this paper discusses the implementation of decision-making applications in scholarships using the weighted product method that underlies the awarding of scholarships with four criteria: parents of students participating in the family program of hope, orphan status, and several dependents of parents, and parents’ income. Keywords Decision support systems · Weighted product (WP) · Information systems
1 Introduction In this life, humans are often faced with moments where they have to make decisions, and one of them is when they realize that education is critical where getting a proper education and following the wishes of the individual will at least increase his dignity C. Ananda · D. T. S. Warnars Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] D. T. S. Warnars e-mail: [email protected] H. L. H. S. Warnars (B) Computer Science Department, Graduate Program, Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_5
65
66
C. Ananda et al.
as a human being and besides that to improve the human capital of the nation. Furthermore, the state has an interest and responsibility to facilitate its citizens to get proper education according to the tastes and abilities of each community, and besides that, considering the number of Indonesian people is too large, so that educational activities are also given to the community as private education sector to provide education as high as—height. However, as humans, we are inseparable from human destiny, which is determined by the creator and the human behavior itself so that, in the end, it impacts the education that is sought, especially in society. Moreover, children who receive education whose funds come from their family or parents cannot continue their current education at the end of the day. The problem of layoffs and reduced working hours due to working from home online during the COVID-19 pandemic also reduced income, which is usually the income partially allocated to continue the education. Based on these problems, this is where there are scholarships to help with problems in financing these educational activities. Scholarships can be defined as allocating funds that do not come from private funds but funds from the government, companies, and educational foundations [1]. In addition, scholarship funds can be provided employing exemption from tuition fees categorized as a substitute for work costs or official ties that will be carried out after the scholarship recipient carries out the education period. Scholarships are submitted to recipients entitled to receive them and are determined by several criteria stipulated by each scholarship granting organization. Scholarships are awarded according to the interests of each scholarship provider who looks more at the institution’s long-term goals as stated in the vision and mission of an organization. For example, if it is a religious organization, it will also look at religious similarities, and if it is a local government organization, it will prioritize giving scholarships to students from which the local government comes. In this paper, the implementation will focus on SMA Negeri 8 Serang City in Banten Province, which is a government-owned school, and based on semester 2020/2021 data, and it has a total of 54 teachers with 823 male students and 1,337 female students in 30 classes with the student–teacher ratio at 40. So far, SMA Negeri 8 Kota Serang has experienced difficulties because a manual process still carries out scholarship selection, so it takes a very long time. Based on the observation of the scholarship acceptance process, scholarship applicants only submit the scholarship registration requirements documents, and from these files, a meeting will be held to compare prospective students manually. Based on the meeting results and discussion at the teacher meeting, it was announced which students were entitled to receive scholarships through wall magazines. Based on the results of interviews, observations, and group discussion forums with teachers, students, and parents, it is concluded that the current scholarship selection process is very ineffective and unfair in determining scholarship recipients. There are accusations of nepotism where the scholarship recipients come from the families of teachers and employees, even though recipients of scholarships from teachers or employees are entitled, intelligent, and deserving of scholarships. Seeing the problems mentioned above, it is essential to make a system supported by the application of information technology, where the decisions carried out are based
Implementation of the Weighted Product Method …
67
on the output of a computer program. It is hoped that this decision-making system can help principals and teachers determine which students are eligible for scholarships objectively and avoid improper scholarship distribution. Also, it avoids accusations of nepotism among teachers and school employees in giving scholarships.
2 Current and Previous Similar Research Papers To look at current and past similar research, we used scholar google to search for the term “decision support system,” which was restricted to papers between 2017 and 2021 and produced 52,900 papers. In addition, we also restricted searches to terms like “decision support system” + scholarships for papers between 2018 and 2021, so there were 6580 results. Besides, we also expanded our search for papers between 2017 and 2021 by typing in the term “decision support system” + “weighted product”, resulting in 938 papers and continued using the term “weighted product” + scholarships with 313 results papers and continued using next term such as “decision support system” + “weighted product” + scholarships with 82 results papers obtained. However, after going through each search result, not all of the generated titles have the results as mentioned in the text search, so the value is not as many as the number of searches generated. In this literature review section, to know the current and five previous years, a similar topic was using the weighted product (WP) method for decision support systems. This WP method has been applied in many areas such as education, health systems, agriculture, restaurant and hotel, housing, office, and business, etc. In this paper, the literature review is limited to papers in education, housing, culinary systems, offices, business, health systems, and agriculture. Meanwhile, The Weighted Product (WP) is a multiplication weighting method for combining attribute ranks, and the rank for each attribute is incremented according to the associated attribute value, and this process is sometimes also called the normalization process. The WP algorithm has three stages: first, determining the value of the weight W, namely weighting, the second determining the value of the vector S, namely lifting, and the third determining the value of the vector V as a rating. A desktop application evaluates the effectiveness level of digital libraries. In the education system, the decision support system as a decision support tool by top management has been implemented in every field of education, such as using the WP method combined with the Alkin model for evaluating educational services in a computer college library implemented as a program. The implementation uses the Borg & Gall development model, including interviews and questionnaires as user requirement tools [2]. Moreover, WP has been implemented to determine exemplary high school students using nine criteria: average report card score, ranking, number of absences, competition participation, active extracurricular activities, extracurricular positions, discipline, morals, and accumulated point of violation [3].
68
C. Ananda et al.
Decision-making applications built with the WP method determine which students are most eligible for a scholarship using four criteria as Average Report Card, Attendance, Attitude, and Extracurricular activities [6]. Moreover, a Decision Support System was applied using the WP method to search Achieving Student [4] where one paper used five criteria such as Average class score, Discipline, Attendance, Extracurricular, and Non-Academic [5]. Another implementation was built by creating a cellular application with an android studio and Java development kit, which used the WP method to select the most suitable university in Pekanbaru, which applied three criteria: distance, accreditation value, and several lecturers [7]. In addition, in another paper, determining which students are most eligible for scholarships is also carried out using the WP method by applying 14 criteria combined with the K-Nearest Neighbor (KNN) method to calculate as a result of the final recommendation for scholarship recipients [8]. Another implementation is the MultiObjective Optimization Method based on Ratio Analysis (MOORA) to determine scholarship recipients using eight criteria, namely the value of parents’ income, number of dependents, academic grades, diploma achievement, monthly electricity bills, home status, type of house, and the value of the interview results [9]. Applications Another implementation also uses the MOORA method to determine scholarship recipients for high school students applying the same eight criteria [10]. Another implementation for determining high school student scholarship recipients uses the Simple Additive Weight (SAW) method, which applies seven criteria, namely the age range, the amount of parents’ income, the number of dependents of the parents, the number of siblings, the average score of report cards, the activities in organizations, and distance of residence [11]. Whereas in the type of paper that discusses housing and culinary, the WP method has been applied to various advantages in making decision support system applications such as: selecting the best choice of cooking products suitable for use by a housewife as part of the ingredients for serving food at the table and not. It is also for chefs and restaurant owners when choosing which cooking products are suitable for serving in their restaurant [12]. Also, a mobile application with the WP method is used to determine the culinary that best suits the pockets of culinary connoisseurs consumers in the city of Kudus, Central Java, by using five criteria, namely menu variations, affordable prices, availability of wifi, availability of chargers, and easy distance to travel [13]. The WP method is also applied to applications for the granting of industrial permits for household businesses issued by the health office in ensuring healthy and quality food for the community by using five criteria, namely the production method used, the type of food sold, the product packaging used, and has participated in preparation counseling. Healthy and clean food and the following is completing the profile file [14]. In a paper that discusses the application of the WP method in the office and business area, there are several applications combined with other methods. For example, the WP method is applied to find the best employees based on 13 criteria such as teamwork, craft, applying superior instructions, working diligently, responsible, understanding the job, achieving work, having an average working speed, multiskills, attendance, no information, often leaves the workplace and complies with
Implementation of the Weighted Product Method …
69
company regulations [15]. The selection of other best employees is applied using the WP method for the Pringsewu district revenue office, Yogyakarta province, using five criteria: employee attendance, employee behavior, employee experience, employee discipline, and employee teamwork [16]. Besides, other implementations combine the WP method with the El Chinix Traduisant La Realite (Electre) method for employee recruitment using three initial criteria, namely age range, educational background, and experience, and the second 4 criteria if employees are to be interviewed, namely psychological scores, ability values, and skills, TOEFL scores, and interview scores [17]. Besides that, a Simpeunan Savings and Loans Cooperative in the city of Tasikmalaya uses the weighted product method and the Simple MultiAttribute Rating (SMART) in determining creditworthiness by using five criteria such as the amount of savings, total income, character rating, a form of guarantee, and condition of guarantee [18]. Furthermore, the k-means grouping is combined with the WP method to map crime-prone locations that assist security forces and the public in controlling crime in the Kudus Regency by using three criteria, namely the number of crime incidents, the number of houses in an area, and the distance to the scene and the police station [19]. There are currently three papers in the health system that discuss using the WP method for their approach, one of which is to create a decision-making system to provide information to pregnant women in paying attention to nutritious foods for their babies who pay attention to the composition of the food they eat, which consists of 5 criteria such as vegetables, fruit, meat, nuts, milk, etc. [20]. After that, another paper discusses creating a system to help decision-making at the GrandMed Hospital in the city of Deli Serdang for the selection of suppliers or drugstores that will provide medicines to patients by paying attention to 4 criteria such as price worthiness, delivery performance, drug variations and delivery distance [21]. Another paper discusses an expert system that helps determine the types of diseases babies suffer that pay attention to 7 criteria for baby health conditions, namely excessive crying, fever conditions, diarrhea, skin problems, weight gain, height growth, and communication development [22]. In agriculture, two recent papers use the WP method approach, namely a paper that discusses systems to help chili farmers in determining the success of planting chilies by paying attention to 4 criteria, including the condition of the soil elevation above sea level in meters, the measure of soil pH (Power of Hydrogen), nutritional value, and Ambient Temperature in Celsius [23]. Furthermore, the following paper discusses creating a system to assist rice farmers in determining rice varieties to be planted by considering five criteria: potential success, average yield, harvest time, resistance to brown planthoppers, and resistance to bacterial leaf blight [24].
3 Result and Discussion In this paper, the discussion will be carried out in 3 stages: system modeling, Weighted Product implementation, and system implementation (Fig. 1).
70
C. Ananda et al.
Fig. 1 Use case diagram of the proposed application
3.1 System Modeling In modeling this system, the system is modeled with two diagrams in the Unified Modeling (UML) language, namely use case diagrams and class diagrams, where the use case diagram shows the process in the application being built, while the class diagram shows the database design model to be used. There are four use case diagram activities in the use case diagram, namely registration, login, entering student data, and running the results. When users want to use the system, they must enter their data such as username, name, gender, email, date of birth, and password as seen in table “User” in the class diagram in Fig. 2. In the use case activity login, the user must first enter the username and password that they have previously registered, and the system will check into the database if their entries match the data in the user table. If it does not match, the user cannot enter the system, and there will be a confirmation that the username or password does not match. In the use case entry student data activity, the user can enter the number of students to be assessed by entering the student code attributes and students names, and the data will be stored in the database as an alternative table as seen in the class diagram in Fig. 2. The entry process will include the scoring criteria for each student, which will be saved in the scoring table as seen in the class diagram in Fig. 2. In this use case activity, there are two sub-use case activities which are drawn with the symbol include, which means that the subcase must be carried out, and two sub-use case USer IdUser UserName Name Gender Email DateofBirth Password
Alternav e IdStudent Name Alternave
Scoring 1
1..* IdStudent IdCriteria Score S_score
Fig. 2 Class diagram of the proposed application
Criteria 1..* 1
IdCriteria NameCriteria Weight Percentage
Implementation of the Weighted Product Method …
71
activities include entering the criteria that will be used to determine the predictions of students who will receive the scholarship and the weight that will be applied to any given criterion. The process of entering criteria and weights will be stored in the criteria table as shown in the class diagram in Fig. 2. When entering the criteria, the criteria are entered, and the percentage attributes in the table will be generated automatically based on the average total weight. Finally, the use case activity running the result will be used to run the result based on the inputted data in previous use case activities such as student data, criteria, and weight. Figure 2 shows the application class diagram, which contains four classes into four database tables: user, alternative, scoring, and criteria, wherein, in this case, the user table is not related to other tables. An alternative table is a table that stores student data that has 1 to many relationships with the assessment table. Besides that, the assessment table has a multi-to-one relationship with the criteria table, or in other words, the alternative table has a many-to-many relationship with the criteria table.
3.2 Weighted Product Implementation The Weighted Product (WP) method has four steps such as 1. 2. 3. 4.
Determination of criteria, criteria name, weight, and range of criteria. Finding the percentage of criteria weights. Finding preference value for each alternative. Finding ranking score.
The first step is determining the criteria, name of criteria, weight, and range of criteria as shown in Table 1 with four criteria, namely the involvement of parents in the family hope program, status as an orphan, the number of dependents of the parents, and parents earning with a weighted score. 4, 4, 3, and 5 with a total weight score of 4 + 4 + 3 + 5 = 16, respectively. These four criteria and the weighting composition Table 1 Equalization of criteria weights Criteria
Criteria name
Weight
Range of criteria
Percentage of criteria weights
C1
Involvement of parents in the hope family program
4
1, 2
0.25
C2
Orphans
4
1, 2
0.25
C3
Number of dependents of parents
3
1–5
0.1875
C4
Parents earning
5
1–5
0.3125
Total
16
1
72
C. Ananda et al.
of each of these criteria were obtained from the results of interviews and discussion group forums with schools represented by teachers and students represented by parents of students. Moreover, as seen in the fourth column in Table 1, the range of criteria for each criterion is also based on a forum discussion group between teachers and parents. 1.
2.
3.
4.
The first criteria, namely “parental involvement in the family hope program,” is an activity where parents must be registered in the family hope program organized by the government as a sign of underprivileged families, and a score of 1 means not registered while a score of 2 means registered. The second criteria are the provision of an orphan status which has two criteria choice scores, namely 1 and 2 where the score 1 indicates that a student has only one parent, either mother or father, while score 2 indicates that he is a student who does not have both father and mother. The third criteria are the number of dependents of parents who have five criteria of choice scores, where the choice scores of 1, 2, 3, 4, and 5 indicate the parents have 1, 2, 3, 4, and > 4 dependents, respectively. The fourth criteria are earning of parents who have five criteria of choice scores, where the choice scores of 1, 2, 3, 4, and 5 indicate parents have income > 4 million, 3–3.99 million, 2–2.99 million, 1–1.99 million, and < 1 million earning, respectively.
The second step is to find the percentage of criteria weights (PCWj) as shown in the last column of Table 1, which uses Eq. (1), where the weight of each criterion (W j) is divided by the total weight criteria (W j) or sigma W j and in this case 4 + 4 + 3 + 5 = 16. For example, the criteria C1 as shown in the first row of Table 1 has PCWj = W j/W j = 4/16 = 0.25. The next criteria are also processed with Eq. (1) and have scores such as 0.25, 0.1875, and 0.3125, respectively, and the percentage of criteria weights has a total of 0.25 + 0.25 + 0.1875 + 0.3125 = 1. wj PCWj = wj
(1)
where: PCWj = Percentage of criteria weights W j = Weight of each criterion W j = Total Weight Criteria (Sigma weight criteria) j = Criteria The third step is to find the preference value of each alternative (S i ) shown in the Total S i column in Table 2, which contains 21 students/alternatives (i) that have been composed with four criteria (j) in Table 1, where the assessment of each criterion for each student (Xij) was made based on the results of filling in the registration form scholarships filled by students and their parents. Meanwhile, the W j is applied as rank positive for the benefit criteria and negative value for cost criteria, and in this paper, C1 criteria as cost criteria while others as benefit criteria.
Implementation of the Weighted Product Method …
73
Table 2 Matrix of a combination of alternatives and criteria, a score of total S i and V i No.
Name
Alternative
C1
C2
C3
C4
Total S i
Vi
1
Andini Putri
A1
1
1
5
2
1.6793018
0.053637
2
Sentanu Eka Prasetyo
A2
2
1
1
2
1.0442738
0.033354
3
Iin Sumiati
A3
2
1
1
2
1.0442738
0.033354
4
Alexander HB
A4
2
1
1
3
1.1853399
0.037859
5
Mar Ah
A5
2
1
1
3
1.1853399
0.037859
6
Jumanah
A6
2
1
1
3
1.1853399
0.037859
7
Munihah
A7
2
1
3
4
1.5934795
0.050895
8
Siti Fatonah
A8
2
1
2
3
1.3498516
0.043114
9
Yanah
A9
1
2
5
2
1.9970376
0.063785
10
Lilis
A10
2
1
3
3
1.4564753
0.046519
11
Hilman
A11
1
1
4
2
1.6104903
0.051439
12
Ahmad Riyan Rifai
A12
2
1
2
4
1.4768261
0.047169
13
Astriyah
A13
1
1
5
1
1.3522496
0.043191
14
Irmawati
A14
1
1
3
2
1.5259212
0.048738
15
Aryadi Wijaya
A15
1
1
4
3
1.828044
0.058387
16
Nazwa Marisa Awalia
A16
1
1
3
3
1.7320508
0.055321
17
TB Aldiron
A17
1
1
4
3
1.828044
0.058387
18
Muhamad Fiqi
A18
2
1
4
4
1.6817928
0.053716
19
Deviani
A19
2
1
2
2
1.1892071
0.037983
20
Siti Irnawati
A20
2
1
4
4
1.6817928
0.053716
21
Sunariah
A21
2
1
4
4
1.6817928
0.053716
31.308925
1
Total
For example, in the alternative A1, a student Andini Putri, as shown in the named w first row of Table 2, has a score of Si = nj = 1 X ij j , where S 1 = (X 11 −W 1 ) (X 12 W 2 ) (X 13 W 3 ) (X 14 W 4 ) = (1–0.25 ) (10.25 ) (50.1875 ) (20.3125 ) = 1 * 1.3522496 * 1.2418578 * 1.6793018 = 0.053637. In this case, the rank for the first criteria C1 is given negative rank (X 11 −W 1 ) since C1 criteria as cost criteria, and other criteria such as C2, C3, and C4 are given a positive rank (X 12 W 2 ) (X 13 W 3 ) (X 14 W 4 ) because they are benefit criteria. Other alternatives are also applied using Eq. (2), and each has its own Si result, as shown in the Total S i column in Table 2. Si =
n j=1
where: S i = preference value for each alternative
w
X ij j
(2)
74
C. Ananda et al.
i = alternative j = criteria n= number of criteria n j=1 = Pi (product) of criteria (j) or multiplication of criteria (j) X ij = assessment of each criterion for each alternative W j = percentage of criteria weights w X ij j = X ij to the rank of W j The fourth step as the lastw step is to find the ranking score using Eqs. (3) or (4), where the attribute nj = 1 X i j j in Eq. (3) is the attribute S i in Eq. (2), which is the wj n attribute of the divisor divided by the total S i j = 1 X ij , which is 31.308925 as shown in Table 2. Therefore, the equation V i , as shown in Eq. (4), is a representation si and of the value of S i in Eq. (2) which is divided by the total value of S i as shown in the lower right corner of Table 2, the total of all values V i for 21 students/alternatives is 1. n wj j=1 X ij Vi = n (3) wj j=1 X ij Si Vi = si
(4)
Based on Eq. (3) or (4), the highest ranking score is the 9th student named Yanah with V i score = 0.063785, and the lowest ranking score is the second and third students named Sentanu Eka Prasetyo and Iin Sumiati with similar V i score = 0.033354. This score ranking indicates that the student with the highest ranking score is the top priority for receiving the scholarship, while the student with the lowest ranking will be the last to receive the scholarship. Figure 3 is the graphic for the matrix combination of alternatives and criteria, as seen in Table 2. Figure 3 shows conformity with what is described in Table 2 before where a student named Yanah has the highest score while students named Sentanu Eka Prasetyo and Iin Sumiati have the lowest score.
3.3 System Implementation In this implementation system, the user interface menu will be displayed as seen in Figs. 4, 5, 6, 7 and 8, representing the implementation of the system modeling described using use cases and class diagrams in Figs. 1 and 2. This implementation system is carried out using personal home pages (PHP) as a web server programming and using the MySQL database to store tables modeled in Fig. 2. Figures 4, 5, 6, 7 and 8, are the user interface for the application, which can help the user to understand and interact with the application developed using personal
Implementation of the Weighted Product Method …
75
2.5 2 1.5 1 0.5
C2
C4
Sunariah
Deviani
Si Irnawa
TB Aldiron
Muhamad Fiqi
Aryadi Wijaya
Nazwa Marisa Awalia
Astriyah Si
Irmawa
Ahmad Riyan Rifai
Lilis C3
Hilman
Yanah
Munihah C1
Si Fatonah
Mar Ah
Jumanah
Iin Sumia
Alexander HB
Andini Putri
Sentanu Eka Prasetyo
0
Vi
Fig. 3 Graphic of matrix of a combination of alternatives and criteria
Fig. 4 Main menu user interface
home pages (PHP) web-based and MySQL database. Figure 1 shows the use case diagram, which contains some activities such as registration, login, entry of student data, and running the result. Especially for entry, student data use case has two subuse case activities: entry criteria and entry weight. Entry student data is the entry in the application regarding alternative as data of students which content with some activities as field column.
76
C. Ananda et al.
Fig. 5 Login menu user interface
Fig. 6 Menu entry criteria and weight
Meanwhile, Fig. 4 is the main menu that shows the application’s content, which includes all submenus as seen in the use case diagram in Fig. 1. Figure 5 is the login menu where the user should enter a username and password, and on this page, if a new user, then needs to register by entering user data such as email, name, address, gender, date of birth, username, and password. After the registration, the email confirmation will come to the user’s email address and confirm to enter the system by entering with username and password as seen in Fig. 5. Moreover, Fig. 6 shows the entry of student data as an alternative such as name and the content of criteria is input including the weight for each of the criteria. Figure 6 shows the user interface for use case activity entry student data include sub-use case activities such as entry criteria and weight, as seen in Fig. 1. Furthermore, Figs. 7
Implementation of the Weighted Product Method …
77
Fig. 7 Form selection vector S
Fig. 8 Form selection vector V
and 8 show the process of running the result use case as seen in Fig. 1, whereafter the user pushes the button, then the application will automatically apply and run the Eqs. (1), (2), and (3) on the application. Figure 7 shows the running of the S i vector using Eq. (2), while Fig. 8 shows the running V i vector using Eq. (2). Meanwhile, Fig. 8 is the print screen where the use case running the result in Fig. 1 is run, and then it will show the content result respectively based on the student data
78
C. Ananda et al.
input as data alternatives and criteria as condition filtering for finding the best result, the ranking score.
4 Conclusion Based on the results of research conducted in making the application of the decisionmaking system to determine the prospective scholarship recipients who cannot name in SMA Negeri 8 Serang City, the results of the research conclusions obtained are as follows: 1.
2.
3.
To determine the prospective scholarship recipients can no longer do it manually. The existence of this decision-making system can help the school determine that the scholarship recipients cannot accurately target all of the students of Serang City 8 High School and no longer require a long time. A ranking decision in Serang City 8 High School becomes more optimal by applying the decision-making system using the weighted product method to make it easier for schools and application users to determine the highest weighting value of several students on the list of potential scholarship recipients. Using four criteria determined using the Ministry of Education and Culture indicators with the Ministry of Education and Culture regulation of the Republic of Indonesia Number 6 of 2016. The application of web-based decision support systems in SMA Negeri 8 Kota Serang can be realized.
References 1. B. Subaeki, M. Irfan, R.S. Adipradana, C.N. Alam, M.A. Ramdhani, Decision support system design of higher education scholarship recipients with android-based. J. Phys: Conf. Ser. 1280(2), 022016 (2019) 2. D.G.H. Divayana, I.M. Ardana, I.P.W. Ariawan, Implementation of educational services evaluation application based on weighted product-Alkin as a digital library evaluation tool. J. Talent. Dev. Excellence 12(1), 1798–1812 (2020) 3. S.R. Arifin, R.H. Pratama, ımplementation of weighted product (wp) methods in decision support system for determining exemplary students. Positif: J. Sistem Dan Teknologi Informasi 6(1), 76–84 (2020) 4. D. Rahayu, S. Mukodimah, Decision support system of achieved students using weighted product method. Int. J. Inf. Syst. Comput. Sci. (IJISCS) 3(2), 72–77 (2019) 5. R.R. Mohamed, M.A. Mohamed, D. Rahayu, W. Hashim, A. Maseleno, Decision support system of achieving student using weighted product method. Int. J. Psychoso. Rehabil. 23(4) (2019) 6. R.S. Septarini, R. Taufiq, S. Al Fattah, The implementation of weighted products in the support system of scholarship acceptance decisions at the MA AL-Falahiyah AL-Asytari. Jurnal Informatika Universitas Pamulang 5(4), 438–444 (2021)
Implementation of the Weighted Product Method …
79
7. D. Nasien, M.H. Adiya, A. Mulyadi, A. Sukabul, G. Rianda, D. Yulianti, Decision support system for selecting university in Pekanbaru based on android. Int. J. Electr. Energy Power Syst. Eng. 3(2), 30–34 (2020) 8. L.A. Nasher, N. Bahtiar, Application of decision support system using the K-nearest neighbor and weighted product method for determining the recipients of low-income family scholarship (GAKIN) (case study: Poltekkes Kemenkes Semarang). J. Phys.: Conf. Ser. 1217(1), 012117 (2019) 9. A. Utami, E.L. Ruskan, Development of decision support system for selection of yayasan alumni scholarship using moora method, in Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019) (Atlantis Press, 2020), pp. 706–710 10. R. Mardhiyyah, R.H.P. Sejati, D. Ratnasari, A decision support system of scholarship grantee selection using Moora. Int. J. Appl. Bus. Inf. Syst. 3(1), 21–27 (2019) 11. T. Susilowati, S. Suyono, W. Andewi, Decision support system to determine scholarship recipients at SMAN 1 Bangunrejo using SAW method. Int. J. Inf. Syst. Comput. Sci. (IJISCS) 1(3), 56–66 (2017) 12. S.S. Goswami, D.K. Behera, S. Mitra, A comprehensive study of weighted product model for selecting the best product in our daily life. Braz. J. Oper. Prod. Manage. 17(2), 1–18 (2020) 13. A.K. Wardhani, E. Lutfina, Application culinary decision support system in Kudus city with weighted product method based on mobile phone. J. Comput. Sci. Eng. (JCSE) 1(1), 10–16 (2020) 14. G.M. Putra, Application of household ındustrial certificate license using weight product method, in International Conference on Social, Sciences and Information Technology, vol. 1(1), pp. 61–70 (2020) 15. S.A. Susanto, E. Prasetyo, M.M. Hidayat, P. Setiawan, Applying the weighted product method for the best selection of personal quality control in Pt. Pacific Equinox Surabaya. J. Electr. Eng. Comput. Sci. 5(1) (2020) 16. N. Aminudin, E. Sundari, K. Shankar, P. Deepalakshmi, R.I. Fauzi, A. Maseleno, Weighted product and its application to measure employee performance. Int. J. Eng. Technol. 7(2.26), 102–108 (2018) 17. M. Irfan, U. Syaripudin, C.N. Alam, M. Hamdani, Decision support system for employee recruitment using El Chinix Traduisant La Realite (Electre) and weighted product (Wp). Jurnal Online Informatika 5(1), 121–129 (2020) 18. A. Ilah-Warnilah, I.N. Hawa, Y.S. Mulyani, The analysis of determining credit worthiness using weighted product and smart methods in SPB cooperatives. Indonesian J. Inf. Syst. 2(2), 140–151 (2020) 19. Y. Rahmatika, E. Sediyono, C.E. Widodo, Implementation of K-means clustering and weighted products ın determining crime-prone locations. Kinetik: Game Technol. Inform. Sys. Comput. Netw. Comput. Electron. Control 5(3), 195–202 (2020) 20. E. Darnila, Application of decision support system for ınfant nutrition women using weighted product on health. Login: Jurnal Teknologi Komputer 14(1), 25–31 (2020) 21. D. Fanita, B. Sinaga, Supplier selection decision support system drug weighted methods product (WP). J. Comput. Netw. Archit. High Perform. Comput. 2(1), 135–139 (2020) 22. M. Muslihudin, G. Devika, P. Manickam, K. Shankar, D.P. Putra, E.P.S. Putra, A. Maseleno, Expert system in determining baby disease using web mobile-based weighted product method. Int. J. Recent Technol. Eng. 8(1), 3299–3308 (2019) 23. R. Ramadiani, B. Ramadhani, Z. Arifin, M.L. Jundillah, A. Azainil, Decision support system for determining chili land using weighted product method. Bull. Electr. Eng. Inform. 9(3), 1229–1237 (2020) 24. G.R. Cahyono, J. Riadi, I. Wardiah, Decision support system for the selection of rice varieties using weighted product method. J. Phys. Conf. Ser. 1450(1), 012059 (2020)
A Survey on E-Commerce Sentiment Analysis Astha Patel, Ankit Chauhan, and Madhuri Vaghasia
Abstract Sentimental Analysis for products and services available on various ecommerce website and applications has been an important and crucial research task in current era. With the ease of e-commerce and m-commerce every seller tries to show their service and products by promoting on various platforms. Opinion mining, as well referred as sentiment analysis, is a significant element in natural language processing. It is used to evaluate what individuals or audiences consider about the products and services currently offered on collective media channels or social media platforms or e-commerce sites. To detect sentimental polarity a better method should be chosen. We have reviewed some of the research work, those have been tested and proven as good research work for sentimental analysis, as reviews of any products are available on website or applications related to products or services. But it is difficult to find out sentiments when large number of reviews from various sources are collected. You can find different reviews from different sources. There are lakhs of products reviews and services available on various e-commerce portal. To check manually review is difficult task. An automated review system without bios is need of the time. Keywords Machine learning · E-commerce · Review · Sentimental analysis · Support vector machine · Naïve Bayes · Opinion mining · Natural languange processing
A. Patel (B) · A. Chauhan · M. Vaghasia Parul Institute of Engineering and Technology, Vadodara, India e-mail: [email protected] A. Chauhan e-mail: [email protected] M. Vaghasia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_6
81
82
A. Patel et al.
1 Introduction In the era of e-commerce and m-commerce, online shopping for various products as well as various services has been booming. Recent time of lockdown due to covid in various countries and areas also boosted up online shopping. Users’ comments on various platform for various products and services has been an important source of feedback. It has been increasing day by day. People can easily share their views pros and cons of products as well as for sellers and selling platforms also. With the advancement of various technological changes and use of the computer and mobile world, the end user or individuals are using social media platforms to state their opinions and thoughts about products and services in an unorganized (unstructured) approach. Opinions expressed on social media channels such as for example, Facebook can be classified to describe all other forms of positive, negative and neutral evaluations of the given text in reviews. It is challenging task to analyze sentiments from lots of reviews from various sites for so many products. Machine Learning can play an important role for the same. In this survey paper we have reviewed some of the research work that has been carried out for the Sentimental Analysis for e-commerce. The main challenges for sentiment analysis of e-commerce and mobile commerce is Dimension Mapping (mapping with other contents) and Double Meaning of Words (ex. ‘which’). The dimension mapping problem primarily refers to the mapping of opinionated text words (containing words) with the correct dimensions of the content. The sentiment of word disambiguation problem is similar to the scenario in which a sentiment word can be linked with two or more content dimensions. As reviews are in unstructured data, for research purpose structural data is required. Using NLP, it is extremely possible for computers and devices to read text or hearing speech, interpreting it, measure sentiments of the content and also determine which parts are important. NLP in sentimental analysis plays a crucial role in generating dataset and extracting informational data of textual data. Data Processing stage in sentimental analysis is performed by mainly NLP. Final step in sentimental analysis is classification. A better classification is applied to classify the review in classes like positive, negative or neutral.
2 Related Works As work related to sentimental analysis is been reviewed, various researchers have used various techniques. Each technique has its own pros and cons. We have also discussed strong features and limitations of the same work. In research [1] SVM Algorithm is applied to generate a model for sentimental analysis. In this research Amazon dataset is used for research work. POS (Part of the Speech) technique is used for this research. Along with this, negation phase identification is also used to detect words like “not good”, “not bad” etc. Tokenization, Removing Stop Words, POS Tagging and Stemming are included in Data Processing. Sentiment classification of information, also known as polarity categorization (PC),
A Survey on E-commerce Sentiment Analysis
83
Fig. 1 General System Flow [1]
is the process of discovering or detecting and naming (classifying) a particular viewpoint depending in terms of preference (positive, negative or neutral) application. During their research, they used the Natural Language Toolkit (NLTK) system as well as the Scikit-learn (scilib) Python library for SVM implementation (Fig. 1). Another study [2] employed the PySpark platform and a resilient distributed dataset (RDD) based sentiment analysis utilizing the Spark NLP tool to solve scalability and availability challenges in sentiment analysis (SA) on a conventional ecommerce platform. They likewise utilized Python’s Scrapy framework for web scraping to extract important data from e-commerce websites (Fig. 2). In another research work [3] they have applied Amazon based customer review dataset. The emphasis is mostly on discovering various aspect phrases from each review in the dataset, determining the parts of speech (POS) and using particular categorization algorithm to assess the positivity, negativity and neutrality of each review. They have applied three levels of sentimental analysis. Document level sentiments: The entire document instance must be processed at once at the document level sentiment analysis stage. While using the sentence level technique sentiment analysis, the research information must be separated into sentences. Subjectivity categorization refers to the process of dividing a document into sentences. The primary goal of
84
A. Patel et al.
Fig. 2 FLASK based SA system [2]
Table 1 Result analysis of aspect level [3]
Parameter
Naïve Bayes
SVM 83.423
Accuracy
90.423
F-Measure
0.952
0.841
Precision
0.947
0.852
Recall
0.959
0.83
this stage of analysis in their research is to determine the features of a product. By applying Naïve Bayes classifier, they have achieved 90% accuracy (Table 1). In research [4] it was studied that in their research work they have applied supervised learning method for sentimental analysis. They have extracted large scale dataset from Amazon. They have tested their work over 48,500 products. They have mostly used electronics category and musical instrument category for their research work. Figure 3 explains workflow for large scale sentimental analysis. In their research they extracted data, preprocessed the same data and applied feature extraction. After that supervised learning method is applied and classification is done. In research [5] researchers applied sentimental dictionary approach. Researchers have applied emotional resources for building sentimental dictionary. Text emotion analysis is done with machine learning algorithms (Fig. 4). As another research [6] researchers manually assess product reviews by gathering data in the form of an excel file. Feature extraction matrix (FEM) is applied in this research. Product recommendations and a list of products based on the greatest value of FEM for searched features are generated depending on the user searched feature. In research, the Support Vector Machine as well as Naive Bayes approach have produced good and competent outcomes [7]. According to them, the Part of Speech (POS) principal boosts sentiment potency. As a result, integrating POS with SVM is useful for sentimental analysis in e-commerce.
A Survey on E-commerce Sentiment Analysis
85
Fig. 3 Workflow for supervised learning [4]
In another research [8] SLCABG is applied. The study presented in this paper is based on the sentiment lexicon and merges Convolutional Neural Network (CNN) with attention-based Bidirectional Gated Recurrent Unit (BGRU) techniques (BiGRU). This work is tested only on Chinese language. Hybrid Recommendations is one of the important modules in research [9]. To improve the accuracy, they used a supervised learning technique (Support Vector Machine). A cluster-based approach is applied in their work (Fig. 5). Deep learning modified neural network (DLMNN) [10] is a strategy in which the system uses Neural Network to portray the results as negative, positive and neutral ratings. This system has also provided significant result improvement on particular dataset. After reviewing various research work, findings are summarized as following (Table 2).
86
Fig. 4 Dictionary approach for SA [5] Fig. 5 Cluster based SA system [9]
A. Patel et al.
A Survey on E-commerce Sentiment Analysis
87
Table 2 Summary of literature survey Dataset
method used
Findings
Limitation
May Real-time sentiment analysis -2019 on E-commerce application [1]
Amazon Review Data from Amazon.com
SVM
Advantages Two levels of categorization: review level and sentence level Drawback: No weightage given to features
Tough in managing complex structures of sentences and different languages. No weightage given to features. No advanced deep learning approach used
Sentiment analysis for e-commerce products using natural language processing [2]
Amazon
NLP Flask
NLP enables better sense from review Web scraping is dependent, may fail to get data every time
Data fetching might result in failure. Sentiments generated can be bias
Aspect-level 2018 sentiment analysis on E-commerce data [3]
Amazon
SVM
Aspect level classification applied Limited dataset tested
Different languages issues
Sentiment analysis on large scale amazon products reviews [4]
May 2018
Amazon
Naïve Bayesian (MNB) and support vector machine (SVM)
Big Data is tested and got better result Feature weighting is not applied
Data collection problem as not enough data is provided by e-commerce site publicly. Can’t scrape enough Data to consider it as real-life public reviews over different products
Sentiment analysis of E-commerce text reviews based on sentiment dictionary” [5]
Jun 2020
Amazon
TF/IDF
TF—IDF may be used for extracting exact keywords Other classifier with TF/IDF may improve result
Part of speech tagging turns out to be biased and affects the output results
Title of paper
Year
May 2021
(continued)
88
A. Patel et al.
Table 2 (continued) Title of paper
Year
Dataset
method used
Findings
Limitation
An experimental Mar analysis on 2021 E-commerce reviews, with sentiment classification using opinion mining on web [6]
E-commerce websites
Feature extraction matrix
Better system for low quantity data Manually data collection is not proper method for large scale products
Data collection of mobile products only and selecting top reviews only
Sentiment analysis on product reviews [7]
Online API
SentiWordNet
POS + SVM got better result Bigger dataset should be tested
No multi language reviews are taken into consideration Challenges in finding the sentiments for sarcastic reviews
2020 Sentiment analysis for e-commerce product reviews in Chinese based on sentiment lexicon and deep learning [8]
dangdang.com
CNN
Sentiment lexicon and Deep learning applied Only Chinese language tested
Only categories the sentiments into positive and negative not suitable for high preferences for sentiment refinement
“Sentiment analysis in E-commerce using Recommendation System” [9]
2021
Amazon
SVM
Clustering approach applied Weighted features can improve the result
Product features wise sentiments are not carried out
Sentiment analysis of online product reviews using DLMNN and future prediction of online product using IANFIS” [10]
2020
E-commerce
Deep learning modified neural network
Deep learning modified neural network is better than traditional classification Hyper parameter should be tuned
Cannot understand the complete context of the entire piece while keyword processing to get correct sentiments
2019
3 Conclusion After reviewing some of the research work done in recent time for sentimental analysis in E-Commerce, we have founded some important findings. One of the common
A Survey on E-commerce Sentiment Analysis
89
issues is found like single source of data that can misguide the system generated sentiments. As entire review is analyzed, no particular features of the product are extracted in most of cases. Feature of the products e.g., Battery of Laptop, Processor, Speed etc. can be separated for sentimental analysis. Multilanguage review extraction and translation for enriching the dataset can also be applied. Region wise sentimental analysis can also be helpful for end user. As a future work better preprocessing methods, NLP methods for enriching the dataset and feature selection process can be improved before classification. Also, fake review detection can help the entire system to perform better for end user.
References 1. J. Jabbar, et al., Real-time sentiment analysis on E-commerce application, in 2019 IEEE 16th International Conference on Networking, Sensing and Control (ICNSC) (IEEE, 2019) 2. B.K. Jha, G.G. Sivasankari, K.R. Venugopal, Sentiment analysis for E-commerce products using natural language processing. Ann. Romanian Soc. Cell Biol. 166–175 (2021) 3. S. Vanaja, M. Belwal, Aspect-level sentiment analysis on e-commerce data, in 2018 International Conference on Inventive Research in Computing Applications (ICIRCA) (IEEE, 2018) 4. T.U. Haque, N.N. Saber, F.M. Shah, Sentiment analysis on large scale Amazon product reviews, in 2018 IEEE international conference on innovative research and development (ICIRD) (IEEE, 2018) 5. Y. Zhang, et al., Sentiment analysis of E-commerce Text reviews based on sentiment dictionary, in 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA) (IEEE, 2020) 6. S.S. Latha, An experimental analysis on E-commerce reviews, with sentiment classification using opinion mining on web. Int. J. Eng. Appl. Sci. Technol. 5(11), 2143–2455 (2021) 7. X. Fang, J. Zhan, Sentiment analysis using product review data. J. Big Data 2(1), 1–14 (2015) 8. L. Yang et al., Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access 8, 23522–23530 (2020) 9. R. Krithiga, M. Anbumalar, D. Swathi, Sentimental analysis in E-commerce using recommendation system. Int. Sci. J. Contemp. Res. Eng. Sci. Manag. 6(1), 53–58 (2021) 10. P. Sasikala, L. Mary Immaculate Sheela, Sentiment analysis of online product reviews using DLMNN and future prediction of online product using IANFIS. J. Big Data 7, 1–20 (2020) 11. Kaggle, Consumers reviews of amazon products. Consumer Reviews of Amazon Products|Kaggle
Secured Cloud Computing for Medical Database Monitoring Using Machine Learning Techniques M. Balamurugan, M. Kumaresan, V. Haripriya, S. Annamalai, and J. Bhuvana
Abstract A growing number of people are calling on the health-care industry to adopt new technologies that are becoming accessible on the market in order to improve the overall quality of their services. Telecommunications systems are integrated with computers, connectivity, mobility, data storage, and information analytics to make a complete information infrastructure system. It is the order of the day to use technology that is based on the Internet of Things (IoT). Given the limited availability of human resources and infrastructure, it is becoming more vital to monitor chronic patients on an ongoing basis as their diseases deteriorate and become more severe. A cloud-based architecture that is capable of dealing with all of the issues stated above may be able to provide effective solutions for the health-care industry. With the purpose of building software that would mix cloud computing and mobile technologies for health-care monitoring systems, we have assigned ourselves the task of designing software. Using a method devised by Higuchi, it is possible to extract stable fractal values from electrocardiogram (ECG) data, something that has never been attempted previously by any other researcher working on the development of a computer-aided diagnosis system for arrhythmia. As a result of the results, it is feasible to infer that the support vector machine has attained the best classification accuracy attainable for fractal features. When compared to the other two classifiers, M. Balamurugan Department of Computer Science and Engineering, School of Engineering and Technology, Christ (Deemed to Be University), Bangalore, India e-mail: [email protected] M. Kumaresan School of Computer Science and Engineering, Jain (Deemed to Be) University, Bangalore, India e-mail: [email protected] V. Haripriya · J. Bhuvana (B) School of Computer Science and IT, Jain (Deemed to Be) University, Bangalore, India e-mail: [email protected] V. Haripriya e-mail: [email protected] S. Annamalai School of Computing Science and Engineering, Galgotias University, Greater Noida, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_7
91
92
M. Balamurugan et al.
the feed forward neural network model and the feedback neural network model, the support vector machine excels them both. Furthermore, it should be noted that the sensitivity of both the feed forward neural network and the support vector machine yields results that are equivalent in quality (92.08% and 90.36%, respectively). Keywords Neural network · Cloud database · Electrocardiogram · Fractal features
1 Introduction One of the world’s most remarkable networked systems, the Internet enables devices to communicate with one another across the globe by utilizing an established set of standard protocols and connecting a diverse range of different networks, including academic institutions, commercial and government organizations, among other things. Early on, the Internet was largely constituted of static websites and electronic mail communication, which was the norm. In today’s world, there are many different Internet execution modes visible everywhere we look, and they are integrated into many different aspects of our lives, providing an endless array of services and applications while attempting to meet the needs of every user, regardless of where they are or what time of day they are doing so. Consumers now have more access to and mobility with Internet technologies than they have ever had before, as shown by the fact that all of their devices include Internet technologies in some way or another. The availability of smart devices, which allow us to maintain a constant connection with other areas of the world, is considered a vital feature of modern society’s everyday life. As a consequence, the number of connected devices continues to increase at an alarming pace year after year. In order to do this, it will be required to establish an autonomous device communication system. The Internet of Things (IoT) is one of the most promising choices accessible right now, according to industry experts (IoT). When it comes to real-world goods, the Internet of Things (IoT) is an information network that allows for the retrieval of information about them as well as their direct connection with one another. It is a platform that is meant to store and analyze data from the Internet of Things (IoT), often known as the Internet of Things Cloud (IoT Cloud) (IoT). Devices, sensors, websites, applications, users, and business partners all generate massive volumes of data, which we can’t keep up with. The platform is meant to take in this information and initiate operations in order to respond in real time. IoT cloud systems are service frameworks that are dynamic and accessible on demand, as indicated by the Internet of Things. Currently, the global market is flooded with Internet of Things cloud platforms that are tailored to meet the needs of a wide range of user and application groups, such as businesses, governments, farmers, and health-care providers, as well as communication and transportation companies and manufacturers, among others.
Secured Cloud Computing for Medical Database …
93
Different cloud computing methods are being used in the fields of bio-medical and signal processing, with the future health-care sector utilizing current cloud architecture and the Internet of Things as findings from this study. As a result of the study’s findings, the future health-care sector will use current cloud architecture and the Internet of Things (IoT). This objective focuses on the design and management of existing biomedical signal processing systems in conjunction with IoT and cloud systems, as well as the formation of a network with the least amount of latency, in order to reduce the complexity of waiting time and to provide fast access to information with minimal delay. Many researchers have experimented with a variety of methodologies, but they have not fully used the resources that are accessible to them. It is our goal to design a system model that is efficient and effective in terms of data collection and analysis as well as transmission and utilization in a highly efficient and effective manner. • When using a cloud-based system, the data allocation process should be more efficient, and the effective utilization of the present network should be enhanced as a consequence. In order to keep time complexity to a minimum, a calm method must be followed for data transmission and contact with the end system (the end user). In order to minimize overall delay in the transmission of information, as well as the quantity of redundant information, the available network bandwidth must be used completely with no idle time. • The available network bandwidth must be utilized completely with no idle time. It is important to develop software that integrates cloud computing with mobile technologies in order to implement a health-care monitoring system. The second goal is to develop a rudimentary prototype model for identifying the ECG signal (ECG signal detection). A feature extraction neural network model as well as a support vector machine model are being created in order to categorize the Arrhythmia. 4. To evaluate the performance of the suggested method when it is implemented on a cloud-based computing platform, when doing ECG. When doing ECG signal analysis, the fundamental goal of the feature extraction and classification approach is to distinguish between five distinct forms of arrhythmias: atrial fibrillation, ventricular fibrillation, ventricular fibrillation, and ventricular fibrillation. Classifications such as Class N, Class S, Class V, Class F, and Class Q are just a few examples.
2 Literature Survey An architecture for cloud computing is a technical term that refers to the design of a computer system that makes use of cloud computing technologies. In a cloud computing system, the front end and the back end are two separate components that work together to create the cloud computing system. Their connections to one another are made possible through a network, which is most typically the Internet. As opposed to the back end, which is located on-premises, the front end is located in the cloud and may be accessed by the client (user) using a web browser. Cloud computing services, such as multiple computers, servers, and data storage facilities, are included inside the back end. The front end is comprised of the client’s computer and the
94
M. Balamurugan et al.
application that will be used to access the cloud. Managing traffic and responding to client requests fall within the purview of the central server, which is responsible for monitoring and managing traffic. It complies with a set of standards known as protocols and makes use of specialized software known as middleware to do this. Middleware is software that allows computers linked to a network to communicate with one another and exchange information. When doing their study, Lin et al. delved into great detail into how Organizations Biology perceives the living organism as an organic network. The field of systems biology is distinct from reductionist approaches in that it is more concerned with the interactions between molecules located at multiple omics levels. A variation on this concept has been used to the research of network biomarkers and network medications, which combine clinical data with knowledge of network sciences and, as a result, support the study of sickness in the age of biomedical informatics. The prior is concerned with the identification of precise signals for the purpose of sickness detection and diagnosis in humans. As previously stated by Das et al. [11], electronic health is a rising star that represents the collaboration of medical research and statistical technology, a ray of hope for a prosperous future in health and prosperity, as well as a straightforward option to rely on when in need of medical assistance and assistance. However, the question arises as to whether or not e-health is a truly dependable replacement to traditional healthcare. In this essay, we will evaluate the ethics with which e-health is done, whether or not the code of principles is being carefully obeyed, and, if not, whether or not proper e-health is entirely unpleasant and ineffectual in its implementation. By cooperating with health-care providers and information-technology providers, it is vital to encourage the motivated thinkers who have taken the initiative to deliver a better and faster solution to all health-related concerns. As Chatman pointed out, the most significant aspects of cloud computing offerings are those that are transforming the landscape of health-care information technology in the United States of America. Specifically, it addresses the relevance of cloud computing in addition to the role that it plays in the aforementioned industry. There are a multitude of reasons why health-care institutions, public-sector organizations, and a range of other well-being facilities are adopting the aforementioned contributions, one of which is the speed with which they may be given. 2014 was the year in which Knut Haufe and colleagues tackled the issue of cloud computing is quickly becoming one of the most popular areas in information systems research, coming in second only to database management systems in terms of popularity among researchers. Particularly in the case of health-care companies, it is necessary to assess and handle the specific risks connected with cloud computing in their evidence security management system, taking into consideration the nature of the information being processed. This article presents an overview of the most important security procedures in the context of cloud expansion in the health-care industry, which will be examined in more depth in the next paper. The most important information security processes for health-care organizations utilizing cloud computing will be identified and branded in accordance with the general information security management processes derived from the ISO 27000 family of standards, taking into
Secured Cloud Computing for Medical Database …
95
consideration the primary risks associated with cloud computing as well as the nature of the data being handled. The methods that have been defined will aid a health-care organization that is using cloud computing to focus on the most vital tasks and to create and operate them at a level of development that is suitable given the limited resources that are now available. Kwon and his associates—According to a paper published by the Institute of Electrical and Electronics Engineers, prognostics and systems health management (PHM) is a supporting discipline that employs sensors to assess the health of systems, detect anomalous behaviour, and forecast the remaining useful routine over the life of the benefit. Having been made possible by the Internet of Things (IoT), predictive asset management (PHM) may now be applied to any kind of resource across any industry sector, resulting in a paradigm shift that is opening up significant new economic prospects for enterprises of all sizes. A summary of their thoughts on PHM, as well as the opportunities offered by the Internet of Things, is presented in this presentation. Engineering change and consumer goods are being developed inside a framework being built. Consequently, with the rising use of IoT-based PHM, there has also been a surge in the number of challenges in correlating. There are a variety of topics covered, including correct analytical methodologies, security, Internet of Things platforms, sensor energy collection, Internet of Things business models, and verification procedures, among others. Ghanavati and his associates—It is suggested that the Internet of Things (IoT), which makes use of connected sensors (such as wireless body area networks (WBAN), may give chances for real-time monitoring of patient health status and management of patients and cure in a publication by Springer that is available online. Consequently, the Internet of Things (IoT) will play a crucial role in the development of the next-generation health-care organization. In recent years, remote monitoring of patient health conditions through the Internet of Things has gained popularity. However, monitoring patients outside of hospital settings requires enhancing the Internet of Things’ capabilities by providing additional resources for health data storage and processing, which are currently lacking. A continuous patient health status monitoring framework that connects WBAN through smartphones to cloud computing is presented in this research. The Internet of Things-based serviceoriented background for the framework is provided in this paper. Experimentation has shown that the proposed architecture outperforms baseline WBANs in a sub-static way, as measured by sensor lifetime, existing cost, and energy usage. According to the findings, the SQA methodology delivers promising results in recognizing ECG signals with unsatisfactory quality and surpasses existing techniques based on morphological features and mechanism learning approaches in detecting unacceptable quality. Finally, they have accomplished the transmission of ECG signals of acceptable quality, which has the potential to dramatically extend the battery life of IoT-enabled devices in the future. In order to increase the accuracy and reliability of unproven diagnostic systems, as well as other applications, the quality-aware Internet of Things model has a high chance of calculating clinical acceptability of electrocardiogram data.
96
M. Balamurugan et al.
3 Materials and Methodology It is proposed in this research that three possible enhancements for creating intelligent architectural designs be explored further. Because of these developments, the sensor’s framework will be strengthened, allowing the client to analyze the information acquired by the sensor. A basic current process is regulated at their control stations, with variables such as the level, the weight, and the temperature being monitored. In order for the measuring process to be successful, the physical property that is being measured must be translated into an electrical quantity such as current or voltage. To be more specific, this is necessary for the transfer of such characteristics to control stations for the purpose of monitoring those located in remote areas. Various industries will gain from the monitoring, the analysis, and the control that are carried out, as well as the technology that is employed to carry out these operations. With the main purpose of identifying RR peaks in ECG data, the fundamental objective of this approach is to utilize the ECG signals as input. The service-oriented sensor network architecture (SOSA) is used to enable heterogeneous sensor systems with the communication of the server in the network. The services for sensor developers are provided by the SOSA. The sensor services will be distributed into the eb-xml registry, which will make use of the service description language in order to facilitate their usage. With the use of client sensor apps, the essential arrangements in such services may be discovered and also triggered. It is this kind of network configuration that is founded on a little segment known as an integration controller, which takes a role in linking two technologies, namely, SOSA and the cloud. A detailed description of this approach, which is flavoured and adapted from the client’s PC to the condition of remote computing, follows. As part of this kind of architecture, each Integration Controller will be responsible for relaying the information received from the different sensor systems to the general public using the cloud infrastructure outlined above. Figure 1 depicts the process by which the layered engineering of the system in question was done. In order to analyze this sensed data and convert it into an XML format, extension software will be built and stored on a web server that will be accessible from anywhere in the globe over the Internet. In addition, this engineering allows clients of the sensor to take an active role in the process in an easy way and searches for a large amount of sensor data in a variety of systems at the same time. In order to ensure that the sensor data is captured throughout the sensor’s lifespan, this data will be kept in the system’s back-end storage system. Sensor data is expected to continue to move along the lines of a loosely regulated system to a highly managed cloud, which will complete the data management system that has been created for this sensor data. A cloud-based sensor system will be created by using vast quantities of information and inquiry to produce the sensors that will be necessary for the cloud, as well as improvements in information technology, which will give a superior presentation for such cloud-based sensors. Sensor information will be stored in the sensor cloud, which will then be made available to different sensor customer applications
Secured Cloud Computing for Medical Database …
Get UPD Data
97
Refresh Data Every 60 Seconds
Counters
Timers Data Classification
Statistics Sets
Guages
Back-End Processing
Back-End Processing 1
Back-End Processing 2
V
Statistical Data
Back-End Processing N
Fig. 1 Architecture of Big data system
in order to better address the needs of the customers. They must sustain massive information conditions while also giving some important information and a fast pace of sensor data transfer, all while maintaining a large amount of information. Because the sensors proposed in this framework will provide information on a constant basis, vast quantities of information that has been saved for factual report arrangements will be leaked as a consequence of the continuous flow of information. Businesses will be able to identify and estimate such large volumes of sensor information if they conduct a precognitive assessment of this sensor information from a future-situated perspective depending on what is coming down. This massive sensor information in the Sensor Cloud possesses many important capabilities, including the ability to capture the cloud, manage and control the information in accordance with industry requirements (in this case, paper mills), and the combination of performance information that will deliver the right sensor at the ideal location and time. When it comes to building a structure of capacity, this social model is based on a social database, which uses a table of information as its source for information collection and is produced from scratch. In this scenario, it is vital to differentiate between each of the tables as well as to portray them in an official manner in line with the social model. The “essential key,” as previously stated, will be used to distinguish every line on the table from the lines of the other table by means of the creation of an “outside
98
M. Balamurugan et al.
key,” which can be either a segment or a gathering of the table to that of the essential key of the other table. It will be interesting to see how this is accomplished. Using the T-SQL language, this method will process the piece and store it after checking its linguistic structure with the T-SQL that will be received in the next step. The Idea of a Web-based Health-Care Service: What Is It? In order to distribute and discover Web Services on the Web 4, it is necessary to use reusable components that are capable of being disseminated and found on the Internet. The use of an open-sort standard (such as XML, SOAP, and so on) of the Web correspondence convention, as opposed to a closed-sort standard, is accomplished through the use of a web correspondence convention that is close by having information that will provide administrations to a variety of applications. Its capabilities will be restricted to replying to any calls sent by the client to its server, if any such calls are received. The applications of the different phases, which will create an opportunity for them to interact with one another, will be the key cause in this scenario. When this application is utilized, it will examine the information sources and then request that the web service associated with it be registered with the application. The design of the user interface for this project is accomplished through the use of the C# programming language; the assembly of this application may be accomplished via the use of the .NET framework. When it comes to dialect forms and language structures, the XML format is a kind of language structure that can be extended by the client and is distinct from the PC programme in terms of flexibility. Instead of being emitted from the SGML, this is emitted from it (Standard Generalized Markup Language). If the XML is concerned with effective ways of communicating in an organized manner, it may be possible to replace HTML, which is concerned with the appearance of records in programmes. However, if the XML is concerned with effective ways of communicating in an organized manner, it may be possible to replace HTML. In addition to the HTTP, FTP, SMTP, and TCP protocols, the Cleanser uses the Simple Object Access Protocol (SOAP). SOAP is a standard that is equivalent to the HTTP, FTP, SMTP, and TCP protocols, among others. Because of the LINQ approach for interfacing the dialects of a different programme that can resist organized programming, this T-SQL dialect can be made as easily understandable as it is in C#, in which an important dialect will be distinct in generating related information, as well as in other programming languages. Pre-processing: Among other things, pre-processing of raw ECG data is necessary in order to remove unwanted components such as muscle noise and 60 Hz impedance. The typical meander and T-wave interference are also removed during pre-processing. The standardization and filtering operations will be carried out during the pre-processing stage of the procedure. During this procedure, the signal’s amplitude is normalized, and the signal is then sent through a band pass filter with noise rejection. The best pass band for maximizing this QRS energy will have a frequency range of around 5–15 Hz. Businesses will be able to identify and estimate such large volumes of sensor information if they conduct a precognitive assessment of this sensor information from a future-situated perspective depending on what is coming down. This massive sensor
Secured Cloud Computing for Medical Database …
99
information in the Sensor Cloud possesses many important capabilities, including the ability to capture the cloud, manage, and control the information in accordance with industry requirements (in this case, paper mills), and the combination of performance information that will deliver the right sensor at the ideal location and time. When it comes to building a structure of capacity, this social model is based on a social database, which uses a table of information as its source for information collection and is produced from scratch. In this scenario, it is vital to differentiate between each of the tables as well as to portray them in an official manner in line with the social model. The “essential key,” as previously stated, will be used to distinguish every line on the table from the lines of the other table by means of the creation of an “outside key,” which can be either a segment or a gathering of the table to that of the essential key of the other table. It will be interesting to see how this is accomplished. Using the T-SQL language, this method will process the piece and store it after checking its linguistic structure with the T-SQL that will be received in the next step. The Idea of a Web-based Health-Care Service: What Is It? In order to distribute and discover Web Services on the Web 4, it is necessary to use reusable components that are capable of being disseminated and found on the Internet. This is accomplished by the use of an open-sort standard (such as XML, SOAP, and so on) of the Web correspondence convention, which is close by to having information that will supply administrations to a range of applications.
4 Implementation A linear predictor that self-changes in reaction to approaching statistics of incoming signals will be used in this proposed JQDC method, rather than a fixed indicator, with the goal of producing an indication that self-changes in response to approaching statistics of incoming signals. The utilization of a tapped delay line structural design is used to accomplish this predictor’s results. When updating the weight of the predictors, the LMS system will be utilized, and the predictors will be as follows: In a Linear Predictor, the following is the order of the predictors: Due to the fact that there would be a link between the proposed JQDC and its performance, it is necessary to undertake an inquiry into correlations between the order of linear predictors and their performance. This was achieved effectively by the use of ECG signals collected from the MIT and the BIH databases, respectively, as described above. Following predictions, the CR will improve as the predictors are placed higher and higher up in the order of occurrence. The performance of QRS detection based on the SE and +P, on the other hand, reveals a different pattern of behaviour when compared to other methods. Performance increases with each iteration of the sequence; but, by the fourth iteration, there is an instantaneous error in the signal component, as a consequence of which the detection accuracy has been decreased to an unsatisfactory level of accuracy. It is also possible that there are few orders because the accuracy of prediction is weak, and this may result in low frequency baseline fluctuations and
100
M. Balamurugan et al.
asymmetric P/T wave components, and the erroneous output will have an effect on the accuracy of the QRS detection algorithms. The following are the startup and step size parameters: A considerable number of cycles are required by adaptive systems before they reach their optimum point. The optimal point is dictated by the characteristics of the incoming signal, and in order to speed up this process, the adaptation must be initiated early. The SSLMS predictor is used in combination with previously derived data to provide a more accurate prediction as seen in Fig. 2. The detected data will be captured and analyzed with the help of the WSNEDU2110CB Wireless Sensor Network Educational Kit, which operates at 2.4 GHz Data
Data
dependency
dependency
batch
batch
Product transformation step batch batch
Data transformation step
Data
Data
dependency
dependency
shift Data dependency
shift
Fig. 2 Cloud with medical data
*
Secured Cloud Computing for Medical Database …
101
and contains the Data Acquisition boards with sensors as well as the PC Interface Boards (MIB520). According to QusayH. Mahmoud (2004), the JAX-RPC 1.1 API on the J2EE 1.4 platform provides complete support (QusayH. Mahmoud, 2004) for services that are available via the J2EE 1.4 platform. Written in such a way that the XML namespace is specified, configuration files are compiled in order to generate the SSDL, which contains some inputs such as the address of the server for the client reference and a mapping file that contains a port number and the endpoint location of service for the server reference. The war files are generated in this fashion by the deployment tool from the services that have been built and are then deployed in a server of this kind, as shown in Fig. 2. Sensor system registry (also known as sensor system registry): The sensor system registry is used to facilitate the aggregation of services inside applications simpler by allowing information to be sent between the apps and the sensor system. In order to make them accessible for distribution, the eb-xml registry is used. In order for the sensor services to be registered in the repository, it was necessary to have SSDL files for the sensor services that had service bindings. A breakdown of the services available is shown in Fig. 3. Sensor as a cloud-based service: What exactly is it? The Integration Controller will now upload the sensed data to the Cloud server, where it will be stored for later retrieval. On this illustration, you can see an example of XML code for sensed data that is being disseminated in the cloud. The sensor data is acquired from the sensor cloud via the use of the Hive query command tool, which is supplied with the Hadoop installation of the programme.
Data Center Network Diagram
Data Center WWW Server Communicate Database Client Server
File Server
Financ e
CommunicateCommunicate Server Server Database
Record
Custo m
Database DDN
Router
Switch Others
Manager Manager
Data Center
Router
Switch Finance
Internet Router
Monitor Fireware Internet
Record Financ e Communicate Server
Client Center
Fig. 3 Input register
Clients
Customs
Switch Others
Client Center
102
M. Balamurugan et al.
Fig. 4 File collected in cloud
The sensor data is stored in the sensor cloud until it is needed. A query command has been executed on the hive in order to collect and also analyze sensor data from the device that is being watched. This sensor file contains information that is being analyzed via the usage of the MapReduce algorithm. In order to build connections between a user control centre and a Web Service, it has been found that there are two separate methods to do so: in one direction for data and in another for the Panorama Map. Control Panel for the End-User (Fig. 4): For the purpose of storing the vast quantity of data produced by the front-end sensors as a consequence of the employment of end sensors, a virtual machine will be constructed that will store the humidity, temperatures, and pH value of the virtual machine, among other things. The Web Service will be used to get the contents of the virtual machine at a later time. The display window for this operation will be similar to the display window for the Web Service, and it will extract the quality and amount of data from the database, after which the cloud will calculate the results, which will be shown in the Web Service’s display window. First, you must make a list of all of the elements you wish to include in your Panorama Map. For example, you may add: Through the usage of this software, the start instruction is communicated to the cloud side, which in turn notifies the cloud side of the user’s location and computes the image result, which is then transmitted back to the user control centre. Before the map can be used, it is required to specify seven different parameters on it. In addition to the width and height of the user control centre display window, the current map of the location’s X and Y coordinates, the displacement’s X and Y coordinates, and the newly shown new map hierarchy are all presented. The map may be accessed and manipulated by selecting the “Operate” button on the user control centre’s main display window. Any changes made to this map by the user, such as zooming in or out, will result in the control centre displaying a new map that restores the zoom level at when the user initially logged in. Images
Secured Cloud Computing for Medical Database …
103
of farmland and the sensor data that relates to them may be shown alongside one another if the map’s display is enlarged to its maximum size and the map’s display is expanded to its maximum size. When the sensor is activated, the user gets access to the information and data that has been gathered by the device. UI is an abbreviation for User Interface, and it is comprised of the following elements: The design of this data curve may be split into three categories: the original design, the static design, and the dynamic design. The original design is the most basic kind of design. The original design is the most basic of the three sorts of designs. It is also the most common. The most basic kind of design is the original design, which is the most common type of design. As shown in Fig. 5, the QRS-complex, which involves multiple waves in succession, results in the formation of a wave combination. Figure 5: QRS-complex in action (Fig. 6). On the x-axis, you can see a comparison of the input ECG signal, and on the y-axis, you can see how the amplitude of the signal is measured. 6. The decomposed
Fig. 5 ECG trace
Fig. 6 Input signal
104
M. Balamurugan et al.
signal is referred to as the decomposition signal since the data used in the simulation is compressed. When looking at Fig. 7, it can be seen that the reconstruction of the data is carried out numerous times, with the error for the prediction reducing with each level of reconstruction. This graph is shown by an x-axis showing the passage of time and a y-axis reflecting the amplitude of the signal that is represented by the passage of time. QRS peaks are recognized at the end of each level of prediction, and the total number of peaks is calculated at the beginning of the experiment. Figure 8 depicts the calculation of the total peak value based on the ECG signal, with the peak value highlighted to illustrate that this is the initial step in the detection of the QRS. On the right, you can see the high and low peak values of the electrocardiogram signals, as well as the associated values, which are used to distinguish between distinct peak values of the signals (as shown on the left in Fig. 9). Figure 10 illustrates the final signal discovered, known as the QRS, which has various peak values such as P, Q, R, S, and T, which are separated by the use of different colours to help in understanding. Figure 10: The last signal identified, known as the QRS. This pattern illustrates that the fractal feature of the ECG arrhythmia is the most acceptable characteristic for use with the classification technique. In order to get the lowest possible classification accuracy for classification problems, the feedback neural network has been employed to do this. Nevertheless, when it comes to discriminating between the five different forms of arrhythmia, the feedback neural network system obtains an accuracy of around 90 percent using this method. Alternatively, the observed discrepancies in classification efficiency between the fractal features are mostly due to the continual change in morphological elements of the arrhythmia, as has been previously established. According to Table 1, an overall comparison of the suggested work with several existing algorithms such as Naive Bayes, Decision Tree, Bayes Net, and Fuzzy Petri Net is presented for the sake of simplicity. The measures’ accuracy, sensitivity, and specificity are all evaluated in comparison. When compared to other existing algorithms, the overall result reveals that the proposed technique outperforms them in all areas, as seen in Table 1 (Fig. 11). In accordance with current legislation, this proposal this figure, which compares the accuracy of several current and proposed techniques, indicates unequivocally that the accuracy of the proposed Feed Forward NN and SVM approaches outperforms the accuracy of other existing approaches.
5 Conclusion A fundamental analysis of QRS signal detection is presented in this chapter. Additional talks of specific challenges related with wireless sensor networks and their integration into cloud infrastructure are also presented in this chapter. When compared
Secured Cloud Computing for Medical Database …
Fig. 7 Reconstruction data
105
106
M. Balamurugan et al.
Fig. 8 Selected overall peak
to intracommunication, wireless intercommunication provides more flexibility while also lowering the total cost of ownership. Additional to this, as a consequence of technical improvements, the use of BAN is becoming more widespread. A dataset from the Massachusetts Institute of Technology and the Boston University School of Medicine, which included 48 h of data from 47 patients, was utilized in this study. The arrhythmia data was collected using two-channel ambulatory ECG recordings, which were then used to do the analysis on the information. Higuchi et al. created the Higuchi fractal approach for extracting the fractal dimension characteristic from an ECG arrhythmia, which may be used to diagnose heart failure. Three different classifier models were employed to assess the performance of a computer-aided diagnostic system. The results were compared. A study conducted by the University of California at Berkeley found that SVM outperforms other classification algorithms in terms of classification accuracy, sensitivity, and specificity.
Secured Cloud Computing for Medical Database …
Fig. 9 Generated classes
Fig. 10 Train phase
107
108 Table 1 Performance metrics
M. Balamurugan et al. Classifier
Accuracy
Sensitivity
Specificity
Naive Bayes
83.8
85.0
83.4
Decision tree
93.2
87.5
89.5
Bayes Net
90.8
85.0
82.4
Fuzzy Petri net
85.4
92.5
83.4
Associative Petri Net (APN)
93.5
85.0
85.9
Feed forward NN
94.34
92.08
90.24
Feedback NN
92.23
90
88.65
SVM
95.65
92.36
91.35
Fig. 11 Comparison graph for accuracy
References 1. U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, M. Adam, A. Gertych, R. San Tan, A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 89, 389–396 (2017) 2. Actigraph, Available at: http://www.mtiactigraph.com. Accessed: July 2005. 3. S. Ahmed, N. Javaid, M. Akbar, A. Iqbal, Z.A. Khan, U. Qasim, LAEEBA: link aware and energy efficient scheme for body area networks, in Proceedings—International Conference on Advanced Information Networking and Applications, AINA, pp. 435–440 (2014) 4. A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, M. Ayyash, Internet of Things: a survey on enabling. IEEE Commun. Surv. Tutorials 17(4), 2347–2376 (2015) 5. M. Ali, H.S.M. Bilal, M.A. Razzaq, J. Khan, S. Lee, M. Idris, M. Aazam, T. Choi, S.C. Han, B.H. Kang, IoTFLiP: IoT-based flipped learning platform for medical education. Digital Commun. Networks 3(3), 188–194 (2017) 6. S. Amendola, R. Lodato, S. Manzari, C. Occhiuzzi, G. Marrocco, RFID technology for IoTbased personal healthcare in smart spaces. IEEE Internet Things J. 1(2), 144–152 (2014) 7. M.R. Arefin, K. Tavakolian, R. Fazel-Rezai, QRS complex detection in ECG signal for wearable devices, in Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE, pp. 5940–3 (2015)
Secured Cloud Computing for Medical Database …
109
8. L. Atzori, A. Iera, G. Morabito, The internet of things: a survey. Comput. Netw. 17(2), 243–259 (2010) 9. Available:http://www.texasheart.org/HIC/Topics/Cond/arrhycat.cfm. [Accessed: 04-Jul-2017] 10. D. Curtis, E. Shih, J. Waterman, J. Guttag, J. Bailey, T. Stair, R.A. Greenes, L. Ohno-Machado, Physiological signal monitoring in the waiting areas of an emergency room, in Proceedings of the ICST 3rd International Conference on Body Area Networks, p. 5 (2008) 11. D. Das, P. Maji, G. Dey, N. Dey, ‘Ethical E-Health: A Possibility of the Future or a Distant Dream?’, Int. J. E-Health Med. Commun. 5(3), 17–28 (2014) 12. J. Gotman, P. Gloor, N. Schaul, Comparison of traditional reading of the EEG and automatic recognition of interictal epileptic activity. J. Electroencephalogr. Clin. Neurophysiol. 44, 48–60 (1978) 13. M.T. Hagan, M. Menhaj, Training feedforward networks with Marquardt algorithm. IEEE Trans. Neural Netw. 5, 983–993 (1994) 14. Y.M. Hart, Management of newly diagnosed epilepsy. In S. Shorvon, D. R.Fish, E. Percucca & W. E. Dodson (Eds.), The treatment of epilepsy (2nd edn.). Balckwell Science, Oxford (2004), pp. 161–173 15. F.J. Harris, On the use of windows for harmonic analysis with the discrete Fourier transform, in Proceedings of Instrumentation, Electrical and Electronic Engineers (1978) 16. A.L. Loomis, E.N. Harvey, G.A.I. Hobart, Cerebral states during sleep. As studied by human brain potentials. J. Exp. Psychol. 21, 127–44 (1937) 17. M.T. Hamood, S. Boussakta, Fast Walsh–Hadamard–Fourier transform algorithm. IEEE Trans. Signal Process. 59(11), 5627–5631 (2011) 18. A.I. Baba, C. Câtoi, Bucharest: Tumor Cell Morphology, Comparative Oncology (The Publishing House of the Romanian Academy, 2007). http://www.ncbi.nlm.nih.gov/books/NBK 9553. Access Date: 22-03-2013
Real-Time Big Data Analytics for Improving Sales in the Retail Industry via the Use of Internet of Things Beacons V. Arulkumar, S. Sridhar, G. Kalpana, and K. S. Guruprakash
Abstract Various discoveries achieved by applying Apache Spark in medical administrations foundations are acceptable for a considerable data plan. A part of the educational or research-based social protection organizations are either trying new things with big data or using it in front-line research expeditions. In the medical industry, there is an almost unlimited amount of data that is being created. The Electronic Medical Record (EMR) alone gathers a vast quantity of information. The goal of using new trends and technologies such as the IoT, big data and others to analyze real-time beacon-based sensor data is to help these shopping mall-based retail shops or any other physical retail store compete with online shopping in terms of customized sales promotion, customer relations, different types of analysis such as predictive, diagnostic, and preventive using customer-sales data, and other aspects. We tested a number of beacon-based sensor systems for identifying neighboring mobile phones. The approach that has been proposed Apache Spark Streaming was utilized for various studies, with sample Amazon sales data being used as input. Keywords Ant colony optimization · Mobile sink · Wireless sensor networks · Efficiency · Internet of Things
V. Arulkumar (B) School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamilnadu, India e-mail: [email protected] S. Sridhar Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India G. Kalpana Rajalakshmi Institute of Technology, Chennai, India K. S. Guruprakash Department of Computer Science and Engineering, K. Ramakrishnan College of Engineering, Trichy, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_8
111
112
V. Arulkumar et al.
1 Introduction The phrase “Internet of Things,” which is also known by its abbreviation, IoT, is composed of two words, the first of which is “Web,” and the second of which is “Things.” The first word is the main word, and the second word is “Things.” There are billions of customers all over the globe served by the Internet, which is a global arrangement of linked computer networks that use the standard Internet convention suite (TCP/IP) to communicate with one another. It is a system of systems that consists of millions of private, open, academic, commercial, and government systems, ranging in scale from local to global, all of which are linked together by an extensive cluster of electronic, remote, and optical systems management advances. Through the Internet, more than 100 countries are now linked together in the exchange of information, news, and emotions with one another. News and suppositions are disseminated through the Internet. While traveling to the Things, keep in mind that any question or person who can be recognized by our current reality may be encountered. Regularly occurring items incorporate not just the electronic gadgets that we experience and use daily, but also “things” that we don’t ordinarily consider to be electronic by any stretch of the imagination, for example, nourishment, clothing, and furniture; materials and parts; stock and concentrated things; points of interest, landmarks, show-stoppers, and so forth, in addition to the variety of trade, culture, and modernity. These systems generate enormous amounts of data, which is then sent to personal computers for analysis. When items are capable of both sensing and transmitting information, they are transformed into tools for comprehending and responding to the complexities of nature on a timely basis [1]. In the digital age, enormous amounts of information have been readily available to executives nearby. When we talk about massive information, we’re talking about datasets that are not just large in size, but also high in variety and pace, which makes them difficult to manage using traditional tools and processes. Because of the rapid creation of such information, procedures should be developed and provided with the aim of handling and concentrating esteem and learning from these datasets in mind. Furthermore, leaders must be able to accumulate important pieces of knowledge from such fluctuating and rapidly changing information, which may range from dayto-day interactions to client relationships and informal community information [2]. This kind of esteem may be bestowed via the use of massive information investigation, which is the use of cutting-edge investigation techniques to massive amounts of information. This article intends to deconstruct a few of the many investigation methods and apparatuses that may be used in conjunction with enormous data, as well as the opportunities provided by the use of enormous information investigation in a variety of distinct choice sectors [3]. After the massive information hoarding, follows the methodical planning and preparation of information. As previously said, there are four fundamental requirements for massive information preparation. The ability to quickly stack information is the most important need. Because plate and system activity interferes with the execution of inquiries during information stacking, it is critical to reducing the amount of time spent stacking information. The second
Real-Time Big Data Analytics for Improving …
113
requirement is the ability to prepare questions quickly. Many inquiries are response time essential, to meet the demands of heavy workloads and continuous solicitations as a result of the end goal of meeting these requirements. When a result, the information position structure must be capable of maintaining high inquiry handling speeds even as the number of inquiries increases at an alarming rate. Furthermore, the third need for large-scale information management is the very effective use of storage space. A limited circle of space necessitated by the rapid development in client activities necessitating adaptable capacity limits and calculating power, and information stockpiling necessitates that issues on how to store the information so that space use is increased be addressed during the planning process. Finally, the fourth need is robust workload designs that are capable of adapting to very dynamic workloads. Considering that enormous information sets are investigated by various applications and clients, for various purposes, and in various ways over time, the hidden framework must be extremely adaptable to sudden changes in information preparation and not be restricted to a specific workload design or set of applications and clients. Retailers must alter their approach to the market to succeed. As a consequence, the customer advertises, and the need is met by the client. Products of superior quality at a reasonable price, as well as welcoming after-sale services, are the three most important areas in which the company must concentrate to an extraordinary degree. It is necessary to provide additional administrations to buyers to misfortune them and build up a reputation of dependability, which would ensure consistent sales in the years ahead. The very essence of retail has changed over time. Today, retailing entails going into strip malls, getting on the Internet, and being mobile in your approach [4]. In all of these instances, small merchants miss out on a significant opportunity somewhere. The next-door shop is consistently the most important in terms of compassion for all reasons and seasons, and they must make use of modern technologies. Both online shops and brick-and-mortar stores must exist, but none can do so at the expense of the other. The study made a significant contribution by identifying neighboring mobile phones with the use of sensor devices that were already accessible. Obtaining information on the specified mobile phones is the goal of this operation. Analyze the history data of the mobile phone users who have been identified. To deliver the personalized sale promotion based on the studied data in real-time, the following steps must be taken: To examine the sales history of each category to predict the most challenging project [5].
2 Literature Survey It is well-known that much research has been conducted and published in the past on the deployment of mobile sinks for data collection, with the most recent publication being in 2012. In computer science, an NP-hard problem [6] is a problem in which organizing nodes into optimal clusters is very difficult to solve effectively. It is important to think about optimizing the size of the cluster since it is an effective
114
V. Arulkumar et al.
method of lowering the overall size of the cluster overall size. To maintain control over the size of a cluster, it is essential to keep the distances between member nodes and CHs as short as possible [7], as well as the amount of energy used by the whole network. To find the optimum cluster size, PSO reduces the distance between member nodes and cluster heads (CHs) in the directive. PSO is used in their suggested method, which they go into great depth to describe. Based on the findings of Bifet [8], GA has been used in wireless sensor networks to build the most energy-efficient clusters possible. According to the authors of [9], it is recommended that a self-clustering method for diverse WSNs be used in combination with a GA to prolong the lifetime of the network. To accomplish a range of objectives, several methods of modifying cluster size are used. They think that a routing tree that maximizes network lifetime while keeping the direction-finding path between each device and sink to an absolute minimum should be used to accomplish this [10]. It is shown in [11] that an energy consumption cost model is used in conjunction with a routing tree to evaluate the queries that are sent out. The study in [12] is aimed at determining the most effective route for each moveable sink as well as calculating the break length of a piece mobile sink along the trajectory to maximize the network’s longevity. For example, in [13], the use of a single mobile descendant to prolong the life of a network was studied in more depth to determine its effectiveness. In a recent study published in [14], researchers discovered a technique for extending the life of mobile base stations while simultaneously managing network demand at the same time. Chen et al. [15], Cohen et al. 16] proposes, as an example, that the mobile base station be run in the way of the greediest algorithm, according to the authors. The use of joint flexibility and steering in the case of a stationary base station with a limited range of capabilities is proposed [17–19]. The final analysis reveals that [20–22] control the route via the use of the mobile sink routing protocol, which is implemented in several different ways. Mobile sinks, on the other hand, are only useful in a very restricted number of situations in the real world of practice. Because a WSN is often located in highly populated regions, mobile sinks can only serve a small proportion of the total population. Because of several variables, including road conditions and mobile equipment, these autonomous vehicles are only capable of traveling at a certain pace. Aside from that, since mobile tools have a limited quantity of accessible energy, the maximum distance they can go on a trip is also restricted as a result of this. To maximize network lifespan while still maintaining speed, we must consider these limitations when developing a routing protocol. The mathematical representation of an intensive care unit’s collection of n randomly distributed, similar device nodes is the purposeless graph n = (VMS, E), where V represents the collection of n randomly distributed, similar device nodes in the intensive care unit and MS represents the mean temperature of the monitoring zone. When there are k mobile sinks, there are also k connections between sensors and sinks. When there are k connections between sensors and sinks, there are also k connections between sensors and sinks. A sensor system is regarded to be connected when the broadcast variations of two sensors u and v coincide; otherwise, the system is said to be unconnected. Remote monitoring using mobile sinks is unique in that they can both receive sensed information from device nodes and transmit the sensory
Real-Time Big Data Analytics for Improving …
115
Fig. 1 Overall flow diagram of the proposed work
data they have gathered to an offsite monitoring center, making them a one-of-a-kind kind of sensor. Each sensor node has a unique identity and a limited initial energy capacity Q, while each mobile sink has a lively capacity and access to an infinite supply of energy stored at the depot. Imagine that there are k mobile sinks in the system at any one moment, to make things simpler to comprehend and remember. The road map in the monitoring area is represented by the equation R = (Vr, Wr), where each vertex in Vr represents a road junction set and each piece advantage in Wr indicates a road segment (see Fig. 1). To accommodate a maximum of k mobile sinks, each of which has a liveliness capacity of eM, map R must be constrained in terms of its capacity. Approximately one-third of the total energy used by each mobile sink is consumed by travel and data transmission, as well as the receipt of the information provided. For the sojourn sites along the route, the units of et and ec will be used to represent the sink energy consumption for unit-length travel; for the beginning point, the units of both et and ec will be used to denote the sink energy consumption for a unit-length journey. It is estimated that when it comes to energy consumption rates, the wireless connection between sensor nodes accounts for a large proportion of the energy produced by the sensor nodes. The rest of the energy use, which is comprised of sensing and processing, is insignificant.
3 Materials and Methodology IoT Beacons: “Beacons are tiny wireless sensor devices that constantly broadcast a basic radio signal,” according to the National Science Foundation. The majority of the time, the signal is picked up by neighboring smartphones that are equipped with Bluetooth Low Energy (BLE) technology or WIFI signals. Apple Inc. trademarked the term “iBeacon” to identify its product. It is a new technology that Apple has
116
V. Arulkumar et al.
included into its location framework in iOS 7 and later operating systems, and it is described below. It enables Apple phones running iOS7 (or later versions) to continuously search the surroundings for Bluetooth Low Energy devices such as beacons, allowing them to be more productive.
3.1 Apache Spark for Real-Time Data Analytics: With its high performance, Apache Spark is a strong open-source processing engine that can handle both real-time stream processing and batch processing. It is 100 times quicker than Map Reduce in terms of processing speed. APIs are available in three programming languages: Java, Python, and Scala. To achieve generality, Spark SQL is utilized in conjunction with Spark SQL. This allows for streaming and sophisticated analyses. As part of our investigation, we will make use of the Spark ecosystem for both real-time data analytics and batch processing. Flume: This program is used for the gathering and compilation of real-time event data from many sources. Additionally, this tool is more dependable, scalable, controllable, and configurable, in addition to providing excellent performance. The data collected from beacons will be used in our study, and we will utilize this application to access it in real-time. MLlib: MLlib is a spark subproject that provides primitives for machine learning applications. It is based on Apache Spark, a fast and versatile engine for largescale data processing that was developed by the Apache Software Foundation. It is the Spark machine learning library that is scalable and contains standard machine learning techniques and utilities such as classification, regression, and clustering among other things. To detect signals in the form of WIFI or Bluetooth utilizing Bluetooth low energy from any nearby mobile phone, we will utilize sensor devices (Beacons) and link these devices with IoTs. Following that, we will use flume/Kafka to collect sensor data from the Internet of Things devices and convert it into JSON data format. Now that we have this JSON data, we will examine it in real-time using Spark Streaming and in the past utilizing the Apache Spark ecosystem, respectively. Our goal is to extract useful information from JSON data, display this information on a dashboard, and then link it to the Internet through the Internet of Things. Use this information to create personalized sales promotions in real-time and to increase total sales in the physical retail store, allowing them to compete with Internet shopping (Fig. 2). When it comes to gathering and transmitting recognition information, this system just uses standard, off-the-shelf get-to-point technology. In this way, in addition to achieving high identification rates, it is possible to achieve cheap equipment and setup costs. No matter how you look at it, the vast range and sparse character of our cleverly collected WiFi broadcasts demonstrate a significant limitation issue. We propose a direction estimation technique based on Viterbi’s calculation that takes into account the second-by-second location of a moving device as information and, in addition, creates the unmistakably spatio-temporal path taken. In addition, they
Real-Time Big Data Analytics for Improving …
117
Fig. 2 Flow chart of proposed work
demonstrate a few methods that cause passing devices to transmit more messages, increasing location rates, and improving the quality of the usage flag for improved accuracy. Taking everything into consideration, we can conclude that an unmodified cell phone that makes use of WiFi screens is both down to earth, efficient, and accurate. Based on estimates from a few real-world arrangements, we have discovered that a large number of mobile phones may be tracked by using the methods described here. The precision with which inactive WiFi following is performed is highly dependent on the thickness and geometry of the arrangement. However, when comparing our results to GPS ground truth, we were able to achieve a mean error of fewer than 70 m by using displays that were more than 400 m apart. Because of the cheap gear cost, high precision, and broad reach of our suggested framework, we believe it may be suitable for a wide-scale organization in metropolitan areas, assuming that the proposed framework is implemented. Providing near continuous, high-scope estimates of surface street activity stream would certainly be beneficial to suburbanites and organizers in such zones, as would be the case in other zones. This technique allows us to identify, track, and locate mobile phones by detecting
118
V. Arulkumar et al.
the amazing distant marks of WiFi signals that they emit. For example, if we need to know how many people came into a certain store, we may use our technological advancement to find out. Additionally, it allows you to monitor these consumers without identifying them: How long did they stay in the building? What exactly are typical activity designs? We can make use of the distant marks that mobile phones broadcast from time to time via WiFi to our advantage. This means that there is no need for an application or client participation. The following are the stages that this technology can take: Smartphones should be identified and detected as follows: There are many uses for this. All we need to know is the total number of people who pass the test. All things considered, we should be able to implement this technological locator at this location. The finder will then begin compiling a list of mobile phone numbers and IDs. Not every visitor will have access to a mobile phone or a computer with WiFi enabled. Regardless, based on our previous experience, we are aware of the significant disparity between the total number of consumers and the number of mobile phone IDs collected. This means that we can check people with more than 95% accuracy thanks to this technology. Tracking the location of smartphones: Guest identification and tracking are required in more advanced applications where visitors must be distinguished and numbered. For example, we need information on how often a customer visits your retail area.
3.2 Proposed Ideal Beacon Selection In my solution section, we’ve demonstrated the arrangement, which includes the mobile applications that end-users and business owners require, as well as the API that designers can use to create proximity-based mobile applications for these two types of customers. This section describes in detail how each component of the layout should be carried out (devices, programming dialects, and so on). Because two examples were created to test the concept, we will also discuss how they may be used. The code is facilitated on Github, which is a platform that facilitates administration for activities that make use of the code. The following section outlines the critical considerations that went into the development of our Smart Places strategy. Following that, we condense the gadgets that were used in the development of this arrangement. Following that, it is clarified which innovation was used to increase the visibility of labels following our definition of a Smart Place. At that point, we demonstrate how APIs can be used by designers in their work. Following that, we’ll go over how the Smart Retail Shops came to be. Smart Locations: A Smart Place is marked with labels that allow for location-based management. The Smart Retail Shop is an example of Smart Places in action. They were developed following our definition of Smart Place. Customers can gain access to these proximity-based administrations through the use of a portable application, which is also demonstrated in Sect. 3. These administrations, on the other hand, maybe considered portable apps in and of themselves. Alternatively, we might provide an SDK for each step that would allow designers to include proximity-based designers into their apps. If we take this strategy,
Real-Time Big Data Analytics for Improving …
119
we would end up in a situation where the customers would have to install a separate application for each Smart Place that they wanted to use. We must first examine the many methods that might be used to understand why an application is required to get to any Smart Place. Smart Places may be local portable apps or web applications that operate in the background of embedded web software, depending on the situation. A local portable application is a program that is developed using the local instruments that are now available on the stage. This kind of application just continues to function at the stage for which it was designed. The term “web application” refers to an application that operates inside a web-based software. It may continue to operate at any stage when a web-based application is available. Customers would be required to offer a single application for everyone if Smart Places were developed as local apps instead of web-based applications. While walking, they would be unable to discover new nearby-based administrations because of this strategy. To get over this obstacle, we decided that Smart Places would be available as web-based apps. Any online application may thus be configured to react to the presence of labels in a Smart Place. Aside from that, it results in a situation where the clients, as it were, need a single application to get access to any Smart Place, because everyone is a web application that will operate inside an installed web program in the client flexible application. This is done by using the Web View, which is a gadget provided by the Android SDK that allows websites to be embedded within any user interface in an Android program. Smart Places may be accessed via the client’s mobile application. Brilliant places are web-based apps that operate in the background in a Web View inside the client’s multi-functional application. In this Web View, for example, the Smart Restaurant and Savvy Museum examples continue to operate as normal. Clients will not be required to introduce two separate apps in their environments. They will be able to go to any of these examples by using our flexible application. Because we chose web apps as the method by which we want to strengthen Smart Places in some way, we expect them to be able to recognize labels nearby. Where can a web application that runs within a local portable application that is placed inside another web program detect labels in a Smart Place? The labeling guidelines (also known as BLE guides) are handled locally by our method. The essential information is passed to a server-side web application operating inside the inserted web programming.
3.3 Analysis and Results A whole new era is emerging in the retail industry, one that is being propelled forward by the Internet of Things. The Internet of Things, along with massive data analysis, is altering both the route that consumers take and the way that merchants interact with them to provide better service. The Internet of Things is a unique benefit for the retail industry, as it provides merchants with the apparatuses and bits of information they need to transform their organizations. Retailers may significantly improve, computerize, and refine business processes by implementing a successful Internet of Things strategy. They can also reduce operational costs, integrate channels, and,
120
V. Arulkumar et al.
Fig. 3 Proposed beacon architecture
most importantly, better understand and lock in with customers. The fact is that, according to projections, the Internet of Objects will include 30 billion connected self-sufficient things by 2020, and that merchants may find themselves overwhelmed by the possibilities that this technology provides—particularly in the retail sector. It is essential to first develop persuasive business cases with clearly defined objectives, followed by the development of execution plans that make use of your current resources and structure (Fig. 3). To make use of guides, it is necessary to have the proper equipment and programming. Signals need the use of equipment and programming that can transmit and receive reference point signals through Bluetooth Low Energy (BLE). BLE capabilities have been included in iPhone equipment since the release of the iPhone 4 s, and Apple has integrated the ability to communicate with BLE equipment into iOS 5 and the ability to send messages using the iBeacon standard into iOS 7 since that time. 2. Beginning with the release of the Android 4.3 operating system, gadgets running the Android operating system will be able to communicate via Bluetooth Low Energy (BLE) technology. 3. As of the right moment, 30% of mobile phones in the United States are equipped to connect with reference points through Bluetooth Low Energy. This percentage is expected to rise to 80% in the next year and a half as more traditional telephones are replaced with more modern mobile phones, according to projections (see Fig. 3). The use of the iBeacon convention may help to expedite the deployment of BLE-enabled flexible apps. IBeacon is a Bluetooth Low Energy (BLE) standard that was developed by Apple to assist expedite the development of indoor area capabilities in a variety of applications. This standard
Real-Time Big Data Analytics for Improving …
121
enables mobile phones and other equipment that is BLE-enabled to communicate in a language that is indistinguishable from the language spoken by portable apps that utilize BLE. Having this standard in place assists designers in expediting the development of portable apps that make use of reference points, allowing for the release of new functionality to occur more quickly. In addition, the iBeacon standard is being used to take advantage of some of the space capabilities that are built into the iPhones themselves. We have utilized Python to do this, and we have produced the JSON in a nonstandard format, which implies that Spark’s internal reader will not be able to read it. To evaluate, we have done the following: It is necessary for our research that we have two pieces of equipment. Both the audit text, which we must interpret and the general rating, which tells us what score the client really assigned to the item, are important. If we are receiving review text with as little as 53 words and potentially as much as 500 + words, the model should be blocked in order to determine which terms are important rather than how often they appear. In this Sentiment Analysis, we will just use the review text and general data, but it is possible that a more grounded model may be developed by including other components in the analysis process. For starters, since Spark is designed for repeated computations, it facilitates the development of efficient use of large-scale machine learning calculations, which are by their very nature iterative in nature. Low-level upgrades in Spark often result in MLlib’s execution picking up where it left off, with no immediate changes to the library itself. A second benefit of Spark’s vibrant open-source community has been the rapid development and adoption of MLlib, with more than 140 people pledging their support for the project. Third, MLlib is one of a small number of anomalous state libraries built on top of the Spark framework. MLlib’s spark is a component of Spark’s diverse biological community, and it is only partially responsible for the spark. MLlib is an API for pipeline improvement that provides engineers with a broad range of devices to simplify the development of machine learning pipelines on a practical level. Cutting-edge datasets are growing in quantity and unpredictability at an alarming rate, and there is an increasing need for solutions to address this avalanche of information using quantifiable methods. For large-scale information processing, a few “next-generation” information stream motors that summarize MapReduce have been developed, and the question of how to make machine learning effective on top of these motors is one that is very intriguing. Apache Spark, in particular, has grown to become a widely used open-source motor that is widely distributed. Begin is a fault resistant and widely used group registration framework that provides APIs in Java, Scala (for Scala), Python (for Python), and R (for R), as well as an advanced motor that supports up to 204 execution charts in total. Aside from that, Spark is very productive while doing iterative computations, making it an excellent choice for the development of large-scale machine learning applications. In this work, we demonstrate MLlib, Spark’s widely distributed machine learning library, which is also the largest of its kind. The library is designed for large-scale environments that take use of information parallelism or model parallelism to store and
122
V. Arulkumar et al.
operate on information or models on a large scale. Grouping, relapsing, communityoriented separation, bunching, and dimensionality reduction are some of the basic learning computations that may be performed quickly and efficiently in regular learning situations using the MLlib library of functions. It also provides a variety of fundamental insights, direct variable-based arithmetic, and streamlined primitives, among other things. A Scala-based straight polynomial math library that uses local (C++ -based) straight polynomial math libraries on each node, MLlib includes APIs for the Java, Scala, and Python programming languages and is distributed as part of the Spark extension under the Apache 2.0 license. The close integration of MLlib with Spark provides a number of benefits. The fact that Spark is designed for iterative computation, for starters, enables faster progress in the development and production executions of large-scale machine learning calculations, which are often iterative by nature. Spark improvements at the lowest level often result in execution pick-ups in MLlib, with no modifications to the library itself being made right away. Second, Spark’s thriving open-source community has spurred rapid development and adoption of MLlib, with more than 140 people pledging their support for the project thus far. MLlib is one of a small number of anomalous state libraries built on top of the Spark framework. In part because of Spark’s diverse biological population, and in part because of MLlib’s spark, this is a significant feature (Fig. 4; Table 1). The top seven businesses in terms of maximum valuations, Amazon’s top ten highest-volume sellers include: Our second objective is to identify the ten organizations with the highest number of transactions. We will use the company Amazon as an example in our investigation. As a result of this query, we are provided with the ten most significant volumes of “Amazon,” as well as the month and year in which they occurred. In this case, the question is as follows: SELECT a date and a volume FROM the nyse ORDER BY VOLUME WHERE symbol = ‘AMAZON’ WHERE volume = DESC IMITATIONS 10 (Table 2). The results of the aforementioned research reveal the organizations that have reaped the advantages of each and every business. A significant number of investors
Fig. 4 Top 7 companies by maximum values
Real-Time Big Data Analytics for Improving … Table 1 Input sectors with preferred values
Table 2 Volumes traded for Amazon
Company symbol
123 Max_value
BFX
232,356,236
CNM
2306 458999
NML
123,456,789
BHN
112,223,555
MML
1002 000321
SLL
6,566,666
HAL
3,232,356
Month/Year
Volume
AUG-09 478
236,336,173
AUG-09 376
167,235,666
FEB-09 372
321,654,159
MAR-09 293
121,233,808
NOV-08 291
151,515,565
MAR-09 274
337,333,212
AUG-09 478
173,121,233
AUG-09 376
16,723,566
FEB-09 372
1,596,355
MAR-09
255, 867, 900
DEC-09
247, 893, 200
MAR-09
256, 110, 600
MAR-09
255, 867, 900
are interested in putting their money into a company that is doing excellently in the value advertising market. The information should be useful for monetary experts and clients who operate in the money markets and who want to break down all of an organization’s previous records in order to guide their consumers through business initiatives. Our second finding for WFC shows that the organization moved quickly in 2008 and 2009 when comparing the years 2000 to 2014 (Fig. 5). These two years have seen the highest amount of trades in the market. In this study, we showed the probability that Big Data technologies such as Hadoop and Hive would be used by the financial services sector. This method should be particularly beneficial in the area of Business Insight, where the company keeps track of its previous performance as well as other relevant information. Our future study will include the discovery of more money-related information sets in order to find more useful relationships or examples (Fig. 6).
124
V. Arulkumar et al.
Fig. 5 Comparison with iBeacon Technology
Fig. 6 Comparison with iBeacon Technology in certain years
4 Conclusion Likewise iBeacon Technology is more than simply a means of closing a transaction. For example, a design shop may increase consumer confidence in-store by displaying the most current outfits or big-name styles that have been featured in leading design online journals and magazine articles, among other things. This may include suggestions on how to “complete the outfit” by advising on which shoes go well with which trousers, which pack, which beat … all of which are available in-store to purchase. Another example might be a DIY shop, where you can provide clients with setup advice as well as occasional chores for the house or garden, all of which they can do with the help of goods available in the store. These are all examples of messages that are designed to increase sales and make a business more secure in its customers’ loyalty. Increase Customer Loyalty by: Customers may get additional reasons for visiting a shop or moving region by using the resources and technologies mentioned in this study, which can enhance a brand’s fidelity program to an even greater extent. If a customer finds that an item that he or she has tried on
Real-Time Big Data Analytics for Improving …
125
does not suit their requirements, a guide-enabled app may suggest other products and also indicate where they are located in the shop. Customers may also request assistance from a shop employee in an instant by just pressing the appropriate button on their device. Finally, most shops send out weekly notifications about hot discounts or offers that provide additional benefits for customers who have loyalty card accounts. Customers, in any event, tend not to be concerned while they are engrossed in their buying experience in-store, with the exception of a few instances. By using beacons and linking them to the Internet of Things, businesses may notify customers about upcoming arrangements as soon as they walk through the door.
References 1. L. Hand, Business strategy: developing IoT use cases for retail. Published by International Data Corporation (IDC) 3.6. http://www.idc.com/getdoc.jsp?containerId=RI250069 2. L. Dunbrack, S. Ellis, L.H. Kimberly, K.V. Turner, IoT and Digital Transformation: A Tale of Four Industries. Published by International Data Corporation IDC #US41040016 (2016) 3. A. Saha, A study on “The impact of online shopping upon retail trade business”. IOSR J. Bus. Manag. (IOSR-JBM) 74–78 (2015) e-ISSN: 2278-487X, p-ISSN: 2319-7668 4. A. Cuzzocrea, I. Song, K.C. Davis, Analytics over large-scale multidimensional data: the big data revolution!, in Proceedings of the ACM International Workshop on Data Warehousing and OLAP (2011), pp. 101–104 5. E.A. Kosmatos, N.D. Tselikas, A.C. Boucouvalas, Integrating RFIDs and Smart Objects into a Unified Internet of Things Architecture. Adv. Internet Things Sci. Res. 1, 5–12 (2011). https:// doi.org/10.4236/ait.2011.11002 6. M. Zaharia, M. Chowdhury, M.J. Franklin, S. Shenker, I. Stoica, HotCloud 2010, June 2010. MateiZaharia’s Publications 7. U. Han, J. Ahn, Dynamic Load Balancing Method for Apache Flume Log Processing. Adv. Sci. Technol. Lett. 79 (IST 2014), 83–86 2014. https://doi.org/10.14257/astl.2014.79.16 8. A. Bifet, Architectures for massive data management Apache Kafka, Samza, Storm (University Paris Saclay, telecom Paris Tech. 2015) 9. A. Zaslavsky, C. Perera, D. Georgakopoulos, Sensing as a service and Big Data, in IEEE International Conference on Computational Intelligence and Computing Research, pp. 1–6 (2013) 10. F. Chen, P. Deng, J. Wan, D. Zhang, A.V. Vasilakos, X. Rong, Data mining for the internet of things: literature review and challenges. Int. J. Distrib. Sensor Netw. 2015, Article ID 431047, 14 p (2015) 11. S. Madakam, R. Ramaswamy, S. Tripathi, Internet of Things (IoT): a literature review. J. Comput. Commun. 3, 164–173 (2015) 12. S.M. Barakat, Internet of Things: ecosystem and applications. J. Curr. Res. Sci. (2016). ISSN 2322-5009 13. L. Yao, Q.Z. Sheng, Schahram, Web-based management of the Internet of Things. Published by the IEEE Computer Society (IEEE, 2015). 1089-7801/15/$31.00 © 2015 14. D. Evans, The Internet of Things: how the next evolution of the internet is changing everything. White paper published by Cisco Internet Business Solutions Group (2011) 15. C. Chen, B. Das, D.J. Cook, Energy Prediction Based on Resident’s Activity. Washington, DC, USA (2010). Copyright 2010 ACM 978-1-4503-0224-1. 16. J. Cohen, B. Dolan, M. Dunlap, J.M. Hellerstein, MAD Skills: New Analysis Practices for Big Data. ACM VLDB ‘09, 24–28 Aug 2009, Lyon, France. Copyright 2009 VLDB Endowment (ACM, 2009)
126
V. Arulkumar et al.
17. H. Zhou, B. Liu, P. Dong, The technology system framework of the Internet of Things and its application research in agriculture, in 5th Computer and Computing, ed. by D. Li, Y. Chen (2011) 18. W. Liang, J. Luo, Network lifetime maximization in sensor networks with multiple mobile sinks, in Proceedings of the 36th Conference on Local Computer Networks (LCN ‘11), Oct 2011 (IEEE, Bonn, 2011), pp. 350–357 19. M. Moody, Analysis of promising Beacon technology for consumers. Elon J. Undergraduate Res. Commun. 6(1) (2015) 20. F. Zafari, I. Papapanagiotou, K. Christidis, Micro-location for Internet of Things equipped smart buildings. IEEE Internet Things J. 3(1) (2015) 21. V.H. Bhide, A survey on the smart homes using internet of things (IoT). Int. J. Adv. Res. Comput. Sci. Manag. Stud. ISSN: 232 7782 1 (Online) (2014) 22. H.S. Bhosale, D.P. Gadekar, A review paper on big data and Hadoop. Int. J. Sci. Res. Publ. 4(10) (2014). ISSN 2250-3153
Healthcare Application System with Cyber-Security Using Machine Learning Techniques C. Selvan, C. Jenifer Grace Giftlin, M. Aruna, and S. Sridhar
Abstract Compared to last year Internet of Things Intelligent (IoT), this year’s IoT brings a significant increase in intelligence, or “things,” into the Internet of Things (IoT). Regarding the significance of subjective pain inside such an active communication network, we are not yet beyond the reach of artificial intelligence, but we are close. Pain, as well as organs of emotions and ideas (cell phones and tablets), appear alongside home appliances and mobile devices (such as smartphones and tablets). In addition, several of these devices are accessible in markets all over the globe. When it comes to current Internet problems, the source of the pain is the access to Internet Connectivity that they provide. In order to reap the advantages of research capacity solutions, artificial intelligence methods use intelligence. Globally, healthcare services are among the most significant uses that the Internet of Things (IoT) has made possible. In order for patients to monitor their health in real time, advanced sensors may be worn on their bodies or implanted into their organs. Afterwards, the information may be analysed, grouped, and prioritised if necessary. When physicians work with algorithms, they may make adjustments to their treatment plans while simultaneously ensuring that patients get cost-effective health care. Keywords IoT · Health care · Artificial intelligence · Remote patient monitoring · Machine learning C. Selvan (B) Department of Computer Science & Engineering, New Horizon College of Engineering, Bangalore, Karnataka, India e-mail: [email protected] C. Jenifer Grace Giftlin Department of Information Technology, Sri Krishna College of Engineering and Technology, Coimbatore, TN, India M. Aruna Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Kanchipuram, Chennai, TN, India S. Sridhar Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_9
127
128
C. Selvan et al.
1 Introduction The world’s population is growing at an exponential pace in cities, and the greater the number of people, the greater the amount of strain on resources. Even while there are medical resources and facilities in cities, and they are increasing in number every day, the level of adequacy has not yet been reached, placing an excessive burden on the government’s resources. Cities’ health-care systems have matured, bringing with them the appropriate answers to lightning-related problems in the process. This research represents a new step in e-health, since a large number of sensors are being utilised to develop a new multidimensional monitoring method for the action of a wide range of diseases. It conducts its research using the Raspberry Pi ARM11 (BCM2837), and it is continually developing a comprehensive process of attention for the treatment of illnesses such as cardiac convulsion diabetes, and fever [1]. The antenna continuously captures the gesture along the route, as well as they are then linked to the required statement via physical characteristics, through wireless data acquired starting the network being saved, process, with evaluated in offered health records after they have been linked to the required communication. In the absence of doctors, current data records and decision support systems may be used to conduct a decent diagnostic for first aid advice, and if physicians are not accessible, the devices can predict health problems that are occurring now. According to the database system, we can not only predict the arrival of medicines and medical equipment, but we can also monitor the effect of contemporary technology on the quality of life and health of everyone who makes a difference in the world. Each step taken towards a more accurate disease prediction is done in order to reduce the overall cost of healthcare. Patients will benefit from this article since it offers them with a financial model for technical services as well as philosophical ideas. The real-world medical sector [2] poses a significant obstacle to IoT integration. The lives of Lorem’s pupils have many safety-changing consequences, and by reducing medical costs, it accomplishes that goal while also significantly improving the accuracy of illness prediction in general. A technological tune model, as well as profitable concerns used designed for uncomplaining reassure and open space issues, is presented in this article when implementing IoT in the real world of healthcare [3]. The following is a brief summary of the document’s main goal: It is possible to acquire medical information about a patient via the Internet of Things, and it is also possible towards observing the data that has been collected about the patient. The identification and definition of disease or condition will be accomplished via data mining, starting with specific data that will allow for more effective decisionmaking. • Making healthcare choices based on the Internet of Things at any time and from any place. A number of industries, including manufacturing, transportation, and government, have benefitted from the Learning (ML)/Deep Learning (DL) system3 [3]. Since a
Healthcare Application System …
129
few years, DL has surpassed the state in terms of population. A few of the fields where art is flourishing are computer visualisation, text analysis, and word processing, to name a few. Algorithms like as ML/DL are often used. For example, social media has developed into a technical wonder that we use in our everyday lives (for example, Facebook). The ML/DL calculating method is having an increasingly positive impact on health-care delivery as well. Large-scale technologies have long posed a barrier to the treatment of disorders [4]. Body organ identification in medical imaging, longrange categorization of pneumonia, detection of lung cancer, treatment of imaging reconstructions, and sections of brain tumours are just a few of the domains where machine learning and deep learning have made major strides. To mention a few examples, smart software looks for nearby patients, and machine learning provides medical test results (Fig. 1). Clinician-assisted analysis have risen to the top of the list of possible application areas for ML/DL models, and a number of models have already been developed to fill this need. Human doctors are being phased out in fields like as clinical pathology, radiation therapy, eye disorders, and skin issues, with DL models taking their place. Many research have been published on DL models, which on average outperform human physicians [5, 6] in various areas, according to the findings. Additionally, technology and machine learning/deep learning may assist with result assessment and the development of intellectual solutions that are based on human intelligence. Additional benefits, such as activation, are available in addition. Peripheral medical services are important in the modernization of health-care technology in rural and low-income areas, and they contribute significantly to this effort [7, 8]. Fig. 1 Machine learning clinical overflow
130
C. Selvan et al.
2 Related Work Multiple researchers have suggested different models for IoT in healthcare, as well as techniques for predicting various types of diseases utilising a variety of data sources and methods. It is the purpose of this section to highlight the work that has been done in the same area. Make use of a smart chair to implement an ECG BCG system in the sitting area that will consistently detect physiological issues. By supplying classics, bio signal and a monitoring organization could be used in the exact way that they be intended to be used: without modification. Several instances of Internet of effects (IoT) applications during healthcare will be discussed in this article. Almutairi and his associates A mobile health system that gathers coincident data from mobile devices and enable patients to store it on Internet-enabled network servers with restricted admission to certain consumers was proposed by the author of the paper. This information has been generated and is available for use. Berger et al. [9] are experts in their field. Sensors for smart homes were utilised in the construction of the structure. Hockey and its prototypes are also being evaluated as part of a network that will monitor and track the movements of patients in the future. Their primary aim is to verify the behaviour patterns of their system and to be able to manage and discuss the same thing while at work. Chiuchisan and colleagues created a framework to avoid this from happening. As an intelligent intensive care system threatens them, the patients’ families and doctors voice their concerns about the situation. Incompatibility with one’s health or physical movements, as well as with the surroundings of one’s house is defined as Dwivedi et al. [10] provides a stable foundation for confidence. Clinical data must be provided via the Internet in order to be included in the EPR (Electronic Patient Internet) system. It is suggested to employ a multi-level power information system. The word break is a universal enter arrangement. Biometric technology includes exchanges smart cards, as well as biometric skill to name a few applications. Take, for example, the Gupta family. Gupta et al. [11]: A model for model measurement and recording was suggested, and it was implemented. The usage of an ECG as well as the patient’s other important symptoms are taken into consideration. A raspberry pie may be served at hospitals and other places to patients and their families, such as Gupta, to improve their overall well-being. Intel offers a Galileo-based strategy as a starting point. It is shown by a graph that depicts the evolution of different data and load types. Physicians may and do make use of electronic medical records databases. Efforts are being made to minimise patients’ birth pain, and their health indicators are being monitored on a regular basis [12]. Lopez and his associates proposed an IoT-based framework since he and his team were unable to study and seek out IoT solutions that could be helpful to them and their community. For the most recent Internet of Things test, two use cases were
Healthcare Application System …
131
developed, with the first concentrating on the technologies that could be utilised and the second on the applications that could not be used. Magarelli and Rao planned a new process for assessing the strictness of the patient’s disease which was documented in the patient’s medical record. It was via the use of a statistical technique based on illness probability threshold mining that they reached this result. Moreover, to get to know us their primary goal is to make improvements to the 0 algorithm. It is required to cut the weight of hyperlinks on websites. Sahoo et al. [13] examined the health-care system as well as a massive quantity of patient data from a variety of reports. They acquired a greater knowledge of health attitudes, which they may use to predict the patient’s or prospective health condition in the future. They take use of large information stored in the cloud. There are a variety of approaches that may be used to utilise the same analytics platform. Taigi et al. [14] investigated the health and operation of the Internet of Things. He couldn’t physically recognise her as real and genuine since she didn’t seem real and genuine to him. The use of cloud computing has been proposed as a solution. With permission, we may transmit medical data and patient size in a secure manner. To establish a bond between the sufferer and his or her family, hospitals, doctors, labs, and a variety of other organisations are all engaged in this process. In other words, the patient is financially liable for the clinic for a period of one month. Wang, as well as other online applications tailored to specific medical equipment, have lately been developed with the use of the Internet of Things. Data retrieval is dependent on the quality of the connection. UDA-IoT design methods are described in detail below. The use of information in medical applications has been shown [15]. P2P frameworks and Internet of Things therapy technologies are used to keep patients engaged with the monitoring system and to keep them inside the monitoring system. Real Web Communication (WebRTC) by Kahiki et al. [16] is tested in a variety of conditions, and Al Syndrome gives test results for each. Priority should be located on reliable information routine turn on the Bluetooth control with the help of the sphygmomanometer. SBP was a machine function that kept track of things like systolic blood strain, diastolic blood strain, and cold symptoms. Because of this, the data from the software may be transmitted more easily. Mobile devices and data are archived, disputes are recorded, and public comments are made, among other things. In real-world applications, the slowly approach to end-to-end, real-time Internet architecture poses a confront because of the large number of moving parts. Especially devices are connected towards the seat and supervise the difficulties that user encounter while wearing Bluetooth headset on the way and absent of spectacle resulting in a device that is specifically designed for the visually impaired. The inclusion of another blind sensor [17] provides the user with a comprehensive communication system that ensures precision and readiness during the glove’s extended life. In addition to previous attempts, there is a limited amount of data connectivity across various cloud environments, making it difficult to assess and analyse the data. Given this restriction, we offer a potential solution in this article that takes into consideration.
132
C. Selvan et al.
3 Materials and Methods A patient’s electrocardiogram (ECG), temperature, electromyography, and muscle activity in breathing, perspiration, and blood sugar, as well as infection such as arrhythmias, passion nerve along among muscular disorders, blood stress plumpness and diabetes, may all be monitored with this device. Today, sensors can be easily applied to the skin, and many parts of the body have seen significant improvement, so use caution while using them. A variety of physiological data, including various physiological parameters, is collected via the use of sensors implanted in the bodies of patients [18]. The data is then sent using pre-purchased data and communications software running on a small handheld device. In order to avoid getting in the way of the patient’s movements, sensors should be small and light in weight. It is recommended that these sensors be powered by small, low-energy batteries. This ensures that the sensors may be used forever without the need for shipping or recharging. Transmitting patient data from the health centre’s accurate and secure location should be possible with the right transmission components. Transmission may be accomplished via the usage of Bluetooth. You may also get the information if you go to the online health centre and request it via that channel. Activating the devices linked to the system via the hub, which may be accomplished using a smartphone, is possible [19] (Fig. 2). Fig. 2 Data collection transmission
Healthcare Application System …
133
3.1 Proposed Methodology We are admitted to the government hospital for this task because there are never enough physicians to manage sugar, blood pressure. The pulse and the transfer of heat to the body are both monitored. The work is intended to be used in the administration of a wide range of pharmaceutical products. Blood pressure, tension, with mind rate is all indicators of different types of fever that may be detected via monitoring. The sensors used to measure these limitations are divided into four categories. The four components are as follows: a blood glucose sensor, a high blood strain sensor, a heart rate antenna, and a temperature exchanger. One module [20] is responsible for connecting all of the sensors. The ARM11 gene is responsible for determining whether or not the patient is recognised by the recipient’s physical features and information processing abilities. When the patient accesses his or her profile after receiving the microprocessor, it starts to communicate with him or her via the speaker and to teach him or her how to use medical equipment and to show him or her a video demo. The employee is in charge of a wide range of responsibilities. Toto, the pressure sensor, continuously checks the amount of the fluid in the body and transmits the information to the computer. Heart and healing heart temperatures are maintained at a high level and made available to the customer in a similar manner to the management of blood sugar levels. All expenses are calculated based on data collected by sensors. The MCP3008 is an analog-to-digital converter that may be used in a variety of applications. Discrete numbers are generated by converting the sensor’s calculated data into discrete numbers, which are then sent to the channel makers. For the pump, the image sensor and door have previously been setup to work together. This section may be connected to programmes that facilitate the exchange of information between a doctor and a patient. (See also the Equipment Section.) They are, in fact, true. Developing a high-quality tool. It implements the Internet of Things concept by assigning a unique IP address to a patient database page. According to their doctor’s instructions, the patient may establish an account with an IP address in order to monitor their reading levels. A Wi-Fi unit is worn in the last step of the process, which involves sending readings to the patient’s location. International publication of special editions of mathematics and applications continues to be a priority (Fig. 3). In this case, the camera is linked to the equipment that is responsible for registering the patient’s face or adding facial information to the statement. When it occurs, a comparison between the stored face and an existing face is performed. If a patient’s face information is accessible, the picture opens a window with the patient’s information [21]. It may be used to keep track of a patient’s information. If no information is given, a column will display for them to enter their information, as if they were a brand-new machine. A patient is diagnosed and the employee starts to interact with them as well as other members of the team. Monitoring is used to keep track of and engage with the illness at the location.
134
C. Selvan et al.
Fig. 3 Proposed technology
Employees are responsible for keeping all cervical patients’ medical records up to date. In addition, the WIFI module is operational. Keeping track of all patient information and DMH readings is important for physicians since they may transmit them to medical and Ethernet networks at any time. Certain criteria have been established to enable physicians to diagnose medical data and to hand out gifts to their patients. Next, for future reference, enable doctors to enter their information and create a user ID and password for themselves. Using their login ID and password, physicians may get access to this website, which acts as an additional degree of security [22]. Your doctor may choose to give you a prescription orally or in writing, depending on your preference. Via the use of a microphone, the doctor receives audio input, which is sent to the patient through a speaker. Patients may also get medicines and health cards by contacting the pharmacy via phone, email, or text. They pay close attention to the doctor’s directions or glance at the prescription on the computer screen while they are driving. Without depending on nurses or other professionals, patients may recover their health while also ensuring that they are enrolled properly [9]. In order to determine the lowest theoretical signal intensity required by a receiver for a particular data rate, an established formula must be used. That is the case −154 dBm + 10 log 10(bit rate)
(4.1)
If the data rate is 1 Mbps, then a receiver must have a sensitivity of −94 dBm in order to have a reasonable chance of receiving good data. For example, if the
Healthcare Application System …
135
received signal is in reality at −84 dBm, the fade margin is 10 dB, which means that the received signal may be faded by a factor of up to 10 dB if necessary. Once the data rate has been determined, it is possible to compute the minimum allowable receiver power. The connection between RSSI and distance is shown by the arrow. Fm n P0 Pr F
Fade Margin Path-Loss Exponent, ranges from 2.7 to 4.3 Signal power (dBm) at zero distance Signal power (dBm) at distance Signal frequency in MHz—2412~2483.5 MHz
pi = p0 + 10nlog(ri /r 0)
(4.2)
where The received power in decibels (dBm) at a reference distance is denoted by p0. The route loss exponent, denoted by the symbol r0n, varies from 2 to 4 depending on the transmission medium. The second mode is very essential since it is only during this phase that the user’s authentication is confirmed and they are either granted or refused access to the e-health system. This procedure is carried out by the authentication server. The authentication server validates the authentication by receiving a packet from any one of the IPs listed in the authentication request. When a packet is received by the authentication server, it checks to see whether the specific IP address is on the registered list and also appears on the authentication list. If it does, the packet is rejected. Then it extracts the RSS value, which is compared to the authentication list, and if the value falls within the min and max threshold values, the authentication is successful, and the data is transmitted to the cloud server or received from the cloud server (Fig. 4).
4 Implementation The Internet of Things (IoT) is a computer progression in which each physical object is equipped with a sensor, microcontroller, and transmitter that allow them to communicate with one another. It is produced by putting together a protocol stack. Maintain open lines of contact with one another as well as with clients and clients’ customers. In Internet-based healthcare, a variety of distributed devices collect, analyse, and transmit medical data to the cloud, enabling a massive amount of data to be gathered, stored, and analysed in a number of new ways and by causing contextual disturbances. The Innovative Information Finding Model (IIFM) offers continuous and comprehensive access to medical information through any Internet connection. The
136
C. Selvan et al.
Fig. 4 SetUP phase flowchart
restricted battery life of all Internet-connected gadgets ensures that overall energy usage is maintained to a minimal. The use of the ZigBee mesh protocol to bring the Internet health system into hospitals is discussed in this article. During the implementation of the health-care system, it will be possible to monitor the pathological parameters of hospitalised patients on a regular basis. Equipment that has been well tested and proven, for example, improves overall maintenance quality while simultaneously lowering overall maintenance costs and actively participating in data collection and analysis [23]. The Raspberry Pi is programmed surrounded by Python and transmits health figures to a web server through an Ethernet connection. The name of the patient as well as his or her health condition may be discovered online. The programme makes use of a number of different components.
Healthcare Application System …
137
4.1 Camera Specifications These documents detail the product’s specs in order to verify that it is designed to meet the needs of customers. The AH5020B23-S1-2Z1 is a USB class camera with a video capability developed for laptop photography. The CMOS sensor, lens, holder, support, PCB, image processing channels, and interface, as well as access to digital video devices, are all included. It will be a reliable gadget integrated into a laptop for transferring video data through USB connection. Not only does the AH5020B23-S12Z1 provide UXGA resolution (1600 × 1200) for still image creation applications, but it also provides a video stream for the end user to watch and record a movie through USB 2.0 interface. In YUY2 mode, it can handle VGA resolution (640 × 480) at 30 frames per second. AH5020B23-S1 2Z1 creates AE, AWB, and AGC for CMOS sensors that allow auto image management. It also has a typical UVC UI for picture quality control (UI).
4.2 Comparison with Existing Algorithm The proposed Enhanced Secure Patient Monitoring Algorithm to Session Expired Identity verification in Wireless Personal Area Network is tried to compare to Priority Based Secure Cluster-based Patient Monitoring Algorithm (PBSCHCMA) (Sethupathi et al. 2015) [24]. A Secure Priority Based Treatment Monitoring Algorithm (SPBHCMA) (Sethupathi et al. 2015), and Light Weight Security Architecture (LSA) (Sethupathi et al. 2015) bench 1 show the results of measuring the delay time with various key lengths (Table 1). Because of improved routing and decreased queuing time, the proposed ESHSEA algorithm performs well in measuring the delay when compared to current methods. With a key length of 50, a delay of 0.64 s is produced, which is smaller than the LSA, SPBHCMA, and PBSCHCMA. ESHMA has a shorter delay period than the other algorithms. ESHMA has a delay time of 4.19 s even with a key length of 250, but LSA, SPBHCMA, and PBSCHCMA have delays of 4.8 s, 4.32 s, and 4.2 s, respectively (Fig. 5; Table 2). Table1 Key length versus delay time Key length 50
Delay (s) LSA
SPBHCMA
PBSCHCMA
ESCHMA
ESHMA
0.9
0.7
0.65
0.63
0.64
100
2
1.78
1.56
1.49
1.52
150
4
3.69
3.27
3.27
3.23
200
4.2
4.06
3.82
3.82
3.78
250
4.8
4.32
4.2
4.2
4.19
138
C. Selvan et al.
Fig. 5 Key length versus delay time
Table 2 Key length versus throughput Key length
Throughput (bits) LSA
SPBHCMA
PBSCHCMA
ESCHM A
ESHM A
50
504,402
542,425
584,526
586,424
586,420
100
424,892
444,088
486,744
492,754
492,744
150
408,446
429,829
442,426
452,209
452,098
200
457,984
468,795
484,687
484,687
487,445
250
267,892
277,688
284,567
284,567
284,646
Throughput: Table 2 shows the throughput numbers achieved when modelling the ESHMA. When compared to existing techniques, the proposed ESHMA algorithm outperforms them in terms of throughput measurement, as shown in Fig. 5. Comparing the results of the ESHMA’s throughput measurement to those obtained from the LSA, PBSCHCMA, and SPBHCMA, the results are promising. The red line in Fig. 6 shows that ESHMA has a higher throughput value than the other techniques that were taken into consideration. Figure 6 shows that, when compared to current methods, the proposed ESHMA algorithm works better in measuring the PDR since transmission congestion is lower. PDR is 95.14 for LSA, 96.28 for SPBHCMA, 98.02 for PBSCHCMA, and 98.43 for the ESHMA algorithm with key lengths of 50. ESHMA produces comparable better outcomes with key lengths of 100, 150, 200, and 250.
Healthcare Application System …
139
Fig. 6 Key length versus packet delivery ratio
5 Conclusion Using the cloud-based information system, the suggested solution aspires to provide patients with better-connected economic health services, allowing specialists and doctors to build on this knowledge and reach more fast and favourable conclusions for their patients. The final model contains all of the characteristics that a doctor is looking for in a patient at any given time. Therefore, the applicable expert would take action against the healthcare victim in the clinic as a result of the connected economic assistance to ill nations, resulting in reduced hospital lines and direct consultation with physicians, reducing contextual dependence, and permitting full use of the website. a. The primary goal of the approach is to provide patients with a high-quality financial life that is connected to their services via the network’s cloud of information, which comprises professionals from many fields. Doctors may be able to utilise this information to provide their patients with a quick and cost-effective resolution. Several characteristics of the final product enable the doctor to test the patient from any place and at any time. This would result in an economic advantage designed used in favour of sick people who want to cross the margin into hospital and have through consultations among doctors in order to reduce the consistency of their families’ health-care costs. This proposed method would be cost-effective in terms of ensuring proper health management in public hospitals, which is the goal. Through the process of replication, more components of the artificial intelligence system may be enhanced. Both patients and physicians are involved. The majority of people’s medical histories include parameters and appropriate results, and data mining is used to search for templates all of the time, as well as for systemic disease connections, among other things. Example: if the patient’s health measurements change in the same manner as those of previous patients in the database, it is possible to predict the result. If there is a trend like this, we will be able to recognise it immediately, and
140
C. Selvan et al.
physicians will have an easier time identifying it. Medical experts have discovered a way out to this problem.
References 1. H.T. Sullivan, S. Sahasrabudhe, Envisioning inclusive futures: technology-based assistive sensory and action substitution. Futur. J. 87, 140–148 (2017) 2. Y. Yin, Y. Zeng, X. Chen, Y. Fan, The Internet of Things in healthcare: an overview. J. Ind. Inf. Integr. 1, 3–13 (2016) 3. H.N. Saha, S. Auddy, S. Pal, Health monitoring using Internet of Things (IoT). IEEE J. 69–73 (2017) 4. S.F. Khan, Health care monitoring system in Internet of Things (loT) by using RFID, in IEEE International Conference on Industrial Technology and Management (2017), pp. 198–204 5. M. Hassanalieragh, A. Page, T. Soyata, G. Sharma, Health monitoring and management using Internet-of-Things (IoT) sensing with cloud-based processing: opportunities and challenges (2015) 6. M.S.D. Gupta, V. Patchava, V. Menezes, Healthcare based on iot using raspberry pi, in 2015 International Conference on Green Computing and Internet of Things (ICGCIoT), Oct 2015, pp. 796–799 7. P. Gupta, D. Agrawal, J. Chhabra, P.K. Dhir, Iot based smart healthcare kit, in 2016 International Conference on Computational Techniques in Information and Communication Technologies (ICCTICT), Mar 2016, pp. 237–242 8. N.V. Lopes, F. Pinto, P. Furtado, J. Silva, Iot architecture proposal for disabled people, in 2014 IEEE 10th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), Oct 2014, pp. 152–158 9. R. Nagavelli, C.V. Guru Rao, Degree of disease possibility (ddp): a mining based statistical measuring approach for disease prediction in health care data mining, in International Conference on Recent Advances and Innovations in Engineering (ICRAIE-2014), May 2014, pp. 1–6 10. P.K. Sahoo, S.K. Mohapatra, S.L. Wu, Analyzing healthcare big data with prediction for future health condition. IEEE Access 4, 9786–9799 (2016). ISSN 2169-3536 11. B. Krishnan, S.S. Sai, S.B. Mohanthy, Real time internet application with distributed flow environment for medical IoT, in International Conference on Green Computing and Internet of Things, Noida (2015), pp. 832–837 12. V. Arulkumar, C. Puspha Latha, D. Jr Dasig, Concept of implementing big data in smart city: applications, services, data security in accordance with Internet of Things and AI. Int. J. Recent Technol. Eng. 8(3) (2019) 13. D. Azariadi, V. Tsoutsouras, S. Xydis, D. Soudris, ECG signal analysis and arrhythmia detection on IoT wearable medical devices, in 5th International Conference on Modern Circuits and Systems Technologies, Thessaloniki (2016), pp. 1–4 14. A. Mohan, Cyber security for personal medical devices Internet of Things, in IEEE International Conference on Distributed Computing in Sensor Systems, Marina Del Rey, CA (2014), pp. 372– 374 15. L.Y. Yeh, P.Y. Chiang, Y.L. Tsai, J.L. Huang, Cloudbased fine-grained health information access control framework for lightweight IoT devices with dynamic auditing and attribute revocation. IEEE Trans. Cloud Comput. 99, 1–13 (2015). IoT-based health monitoring system for active and assisted living 19 16. V. Arulkumar, An intelligent technique for uniquely recognising face and finger image using learning vector quantisation (LVQ)-based template key generation. Int. J. Biomed. Eng. Technol. 26(3/4), 237–49 (2018)
Healthcare Application System …
141
17. P. Porambage, A. Braeken, A. Gurtov, M. Ylianttila, S. Spinsante, Secure end-to-end communication for constrained devices in IoT-enabled ambient assisted living systems, in IEEE 2nd World Forum on Internet of Things, Milan (2015), pp. 711–714 18. K. Yelamarthi, B.P. DeJong, K. Laubhan, A kinect-based vibrotactile feedback system to assist the visually impaired (2017) 19. X.-W. Chen, X. Lin, Big data deep learning: challenges and perspectives. IEEE access 2, 514–525 (2014) 20. M. Siekkinen, M. Hiienkari, J. Nurminen, J. Nieminen, How low energy is bluetooth low energy? Comparative measurements with zigbee/802.15.4, in Wireless Communications and Networking Conference Workshops (WCNCW) (IEEE, 2012), Apr 2012, pp. 232–237 21. N. Bui, M. Zorzi, Health care applications: a solution based on the internet of things, in Proceedings of the 4th Int. Symposium on Applied Sciences in Biomedical and Communication Technologies, ser. ISABEL’11 (ACM, New York, 2011), pp. 131:1–131:5 22. K. Laubhan, M. Trent, B. Root, A. Abdelgawad, K. Yelamarthi, A wearable portable electronic travel aid for the blind, in IEEE International Conference on Electrical, Electronics, and Optimization Techniques (2016) 23. M. Li, S. Yu, Y. Zheng, K. Ren, W. Lou, Scalable and secure sharing of personal health records in cloud computing using attribute based encryption. IEEE Trans. Parallel Distrib. Syst. 24(1), 131–143 (2013). C. Bishop, Pattern Recognition and Machine Learning. Springer, New York (2006) 24. V. Arulkumar, C. Selvan, V. Vimal Kumar, Big data analytics in healthcare industry. An analysis of healthcare applications in machine learning with big data analytics. IGI Global Big Data Analyt. Sustain. Comput. 8(3) (2019)
Energy Efficient Data Accumulation Scheme Based on ABC Algorithm with Mobile Sink for IWSN S. Senthil Kumar, C. Naveeth Babu, B. Arthi, M. Aruna, and G. Charlyn Pushpa Latha
Abstract The current trend in Wireless Sensor Network (WSN) is based on multihop networking which is used to transmit data through various networks. The usage of multihop forwarding in large-scale WSNs cause an energy hole problem, which results in a considerable amount of transmission overhead. In this paper, a multiple portable sink-based information gathering method that combines energy balanced clustering as well as Artificial Bee Colony-based data gathering is proposed in order to address these concerns. The remaining energy of the node is used to determine which node will serve as the cluster’s centre of gravity. According to the findings of this research, mobile sink balancing may be approached from three different perspectives: data gathering expansion, mobile route distance reduction, and network reliability optimization. This study is conducted with the use of a significant and intense WSN that enables a specific level of data delay to be tolerated in order to be successful. The paper proposes the optimization technique which is known as Artificial Bee Colony optimization technique that can accept the reduction losses in data communication, improve network lifetime, save the energy of the system, maintain the reliability of the system, and increase the network efficiency. Keywords Ant colony optimization · Mobile sink · Wireless sensor networks · Efficiency S. Senthil Kumar (B) Department of Computer Science, Sri Ramakrishna Mission Vidyalaya College of Arts and Science (Autonomous), Coimbatore, India e-mail: [email protected] C. Naveeth Babu Department of Computer Science, Kristu Jayanti College, (Autonomous), Bangaluru, Karnataka, India B. Arthi · M. Aruna Department of Computer Science and Engineering, Faculty of Engineering and Technology, SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Kanchipuram, Chennai, Tamil Nadu, India G. Charlyn Pushpa Latha Department of IT, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_10
143
144
S. Senthil Kumar et al.
1 Introduction The IWSN is one of the most popular WSN applications, and it makes use of several sensor nodes. Temperature sensors, sound sensors, vibration sensors, pressure sensors, and other types of sensor nodes are used to monitor and track data. In case anything gets out of control, the sensor nodes alert the control system by sending data to the system. At this point, the most significant parameter is the passage of time. Communication latency must be eliminated in an industrial setting since it is critical to the operation of the business [1]. As a result of their low energy capacity, the sensors’ energy consumption must be kept to a bare minimum in order to save as much energy as possible throughout the operation. The greater the efficiency with which energy is conserved, the longer the network’s lifespan. To reduce the amount of energy used by data transmission, there are a number of methods described in the literature. The use of a sink node is one of the most frequently used methods to improve communication quality in networks [2]. The sink node may be either stationary or movable. In accordance with Lu et al. and Gungor et al., the immobile sink nodes stay in a particular position at all times, and as a result, the nodes near the sink node drain more energy due to the data flow [3]. This disadvantage is addressed by the mobile sink, which is a device that travels around the network and collects data. Energy consumption of the nodes across the network is reduced because of this concept. In order to take use of the advantages of mobile sinks, this study utilizes a mobile sink node for data collection [4]. The mobile sink, on the other hand, travels the network to gather data from every single node. This increases the latency and the amount of time it takes for data to reach the sink node. In order to cope with this problem, this work makes use of the idea of clustering, which means that all of the nodes are grouped together into a single cluster, which is controlled by a node known as the cluster head node, which is in charge of multiple participant nodes. It is possible for all the cluster’s participant nodes to communicate with the cluster head node, and for the cluster head node to communicate with the mobile sink. This concept improves manageability while also reducing communication latency to a bare minimum. Because of this, energy consumption is decreased, and the network’s life expectancy has been extended. The primary goal of this study is to reduce communication latency while also reducing energy usage in order to extend the network’s lifetime. Two critical stages of this study are cluster formation and data collection by a mobile sink node, both of which are discussed in detail below. By reaching out to all the clusters, the mobile sink node accumulates data. The Artificial Bee Colony (ABC) [5] method is used to determine the optimal route for data collection by mobile sink nodes. The work’s performance is then evaluated in terms of communication throughput, latency, energy usage, time consumption, and network lifespan, among other metrics. Currently existing methods for harmonizing the energy ingesting of devices are too complicated to be used in practice due to the many boundaries forced on WSNs for dissimilar purposes, rendering them unfeasible for majority of the
Energy Efficient Data Accumulation Scheme …
145
applications. In a broad variety of actual applications, the performance of existing algorithms is significantly hampered by these limitations. For example, a sensor network with many mobile sinks is being built for the purpose of monitoring a distant region, and there is a local connected road map for all of the mobile sinks in the monitored zone, there would be a problem in scheduling these mobiles descends to assemble abundant data from devices feasibly to extend the system’s generation, and this problem must be resolved. In this paper, it is suggested to build a new constrained optimization problem for wireless sensor networks, which may be used to solve the energy management optimization problem of many mobile sinks that have limited energy availability. Following that, an effective approach is proposed for dealing with the problem, mentioned as the Artificial Bee Colony grounded portable sink undertaking algorithm. Using this method, not only is the assignment between the mobile sinks and sensor nodes are balanced, but also the energy ingesting between the sensor nodes are stable. It is proposed in this article that a mobile wireless sensor network data group technique, based on the artificial bee colony procedure, may be utilized to collect data from a mobile wireless sensor network in an efficient and reliable manner. To summarize the most significant contributions made by this essay, the following are the bullet points: This paper discusses the data collection technique for mobile sinks, cluster head assortment glitches, and mobile sink route optimization. In order to recover the efficacy of network data gathering, the route optimization of the movable sink may be phrased as a straight pathway discovery problem in order to maximize data collection efficiency. The artificial bee colony technique may then be used to search for structures of the optimal resolution, as well as the straight route of the mobile sink, in order to find the best solution. In their natural habitat, honeybees engage in a range of complex behaviours, including mating, breeding, and foraging. It has been feasible to simulate the behaviour of honeybee-based optimization techniques after this, which has been done for several different approaches. A technique for route optimization based on bee colony optimization is employed in the proposed study because of the complicated behaviour. The following organization is discussed in detail in the upcoming part of this article: Sect. 1 illustrates the introduction. Section 2 gives a survey of important literature in the area of research. Section 3 offers a description of the system model and its components. Section 4 presents a mobility-based energy efficiency algorithm, which is based on the fact that people are always moving. It is shown in Sects. 5 and 6 how the numerical results and conclusions are obtained, and how they are reached.
2 Literature Survey Many studies have been performed and published in the past on the deployment of mobile sinks for data gathering. An NP-hard issue [6] is a problem in which arranging
146
S. Senthil Kumar et al.
nodes into optimum clusters is very difficult to solve successfully. It is worthwhile to consider optimizing the size of the cluster since it is an efficient way of reducing its overall size. In order to control the size of a cluster, it is necessary to minimize the distances among member nodes and Cluster Heads (CHs) [7], as well as the amount of energy used by the network as a whole. PSO decreases the distance between member nodes and CHs in directive to determine the optimal cluster size. Based on the results of [8], GA has been utilized to construct the most energy-efficient clusters feasible in wireless sensor networks. It was suggested by the writers of [9] that a self-clustering technique for varied WSNs be employed in conjunction with a GA to extend the lifespan of the network. Other ways of altering cluster size are used in order to achieve a variety of goals. The researchers believe that a routing tree that maximizes network lifespan while possessing the direction-finding route between each device and sink to the bare minimum should do this [10]. An energy consumption cost model presented in [11], with an evaluation of the queries carried out via a routing tree. Aims of the research in [12] include finding the most efficient path for each movable sink and computing the break duration of a piece mobile sink along the trajectory in order to maximize the network’s lifespan. For example, in [13], the usage of a solitary mobile descend to extend the life of a network was investigated in more detail. The findings of a recent study published in [14] indicate a method for prolonging the life of mobile base stations while also controlling network demand at the same time. [15, 16] recommends that the mobile base station be operated in the manner of the greediest algorithm, as an example. Joint flexibility and steering are suggested [17–19] in the situation of a stationary base station with a restricted range of capabilities. Ultimately, the route is controlled by [20–22] via the use of the mobile sink routing protocol once all has been said and done. Mobile sinks, on the other hand, have a very limited range of practical applications in practice. The fact that a WSN is often situated in densely populated areas means that mobile sinks can only serve a tiny part of the population. Because of a variety of factors, including road conditions and mobile tools, these autonomous cars can only go at a particular speed. Aside from that, since mobile tools have a limited amount of available energy, the greatest distance they can go on a journey is also limited. These limitations on mobile sinks, are taken into account while developing a routing protocol in order to maximize network lifespan while still maintaining speed. An intensive care unit’s collection of n randomly distributed similar device nodes, which is represented mathematically by the purposeless graph n = (VMS, E), where V represents the collection of n randomly distributed similar device nodes in the intensive care unit and MS represents a mean temperature of the monitoring zone. In MS, there are k mobile sinks, and in E, there are k connections between sensors and sinks, with k being the number of connections between sensors and sinks [23]. It is considered to be connected when the broadcast variations of two sensors u and v coincide; otherwise, it is said to be unlinked. When it comes to remote monitoring, mobile sinks are one-of-a-kind sensor since they can both receive sensed information from device nodes and send the sensory data they have collected to an offsite monitoring centre. If each sensor node has a unique identity and a limited beginning
Energy Efficient Data Accumulation Scheme …
147
energy capacity Q, it is assumed that each mobile sink has a lively capacity and an unlimited supply of energy at the depot. To make things easier to understand, it is assumed that there are k mobile sinks in the system at any one point in time. R = (V r , W r ), where each vertex in V r represents a road junction set and each piece advantage in W r represents a road segment, used to represent the road map in the monitoring region (Fig. 1). Map R can only include a maximum of k mobile sinks, each of which has a liveliness capacity of eM, and as a result, it is limited in its capacity. Travel and data transmission, as well as the reception of the information supplied, account for about one-third of the total energy used by each mobile sink. ‘et’ and ‘ec’ are used to indicate the sink energy consumption for a unit-length journey at the sojourn sites along the route, while ‘et’ and ‘ec’ are used to denote the sink energy consumption for a unit-length trip at the starting point [24]. The wireless connection between sensor nodes consumes the vast majority of the energy generated by the sensor nodes when it comes to energy consumption rates. The remainder of the energy use, which comprises sensing and processing, is negligible.
Control Room
Mobile Sink Node
CH
CH
CH
28 14
20
11
2
19
9
1
10
4 13
5
3
27 22
12
16 17
25 24
18
7 8 6
21
15
26
23
Node Clusters
Fig. 1 Overall flow diagram of the proposed data accumulation scheme
148
S. Senthil Kumar et al.
3 Materials and Methodology This section describes the proposed data accumulation system, which is based on the stages of clustering and ideal route selection, in more detail. The clustering phase aims to bring the nodes together under the control of a cluster head node, which is in charge of managing the cluster. The task selection method, is used to carry out the clustering process. Figure 1 depicts an overview of the planned work’s overall flow diagram. It can be readily seen in the above-presented picture that the nodes are initially grouped, and that the cluster head nodes may communicate with the mobile sink node by using the optimal route selected by the ABC algorithm to do so. Periodically, the data is sent from the mobile sink to the control room. The computational complexity and time consumption for data transmission are reduced by this concept.
3.1 ABC Algorithm ABC is a metaheuristic algorithm that is designed to imitate the natural behaviour of honeybees. It was developed by Karaboga and is based on his research. The food supply, the amount of working and unemployed bees, and the number of employed bees are all essential factors in the ABC algorithm. The primary goal of this algorithm is to locate the most suitable food source. Food source: A food source may be designated by taking many factors into consideration, including the distance among the food and the hive, the quality of the food, and the ease with which the energy can be absorbed. A food source is determined to be the best based on the criteria listed above. Bees on the job: The bees on the job are responsible for spreading critical information about the food source. The majority of the information exchanged is the location of the food supply and the distance between the food source and the hive. When a swarm of bees is taken into consideration, the hired bees are responsible for half of the swarm, with the other half being taken care of by spectator bees. Jobless bees may be divided into two types: scout bees and spectator bees. Scout bees are the most common kind of unemployed bee. The scout bees are on the lookout for a new food source in the vicinity of the beehive. The observer bees remain in the hive and use the information gleaned from the employed bees to identify the source of food for the colony. The location of the food supply is often found to be the optimal answer to the analysis issue, and the quantity of honey present in the food determines the excellence of the meal produced. The fitness function of the algorithm is determined by the quality of the meal. Using the theme of employed bees, the algorithm determines that employed bees are concerned with the food source. The following is a representation of the standard pseudocode for the ABC algorithm in Table 1.
Energy Efficient Data Accumulation Scheme … Table 1 ABC algorithm
149
Standard Pseudocode of ABC Algorithm 1: Produce initial population i p = 1 to MC 2: Calculate the fitness function of the population 3: Fix counter = 1 4: Do //Employed bees phase 5: Search for the food source; 6: Calculate the fitness function; 7: Employ greedy selection process; 8: Compute the probability for the food source; // Onlooker bees phase 9: Select food source based on the probability values; 10: Generate new food source; 11: Calculate the fitness function; 12: Apply greedy selection process; //Scout bees phase 13: If food source drops out then swap it with new food source; 14: Save the best food source; 15: Counter + = 1; 16: While counter = MC;
Scout bees are responsible for conducting the first search for food sources. As soon as this phase is completed, the observer and the hired bees begin to exploit the food supply. The food supply becomes depleted after continuous exploitation, and the hired bees are relegated to the role of scout bee. The number of employed bees is always the same as the number of food sources, since each employed bee is linked with a single food source. Generally speaking, the most fundamental ABC method is divided into three phases: initialization; employed; onlooker and scout bees phase. Every step is repeated until the maximum number of iterations has been reached in a certain period of time. Initially, it is necessary to determine the total number of solutions available as well as the control parameters. The employed bees look for new food sources of higher quality in the vicinity of the old food source in which they were previously engaged. In the next step, the new food source is evaluated for its fitness, and the results are then compared to the previous food source with the aid of greedy selection. The information regarding the food source that has been gathered is disseminated among the observer bees who are present in the beehive. As part of their decision-making process, the spectator bees use a probabilistic method to choose food sources based on the information provided by the employed bees during the second phase. This phase is followed by the calculation of the fitness function of the food source that is situated close to the food source that was selected in the
150
S. Senthil Kumar et al.
previous stage. The greedy selection compares and contrasts the old and new food sources available. Finally, when it is not feasible to improve the answers within a certain number of iterations, the hired bees are promoted to the position of scout bees. The solutions that have been identified by the bees are discarded. At this moment, the scout bees begin their hunt for a new food source, and the bad options are removed from consideration. According to Karaboga and colleagues, all three stages are repeated until the stopping point is reached.
3.2 Proposed Ideal Path Selection by ABC Algorithm The mobile sink node has to handle multiple cluster head nodes for accumulating the data. Let the cluster head nodes be represented by ϕ = {CH1 , CH2 , CH3 , . . . , CHn }. The traversing path of the sink node must be detected, so as to reach all the CH nodes. The choice of route is finalized by considering two parameters, which are distance between the mobile sink and the CH node, and the energy cost. With these parameters, the fitness function is built as follows. f v(i ) =
i (ER(PT)) Dis(i, ms)
(1)
In the above equation, i is the cluster head, ER(PT) is the energy required to transmit a message and Dis(i, ms) is the distance between the cluster head and the mobile sink node. Based on this fitness function, the proposed ABC based path selection algorithm is presented as follows. The probability of the food source is computed by the following equation. P(FS) = LN(OS) + α(LN(OS) − LN(NS))
(2)
In the above equation, LN(OS) and LN(NS) are the locations of old and new cluster heads, α is a parameter that range between 1 and −1. f vi probi = f s n=1
f vn
(3)
In the above equation, f vi is the fitness value of the ith cluster head node and n is the total number of cluster head nodes (Table 2). Using the algorithm, the ideal path is selected with the help of energy gain and distance between the mobile sink and cluster head. This idea conserves energy and reduces the communication delay. The performance of the proposed work is evaluated in the following section.
Energy Efficient Data Accumulation Scheme … Table 2 Proposed ideal path selection algorithm
151
Proposed Ideal Path Selection Algorithm Input: Clustered Nodes; Output: Ideal Path Selection; Begin 1: Produce initial population i p = 1 to MC 2: Calculate the fitness function of the population by Eq. (5.1) 3: Fix counter = 1 4: Do //Employed bees phase 5: Search for the food source; 6: Calculate the fitness function by Eq. (5.1); 7: Employ greedy selection process; 8: Compute the probability for the food source by Eq. (5.3); // Onlooker bees phase 9: Select food source based on the probability values; 10: Generate new food source; 11: Calculate the fitness function; 12: Apply greedy selection process; //Scout bees phase 13: If food source drops out then swap it with new food source; 14: Save the best food source; 15: Counter + = 1; 16: While counter = MC;
3.3 Analysis and Results The performance of the work is evaluated by implementing the work in NS2, on a standalone computer. The parameters chosen to carry out the simulation are presented in Table 3. Table 3 Simulation parameters
Parameter
Settings
Network area
200 × 200
Initial energy of nodes
2J
Communication radius
50 m
Energy for data transmission
50 nJ/bit
Speed of mobile sink
2 m/s
152
S. Senthil Kumar et al.
The performance of the proposed work is analyzed in terms of throughput, latency, energy consumption, time consumption, and network lifetime. The results attained by the proposed work are compared with the existing approaches such as DDRP and SGDD. The attained experimental results are presented as follows. From Figs. 2, 3, 4 and 5, the planned work’s throughput and average delay, its energy consumption and network lifespan can be determined. The experimental findings demonstrate that the suggested approach results in increased throughput and network lifespan while reducing latency and energy consumption to an acceptable level. The entire amount of data that is sent in a particular period of time is referred to as throughput. Regardless of the data transfer method used, the throughput must be as high as possible. The latency of data transmission should be kept to a minimum for the data to be delivered on time. Due to the understanding of industrial submissions, it is essential to guarantee that data transmissions have the shortest possible latency. This study achieves minimum latency via the application of two concepts: the use of Fig. 2 Throughput analysis
Fig. 3 Average latency analysis
Energy Efficient Data Accumulation Scheme …
153
Fig. 4 Energy consumption analysis
Fig. 5 Network lifetime analysis
sink nodes and the selection of the optimal route for sink nodes to go to and from the cluster head node. After that, the energy consumption of the planned task is evaluated in relation to the simulation duration. Energy consumption of the nodes increases with time, as shown in Fig. 4. When the suggested method is compared with others that have been used before, it uses the least amount of energy, due to the optimal route selection that takes into account both energy gain and distance metrics. As the energy usage decreases, it becomes apparent that the network’s lifespan will be extended. The lifespan of a network is assessed in terms of the number of nodes that are still operational in the network. There are about 180 living nodes in the proposed work at the conclusion of the 500th second, while the competing methods had 124 and 163 alive nodes, respectively. It is calculated and reported in Table 4 the time taken for the final node to die.
154 Table 4 Network lifetime analysis with respect to time
S. Senthil Kumar et al. Sensor count
DDRP (s)
SGDD (s)
Proposed (s)
100
2134
2304
2473
200
2581
2864
3021
300
2867
3215
3543
400
3142
4046
4873
500
3314
4628
6043
600
4017
5537
6943
700
4324
6242
8261
800
6219
7634
10,262
The proposed work has a total duration of 10,262 s, while the competing works DDRP and SGDD have a duration of 6219 s and 7634 s respectively. It is calculated and shown in the table how long the network will last in its entirety. According to the experimental findings, it has been shown that the suggested work has a longer lifespan as a consequence of the use of mobile sink nodes and optimal route selection ideas.
4 Conclusion This work describes an energy-efficient data-accumulation strategy for IWSN that makes use of a mobile sink to collect information. Every step of the process is based on clustering and optimal route selection. The nodes are grouped with the assistance of the task selection algorithm, and the ABC algorithm is used to determine the optimal route for the mobile sink to take in order to reach the cluster head. The work’s overall performance is slow in terms of amount, regular delay, energy consumption, and network lifespan.
References 1. G. Xing, M. Li, T. Wang et al., Efficient rendezvous algorithms for mobility-enabled wireless sensor networks. IEEE Trans. Mob. Comput. 11(1), 47–60 (2012). https://doi.org/10.1109/ TMC.2011.66 2. Y. Yue, J. Li, H. Fan, et al., Optimization-based artificial bee colony algorithm for data collection in large-scale mobile wireless sensor networks. J Sens. 2016, Article ID 7057490, 12 (2016). [Web of Science ®], [Google Scholar] 3. L. Malathi, R.K. Gnanamurthy, K. Channdrasekaran, Energy efficient data collection through hybrid unequal clustering for wireless sensor networks. Comput. Electr. Eng. 48, 358–370 (2015). https://doi.org/10.1016/j.compeleceng.2015.06.019 4. S.J. Tang, J. Yuan, X.Y. Li, et al., DAWN: energy efficient data aggregation in WSN with mobile sinks, in Proceedings of the IEEE 18th International Workshop on Quality of Service (IWQoS ‘10) (IEEE, Beijing, China, 2010). pp. 1–9
Energy Efficient Data Accumulation Scheme …
155
5. M. Ma, Y. Yang, Sencar: an energy-efficient data gathering mechanism for large-scale multihop sensor networks. IEEE Trans. Parallel Distrib. Syst. 18(10), 1476–1488 (2007). https://doi.org/ 10.1109/TPDS.2007.1070 6. S. Basagni, A. Carosi, E. Melachrinoudis et al., Controlled sink mobility for prolonging wireless sensor networks lifetime. Wirel Netw. 14(6), 831–858 (2008). https://doi.org/10.1007/s11276007-0017-x 7. S. Basagni, A. Carosis, E. Melachrinoudis, et al., A new MILP formulation and distributed protocols for wireless sensor networks lifetime maximization, in Proceedings of the IEEE International Conference on Communications (ICC ‘06) (IEEE, Istanbul, Turkey, 2006), pp. 3517–3524 8. H.T. Nguyen, L. Van Nguyen, H.X. Le, Efficient approach for maximizing lifespan in wireless sensor networks by using mobile sinks. ETRI J. 39(3), 353–363 (2017). https://doi.org/10. 4218/etrij.17.0116.0629 9. J. Luo, J.-P. Hubaux, Joint sink mobility and routing to maximize the lifetime of wireless sensor networks: the case of constrained mobility. IEEE/ACM Trans Networking. 18(3), 871–884 (2010). https://doi.org/10.1109/TNET.2009.2033472 10. J. Luo, J. Panchard, M. Piórkowski, et al., Routing towards a mobile sink for improving lifetime in sensor networks, in Proceedings of Distributed Computing in Sensor Systems: 2nd IEEE International Conference, DCOSS 2006, vol. 4026 (San Francisco, CA, USA, 2006) 11. B. Bhushan, G. Sahoo, E2 SR2: an acknowledgement-based mobile sink routing protocol with rechargeable sensors for wireless sensor networks. Wirel. Netw. 25(5), 1–25 (2019). https:// doi.org/10.1007/s11276-019-01988-7 12. R. Mitra, S. Sharma, Proactive data routing using controlled mobility of a mobile sink in wireless sensor networks. Comput. Electr. Eng. 70, 21–36 (2018). https://doi.org/10.1016/j. compeleceng.2018.06.001 13. J. Wang, J. Cao, R.S. Sherratt et al., An improved ant colony optimization-based approach with mobile sink for wireless sensor networks. J. Supercomput. 74(12), 6633–6645 (2018). https:// doi.org/10.1007/s11227-017-2115-6 14. R. Akl, U. Sawant, Grid-based coordinated routing in wireless sensor networks, in 4th IEEE Consumer Communications and Networking Conference (CCNC ‘07) (2007), pp. 860–864 15. S. Sharma, On energy efficient routing protocols for wireless sensor networks. Ph.D. thesis. National Institute of Technology Rourkela; 2016 16. V. Arulkumar, C. Selvan, V. Vimal Kumar, Big data analytics in healthcare industry. An analysis of healthcare applications in machine learning with big data analytics. IGI Glob. Big Data Anal. Sustain. Comput. 8(3) (2019) 17. V. Arulkumar, C. Puspha Latha, D. Dasig, Jr, Concept of implementing big data in smart city: applications, services, data security in accordance with Internet of Things and AI. Int. J. Recent Technol. Eng. 8(3) 18. V. Arulkumar, M.A. Lakshmi, B.H. Rao, Super resolution and demosaicing based self learning adaptive dictionary image denoising framework, in 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS) (2021), pp. 1891–1897. https://doi.org/ 10.1109/ICICCS51141.2021.9432182 19. V. Arulkumar, An intelligent face detection by corner detection using special morphological masking system and fast algorithm, in 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC) (2021), pp 1556–1561
IoT Based Automatic Medicine Reminder Ramya Srikanteswara , C. J. Rahul, Guru Sainath, M. H. Jaswanth, and Varun N. Sharma
Abstract Monitoring healthcare is a major issue, that requires attention. In underdeveloped countries, the number of nurses for patients is relatively low, and the accessibility of 24-h medical supervision is also ambiguous, resulting in the incidence of easily avoidable deaths as well as urgent situations, causing a disturbance in the health sector. The medical dispensers that are currently available are expensive, and devices that combine a reminder and a dispenser are hard to come by. The major goal of the medicine reminder is to automatically transmit an alarm and dispense medicine to the correct individual at the stated time from a single machine. An automatic medicine distributor is developed for persons who take medicine without expert direction. It is used by a single patient or by a group of patients. It discharges the individual of the error-prone job of injecting the incorrect medicine at the incorrect time. The main goal line is to retain the device simple usage and affordable. Working software is trustworthy and steady. The older age population will benefit greatly from the device because it can substitute expensive medicinal treatment and the money spent on a personal nurse. Keywords Internet of Things · Medical update frameworks · GSM module · Arduino
1 Introduction Medicines were not [1–6] required in the years past, but now, in our day-to-day lives, most individuals are required to take their prescribed medicine at the specified time since diseases are on the rise. As a result, the majority of people come into contact with these diseases sooner or later. Under these fast or slow-spreading diseases, some diseases are not permanent while most of the others are permanent life terrifying diseases [7]. Human life expectancy is reduced as a result of various disorders. To have a better life they have to take the prescribed medicines at the prescribed time R. Srikanteswara (B) · C. J. Rahul · G. Sainath · M. H. Jaswanth · V. N. Sharma Nitte Meenakshi Institute of Technology, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_11
157
158
R. Srikanteswara et al.
Fig. 1 Medicine reminder pictorial representation
[8]. They are advised by the Doctor to take anticipated medicines in desired doses. Therefore patients have problems like overlooking to take medicine at the correct time. The doctor may change or update the prescription, patients would have to recollect the new list of medicine [9]. People who take prescription medications, whether for a simple virus or something more serious, only take half of the required doses, according to studies [10] (Fig. 1). In addition, one-third of kidney transplant recipients fail to take anti-rejection meds, and 41% of heart attack patients fail to take blood pressure medications. This non-compliance is predicted to result in 125,000 deaths and at least 10% of hospitalizations each year, costing healthcare systems billions of dollars [11]. Medication administration necessitates meticulous attention to detail and concentration. It may be difficult for some people to achieve this on their own, hence a medicine dispensing model would be useful. Many errors occur during the administration of drugs, according to studies [12]. This problem of drug errors can also be caused by medical personnel, putting patients’ health and even lives in jeopardy. Hence we have proposed a project called Arduino-Uno-based Smart medicine box that leverages a real-time clock to decrease difficulties such as taking medicines at the wrong time, taking the wrong dose, and unknowingly taking expired medications, all of which result in unnecessary health complications. In addition to the existing medicine reminder, the newly added feature in our project is that our system is capable of sensing whether the patient has taken the medicine or not, therefore won’t forget to take the tablet at the prescribed time [13] and also this smart pillbox assembles the response from the caretaker or the patient and sends the purchase order to the pharmacy [14]. The user can set the distribution range of the pills and the number of pills at each interval using the application and onboard keys.
IoT Based Automatic Medicine Reminder
159
2 Background The current framework is based on the Android Executing framework, which will prompt the client to take the prescribed medicine on time by displaying notice and ringing an alarm. Android is a Linux-based operating system designed specifically for contact smartphones, such as PDAs, tablets, and PCs, and developed by Google in collaboration with the Open Handset Association. Android was created from the beginning to allow designers to create adaptable applications that take advantage of each of the handset’s features and bring them to the table. The framework is built on the Android working framework because Android is the most popular platform in the business. Android also has an application development framework, which includes administrations for creating Graphical User Interface applications, data to get, and other forms of benefits, as well as an Application Programming Interface for application development. The structure is designed to make segment recycling and merging more efficient. A mandatory XML display record is used in the development of Android apps. When scheduling a time, the show record features are added to the application. This file gives fundamental information to an Android phase for dealing with an application’s existence round. Displays of the application’s parts about other structural and setup properties are examples of this type of detail coupled with a show record. Exercises, Services, Broadcast Receivers, and Content Providers are just a few of the categories that can be found with Segments. After logging in, the patient/user will be able to view a list of all the appointed specialists, including their names, contact information, phone numbers, medical clinic location, availability of specialists as needed, and any other information that the Specialist registers at the time of login to the framework. They can get a drop-down image of the infection and can easily go through the doctors’ review. It also displays the date and time of the next doctor’s appointment. This encourages patients/users to understand the doctor’s diagnosis. The management assists them correctly in following the framework to make it useful and valuable. Prescription updates aid in the reduction of drug administration errors and mismeasurements. The Android interface allows for simple changes in medication intervals and also provides a notification to the caregiver if the medication is not provided on time. Setting an alarm and getting noticed are the two aspects of the update framework. If the patient does not take the pill, the alert sounds louder, and the color of the LED strip changes. After a specified amount of time notifications will be sent to the patient on the Android app to dispense and administer the pill, turn off the alarm [15] With rapid implementations in automotive, home automation, smart cities, and other areas where everything connects to the cloud and makes one’s life easier, the Internet of Things is attracting the interest of a lot of consumers and the enterprise electronics market. Developers have access to a number of power-efficient and low-cost sensors for use in a variety of applications [16, 17]. There are a few existing systems that use IoT concepts and collect data on the type and timing of pill consumption by patients
160
R. Srikanteswara et al.
and store it in cloud storage, where it is analyzed through various applications. In addition, numerous separate pill dispensers are linked together to form a network of pill dispensers that will be monitored via the Internet of Things. The position of each medicine dispenser may be examined in real time via a web interface or an Android interface, which will provide useful information on the treatments received by the patients and what medications they are receiving. Hospitals and other organizations can easily keep track of their patients using these technologies. The network of pill dispensers provides vital data to hospitals, allowing them to deliver the best possible service to their patients [1].
2.1 Cloud Computing and IoT Cloud computing is based on resource sharing, which is essential for IoT platforms. Cloud computing is not only about sharing resources, but also about maximizing them. It is also location independent; customers can access cloud services over an Internet connection from any location and with any device. When we talk about the Internet of Things platform, it should be accessible from anywhere at any time. Another important feature is the virtualization of physical devices; virtualization allows users to effortlessly share gadgets. The multitenancy characteristic of cloud computing allows multiple users to share resources across space and time. Furthermore, Cloud provides elasticity and scalability of resources and applications, as well as easy access and availability of services and resources. As a result, the confluence of Cloud and IoT has the potential to provide enormous opportunities for both technologies [18] (Figs. 2 and 3). Fig. 2 Cloud and IoT
IoT Based Automatic Medicine Reminder
161
Fig. 3 Existing system
3 Proposed System The primary purpose is to assist the patient in taking prescribed drugs and to prevent missed doses due to neglect or poor care. When the pills are not retrieved from the tray, the medicine reminder system illuminates an LCD, sounds an alarm, and sends a notification to an Android application [1]. The patient must take the specified dose from the medicine reminder box on time, otherwise, our system will continue to generate the sound, which will be considerably louder than before, until the patient takes the medicine out of the box. This kind of altering feature increases the life years of the person [2]. This can be of massive contribution to the elderly people and also to the people who are detected with chronic diseases, which requires them to take medicines at regular intervals [2]. The medicine reminder architecture, which consists of a medicine box with a set of different columns for each. Each person can use it regularly for their medicine. The control structure of the medicine box has LEDs for notifying the patient in the form of alarms for their proper medicine taking. There is a ringer in the architecture which notifies the patient by giving an alarm by ringing sound. The alarm will ring for a particular specified time, inside that time just the person needs to press the button by taking the tablet, normally the alarm will be notified in form of messages to track the patient by GSM that patient has not consumed the tablet at the prescribed time by the doctor [19]. The ringer and LED’s are giving the reminder at the particular time set by the family. So the medicine reminder system will regularly analyze the patient’s health by using the IoT (internet of things) and it can also record the patient’s daily dosage level of the tablets. Hence the medicine has its own timing which is compared to the real clock. If the mentioned information matches, the buzzer will go off, or else the buzzer will give the alarm sound, which reminds the patients to take the medicine. Data can be recorded the patient’s health and person daily prescribed medicines to consume [20] (Figs. 4 and 5).
162
Fig. 4 Proposed system
Fig. 5 Proposed block diagram
R. Srikanteswara et al.
IoT Based Automatic Medicine Reminder
163
4 System Reqirements • • • • • • • •
ESP32 CP2102 WiFi BIuet00th DeveI0pment B0ard ICD (IIQUID CRYSTAI DISPIAY) I2C PROTOCOL Arduin0 B0ard GSM(gI0baI framew0rk f0r p0rtabIe) RTC(Real time clock) Switch Buzzer
5 Hardware Requirements • • • • • • • • •
ARDUINO APR SPEAKER LCD’S IR SPEAKER DC MOTOR H-BRIDGE NODE MCU BUZZER
6 Hardware Used 6.1 Arduino Board Arduino board is an open-supply digital platform used for manufacturing singleboard microcontrollers and microcontroller kits for constructing virtual devices. It includes each programmable circuit sheet and a software program or IDE (integrated development environment) that runs on the PC, this board is used to write down and add PC code to the physical board The Arduino platform has come to be well-known with humans truly beginning with electronics, and for modern situations. Unlike the maximum preceding programmable circuit boards, now the Arduino no longer needs any of the separate hardware (referred to as a software program programmer) in order to stack or load new code onto the board—you may truly make use of a USB cable. Furthermore, the Arduino IDE makes use of a simplified model of C++, which makes it smooth to research the code At last, it gives a well-known shape component that breaks out the abilities of the microcontroller into a greater available package [21] . Arduino specification (Fig. 6; Table 1):
164
R. Srikanteswara et al.
Fig 6 Pin specification Table 1 Arduino specifications
Micro controIIer
ATmega 328 P
Operating voIatge Input voItage
7–12 v
Input voItage Iimit
6–20 v
DigitaI I/0 Pins
6
AnaIogue input Pins
6
DC current per I/0 pins
20 mA
DC current for 3.3v Pin
50 mA
FIash memory
Of which 0.5 KB is used
SRAM
2 KB
EEPROM
1 KB
CIock speed
16 MHz
Length
68.6 mm
Width
53.4 nm
Weight
25 g
IoT Based Automatic Medicine Reminder
165
6.2 GSM (Worldwide Framework for Portable) GSM (Global framework for mobile communications)/GPRS (General Packet Radio Administration) is SIM900 MODEM. RS232 Quad-band GSM/GPRS electronics, chips in which frequencies range from 850, 900, 1800, and 1900 MHz. It could be very small in length and less difficult to apply with the GSM modem. The modem includes 3V3 and 5 V DC TTL interfaced with hardware which allows customers to flawlessly interface the hardware with 5 V microcontrollers (PIC, AVR, 8051, Arduino, and so on.) and 3V3 microcontrollers (ARM, ARM Cortex XX, and lots of more). The modem also can interface with a microcontroller with the aid of using the use of USART (Universal Synchronous Asynchronous Receiver and Transmitter) which highlights (Serial correspondence) [22].
6.3 LCD (Liquid Crystal Display) LCD (Liquid Crystal Display) is a flat panel show generation wherein it’s far usually utilized in TV’s and laptop display, cellular devices together with laptops, smartphones. LCDs have a huge variety of display lengths as compared to CRT (cathode ray tube), LCDs had been an essential soar so long as the discovery they supplanted, which comprise light-discharging diode (LED) and plasma shows. Small LCD display shows are generally utilized in LCD projectors and transportable customer gadgets such as small LCD screens are common in LCD projectors and user gadgets together with virtual cameras, watches, virtual clocks, calculators, and, which includes smartphones. LCD display show has been replaced with a big cathode ray tube (CRT) seen in most of the applications [23].
6.4 RTC (Real-Time Clock) RTC’s complete shape is a Real-Time Clock. RTC modules are digital tools that recollect TIME and DATE. That has batteries organized inside, which keeps the module running even with the absence of outside energy and keeps time, date updated. So we will have the best TIME and DATE from the RTC module on every occasion we need. So via the libraries which are used for the module DS3231 is best. Making use of libraries makes the device simple. You ought to download and use those libraries and get in touch with them in the programs. Once the header record is recorded the controller interacts with itself, offers the date and time. Morning timer can additionally be set or modified as a consequence of the use of libraries. Also, while the present day is going down, the RTC module chip withdraws the present day from battery supply related with it naturally. So the time might be modified. Later
166
R. Srikanteswara et al.
while the device restarts the controller can get the prevailing operating time from the module with no mistake [24].
7 Results The medicine reminder system is useful for the user who wants this and all related users. We conclude the result that our project is very useful for those people who are taking medicines regularly, a measure of medicine is very long and hard to remember for those users. Our product also promotes a stress-free life for both the patients and their loved ones which in turn results in a healthier life. Our product is very utilizable in that it can cure those victims’ illness and there will be no need of taking care of these types of patients, so a caretaker has no tension about their health and they will live a healthy and tension-free life [25].
7.1 Unit Testing See Figs. 7, 8, 9 and Tables 2, 3, 4, 5.
Fig. 7 Working system
IoT Based Automatic Medicine Reminder
167
Fig. 8 LCD display
8 Conclusion This medicine reminder aims to prevent an unhealthy and stress-free life for the patients or users who are taking medicines regularly and to dispense this system at a reasonable price and low cost also. Our project is recyclable by changing those other medicinal boxes that have only alerting systems and are non-usable or expensive compared to our product [25]. This pill dispensing system is also very easy to understand, user-friendly, and is sustainable so that the older generations also can use this product without the help of a tech-savvy person. This product has good scalability, is very reliable, and also sends regular notifications about the patient’s well-being through the Android application to their loved ones, hence they can live a worry-free life.
168 Fig. 9 Message displaying through Blynk app
R. Srikanteswara et al.
IoT Based Automatic Medicine Reminder Table 2 Unit test case for ICD
Table 3 Unit test case for RFID
Table 4 Unit test case for GSM
Table 5 Integration test case for RFID card and ICD dispIay
SI # test case: -
169 UTC-1
Name of test: -
ICD testing
Items being tested: -
ICD
SampIe input: -
Power suppIy
Expected output: -
ICD shouId dispIay “WeIcome to piIIbox” message
ActuaI output: -
ICD dispIays “WeIcome to piIIbox”
Remarks: -
Pass
SI # test case: -
UTC-2
Name of test: -
Checking of DC motor
Items being tested: -
DC motor
SampIe input: -
Run the DC motor
Expected output: -
DC Motor shouId run in both directions
ActuaI output: -
First it rotate in forward and next in reverse
Remarks: -
Pass
SI # test case: -
UTC-3
Name of test: -
N0deMCU testing
Items being tested: -
N0deMCU M0duIe
SampIe input: -
Power suppIy, Send the Message
Expected output: -
Message shouId be sent to the BIynk given in the program
ActuaI output: -
Message sent to the BIynk
Remarks: -
Pass
SI # test case: -
ITC-1
Name of test: -
Working of Arduino and ICD
Items being tested: -
DC motor and ICD dispIay
SampIe input: -
PiIIbox
Expected output: -
ICD shouId dispIay morning tabIet
ActuaI output: -
ICD dispIays morning tabIet
Remarks: -
Pass
170
R. Srikanteswara et al.
References 1. M.V. Moise, P.M. Svasta, A.G. Mazare, Programable IoT pills dispenser, in 2020 43rd International Spring Seminar on Electronics Technology (ISSE) (2020), pp. 1–4, https://doi.org/10. 1109/ISSE49702.2020.9121107 2. B. Ayshwarya, R. Velmurugan, Intelligent and safe medication box in health IoT platform for medication monitoring system with timely remainders, in 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS) (2021), pp. 1828–1831. https:// doi.org/10.1109/ICACCS51430.2021.9442017 3. B. Jimmy, J. Jose, Patient medication remainder: Measures in daily practice. Oman Med. J. (2019) 4. S. Bryant, Principles: Medicine dispensary. IOSR J. Electr. Electron. (1.03)6 (2014). Retrieved Febraury 5. D. Raskovic, T. Martin, E. Jovanov, Medical remainder applications for wearable computing. PC J. 47(4), 498506 (2020) 6. Google, “Activity”. Retrieved Jan (2014) Available: Google, “Fragments”. Retrieved Jan 2019 Available: Google, “Fragment Transaction” 7. H.-W. Kuo, Research and implementation of intelligent medical box, M.S. thesis, Department of Electrical Engineering, I-Shou University, Kaohsiung, TW (2017) 8. N.B. Othman, Ong, P. Ek., Pill dispenser with alarm through smart phone notification using Iot, in IEEE 5th Global Conference on Consumer Electronics (2016) 9. D.H. Mrityunjaya, K.J. Uttarkar, B. Teja, K. Hiremath, Automatic pill dispenser. Int. J. Adv. Res. Comput. (2018) 10. M. Viswanathan, C.E. Golin, C.D. Jones, M. Ashok, S.J. Blalock, R.C. Wines et al., Interventions to improve adherence to self-administered medications for chronic diseases in the United States: a systematic review. Ann. Intern. Med. 157(11), 785–95 (2012) 11. M.D. Lisa Rosenbaum, H. William, M.D. Shrank, Taking our medicine—Improving adherence in the accountability Era. N. Engl. J. Med. 369, 694–695 (2013) 12. “JONA”. J. Nurs. Adm. 39(5), 204–210 (2009) 13. G.B.-Z. Joram (three Dec 2010), Online clever tablet container allotting system, US 20090299522A1 14. S. Shinde, T. Kadaskar, P. Patil, R. Barathe, A smart pill box with remind and consumption using IOT. Int. Res. J. Eng. Technol. 4(12), 152–154 (2017) 15. A. Uttamrao (25 Jan 2015) Intelligent medication container, US 7877268 B2 16. C.G. Rodriguez-Gonzalez, A. Herranz-Alonso, V. Escudero Vilaplana, M.A. Ais-Larisgoitia, I. Iglesias-Peinado, M. Sanjurjo-Saez, Robotic dispensing improves patient safety, inventory management, and staff satisfaction in an outpatient hospital pharmacy. J. Eval. Clin. Pract. 25(1), 28–35 (2019) 17. F. Mattern, C. Floerkemeier, From the internet of computers to the internet of things. ETH Zurich, pp 1–6, Oct 2016 18. M.V. Moise, A.-M. Niculescu, A. Dumitra¸scu, Integration of internet of things technology into a pill dispenser, in 2020 IEEE 26th International Symposium for Design and Technology in Electronic Packaging (SIITME) (2020), pp. 270–273. https://doi.org/10.1109/SIITME50350. 2020.9292283 19. A.R. Biswas, R. Giaffreda, IoT and cloud convergence: Opportunities and challenges, in 2014 IEEE World Forum on Internet of Things (WF-IoT) (2014), pp. 375–376. https://doi.org/10. 1109/WF-IoT.2014.6803194 20. E. Peel, M. Douglas, J. Lawton, Self tracking of blood glucose in kind 2 diabetes: longitudinal qualitative have a look at of sufferers perspectives. BMJ 335(7618), 493 (2007) 21. K. Bhavya, B. Ragini, Assistant Professors Department of Engineering Karpaga Vinayaga College of Engineering and Technology, Padalam Tamilnadu, India” A Smart Medicine Box for Medication Management the usage of IOT 22. J.P. Solanke, S.K. Lakshman, Smart medicine and monitoring system for secure health the usage of IOT
IoT Based Automatic Medicine Reminder
171
23. M.V. Kondawar, A. Manusmare, Review on smart pill box monitored through internet with remind, secure and temperature controlled system 24. B. Jimmy, J. Jose, Patient medication adherence: Measures in daily practice. Oman Clin. J. (2016) 25. S. Bryant, OO principles: Encapsulation and decoupling. IOSR J. Electr. Electron. 1.03(6) (2014), Retrieved January
IoT Based Framework for the Detection of Face Mask and Temperature, to Restrict the Spread of Covids Ramya Srikanteswara , Anantashayana S. Hegde, K. Abhishek, R. Dilip Sai, and M. V. Gnanadeep
Abstract Coronavirus is a novel virus that is responsible for causing the disease, COVID-19, which is deadly. It was first detected in December 2019, in the city of Wuhan, China and due to its contagious nature, people all over the wor1d are now infected. COVID spreads very fast and easily through air, from one person to another, affecting almost the entire population of the world in a short span. Wearing a mask in public is very much necessary as a means of preventive measure against the viral disease. Moreover, body temperature is an important factor to be identified to determine whether an individual is affected from the virus. Manually checking if a person wears a mask in outdoors or determining the temperature of an individual in a crowded area, is a tedious task and requires an urgent need for solution. Internet technology introduction into the world is beneficial and it can transmit the data without any human interaction which is best suited for this Covid-19 situation. This article provides the road map to how this technology can be utilized for a better cause. In this work, an IoT based framework is designed to ensure the restriction of entry of a Covid affected individual into the premise, by detecting if the mask is worn and his temperature is normal, to avoid the spread of this disease. Moreover, using this method the safety of the staff in the checking process at the entry point is protected. Keywords Corona-virus · COVID-19 · Deep-learning · Face-mask-detection · Temperature detection · Tensor-Flow · Automatic functioning of gate
1 Introduction COVID is a contagious respiratory disease caused by Severe Acute Respiratory Syndrome CoronaVirus (SARS CoV). Currently COVID is spreading in all countries across the world quickly, affecting over 181 million peoples [1], and 3.9 million deaths are recorded based on the report from WHO (World Health Organization) on 28 June 2021. In order to avoid the worldwide disaster, a wise and clear-cut method R. Srikanteswara (B) · A. S. Hegde · K. Abhishek · R. D. Sai · M. V. Gnanadeep Nitte Meenakshi Institute of Technology, Bengaluru, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_12
173
174
R. Srikanteswara et al.
to prevent the spreading of the COVID is instantly desired globally. Few familiar individual coronaviruses [2] that affect the earth are LN63, 1HKU, 430C and E229. Earlier similar viruses such as, COV- SARS, nCOV-2019 and COV-MERS, enervated people and affected wildlife, and now has changed toward individual coronaviruses [3]. People who have respirational difficulties can affect anyone near them, by their infectious droplets. Surrounds of polluted people will thus cause the spread, and the virus might never reach the close sides. To curb the metastasis [4], wearing a face mask while going to any public place is very much necessary. WHO is stressing on their points of wearing a face mask as their priority suggested by the health care assistants. The reason of using Internet of Things (IoT) for this model is important because IoT can transform the data without any human intervention. It is an intelligent system which can collect the data through cameras and sensors and analyze the data and send it to anywhere in the world. IoT can merge with other technology like deep learning, machine learning and AI to combat against COVID-19. Face mask image detection has become a significant job in international civilization. Mask recognition includes the processes of finding and locating the face in the image and the crucial process of identifying whether the identified face is covered using a mask or not. The problem is the object discovery to identify the objects. Face detection process deals with various characteristics to identify the face, like identification of colors, locating the points, joining the points to identify the objects, and also identifying different objects even if the objects are varied in sizes and so on [5]. Accuracy is more essential in Deep Learning, and tensor-flow is used to calculate their data. For face detection, a CNN model which contains a dataset of more than 60 k images of faces with or without mask is utilized. During training, the image are segregated as with mask and without mask. Next, a testing model is created which checks the given image. The image processing is done by extracting images bit by bit. The proposed method provides a simplified approach toward face mask detection by using the Deep Learning packages like TensorFlow, Keras and OpenCV. Main objectives of this proposed work are as follows: • Allowing a person to enter the premises only if he has a normal body temperature, and ensure he is wearing a mask; and automatically open the gate only if these conditions are satisfied. • Ensuring safety of the security person in charge, as a security person near the gate may not be required.
2 Related Background Covid has become one of the most life-threatening diseases. There are many techniques used to evade the virus. In one of the technique is detection of face. A face is recognized from a picture that has several different attributes. Consistent with [6], analysis into the detection of face needs an identification of expressions, following
IoT Based Framework for the Detection of Face Mask …
175
the face, and estimation-creation. Identifying the face in an image, is the major challenge, as the face modifications are different in color, size and shape and they aren’t absolute. It becomes a more difficult job in case of unclear pictures and gets delayed by another issue of concern, due to low quality picture capturing camera. Authors of [7] says, closure detection of face technique has 2 major challenges: (1) datasets of sizable volumes that contain pictures of masked and unmasked faces are not available. (2) facial expressions are excluded in the area bounded. Using the regionally linear embedding rule and the trained dictionaries on associate degree vastly, masked faces, synthesize of dull faces, many misplaced facial expressions are often regained, additionally the superiority of facial cues are often lessened to fine degree. Consistent with the work rumored in [8], there is a strict constraint that comes along with the Convolutional Neural Network (CNN) in laptop vision relating to the scale of the input image. Before fitting the pictures into the network, the pictures are reconfigured in order to overcome the inhibition. Generally, identifying the face from the picture with accuracy and identifying whether the person has a mask worn or not is a major challenge, however the proposed method identifies a face wearing a mask in motion. Although various methods exist, there is a need for a cost effective and efficient method for the prevention of spread of Covid. The proposed method considers both these important factors.
3 Proposed Work Dataset In the proposed method, two datasets are used for testing the technique. Dataset 1 [9] contains 1918 pictures of people wearing face masks and 1915 pictures of people not wearing face masks. The captured images are single face pictures in various surroundings, and with varieties of masks being worn by people in different colors as shown in Fig. 1. Dataset 2 [10] contains 1915 pictures of people without mask. In Fig. 2, the collections of faces are slightly tilted with multiple expressions on the faces. The main challenges faced while identifying people with or without masks are that, this method comprises of images in varying angles and lack of clarity. Indistinct moving faces in the video make it more difficult. The system can efficiently detect partially covered faces either with a mask, hair or hand. It considers the occlusion degree of four regions—nose, mouth, chin and eye to differentiate between annotated mask or face covered by hand. Therefore, people can use this loophole to get away from the sensors. Packages Incorporated a.
TENSORFLOW TensorFlow acts as an interface used for implementing the algorithms of Machine Learning (ML). It is also used in the implementation of the ML systems
176
R. Srikanteswara et al.
Fig. 1 Few pictures from dataset 1 with faces wearing masks
b.
with several fields of computer science, which contains study of sentiments, voice recognition, and taking out of physical information, computer vision, summarization of texts, recovery of information, computational discovery of drugs and also detection of flaws in order to follow research [11]. In the proposed model, the entire Convolutional Neural Network (CNN) architecture, which is sequential, uses TensorFlow as a backend. Also, it is used to change the size of the image, that is scaling them during the processing of the images. KERAS Keras provides the basic vital structure with easy assembly unit for the development and passage of ML measures. Scalability and cross platform capacities of TensorFlow are considered as an advantage. The layers and models are the core data buildings of Keras [12]. Using Keras, the layers in the CNN model are applied. The overall compilation of the model is done by the process of
IoT Based Framework for the Detection of Face Mask …
177
Fig. 2 Few pictures from dataset 2 with faces not wearing masks
c.
conversion of the vector classes into a matrix of binary format during image processing. OPENCV Open Source Computer Vision Library shortly known as openCV, is an open source computer vision and ML software library, which is employed to compare and differentiate, to identify objects and faces, to group motions in recordings, follow progressive modules, trace eye motion, path camera actions, eject red eyes from photos using flash notice relative pictures in an image database, understand landscape and discovered markers in order to overlay with exaggerated reality [13], in resizing and conversion of color in data images. OpenCV is used in the projected technique.
178
R. Srikanteswara et al.
Proposed Method In the proposed method, 2D convolution layers are linked to a dense-neuron layer, which is a cascade classifier and a pre-trained CNN. The process of imagepreprocessing involves data conversion from a given format to another format that is user friendly. The subject could be in any desired format which is more like tables, images, videos, graphs and so on. The data which is ordered, fits in a model called info-model, arranges and also records relationships among different objects [14]. This proposed method uses Numpy and openCV for image and video data arrangements. Conversion of image from RGB to Grayscale: Image recognition systems based on modern descriptors usually uses grayscale images, hence color image is converted to grayscale. While using robust descriptors, process of conversion from Color to Grayscale is a little complicated. In order to achieve a good routine, present nonessential data may increase the training dataset size. With the help of grayscale, the algorithm is rationalized and the computational requisites are diminished. Descriptors can be utilized, instead of working parallelly on colored images. [15]. The quality of image recognition can be implemented by constructing a novel framework with deep learning and Principal Component Analysis (PCA) to build an IoT image identification. This proposed research work has conducted many tests and delivered greatest image-based recognition results [16]. The face mask detection algorithm is as follows:
Deep Learning Deep-learning can be defined as a part of Machine Learning (ML) algorithms which uses several layers to extract features of high-level, progressively from raw-input. As
IoT Based Framework for the Detection of Face Mask …
179
an example of image-processing, the layers considered in the lower area may identify edges, while layers present in the higher area may identify the concept related to the topics in the life of human beings like faces, digits or letters. Deep learning an element of ML processes are reinforced by the use of Artificial Neural Networks (ANN) along with the representation learning. Learning may be categorized as supervised learning, semi-supervised learning or unsupervised learning [17]. Deep-learning architectures that comprises of Deep Neural Networks, Belief Networks, Persistent Neural Networks and Convolutional Neural Networks are applied to several fields like computer and machine vision, NLP, machine-translations, speech recognition, audio recognition, societal network riddling, bioinformatics, drug-designs, study of medical-images, review of material, and also programs of board-games. Wherever it is, it yields similar results surpassing the performance of human experts in some cases. Processing of information concepts and communication nodes in distributed systems, in the biological format in biological systems are the main inspiration for ANN [18]. However, neural networks are different from biological brains in many aspects. Specifically, the brains are dynamic and analogous in living organisms, whereas neural networks are symbolic and static. In case of deep learning, the word “deep” is used, as it utilizes several layers in a network. Previously, a universal classifier does not include linear perceptions. Therefore, it can be said that deep learning is a variation that deals with infinite layers of fixed size and accepting it for practical applications also for enhanced implementation, yet preserving back the theoretical popularity under mild situations. The layers in deep learning are allowed to be different from each other and also from biologically formed models. This model is for trainability, understandability and for efficiency; hence it can be mentioned as the “structured” part. Deep Neural Networks DNN has various types of neural networks that contain neurons, biases, functions, weights and synapses. The functioning of these parts is comparatively the same as that of the working of brains of humans, and it can be trained like other Machine Learning algorithms [19]. Considering an example: DNN which are trained in order to identify the animal, which is a cat breed, is studied and analyzed from the given image, also calculates the probability that the cat is of a particular breed. Users can examine generated outcomes and select the precise possibilities that the network must exhibit the above threshold values etc. Every measured manipulation is considered as a layer. A complex Deep Neural Network contains many layers; hence called “deep” networks. DNN are capable of modeling the complex relationships that are non-linear in nature. These neural network architectures create a compositional model in which the object is represented as compositional layers of primitives. Composition of features from lower layers is done using additional layers that effectively models data that are complex in nature with less units of measurement than performing as a shallow network. Deep network architectures contain several varieties of complex kinds of approaches. Each architecture reaches success in each of its respective domains.
180
R. Srikanteswara et al.
Comparing the performance of multiple architectures always is impossible, unless same data sets are used for evaluation. DNNs are networks where information flow occurs from input to the output layer that is in feed forward format, without looping back. Firstly, DNN creates a complete design of connections of all virtual neurons, and then assigns them with a random number chosen [20], or to any “weights”, in order to make the connections. The product of weights and inputs are found, and then an output value in the range 0 and 1 is returned. If a certain pattern is not found in the network, then the weights are adjusted using an algorithm. In this manner, few parameters are made more influential by using an algorithm. This process is done till it determines a correct mathematical method to process data completely. Recurrent Neural Networks (RNNs) are neural networks within which data flow takes place in any way and are also utilized for applications like language modeling. For a long period of time, the short-term memory is more effective. Convolutional Neural Networks (CNNs) are neural networks that can be used in computer-vision. It is used for Acoustic Modeling for speech recognition automatically. Convolution Neural Network Convolutional Neural Network shortly known as CNN, is also called as ConvNet which is a class of DNN. It is mostly applied to image processing. Most CNNs are only equivariant i.e., if a variation occurs, it occurs in equal proportions, as opposed to invariant that is no variance to translation [10]. Its applications include video and image recognition, image segmentation, natural language processing, image classification and financial time series. CNNs are perceptron constituted of several layers which are arranged grounded on some rules and regulations. The networks are completely connected, i.e., each neuron in layer 1 has a connection to any or all neurons of the upcoming layer. Entire connectivity in these fully connected networks [11] make them in peril of overfitting data (Fig. 3). Convolution Neural Networks takes a different approach toward regularization. It is done by considering the hierarchical patterns that are found in data and by assembling patterns [21] in increasing order of their complexity by using much simpler and smaller patterns that are embossed in their filters. Hence considering the connectivity and complexity measures, Convolution Neural Networks are on the lower extreme.
Fig. 3 Steps followed by convolutional neural network
IoT Based Framework for the Detection of Face Mask …
181
The idea of these networks is inspired from the nerves and neurons in living organism that acts as message carriers connecting all body parts to the brain. The connectivity pattern in that biological process, between neurons is similar to that of the organization [12] of the animal Visual Cortex. Individual Cortical neurons works only for restricted region of visual fields to stimuli known as the receptive field. Different neurons containing these receptive fields individually overlap completely so that they can cover the entire visual fields. CNN use less preprocessing compared to the other image classification algorithms. Hence networks can learn to enhance filters over automatic learning. In contrast, in case of traditional algorithms these filters are applied manually. Hence this feature is a major advantage in feature extraction technique, as there is independence from prior knowledge and from human interventions. Histogram of Oriented Gradients (HOG) HOG provides the feature descriptor technique used at deep learning for object detection in Computer Vision and Image Processing. In this procedure, the sum of the incidences of gradient orientation with local portions of an image is considered [15]. These concepts are relatable to shape contexts edge orientation histograms. The main difference is that it is computed on a compact network of consistently spaced cells, and for improved accuracy it used for overlapping local contrast normalization. In 1986, Robert K. Mc Connell explained the concepts behind HOG without using the word HOG in an application. In 1994, Mitsubishi Electric Research Laboratories used these concepts. Navneet Dalal [22] and Bill Trigg, scientists at French National Institute for Research in technology and Automation, at the Conference on CVPR (Computer Vision and Pattern Recognition) in 2005, presented their work on HOG descriptors, where these concepts became widespread because of their utilization. The main focus of Navneet Dalal and Bill Triggs, was on the detection of pedestrians in static images, as they wanted to include detection of humans in the videos. Moreover several varieties of vehicles and animals detection in static images, through their advances in their tests were performed (Fig. 4). Flow of the Proposed Work The framework works as a 2-factor authentication, to let a person enter the locale. As the first step, the face mask detection is performed, where a web cam captures the image of the person who wants to enter. The captured image is forwarded to the face mask detection algorithm, where the trained data set is used to compare the face of the person and determine whether the person is wearing a mask or not. Based on the result, if the person wears a mask, the face is bounded by a green colored rectangular box with accuracy and if the person does not wear a mask, then the face is bounded by a red colored rectangular box with accuracy, and also leads to a security alert sound, which indicates there is a person near the gate without a mask (Fig. 5). The second step of authentication is the temperature detection, where a hardware setup comprising of a microcontroller, temperature sensor, servo motor and LCD display are used. The temperature sensor senses the temperature of the person, which is forwarded to the LCD display by the microcontroller and is further used to
182
R. Srikanteswara et al.
Fig. 4 Images before and after application of HOG algorithm Flow of the proposed work Start
Capture image using web cam
Apply face mask detector
yes
Temperature detection
yes Gate opens
Check if temperature is normal
Fig. 5 Flow of the proposed work
Check if mask worn
no Gate closes
no
Sound alert
IoT Based Framework for the Detection of Face Mask …
183
determine if the servo motor has to open the gate or not. If the temperature detected by the temperature sensor is high, then the gate will not be open and the person is not allowed inside. If the temperature is normal, then the servo motor opens.
4 Results The proposed method performs two main functions to avoid the spread of Covid. It does a check if the person entering the organization has a mask on or not and checks the temperature as well. This can be checked by the admin on his system. Only when both the criteria are found to be fulfilled, the gate opens. Figures 6 and 7 are snapshots of mask detector detecting persons with and without mask. Fig. 6 Mask detector detecting a person wearing mask
Fig. 7 Mask detector detecting a person without mask
184
R. Srikanteswara et al.
Fig. 8 Temperature detected displayed in Fahrenheit using LCD display
The mask detection using keras and OpenCV method attain accuracy up to 95.77% and 94.58% respectively on two different datasets, where dataset 2 is more versatile than dataset 1. The optimized values of parameters are obtained using the Sequential Convolutional Neural Network model to detect the presence of masks correctly without causing over-fitting. Figure 8 indicates the temperature detected displayed in Fahrenheit using LCD display.
5 Conclusion The proposed technique mainly provides safety against the current deadly disease COVID-19, which is a contagious disease spreading through the air we breathe in and affecting the lungs, reflecting primary symptoms of cough, cold, fever, breathing problems and many more, ultimately leading to death. Even though the government has implemented lockdown, yet due to unavoidable reasons people insist to come out. As preliminary precautions, it is very much essential to keep social distancing, wash the hands regularly, circumvent touching nose and mouth, wear a mask before going to any public places, check the body temperature frequently and consult a doctor immediately if there are any symptoms. It is the responsibility of the institution/locale authority to ensure the people are healthy and are following these precautionary measures. Also, the safety of staff members involved in this process is also very much important. An Internet of Things (IoT) based biotelemetry can monitor the patients by analyzing the ECG signals. Information is classified using artificial neural module to obtain clinical decisions [23]. This method aims at allowing a person to enter the premise with normal temperature, and with a mask on. Allow or restrict of the entry of the person is controlled by automatic open/close gate based on the result of these factors. Since these precautionary measures are useful to restrict the affected people enter the premise, this protects the safety of healthy person and avoid the spreading of the disease. Moreover, it ensures the safety of all the people belonging to the institution including the
IoT Based Framework for the Detection of Face Mask …
185
staff authorities and members involved in the security process. This project has a vital role in reducing the contact of affected and unaffected person, which acts as a major step to fight against COVID-19. The basic Convolutional Neural Network (CNN) model using TensorFlow with Keras library, detects if the person has his mask on or off the face. The temperature is detected by the sensors and the gate automatically functions based on these two criteria. It ensures that no outsider enters the institutes/schools/companies/public places without a mask by creating an alarm. The model can be upgraded to mail the authority when it finds a person off-mask from his face.
References 1. WHO, Coronavirus disease 2019 covid-19. https://www.who.int/docs/default-source/corona viruse/situation-reports/20200812-covid-19-sitrep-205.pdf?sfvrsn=627c9aa8_2 (2020) 2. “Coronavirus disease 2019 (COVID-19)—Symptoms”, Centers for disease control and prevention, https://www.cdc.gov/coronavirus/2019-ncov/symptomstesting/symptoms.html. (2020) 3. Corna virus—Human coronavirus types—CDC, https://www.cdc.gov/coronavirus/types.html (2020) 4. WHO, Advice on the use of masks in the context of COVID-19: interim guidance (2020) 5. M. Jiang, X. Fan, H. Yan, RetinaMask: a face mask detector, https://arxiv.org/abs/2005.03950 (2020) 6. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1. (IEEE, 2005) 7. S. Ge, J. Li, Q. Ye, Z. Luo, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2682–2690 8. S. Ghosh, N. Das, M. Nasipuri, Reshaping inputs for convolutional neural network: Some common and uncommon methods. Pattern Recogn. 93, 79–94 (2019) 9. prajnasb/observatins, GitHub, https://github.com/prajnasb/observations/tree/master/experi ements/data (2020) 10. S. Bharathi et al., An automatic real-time face mask detection using CNN, in 2021 Smart Technologies, Communication and Robotics (STCR). (IEEE, 2021) 11. A. Das, M.W. Ansari, R. Basak, Covid-19 face mask detection using TensorFlow, Keras and OpenCV, in 2020 IEEE 17th India Council International Conference (INDICON) (IEEE, 2020). 12. X. Deng et al., A classification–detection approach of COVID-19 based on chest X-ray and CT by using keras pre-trained deep learning models. Comput. Model. Eng. Sci. 125(2), 579–596 (2020) 13. A. Dumala, A. Papasani, S. Vikkurty, COVID-19 face mask live detection using OpenCV, in Smart Computing Techniques and Applications (Springer, Singapore, 2021), pp. 347–352 14. B. Suvarnamukhi, M. Seshashayee, Big data concepts and techniques in data processing. Int. J. Comput. Sci. Eng. 6(10):712–714 (2018). https://doi.org/10.26438/ijcse/v6i10.712714 15. C. Kanan, G. Cottrell, Color-to-Grayscale https://www.researchgate.net/publication/221 755665 (2012) 16. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 17. R. Memisevic, Deep learning: Architectures, algorithms, applications, in Conference: 2015 IEEE Hot Chips 27 Symposium (HCS), Aug 2015. https://doi.org/10.1109/HOTCHIPS.2015. 7477319
186
R. Srikanteswara et al.
18. F. Hohman, M. Kahng, R. Pienta, D.H. Chau, Visual analytics in deep learning: An interrogative survey for the next frontiers, in IEEE Transactions on Visualization and Computer Graphics, vol. 25(8), pp. 2674–2693, 1 Aug 2019. https://doi.org/10.1109/TVCG.2018.2843369 19. H. Yi, S. Shiyu, D. Xiusheng, C. Zhigang, A study on deep neural networks framework, in 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), 2016, pp. 1519–1522. https://doi.org/10.1109/IMCEC.2016.7867471 20. P. Sharma, A. Singh, Era of deep neural networks: A review, in 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2017, pp. 1–5. https://doi.org/10.1109/ICCCNT.2017.8203938 21. T.S. Rao, S.A. Devi, P. Dileep, M.S. Ram, A Novel approach to detect face mask to control covid using deep learnig. https://ejmcm.com/article_2807_b1e004fc8cf0f8080144eb4707a 0b85a.pdf 22. R. Yamashita, M. Nishio, R. Do, K. Togashi, Convolutional neural networks: an overview and application in radiology. https://insightsimaging.springeropen.com/articles/10.1007/s13 244-018-0639-9 (2018) 23. V. Balasubramaniam, IoT based biotelemetry for smart health care monitoring system. J. Inf. Technol. Digit. World 2(3), 183–190 (2020)
Efficient Data Partitioning and Retrieval Using Modified ReDDE Technique Praveen M. Dhulavvagol, S. G. Totad, Nandan Bhandage, and Pradyumna Bilagi
Abstract I.T. industries and private organizations generate a massive volume of data every day. Storing and processing Big data is challenging due to scalability and performance issues. Nowadays, a distributed architecture is used to process Big data. In a distributed architecture, several nodes/systems communicate to store and process data in a distributed architecture. Search engines use distributed architecture to store and retrieve documents for the user query. Elasticsearch is an open-source search engine, which uses distributed architecture. The main goal of this paper is to configure elastic search clusters, implement the shard selection algorithms, and perform the comparative study analysis of the existing shard selection techniques with the proposed shard selection technique. The sharding technique is applied to partition and retrieve relevant data from the nodes. Shards are created on each data node of the cluster. Shard is the small unit of storage in the memory of the data node. Data is horizontally partitioned according to topic-based and stored on different shards. This paper proposes a Modified ReDDE shard selection algorithm that enhances the throughput by searching only the relevant shards in the distributed processing architecture instead of all the shards. The results interpret that the Modified ReDDE algorithm improves the performance parameters compared to existing shard selection techniques by 26%. Keywords Shard · Cluster · Node · Index · Elasticsearch · ReDDE
1 Introduction In today’s digital era, search engines are the medium for connecting to the digital world. The search engines serve the data which the user wants in some milliseconds. The data retrieved will be in terms of the millions that the data collection data centers P. M. Dhulavvagol (B) · S. G. Totad · N. Bhandage · P. Bilagi School of Computer Science and Engineering, KLE Technological University, Hubballi, India e-mail: [email protected] S. G. Totad e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_13
187
188
P. M. Dhulavvagol et al.
have nowadays. The most searched result is the browser, the most relevant data for the search query. In general, the data is fetched from the storage by comparing the search query with the data in storage. This is called a traditional search technique that takes high throughput, high latency, and low fault tolerance, due to which the efficiency of a search engine is reduced. Now the data will be stored in distributed architecture where the huge server is located at different geographical places across the world. These servers communicate with each other and fetch the relevant data for the users. A load of searching for data in a single server has been reduced and distributed across multiple servers. This distributed architecture motivates the Elasticsearch search engine. Elasticsearch search engine follows the distributed architecture (master node- data node). One single server acts as the Master node, and all the other connected servers are called data nodes. Inturn, these data nodes are divided into shards, which means each is further divided into shards. Shards are the small storages where the actual data is stored and each shard belongs to different topics. When the user gives a search query the query is first passed to the master node, where it is connected to all nodes. This master node contains the information of each data node. That is the number of shards, replicas, and which topic is stored in which shard. The master node will search in those shards that belong to the query’s topic. The relevant files of the search query are fetched from shards and given to the user. This distributed architecture is called as Elasticsearch cluster.
2 Literature Survey Storing and retrieving a large dataset is challenging in Distributed architecture; therefore, the researchers have proposed some techniques to overcome the challenges [1]. Elasticsearch can be the better search engine for medium and large-scale data. CORI shard selection algorithm can be used as the shard selection and reported that the distributed architecture communicates in the search operation for a particular query. Here data nodes are only responsible few parts of the dataset in searching and storing. This algorithm will fetch the data from shards relevant to the search query. The parameters enhanced are latency, throughput and scalability. The CORI algorithm is the first lexicon-based algorithm and uses many assumptions and hard-coded values [2]. Dividing the data according to the topic-based motivates the cluster to search the data for the query in a few shards which contain the relevant data. If the documents have similarities, then they can be grouped. Kulkarni and Callan [2] described that the selective search can be the approach for dividing the data according to topic-based shards and searching in only a few shards that are relevant to the query. The experimental results showed that the selective search is more effective than the exhaustive method (traditional search method). The machine learning algorithm tries to divide the data according to their similarity and store them. While searching the machine learning algorithm fetches the data by similarity of the query and the data. Machine learning’s ability to learn and
Efficient Data Partitioning and Retrieval Using Modified ReDDE …
189
make predictions can take insights from a large dataset. L’Heureux [3] reported that Big data is the future. Humans produce 2.5 quintillion bytes every day, and storing, processing, fetching and maintaining efficiency is difficult. Machine learning can be the solution for this because of the ability of machine learning to analyze the data and store them according to the predictions made by the machine learning algorithms. Praveen et al. [4] explained that in distributed network the systems communicates with each other for information retrieval. The sharding technique can be applied for information retrieval where the scalability, consistency, and fault tolerance issues can be addressed. The paper’s main aim is to enhance processing and information retrieval in large-scale databases. The authors propose the ReDDE shard selection algorithm, focusing on the database collection and content similarity for information retrieval. The comparative analysis of the proposed algorithm (ReDDE) and existing shard selection algorithm (CORI) shows that the ReDDE shard selection algorithm is more efficient than CORI and reduces the overall cost by 28%. Federated search is technique of searching on a multiple text collection simultaneously. A query is submitted to the sub-collections, estimated to contain relevant data. Finally, the results from every collection is merged and listed. Further [5] identified the three major challenges of federated search. Those are collection selection, collection representation, and result merging. The authors explained the representation sets in Cooperative Environment, Lexicon Based Collection Selection and Federated Search Merging to overcome the challenges. Prior research under various conditions on CORI algorithm has proved to be an effective collection selection algorithm. Callan et al. [6] has reported that the CORI algorithm does not perform well in small and extensive databases mixed environments. Proposes a new collection selection algorithm, which is based on database size and database contents. The experimental results showed that the database size estimation is a more accurate method for a large database. The ReDDe algorithm showed better performance in collection ranking than the CORI algorithm [7]. The existing shard selection techniques CORI, exhaustive methods, and Rank-S have failed to optimize and efficiently retrieve the relevant data to the user query. We propose a Modified ReDDE shard selection technique to overcome these challenges, enhancing the performance parameters throughput, scalability, and latency.
3 Methodology The methodology for implementing the Modified ReDDE shard selection and ranking algorithm in a distributed architecture follows. The distributed architecture contains several nodes in which one node is considered a master node, and the remaining nodes are data nodes. Initially, the user query is passed to the master node. The master node contains CSI (Centralized Sample Index), where the query is matched. As a result of comparison, the master node creates a list of documents which are estimated to contain the relevant document to
190
P. M. Dhulavvagol et al.
Fig. 1 Proposed system architecture
the user query. The document with highest similarity score and shard which contains this document is considered as more relevant document and shard to the user. The unstructured data is fed into the master node where it first pre-processes the data and performs clustering operation on the pre-processed data. The steps followed to process the data are: • Pre-processed data is used for clustering the data according to topic-based. • The clustered data is loaded into the data nodes by the master node. • The user gives a query to the master node, where it analyzes to which cluster the query belongs. • A master node sends the query to that shard that contains the relevant data. • The data which is relevant to the query is fetched and displayed to the user The Fig. 1 shows the proposed system diagram. It mainly includes 6 modules.
3.1 Data Node The data node is a storage entity in cluster where the data is indexed and stored. The Fig. 2 shows the data node which mainly consists of two types of shards. • Primary shard—This shard contains the actual data. When we index documents then the documents will be stored in the primary shards. • Replica shard—This shard is the copy of the primary shards, which is present in the other data nodes
Efficient Data Partitioning and Retrieval Using Modified ReDDE …
191
Fig. 2 Data node in a distributed architecture
Fig. 3 Shard
3.2 Shard The shard is the unit at which Elasticsearch distributes data around the cluster. Shard contains the lucene index and segments (Fig. 3). Each elasticsearch index contains the shards as numbers specified during the indexing of the documents in elasticsearch. The shards contain Lucene index and segments. The Lucene index stores statistics about terms to make term-based search more efficient. Lucene’s index falls into the family of indexes known as an inverted index. This is because it can list, for a term, the documents that contain it. The segments store the actual text files. The Lucene index and segments are interconnected, when the query is matched by the Lucene index, it fetches the text files from the segments. ReDDE is the new resource selection algorithm which tries to estimate the relevant documents from large dataset. The algorithm tries to find the distribution of relevant data in dataset. The ReDDE algorithm considers the content similarity and size of the database for estimation of the relevant documents.
4 Implementation 4.1 Pre-Processing of Data Pre-processing consists of tokenizing, stemming, removal of stop words. Bringing down the data in such a format helps in getting results accurately.
192
P. M. Dhulavvagol et al.
Algorithm 1—Pre-processing of data Input- The data should be in text format and unstructured data. Output - Whole data is pre-processed and stored for future use def removewords (listofTokens,listof Words): return (token for token in list Tokens if token not in ListOfwords def applyStemming(listofTokens, stemmer): return (stemmer.stem(token) for token in listofToken) def twoletters(listofTokens) twoLetterWord=[ ] for token in Listoftokens: if len(token) ==2 or len(token)Token == 21 twoletterWord.append(token) return twoLetterword
4.2 Clustering of Data This involves dividing the data according to the topic-based with help of the kmeans clustering algorithm. Each cluster of data is loaded into each shard in data nodes through the master node. Algorithm 2—Clustering of data Input - All the text files should be preprocessed. Output - Clustering algorithm with term frequency and inverted document frequency(tf-Idf) clusters text files. def run_Means (max_k, data): max k + 1 kmeans results = dict() for k in range (2, max_k): kmeans = cluster. KMeans(n_clusters = k, init = ’k-means++’, n_init = 10, tol = 0.0001, in jobs = 1, random state = 1, algorithm = ‘full’) kmeans_results.update({k:kmeans.fit(data)}) return kmeans_results
4.3 Data Loading In this module, the data is loaded into the shards according to topics. Each shard contains a unique topic. Replica shards are loaded with data from other data nodes.
Efficient Data Partitioning and Retrieval Using Modified ReDDE …
193
Input—The data should be clustered according to topic-based. Specifying the number of shards and converting to JSON format. Output—Data is loaded into shards according to the topic-based.
4.4 Input Query The user enters the query in a search box and when the user presses the search button then relevant documents to the query are fetched. Input—Data should be present in data nodes according to topic-based. User gives the query. Output—Query is sent for retrieving relevant data.
4.5 Shard Selection Selecting the shards which contain the relevant document for the query can be implemented with the help of the CORI shard selection algorithm. Algorithm 3 Shard selection algorithm Input - Data should be present in data nodes according to topic-based and the user should give the query. Starts comparing the query with the ready created clusters. Output - Gives the shard number on which the relevant data is stored. def getroutingkeys (substr): routings=set ([ ]) for i in rangelo, len (files)): if filesl[i] in range0-7: routings routings union(set{Routing keys(u])) elif files[i] in range7-14: routings routings.union (set{Routing keys(u])) elif files[i] in range14-21: routings routings.union(set{Routing keys(u])) elif files[i] in range21-28: routings routings.union(set{Routing keys(u])) elif files[i] in range28-35: routings routings.union(set{Routing keys(u]))
194
P. M. Dhulavvagol et al.
4.6 Data Retrieval The data which is relevant to the query are fetched from the shards which are selected by the Modified ReDDE shard selection algorithm. The relevant documents are fetched based on the similarity score of the document to the user query.
5 Results and Discussion The experiment is performed on the 3 systems with Linux operating system, which has 32 GB RAM and 512 GB storage capacity. The dataset contains 12 GB of unstructured text files. The dataset is collected from Gutenberg dataset. The comparative study analysis of different shard selection algorithms, mainly CORI, ReDDE, and Modified ReDDE was carried out considering 10 and 20 queries, measuring the throughput and similarity score. The Table 1 interprets, as number of shards increase the time taken to retrieve the files is getting reduced. Among the three shard selection algorithm modified ReDDE gives better result as compared to CORI and ReDDE for 10 queries. From Fig. 4 we can observe that, as we increase the number of shards the retrieval time decreases. Initially, with only 1 shard, the time taken to retrieve the data is 864 ms. Similarly, as we increase the number of shards finally the time taken is 516 ms for 7 shards for 10 queries. The Table 2 interprets, as number of shards increase the time taken to retrieve the Table 1 Average time taken to retrieve files for 10 queries by shard selection algorithms Shard selection algorithm
No of shards
Multi-node un-clustered network. throughput (ms)
Multi-node clustered network. throughput (ms)
Score
CORI
1
4653
2611
10.050
3
3957
2304
9.568
5
3455
1804
13.685
7
3409
1661
15.263
1
3422
1365
11.256
3
3956
1217
10.878
5
3250
1113
13.686
7
2753
637
15.668
1
1533
864
12.056
3
1298
843
11.911
5
989
691
13.865
7
756
516
15.979
ReDDE
Modified ReDDE
Efficient Data Partitioning and Retrieval Using Modified ReDDE …
195
Fig. 4 Shard selection algorithms versus retrieval time
Table 2 Average time taken to retrieve files for 20 queries by shard selection algorithms Shard selection algorithms
CORI
ReDDE
Modi-fied ReDDE
No of shards
Multi-node un-clustered network.Throughput (ms)
Multi-node clustered network.Throughput (ms)
Score
1
6859
4583
10.050
3
6258
4257
9.568
5
5756
3758
13.685
7
5214
3599
15.263
1
5344
3591
11.256
3
5122
3052
10.878
5
4752
2651
13.686
7
4487
2257
15.668
1
2586
1576
12.056
3
2159
1322
11.911
5
1739
1049
13.865
7
1583
955
15.979
files is getting reduced. Among the three shard selection algorithm modified ReDDE gives better result as compared to CORI and ReDDE for 20 queries. The Fig. 5, shows the retrieval time for the proposed shard selection algorithm. Initially, with only 1 shard, the time taken to retrieve the data is 1576 ms. Similarly, as we increase the number of shards finally the time taken is 955 ms for 7 shards for 20 queries.
196
P. M. Dhulavvagol et al.
Fig. 5 Shard selection algorithm versus retrieval time
6 Conclusion Elasticsearch is used for storing and processing the large dataset. The shard selection algorithms help to enhance the throughput of the systems. The graphs and tables interpret that the retrieval time of the documents is related to the number of shards of a data node in the elasticsearch cluster. As we increase the number of shards in the cluster, the retrieval time of documents reduces in all the types of clusters. The shard selection algorithm selects the shard which is related to the query, therefore the document similarity score will be high in clustered data as compared to unclustered data. This is because as we increase the number of shards, documents in each shard reduce therefore searching time reduces and data is fetched faster for the given query. The results interpret that the modified ReDDE shard selection algorithm performance is better compared to the CORI and exhaustive shard selection algorithms by reducing overall cost by 26%.
References 1. P. Berglund, Shard selection in distributed collaborative search engines, a design, implementation and evaluation of shard selection in ElasticSearch, University of Gothenburg (2013) 2. A. Kulkarni, J. Callan, Selective Search: Efficient and Effective Search of Large Textual Collections (San Francisco State University, 2015) 3. A. L’Heureux, Machine Learning with Big Data: Challenges and Approaches (The University of Western Ontario, 2017) 4. M.D. Praveen, Vijay, S.G. Totad, Performance Analysis of Distributed Processing System using Shard Selection Technique in Elasticsearch (KLE Technological University, 2019) 5. E. Rodrigues, R. Morlay, Run Time Prediction for Big Data Iterative ML Algorithms: A KMeans Case Study (Faculty of Engineering, University of Porto, Porto, 2017) 6. J.P. Callan, Z. Lu, W.B. Croft, Searching distributed collections with inference networks, ˙ in Proceedings of the 18th Annual International ACM SIGIR Conference on Research and ˙ Development in Information Retrieval (ACM, 1995)
Efficient Data Partitioning and Retrieval Using Modified ReDDE …
197
7. P. Dhulavvagol, V. Bhajantri, S. Totad, Performance analysis of distributed processing system using shard selection techniques on elasticsearch. Procedia Comput. Sci. 167, 1626–1635 (2020). https://doi.org/10.1016/j.procs.2020.03.373
Stock Price Prediction Using Data Mining with Soft Computing Technique R. Suganya and S. Sathya
Abstract Stock costs forecast are interesting and valuable research theme. It provides huge revenue if the prediction favors’ and it makes a huge fall if it goes wrong. Developed nations economies are estimated by their stock trading sector. As of now, financial exchanges are viewed as a celebrated revenue generating field on the grounds that much of the time it gives large benefits with minimal loss and hence generally considered to be safe place of return. Financial exchange with its tremendous and dynamic data handling capacity is considered as a predominant place for profit lovers as well as for research scientists. In this paper, the concept of k-closest neighbor method was appled to select the data from the data set and later the selected datas are used to predict the future stock price using soft computing technique. In soft computing technique genetic algorithmic approach was implemented with a novelty for predicting and improving the performance of the result. Stocks of various companies are analyzed and the results are fine-tuned so the the trust worthable technique was developed. The experimental results show that this combined technique can well be suited for predicting the stock price. Keywords Soft computing · Long-short-filled term memory · Stock price · Data mining · Prediction · Clustering technique
1 Introduction Anticipating return on the securities exchange is a huge issue in monetary organizations and furthermore it is a challenging one. The share value forecast was a challenging method. It is found that for an organization’s share expenses don’t R. Suganya (B) Research Scholar, Department of Information Technology, School of Computing Sciences, Vels Institute of Science, Technology and Advanced Studies (VISTAS), Chennai, Tamilnadu, India e-mail: [email protected] S. Sathya Department of Information Technology, School of Computing Sciences, Vels Institute of Science, Technology and Advanced Studies, (VISTAS), Chennai, Tamilnadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_14
199
200
R. Suganya and S. Sathya
actually rely exclusively upon the cash related status of the organization, yet furthermore on the nations socio-economic monetary circumstance. It is determined by the country’s specific monetary turn of events or items. So the present stock price forecast has always been a tedious and competitive one. Due to many reasons, for example, political events, company related news, catastrophic events share costs keep varying always. A great deal of exploration in finding stock costs or stock trend has been continuing for a long time. It includes expecting fundamental details and previous data that has been openly available. It is consistently a difficult undertaking in discovering the connection between the prescient information. The cost prevalently relies upon the purchase to sell proportion. The cost will rise sharply in case there is a very big demand in the purchasing conditions. The price will drop in case if there are a greater number of vendors than purchasers. Generally individuals trading are in concern with the exchange or brokers assistance. An intermediary information may help traders in making great stock records. For the vast majority of the stocks, most dealers have proposals dependent on organization data and what generally anticipated them. The effective market hypothesis shows that the stock costs reflect not only on the historical prospects but also on all related parameters. Stock markets are a consistently changing tumultuous business region where expectation assumes a significant part. Forecast always gives the current data related to the stock price. This can thus be utilized in client fields to execute to decide to buy or not the particular portions of particular script. To plan outcome and expectations, a clear predictive analysis is needed for the recorded information. These analysis should be result of the frequently observed and calculated expectations of the stock price. Order is a pack of buy and sell strategies. But the buy or sell depends on the predicted strategies. It’s a bunch of information that forms the characterization method and is utilized for the same. The most famous strategies, for example, decision based tree, linear programming, neural organization and insights are utilized by the above methods. The order of the contents are given below. Introduction was given in Sect. 1, Information Survey works are given in Sect. 2. The proposed model is represented in Sect. 3, Results and Discussion was given at Sect. 4 and atlast the conclusion was given in Sect. 5.
2 Related Work Examination led by Zhauo et al. [1, 2] sorted out some way to discover the inconsistency in exchanging the information between the rise and fall of the stock with respect to the volume of data traded in the stock exchange. They found that there are odd groups that determines the stock cost. The researchers [1] tried to do information mining and found that it is dependable with the group of similarities (k-implies). The ordinary return and profit rate is analyzed against the validity calculation and the estimate will either be a profit or loss. The creators Baviskar and Namdev [1, 3] focused on understanding stock commercial center related realities for purchase
Stock Price Prediction Using Data Mining …
201
and anticipate National Stock Exchange data to are expecting predetermination stock activities. Information mining has been definitely used to separate significant records from verifiable stock information to explore and to anticipate future characteristics. There are a few classifications of technical indicators used to make better analysis of stock. They are price based and volume defined. Weerachart and Benamas [1, 4] put forth their ideas which improves the info mining for analyzing and contributing the various stock levels. Furthermore the author used the gain factor for predicting the future value. The author used to find the presence of valid data, implementation of rank search method and wrapper selection using greedy approach. The author also introduced a novel step search technique. He then used subset elimination by incrementing the known value against the standards. By means of this the predictive quality was improved considerably. Qasim, Assif and Alnaqi [1, 5] utilized the choice tree grouping, and it’s one of the procedures of novel information mining. The CRISP-Data mining approach is utilized to build the proposed model. This review presents a proposition to utilize the choice tree classifier on stock verifiable costs to make choice standards that give financial exchange purchase or sell suggestions. The future value was not so accurate as there are many factors that influence it. Factors including political reasoning, demonetarization and other financial setbacks. Utilizing verifiable time series securities exchange information and information mining strategies, the creators Amgde and Kulgarni [1, 6] fostered a forecast model for foreseeing share market designs relying upon specialized analysis. The trial results got shown the capability of ARIMA model to anticipate the stock value records on transient premise. A financial exchange shows ventures and investment funds that are advantageous to improve the public economy’s effectiveness. R is a language of programming and a designs and handling climate. R-studio permits the client to run R scripts in a more easy to understand climate. The creators Navale et al. [1, 7] utilized man-made analysis and data mining to definitively expect the results. In automation, most examiners have used methods to achieve precision and results. In addition, the limits and execution were improved using the power of data mining. For improving the performance the combined effect of data mining and human intelligence were used together. Desai and Gandhi [1, 8], introduce a novel text mining way to deal with oversee measure the impact of predictable news on stock. They showed a model that predicts changes in stock value that is proportional to the impact of non-quantifiable data, that represents the wealth rather than mere data. This advancement is made to help financial executors with new developed models from obvious data that are most likely to impact. Prasanna and Ezhilmaran [1, 9] proved their work on the findings of the new stock costs. In this survey, the author attempted to produce some remarkable work utilizing information mining techniques to fortell the stock price. His work contributed such to the field of stock prediction and provoked others to concentrate on their performance. The journalists (Dr. P. K. Sahoo and Mr. Krishna Charlapally) of [1, 10] investigated the utilization of auto-backward way to deal with figure stock rates. Because of its straightforwardness and expansive agreeableness, the plan of auto relapse is utilized. They additionally played out an exploration on the auto-backward model’s viability. The strategy for Moore and Penrose is utilized to anticipate the relapse coefficient
202
R. Suganya and S. Sathya
coefficients. What’s more, they additionally concentrated on expectation precision by contrasting the anticipated qualities throughout some stretch of time with the real qualities. Charkha [11] used the typical neural network in conjunction with the stock market learning algorithm [12, 13] and the neruo network patterns for prediction. In this review, key parameters [14] and secret datas [15] are not used as a criteria for prediction [16] and hence it had broken the concept of data selection [17] in stock market analysis for future price. The high end concept of sigmoidal [18] neuto function with variable bias can also be used for classification.
3 Proposed Work Most of the existing models do not include the extern parameters such as political, social and environment factors. In this model, the external factors are considered so that the accuracy of the system can be improved considerably. Various factors that affect the stock market behavior are also considered. The power of data mining was used to select the required data and the soft computing methods are used to fine tune accuracy of the prediction. The research has been carried out by collecting the historical data of the script named CITI (City Union Bank) from NSE website (Fig. 1). Fig. 1 Schematic system architecture
Stock Price Prediction Using Data Mining …
203
Fig. 2 Proposed model for predicting the stock price
3.1 Procedure Step 1: Stock Data collection: The required data is collected from the NSE website for a period of 6 months. It is downloaded in the form of excel sheet. Step 2: Data Selection: Here the required parameters are selected and the other parameters are eliminated. It is the process of valid data selection for stock price prediction. Step 3: In this step, an effective classification method was applied. The effective data mining algorithm was implemented, i.e. k-means. Step 4: The resultant values of different algorithms are now trained by the soft computing model, i.e. algorithm based lstm method. Step 5: By combining the results of the data mining and soft computing approach the future stock price was found.
3.2 Proposed Model The proposed model, selects the required data from the data set. Among the parameters, totally 12 effective parameters were selected based on the technical analysis and also their impact toward the future prediction. For the selected data, separately the following data mining techniques have been implemented and the output was noted (Fig. 2). The data selection is done by means of technical analysis forum and further novel clustering technique of data mining concepts were executed and then the novel algorithm of lstm approach is utilized to predict the future outcome of the stock.
3.3 Clustering Algorithm The Clustering algorithm implemented for the selected data is of k-means type and the results are tabulated as a new data. It is executed as follows.
204
1. 2. 3. 4. 5.
R. Suganya and S. Sathya
Clusters the given data into k known groups. Select the random cluster centers. Objects can find their nearby cluster. Mean calculation of all objects in every cluster. Iterate the above 2, 3, 4 steps until same points are utilized.
3.4 Clustering Technique Clustering is the assignment of collection of a bunch of items so that articles in a similar gathering (called a bunch) are more like each other than to those in different gatherings (groups). Bunch is a gathering of articles that has a place with a similar class. As such, comparative items are gathered in one bunch and disparate articles are assembled in another group. While doing bunch investigation, first segment the arrangement of information into bunches dependent on information comparability and afterward dole out the names to the gatherings. It is unaided learning strategy which is utilized to bunch comparable examples on premise of elements. Based on this the results are tabulated in Table 1. Hence with the defined input data, three more sets of data for training to be given to the next phase i.e. LSTM a soft computing approach. The input for the proposed system is given below (Fig. 3). The above figures shows the various inputs that are considered for stock price prediction. Table 1 Shows the predictive comparing of the various models Date
Prediction using time series analysis
Prediction using Prediction technical using data analysis mining (formula based)
Proposed method (data mining and soft computing)
Actual value (Real value)
17/05/2021
165.34
167.89
168.75
169.25
167.95
18/05/2021
167.99
168.93
169.00
169.23
170.80
19/05/2021
168.92
169.01
169.06
169.68
169.95
20/05/2021
170.00
170.04
170.94
171.00
171.65
21/05/2021
173.21
173.88
174.33
174.98
175.05
24/05/2021
174.06
174.58
175.78
176.02
176.35
25/05/2021
171.84
172.09
172.64
173.01
173.10
26/05/2021
171.95
172.94
173.91
174.11
174.35
27/05/2021
170.03
171.96
172.89
173.39
173.90
28/05/2021
172.01
172.05
172.91
173.67
173.75
Stock Price Prediction Using Data Mining …
205
Fig. 3 Input for stock price prediction
3.5 Proposed Algorithm Step 1: Start the program. Step 2: Required data selection from the old traditional data set of a dealing stock. Step 3: Automatic downloading of the required data. Step 4: Perform feature analysis on the data between 0 and Step 5: Develop a model with nearly sixty time interval stamps and one essential output. Step 6: Build the effective re-trained network with that of the above Step 5. Step 7: Performing KNN, Step 7: Implement first LSTM layer and also perform value reduction. Step 8: Record the information available in the final layer. Step 9: Fine tune it by performing novel enchancements and the loss overcome technique to minimize error. Step 10: Predict and visualize the results using plotting techniques.
3.6 Implementation of Novel LSTM Approach A framework which, with the aid of long short time memory, learns online to anticipate close stock costs. The Long-Short-filled Term Memory is a preliminary example for the RNN methodology for deeper learning purposes, LSTM has input associations, as opposed to traditional feed-in neural network approach. This is one of the information technique used. But also on all types of file such as (e.g. sound
206
R. Suganya and S. Sathya
or movie file) for information. An important step is the collection of information from the market before the processing of the data. In this proposed framework the import of data from advertising clearance organizations such as BSE (Bombay Stock Exchange) and NSE is the main phase (National Stock Exchange). In order to be separated from multiple views, the data set used in the market expectancies must be used. Additionally, the information set adds up to enhance the data set with additional outside information. Most of the information includes the costs of stock for the previous year. Python packages are available for the stock price prediction. The following phase is the processing of the data; pre-processing is an important step forward in information mining, which requires the modification of rough information into a basic setup. The material obtained from the source is contradictory, fragmentary and contains errors. The pre-processing procedure will purify the information; the highlights have to be scaled in order to limit the factors. The model preparation includes cross-approval, a well-founded, anticipated model implementation with the information on preparation. In the computation itself, the target of tuning models is to unequivocally tune the estimation preparing and training.
3.7 Data Set The required data set was derived from the official website of NSE and it is for a duration of 8 months, from Nov 2020 to July 2021. Sample data for the script named MARUTHI SUZUKI was shown below in Fig. 4.
Fig. 4 Sample data set for the script maruthi suzuki
Stock Price Prediction Using Data Mining …
207
4 Results For example consider the bank script named CUB. From the above table it has been found that the proposed model of using data mining with soft computing technique proved to be accurate when compared with the other models (Figs. 5 and 6).
Fig. 5 Predicted results with the proposed model
Fig. 6 Result comparison of the predicted price vs actual price
208
R. Suganya and S. Sathya
Table 2 Performance analysis for different stocks Stock name
LSTM (RMS error value) existing model
Data mining with LSTM (RMS error value)-Proposed model
Indian Overseas Bank
0.49
0.17
Bank of Baroda
0.31
0.22
City union Bank
0.89
0.35
Axis Bank
0.56
0.12
From the above graph it is found that the proposed model gives a good range of accuracy in predicting the future price. It is found that the no. of sample increases the system accuracy will also keep increasing. The performance of this proposed system is compared with the results of the existing LSTM model of four different stocks and the results are tabulated (Table 2).
5 Conclusion An extended and a realistic research approach was conducted and was found with the result proofs that the proposed model gives a very good accuracy in predicting the future stock price. The above experiment further shows that the no. of layers improved the accuracy of the system will also get improved. Hence there is a very good scope for the future researchers in this field of stock price prediction.
References 1. P. Garg, S.K. Vishwakarma, An efficient prediction of share price using data mining techniques. Int. J. Eng. Adv. Technol. (IJEAT) 8(6), 3110–3115 (2019) 2. L. Zhao, L. Wang, Japan advanced ınstitute of science and technology “Price trend prediction of stock market using outlier data mining algorithm”, in 2015 IEEE Fifth International Conference on Big Data and Cloud Computing, (Baylor University, 2015) 3. S. Baviskar, N. Namdev, Analyzing and predicting stock market using data mining techniques. IJIRT 2(3), 76 (2015) 4. W. Lertyingyod, N. Benjamas, Stock price trend prediction using artificial neural network techniques. in IEEE 2016 5. Q. Al-Radaideh, A.A. Assaf, E. Alnagi, Predicting stock prices using data mining techniques, in The International Arab Conference on Information Technology (2013) 6. M.C. Angdi, A.P. Kulkarni, Time series data analysis for stock market prediction using data mining technique with R. Int. J. Adv. Res. Comput. Sci. 6(6) (2015) 7. G.S. Navale, N. Dudhwala, K. Jadhav, P. Gabda, B.K. Vihangam, Prediction of stock market using data mining and artificial ıntelligence. Int. J. Comput. Appl. 134 (2016) 8. R. Desai, S. Gandhi, Stock market prediction using data mining. IJEDR (Int. J. Eng. Dev. Res. 2(2) (2014) 9. S. Prasanna, D. Ezhilmaran, A survey of stock price prediction and estimation using data mining techniques. Int. J. Appl. Eng. Res. 11(6), 4097–4099 (2016). ISSN 0973-4562
Stock Price Prediction Using Data Mining …
209
10. P.K. Sahoo, K. Charlapally, Stock price prediction using regression analysis. Int. J. Sci. Eng. Res. 6(3), Mar-2015 11. P.R. Charkha, Stock price prediction and trend prediction using neural networks. IEEE (2008) 12. A.S. Kumar, R. Suganya, Stock price prediction using tech news based soft computing approach. Int. J. Adv. Trends Comput. Sci. Eng. 9(2), 2049–2054, Apr 2020 13. S. Sathya, R. Suganya, Stock price prediction using datamining with novel computing approach. Turk. J. Comput. Math. Educ. 12(10), 2345–2351 (2021) 14. C. Anand, Comparison of stock price prediction models using pre-trained neural networks. J. Ubiquitous Comput. Commun. Technol. (UCCT) 3(2), 122–134 (2021) 15. P. Karthigaikumar, Industrial quality prediction system through data mining algorithm. J. Electron. Inf. 3(2), 126–137 (2021) 16. M. Tripathi, Sentiment analysis of nepali COVID19 tweets using NB, SVM and LSTM. J. Artif. Intell. 3(03), 151–168 (2021) 17. J.I.-Z. Chen, K.-L. Lai, Deep convolution neural network model for credit-card fraud detection and alert. J. Artif. Intell. 3(02), 101–112 (2021) 18. S.R. Mugunthan, T. Vijayakumar, Design of ımproved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021)
A Complete Analysis of Communication Protocols Used in IoT Priya Matta, Sanjeev Kukreti, and Sonal Sharma
Abstract With the proliferation of wireless networking, computer miniaturization, incorporation of computing technologies into every object, and the emergence of Internet connectivity everywhere, Internet-of-Things (IoT) has become the world’s most influential framework. IoT like all other network-based paradigms, deals with communication. Besides the expansion of connectivity and growth in the count of connected devices, there is an increased requirement of efficient and better solutions to connect them, make them communicate without a fail. To accomplish this communication, there are a number of well-defined and standardized set of rules, known as protocols. Various researchers have defined different layers in an IoT System. Each layer has one or the other protocol. In each layer, one can find a number of protocols, forming a protocol family for a specific layer. These protocols are discussed in sequence corresponding to the different layers. The technical aspects like expenses in terms of size and speed, and suitability to the application domain is elaborated in detail. Some of the majorly discussed protocols are 6LowPAN, IPv6, RPL, Wifi, Bluetooth, mDNS, DNS-SD, MQTT, CoAP, AMQP, Websocket, OMA-DM. This paper also offers the wide-ranging and most inclusive features of above-mentioned renowned protocols available in the market, with a proper comparison among them. Keywords Wireless-networking · Internet-of-Things · Communication · Protocols · Application domain
1 Introduction Internet of Things is a paradigm where interrelated objects are interconnected to each other via a global network, the Internet. The things that are present in this type of P. Matta (B) · S. Kukreti Department of Department of Computer Science and Engineering, Graphic Era University, Dehradun, India e-mail: [email protected] S. Sharma Uttaranchal University, Dehradun, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_15
211
212
P. Matta et al.
scenario consist of circuits, electronics, software, sensors and various types of tools that are used for connecting as well as exchanging the data with other devices. In this era of technology, we all are surrounded with smart gadgets around us like smart phones, smart watches, smart home appliances etc. Every time, somehow, we are always connected to Internet by different types of means, the main reasons behind this is, that IoT is playing a vital role in our lives as it made the things easier, fast, precise, and convenient that later on help us in our work to make it easier. According to Elhadi et al. [1], “the concept of a network of smart devices was discussed as early as 1982, with a modified Coke machine at Carnegie Mellon University becoming the first internet-connected appliance, able to report its inventory and whether newly loaded drinks were cold.” In 1999, Kevin Ashton presented with a new terminology, Internet-of-Things, to define a scenario in which all the things existing in the physical world are connected via the Internet. The expansion of Internet connectivity from the computing devices to comparatively unintelligent devices like chair, table, dryer, refrigerator, is the main aim behind the concept of IoT. According to Sungeetha et al. [2], “users and things can be connected at all times and locations using IoT.” According to Jacob et al. [3], “IoT is an ecosystem comprised of multiple devices and connections, a large number of users, and a massive amount of data.” IoT converts everything to a “smart-thing” and therefore, advances every facet of our living. IoT is usually described as “dynamic global network infrastructure with selfconfiguring capabilities based on standards and communication protocols.” For largescale scenarios Hanes et al. [4] described the paradigm as, “Internet of Things envisions a self-configuring, adaptive, complex network that interconnects things to the Internet through the utilization of standard communication protocols.” According to Arseni et al. [5] “following the complete trend, the IT paradigm called IoT aims to group each technological end-point that has the ability to communicate, under the same umbrella.” According to Miorandi et al. [6], “since the first day it was conceived, the IoT has been considered as the technology for seamlessly integrating classical networks and networked objects.” These smart objects are interconnected to each other with embedded electronics, computing capabilities, ability to sense and correspondingly actuate, and can have a unique existence in the physical as well as information world. After introducing the concept of IoT, the paper follows four more sections. In second section, we have discussed the characteristics of IoT. Section third discusses background and related work. Next section contains some of the already proposed architectures for an IoT system. Different layers have been identified in the same section. Following the identification of layers, their respective protocols are discussed in detail. Section five is conclusion of the paper.
2 Characteristics of IoT The concept of “smart things” targets at implementing technological tools in each and every “thing,” that is physically existing and feasible in every domain. At every
A Complete Analysis of Communication Protocols …
213
level and aspect of connected things, IoT should be considered as a major constituent. According to Chen and Kunz [7], since the IoT is a set of tools and technologies, it is necessary to take into account each level and aspect of a connected object, modeled as a system in its own right. Although, there are various components that assemble together to design and develop an IoT system, but the complete IoT system has some key characteristics, that are mentioned below. Connectivity: In IoT, connectivity means all the devices are able to communicate with each other, exchange the data among themselves. They are connected to each other via any type of network, mean they are not required to be connected via any predominantly major providers. Devices can make their own network that can exist on a much smaller and cheaper scale while still being functional. IoT creates these small networks between its system devices. Small devices: Nowadays the size of devices is decreasing, devices are available at low costs, and are becoming more powerful over time. IoT is working on the area of making devices small and making it more precise, scalable, portable, and versatile. Sensors and actuators: Micro-Electrical-Mechanical-Systems (MEMS) are unavoidable components of an IoT system. Sensors as well as actuators are those MEMS which convert functional energy to electrical signals and electrical signals to functional energy respectively. These form a crucial component that results into a transition of passive network of things to an active network of smart devices. Convenient control: The devices at home, office, small-scale or large-scale industries, hospitals, schools, institutes can be conveniently handled and controlled. Active engagement: All the connected technology, product, or services have to be actively engaged with each other to pursue as an active component of real time IoT system. All of them must possess active engagement with each other by the help of IoT. Artificial intelligence—Today, IoT has made every device that much smart that it knows how to react or respond over a particular situation. For example, if there is a refrigerator without butter and bread, it can itself order the required amount of bread and butter from the appropriate shopping complex.
3 Background and Related Work Whenever a paradigm emerges, its components, underlying platforms, supporting technologies are the major considerations. Similarly, whenever a new system of that paradigm is to be designed and developed, the major focus is on the architecture of the system. As there is no single agreed generalized architecture of an IoT system, therefore, various researchers proposed different architectures for different applications. Various suggestions are proposed to design and develop a domain-specific IoT architecture. Static component of the architecture does not allow it to serve the conditions where the requirement of diverse applications is completely dynamic in nature. Beier et al. [8] proposed that, “EPC global IoT architecture mainly focuses
214
P. Matta et al.
on the RFID network and smart logistics system can be considered as solution for architecture.” According to some researchers, IoT model can be easily defined in terms of different layers. For example, Gronbaek [9] and Dai et al. [10] have proposed that, “architectural model similar to the open systems interconnection (OSI) architecture are fruitful on IoT too.” Gronbaek [9] also declared that, “the architecture will support ubiquitous services on an end-to-end basis.” They deliberated it on the basis of four levels: Things layer, Adaptation layer, Internet layer, and Application layer. Tan and Wang [11] proposed, “a five-layered architecture to represent an IoT system completely, consisting of Application layer, Middleware layer, Coordination Backbone layer, Access layer, Edge Technology layer.” Ma [12] conversed that, “the architecture of IoT should be studied from the viewpoint of users, network providers, application developers and service providers, providing the basis for defining a wide variety of interfaces, protocols and standards.” Jing et al. [13] proposed, “a threelayered architecture is enough to handle all the tasks of an IoT system. The three layers are namely: Perception layer, Transportation layer and Application layer.” Similarly, Bandyopadhyay and Sen [14] proposed, “a generic five-layer architecture consisting of edge technology layer, access gateway layer, internet layer, middleware layer, and application layer, can describe the overall design of IoT.” Sarkar et al. [15] offered that, “the functionalities of IoT infrastructure are grouped into three layers, which are: (1) virtual object layer (VOL); (2) composite virtual object layer (CVOL); and (3) service layer (SL).” According to Ning and Wang [16], “a centralized web-resource-based architecture can be generated for decoupling the application development from the domain of heterogeneous devices.” They gave the concept of using a thin client, where a thin server can be used as a server and does not contain any application logic. And therefore, such an IoT model can support a variety of IoT applications. Castellani et al. [] proposed a specialized architecture, designed and developed for the application implemented in smart offices. It includes smart door for authenticated entry using unique identification techniques like RFID and secure network. The proposed model is composed of three types of nodes, namely, Base Station Node (BSN), Mobile Node (MN), and Specialized Node (SN). Yashiro et al. [17] proposed uID architecture, that can easily be reflected as a welldefined platform for things-oriented as well as semantic-oriented IoT systems. Its two main constituents are uCode and uID database. According to them, “concept of uIDCoAP design is to mitigate the burdens of manufacturers to add IoT communication functions to their existing products.” Ungurean [18] proposed an architecture based on OPC.NET specifications that is based on two main modules: the data server and the HMI application. According to them, “the data server acquires data from a network of sensors (fieldbus) and sends commands to the actuators (such as relays, electro valves, etc.) that are connected on the fieldbuses.” According to Rahman et al. [19], the architecture of IoT consists of five layers namely: Physical layer, MAC layer, adaptation layer, network layer, and application layer.
A Complete Analysis of Communication Protocols …
215
4 IoT Architecture Many researchers proposed three-layered architecture, while others proposed fourlayered and five-layered architecture too. The different architectures proposed by different researchers [20–23] are given below (Figs. 1, 2, 3 and 4): A three-layer architecture is the most accepted IoT architecture, that covers almost all the functionalities of an IoT system. As the name suggests, it has three layers namely, perception layer, network layer, and application layer. Perception layer is also referred as physical layer, as it deals with all the hardware components and physical devices. Sometimes it is also termed as sensing layer, as it contains the sensors which are embedded into physical things. The main task of this layer is to gather data from smart devices and transfer the collected data to the network layer. Network layers forms the second layer of the architecture. It basically acts as an interface between perception layer and application layer. The major issues of consideration in this layer are connection technologies, bandwidth, and secured connection. The connectivity may be wired or wireless, connection may be simplex, half duplex or full duplex. Application layer forms the last layer that receives the data from network layer, analyze it and process it to draw decisions. This analysis sends the response and result back to the perception layer via network layer, to act accordingly.
Fig. 1 Three-layered architecture [20]
216 Fig. 2 Three-layered architecture [21]
Fig. 3 Three-layered architecture [22]
P. Matta et al.
A Complete Analysis of Communication Protocols …
217
Fig. 4 Four-layered architecture [23]
5 IoT Protocols Whenever a system is composed of multiple layers, there are some well-defined rules of communication among those layers. Each layer has to follow a set of rules to communicate with the other layer as well as is peer on the other end. The communication among the smart devices also requires some communication rules. To understand and accomplish this communication, there are a number of well-defined and standardized set of rules. These set of rules are referred as protocols. They enable various devices embedded in different systems and using different networks to communicate each other. The massive Internet-of-Things system is the more complex as the protocols are required to ensure seamless and flawless communication. While discussing a general three-layered architecture of IoT, one can find a number of protocols, forming a protocol family for a specific layer. In previous section, we have identified a number of layers. Each layer will have its own set of rules and regulations, or precisely termed as protocols. These protocols are discussed in sequence corresponding to the different layers. CoAP: Internet Engineering Task Force (IETF) designed and developed Constrained Application Protocol (CoAP). According to R6, “CoAP has been defined as a technology enabler to allow applications to interact with physical objects.” CoAP also enables the merging of constrained devices into the IoT via constrained networks. This integration is possible even if the network availability as well as bandwidth is quite low. The base of CoAP is UDP, therefore overall implementation of CoAP is considered very lightweight. CoAP make use of all the commands provided by HTTP, therefore, efficiently supports client/server architecture. According to Rodrigues
218
P. Matta et al.
[24], “Where TCP-based protocols like MQTT fail to exchange information and communicate effectively, CoAP can continue to work.” According to Yashiro et al. [25], “CoAP CHTTP Mapping enables CoAP clients to access resources on HTTP servers through a reverse proxy that translates the HTTP Status codes to the Response codes of CoAP.” Some of the well-known characteristics of CoAP are as follows: • • • • •
Can work with the networks having low bandwidth. CoAP is a request/response protocol CoAP makes proper utilization of synchronous as well as asynchronous messages. CoAP runs over UDP Lowers down the overheads as encountered by TCP
MQTT: Message Queuing Telemetry Transport (MQTT), designed and developed by IBM is emerged as universal protocol for all kind of IoT projects. It is basically meant for lightweight M2M communications. MQTT functions by using TCP as its major base and is categorized as asynchronous publish/subscribe protocol. Instead of preferring request/response protocols, IoT systems favor publish/subscribe protocols, as the requirements of IoT projects can easily be met by publish/subscribe protocols. According to [23], “Publish/subscribe protocols meet better the IoT requirements than request/response since clients do not have to request updates thus, the network bandwidth is decreasing and the need for using computational resources is dropping.” Some of the well-known characteristics of MQTT are as follows: • • • •
Low energy requirement. The technique used by MQTT is “messaging-passing.” Size of data packets is very less, resulting in less bandwidth requirements. MQTT is a lightweight protocol, therefore, generally implementable in all kinds of projects. • MQTT runs over TCP. • A well-known example: MQTT is being used by Facebook. AMQP: To facilitate encrypted messaging among organizations and applications, and for asynchronous messaging by wire an open source published standard Advanced Message Queuing Protocol (AMQP) was introduced. According to Yokotani and Sasaki [26], “Study shows that comparing AMQP with the REST, AMQP can send a larger number of messages per second.” Because of its features like portability, secure, efficiency, and support of multichannel, AMQP is used extensively in IoT device management and client/server messaging. Even after network disruptions AMQP’s store and forward feature ensures reliability and is its key advantage. According to Yashiro et al. [25], “It ensures reliability with the following message-delivery guarantees: • At most once: means that a message is sent once either if it is delivered or not. • At least once: means that a message will be definitely delivered one time, possibly more. • Exactly once: means that a message will be delivered only one time.”
A Complete Analysis of Communication Protocols …
219
Some of the well-known characteristics of AMQP are as follows: • Can ensure reliability even in disrupted network. • Used to facilitate encrypted messaging. • It is open-sourced published standard. WEBSOCKET: As an enhancement to a standard HTTP connection the WebSocket was developed as a bi-directional communication protocol. The WebSocket’s base is TCP, and has newly come with the introduction of HTML5. For communication among client and server the technique which is used is full-duplex message-based. WebSocket is a kind of protocol which is not a request/response and publish/subscribe protocol. For establishment of a WebSocket session, a handshake with a server is initialized by the client. For handling the WebSocket sessions along with the HTTP connections over the same port by the web servers, the handshake in itself is identical to HTTP. Overhead in WebSocket messages are of only 2 bytes at the time of the session. According to Yashiro et al. [25], “It is designed for real-time communication, it is secure, it minimizes overhead and with the use of WAMP it can provide efficient messaging systems.” Some of the well-known characteristics of WebSocket are as follows: • It is an upgrade to HTTP. • It is a bi-directional communication protocol. • Operates over TCP. OMA DM: For the constrained devices having limited bandwidth, the Open Mobile Alliance Device Management (OMA DM) was designed and released in 2003. Its specifications were predecessor of LWM2M which were developed for mobile phones, tablets, and PDAs. The OMA DM gives many functions for the purpose of mobile device management and supports M2M communication through the range of protocols consisting HTTP, WAP, or SMS. The language which is defined by OMA-DM and is heavily dependent on it is SyncML. According to Ren et al. [27], “The typical underlying transport protocol is HTTP. This makes OMA-DM in unaltered form infeasible for constrained devices.” So basically, to support firmware updates, configuration, fault management and provisioning, OMA-DM should be preferred to IoT devices. Some of the well-known characteristics of OMA DM are as follows: • Designed for constrained devices. • Supports M2M communication. • Dependent on SyncML. Bluetooth: Bluetooth is the most preferred protocol for the short-range communication and building Personal Area Network (PAN). Bluetooth for transmission of data uses a 2.4 GHz wireless radio wave link and is commonly used protocol in IoT for wireless data transfer. Bluetooth really fits in for making wireless data transfer very cheap, short range, less power consumptive between the computing devices. The major section of devices where the Bluetooth protocol is taken on work are smartphones, smart wearable devices based on PAN and other portable devices,
220
P. Matta et al.
where without using big amount of memory and power small segments of data is transferred. There is another version of Bluetooth known as Bluetooth Low Energy (BLE) which consumes very less energy and has an important part in connecting the IoT devices. Some of the well-known characteristics of Bluetooth are as follows: • Uses 2.4 GHz frequency radio wave link for data transfer. • Very cheap, short ranged and less power consumptive. • Used in smartphones, smart wearable devices etc. IPv6: To tackle the long-anticipated problem of IPv4 address exhaustion, Internet Engineering Task Force (IETF) developed the IPv6 protocol. The count of available address space in IPv6 is 2^128 which in comparison to IPv4 are much more. IPv6 can facilitate up to 340 undecillion unique IP addresses. IPv6 supports auto-configuration, integrated security, and a variety of new mobility features, which in turn enables a higher degree of network complexity. The architecture of IPv6 and IPv4 is mostly similar. IPv6 is defenseless to malicious activities like IP scanning just because it provides us more address space. 6LowPAN: 6LowPAN is IP based technology which came into existence in 2007. Its an IPv6 low power wireless Personal Area Network that interpret encapsulation and header compression mechanisms. It was introduced because a low-powered energy IoT protocol was extensively required at that time, that’s the reason why it replaced IPv6. It has a major role in IoT wireless communication. According to Viswanath et al. [28], “It acts by supporting addresses with different lengths, low bandwidth, star and mesh topologies, battery supplied devices, low cost, large number of devices, unknown node positions, high unreliability, and long idle periods during when communications interfaces are turned off to save energy.” Some of the well-known characteristics of 6LowPAN are as follows: • IP based technology. • Interpret encapsulation and header compression mechanisms. Wi-Fi: It is a wireless protocol that operates on 2.4/5 GHz frequency and was developed to replace Ethernet using wireless communication, to provide easy to implement and easy to use short ranged wireless connectivity. Spectrum cost of Wi-Fi is zero. Wi-Fi is now an undeniable choice for IoT connectivity due to its zero-spectrum cost and the coverage of Wi-Fi is almost everywhere, but it not a correct choice always. Right now 802.11n is the commonly used Wi-Fi in home and business, it offers throughput of about hundreds of megabit/sec, which is good for file transmission but for various IoT application it is lot power consuming. Some of the well-known characteristics of Wi-Fi are as follows: • Easy to implement and use. • Provides throughput of hundreds of megabit/sec. RPL: Routing Protocol for Low-Power and Lossy Network is a distance vector routing protocol which runs over IPv6 and as its name suggests is developed for Low-Power and Lossy Network (LLNs). Message confidentiality and integrity is also
A Complete Analysis of Communication Protocols …
221
supported by RPL. This protocol works on IEEE 802.15.4. If link layer mechanisms are not accessible and inappropriate, RPL can use its own mechanism. According to Viswanath et al. [26], “Network devices running RPL protocol are connected with no present cycles. Due to that, the Destination Oriented Directed Acyclic Graph (DODAG) was built and is called a DODAG root.” mDNS, DNS-SD: For the desktop systems having nearly no limit of bandwidth Multicast Domain Name System (mDNS), Domain Name System Service Discovery (DNS-SD) were developed, but it lacks optimization for smart object networks which have low data rate. Even without any help from the servers in local network, mDNS allows to map domain name to network address. During the usage of DNS resource records, DNS SD facilitates to discover and to broadcast services in a network.
6 Conclusion With the vast emergence of IoT in day today life, it is highly required that the IoT systems should be highly efficient. Their efficiency is the actual major to their appropriate use. In any IoT system, the most critical part is communication. IoT like all other network-based paradigms, deals with communication. To accomplish any communication perfectly, there should be a number of well-defined and standardized set of rules, known as protocols. Time to time various protocols are defined and designed by researchers for IoT. Some of the well-known protocols are discussed in this work. Some of the majorly discussed protocols are 6LowPAN, IPv6, EPC, Wifi, Bluetooth, mDNS, DNS-SD, MQTT, CoAP, AMQP, OMA-DM, JSON-LD. Their technical specifications, advantages, and disadvantages are also discussed. This paper can give a brief overview to those researchers, who are going to develop IoT systems for different applications.
References 1. S. Elhadi et al., Comparative study of IoT protocols. Smart Appl. Data Anal. Smart Cities (SADASC’18) (2018) 2. A. Sungheetha, R. Sharma, Real time monitoring and fire detection using internet of things and cloud based drones. J. Soft Comput. Paradigm (JSCP) 2(03), 168–174 (2020) 3. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 4. D. Hanes, G. Salgueiro, P. Grossetete, R. Barton, J. Henry. IoT fundamentals: Networking technologies, protocols, and use cases for the internet of things. Cisco Press (2017) 5. S.-C. Arseni et al., Analysis of the security solutions implemented in current Internet of Things platforms, in 2015 Conference Grid, Cloud & High Performance Computing in Science (ROLCG) (IEEE, 2015) 6. D. Miorandi, S. Sicari, F. De Pellegrini, I. Chlamtac, Internet of things: Vision, applications and research challenges. Ad Hoc Netw. 10(7), 1497–1516 (2012)
222
P. Matta et al.
7. Y. Chen, T. Kunz, Performance evaluation of IoT protocols under a constrained wireless access network, in 2016 International Conference on Selected Topics in Mobile and Wireless Networking (MoWNeT) (IEEE, 2016) 8. S. Beier et al., Discovery services-enabling RFID traceability in EPC global networks, COMAD 6 (2006) 9. I. Gronbaek, Architecture for the Internet of Things (IoT): API and interconnect, in 2008 Second International Conference on Sensor Technologies and Applications (sensorcomm 2008) (IEEE, 2008) 10. G.P. Dai, Y. Wang, Design on architecture of internet of things, in Advances in Computer Science and Information Engineering (Springer, Berlin, Heidelberg, 2012). pp. 1–7 11. L. Tan, N. Wang, Future internet: The internet of things, in 2010 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE), vol. 5 (IEEE, 2010) 12. H.D. Ma, Internet of things: Objectives and scientific challenges. J. Comput. Sci. Technol. 26(6), 919–924 (2011). https://doi.org/10.1007/s11390-011-1189-5 13. Q. Jing, A.V. Vasilakos, J. Lu, Security of the internet of things: Perspectives and challenges. Wirel. Netw. Springer 20(8), 2481–2501 (2014) 14. D. Bandyopadhyay, J. Sen, Internet of things: Applications and challenges in technology and standardization. Wireless Pers. Commun. 58(1), 49–69 (2011). https://doi.org/10.1007/s11277011-0288-5 15. C. Sarkar, A.U. Akshay, R.V. Prasad, A. Rahim, R. Neisse, G. Baldini, DIAT: A scalable distributed architecture for IoT. IEEE Internet Things J. 2(3), 230–239 (2015). https://doi.org/ 10.1109/JIOT.2014.2387155 16. H. Ning, Z. Wang, Future Internet of Things architecture: Like mankind neural system or social organization framework? IEEE Commun. Let. 15(4), 461–463 (2011) 17. A.P. Castellani, N. Bui, P. Casari, M. Rossi, Z. Shelby, M. Zorzi, Architecture and protocols for the Internet of Things: A case study, in Proceedings of 8th IEEE International Conference on Pervasive Computing and Communication Workshops (PERCOM). (Mainheim, Germany, 2010), pp. 678–683 18. T. Yashiro, S. Kobayashi, N. Koshizuka, Sakamura, K., An Internet of Things (IoT) architecture for embedded appliances, in 2013 IEEE Region 10 Humanitarian Technology Conference, (Sendai, 2013), pp. 314–319 19. I. Ungurean, N.-C. Gaitan, V.G. Gaitan, An IoT architecture for things from industrial environment, in 2014 10th International Conference on Communications (COMM) (IEEE, 2014) 20. A. Hakiri, P. Berthou, A. Gokhale, S. Abdellatif, Publish/subscribe-enabled software defined networking for efficient and scalable IoT communications. IEEE Commun. Mag. 53(9), 48–54 (2015) 21. P. Lea, Internet of Things for architects: Architecting IoT solutions by implementing sensors, communication infrastructure, edge computing, analytics, and security. Packt Publishing Ltd. (2018) 22. J. Frahim, C. Pignataro, J. Apcar, M. Morrow, Securing the internet of things: a proposed framework. Cisco White Paper (2015) 23. Serving at the edge: A scalable IoT architecture based on transparent computing 24. T.G. Rodrigues et al., Hybrid method for minimizing service delay in edge cloud computing through VM migration and transmission power control. IEEE Trans. Comput. 66(5), 810–819 (2017) 25. T. Yashiro et al., An Internet of Things (IoT) architecture for embedded appliances, in 2013 IEEE Region 10 Humanitarian Technology Conference (IEEE, 2013) 26. T. Yokotani, Y. Sasaki,Transfer protocols of tiny data blocks in IoT and their performance evaluation, in 2016 IEEE 3rd World Forum on Internet of Things (WF-IoT) (Reston, VA, 2016), pp. 54–57 27. J. Ren, H. Guo, C. Xu, Y. Zhang, Serving at the edge: A scalable IoT architecture based on transparent computing. IEEE Netw. 31(5), 96–105 (2017)
A Complete Analysis of Communication Protocols …
223
28. S.K. Viswanath et al., System design of the internet of things for residential smart grid. IEEE Wirel. Commun. 23(5), 90–98 (2016). https://doi.org/10.1109/MWC.2016.7721747 29. R.A. Rahman, B. Shah, Security analysis of IoT protocols: A focus in CoAP, in 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC) (2016), pp. 1–7
Image Captioning Using Deep Learning Model Disha Patel, Ankita Gandhi, and Zubin Bhaidasna
Abstract Image Captioning means that the natural language descriptions are generated automatically based on the content of an image. It’s an important aspect of scene comprehension since it integrates computer vision and natural language processing knowledge. Numerous methods and algorithms are developed by researchers to increase the accuracy of image captioning. However, it is one of the major questions for future researchers to get optimized result in captioning an image. Furthermore, there are thousands of gray-scale images that are captioned. In this proposed work, different pre-trained models are used to extract features of images through Convolutional Neural Network (CNN) for colored images and gray-scale images from dataset and then, the extracted features are fed into LSTM, which generates caption for images. At last, the model’s accuracy of color and gray-scale images are studied to determine the model’s capability in captioning both types of images. Keywords CNN—Convolutional neural network · LSTM—Long short-term memory · R, G, B—(Red, Green, Blue) · NLP—Natural language processing
1 Introduction Deep learning is a machine learning and Artificial Intelligence (AI) technique that mimics how humans acquire knowledge. Deep learning is highly useful for data scientists who are concerned with gathering, analyzing, and interpreting massive amounts of data; it speeds up and simplifies the process. Image processing [1] is a technique for performing operations on images in order to improve them or extract relevant information from them. In science and industry, D. Patel (B) · A. Gandhi · Z. Bhaidasna Parul Institute of Engineering and Technology, Vadodara, India e-mail: [email protected] A. Gandhi e-mail: [email protected] Z. Bhaidasna e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_16
225
226
D. Patel et al.
image processing [1] has been and will continue to play an essential role. Number of applications, including visual recognition, scene understanding, etc., uses image processing. Caption generation [2] is a fascinating artificial intelligence challenge that involves generating a descriptive text for a given image. It uses two computer vision approaches to comprehend the image’s content, as well as a language model from the field of Natural Language Processing (NLP) to convert the image’s comprehension into words in the correct order [2]. Computer vision, NLP, and machine learning are all used to solve the problem of image caption [3]. Caption generation is a difficult artificial intelligence task in which a textual explanation for a given photograph must be generated. Image captioning is a solution to the real-life problems faced by people, especially for the vision and hearing-impaired person. This concept is not only implemented with images but can also be used with videos. There are also other applications like making a new dataset in virtual assistant, image indexing, etc.
2 Review of Literature Here is the summary of relevant papers with their detail in literature survey table. In [2], the author proposed a hybrid method that employs a multilayer CNN to produce features and an LSTM to arrange words into sentences. In that work, CNN compares the training photos to produce a correct caption for an image. It outperformed state-of-the-art models when using Bleu Metrics, a technique for evaluating model performance (Fig. 1). Firstly, in the above figure, the images are taken which are given to a pre-trained model for its features extraction. On the other side, the captions given in datasets are utilized and pre-processing is done (finding out tokens, vocabulary, etc). After that, both the outputs i.e., extracted features as well as the output of all processed data, are given while constructing the model for image captioning with RNN. At last, the caption is generated by the model for the respective image. A novel technique was presented in study [4]. The major goal was to caption an image directly from a gray-scale photograph. With the help of Inception V3, the features of the photos were extracted using a pre-trained model. Following that, those characteristics were fed into the LSTM (Long Short-Term Memory) model. However, it achieved a 46% total accuracy (Fig. 2). The flow of the whole system is seen in the above diagram. Here, Inception V3 is used as pre-trained model and its features are taken as first inputs in the model, which is followed by Dropout and Dense layer. Secondly, the embedded layer is taken after the second input. It is followed by the Dropout layer with probability of 0.5 and last, LSTM layer of 256 units is added. At last, Decoder model is made with the extracted inputs and LSTM which passes through Dense layer of 256 units having ReLU as activation function. Finally, SoftMax layer is added to the output probability.
Image Captioning Using Deep Learning Model
227
Fig. 1 Architecture of image captioning [2]
The paper [1] demonstrated a way for captioning both photos and videos that is comparable. CNN and RNN (LSTM), as well as GAN, are utilized to caption images. CNN and RNN were employed in the generation of captions for videos, and LSTM model with GAN was used where videos were segmented into frames of pictures to be processed. The work had a quick rundown of picture and video captioning. In the research work [5], MLADIC is a multitask system that used a dual learning technique to maximize two linked objectives: image captioning and text-to-image synthesis. Encoder-decoder model (CNN-LSTM) was used for image captioning, and C-GAN was used for picture synthesis (conditional generative adversarial network). One is the target domain, which included the flicker30k and oxford-102 datasets, while the other is the source domain, which included the MS-COCO dataset. In comparison to the target domain, MLADIC performed better. In future, large-scale unpaired photos and text can be gathered from a variety of web sources, such as Pinterest for photographs and Wikipedia for content. The research work [6] introduced a new picture captioning difficulty, namely, characterizing an image under a specific theme. A cross-model embedding of an image was taught to tackle this challenge. On the MS-COCO and Flickr30K datasets, the suggested method has performed good result for both captioning image-retrieval and caption production. The new framework gives users more control over how captions for photographs are generated, which could lead to some interesting applications. It was found to be superior to other approaches. It delivered good results on both datasets. In [7], captioning is a project that described the content of RSIs. To describe RSL 5 sentences were used in caption dataset. In previous approaches, five sentences were given separately, which could result in an unclear sentence. To treat five sentences as a whole, a collection of words with the same topic terms should contain the same
228
D. Patel et al.
Fig. 2 Schematic diagram of LSTM network [4]
information in all five sentences. In the training phase, the first topic repository was supplied to record subject words. The topic terms were then obtained from the topic repository for testing purposes. Finally, topic words were fed into a recurrent memory network to help generate sentences. The test image’s topic words can also be altered. The question of how to limit the recurrence of these situations will be addressed in future research. In the research work [8], to resolve these issues, an online positive recall and missing concepts mining method were offered in this study. The loss of distinct samples based on their online positive recall forecasts was re-weighted. For missing concept mining, there were two stages. An element-by-element selection procedure was used to determine the most appropriate concepts at each time for caption generation, so that the image description can be described very precisely. The relationship between concepts would be clearly extracted in future work in order to increase performance.
Image Captioning Using Deep Learning Model
229
The paper [3] proposed a multilayer dense attention model for image caption. The image features for the coding part were extracted using a faster recurrent convolutional neural network, the multilayer dense attention model which was decoded using the LSTM-attend, and using LSTM-attend, the description text was generated. Parameters were defined using strategy gradient optimization in reinforcement learning. The dense attention methods were used in coding stage to effectively avoid non-salient information interference, and the relevant descriptive text for the decoding process is output. As a result, photos were analyzed and text was generated with ease. In the research paper [9], the old way of text generation was obsolete. It did not have any connections, such as those between famous people, institutions, and buildings. In the study, the user focused on annotating a news visual, which can aid in gaining real-world knowledge about the story behind the entire image. The researcher proposed a fresh technique that captions the image more precisely to a life-like explanation by exploiting the semantic importance of the listed elements. Furthermore, it extracted text from news and examines the relationships between them. In particular, the user recommended to take out some information from news using a phrase correlation analysis algorithm. There is no Optimal Image Text recognition Training Set in the work [10]. An unique method entitled sglstm was designed to overcome the unavoidable imbalance of message in datasets. FLICKERNYC was built from Flicker Dataset, and a unique guiding textual feature was acquired using a multimodal LSTM model. Image content and description were inextricably linked while teaching mlstm. The leading information was then used as extra input to the network during sglstm development, together with the picture representations and corresponding ground truth descriptions. These inputs were passed to the multimodal block and the caption was generated. In [11], a new system “RSI” was introduced that generated captions using finding out the relationships between objects with their attributes, which is present in RS images with caption. The first step was to encode image visual features, then generate the caption. Second step was to convert that generated caption into meaningful feature vectors. Last step was to measure the comparison between both the vectors of the query picture’s caption as well as those of archive photos and then retrieve the most comparable photos to the query image. They intend to improve the captioning area in future. In [12], Prior RL-based picture captioning strategies concentrated mostly on a specific policy network and reward function—an approach that is unsuitable for multilevel (keyword and phrase) and multimodal image captioning. A novel image labeling framework was suggested for optimization that can be readily combined with RNN-based captioning models, language metrics, or visual semantic functions. It is comprised of two components: a multilevel policy network that modifies the word and sentence-level rules for word formation simultaneously, and a multilevel reward function that uses both a vision language reward and a language reward cooperatively. The research work [13] supervised learning model that integrates multiple deep learning approaches in order to investigate feasibility of capturing the difference between two image characteristics in order to generate a language model probability
230
D. Patel et al.
distribution. The feature of an image pair was extracted using deep Siamese convolutional neural network, and then salient regions of feature vector were detected using an attention mechanism, allowing a bidirectional LSTM decoder to generate a matching and semantically associated caption sequence. The model compared a pair of images and corresponding caption. In the paper [14], the system was typically a top-down based mechanism. These models were inquired from image features using hidden LSTM, rather being optimized. A paradigm called VAA (Visual Aligning Attention) was presented to address issues caused by top-down based mechanisms, such as losing concentration and resulting in decreased accuracy. During the training phase, a well-designed visual positioning loss optimized the attention layer. By directly measuring the feature comparison of attended image characteristics and corresponding words inserting vectors, the visual alignment loss can be calculated. It filtered out the non-visual words in sentences from the visual vocabulary. In [15], how to effectively increase the “vertical depth” of encoder-decoder remains to be solved. For picture captioning, a new technique called Deep Hierarchical Encoder-Decoder Network (DHEDN) was created. The structure was exposed to different encoders and decoders in this method. In caption generation, it has the ability to combine high-level semantics vision and language. It has three layers in its model. The first component is an LSTM decoder. The middle layer has an encoderdecoder to improve the top layer’s decoding ability, while the bottom layer had an LSTM for textual input encoding. The Qualitative studies showed how this model “translates” an image into a sentence and a visualization showed how the unseen states of different hierarchical LSTMs evolve over time. In research [16], CNN is image decoder which extracted features from image. A RNM is caption decoder which generated output caption according to features. However, it represented the co-attention that distinguishes intermodal relations while ignoring the self-attention that distinguishes intramodal interactions. Multimodal Transformer Model, which is evolved from Transformer Model, overcomes it. In this way, it captured both inter- and intra-model interaction in a single attention block. It can make difficult decisions and generate captions. Furthermore, the multiview visual feature introduced in the MT model can improve its performance.
3 Literature Table Title
Journal name
Year
Data set
Algorithm
Image captioning—a deep learning approach [2]
International Journal of Applied Engineering Research Open Access
2018
Flickr8k, Flickr30k
CNN, LSTM – Accuracy-68% – Limited dataset is used – Accuracy can be improved
Limitations
(continued)
Image Captioning Using Deep Learning Model
231
(continued) Title
Year
Data set
Algorithm
Deep cap: A deep IEEE learning model to caption black and white images [4]
Journal name
2020
Flickr8k
CNN, LSTM – Accuracy-45.77% – Accuracy can improve in future using alternate method
Limitations
Automatic image IEEE Access and video caption generation with deep learning: A concise review and algorithmic overlap [1]
2020
Flickr8k,9k, 30k and MSCOCO
CNN, RNN/LSTM
– Shown concise review – Any type of implementation is not done
– Large-scale dataset for better performance – For text they have collected data from different sources like Pinterest for images and Wikipedia for text – Can improve performance of their model
Multitask learning for cross-Domain image captioning [5]
IEEE 2019 Transactions on Multimedia
Flickr30k, Oxford- 102, MS-COCO
CNN, CGAN, RNWLSTM
Topic-oriented image captioning Based on order-embedding [6]
IEEE Transactions on Image Processing
2019
Flickr30k, MS-COCO
CNN, RNN/LSTM
Retrieval topic recurrent memory network for RSI captioning [7]
IEEE Journal of Selected Topics in Applied Earth
2020
UCMcaptions, RSICD
Memory Networks, RTRMN
– Some images have meaningless action so they try to solve that error
More is better: precise and detailed image captioning using online positive recall and missing concepts mining [8]
IEEE Transactions on Image Processing
2019
MS-COCO, MS-COCO Online Test Server
FCN, MIL
– Need to extract relation between concepts to improve performance
Multilayer dense IEEE Access Open Access attention model for image caption [3]
2019
Chinese AI’ Flickr8k, 30k, MSCOCO
Faster R-CNN ,
– Accuracy-68% – Have to implement on larger dataset
LSTM
(continued)
232
D. Patel et al.
(continued) Journal name
Year
Data set
Algorithm
Context-driven IEEE Access image caption with global semantic relations of the named entities [9]
2020
Good news, Breaking news
CNN, LSTM – Few outcomes have average score compare to others
Self-guiding multimodal LSTM—When we do not have a perfect training dataset for image captioning [10]
IEEE Transactions on Image Processing
2019
Flickr NYC
m LSTM, sg LSTM
– Test this model on another dataset instead the one which they have used
Toward remote sensing image retrieval under a deep image captioning perspective [11]
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
2020
UAV image captioning dataset, RSICD dataset
CNN, RNNLSTM
– Improve captioning block for better performance
Multi-level policy IEEE 2020 and reward-based Transactions deep on Multimedia reinforcement learning framework for image captioning [12]
Flickr30k, MS-COCO
CNN, Multi-layer LSTM
– Implement multi-agent algorithm to train the model
Caption net: Automatic endto-end siamese difference captioning model with attention [13]
Title
Limitations
IEEE Access
2019
Spot-the-diff baseline
Deep CNN, LSTM
– More accurate features is need to obtain – Accuracy is not so good
VAA: Visual IEEE Access aligning attention model for remote sensing image captioning [14]
2019
UCMcaption, SydneyCaptions
VAA Model
– Accuracy-81% – Can improve VAA model for complicated images
Flickr8K, 30k, MSCOCO
LSTM
– More dataset can take to test model’s performance
Deep hierarchical IEEE 2020 encoder- decoder Transactions network for on Multimedia image captioning [15]
(continued)
Image Captioning Using Deep Learning Model
233
(continued) Title
Journal name
Year
Data set
Algorithm
Limitations
Multimodal transformer with multi-view visual representation for image captioning [16]
IEEE Transactions on Circuits and Systems for Video Technology
2020
MS-COCO 2015 dataset
Multimodal transfer
– Can go for large dataset to test the model
4 Proposed work There are several methods and models that are created to caption the image. Yet captioning image is the complex task. Some of the researchers have implemented different methods to caption color image while 1% of researchers have captioned gray-scale image. Overall, proper accuracy has not been attained due to several factors such as models, training, inputs, and outputs, etc. So, in this proposed work, a new approach that can caption both color and gray-scale images has been proposed. Additionally, this work can be further utilized in social media platforms for captioning purpose. In future, one possible outcome can be a dataset named Flicker30k is used to train model now, however, later a larger dataset can be opted. The Flickr30k dataset includes 31,000 photos from Flickr, as well as 5 reference sentences contributed by human annotators. The similar dataset is also used for gray-scale images, by converting the current dataset’s images into gray-scale. For feature extraction, pretrained model i.e., Inception V3, VGG16 or VGG19 are used. They are highly trained models with thousands of images. They are basically used to identify any object in the image and are highly accurate. Firstly, the features are obtained from the image datasets of the colored one. While, for the gray-scale captioning, current dataset will be converted into gray-scale, so that, model will easily understand how to caption the gray-scale images as they only have 1-channel. The images are then separated and fed to the pre-trained (Inception) model accordingly. If there are color images which is of 3-channel, it gives in RGB channel form to the model. However, Inception Model takes input of 3-channels at a time, hence 1-channel of gray-scale was stacked 3 times to it. Later, the extracted features are used as input to LSTM, which generates the caption of an image. From the above procedure, the good performance of model is gained, where it can identify both types of images. With the help of proper methods, training more data and epochs and use of proper technique in the model, the performance of image captioning can be improved more than other methods.
234
D. Patel et al.
5 Observation and Result Many approaches and algorithms have been presented and used to increase the accuracy of captioning the picture during the research of various studies. Some attempted to improve accuracy, while others introduced a new technique for a variety of reasons. Making a new set of data is one of these goals. Further, the Attention Mechanism is applied in a few articles [3, 14, 16]. With this strategy, their major objective was to increase the model’s accuracy for captioning. On the other hand, other researchers utilized the Encoder-Decoder Framework [15], which included building a new network that was accessible to distinct encoders and decoders. The research [10] that caught our attention aimed to create a new dataset from the Flicker dataset called “FlickerNYC.” The researchers [10] created two approaches (sgLSTM and mLSTM) for improving caption descriptions in the existing dataset. However, no method has previously been proposed that can create both color and gray-scale images.
6 Conclusion This paper explains what image captioning is and its benefits. A different approach to caption the colorized images and gray-scale images has been framed and tested it with different datasets and pre-trained models. At last, model’s accuracy has been determined to find out that this model is the most suitable model to caption both types of images. However, there are few challenges to overcome. Firstly, the irrelevant use of words in caption, i.e., captioning which is not present in a scene, is the main problem. Secondly, evaluation is the problem where automatic testing is not as good as human testing. Computer vision systems will become more reliable as automatic picture captioning and scene interpretation improve, making them more useful as personal assistants for visually impaired persons and in enhancing their day-to-day lives. From the above survey and study, the benefits as well as challenges to overcome in field of image processing are evident.
References 1. S. Amirian, K. Rasheed, T. Taha, H. Arabnia, Automatic image and video caption generation with deep learning: A concise review and algorithmic overlap. IEEE Access 8, 218386–218400 (2020) 2. D.S. Lakshminarasimhan Srinivasan, A.L. Amutha, Image captioning—A deep learning approach. Int. J. Appl. Eng. Res. Open Access 3. K. Wang, X. Zhang, F. Wang, T. Wu, C. Chen, Multilayer dense attention model for image caption. IEEE Access 7, 66358–66368 (2019)
Image Captioning Using Deep Learning Model
235
4. V. Pandit, R. Gulati, C. Singla, S. Singh, DeepCap: A deep learning model to caption black and white images, in 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) 5. M. Yang, W. Zhao, W. Xu, Y. Feng, Z. Zhao, X. Chen, K. Lei, Multitask learning for crossdomain image captioning. IEEE Trans. Multimedia 21(4), 1047–1061 (2019) 6. N. Yu, X. Hu, B. Song, J. Yang, J. Zhang, Topic-oriented image captioning based on orderembedding. IEEE Trans. Image Process. 28(6), 2743–2754 (2019) 7. B. Wang, X. Zheng, B. Qu, X. Lu, Retrieval topic recurrent memory network for remote sensing image captioning. IEEE J. Sel. Topics Appl. Earth Observations Remote Sens. 13, 256–270 (2020) 8. M. Zhang, Y. Yang, H. Zhang, Y. Ji, H. Shen, T. Chua, More is better: Precise and detailed image captioning using online positive recall and missing concepts mining. IEEE Trans. Image Process. 28(1), 32–44 (2019) 9. Y. Jing, X. Zhiwei, G. Guanglai, Context-driven image caption with global semantic relations of the named entities. IEEE Access 8, 143584–143594 (2020) 10. Y. Xian, Y. Tian, Self-guiding multimodal LSTM—when we do not have a perfect training dataset for image captioning. IEEE Trans. Image Process. 28(11), 5241–5252 (2019) 11. G. Hoxha, F. Melgani, B. Demir, Toward remote sensing image retrieval under a deep image captioning perspective. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 13, 4462– 4475 (2020) 12. N. Xu, H. Zhang, A. Liu, W. Nie, Y. Su, J. Nie, Y. Zhang, Multi-level policy and reward-based deep reinforcement learning framework for image captioning. IEEE Trans. Multimedia 22(5), 1372–1383 (2020) 13. L. Yang, H. Wang, P. Tang, Q. Li, CaptionNet: A tailor-made recurrent neural network for generating image descriptions. IEEE Trans. Multimedia 23, 835–845 (2021) 14. Z. Zhang, W. Zhang, W. Diao, M. Yan, X. Gao, X. Sun, VAA: Visual aligning attention model for remote sensing image captioning. IEEE Access 7, 137355–137364 (2019) 15. X. Xiao, L. Wang, K. Ding, S. Xiang, C. Pan, Deep hierarchical encoder–decoder network for image captioning. IEEE Trans. Multimedia 21(11), 2942–2956 (2019) 16. J. Yu, J. Li, Z. Yu, Q. Huang, Multimodal transformer with multi-view visual representation for image captioning. IEEE Trans. Circuits Syst. Video Technol. 30(12), 4467–4480 (2020)
Recommender System Using Knowledge Graph and Ontology: A Survey Warisahmed Bunglawala, Jaimeel Shah, and Darshna Parmar
Abstract In recent years, Users find it challenging to choose what interests them among the many options offered due to the abundance of information. And finding those choices from data itself has also becomes very challenging task for organizations. To handle this task, recommendation system is an important field of research in computer science. Despite several efforts to make RS more efficient and personalized, it still faces issues like cold start, data sparsity, etc. And as it is designed to be readable by humans only, computer cannot process nor can interpret the data in it. Ontology facilitates the knowledge sharing, reuse, communication, collaboration, and construction of knowledge rich and intensive systems. Adding semantically empowered techniques to recommender systems can significantly improve the overall quality of recommendations. There has been a lot of interest in creating recommendations using knowledge graphs as a side information. By this, we not only overcome the issues of traditional RS but also provide a flexible structure, which naturally allows integration of multiple entities all together. It is also helpful in explanation for recommended items. So, in this survey we collected recently published research papers on this particular field to enhance RS. We provided a fine-grained information on this topic along with the explanation on how to use ontology approach for building a KG and challenges faced by both RS and KG systems. Certain crucial datasets and tools are also offered for a better understanding of accessibility. Keywords Recommender system · Knowledge graph · Ontology · Top-down approach · Bottom-up approach · Challenges in KG · Challenges in RS
W. Bunglawala (B) · J. Shah Parul Institute of Engineering and Technology, Vadodara, India e-mail: [email protected] J. Shah e-mail: [email protected] D. Parmar Prof., Parul Institute of Engineering and Technology, Vadodara, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_17
237
238
W. Bunglawala et al.
1 Introduction The rapid progress of digital technology has resulted in a massive increase in data. Data are generated in large quantities on social media platforms like Twitter, Instagram, and Facebook. Furthermore, research and publications are also increasing day by day [1]. With this advancement of the data, it brings advantages and disadvantages both. Advantage can be described as we have a lot of data available within seconds. Disadvantage can be stated as abundance of data has increased and due to that we are unable to get most relevant and required information. To overcome this disadvantage, recommendation system has been developed and still in research trend. A recommendation system is a type of information filtering system that attempts to predict a user’s “rating” or “preference” for an item. The task of a recommendation system can be divided into two parts: (I) estimating a value of prediction for an item and (ii) recommending users about items [1]. To accomplish this objective, there are a variety of ways available, the most frequent or popular of which being Content-based Recommendation Systems and Collaborative Recommendation Systems. Hybrid Recommendation System is also developed by merging those techniques [1–3]. Recommendation systems require constant change because of the exponential growth of data and knowledge. In recent years, introducing a knowledge graph as a side information in recommendation system has attracted a lot of researchers and organizations [4, 5]. First, Knowledge graph is introduced by goggle in May 2012 [6]. There are lot of definitions on knowledge graph is available according to its usage. For ease of comprehension, a knowledge graph is a heterogeneous graph in which the nodes represent entities and the edges reflect relationships between them [2]. Knowledge graph has the advantage of having a flexible basic graph structure and providing a model of how everything is connected. In this way, knowledge graph is of great use in RS as side information and to help with explanation and integration of large data. Ontologies are the foundation of a knowledge graph’s formal semantics. They can be thought of as the graph’s data schema. They serve as a formal contract between the knowledge graph’s creators and its users over the meaning of the data included inside it. A user could be another person or a program that needs to interpret facts in a reliable and exact manner. Ontologies ensure that data and its meanings are understood by everyone [7]. So, we can say that ontology represents structure or schema and it has power to rules and reasoning, while knowledge graph captures the data. Motivation behind this survey is to state the latest research in the area of recommender system as it is very important field of research now a days and in upcoming future. And to handle the abundance of data that are generating nowadays we need some common and more scalable structure to handle it, so concept of knowledge graph and ontology is considered. That’s why study not only include basic information regarding system but also gives an overview on basic steps to follow for building the system. And study also include information available on datasets and tools that can be used. Also, this type of system cannot only be used for ecommerce. But it
Recommender System Using Knowledge Graph …
239
can be majorly beneficial to the situation like COVID-19 where we want related data available within seconds for data discovery or in case of emergency. So, we reviewed some research and articles combining these technologies.
2 Background and Related Work 2.1 Concept of Knowledge Graph Although some people have attempted to establish a formal definition of knowledge graph, none of them can be said the standard definition. The phrase “Knowledge Graph” can be interpreted in a variety of ways. As an alternative to the definition, the following features of a knowledge graph can be presented [8]: • It primarily describes real-world things and their interrelationships in the form of a graph. • In a schema, defines the classes and characteristics of entities. • Allows for the possible interconnection of arbitrary things. • Covers a wide range of topics. As shown in the figure below, entity is a thing present in a real world while concept is something define as a collection of individuals which have a same characteristic. Literal can be defined as a nothing but a specific value or strings of some relations. And the edge between entity and concepts can be define as a relation. For example, Yao Ming is individual entity and Basketball is a concept as so many players out there play basketball such as Kobe Bryant and Stephan Curry. While Yao Ming height can be define as a “2.29 m”, so this specific value can be said literal and on the other hand, Yao Ming has a wife Ye Li so wife is a relation between those two entities (Fig. 1). Fig. 1 Knowledge graph [8]
240
W. Bunglawala et al.
It’s worth noting that there are two sorts of knowledge in KG: schematic knowledge and factual knowledge [8]. Statements concerning concepts and qualities, such as (Asian Country, subClassOf(), Country), make up schematic knowledge. While factual information is made up of claims regarding specific situations, the triples in the graph above are all factual knowledge. The majority of the KG has a great quantity of factual information and a minor amount of schematic knowledge. The logical foundation of knowledge graphs is based on ontology languages such as Resource Description Framework (RDF) and Ontology Web Language (OWL), which are W3C-recommended creations (World Wide Web Consortium). RDF can be used to represent rich and complicated knowledge about entities, properties, and relationships, while owl can represent both schematic and factual knowledge. So, using ontology as a basic logical foundation, a knowledge graph can be formed in one of two ways: bottom-up or top-down.
2.2 Literature Work As we have reviewed latest work related to recommender system using Knowledge graph, and ontology, various researchers have used various model and techniques. Some of the reviewed research and articles are mentioned below. In paper [4], authors have mentioned MKR model for KG enhanced RS and within a short period of time one more research [9] mentioned extended MKR model. the system flow mentioned in the paper is, feature extraction—in this general feature are extracted using MLP layer and Text feature extraction is based on the Text CNN. After the feature extraction, recommendation module is there in that they have taken u and v where u is users and v is items as an input for this layer and predicted the probability of the user u engaging item v. After this knowledge graph embedding layer is there as a side information. And then cross-compression unit is mentioned, in which two parts are there, one is crossing part and another is compression part. Through cross-compression units, SI-MKR can adaptively adjust the weights of knowledge transfer and learn the relevance between the two. In one research [10], authors applied a hierarchical design based on heterogeneous input features to recommendation systems to learn text features, behavior features, graph structured features, and spatio-temporal features from massive data. They introduced the classification model design of recommendation systems, which is divided into three layers: feature input, feature learning, and output layers. And they also mentioned the evaluation indexes like mean absolute error (MAE) and root mean square error (RMSE) for evaluating the RS. Experimental comparison and future development direction of recommendation systems are also mentioned. Paper [6] written by authors Jiangzhou Liu, Li Duan presents the basic knowledge of recommendation system and knowledge graph. After that they mentioned the key methods that used in the recommendation system with KG which includes path-based method, embedding-based method and hybrid method further more they mentioned the user interest model, after that they have provided some basic future
Recommender System Using Knowledge Graph …
241
directions which include combination with graph neural network, enhanced representation of KG, KG completion and corrections. With same title paper [5] written by authors Qingyu Guo and others categorized recommendation methods based on knowledge graph into three categories: embedding based, connection-based methods and propagation-based methods and mentioned pros and cons on algorithm used in these methods. Useful dataset is mentioned and categorized into different categories such as movies and books. One of the detailed papers [11] mentioned KG based RS filtering approaches into categories like ontology based, linked open database, embedding based and path based. In result, they classified into two categories such as KG and Semantic Web and second is KG and AI methods. In first category, top six approaches mentioned some of them are KG and linked data, KGE and Ontologies, and in second category, they compared different filtering approach and mentioned that hybrid system is most common. Future direction mentioned by the authors are interpretability of RS, explainable recommendation and KG based dynamic RS. In other paper [12], authors mentioned that there are two types of approach, bottom-up and top-down approach. Paper describes bottom-up approach for knowledge graph creation. In that, they first shown the architecture with different layers like knowledge extraction, knowledge fusion, storage of knowledge graph, and retrieval of KG have been described in details with their methods and tools available for it. In one paper [13] authors created a scenic spot knowledge graph based on ontology, paper defines the concept of ontology on why and how should we use ontology so that the purpose can be served is greatly explained in this paper. Also, they present architecture that includes steps like data gathering and ontology building, entity alignment, and knowledge graph storage tool. They used neo4j for storage and mentioned that it is one of the great databases that stores structured data in the form of network. For the evaluation purpose they also describe precision and recall matrices. There model outperform the string similarity method. In paper [2], authors mentioned that graph database is more efficient and expressive so they used a property graph. In that, they represent a multi-layer graph model and constructed a knowledge graph and returned the various top end N recommendation. So, they mentioned five-layer model in which layer one is for users and details, layer two is for needs, layer three mentioned features and related details while the layer four comprised of all nodes related to various items specifications and its associated details. The layer five comprised of all nodes related to various items and its associated details. The construction of layer two, three, and four can be carried out based of preoccupied knowledge. In the process, a system model is defined as a combination of different recommendation techniques hence can be called hybrid model RS, so that more efficient top-N recommendation can be done. In survey paper [14] along with review, some great future directions are mentioned such as bringing in more side information into knowledge graph so that power can be enhanced, also connecting social networks to know how social influence affect the recommendation, explainable recommendation, and GCN are also in trend. For the purpose of explainable reasoning over KG for RS, one of the papers [15] has mentioned new model named KPRN-knowledge aware path recurrent network.
242
W. Bunglawala et al.
Model allows effective reasoning over path to infer underlying rationale of user-item interaction. Also, they designed new weighted pooling operation to discriminate the strength of different paths in connecting users with an item. Datasets used in this paper relate to music and movie. They also used LSTM to capture the sequential dependencies. In paper [16], authors describe the combination of ontology and collaborative filtering for mooc’s recommender system. They mentioned basic components of personalized system as (1) techniques, (2) item, and (3) personalization. Proposed method includes hybrid method as mentioned above and for computing similarity and mooc’s similarity extended cosine similarity used and for learners’ similarity PCC is used. At last, algorithm for generating recommendation is mentioned. We also studied papers other than above, and some of that focuses on solving problems using KG related to COVID-19 pandemic that worth mentioning. They are listed below. • Cone-KG: A Semantic Knowledge Graph with News Content and Social Context for Studying COVID-19 News Articles on social media [17]. • Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge [18]. • COVID-19 Knowledge Graph: Accelerating Information Retrieval and Discovery for Scientific Literature [1].
3 Observation and Discussion During survey, we have read many recently published papers and articles, and we find some very useful details which we present in this section as simply as possible. We explain some basic terms and challenges that one need to understand for the work. And some of the useful dataset and tools are also mentioned to get started. Most of the papers mentioned traditional algorithm as content based, collaborative and hybrid algorithm and then some personalized algorithm also mentioned such as demographic based, community based, and knowledge-based algorithm. In knowledge-based algorithm, some papers mentioned techniques or approach to achieve the recommendation using knowledge graph that are divided into four categories. For better understanding, see Fig. 2. Among these categories, ontology-based approach is popular due to the fact that it facilitates knowledge sharing, reusing, and highly rich knowledge with semantics.
3.1 Ontology-Based While creating ontology-based knowledge graph two types of approach are there, top-down approach and bottom-up approach. None of these two methods is better than each other it’s depends on the view of the developer. It may be easier for the
Recommender System Using Knowledge Graph …
243
Fig. 2 Recommendation techniques
developer to follow the top-down method if they have a more systematic top-down perspective of the domain. Or if they have better understanding at data level, they might follow bottom-up approach. The combine technique, on the other hand, is easier for most ontology developers since it leverages the notion “in the middle,” which is a more descriptive concept in domain ontology. For simplicity, we represent bottom-up architecture for KG using ontology just to understand the terms in overall process, as steps can be altered on base of approach developer use. To build an ontology, there are some useful software available which helps us in construction as well as in visualization of ontology, those are protégé, NeOn Toolkit, SWOOP, Neologism, and Vitro [19]. In bottom-up approach, from Linked Open Data (LOD) or other knowledge resources, we extract knowledge instances. After knowledge fusing, to generate the entire KGs, the top-level ontologism is built using knowledge instances [12]. Bottom-up approach of KG is and iterative update process, which includes knowledge acquisition, knowledge fusion, knowledge storage, and retrieval. For better understanding, let us look at the following architecture of bottom-up approach. Structured data, unstructured data, and semi-structured data are the three basic sources of knowledge acquisition, as shown in Fig. 3. Attribute, relation, and entity extraction are all types of knowledge extraction. Following that, knowledge fusion can be defined as an iterative process in which we build the ontology and regularly review it for higher quality. NoSQL databases are more popular for storing and retrieving knowledge graphs. In knowledge extraction, extracted knowledge is usually presented in machine readable formats such as RDF and JsonLD. There are many tools available for knowledge extraction depending on the needs and functions, some of them are Stanford NER, OpenNLP, AIDA, Open Calais, and Wikimeta. While we can extract knowledge from any sources such a website or any record and datasets available, nowadays most of the instances extracted from DBpedia or Yago, and Wikipedia. For
244
W. Bunglawala et al.
Fig. 3 Bottom-up approach for KG using ontology [12]
semi-structured and unstructured data sources, we need entity extraction, relation extraction, and attribute extraction. • Entity extraction is the process of identifying an entity from a large amount of data and categorizing it into predetermined categories such as person, place, or location. • Relationships among those entities are analyzed after entity extraction to conceptually extract relations. • Attribute extraction is to define the intentional semantics of the entity and it is important for defining the concepts of entity more clearly. The purpose of knowledge fusion is to achieve entity alignment and ontology creation. And it is an iterative process. Entity matching is another name for entity alignment. The goal of entity matching is to determine whether or not various entities refer to same real-world object. It’s worth noting that entity alignment typically relies on external sources like manually created corpora or Wikipedia links. After that, ontology construction and evaluation step is there, in that we create the ontology and constantly evaluate it for better performance of the application. To ensure, the KG’s quality, general ontologies such as FOAF and general meta-data from schema.org are required. In terms of KG storage, it is often saved in a NoSQL database. There are two basic storage types: RDF-based storage and graph database storage. The benefit of using RDF is that it improves the efficiency of querying and merge-join for triple patterns. Better query results on the other hand, come at a high expense in terms of storage capacity. Some popular RDF-based data storage is 4store, RDF Store, TripleT and so on, most of the native storage system provides SPARQL or similar like query languages. Graph-based storage, on the other hand, has the advantage of providing excellent graph query languages and supporting a wide
Recommender System Using Knowledge Graph …
245
range of graph mining methods. They do have drawbacks such as slow knowledge updates, expensive maintenance costs, and distributed knowledge inconsistency. SPARQL is a popular query language for retrieving information, and practically, every large-scale knowledge graph system has a SPARQL query endpoint. SPARQL generates output in JSON, JSON-LD, RDF/XML, XML, CSV, RDF/N3, and many more formats, with practically all of them being machine readable. Machine-readable forms necessitate visualization tools. The most popular of which are browserbased visualizations because some query returns formats are text-based. IsaViz, RDF Gravity, DBpedia Mobil, and Open Link Data Explorer are some of the most popular tools. Because ontology is based on description logic, knowledge retrieval is primarily based on logic principles.
3.2 Challenges in Recommender System The most common issues or challenges associated with development of traditional recommendation system is cold start and sparse data [2, 20]. Cold-start problem can occur when system is unable to inference any information regarding users or item. This can happen when new user or the new item is added in catalog. In this situation, we cannot predict the new user taste as no information is available. Furthermore, due to insufficient and erroneous findings, users are unable to rate or purchase the item. So, to avoid the cold-start problem, numerous methods are suggested, including (a) asking users to rate some items at the start, (b) asking users to indicate their liking, and (c) recommending items based on user demographic data. There may be sleepers in some circumstances; sleepers are items that are nice but not rated. We can manage this by employing meta-data or content-based solutions, such as item entropy, item personalization, or using Linked Open Data (LOD), which eliminates the need for consumers to supply explicit input. Data sparsity can be understood as let’s say we formed a cluster of similar data and we will recommend the product based on those cluster. Now as more and more variables included in dataset or we can say with huge amount of data, noise, and uncertain data are also increased. In this situation, data will be more uniform and we will struggle and we won’t be able to do anything with those data. To overcome this issue many techniques like, multidimensional models, SVD techniques, and demographic filtering can be used.
3.3 Challenges in Knowledge Graph In knowledge graph, most common challenges we can list out is knowledge completion, harmonization of datasets, and knowledge alignment [8, 21]. Knowledge Completion: Incompleteness of knowledge graph is when there is a dashed line available in the graph or we can say that there appears to be a
246
W. Bunglawala et al.
possible relationship between two entities from the solid facts. Completing a KG is a very challenging task for researchers and that is why knowledge embedding is an active research area in this field. The symbolic compositionality of KG relations is ignored by embedding-based techniques, which limits their application in increasingly complicated reasoning tasks. And to overcome this, several other approaches like multi-hop paths are developing [22]. So basically, due to the large amount of data and relations we can say that fraction of incompleteness in KG will always be there. Harmonization Datasets: The ability to harmonize or integrate data from many sources is crucial to the creation of semantic knowledge graphs. However, different authors use different names to describe the same subject, which is a typical problem. As a result, it is possible to confuse one entity with another. We must use ontologies to tackle this difficulty since they give much more than just data harmonization. One of the functions of ontology is to provide a standard model of knowledge associated with a specific domain, as well as a common identifier that can be used to link it to other things in other ontologies. Knowledge Alignment: Knowledge graphs have become more widely available on the Web in recent decades, but their heterogeneity and multilingualism continue to limit their sharing and reuse on the Semantic Web [2]. Basically, knowledge alignment is nothing but to discover the mapping (i.e., equivalent entities, relationships and others) between two KGs. Embedding and reasoning both can be used for these types of challenge but hybrid reasoning promises more encouraging result [2].
3.4 Available Tools and Dataset As a data storage, lots of database available to store knowledge graph and graph data. And most of the NoSQL databases are used to store the KG. Some are listed below [23–25] (Table 1). Also, there are many general as well as domain specific datasets available. Some of the popular general datasets are mentioned below along with few COVID-19 datasets [23, 24] (Table 2).
4 Proposed Idea Talking about the latest pandemic COVID-19 has claimed so many lives worldwide. And it increases the need for tools that enable researchers to search vast scientific corpora to find specific information, visualize connections across the data, and discover related information in the data. Several dedicated search engines have been built due to the need of information retrieval related to scientific literature on COVID19. Search engines like Sketch Engine COVID-19, Sinequa COVID-19 Intelligent Search, Microsoft’s CORD19 Search and Amazon’s CORD19. However, this search
Recommender System Using Knowledge Graph …
247
Table 1 List of database Database name
Link
Database model
Neo4j
https://neo4j.com/
Graph
GraphDB
https://graphdb.ontotext.com/
multi model
CosmosDB:Azure https://docs.microsoft.com/en-us/azure/cosmos-db/introd Multi model uction OrientDB
https://orientdb.org/
Multi model
ArangoDB
https://www.arangodb.com/
Multi model
Janus Graph
https://janusgraph.org/
Graph
Virtuoso
https://virtuoso.openlinksw.com/
Multi model
Amazon Neptune
https://aws.amazon.com/neptune/
Multi model
Stardog
https://www.stardog.com/
Multi model
Dgraph
https://dgraph.io/
Graph
Table 2 List of datasets Dataset name
Description
Kaggle: Cord-19 [26]
It has approximately 500,000 scholarly publications on COVID-19, SARS-CoV-2, and related to coronaviruses, with over 200,000 of them having full text
Coronavirus (COVID-19) tweets dataset [27]
CSV files with the IDs and sentiment ratings of tweets about the COVID-19 pandemic are included in the collection. In real time, the Twitter stream is being monitored
AYLIEN: COVID-19 [28]
Corona virus news datasets
WordNet [29]
Princeton University offers a free extensive lexical database of English [8]
YAGO [30]
Wikipedia, WordNet, and GeoNames have all contributed to this massive semantic knowledge base
DBPedia [31]
DBpedia is a community-driven effort to extract structured content from the resources of different Wikimedia projects
Wikidata [32]
It’s a free, multilingual dataset that collects structured data to help Wikimedia Commons and Wikipedia
Google KG [3]
There are millions of items in Google’s Knowledge Graph that describe real-world entities
248
W. Bunglawala et al.
engines return thousands of search result that overlooked the inherent relationships like citation and subject topics [11]. Also, they do not provide the tool to visualize relationships, which can be beneficial for knowledge discovery. So, we need the system that can be specifically used for knowledge discovery and information retrieval. Also, we need to use every unique data we can gather such as scientific data and social media data. To build this kind of system, proposed flow diagram is mentioned below. After getting enough information about both techniques, data gathering and data extraction depends on the system so in this case if we want to build a system for COVID-19 situation some useful available datasets are mentioned in previous section along with the details on building knowledge graph using ontology (Fig. 4).
5 Conclusion and Future Work With a lot of data available, we need a fine-grained recommendation system that can help us to discover knowledge as efficiently as possible. In situation like COVID-19, it can be very helpful to discover new knowledge and knowledge retrieval. Also, to enrich the recommendation system with knowledge we need a common structure. Graph structure like knowledge graph can handle different types of data easily and efficiently. Therefore, proposed system is using KG for RS and basic approach is mentioned to achieve this task based on ontology with some techniques. Also, some popular databases and datasets that are available is mentioned along with some COVID-19 datasets that can be helpful in the situation of pandemic. For future work, we can construct a generalized KG using these techniques, which can be used for COVID-19 scientific literature and/or social media recommender system to help in situation of pandemic. Also, this paper can also be used as a reference for creating similar applications with different goals and datasets based on ontology.
Recommender System Using Knowledge Graph …
249
Fig. 4 Proposed flow diagram
References 1. C. Wise, V.N. Ioannidis, M.R. Calvo, X. Song, G. Price, N. Kulkarni, G. Karypis, COVID19 knowledge graph: accelerating information retrieval and discovery for scientific literature. arXiv preprint arXiv:2007.12731 (2020) 2. A.A. Patel, J.N. Dharwa, An integrated hybrid recommendation model using graph database, in 2016 International Conference on ICT in Business Industry & Government (ICTBIG) (IEEE, 2016)
250
W. Bunglawala et al.
3. Bluepi, Classifying different types of recommender systems, https://www.bluepiit.com/blog/ classifying-recommender-systems/ 4. H. Wang, F. Zhang, M. Zhao, W. Li, X. Xie, M. Guo (2019) Multi-task feature learning for knowledge graph enhanced recommendation, in The World Wide Web Conference, pp. 2000– 2010 5. Q. Guo et al., A survey on knowledge graph-based recommender systems. IEEE Trans. Knowl. Data Eng. (2020) 6. J. Liu, L. Duan, A survey on knowledge graph-based recommender systems, in 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), vol. 5 (IEEE, 2021), pp. 2450–2453 7. Ontotext, What is knowledge graph, https://www.ontotext.com/knowledgehub/fundamentals/ what-is-a-knowledge-graph/ 8. W. Li, G. Qi, Q. Ji, Hybrid reasoning in knowledge graphs: Combing symbolic reasoning and statistical reasoning. Semant. Web 11(1), 53–62 (2020) 9. Y. Wang, L. Dong, H. Zhang, X. Ma, Y. Li, M. Sun, An enhanced multi-modal recommendation based on alternate training with knowledge graph representation. IEEE Access 8, 213012– 213026 (2020) 10. H. Wang, Z. Le, X. Gong, Recommendation system based on heterogeneous feature: A survey. IEEE Access 8, 170779–170793 (2020) 11. J. Chicaiza, P. Valdiviezo-Diaz, A comprehensive survey of knowledge graph-based recommender systems: Technologies, development, and contributions. Information 12(6), 232 (2021) 12. Z. Zhao, S.-K. Han, I.-M. So, Architecture of knowledge graph construction techniques. Int. J. Pure Appl. Math. 118(19), 1869–1883 (2018) 13. W. Zeng, H. Liu, Y. Feng, Construction of scenic spot knowledge graph based on ontology, in 2019 18th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES) (IEEE, 2019) 14. P.S. Sajisha, V.S. Anoop, K.A. Ansal, Knowledge graph-based recommendation systems: The State-of-the-art and some future directions 15. X. Wang, D. Wang, C. Xu, X. He, Y. Cao, T.S. Chua, Explainable reasoning over knowledge graphs for recommendation, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, No. 01 (2019), pp. 5329–5336 16. K. Rabahallah, L. Mahdaoui, F. Azouaou, MOOCs Recommender system using ontology and memory-based collaborative filtering. ICEIS (1) (2018) 17. F. Al-Obeidat, O. Adedugbe, A.B. Hani, E. Benkhelifa, M. Majdalawieh, Cone-KG: A semantic knowledge graph with news content and social context for studying covid-19 news articles on social media, in 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS) (IEEE, 2020), pp. 1–7 18. M.Y. Jaradeh et al., Open research knowledge graph: next generation infrastructure for semantic scholarly knowledge, in Proceedings of the 10th International Conference on Knowledge Capture (2019) 19. W3C, Ontology editors. https://www.w3.org/wiki/Ontology_editors 20. S. Khusro, Z. Ali, I. Ullah, Recommender systems: issues, challenges, and research opportunities. Inf. Sci. Appl. (ICISA) 2016 (Springer, Singapore, 2016). pp. 1179–1189 21. SciBite, Addressing common challenges with knowledge graphs, https://www.scibite.com/ news/addressing-common-challenges-with-knowledge-graphs/ 22. X.V. Lin, R. Socher, C. Xiong, Multi-hop knowledge graph reasoning with reward shaping. arXiv preprint arXiv:1808.10568 (2018) 23. S. Ji, S. Pan, E. Cambria, P. Marttinen, S.Y. Philip, A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Trans. Neural Netw. Learn. Syst. 24. GitHub, Totogo/awsome-knowledge-graph, awesome knowledge graph—github GitHub— totogo/awesome-knowledge-graph: A curated list of Knowledge Graph related learning materials, databases, tools and other resources
Recommender System Using Knowledge Graph …
251
25. C#corner, Most popular graph database, https://www.csharpcorner.com/article/most-populargraph-databases/ 26. Kaggle, COVID-19 Open research dataset challenge (CORD-19), https://www.kaggle.com/ allen-institute-for-ai/CORD-19-research-challenge 27. IEEE Data Port, Coronavirus (COVID-19) tweets dataset, https://ieee-dataport.org/open-acc ess/coronavirus-covid-19-tweets-dataset 28. Aylien, Free coronavirus news dataset—Updated, https://aylien.com/blog/free-coronavirusnews-dataset 29. Princeton University, WordNet—A lexical database for english, “What is WordNet?” https:// wordnet.princeton.edu/ 30. Yago, YAGO: A high-quality knowledge base, https://yago-knowledge.org/ 31. DBPedia, Global and unified access to knowledge graphs, https://www.dbpedia.org/ 32. Google Search Central, Google knowledge graph search API, https://developers.google.com/ knowledge-graph
A Review of the Multistage Algorithm Velia Nayelita Kurniadi, Vincenzo, Frandico Joan Nathanael, and Harco Leslie Hendric Spits Warnars
Abstract This paper reviews multistage algorithms with various concepts, and various forms of multistage algorithms are discussed after a detailed research. Previous research searches are conducted using Google Scholar and the software Publish or Perish, and carried out in three ways: the exploration of multistage algorithm publications, classification of multistage algorithm publications, and review of multistage algorithm publications. Based on these three steps, the literature review is limited to 26 publications, and the results show that the naming of the multistage algorithm is only limited to mentioning the number of stages in the proposed algorithm, and the abbreviation of its name overlaps with terms in other algorithms. For example, abbreviation such as MSA defined as Multi Stage Algorithm is expanded in other studies as Multiple Sequence Alignment. Keywords Multistage algorithm · Multiple sequence algorithm · Literature review
1 Introduction Applying a multistage algorithm helps minimize the effect of higher demand stage requirements between selecting test models, by determining the ideal mix between the models of two different data at each calculation phase in the algorithm. The V. N. Kurniadi · Vincenzo · F. J. Nathanael Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] Vincenzo e-mail: [email protected] F. J. Nathanael e-mail: [email protected] H. L. H. S. Warnars (B) Computer Science Department, Graduate Program, Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_18
253
254
V. N. Kurniadi et al.
multistage algorithm is applied to improve the Park Chen Yu (PCY) algorithm by utilizing several progressive hash tables to reduce the number of competitor sets. PCY itself was created by three researchers named Park, Chen, and Yu, where this algorithm is used to find frequent itemset mining searches for data searches in large datasets. The payoff is that multistage takes many steps to find the continuity set. This paper proposes to know more about the multistage algorithm in many ways, while this multistage model is multiscale and stochastic. This paper explores the development and implementation of multistage algorithms, including their meaning, because there are many different implementations. Exploration will be carried out in three steps such as: 1. 2. 3.
Exploration of multistage algorithm publication, Classification of multistage algorithm publication, Review of multistage algorithm publication.
2 Exploration of Multistage Algorithm Publication This multistage algorithm exploration step uses two approaches: a search for previous paper publications using the title of the multistage algorithm that is searched on Google Scholar and a search using Publish or Perish software. Exploration started by using Google Scholar to find previous publications where the title of the publication contain a multistage algorithm sentence without using citations in all years of publication, and the results displayed were 582 and the search can be seen at the following link: https://scholar.google.com/scholar?hl=en& as_sdt=0%2C5&q=allintitle%3A+multistage+algorithm&btnG=. A similar search was performed using the Publish or Perish software and that showed 582 publications with 7165 citations such as 130.27 citations per year, 12.31 citations per paper, 2.59 authors per paper, h-index 38, and g-index 74. Figure 1 shows 582 publications from 1966 to 2021, including 12 publications that do not have year information, and due to limitations in presenting, the image only shows from 1987 to 2021. Figure 1 shows that 2005 papers had a maximum number of citation of 1170, while some from 1966, 1968, 1969, 1974, 1975, and 1980 papers do not have total citations. Based on the search for titles in the search results of 582 publications, it is found that most of the paper titles do not focus on multistage algorithms but instead use the term multistage in the algorithms proposed or discussed in these publications. Therefore, the search for previous publications was expanded by searching for titles containing multistage algorithm sentences using citations. Therefore, the search was continued by using Google Scholar to search for previous publications where the title of the publication contain a multistage algorithm sentence using citations for all years of publication. The results showed 52 results, and the search could be seen at the following link: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=allint itle%3A+%22multistage+algorithm%22&btnG=. A similar search was performed using the Publish or Perish software and that showed 52 publications with a total
A Review of the Multistage Algorithm
255
Fig. 1 Graph of 582 results for searching the title of the publication containing a multistage algorithm sentence without using citations in all years of publication
of 1474 citations such as, 42.11 citations per year, 28.35 citations per paper, 2.62 authors per paper, h-index 16, and g-index 38. Figure 2 shows 52 publications from 1986 to 2021, including one publication that do not have year information, and 2009 papers had a maximum number of citation of 482, while some from 1986, 2008, 2012, and 2020 papers do not have total citations. However, after a rigorous search for titles in the search results of these 52 publications, it was found that most of the papers do not have multistage algorithm titles
Fig. 2 Graph of 52 results for searching the title of the publication containing a multistage algorithm sentence using citations in all years of publication
256
V. N. Kurniadi et al.
but instead only use the term multistage in the algorithms proposed or discussed in the publication. Therefore, several publications that did not match the title multistage algorithm were omitted, and in the end, the remaining 26 papers which had the publication title as the multistage sentence algorithm were found. Figure 3 is a graphic representation for Table 1 showing 26 publications from 1989 to 2021, and the 2002 paper had a maximum number of citation of 51, while the 2003 and 2009 papers had no total citations. The limitation of 26 papers using Fig. 3 Graph of 26 results for searching the title of the publication containing a multistage algorithm sentence using citations in all years of publication
Table 1 Table of 26 results for searching title of the publication containing a multistage algorithm sentence using citations in all years of publication
Year
Sum of cites
Number of papers
1989
31
1
1994
27
1
1995
7
2
1996
3
2
1997
32
1
2002
51
1
2003
0
1
2007
37
1
2009
0
1
2010
4
1
2011
13
2
2013
1
1
2014
3
2
2016
35
3
2017
37
4
2018
18
1
2021
1
1
Total
300
26
A Review of the Multistage Algorithm Table 2 Classification of 26 selected papers
Type of publications
257 Amount
International conference
9
International journal Q1
6
International journal Q2
3
International journal Q3
2
International journal Q4
1
International journal no Q
4
Doctoral dissertation Total
1 26
Publish or Perish software shows that the 26 publications, as seen in Table 1, have 300 citations such as, 9.38 citations per year, 11.11 citations per paper, 3.00 authors per paper, h-index 8, and g-index 17. These 26 publications are focused on, and the review and classification of these publications are mentioned in the following section.
3 Classification of Multistage Algorithm Publication At this step, a classification of the multistage algorithm implementation is carried out for the 26 papers based on the type of publication, such as international conferences, ranking of international journals using scimagojr journal ratings. The table and graph show the mapping of these papers. Moreover, the provisions for ranking scimagojr international journals use journal ratings in the year the publication. In this case, three publications have a long published year, but the journal was not indexed to the scimagojr rank when the publication was published, so the three publications are categorized as no Q international journals. Table 2 shows the classification results where Fig. 4 is the graphic representation for Table 2.
4 Review of Multistage Algorithm Publication In this step, the review for 26 papers starts from the previous publication. El-Shishiny first proposed the multistage algorithm in 1989 that was used for image classification implemented on a microcomputer. This algorithm has three steps in its application, namely parallelepiped classification, ellipsoidal separation and closest distance classification using the Mahalanobis algorithm. In its implementation, this multistage algorithm has the following pseudocode: 1. 2.
training example parallelepiped classification
258
V. N. Kurniadi et al.
Fig. 4 Classification of 26 selected papers
3. 4. 5. 6. 7. 8.
10 9 8 7 6 5 4 3 2 1 0
9 6 3
4 2
1
1
while (number of clusters whose pattern includes 1) { if the number of clusters whose pattern includes > 1 then do the ellipsoidal separation else if the number of clusters whose pattern includes 1 then do Mahalanobis distance classification }
Parallelepiped classification is done by conducting classifier training to determine the smallest and largest value for each feature in the cluster. At the same time, the classification of the Mahalanobis distance between point X and cluster i as a search for the nearest cluster was introduced by Swain in 1978. Meanwhile, ellipsoidal separation separates classes with ellipsoidal domains by calculating eigenvectors and estimating elliptical parameters [1]. Moreover, the multistage algorithm required in Chen’s research in his 1994 paper was used to achieve an improvement by applying the concept of equilibrium in which the aim was to measure the staging effect of the polyethylene fraction with supercritical ethylene and 1-hexene. This multistage algorithm is combined with the SAFT equation seen in the Supercritical AntiSolvent (SAS) fractionation which is formed from a polyethylene model with ethylene and 1-hexene, where this SAFT connects macroscopic partitions. SAS itself is a process that separates polydisperse mixtures of macromolecules with varying average molecular weights from which the macromolecules get a chemical microstructure. While implementing the SAS process, paying attention to temperature and pressure is necessary to control the selectivity and solvent capacity. Besides, the solvent composition must also be considered for the polymer solubility in weight percent. In addition, this multistage algorithm is used to predict the effect of theoretical stages to increase the selectivity of the solvent, i.e., significantly to reduce the size of the light, where smaller the light, the greater the sensitivity to staging [2].
A Review of the Multistage Algorithm
259
Furthermore, the term tomography refers to cross-sectional imaging of objects that can choose between transmitting and reflecting data, and most of its progress toward electromagnetic diffraction tomography was developed based on the Fourier diffraction projection theorem. Kartal, a researcher from the Technical University of Istanbul, Turkey, developed a multistage algorithm based on matrix partitioning, run in parallel to process tomographic diffraction information as image reconstruction. This model is run by reconstructing the partitioned algorithm into sub-matrixes, and a matrix inversion process carries out each sub-matrix, and its application is almost similar to the approach carried out in nonlinear neural networks. In addition, in its application, each stage has input and output vectors where at each output at each stage, a comparison is made between the desired output vector and the actual output vector, and the results are used as output vector predictions for the next stage [3]. This algorithm has several advantages, such as the advantages of parallel calculations and more resistance to numerical errors than single-stage algorithms. Multistage algorithms have advantages in terms of time efficiency, especially in parallel execution, and, of course, reduce reconstruction errors. In addition, the use of this algorithm produces a reconstruction that is stronger than the single-stage algorithm, and as its application is in Digital Signal Processing, it will be more applicable to real-time processes where memory limitations are overcome by a multistage algorithm where each stage is applied to a chip [4]. A Master’s thesis from the University of Southwestern Louisiana, USA, proposed an improvised multistage algorithm to perform task scheduling problems in resource selection so that the execution process is faster. This master thesis done by Feng discusses many types of task scheduling, such as direct scheduling, batch scheduling, static scheduling, dynamic scheduling, preemptive scheduling, and non-preemptive scheduling. The scheduling process is the process of allocating resources in the form of resources such as network links, processors, or expansion cards where the tasks performed can be in the form of data flows, threads, or processes. In addition, scheduling activities, which are the basis of the computing process itself, are carried out by the scheduler, which is carried out by a multistage algorithm to share resources effectively to achieve the target of achieving excellent service quality [5]. Meanwhile, Pham, a researcher from Georgia Institute of Technology, Atlanta, USA, introduced an algorithm to recognize a predetermined target aimed at the image data of the Forward-Looking Infrared (FLIR) sensor, and experiments using 147 images from the TRIM2 database using targets such as helicopters, tanks, howitzer, and vehicle. The developed algorithm approach starts with scene input and preprocessing. After that, two parallel activities are carried out by performing a target edge extractor and an interior target region extractor. After that, a target detector and the target location results are obtained. The preprocessing is carried out to reduce noise using a 3 × 3 morphological filtering operator, and the target edge information is extracted using an edge detector by calculating the gradient magnitude using the Sobel operator [6]. Another paper by Pham of the Georgia Institute of Technology researched the use of Constant False Alarm Rate (CFAR) detectors and morphological principles to improve detection accuracy and reduce false alarms on Automatic Target Recognition
260
V. N. Kurniadi et al.
(ATR) Synthetic Aperture Radar (SAR) targets. The developed model focuses on efficiency in the prescreening stage, which is one of the stages of the three stages of the ATR problem, where the prescreening stage is the essential stage carried out to reduce the computational load that focuses on the potential target area of an image. The CFAR algorithm is carried out in the following steps, starting from entering the scene, and then, it performs three processes that can be carried out simultaneously namely the two-parameter CFAR detector, global thresholding, and target/clutter variance comparison. Next, a majority filter is done, then the shape/size target discriminator and the target location are obtained [7]. Dharanipragada’s paper, a researcher from Watson research center, Yorktown Heights, USA, presented an algorithm that functions to find words in a human speech whose implementation is carried out in two stages: the phone-ngram representation and the rough-to-detail search stage. The phone-ngram representation stage, or offline data preparation, provides the level of speech phonemes that can be searched efficiently using phoneme recognition through a tree of vocabulary prefixes derived from the speech in forming an index or phone-ngram table. Meanwhile, a rough-todetailed search, commonly called a run-time search, is carried out to find the order of words/phones in the speech by doing phone-ngram matching [8]. Soraluze, a researcher from Spain, proposed improving the performance of the KNN classifier by using a hierarchy or multistage classifier, where the classifier is modeled by carrying out the process of improving classifier training to form a hierarchy and using rejection techniques at all levels of hierarchy. This multistage classifier based on the KNN classifier was implemented and tested with three datasets, one dataset from the University of California Irvine (UCI) repository, and two datasets from Statlog projects such as Statlog LandSat Satellite and shuttle statlog by applying the Training Algorithm and the Incremental Training Algorithm (ITA). Meanwhile, the proposed Multistage Recognition Algorithm (MRA) is implemented as a classification process where classifier training is carried out in stages and positioned as a hierarchy according to the number of patterns. In addition, this model is also equipped with a Multistage Recognition Algorithm with Active Memory (MRAAM), which works similar to MRA, but this MRAAM still stores data at the previous stage hierarchically. This study uses a multistage classifier, either with MRA or MRAAM, making the classification process faster [9]. Furthermore, Jazayeri from the University of Calgary, Canada, used a multistage approach to model dynamic power system loads to overcome the identification problem. This model estimated model parameters by performing equations for dynamic power system loads using a zero-order containment method followed by a Nonlinear AutoRegressive Moving Average with exogenous input (NARMAX) polynomial order-2 model. After that, the slightest quadrant approach was carried out to predict the NARMAX parameters, and the values found in the early stages were used as a starting point for Levenberg–Marquardt optimization to calculate the optimal parameters [10]. Jianshuang Cui from the University of Science and Technology Beijing, China, proposed using a multistage algorithm for Job Shop (JSP) scheduling problems, starts by randomly generating initial variables and continues by connecting the stages until
A Review of the Multistage Algorithm
261
a stop condition is found. The algorithm was successfully applied to 162 instances and showed that the algorithm was robust and simple in the implementation [11]. Next is the pseudocode of the multistage algorithm 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Get the initial solution randomly. Call Generalized Memory Polynomial (GMP) while Iteration counter < MaxTrials {Call New Solution Generation Procedure (NSGP) for getting a new solution if the new solution is feasible? then call Improved Critical Path Algorithm (ICPA) Getting Current Best Makespan (CBM) if CBM < OptimVal? then OptimVal = CBM else Undo the Swap operation using local search } output: OptimalVal and Current Best solution
Zamyatin from Tomsk Polytechnic University, Russia, proposed a lossless compression algorithm performed in three stages using a wavelet transform to increase the compression ratio. The three stages are 1. 2. 3.
Finding the transformation coefficient by performing a wavelet transform from the initial data. Formation of deviation data sets by considering the functional relationship of Albedo values between different image bands. By using the traditional compression algorithm to compress the data obtained.
This three-stage algorithm was implemented in Borland Developer Studio 2006 without special attention to code optimization and by using a computer that has an Intel Pentium IV processor specification with a processor speed of 2.8 GHz and tested against 10 Remote Sensing (RS) datasets such as SPOT, ADAR-5000, Airphoto, Landsat-MSS, Landsat-TM and Flightline C1 [12]. El Attar, a Ph.D. student from Dhar El Mahraz University, in his 2011 paper investigated and proposed an algorithm for automatic calibration process on a camera, where this algorithm starts with an initialization stage to get the focal length and then estimates the camera’s intrinsic parameters using a multistage algorithm [13]. In the proposed multistage algorithm model, there are five iterations carried out, namely: 1.
2. 3.
The first iteration is done by estimating the initial focal length, where the formula is run to minimize the camera with zero tilt focus and have a main point in the center of the image. The second iteration is carried out to estimate the aspect ratio based on the solution obtained from the first iteration. The third iteration is carried out to estimate the main point by estimating the coordinates of the main point of an image and the previous iteration’s output as an estimate to minimize variations with two focal distances and the main point.
262
4.
5.
V. N. Kurniadi et al.
The fourth iteration is carried out to re-estimate the focal length the same as the initial iteration to achieve automatic camera calibration and 3D reconstruction. The previous iteration’s output will be used as an initial estimate for minimization. The fifth iteration is carried out to refine the camera’s intrinsic parameters using the output of the fourth iteration to improve all parameters.
He Yuqing, a researcher from the research institute of the Hunan electric power company, Changsha, China, proposed a new algorithm and a multistage algorithm for the configuration of electric power distribution networks. System reliability was measured by three index parameters such as Average Service Availability (ASAI) and Energy Not Supplied (ENS) and then combined with a power loss index to configure the power distribution network. This fast multistage algorithm was developed for this optimization problem, and the reliability index was not calculated in the previous reconfiguration stage [14]. The Nan Ho researcher from Vietnam National University discussed the multistage algorithm, divided into three stages: preprocessing, coordinating, and postprocessing, where each stage is a search algorithm. At the initial stage this stage consists of two sub-activities, namely Coarse Optimization (CO) and Pre-optimization, whose function is to determine attractive locations. In the second stage, the transition between stages avoids overlapping, where the central coordinator acts as a transition and manages the number of instances in the algorithm. Meanwhile, there are two sub-activities in the third stage, namely looking for the global optimum and the local optimum, which function to obtain the optimal solution. The multistage algorithm is called a grid-based hybrid algorithm which is the formation of Latin Hypercube Sampling (LHS), which is a collaboration of three agents such as Coordinator Agent (CoA), Instance Agent (InA), and Evaluator Agent (EvA), where each agent can send and receive information [15]. Ruinskiy, a researcher from Tel-Hai College, Israel, presented an algorithm to find fricatives in a sound speech to record data sources where fricatives are helpful in applications for the deaf for the excessive accentuation of phonemes which can degrade the aesthetics of sound recordings. In the first stage, this multistage algorithm performs a classification process with the Linear Discriminant Analysis (LDA) algorithm to detect fricatives in speech recording sentences. In the second stage, the detection results in the first process in the form of phonemes are reclassified using the Decision Tree (DT) algorithm to eliminate false detections. This algorithm was tested on a Texas Instruments Massachusetts Institute of Technology (TIMIT) audio database corpus containing hundreds of audio sentences spoken by various speakers with different dialects, and the detection rates across the entire range of fricative phonemes were obtained. In this case, the data tested using MATLAB with the MatlabADT toolbox was on 1680 sentences by 168 different speakers where the sentences contained more than 4600 fricatives [16]. Furthermore, Wang from China’s Dalian Jiaotong University proposed a Multistage Algorithm (MA) for optimization of loading problems and capacity vehicle
A Review of the Multistage Algorithm
263
routing problems (LCVRP) and for solving capacity vehicle routing (CVRP) problems. This MA is run in 3 stages; where the first stage is loading to set all consumer demands to a minimum of vehicles. The first and second stages are intended to produce the best solution. Then in the second stage, the parallel 2-opt algorithm solves the traveling salesman problem and schedules the consumer route. Meanwhile, the best solution results are optimized using the Tabu Search algorithm (TS) [17]. Guillaume Aupy from INRIA, France, tested a multistage algorithmic model for adjoint computing developed by Stumm and Walther. The Stumm and Walther model is a development of the Griewank and Walther model, but still, the results are not optimal. The paper described four problems that are overall to solve the main problem, namely the Adjoint Computing (AC) problem, all of which are to minimize the makespan of the AC problem [18]. There are seven lemmas described in this paper, namely: 1. 2. 3. 4. 5. 6. 7.
There is an optimal solution that has a structure. An optimal solution for Problem 2 satisfies (P0) and (P1). There is an optimal solution for Problem 2 that satisfies (P0), (P1), and (P2). There is an optimal algorithm for Problem 2 that satisfies (P0), (P1), (P2), and (P3). Provided the optimal algorithm for Problem 2 that satisfies (P0–3). An optimal algorithm for Problem 2 satisfies (P0–3) and (P4). An optimal algorithm for Problem 2 satisfies (P0–4) and (P5).
Cherneta, a researcher from Tomsk Polytechnic University, Russia, developed viola-jones based on the AdaBoost algorithm, an image object detection algorithm that applies the Haar feature and tests it on vehicle license plates. The test as character recognition in the form of 5000 training data of 28 × 52 pixel image measurements presented in images with different images from camera angles, contrast, and lighting. Furthermore, the classification was tested using the Open CV library by applying 2000 vehicle images and the algorithm training time was 32.4 h, and the test accuracy reached 98.21%. The algorithm was carried on in two stages: character segmentation and recognition. This algorithm had a classification cascade architecture and adaptive operating principles, wherein the cascade process each stage uses the Haar feature as a weak classification [19]. Meanwhile, Zhang from Boehringer Ingelheim, Shanghai, China, proposed a multistage AICaps model to select the best subset that implements AICi, an “enhanced” variant of Akaike Information Criteria (AIC) and evaluated by Monte Carlo simulation. The AICaps model had several stages starting from stage 1, comparing the corrected AIC (AICc) for the Sum of Squared Error (SSE) model with a minimum order equal to 1, with AICi for all models of order greater than 1. Stage 2 compares AICc as a model of at least 2 SSE with AICi for the entire model order greater than 2. For the last stage, which is a continuation of the previous stages, the actual model is the SSE model with a minimum order of S − 1 or model full of order S, and the model with the smallest AICc is selected [20].
264
V. N. Kurniadi et al.
Furthermore, Paulus from Georgia Tech Research Institute, USA, presented an algorithm that predicts the multistage linear and nonlinear phase components of an extended well time target signal for long-lived coherent integration. This multistage algorithm uses the inner product to detect the signal components from the phase-specific dictionary and additionally is performed to predict the target phase components unknown up to the sixth order to generate an accurate extended-stay multiphase signal model. The first stage of the multistage algorithm is carried out to determine the approximation of the linear phase component of the signal, and compared to the linear phase signal model. The multistage signal model generated from the multistage algorithm is better at maximizing the output. This multistage algorithm can be applied to the target movement parameters such as lane or road environment changes and traffic acceleration conditions. The stationary limit of this multistage algorithm is at the individual target scatter points and quiescent 4.0 S is the upper limit [21]. Furthermore, Kaven and Ghobadi from the Iranian University of Science and Technology proposed a multistage model for domain decomposition and allocation to reduce the cost and time for distribution of blood components between hospitals and blood centers by applying graph partitioning or the so-called p-median methodology and metaheuristic optimization algorithms such as Enhanced Colliding Body (ECBO) algorithm. In the first stage, this multistage algorithm is carried out by forming an adjacency matrix from the graph and, in the second stage, partitioning the resulting graph from the first stage into subdomains. This multistage algorithm was tested to minimize the distance between hospitals and blood centers in Tehran, Iran, which has an area of 730 km2 with a population of 8.4 million people and more than 150 health centers [22]. A researcher named Niemi from Stockholm University, Sweden discussed using a multistage algorithm that combines atomic molecular dynamics with Landau’s mean-field theory and describes folding and protein dynamics models with atomiclevel precision. This multistage model algorithm is well suited for characterizing the conformational state of intrinsically unstructured proteins, and the investigation of isolated monomeric Myc oncoproteins that can be encountered in cancer is an example. The Landau model approach was used to investigate that since monomer Myc is unstable and to analyze the highly degenerate structural landscape. Thermal stability properties were analyzed using atomic molecular dynamics and a group of structures were observed using two helical segments of the original leucine zipper parallel to each other [23]. Algorithm aims to design a method that can accurately diagnose the rhythm for compressions generated by a piston-driven pacemaker used to deliver Cardiopulmonary Resuscitation (CPR). Meanwhile, Zambri from California State University, USA, proposed a multistage algorithm model to calibrate the hemodynamic model and predict the parameter values of the biophysiological system and the external stimulus model, respectively. The proposed multistage algorithm adopts a predictive approach to predict two sets of parameters that have different properties and scales, and a multi-step strategy is applied by applying a combination of the Tikhonov regularized Newton Method (TNM) and Cubature Kalman Filter (CKF) algorithms and
A Review of the Multistage Algorithm
265
the combination of the two is called TNM-CKF algorithm. The proposed multistage algorithm can calibrate hemodynamic models without a prior knowledge of the values of biophysiological parameters and the ability to characterize one or more events. Moreover, different activity levels of different neurons can be distinguished by this multistage algorithm, wherein the calibration of this model provides control input parameters with excellent accuracy in the obtained results [24]. Furthermore, Isasi from the University of the Basque Country UPV/EHU, Spain, proposed a multistage algorithm that used two filter artifacts, namely a rhythm analysis algorithm classified by Electrocardiogram (ECG) slope and Recursive LeastSquares (RLS). This study used data from 230 cardiac arrest patients treated with the LUCAS 2 mechanical CPR device. In addition, the data set consisted of 201 and 844 shockable and non-activated ECG segments, of which 844, 270 were asystole and 574 Rhythm Organized. Two RLSs were used to reduce CPR artifacts, followed by applying a three-stage shock and no-shock decision during mechanical compression of the piston movement based on the standard defibrillator algorithm and ECG tilt decision. The data were randomly separated into training and testing data of 60% and 40%, respectively [25]. Lastly, Harkat, a researcher from Batna University, Algeria, conducted a study by proposing a multistage algorithm that aims to remove noise in the electrocardiogram signal and is carried out in two stages where the first stage of noise variation is estimated using DONOHO, followed by baseline based wavelets. Then, an adaptive Wiener ID is used to remove noise in the second stage. A fine Savitzky–Golay (SG) filter was also applied to extend the restoration process [26].
5 Discussion and Opinion Based on the previous literature review on 26 publication content with statement multistage algorithm, some ideas, thoughts, and comments related to those literature reviews are listed. The need to effectively figure the subordinates of a capacity, emerges most of the time in numerous territories of logical registering, including scientific improvement, vulnerability evaluation, and nonlinear frameworks of conditions. When the first subsidiaries of a scalar-esteemed capacity are required, alleged adjoint calculations can process the slope at a cost equivalent to little consistent occasions of the expense of the work itself, regardless of the number of free factors. This adjoint calculation can emerge from discretizing the constant adjoint of a halfway differential condition or applying the alleged invert or adjoint method of algorithmic (likewise called programmed) separation to a program for registering the capacity. Multistage coordinates the Active Distribution Network (ADN) planning process, where the planning is worried about the portion of employments to rare assets (machines). In the income board, each activity has a specific weight (or income), and the objective is to yield a practical subset of the occupations with the most extreme complete weight. Moreover, the scheduler must choose which machine,
266
V. N. Kurniadi et al.
each acknowledged activity ought to be appointed and when it ought to be executed, inside the time interim, between its discharge date rj and cutoff time dj. The goal is to expand the all-out weight of acknowledged employments. The proposed method optimized multiple planning options, that is, in a structured way. Meanwhile, huge size multistage stochastic 0–1 optimization issues are exceptionally hard to understand, primarily because of the number of imperatives and the 0–1 factors. To improve the exhibition of calculation for taking care of substantial estimated issues, a few considerations dependent on the presentation of the equal adaptation are introduced. Multistage stochastic programs typically exhibit time incoherence if the target is neither in expectation nor in full functionality. The development and widespread use of stochastic programming is closely linked to the growing capacity of computing made available from the field’s inception. A multistage linear stochastic model was designed to maximize electricity generation, storage, and transmission investment over a long planning period. Moreover, multistage stochastic issues emerged in a wide assortment of real-world applications in the vitality, account, and transportation fields. Right now, multistage stochastic direct projects that fulfill the accompanying conditions: 1. 2. 3.
The time skyline length T is limited yet conceivably huge (there might be several periods and stages); For each timeframe, the set of tests acknowledge of the exogenous data process is limited (and moderately little); The stage cost is a direct capacity of the choice for each stage.
Random Matching (RM): RM is a straightforward and effective strategy to process maximal coordination and limits the coarsening level in heuristic strategies like covetous calculation. In adjoint computation, memory and circle are conventional terms for a two-level stockpiling framework, demonstrating any stage with a double memory framework. The architecture of rhythm analysis in Electrocardiogram is beat examination. Sifting ought to uncover the basic heart mood of the understanding. Subsequently, sˆecg(n) was utilized to analyze the musicality as shockable or nonshockable. Fuzzy numbers propose scientific methods for determining specific solutions. They developed a far-reaching secluded system to determine different setups of fluffy numbers for fluffy positioning. Moreover, the Data Filtering-Based Recursive Least-Squares Algorithm in multistage Algorithm is the Recursive Least-Squares calculation which has high proficiency in utilizing every single estimated datum at each progression, so it has excellent boundary, estimation precision and high union rate. A small percentage of particular nodes are called anchors or beacons. Anchors know their areas since they may have GPS equipment or are conveyed to known areas. The rest of the nodes are deployed and sent indiscriminately to areas with area mindfulness. All atomic molecular dynamics intends to stimulate the time development of every molecule in a given protein, counting solvents. It delivers every iota’s discrete and piecewise direct time direction to answer a discretized (semi-)old-style Newton’s condition. Algorithm for the multistage stochastic problem through more straightforward issues to construct the pursuer’s instinct, initially, uses a combining lower-bound
A Review of the Multistage Algorithm
267
and upper-bound. In the generation of double limits, a Progressive Hedging (PH) Bounding Approach was used. The first methodology for producing double-limits linear programming relaxations of an adjusted variant of the unique MSSP plan utilize the PH. Multistage stochastic programming has discovered applications in an assortment of parts. In finance, MSP has been applied to portfolio advancement to amplify the average return while controlling the hazard, just as in resource obligation board. In the vitality segment, an old-style achievement story is aqueous age planning for Brazil, including the month-to-month arranging of intensity age of an arrangement of hydro and warm plants to fulfill vitality need even with stochastic water inflows into the hydro-supplies. Various diverse arrangement ways to deal with dynamic streamlining issues for frameworks depicted by normal differential conditions ODES or DAEs have been proposed. The critical quality of the whole discretization approach plot above is how streamlining is carried out across the discretized factor space and discretized imperatives. An extraordinary calculation depends on recently characterized vitality and unpredictability boundaries (EVP’s) as the essential progressive estimation (SA) factors has been demonstrated to be astoundingly steady and effective. Use of the instability boundaries as SA factors, typically leads to the decision of the recently characterized family member S-boundaries as the factors of an iterative technique, in which the semi Newton strategy for Broyden is proficiently utilized, to quicken union. Utilization of the S-boundaries, which are extraordinary mixes of the fluid and fume stage rates and the stage temperatures, maintains a strategic distance from the challenges related to connections between these factors. Furthermore, it is conceivable to comprehend the entirety of the model conditions without any of the interior irregularities that emerge with different strategies. Meanwhile, using calculation to an issue in stages, can decrease the measure of time taken to find that arrangement (comparative with the time taken to apply a calculation to the whole issue) and impressively improve the nature of that arrangement. The technique offered is a mixture of heuristic sequencing and transformative strategies, which can outperform either strategy alone. To comprehend the massive case of these issues, a k-stage requirement system that yields the worldwide ideal specifically cases and is helpful for issues where endogenous vulnerability is uncovered during the first few time frames of the arranging skyline is proposed. To take care of the more broad issues of enormous size, a NAC unwinding technique dependent on loosening up the NA requirements and including them if abused is also proposed. At long last, a Lagrangian deterioration calculation that can foresee the specific lower limits for the arrangement gotten is portrayed. Large-scale problems are beyond the range that, the new heuristics generate in less than one percent of the computing resources required by the optimal procedures solution. Moreover, it is conceivable to mix some heuristics and select the best arrangement as the great arrangement. Further, enlargement of heuristic calculations to correct arrangement procedures may help in improving the computational effectiveness and enhance the scope of the materialism of the specific arrangement strategies, since few advancing calculations require a preliminary answer to start the inquiry for an ideal
268
V. N. Kurniadi et al.
arrangement. Any end-point limitations from balance way imperatives, will experience the ill effects of their angles, as for the improvement boundaries being zero at the arrangement. In any case, it is intriguing to take note that these issues can be eased by receiving a half-breed approach like that of imbalance imperatives. Subsequently, notwithstanding the end-point limitations, every correspondence imperative is upheld as a point limitation at the stage limits. The current methodologies for tackling the Multistage Stochastic issue, depend on a model that abuse the moderate stream law and might cause an absurd outcome. The most popular calculation actualized a GA utilizing a Prufer-coded form that is a helpless decision in transformative calculations to tackle the multistage stochastic issue. The principal reason was introducing an increasingly productive and successful calculation to discover an approximated multistage stochastic arrangement. The settled decay calculation for illuminating multistage stochastic direct projects can adequately be executed equally without significant alteration. The equal execution exploits the freedom of subtrees to disperse the arrangement work among various laborer errands. Every specialist task requires correspondence with a chief errand to take care of the whole issue. Therefore, the settled disintegration is perfect for systems with similarly moderate correspondence times.
6 Conclusion In conclusion, the multistage algorithm has various concepts and forms. This multistage algorithm has many functions in each form, one of which is the stochastic model of the multistage algorithm, seen from various papers implemented by the multistage algorithm itself in the algorithm. This stochastic model has optimizations that are quite difficult to understand, and usually, this model is a typical time incoherence. The results of the literature review of 26 published papers containing the multistage sentence algorithm can help understand the meaning of the term multistage algorithm itself, which is the use of multistage intended for the development of an algorithm carried out with more than one stage in its application. The use of multistage algorithms in various fields of human life is undeniable that this multistage algorithm, apart from various concepts and implementations, does not reduce its function as part of how to think like a human, which is represented in an algorithm is called a multistage algorithm. The combination of multistage algorithms with other algorithms show the greatness of this algorithm and any algorithm, which is of course inseparable. Moreover, each algorithm can support each other to certainly help and increase the efficiency in human life, which is undoubtedly assisted by this algorithm as an implementation of how to think like a human.
A Review of the Multistage Algorithm
269
References 1. H. El-Shishiny, M.S. Abdel-Mottaleb, M. El-Raey, A. Shoukry, A multistage algorithm for fast classification of patterns. Pattern Recogn. Lett. 10(4), 211–215 (1989) 2. C.K. Chen, M.A. Duran, M. Radosz, Supercritical antisolvent fractionation of polyethylene simulated with multistage algorithm and SAFT equation of state: staging leads to high selectivity enhancements for light fractions. Ind. Eng. Chem. Res. 33(2), 306–310 (1994) 3. Z. Li, Rho-multistage algorithm on elliptic curves. Beijing Ligong Daxue Xuebao/Trans. Beijing Inst. Technol. 15(3), 261–264 (1995) 4. M. Kartal, B. Yazgan, O.K. Ersoy, Multistage parallel algorithm for diffraction tomography. Appl. Opt. 34(8), 1426 (1995) 5. Z. Feng, Improved multistage algorithm for task scheduling. Master thesis, University of Southwestern Louisiana, USA, 1996 6. Q.H. Pham, M.J. Smith, Morphological multistage algorithm for recognition of targets in FLIR data. Proc. SPIE Int. Soc. Opt. Eng. 2756, 14–25 (1996) 7. Q.H. Pham, T.M. Brosnan, M.J. Smith, Multistage algorithm for detection of targets in SAR image data, in Algorithms for Synthetic Aperture Radar Imagery IV, July 1997, vol. 3070 (International Society for Optics and Photonics, 1997), pp. 66–75 8. S. Dharanipragada, S. Roukos, A multistage algorithm for spotting new words in speech. IEEE Trans. Speech Audio Process. 10(8), 542–550 (2002) 9. I. Soraluze, C. Rodriguez, F. Boto, A. Cortes, Fast multistage algorithm for K-NN classifiers, in Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2905 (2003), pp. 448–455 10. P. Jazayeri, W. Rosehart, D.T. Westwick, A multistage algorithm for identification of nonlinear aggregate power system loads. IEEE Trans. Power Syst. 22(3), 1072–1079 (2007) 11. J. Cui, L. Cheng, T. Li, A multistage algorithm for the job shop scheduling problem, in IEEM 2009—IEEE International Conference on Industrial Engineering and Engineering Management (2009), pp. 808–812 12. A. Zamyatin, Multistage algorithm for lossless compression of multispectral remote sensing images, in ISPRS TC VII Symposium—100 Years ISPRS, Vienna, 5–7 July 2010, vol. XXXVIII, Part 7A (IAPRS, 2010) 13. A. El-Attar, M. Karim, H. Tairi, S. Ionita, A robust multistage algorithm for camera selfcalibration dealing with varying intrinsic parameters. J. Theor. Appl. Inf. Technol. 32(1), 46–54 (2011) 14. Y. He, D. Liu, C. Zeng, M. Wen, Novel model and multistage algorithm for distribution network reconfiguration considering system reliability. Dianli Xitong Zidonghua/Autom. Electr. Power Syst. 35(17), 56–60 (2011) 15. T.N. Ho, N. Marilleau, L. Philippe, H.Q. Nguyen, J.D. Zucker, A grid-based multistage algorithm for parameter simulation-optimization of complex system, in Proceedings—2013 RIVF International Conference on Computing and Communication Technologies: Research, Innovation, and Vision for Future, RIVF 2013 (2013), pp. 221–226 16. D. Ruinskiy, Y. Lavner, A multistage algorithm for fricative spotting, in Proceedings—2014 22nd Annual Pacific Voice Conference—Voice Technology: Software, Hardware Applications, Bioengineering, Health and Performance, PVC 2014 (2014) 17. C. Wang, C. Jin, J. Han, A multistage algorithm for multi-objective joint optimization of loading problem and capacitated vehicle routing problem. ICIC Express Lett. Part B Appl. 5(5), 1453–1459 (2014) 18. G. Aupy, J. Herrmann, P. Hovland, Y. Robert, Optimal multistage algorithm for adjoint computation. SIAM J. Sci. Comput. 38(3), C232–C255 (2016) 19. D.S. Cherneta, A.A. Druki, V.G. Spitsyn, Development of multistage algorithm for text objects recognition in images, in 2016 International Siberian Conference on Control and Communications, SIBCON 2016—Proceedings (2016) 20. T. Zhang, J.E. Cavanaugh, A multistage algorithm for best-subset model selection based on the Kullback-Leibler discrepancy. Comput. Stat. 31(2), 643–669 (2016)
270
V. N. Kurniadi et al.
21. A.S. Paulus, W.L. Melvin, D.B. Williams, Multistage algorithm for single-channel extendeddwell signal integration. IEEE Trans. Aerosp. Electron. Syst. 53(6), 2998–3007 (2017) 22. A. Kaveh, M. Ghobadi, A multistage algorithm for blood banking supply chain allocation problem. Int. J. Civ. Eng. 15(1), 103–112 (2017) 23. A. Niemi, J. Liu, J. Dai, J. He, N. Ilieva,. Towards multistage algorithm to model intrinsically unstructured proteins, in APS March Meeting Abstracts, Mar 2017, vol. 2017, pp. M1–248 24. B. Zambri, R. Djellouli, T.M. Laleg-Kirati, An efficient multistage algorithm for full calibration of the hemodynamic model from BOLD signal responses. Int. J. Numer. Methods Biomed. Eng. 33(11) (2017) 25. I. Isasi, U. Irusta, E. Aramendi, U. Ayala, E. Alonso, J. Kramer-Johansen, T. Eftestol, A multistage algorithm for ECG rhythm analysis during piston-driven mechanical chest compressions. IEEE Trans. Biomed. Eng. 66(1), 263–272 (2019) 26. A. Harkat, R. Benzid, N. Athamena, A multistage algorithm design for electrocardiogram signal denoising. J. Circuits Syst. Comput. 30(4) (2021)
Technology for Disabled with Smartphone Apps for Blind People Hartato, Riandy Juan Albert Yoshua, Husein, Agelius Garetta, and Harco Leslie Hendric Spits Warnars
Abstract Nowadays, technology is developing rapidly. The middle class has experienced technology such as smartphones, but unfortunately, there are still many applications that are not friendly for people with disabilities. The closest example is in Indonesia, where people with disabilities still receive less attention; all development support facilities do not pay attention to the comfort aspect for disabled users. This paper presents five features to help users, especially those who have blindness. These features are chat to speak, chat using voice, motion detect for emergency needs, detect object, voice to search engine, and weather information. In addition, a use case diagram was used to describe the application process and a class diagram to describe the database design. Keywords Mobile application · Application for disabilities · Vision assistant apps · Blind users application · Information systems
Hartato · R. J. A. Yoshua · Husein · A. Garetta Information Systems Department, School of Information Systems, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] R. J. A. Yoshua e-mail: [email protected] Husein e-mail: [email protected] A. Garetta e-mail: [email protected] H. L. H. S. Warnars (B) Computer Science Department, Graduate Program, Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_19
271
272
Hartato et al.
1 Introduction Persons with disabilities, who in daily conversation are referred to as people who are lacking or disabled, are often considered and ridiculed as unproductive secondclass citizens, unable to carry out their duties and obligations so that their rights as humans and citizens are neglected. Meanwhile, many technological advances have also been made in disabilities, both in software and hardware, especially for physical disabilities, individuals with visual or hearing impairments [1]. However, seeing the conditions of people with disabilities around us, it can be seen that applications for people with disabilities are still needed to support all the problems faced by persons with disabilities. Therefore, an application that is suitable and supports people with disabilities is still needed to help people with disabilities live their lives better. Moreover, discriminatory treatment for children with disabilities can have long-term and traumatic effects that can affect future employment opportunities and participation in civilian life, especially for persons with disabilities who have experienced it since birth. This will impact their social life, intelligence, mentality, and self-confidence in society. Refer to data released by UNICEF and WHO, according to a 2004 study, 93 million children under the age of 14 have disabilities, and for those aged 18 years and over, the number exceeds 150 million. In general, people with disabilities are divided into four parts, namely the first is physical disabilities, such as movement disorders that cause they cannot walk, and the second is sensory disabilities, such as hearing or vision problems. The third is intellectual disability, such as memory loss, and the fourth is a mental disability, such as phobia, depression, schizophrenia, or anxiety disorders. Vice President of the Republic of Indonesia K. H. Ma’ruf Amin at the Inclusion Indonesia Dialogue through a video conference at the Vice President’s Official Residence on Thursday January 14, 2021, stated that there were 209,604 Indonesians with disabilities based on data compiled by the Ministry of Social Affairs through the Disability Management Information System or 0.0007724492 of the total population of Indonesia. 2021 with 271,349,889 inhabitants [2]. However, data on persons with disabilities in Indonesia is certainly different from one agency to another due to the scattered data. This paper will only be limited to blind people with disabilities whose data is also unclear, considering that data on persons with disabilities are not integrated. Blind people in Indonesia still do not get enough care from other people. For example, many children are hidden by their families because they cannot do much and are afraid of being hurt. Many public services for people with disabilities do not support standards and become a big problem. Many adolescents do not continue their education due to a lack of a support system because they are blind people. Blindness causes sufferers to experience cognitive, motor, emotional, and social problems. Disadvantages of sufferers can be reduced with the help of adults around them. Therefore, the role of adults around them becomes critical. Older people are their trustworthy source of strength to live the rest of their lives. These problems will become significant if not addressed as soon as they determine their future. So, this paper will focus on one
Technology for Disabled with Smartphone Apps for Blind People
273
aspect of the many disabilities, blindness, and discuss how to help the problem by using technology to help their lives in the future. Understanding what persons with disabilities are, especially persons with disabilities who are blind, it is necessary to interact with them. From the interactions made, it can be seen whether disabilities can impact increasing their mentality, IQ, and EQ. Seeing the problems faced by people with visual disabilities, an application was designed and built which aimed at helping persons with disabilities, especially blind people, carry out their daily activities. So that they can do their work more comfortably or do activities that were not possible before. Many mobile applications for people with disabilities, such as Optacon, BrailleType, Aipoly, and Evasion AI devices. It could be an example that many developers are aware of this problem.
2 Existing Works Increase self-knowledge in directing careers for blind people; they also need to be educated in special schools for people with disabilities. Not to categorize them, but to ensure they are getting exemplary service. Hope for the future that they will get the right job for them and not wholly lean on someone for the rest of their lives. There are a small number of persons with disabilities to find the decent work they need because the competition is too fierce. Fixing their shortcomings, there are many ways to help them get better, one of which is a technology [3]. Elmar Krajnc from FH Joanneum Kapfenberg, Austria. According to Elmar’s paper and his research from seven users, they hired a test person to test which user interface could be the best option for blind people. There were two blind, two visual disorders, and three ordinary people. His research could be successful because the navigation and position of each feature were relatively simple and easy to access. These methods were used to make a proper application for blind people. The result is that the interface collaborating with the talking touch feature has a success rate of 88%, and the gesture application had an 89% success rate. From this research, some data and information can be learned to lead us to fix some deficiencies in the proposed mobile application or even as research to help other applications. Approach them also needs to be done with care, especially with children. From the explanation of Webster and Roe, there are two approaches to understanding it more deeply. First, it needs to treat them like children without disabilities because they do not feel different from other people with that treatment. Second, the prediction of the possibility of their different developmental styles. This approach can better understand what they need and how to make excellent mobile applications for them. The application design has been made easy to use. Even users who have never had experience with this application before can learn to use it quickly, and it can be implemented as soon as possible to help their daily life [4]. Mobile applications are IT software artifacts developed explicitly for mobile operating systems installed on handheld devices. Application design sometimes does not pay attention to the aspects
274
Hartato et al.
of making a functional application design. Effectiveness and efficiency are the two main aspects of making a good application design [5]. There are so many mobile application designs that can help blind people, to help overcome this problem; it is present in various ways, and it can be through cellphones, google glasses, virtual reality, and even combining humans with technology [6]. In addition, the technology applied to humans has been carried out by the Optacon device. This technology, which still uses infrared rather than ultrasonic, is still a dilemma because ultrasound is needed to have a proper vision to see the obstacles ahead. The main deal is that the technology is quite expensive; it is around 1500 GBP. So, an intermediary is still the best choice as an intermediary to help people with disabilities [7]. Currently, mobile applications for people with disabilities are overgrowing because cellphones have turned into touch screens. Interaction with touch screens is even more demanding from a visual point of view. With the evolution of this gadget, application developers will be more comfortable making applications that are easy to interact with [8]. In one of the researches, the difficulty for visually impaired people to use the application is only how to familiarize themselves with this application; however, to use the application, they need to remember where the buttons are. The use of this mobile application is how this application can manage and care for people with disabilities [9]. They also want to have a support system that helps them and provides the right direction for raising their children. Good qualities are needed to meet the needs of persons with disabilities and sufficiently increase their happiness in life. Several studies related to services that should be aimed at persons with disabilities who have cognitive problems. So, this is where the use of mobile applications helps people with disabilities do things that are difficult for persons with disabilities to do themselves [10]. A study published recently has conducted a literature review of 235 papers that discuss the development of research on the application and use of mobile applications designed to provide solutions to problems faced by people with blindness and include existing problems, challenges, and opportunities for development in front of him [11]. A study in Malaysia conducted a literature review of 136 published papers between 2013 and 2019 which took from Google Scholar, IEEE Xplore, and Science Direct, where these papers discuss a speech recognition project that is used for people with blindness by using a mobile application that is equipped with object and distance detection technology for navigation purposes for people with blindness [12]. Including, a study in India conducted a literature review of published papers discussing smartphone applications used by people with visual impairments to help with all the daily problems faced by people with blindness, as well as looking at the opportunities, challenges ahead, and evaluating the usability of the smartphone application [13]. The authors carried out the literature survey from the UK in their paper, which compiles a published paper discussing the implementation of smart cities in the UK that support visually impaired persons in terms of their ease of mobility and quality of life. Several assistive technologies were developed to assist visually impaired
Technology for Disabled with Smartphone Apps for Blind People
275
persons by applying vision, speech, and augmented reality (AR) technologies, such as Buzzclip, Horus, Sunu Band, eSight, EVA, MAPTIC, 3D soundscape, AIRA, and so on [14]. There are several application implementations to support the blind, including Google with an android application called Lookout, and a smart system consisting of a microcontroller board, sensors using solar power, as well as a mobile application that utilizes the global positioning system (GPS) called Seeing GPS eyes or blind square. Its implementation can use several technologies such as small cells or beacons or radio frequency identification (RFID), plus the application of voiceover screen readers, smart furniture, tactile floors/paving, or so-called blister paving [15]. Meanwhile, another survey was conducted by researchers from Romania, where they searched for previously published papers on smart systems applicable to the blind implemented either by the internet of mobility (IoM) or internet of mobile things (IoMT) that implements sensors such as vibration, microphone, headphones, ultrasonic, camera, and photoelectric. Researchers from Norway conducted a study on the use of smartphones for assistive devices for blind teenagers in Nepal where a survey was conducted on 21 blind students, both male and female, aged between 15 and 17 years by looking at the level of benefits of using smartphones for reading aids for them in carrying out their duties as a teacher. Students always read the subject matter provided, especially in doing homework [16]. Also, a qualitative research study was conducted in Hong Kong, The Republic of China, where a mobile application that can be used by visually impaired persons in the tourism sector is proposed, where they can easily access information about tourism in the city of Hong Kong despite their limitations as blind people [17]. A study in New Zealand conducted a study for blind students in New Zealand where the use of digital tools in helping students with visual impairments is incomplete if there is no support for other factors such as the involvement of the closest person or parents. A researcher from Malaysia carried on the research, where created an interactive learning mobile application for learning mathematics with braille using Nemeth code to help blind students in Bangladesh where with this application, the student can quickly learn how to do easy calculations, including self-learning facility, and interactive features such as hearing and touching to physical things [18]. In addition, the group learning model, both with others with disabilities and with students who are non-disabled, can also help them understand the lessons taken and do group assignments and interact digitally as communication in the work group [19]. Meanwhile, Shakya, a Professor from Tribhuvan University, Nepal, conducted an in-depth analysis of the application of Big Data in the banking sector by prioritizing the use of tools, applications, and technologies that are appropriate for the banking sector, especially banking in India. This study conducted a survey on banking in India which also involved its customers, and several survey results showed the improvements needed in implementing big data in the banking sector in India [20]. Patel from RCOE, MU, Mumbai, India, built an application architecture to capture complaints from citizens in terms of monitoring and experiencing the services of government officials who are less severe in serving the people. The application built for this smartphone is more concerned with problems related to road infrastructure, which sometimes breaks down quickly, especially during the rainy season. In addition,
276
Hartato et al.
careless maintenance of road infrastructure will cost more funds and may even cause accidents for residents who use the damaged highway [21]. Moreover, Prasanna from Shri SSS Jain College for Women, India, developed a mobile application for blind and visually impaired people where communication with other people using this smartphone uses voice to text technology where the voice generated will be converted into text. In this case, persons with disabilities can make letters or writings through their voices. In addition, text to voice technology is also used where blind and visually impaired people can listen to written text, whether it is text obtained from another person. This is very useful for people with disabilities where they can channel their talents who like to read books or write-through sound. So they do not need to use their eyes to read but use their ears to read. In addition, the accelerometer sensor is used to detect the physical shock or shake of the smartphone so that it can be used as a key to open the smartphone. Besides that, it is used if people with disabilities experience a fall due to the high accelerometer value [22].
3 Proposed Smartphone Application The proposed system is a friendly use to a blind user. So, to fulfill their needs, it should know and make a simple yet effective process in this smartphone application. As simple as it could be for the process, it would make the blind user understand the process more and no need much time to learn using this application. The system flow in this application can be seen in Fig. 1. The flow of this application process is made as simple as it could be for more natural use. From starting, go through login but without using an ID or password because it will be an obstacle for a blind user. Next, it needs to access the proposed system using a phone camera as a procedure for its application to run. After that, there are so many features to which section of objects. However, the limitation is that to use full features, which need to pay some amount per month for money, could proceed by scanning the fingerprint for IOS. The processes are based on this figure. Based on Fig. 2a, it can be seen that this application’s main page is simple so that the user can use it easily. The first feature is that the user could chat with a friend of theirs by speaking through a microphone, and when their friend replies, the application will make the chat could speak by detecting the words. Moreover, based on Fig. 2b, this page has a feature to help the user when no one is around the user at home when the user suddenly collapses because of something wrong. It could detect when a user is holding their phone, and suddenly the user collapse and the phone drop to the floor. The gesture motion will detect it and send an emergency signal to other family and caretakers to take immediate action. If the signal is false, the user could shake it three times. This feature helps the user so much, decreases the chance of someone passing away, and helps to give much more service to people with special needs. So, in the future, all the people who need special needs and affection will feel cared for by someone even more. So, all people get what they deserve because at this moment,
Technology for Disabled with Smartphone Apps for Blind People
277
Fig. 1 Use case diagram of a smartphone application for blind people
many incidents get a late response from the caretaker and give a high risk of someone passing away. It must be able to minimize all the bad possibilities for someone. This feature can also be used by people who are not disabled to help them get treatment as soon as possible. The emergency signal sent will go directly to an agency’s emergency handling server without another intermediary, so it does not take much time. Moreover, the sending of this signal is very strong so that it is not disturbed by signal problems to respond to emergencies. Based on Fig. 3a, the sensor implanted in the phone and camera could detect objects around the user; this can be done because it makes an AI that gives the best experiences. Meanwhile, based on Fig. 3b, this feature could direct what users want to search through a pre-existing search engine. So, blind people also could surf through the internet like other people without limitations because all the commands and responses are using voice over from smartphones. Based on Fig. 4, the application could give information about outside local weather. Even though the user who has blindness could not see it, at least they could know the current weather status by hearing from voice notifications. So, they
278
Hartato et al.
Fig. 2 a Chat to speak menu, b motion detect menu
could prepare the best outfit to go outside. Moreover, the application could also tell the weather forecast if it would have a chance of raining or not, so the user could prepare to bring the umbrella. Furthermore, most of all, the interaction of this application by motion gestures like shaking phone and voice command makes more natural interaction between the user and the application. In order to make a proper smartphone application, many technologies could pair with smartphone features. One of them is the sensor in the application; in this way, it could let the application system make a feature that could collaborate with a sensor on the smartphone. The sensor is the best thing that it could insert in this application, and it is because the sensor is sufficient to give up-to-date data to the user about their health and could interact with surrounding to improve the use of this application. RFID-based assistive devices. The example was like when a new user downloaded the application, the application needs to ask permission to allow it to use a user camera and other features in a smartphone. It would be done by using a voice and gesture sensor. The application asks using voice, and the user interacting with shake their
Technology for Disabled with Smartphone Apps for Blind People
279
Fig. 3 a Detect object menu, b voice to search engine menu
phone to give permission and do not give permission by not shake their phone for 5 s. With sensors, could make an obstacle detection to prevent any incident that does not want to happen. However, to make this thing work, should spread the receiver of RFID in an area so the RFID implemented in a smartphone could send signals to another receiver around that area. The second component is a sensor proxy that acts as an online representative for the mobile sensors. By what objects are in front of him by notifying by a voice from the user’s mobile, including notification of updates about the weather around the user’s area. This proxy could work by sending those site addresses through a proxy. The proxy is always available and provides the entire sensor data still up to date. For people who are blind and recognize a new environment, this application will tremendously help users recognize the new environment quickly and precisely through the software’s sensors. Activity recognition is the application monitoring the activity by using signal processing in an environment. When users hold their phones, they could do the interaction by activity such as shaking their phones to have an interaction. In this way, the use of sensors could be optimized. Every ongoing activity, there will be
280
Hartato et al.
Fig. 4 Weather information menu
a confirmation from the system using voice notification. Moreover, when the user interacts with the application, the application will remind how many times has been spent by the user, the level of smartphone battery, and other things. The fall detection system is The feature that could help the user get emergency help when the user is alone and other family members are outside. When the user holds the smartphone and suddenly collapses, the smartphone will detect the phone’s gesture and send the alert to a family member and caretaker about the user that collapses. So, they could do something immediately. However, there is a confirmation if the phone is dropped accidentally by shaking the phone for 5 s. The interactive system’s feature gives the user that the voiceover from the application could help many things too. Such as reading the incoming chat from friends from different social media, helping them reply to messages from friends, helping the user play music from a playlist, or integrating with Spotify or Joox. A reminder of what time at that moment automatically reads news for the user when they say, “read the news today.”
Technology for Disabled with Smartphone Apps for Blind People
281
4 Conclusion Disabled people that all this time are suffering to live their life slowly could feel life again because time by time, there are so many assisting things and devices to help them. The improvement in technology also fulfilled what they need and want. Furthermore, there will be more advanced technologies to help them. By these changes into a good thing, many people and sides feel the extraordinary impact of those technologies. Hopefully, people who do not have disabilities need to encourage and help them make them enjoy their lives. It could come in many ways and many chances because everyone has the right to live a good life. Things can be better, and the mobile application that supports disabled people needs to be free of charge to decrease their life problems. Moreover, designing an excellent interface like a simple placement of features, an excellent interactive application, a clear voice for blind people to navigate by the application, and some touch gesture features to be implemented. Also, improved existing features to be more interactive, fast respond, and clearly could help more for the user. Embed a new and fresh feature with future technologies will do it. Collaboration with the internet of things method tries to integrate this application with another thing that could implement with Artificial Intelligence. Also, try to add a technology that uses things and combines humans, which is the user, with the technology itself. The example is like implanting the technology to the user’s hands, ear, legs, eyes, and even place near the brain to send and receive a signal between the technology and its user. The technology is implanted near the eyes and interacts with the user’s retina, and if it is possible with the user, the user’s retina still has a chance of working. So, the benefit that the user gets could directly he feel. In this way, make a breakthrough in technology for disabled people, especially those who have blindness. Moreover, every data gathered from the user is generated and collected to the server as data for further research purposes and improvement for technology and health care. Acknowledgements This work is supported by the Research and Technology Transfer Office, Bina Nusantara University, as a part of Bina Nusantara University’s International Research Grant contract number: No.017/VR.RTT/III/2021 contract date: 22 March 2021.
References 1. J. Oliveira, T. Guerreiro, H. Nicolau, J. Jorge, D. Goncalves, BrailleType: unleashing braille over touch screen mobile phones, in INTERACT 2011, Part I, LNCS, vol. 6946 (2011), pp. 100– 107 2. A. Csapo, G. Wersenyi, H. Nagy, T. Stockman, A survey of assistive technologies and applications for blind users on mobile platforms: a review and foundation for research. J. Multimodal User Interfaces (2015). https://doi.org/10.1007/s12193-015-0182-7 3. M. Bhuiyan, A. Zaman, M.H. Miraz, Usability evaluation of a mobile application in extraordinary environment for extraordinary people. J. Inst. Inf. Technol. (2014)
282
Hartato et al.
4. E. Krajnc, M. Knoll, J. Feiner, M. Traar, A touch sensitive user interface approach on smartphones for visually impaired and blind persons. J. Inf. Qual. e-Health 7058 (2011). https://doi. org/10.1007/978-3-642-25364-5_41 5. T. Guerreiro, P. Lagoa, P. Santana, D. Goncalves, J.A.P. Jorge, NavTap and BrailleTap: nonvisual texting interfaces. J. MIS Q. 39(2), 435–472 (2015) 6. G. Fortino, R. Giannantonio, R. Gravina, P. Kuryloski, R. Jafari, Enabling effective programming and flexible management of efficient body sensor network applications. J. IEEE Trans. Hum.-Mach. Syst. 43(1) (2013) 7. E. Terradas, An overview of the internet of things for people with disabilities. J. Netw. Comput. Appl. 35(2012), 584–596 (2011). https://doi.org/10.1016/j.jnca.2011.10.015 8. N. Pannurat, S. Thiemjarus, E. Nantajeewarawat, Automatic fall monitoring; a review. Sensors (2014) 9. J. Cecilio, K. Duarte, P. Furtado, BlindeDroid: an information tracking system for real-time guiding of blind people, in The 6th International Conference on Ambient Systems, Networks, and Technologies (ANT 2015) (2015). https://doi.org/10.1016/j.procs.2015.05.039 10. T. Loukopoulos, M. Koziri, N. Panagou, P.K. Papadopoulos, D.K. Iakovidis, Cloud video guidance as “Deus ex Machina” for the visually impaired, in Technological Trends in Improved Mobility of the Visual Impaired, ed. by S. Paiva (Springer, 2020), pp. 127–143 11. A. Khan, S. Khusro, An insight into smartphone-based assistive solutions for visually impaired and blind people: issues, challenges, and opportunities. Univ. Access Inf. Soc. 1–34 (2020) 12. A.M. Norkhalid, M.A. Faudzi, A.A. Ghapar, F.A. Rahim, Mobile application: mobile assistance for visually impaired people-speech interface system (SIS), in 2020 8th International Conference on Information Technology and Multimedia (ICIMU), Aug 2020 (IEEE, 2020), pp. 329–333 13. S.S. Senjam, Smartphones for vision rehabilitation: accessible features and apps, opportunity, challenges, and usability evaluation, in Software Usability, ed. by L.A.A. Castro, D. Cabrero (IntechOpen, 2021), pp. 1–18 14. R.M. Aileni, G. Suciu, V. Suciu, S. Pasca, J. Ciurea, Smart systems to improve the mobility of people with visual impairment through IoM and IoMT, in Technological Trends in Improved Mobility of the Visual Impaired, ed. by S. Paiva (Springer, 2020), pp. 65–84 15. D. Sobnath, I.U. Rehman, M.M. Nasralla, Smart cities to improve mobility and quality of life of the visual impairment, in Technological Trends in Improved Mobility of the Visual Impaired, ed. by S. Paiva (Springer, 2020), pp. 3–28 16. P. Sankhi, F.E. Sandnes, A glimpse into smartphone screen reader use among blind teenagers in rural Nepal. Disabil. Rehabil. Assist. Technol. 1–7 (2020) 17. K.L. Lam, C.S. Chan, M. Peters, Understanding technological contributions to accessible tourism from the perspective of destination design for visually impaired visitors in Hong Kong. J. Destin. Mark. Manag. 17, 100434 (2020) 18. L. Nahar, R. Sulaiman, A. Jaafar, An interactive math braille learning application to assist blind students in Bangladesh. Assist. Technol. 1–13 (2020) 19. E. Pacheco, P. Yoong, M. Lips, Transition issues in higher education and digital technologies: the experiences of students with disabilities in New Zealand. Disabil. Soc. 1–23 (2020) 20. S. Shakya, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC 3(03), 235–249 (2021) 21. V. Patel, D. Kapadia, D. Ghevariya, S. Pappu, All India grievance redressal app. J. Inf. Technol. Digital World 2(2), 91–99 (2020) 22. S. Prasanna, K. Vani, V. Venkateswari, Design and development of a mobile-based application for automated SMS system using voice conversion technique for visually impaired people, in Innovative Data Communication Technologies and Application (Springer, Singapore, 2021), pp. 307–317
Mobile Apps for Musician Community in Indonesia Amadeus Darren Leander, Jeconiah Yohanes Jayani, and Harco Leslie Hendric Spits Warnars
Abstract Art and music are human needs in terms of socializing with each other, and every human civilization is always equipped with music. In Indonesia, every community activity, whether related to family, religious, or cultural events, is inseparable from the presence of music as a compliment and means of socializing society. People interested in music are not well informed about the choice of offers about the variety of music that can be held while music organizers and players have difficulty marketing their musical expertise. Seeing these existing problems, building an online platform where session musicians can interact and share is necessary to build a strong relationship between them was proposed. This paper outlines the design of a mobile app that provides an online forum for discussion of all things session musicians and helps find musical activity opportunities. The design was developed using Unified Modeling Language (UML) diagrams, mainly using use case diagrams, class diagrams, and activity diagrams, including user interface displays as a communication display between users and the applications they use. The application was built using Arduino Studio and MySQL to save the database. Keywords Community mobile apps · Information systems · Musician community mobile application
A. D. Leander · J. Y. Jayani Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] J. Y. Jayani e-mail: [email protected] H. L. H. S. Warnars (B) Computer Science Department, Graduate Program, Doctor of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_20
283
284
A. D. Leander et al.
1 Introduction In the past decades, the changes in social and cultural practices in Indonesia have successfully popularized the emergence of different music genres, notably pop, rock, hip-hop, orchestra, and many more [1]. Moreover, well-established businesses are attempting to incorporate musicians in their work for purposes such as promotion or marketing [2]. Many of these large businesses use the services of a particular type of musician to aid in the development of musical pieces and recordings, called session musicians. Session musicians also called studio musicians or backing musicians, play an essential role in the music industry. They are considered by many people as the “best-kept secrets” in that particular industry [3]. By definition, session musicians are contracted to aid in live performances or recording sessions. They primarily play standard instruments such as the guitar, keyboard, drums, and bass, but few outliers can specialize in strings, brass, and woodwinds. Usually, session musicians are not permanently affiliated to a specific solo artist, band, or ensemble but instead are only temporarily hired as required. Great session musicians are remarkably creative, excellent sight-readers, and highly skilled in their respective instruments [4]. Despite everything mentioned above, the term session musician is still very foreign to the public’s ears. This is because session musicians are not always visible and, more often than not, do not receive the proper spotlight they deserve. Most of the popular music associated with celebrities and recognizable characters can be traced back to hired collaborators who work in the background at recording studios, away from the public eye [5]. A famous example of the unnoticed work of session musicians is Motown Records, a renowned American record label composed of professional session musicians who could improvise complex arrangements easily. They released multiple hit singles, yet none of the musicians who worked on the masterpieces, particularly in the 1960s, were individually credited and remains a mystery [6]. There is a dilemma in the method of information transport between session musicians. There is difficulty in ensuring that everyone gets a reasonable opportunity to propel, or even kickstart, their careers as session musicians [7]. Moreover, current and prospective session musicians do not have a robust platform where they can actively share information and experiences among each other. Therefore, session musicians cannot establish broader connections and relationships, aside from the small group they frequently work with. This negatively affects several aspects, one of them being job sharing. Without generalized communication, sharing information regarding jobs will not be equally spread out. This paper discusses a proposed solution to this problem. It outlines methods to implement a mobile-based forum application that enables session musicians to connect. With this app, session musicians will be able to share their knowledge and experiences and promote open jobs among each other through posts on a forum thread. It is a new effort to unite session musicians of all demographics. The following
Mobile Apps for Musician Community in Indonesia
285
sections will describe similar past research, methodologies used to achieve the objectives, and present the results and conclusion.
2 Existing Works Since there is only a limited amount of research regarding the development of mobile applications for musicians, this section will mainly focus on discussing topics that are related and relevant to this research paper, namely the benefits of music, common approaches to mobile application development, and user interface design, as well as research regarding the use of online forums for education and learning purposes. Music is associated with cultural matters such as rituals, magic, or healing purposes in the prehistoric ages. As time goes by, the purpose and use of music have revolutionized to become a form of personal enjoyment, artistic expressions, and entertainment [8]. In this current age, music and technology are very closely tied. The ability to share musical artworks and information on the internet is one possible way to enhance the engagement of Indonesian people in the musical culture itself. This is beneficial as music is one of the best remedies for the heart, mind, and soul [9]. Generally, musicians are at a significantly higher level of subjective well-being than non-musicians, meaning they are more satisfied with their lives and have a more positive outlook and emotions [10]. Mobile applications are currently on the rise and are expanding rapidly in the current industry. However, the development of mobile applications is still considered complex, as it is difficult to mirror the complete behavior of desktop applications [11]. Moreover, there is no single process to develop solid mobile applications as each app differs from the other [12]. Due to this reason, a dedicated framework lifecycle is required in order to produce high-quality applications. The phases in this life cycle resemble those of typical application development lifecycles: Identification, Design, Development, Prototyping, Testing, and Maintenance [13]. Aside from a concrete development process, various standard practices must also be abided by to improve app performance. Making sure that the mobile application is compatible and working correctly on multiple platforms is mandatory in this era. In contrast with desktop applications, mobile applications must be able to incorporate more instinctive and gesture-focused capabilities to provide user convenience. Lastly, an agile development approach is most suitable for mobile apps due to its everimproving nature [14]. A good user interface is a crucial element toward satisfaction in the eyes of both end-users and developers. The user interface connects the system with the software users to allow efficient interaction [15]. The user interface will reflect the whole system, as users often think of the user interface as the system itself. Therefore, the usability of the user interface is essential in the overall quality of the software [16]. The enhancement of applications has dramatically impacted how education and information access are provided to people [17]. An effective online application-based tool for learning is online discussion forums. Online discussion forums demonstrate
286
A. D. Leander et al.
promise for online learning, both individually and as a collaborative group [18]. They allow a simple yet effective method to communicate between online discussion forum participants, as they do not have to be in the same place or at the same time to be interacting with each other [19]. Although there are no face-to-face interactions and discussions between participants, they seem more able to build upon each other’s ideas and be more active in expressing their thoughts if they disagree on something [20]. Participants are also more willing to touch on sensitive issues and share honestly. The positive results are due to online discussion forums’ social interaction and collaborative nature [21]. Meanwhile, Vivekanandam, from Lincoln University College, Malaysia, developed a hybrid algorithm for the classification process in machine learning and model identification. Besides that, this approach is carried out by functional selection by securing the selection operator from the genetic algorithm. The proposed framework is run in six steps, starting with preparing population data, and the second step is calculating the fitness value. The third step is to make a selection in the genetic algorithm section, and the fourth step is to change the mutation. Meanwhile, the fifth step is to choose between the second and last step, the sixth step, the results are processed using the Support Vector Machine (SVM) [22]. Also, Mugunthan, a researcher from Sri Indu College of Engineering and Technology, India, designed the Extreme Learning Machine (ELM) method using the sigmoidal bias function in the classification process. In addition, handling stochastic matrices provides low performance for learning rate and robustness of determination, and this ELM version is modified in six steps to get better accuracy and minimize errors in the classification process [23]. Moreover, Chowdhury, in his master’s thesis at the Icahn School of Medicine at Mount Sinai, USA, conducted a systematic assessment of the community in New York City on the use of Covid-19 applications, especially those assessed were applications such as the mhealth application, which can be downloaded on the Appstore or Google Play Store. The study results show that most of the digital health applications designed to overcome the Covid-19 problem are not designed properly and do not adapt to the community’s literacy needs. This means that software engineering rules in capturing data using the user requirement tool are not correctly and adequately [24].
3 Proposed Idea The use case diagram for the proposed mobile application is illustrated in Fig. 1. In total, there are eight use cases to be developed within the application, which are “Register,” “Login,” “Create performance event,” “Manage performance event,” “Register event,” “Forum,” “Information” and “Rate and comment event.” Each use case will be further explained in the following sections. There will be two main types of users in the proposed mobile application: musicians and event organizers. Musicians who want to use the app to gain information
Mobile Apps for Musician Community in Indonesia
287
Fig. 1 Use case diagram of mobile apps for session musicians community in Indonesia
surrounding the world of session musicians and register for music events can register as a musician and enjoy the provided services. If users want to use the mobile app to look for session musicians to perform at their events, they can register themselves as event organizers. Figure 2a represents the initial landing page of the proposed mobile application and the logo app name, “JamSession,” lies on the center of the screen. This page contains two buttons, “Login” and “Register,” redirecting users to the individual login and register pages. The main menu for musicians is displayed in Fig. 2b. Users can view and edit their profiles when logged in as musicians using the “View Profile” feature. Users also have other feature options: “Create Forum Thread” allows musicians to create new forum threads to serve as discussion platforms, “Search for Forum” provides a method to go through and find forum threads that interest the user, “Register for Event” which allows musicians to search, discover and register for events, and “Information” aims to serve users with news and articles regarding session musicians. The “Logout” feature logs the user out of his/her account. The main menu for event organizers is shown in Fig. 2c. Unlike musicians, event organizers can access fewer features and are mostly only related to events. Event organizers can make and list new events using the “Create an Event” feature and manage their current active events using the “View Events” feature. Like musicians, event organizers also have “View Profile,” and “Logout” features that function the same way. The “Register” use case is a process that is performed when a new user wants to create an account to be able to start using the app, in the case where the user has not previously created an account. The “Register” button on the landing page
288
A. D. Leander et al.
Fig. 2 a Landing page; b main menu for musicians; c main menu for event organizers
will redirect to the register option, as seen in Fig. 3a. Users are prompted to choose which type of user they are registering as, either a Musician or an Event Organizer. Depending on which option the user selects, the user will be redirected to either the register page for musicians, Fig. 3b, or the registration page for Event Organizers, Fig. 3c. Both register pages will be asking the user to enter their full name, email, and password. The only difference between these two pages is that an extra field is provided for the user to fill in the musical instrument that he/she plays on the registration page for musicians. Once the user has input all required personal information into the provided on-screen text boxes, the system will validate whether all of the personal information entered is already in the correct format and abide by the previously specified constraints. According to the activity diagram in Fig. 4a, once the user has input all required personal information into the provided on-screen text boxes, the system will validate whether all of the personal information entered is already in the correct format abide by the previously specified constraints. If the validation fails, the user will again have to input the appropriate personal information. If the validation result is correct, the user clicks on the “Create your account” button on the screen, and the system will then create the user’s account, store it in the database, and redirect the user to the login page. At the same time, the system will send a confirmation email that is used to activate the account to the linked email address. As seen on the activity diagram shown in Fig. 4b, the “Login” use case represents the user’s process of logging into his/her account. Initially, the user will need to input his/her email and password. The
Mobile Apps for Musician Community in Indonesia
289
Fig. 3 a Register option; b register as musician; c register as event organizer
input data will then be passed to the system and validated appropriately. If the data is valid, the user is logged in and is redirected to the appropriate main menu page, depending on whether the user is a Musician or Event Organizer. The “Login” use case represents the user’s process of logging into his/her account. Initially, the user will need to input his/her email and password. The input data will then be passed to the system and validated appropriately. If the data is valid, the user is logged in and is redirected to the appropriate main menu page, depending on whether the user is a Musician or Event Organizer. Figure 5a represents the page the user will be redirected to when the “Login” button is tapped. The login page prompts users to enter the email address associated with their account and the corresponding password. If the user forgets his/her password, a “Forgot password” feature is also available to reset their password. There is no option to specify whether the user is a Musician or an Event Organizer because the user’s role is stored in the database. The first one is the “Create performance event” use case, which is intended to be used by event organizers. Since it is building an app dedicated to session musicians, specifically to help them discover and share job opportunities and provide use cases regarding event performances. Event organizers will be able to use the app to create and add event listings for the musicians to apply and, hopefully, perform at. Firstly, the event organizer will enter all relevant information about the event according to the required fields on the app. The system will then validate whether all information follows all predefined restrictions. Next, users have the option to upload a file to give more information about the event, such as through posters and
290
A. D. Leander et al.
Fig. 4 a Activity diagram for register use case; b activity diagram for login use case
brochures. If the user does choose to upload a file, the file extension and file size will be validated further to ensure that it is acceptable. The user interface of this feature can be seen in Fig. 5b. The event name, number of performers needed, the role of each performer, performance fee, and details fields are provided for filling in. The details field can enter other relevant details such as date, time, and venue. Event organizers can upload images through the “Upload file” button to provide more information or raise more interest. The “Create event” button will save and create the event listing. The activity diagram for the use case “create performance event” is described in Fig. 6a. Firstly, the event organizer will enter all relevant information about the event according to the required fields on the app. The system will then validate whether all information follows all predefined restrictions. Next, users have the option to upload a file to give more information about the event, such as through posters and
Mobile Apps for Musician Community in Indonesia
291
Fig. 5 a Login; b create event; c manage events
brochures. If the user does choose to upload a file, the file extension and file size will be validated further to ensure that it is acceptable. Once all of the previously mentioned steps are completed, the user can choose to save or discard the event, and the system will handle it accordingly. If the user saves the event, it will be saved to the database. Meanwhile, the “Manage performance event” use case aims to allow each event organizer to manage the events they have created. The event organizers will only be able to view the events that they have made. They can edit each event’s information and view and approve the list of musicians who have applied. The illustration for the manage event page on Fig. 5c shows how the events are arranged in a list with previews of their details, combined with the “Confirm Musicians” button and the “Edit” button. The system will show all the manageable events, and the user selects one. The user can then edit the event information, which will be validated before being saved to the database. The activity diagram in Fig. 6b further describes the “Manage performance event” use case, showing the use case flow. The system will show all the manageable events, and the user selects one. The user can then edit the event information, which will be validated before being saved to the database. The next event-related use case is the “Register event” use case. In the “Register event” use case, session musicians can search through the available listing of event performances and apply to the ones they are interested in. In this use case, session musicians and event organizers can discuss the job, potentially regarding the details, fee, number of musicians required, and many more.
292
A. D. Leander et al.
Fig. 6 a Activity diagram for the create performance event use case, b activity diagram for the manage performance event use case
Mobile Apps for Musician Community in Indonesia
293
Figure 7a demonstrates the activity diagram of the “Register Event” use case. The app will display all available events that session musicians can still register upon starting this feature. Available events are events that have not fulfilled their quotas
Fig. 7 a Activity diagram for the register event use case, b activity diagram for the forum use case
294
A. D. Leander et al.
Fig. 8 a Registered event; b create forum thread; c search for forum
yet. All of the events will be displayed in a list form, as seen in Fig. 8a so that users can search through and select any of them. When a user sees an event that he/she is interested in, all the user has to do is tap on it, and more detailed information surrounding the event will be displayed. If the user decides that he/she would like to perform at this event, the user can easily tap on the “Join” button, and his/her name will be registered to the event. The event organizer will then be able to manually decide whether the session musician is eligible to perform for this event or not. The subsequent use case is the “Forum” use case. As the name suggests, the functionality of this use case is to provide session musicians with a forum where they can discuss different topics and share experiences. Users will be able to create forum threads regarding a specific title and comment on the numerous existing forum threads. Overall, the “Forum” use case performs two main activities. The activity diagram for the “Forum” use case is illustrated in Fig. 7b. Overall, the “Forum” use case performs two main activities. The user can either “Create forum thread” or “Reply to forum thread” to start this activity diagram. The first option, “Create forum thread,” means that the user would like to create a new forum thread that did not exist before. The user will initially be prompted to choose a forum category, and the system will respond with a page for creating new forum threads. The user is then asked to input the thread title and the initial forum message. Users also have the option to upload an image. As usual, the image’s file extension and size are validated. After this is done, if the user chooses to save the forum thread, it will be saved to the database.
Mobile Apps for Musician Community in Indonesia
295
The second option is the “Reply to forum thread” path. The system will display a message box for the user to fill in this activity. The user can set the title of the reply and write whatever message the user wants to write. Again, he/she can also upload images to support their reply. Once the user is satisfied with the reply, he/she clicks on the “Reply” button, and the reply is saved to the database. The page will automatically refresh to display the newly created thread reply. Figure 8b illustrates the “Create Forum Thread” feature. The user must fill in the thread title and the original comment message. Once all of the details are as desired, the users can tap on the “Create Thread” button to generate the forum thread. Users also can upload images to include in their forum threads. Users will be able to search for specific forum threads using the “Search for Forum” option, as seen in Fig. 8c. Initially, the search page will display all forum threads available to read. A search bar is present for users to type in keywords of the forums they are looking for. The system will then search for those particular keywords and display the appropriate results. Users can just tap on their desired forum threads to read more about them. It also proposes an “Information” use case representing the functionality that allows users to search and view various information regarding the session musician’s community. Users will be able to explore knowledge surrounding session musicians, such as guides on how to become better musicians, performance tips, instruments tips and tricks, etc. This feature aims to educate users more about the field they are dealing with and broaden their knowledge about the music industry. In Fig. 9a, the “Information” use case is elaborated in the form of an activity diagram. First and foremost, the system will fetch all the information articles from the database and display them in a grid-like format. Users can then view all of the article previews and select any of them they want to read. When the user selects an article, the app will redirect the user to a page that shows the complete article and its images or external links. Figure 10a represents the “Information” feature. Users will be able to read articles and guides about the session musician community by simply tapping on any one of the articles, which will redirect to the article’s page. Meanwhile, the “Rate and comment event” use case, as seen in Fig. 10b, is the musician’s process of rating and commenting on events that they have performed at. This way, it can receive feedback about which events and event organizers are favored by musicians and which event organizers should be blacklisted and rejected in the future. The flow of this use case can be represented using the activity diagram in Fig. 9b, and the user interface in Fig. 10b. The user will be able to rate the event on a scale of 1–5 and also has the option to write additional comments. The comments will be validated before it is saved into the database. For the class diagram, there are eight classes, as seen in Fig. 11, and the first class is the “Musician” class which represents the session musicians that will use this app. This class provides the session musicians’ information such as their name, email, gender, musical instrument, and private information such as their account password. Each musician’s account also has its own unique “userId,” which is the primary key of this class. Musicians can also provide several ratings and comments. Any musician can post zero to many comments, register for zero to many events, and create zero to many forum threads.
296
A. D. Leander et al.
Fig. 9 a Activity diagram for the information use case, b activity diagram for the rate and comment event use case
The “ForumThread” class is the representation of the forum. There will be three attributes in this class, which are “forumId,” “category,” and “DateTime.” The attribute “category” is used to categorize the forum to ease any musician’s process to find a forum thread of a specific topic. It found that categorized forum threads are more convenient than uncategorized forum threads. A single forum thread must have
Mobile Apps for Musician Community in Indonesia
297
Fig. 10 a User interface for information; b user interface for rate and comment event
at least one comment posted, but there is no maximum limit to how many comments can be posted. A single user only creates each forum thread. The next class is “Comment,” representing each comment in a thread. In this “Comment” class, it stores the “userId” and “forumId,” which represent any user that posts a comment in a forum thread. Other than those two attributes, two attributes will store the comment title and comment description, and “DateTime” to store the date and time that the comment was posted. A single musician can only write each comment, and each comment can only be written inside one forum thread. There is another type of user in the app, other than the session musicians, and the event organizer. The “EventOrganizer” class is the representation of this type of user. This class has the same attributes as the musicians, except removing the “musical instrument” attribute and the app’s event organizers’ role is to create events for the session musicians to participate in and perform in. An event organizer can create many events where the bare minimum is at least one event. This class also acts as the connector of session musicians and event organizers. Thus, it has another class called “Event” to store the data of the event’s name, event’s ID, organizer’s ID, and several performers or musicians. This number will represent the quota of musicians needed for the specific event. An event can only be organized by a single event organizer but can be joined by one or more musicians. Each event
298
A. D. Leander et al.
Fig. 11 Class diagram of mobile apps for session musicians community in Indonesia
can also only have one set of event details. Each event can also have multiple ratings and comments by multiple musicians. The class “EventDetails” is only associated with the “Event” class. The class “EventDetails” stores the event’s details, for example, the description of the event itself, the role of the performers or musicians, and the fee for the musicians. Each event detail can only be specified to a single event, as every event has different details. The “RegisterEvent” class is used to model the transactions between musicians and events, which are performance agreements since no actual money-related transactions can be made using the app. It will store the associated “userId,” “eventId,” and “DateTime.” One musician can only make each event registration for one event. Lastly, the “RatingComment” class represents musicians’ ratings and comments on certain events. The class contains the associated “userId” and “eventId” and supporting attributes to store ratings and comments, which are named the same. Each rating and comment is associated with a single event and made by a single musician.
Mobile Apps for Musician Community in Indonesia
299
4 Conclusion It is proposing a mobile app for session musicians that hopefully can be a platform to connect session musicians and event organizers. Session musicians can browse available events to look for job opportunities, while event organizers can create events in the app and get session musicians to perform for their events. An explorative was provided yet straightforward user interface that will attract session musicians. Several additional features can aid session musicians in fields other than job searching, such as a forum thread where they can discuss and share their ideas and an information section where there will be much valuable information such as tips and tricks, articles, and many more. More features will be added to enhance the app to be more supportive toward session musicians for the plans. These features include chat or direct messages between session musicians, a social network where they can connect and showcase their music and skills, in-app payment methods for event performances, and many more. In conclusion, hopefully, this app can answer session musicians’ needs and help them provide for their families.
References 1. M.J. Khadavi, Dekonstruksi Musik Pop Indonesia dalam Perspektif Industri Budaya. J. Humanit. Univ. Muhammadiyah Malang 9(2), 47–56 (2014) 2. A. Bagaskara, Menegosiasi Otentitas: Kancah Musik Independen Indonesia dalam Konteks Komodifikasi oleh Perusahaan Rokok. MASYARAKAT J. Sosiol. 22(2), 235–255 (2017) 3. J. Herbst, T. Albrecht, The skillset of professional studio musicians in the German popular music recording industry. J. Ethnomusicol. 30, 121–153 (2018) 4. I. Campelo, That extra thing—the role of session musicians in the recording industry. J. Art Rec. Prod. 10 (2015) 5. J. Herbst, T. Albrecht, The work realities of professional studio musicians in the German popular music recording industry: careers, practices, and economic situations. J. Int. Assoc. Study Pop. Music 8(2), 18–37 (2018) 6. B. Wright, Reconstructing the history of Motown session musicians: the Carol Kaye/James Jamerson controversy. J. Soc. Am. Music 13(1), 78–109 (2019) 7. W. Wiflihani, Fungsi Seni Musik dalam Kehidupan Manusia. ANTHROPOS J. Antropol. Sos. Budaya 2(1), 101–107 (2016) 8. A. Kusumawardhani, Membangun Musik Indonesia Melalui Budaya Berbagi. J. Ilmu Komun. 11(2), 121–134 (2014) 9. A. Roffiq, I. Qiram, G. Rubiono, Media Musik dan Lagu Pada Proses Pembelajaran. J. Pendidik. Dasar Indones. 2(2), 35–40 (2017) 10. C. Aryanto, S. Hartono, Perbandingan Subjective Well-Being Musisi dan Non-Musisi. J. Ilmiah Psikol MIND SET 6(1), 1–13 (2014) 11. H.K. Flora, X. Wang, S.V. Chande, An investigation into mobile application development processes: challenges and best practices. Int. J. Mod. Educ. Comput. Sci. 1–9 (2014) 12. L. Chandi, C. Silva, D. Martinez, T. Gualotuna, Mobile application development process: a practical experience, in 12th Iberian Conference on Information Systems and Technologies (CISTI), Lisbon, Portugal, June 2017 13. T. Vithani, A. Kumar, Modeling the mobile application development lifecycle, in International MultiConference of Engineers and Computer Scientists (IMECS), Hong Kong, Mar 2014
300
A. D. Leander et al.
14. N. Kumar, K. Krishna, R. Manjula, Challenges and best practices in mobile application development. Imp. J. Interdiscip. Res. 2(2), 1607–1611 (2016) 15. N. Shamat, S. Sulaiman, J. Sinpang, A systematic literature review on user interface design for web applications. J. Telecommun. Electr. Comput. Eng. 9(3–4), 57–61 (2016) 16. D. Saha, A. Mandal, User interface design issues for easy and efficient human computer interaction: an exploratory approach. Int. J. Comput. Sci. Eng. 3(1), 127–135 (2015) 17. S. Kocakoyun, Developing of Android mobile application using Java and Eclipse: an application. Int. J. Econ. Mech. Mechatron. Eng. 7(1), 1335–1354 (2017) 18. K. Amano, S. Tsuzuku, K. Suzuki, N. Hiraoka, Learning together for mastery by using a discussion forum, in 2019 International Symposium on Educational Technology (ISET), Hong Kong, July 2019 19. A. Ezen-Can, S. Kellog, K.E. Boyer, S. Booth, Unsupervised modeling for understanding MOOC discussion forums: a learning analytics approach, in 5th International Conference on Learning Analytics and Knowledge, Poughkeepsie, New York, USA, Mar 2015, pp. 146–150 20. J. McDougall, The quest of authenticity: a study of an online discussion forum and the needs of adult learners. Aust. J. Adult Learn. 55(1), 94–113 (2015) 21. M.G. Alzahrani, The effect of using online discussion forums on students’ learning. Turk. Online J. Educ. Technol. 16(1), 164–176 (2017) 22. B. Vivekanandam, Design an adaptive hybrid approach for genetic algorithm to detect effective malware detection in Android division. J. Ubiquitous Comput. Commun. Technol. 3(2), 135– 149 (2021) 23. S.R. Mugunthan, T. Vijayakumar, Design of improved version of sigmoidal function with biases for classification task in ELM domain. J. Soft Comput. Paradigm (JSCP) 3(02), 70–82 (2021) 24. S. Chowdhury, Exclusion by design: a systematic assessment of community engagement in COVID-19 mobile apps. Doctoral dissertation, Icahn School of Medicine, Mount Sinai, 2022
A Genetic-Based Virtual Machine Placement Algorithm for Cloud Datacenter C. Pandiselvi and S. Sivakumar
Abstract Cloud computing is a process of renting hardware, software, and platforms over the internet. Users are charged on a pay-per-use model for these services. A cloud service provider provides resources on the basis of user’s request. To efficiently allocate resources, server consolidation techniques are used. Virtual machine placement (VMP) is a server consolidation technique in which virtual machines are created and mapped to physical servers for allocation of resources. This research work explores the first fit, best fit, and genetic algorithm for mapping physical servers to virtual machines. For an optimum mapping of VMP, a hybridization of best-fit algorithm and genetic algorithm suggested the improved genetic algorithm (IGA). The performance of these VMP algorithms is evaluated with datasets obtained from Microsoft Azure, Google Cloud, and Amazon EC2. The simulation results showed that the proposed IGA algorithm improves resource utilization and decreases the execution time. Keywords Virtual machine placement · Genetic algorithm · Best fit · First fit
1 Introduction Cloud computing allows users to share a pool of resources based on their needs. Virtualization technologies enable live migration of virtual machines (VM), which allows VM resources such as CPU and memory to be freely moved between physical servers (PS) [1]. Server consolidation technique is used to place several VM on a single PS, allowing the PS to operate at maximum resource efficiency. Server consolidation techniques requires four major steps namely (i) PS overload detection,
C. Pandiselvi (B) Department of Computer Science, Cardamom Planters’ Association College, Bodinayakanur, India e-mail: [email protected] S. Sivakumar Cardamom Planters’ Association College, Bodinayakanur, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_21
301
302
C. Pandiselvi and S. Sivakumar
(ii) PS underload detection, (iii) VM selection and migration, (iv) VM placement [2]. Virtual machine placement (VMP) is the most important aspect of server consolidation. The VMP is the process of mapping, which VM should be mapped to each PS. The problem of VMP is to place VM resources reasonably according to the processing capacity of each PS. An improper VMP mapping can cause resources to be performed on unsuitable PS, resulting in unintended consequences. This might damage the cloud service provider’s reputation. To avoid this, an efficient technique should be utilized to accurately allocate resources to the PS that will allow their efficient performance. A genetic algorithm (GA) is a form of random searching with improved optimization and internal implicit parallelism. It can obtain and instruct the optimized seeking space, as well as automatically alter the seeking direction [3]. The GA approach computes the VMP in advance that it will have on the system after the new VM resource is deployed in the system based on historical data and current conditions of the system. It then selects the solution with the least amount of impact on the system. This results in better server consolidation and a reduction in the number of dynamic virtual machines. With the advantages of GA, this paper presents a genetic algorithm and an improved genetic algorithm (IGA) for VMP in cloud computing environment. IGA is a used to determine an optimized solution for the problem based on crossover, mutation, and selection. The IGA starts its search by looking at an arbitrary selection of solutions. Every solution is given by the fitness function that is evaluated by best-fit algorithm. Thereafter, three operators similar to natural genetic operators such as crossover, mutation, and selection are used to change the population of solutions to a new population. It works iteratively, by applying these three operators in sequence in each generation until a termination requirement is satisfied. The effectiveness of the IGA algorithm performance is evaluated using Microsoft Azure, Google Cloud, and Amazon EC2 datasets. Microsoft Azure features an instance kind with one or more instance sizes that can be scaled to meet the needs of a specific workload. Google Cloud has a Google’s infrastructure, which is used to generate and run-on virtual machines. Amazon EC2 provides a wide selection of instance types optimized to fit different use cases. Premature convergence and a long execution time are the limitations of genetic algorithms [4]. Premature convergence causes many genetic algorithms to converge on sub-optimal solutions. The long execution time is another important major problem. The majority of genetic methods take a long time to process multiple generations before getting the best result. As a result, IGA is a hybrid of best fit and genetic algorithm, with the goal of overcoming the constraints of IGA to the placement problem. The following are some of the work’s major contributions: (i)
The performance of first fit and best fit techniques is evaluated to find the fitness value of VMP problem with heterogeneous datasets.
A Genetic-Based Virtual Machine Placement Algorithm …
(ii)
303
A novel placement idea, an improved genetic algorithm (IGA) is proposed, and compared their performance.
2 Related Work Traditional VM placement algorithms in cloud data centers rely solely on the present state of the overall system. They fail to overlook system variability and historical behavioral data, resulting in system overloading and load unbalancing. However, the GA computes the impact on both system overloading and unbalancing. Kumar and Smys [5], formulate the ubiquitous service for medical care utilizing the cloud. The blockchain technology is used to safeguard the reliability of the balanced data. Andi [6], presented the serverless architecture enhances another layer to the cloud computing model while the management of servers is abstracted from the developers. This provides two services such as Backend as a Service and Function as a Service. Shakya [7], introduced a data security analysis and solution for privacy protection framework during data migration. The approach provides a strict separation between sensitive and non-sensitive data and provides encryption for the sensitive data. Bhalaji [8], proposed accurate prediction of the workload and sequence of resources along with time series. Goncalves and Resende [9], proposed a biased random key genetic algorithm for 2d and 3d bin-packing problems and virtualization management in datacenters. Sonklin et al. [10], presented an improved genetic algorithm for the virtual machine placement and the VM policy that has been employed in VM consolidation. Lu et al. [11], explored a genetic algorithm and some of its modified versions are discussed for optimal machine placement based on improved genetic algorithm in cloud. Belgacem et al. [12], investigates a virtual machine placement approach based on the micro genetic algorithm in cloud computing. Pandiselvi and Sivakumar [13], proposed a particle swarm optimization bin-packing algorithm to utilize resources and energy consumption. The proposed algorithm explained heuristic method to address virtual machine placement problem. Moreover, the sizes of working set are considered while placing applications on physical machines. Rashida et al. [14], proposed a genetic algorithm based on memetic grouping has the objective of cost efficiency and energy-efficient VM placement in multi-cloud environment. The fitness function determines the competence of the GA. In GA, the fitness function is discovered to be a problem-dependent and critical criterion for getting optimal results. When defining the fitness function for a problem is difficult, a simulation can be utilized to determine the fitness function value of a genetic algorithm. Kour et al. proposed an application of some fitness functions in various fields is investigated. The fitness function is critical in applications because the success of a genetic algorithm is determined by how effective the fitness function is. If the fitness function is poorly designed, the genetic algorithm’s later operations will provide an optimized output [15]. Theja and Babu presented a fitness-based adaptive
304
C. Pandiselvi and S. Sivakumar
evolutionary method for VM policy that has been employed in VM consolidation [16]. Xu et al. explored to maximize resource utilization, every PM on the cloud computing platform focused on multidimensional resource load balancing. The approach customizes ant colony optimization in the framework of virtual machine sharing to avoid early convergence or slipping into limited optima [17]. Li et al. investigates a genetic algorithm with a novel heuristic packing technique that translates a genetic material box filling and container loading sequences into a compressed filling solution [18]. Pandiselvi and Sivakumar used to find the best virtual machine placement, a bin-packing technique analyzed with four different fitness strategies [19]. To better understand the behavior of the fitness function, a review of numerous research articles was conducted, and it was discovered that various fitness values based on genetic algorithms have been examined for diverse applications. The GA algorithm is compared with best fit, first fit, worst fit, best fit decreasing, and first-fit decreasing placement algorithms are tabulated as shown in Table 1. Table 1 Related works based on genetic algorithm Author
Algorithm
Based on
Resources considered
Objective
Performance better than
Kour et al. [15]
Fitness functions with different domains
Genetic
CPU
Resource utilization, minimizing cost and time
Best fit, first-fit algorithm
Theja and Babu [16]
Adaptive genetic algorithm (A-GA)
Genetic
CPU
VM reuse strategy
Best-fit decreasing algorithm
Xu et al. [17]
Ant colony optimization
Genetic
CPU, memory, and storage
SLA violation First fit decreasing, worst fit algorithm
Li et al. [18]
Packing heuristic procedure
Genetic
CPU, memory
High consumption time
Greedy algorithm, first-fit algorithm
Pandiselvi and Sivakumar [19]
Particle swarm optimization bin packing
Genetic
CPU
Energy consumption, resource utilization
Best fit, first fit
A Genetic-Based Virtual Machine Placement Algorithm …
305
3 Placement Algorithms Placement algorithms are one of the key mechanisms in data centers for designing an efficient server consolidation in cloud. The principle is based on the process of selecting the most suitable PS among the VM. So, the placement algorithm aims at determining the most optimal VM to PS. The mapping can achieve both the increase in the resource utilization and decrease in the PS overloading. The placement algorithms are namely first fit, best fit and standard genetic algorithms are used to achieve the minimized execution time and maximize the resource utilization. Resource utilization problem: Consider a cloud data center that provides the limited number of PS with resources like CPU capacity, Memory in GB since there are a greater number of user’s requests for resources, the cloud service provider must decide on how many numbers of PS can be allocated to VM based on cloud user’s request. By virtualization technology, the cloud data centers have its unlimited number of VM that is used to deploy PS for maximizing resource utilization. To formulate a resource utilization in VMP problem, a set of terminus servers, S, with the capacity of resources, i.e., CPU or memory and set of user requests to be allocated, P, are given. The destination servers are numbered as 1, 2, 3, … The number of users requested services is m, which are numbered as 1, 2, 3, … Then to find the minimum PS can be modulated are follows, min
n
PSi
(1)
i=1
Subject to: n
X i j = 1 j = 1, 2 . . . m
(2)
i=1
PSi ∗ PScpui ≥
m
∗Scpu j ∗ X i j
(3)
∗Smem j ∗ X i j
(4)
j=1
PSi ∗ PSmemi ≥
m j=1
where, • In Eq. (1) PSi specifies whether the ith server is being used or not, i.e., PSi = 1 specifies the ith server is being used, otherwise, it is not being used. • In Eq. (2) X i j specifies the allocation of resources to servers, i.e., X i j = 1 specifies the resource j is allocated to the ith server, otherwise, the resource j is not allocated to the ith server.
306
C. Pandiselvi and S. Sivakumar
• In Eq. (3) PScpui shows the CPU resource provided by the ith server, while Scpu j shows the CPU resources needed by the resource j. • In Eq. (4) PSmemi shows the memory resource provided by the ith server, while Smem j shows the memory resource needed by the resource j. First fit (FF): The FF algorithm considers the first user request for resources in an available PS with capacity more than or equal to its size. Each resource is allocated to the lowest initialized PS into which it fits. FF method does not search for appropriate size, but just allocates the resources in the nearest capacity available with sufficient size. The execution time for allocating all resources is added. Best-fit (BF): The BF algorithm looks for free PS over the full list. It looks for a PS that is close to the real PS size that is required. The free list of PS is kept in this approach in order of least to largest size. The execution time for placing all PS on that machine is then updated using the algorithm’s computation of execution time. Table 2 lists 12 various VM instances of dataset prepared for general, compute, memory, and storage purposes in this research. For each situation, there are four different types of virtual machines. Amazon EC2, Microsoft Azure, and Google Cloud are some of the solutions. Amazon EC2 [20], provides a wide selection of instance types optimized to fit different use cases. Instance types comprise varying combinations of CPU, memory, storage, and networking capacity and offer the flexibility to choose the appropriate mix of resources for any application. Each instance type includes one or more instance sizes, allowing the applications to scale your resources to the requirements of your target workload. Microsoft Azure [21] Improve the accuracy of cloud infrastructure models with publicly available datasets. Save time on data discovery and preparation by using curated datasets that are ready to use in machine learning workflows also and easy to access from Azure services. Google Cloud instances [22] are ideal for compute-bound applications that benefit Table 2 VM instance dataset
Cloud service provider Microsoft Azure
Google Cloud
Amazon EC2
VM instances
CPU capacity
Memory in GB
D1v2/1
1
3.5
D2V2/2
2
7
D3v2/3
4
14
D4v2/4
8
28
G1-STD-1
1
3.75
G1-STD-2
2
7
G1-STD-4
4
15
G1-STD-8
8
30
EC2.small.1
1
2
EC2.large.2
2
8
EC2.xlarge.3
4
16
EC2.2xlarge.4
8
32
A Genetic-Based Virtual Machine Placement Algorithm …
307
from high-performance processors. Instances belonging to this family are well suited for batch processing workloads, media transcoding, high-performance web servers, high-performance computing (HPC), scientific modeling, dedicated gaming servers, ad server engines, machine learning inference, and other compute-intensive applications. Genetic Algorithm (GA) The GA could be seen as an intelligent probabilistic search in the space of solutions for a hard problem. Starting from the name itself, the terminology of the GA is derived from the evolutionary biology, where individuals from a population by a recombination of genetic characteristics of their parents, plus a small probability of some random genetic mutation. Population initialization, fitness function computation, selection, crossover, and mutation are the steps for solving GA. Population Initialization The initial population is the first step in the GA algorithm. In algorithms, population initialization is critical since it can affect the speed of convergence as well as the quality of the final solution. To make it appropriate for genetic operations, each solution in the population is referred to as an individual, and each individual is represented as a chromosome. Individuals are chosen from the initial population and some operations are performed on them to create the following generation. Fitness Calculation A fitness function is used to measure the quality of the chromosome in the population according to the given optimization objective. The fitness value is defined by getting the chromosome. Consider the goal function given by Eq. (2) to minimize the number of active PS. Calculating fitness values for each chromosome is a difficult challenge as well, hence a novel method called GA is utilized to get the fittest value. Selection In GA, the next phase is selection. Following the calculation of the fitness value for each chromosome, selection procedures are used to choose from the population. Roulette Wheel Selection, tournament selection, and so on are examples of common selection processes. To avoid the population falling into local convergence and degradation, avoid tournament selection, which may result in a local solution. Rather, a chosen technique based on fitness value order is utilized. Each chromosome’s fitness value can be determined first, then sequenced from high to low, with the high fitness values being handed over to the next generation. It must ensure that no duplicate chromosomes are found throughout this operation. Simultaneously, maintaining a tiny part of the poorest solution, which may help avoid local convergence. Crossover The goal of a crossover, also known as recombination, is to produce offspring from two parents P with as much relevant information from both parents as possible. A
308
C. Pandiselvi and S. Sivakumar
crossover operator is used to combine the genetic information of two parents and create new offspring. Crossover operators include single-point crossover, double point crossover, partial crossover, sequential crossover, and so on. Mutation Mutation is a process in which a portion of a gene is arbitrarily transformed. The mutation operator can be used to replicate the great chromosomes that are deleted during the selection or crossover stages. The mutation operator also ensures that the probability of finding any point in the problem space is never zero, regardless of the dispersion of the initial population. Assume that the mutation operator includes moving a chromosome to another chromosome with a chance of picking two chromosomes at random and swapping genes. To avoid chromosomal visits, genes are swapped in the same chromosome to avoid recurrence. To carry out this mutation, two chromosomes from the array containing the permutation are chosen at random and their genes are swapped. Algorithm 1: Procedure of GA 1. Begin 2. Initialize Population 3. Evaluate fitness value by objective function 4. Repeat till (termination condition occurs) 5. Perform a. Chromosome selection b. Parent crossover c. Offspring mutation d. new chromosome evaluation e. Select chromosome for future generation 6. End Algorithm 1: GA procedure is used to analyze the efficient placement of VM, at step 2: the procedure starts with initial population, i.e., the number of PS and the number of VM with CPU capacity. In step 3: a fitness value is used to measure, according to the given optimization objective. In step 5 a. the two-fitness value is selected, in step 5 b. the crossover operator is used to swap between two-fitness values, and in step 5 c. the new offspring is produced. In steps 5 d. and 5 e. the produced offspring are interchanged to create a new fitness value and the PS is placed in that fitness value. Improved Genetic Algorithm (IGA) The genetic algorithm is a metaheuristic algorithm based on theory of evolution. It is a randomized algorithm in which random changes are applied to the current solution to find the new solution. The IGA is used to analyze the fitness value as a best-fit value using the best-fit algorithm, the procedure of IGA is used to analyze the efficient resource utilization. The fitness value is one of the pivotal parts of the algorithm is trying the optimize in IGA algorithm. IGA begins generating problem
A Genetic-Based Virtual Machine Placement Algorithm …
309
Fig. 1 Flowchart of IGA
solutions as a population. This population then undergoes an evaluation of fitness value using the best-fit algorithm. Each iteration includes the following processes: (i) selection, (ii) crossover and (iii) mutation, in which evolution data such as the fitness value are updated shown in Fig. 1. Algorithm 2: Procedure of IGA 1. Begin 2. Initialize Population 3. Evaluate fitness value by best-fit algorithm 4. Repeat till (termination condition occurs) 5. Perform a. Chromosome selection b. Parent crossover c. Offspring mutation d. new chromosome evaluation e. Select chromosome for future generation 6. End
Algorithm 2: PGA procedure starts with initializing a number of PS and number of VM with CPU capacity. In step 3, the fitness value is calculated using best-fit algorithm, in which the VM searches for small sufficient PS among the free available PS and starts by picking the VM and find the minimum PS that can be assigned to
310
C. Pandiselvi and S. Sivakumar
current VM, then the best-fit value is selected at step 5 a., the selection process is used to select two best values at random. In step 5 b. the crossover operator is used to swap between two-fitness values and then offspring is produced. In step 5 c., the produced offspring is interchanged to create a new best-fit value and PS is placed in that best-fit value more efficiently.
4 Performance Evaluation 4.1 Experimental Setup NetBeans IDE is an open-source (https://en.wikipedia.org/wiki/Open-source_sof tware) integrated development environment. The language-aware NetBeans IDE editor detects errors and assists the documentation popups and smart code completion with the speed and simplicity of text editor. Of course, the Java editor in NetBeans is much more than a text editor—it indents lines, matches words and brackets, and highlights source code syntactically and semantically. Therefore, the NetBeans IDE is used to implement IGA. The performance of the IGA is evaluated in terms of execution time and resource utilization. Because access to real data centers is challenging, a simulation-based evaluation is used to evaluate the proposed algorithm’s performance to existing works that are currently used by the majority of cloud service providers. A variety of PS are included in the simulated cloud environment, as well as a number of randomly created VM resource demands, such as CPU capacity and memory.
4.2 Evaluation of FF and BF Algorithm The performance of FF and BF algorithms is evaluated to find the fitness value. The computation of execution time of the algorithm is used as a fitness criterion for checking the fitness value of the algorithms. The simulation processing time to place all VM is referred to as execution time. In this situation, the better strategy will be stated as the average minimal execution time, because it completes all of the VM placement processes in the data center. The experiments are done to identify the effect of applying the fitness value to the IGA algorithm so that the two placement algorithms FF and BF are examined by using three VM instances datasets shown in Table 2. In this scenario, from Table 3 the computation of execution time of the FF and BF algorithms with physical servers namely PS1, PS2, PS3, and PS4 is compared and shown as the BF algorithm gives better results compared with FF algorithm. The BF algorithm is more suitable for handling VM requests efficiently in the IGA algorithm.
A Genetic-Based Virtual Machine Placement Algorithm …
311
Table 3 Execution time for BF and FF algorithm for different datasets Execution time (in s) Azure
Google
BF
FF
Amazon EC2
BF
FF
BF
FF
PS1
3.059
4.090
3.005
4.015
3.000
4.003
PS2
3.012
4.016
3.015
4.015
3.003
4.015
PS3
3.015
4.015
3.003
4.015
3.014
4.015
PS4
3.014
4.014
3.001
4.001
3.016
Total
12.10
16.10
3.025
Average
4.025
12.02
16.04
3.005
12.03
4.010
3.000
4.014 16.04 4.010
Execuon me (Sec)
Figure 2 explores the average computation execution time of BF algorithm as for Azure dataset 3.025 s, Google dataset 3.005 s, and Amazon dataset 3 s and then Fig. 3 explores the average computation execution time of FF algorithm as Azure dataset 4.025 s, Google dataset 4.01 s and Amazon EC2 dataset 4.01 s. The comparison of average calculation execution time clearly shows that the BF algorithm using Amazon EC2 dataset performs significantly better than the FF approach. To obtain an efficient fitness value, the BF algorithm is applied in Algorithm 2: IGA method. 3.03 3.02 3.01 3.025
3
3.005
2.99
3
2.98 Azure
Google
Amazon EC2
VM instance types
Execuon me (Sec)
Fig. 2 Average execution time for best-fit algorithm
4.03 4.025 4.02 4.015 4.01 4.005 4
4.025 4.01 Azure Google VM instance types
Fig. 3 Average execution time for first-fit algorithm
4.01 Amazon EC2
312
C. Pandiselvi and S. Sivakumar
Number of VM Placed
12 10 8 6 PS in Idle
4
VM Placed
2 0 1
2
3
4
5
6
7
8
9
10
11
12
Number of PS Used
Fig. 4 Placement of VM by IGA
4.3 Evaluation of IGA Algorithm The parameter configuration is critical for evaluating the effectiveness of the IGA algorithm. As a result, the algorithms were run five times in the same instance to achieve relevant results. The parameters used in the effective IGA are determined empirically in order to produce a satisfying solution in an acceptable length of time. As a result, 100 iterations is the maximum number of iterations. To validate the IGA’s efficiency, it is tested utilizing the three cloud providers listed in Table 2: Amazon EC2, Microsoft Azure, and Google Cloud. The performance of IGA algorithm is evaluated by the resource utilization and overall execution time of the algorithm. The resource utilization and VM placement for both the IGA and GA algorithm are separately analyzed. The algorithms take 12 sets of physical servers from VM instances dataset taken from Table 2. The IGA algorithm has placed the CPU capacity of VM in PS and other PS are put in idle are shown in Fig. 4. Therefore, the GA algorithm has placed the CPU capacity of VM in PS and other PS are put in idle are shown in Fig. 5. In the IGA algorithm, the number of VM placed in PS is minimized when compared to GA shows the efficient placement and resource utilization.
Number of VM Placed
12 10 8 6
PS in Idle
4
VM Placed
2 0
1
2
3
4
5
6
7
Number of PS Used
Fig. 5 Placement of VM by GA
8
9
10
11
12
A Genetic-Based Virtual Machine Placement Algorithm …
313
Table 4 Simulation result of IGA and GA Algorithm type
Execution status
No of VM instances used
No of VM placed in PS
PS in idle
Execution time in s
IGA
Success
12
03
09
5.005
GA
Success
12
04
08
5.012
Execution Time (ms)
6000 5000 4000
IGA
3000
GA
2000 1000 20
40
60
80
100
Number of Iterations Fig. 6 Comparison of execution time for IGA and GA algorithm
Table 4 displays the overall execution time of two algorithms: the IGA and GA algorithms. The table shows that the IGA algorithm has three number of PS placed and nine number of PS are put in idle where the resources are utilized. The execution time of IGA algorithm is also minimized. As a result, when compared to the existing GA, the IGA uses fewer PS, has a faster response time, has a higher resource utilization rate, and consumes less power. In Fig. 6: the simulation execution time to place the VM as execution time of the algorithm is compared and shown that IGA algorithm has minimum execution time compared with GA algorithm. Therefore, the IGA algorithm performs better than GA algorithm.
5 Conclusion One of the most important difficulties in cloud computing is virtual machine placement. A suitable virtual machine placement is necessary for good system performance. Three placement methods are discussed in this paper: best fit, first fit, and genetic algorithm, as well as a new placement algorithm. The experimental results reveal that the proposed an improved genetic algorithm performs better than the genetic algorithm in terms of execution time and resource utilization. As a result, the proposed genetic algorithm outperforms the traditional genetic algorithm. This strategy can be implemented in existing cloud computing systems
314
C. Pandiselvi and S. Sivakumar
to improve resource efficiency while reducing physical server overloading and algorithm execution time.
References 1. P. Srivastava, R. Khan, A review paper on cloud computing. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 8(6) (2018). ISSN: 2277-128X 2. C. Pandiselvi, S. Sivakumar, A review of virtual machine algorithm in cloud data centre for server consolidation. IJERCSE 5(3), 182–188 (2018) 3. M. Chen, M. Li, F. Cai, A model of scheduling optimizing for cloud computing resource services based on buffer-pool agent, in 2010 IEEE International Conference on Granular Computing (GrC), Aug 2010 4. N. Avinash Kumar Sharma, A multi objective genetic algorithm for virtual machine placement in cloud computing. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 8(8) (2019). ISSN: 2278-3075 5. D. Kumar, S. Smys, Enhancing security mechanisms for healthcare informatics using ubiquitous cloud. J. Ubiquitous Comput. Commun. Technol. 2(1), 19–28 6. H.K. Andi, Analysis of serverless computing techniques in cloud software framework. J. IoT Soc. Mob. Anal. Cloud 3(3), 221–234 (2021) 7. S. Shakya, An efficient security framework for data migration in a cloud computing environment. J. Artif. Intell. 1(01), 45–53 (2019) 8. N. Bhalaji, Cloud load estimation with deep logarithmic network for workload and time series optimization. J. Soft Comput. Paradigm 3(3), 234–248 (2021) 9. J.F. Goncalves, M.G. Resende, A biased random key genetic algorithm for 2D and 3D bin packing problems. Int. J. Prod. Econ. 145(2), 500–510 (2013) 10. C. Sonklin et al., An improved genetic algorithm for the virtual machine placement problem. Aust. J. Intell. Inf. Process. Syst. 16(1), 73–80 (2019) 11. J. Lu et al., Optimal machine placement based on improved genetic algorithm in cloud computing. J. Supercomput. (2021). https://doi.org/10.1007/s11227-021-03953 12. A. Belgacem et al., New virtual machine placement approach based on the micro genetic algorithm in cloud computing, pp. 66–72 (2021) 13. C. Pandiselvi, S. Sivakumar, Constraint programming approach based virtual machine placement algorithm for server consolidation in cloud data center. IJCSE 6(8) (2018). E-ISSN: 2347-2693 14. S.Y. Rashida et al., A memetic grouping genetic algorithm for cost efficient VM placement in multi-cloud environment. Clust. Comput. https://doi.org/10.1007/s10586-019-02956-8,2019. 1234567 15. H. Kour, P. Sharma, P. Abrol, Analysis of fitness function in genetic algorithms. J. Sci. Tech. Adv. 1(3), 87–89 (2015) 16. P.R. Theja, S.K.K. Babu, An adaptive genetic algorithm based robust QoS oriented green computing scheme for VM consolidation in large scale cloud infrastructures. J. Sci. Technol. (2014). https://doi.org/10.17485/ijst/2015/v8i27/79175 17. P. Xu, G. He, Z. Li, Z. Zhang, An efficient load balancing algorithm for virtual machine allocation based on ant colony optimization. Int. J. Distrib. Sens. Netw. 14(12) (2018). https:// doi.org/10.1177/1550147718793799 18. X. Li, Z. Zhao, K. Zhang, A genetic algorithm for the three-dimensional bin packing problem with heterogeneous bins, in Proceedings of the Industrial and System Engineering Research Conference (2014) 19. C. Pandiselvi, S. Sivakumar, Performance of particle swarm optimization bin packing algorithm for dynamic virtual machine placement for the consolidation of cloud server. IOP Conf. Ser. Mater. Sci. Eng. 1110 (2021). https://doi.org/10.1088/1757X/1110/1/012007
A Genetic-Based Virtual Machine Placement Algorithm … 20. Amazon EC2, https://aws.amazon.com/ec2/instance-types 21. Microsoft Azure, https://docs.microsoft.com/en-in/azure/open-datasets/dataset-catalog 22. Google Cloud, https://console.cloud.google.com/marketplace
315
Visual Attention-Based Optic Disc Detection System Using Machine Learning Algorithms A. Geetha Devi, N. Krishnamoorthy, Karim Ishtiaque Ahmed, Syed Imran Patel, Imran Khan, and Rabinarayan Satpathy
Abstract Computer-aided diagnosis relies heavily on the accurate localization of the optic disc (OD) prior to OD segmentation. A medical diagnostic system that uses deep learning is being developed, however, it typically necessitates a huge amount of computation due to the nature of medical imaging. OD pre-processing uses an algorithm that mimics human visual attention to locate the OD automatically. As humans, we use our visual perceptions in order to make sense of our surroundings. If you look at a picture, there are certain parts of it that catch your eye. Human visual perception can be predicted using computational visual attention (CVA) models. Based on human visual perception notions, these models were developed. When it comes to finding OD in fundus retinal images, the bottom-up (BU) saliency paradigm is tested. When paired with mathematical morphology, the IK saliency model and Otsu’s technique are effective tools for detecting OD in retinal images. Keywords CVA · Fundus · Human visual perception · Optic disc · Retinal images
1 Introduction Today, glaucoma is one of the most pressing health and therapeutic issues of our time because of its rapid progression. The number of persons who are infected with the disease is on the rise. Glaucoma affects 1–2% of the population, and over half A. Geetha Devi (B) Department of ECE, PVP Siddhartha Institute of Technology, Vijayawada, Andhra Pradesh, India e-mail: [email protected] N. Krishnamoorthy MCA Department, SRM Institute of Science and Technology, 89 Bharathi Salai, Ramapuram Campus, Chennai, India K. I. Ahmed · S. I. Patel · I. Khan Computer Science, Bahrain Training Institute, Higher Education Council, Ministry of Education, Manama, Bahrain R. Satpathy CSE (FET), Sri Sri University, Cuttack, Odisha, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_22
317
318
A. Geetha Devi et al.
of those people are unaware that they have it. Medical images, as we know, play a vital function in conveying information from different sections of the body, detecting disorders, medical research, and education [1]. As a result of these studies, glaucoma may now be diagnosed automatically using image processing algorithms. Automated image processing systems can process massive volumes of photos in a short period of time and at a low cost while avoiding human errors and other flaws. Clear area in the retina that is almost round in shape and has a diameter of 1.5 mm and boundaries at the periphery is the optic disc. The optic nerve can be found in a yellowish-white region at the center of the optic disc. Donut-like appearance of the optic nerve head is due to optic cup’s depth being greater than surrounding nerve tissue [2]. On display in Fig. 1, you’ll see an optical disc and cup in action. When axons from the ganglion cells depart the eye, they form the optic nerve at the optic disc. Extracting a visual representation from an optical disc is meant to make it more meaningful and easier to understand. Extraction of objects and boundaries from images can be done using Retinal Image Extraction [3, 4]. For some characteristic or computed attribute, every pixel in a region is the same. The first step in any image analysis method is commonly called “image extraction.” Next, feature extraction and object recognition are significantly dependent on the extraction’s performance. An object may never be recognized unless a good extraction algorithm is used [5]. The goal of image extraction is to divide an image into useful sections for a specific application. It is possible to extract useful information about the scene’s surfaces from Fig. 1 Optic disc and optic cup in the retinal image
Visual Attention-Based Optic Disc Detection System …
319
simple gray-level images. A sequence of operations aiming at gaining a comprehensive knowledge of an image usually begins with image extraction as an important first step [6, 7].
2 Related Work Retinopathy diagnosis and fundus image analysis rely on the optic disc’s position as a starting point for subsequent processes, such as optic disc segmentation. An enhanced Harris corner location algorithm is presented in this study by Deng in 2021. When examining the retinal fundus image, the optic disc appears to have the most corners, due to the presence of densely packed vessels and apparent gray shading in the image. Image augmentation, vessel extraction, matching filters, and other approaches are all utilized to pinpoint the precise location of the subject matter [8, 9]. Computer-aided diagnosis relies heavily on the accurate localization of the optic disc (OD) prior to OD segmentation. Deep learning research is encouraging the development of sophisticated medical diagnostic systems, yet medical images frequently require a considerable amount of computation. Automated detection of the location of the OD is performed using an algorithm that mimics human visual attention [10]. These are the four steps that Liang and his colleagues took, in the order listed [11]. Both healthy and diseased retinal pictures were included in the two datasets that were utilized to evaluate the suggested approach. The MESSIDOR dataset properly identifies the OD in 1195 of the 1200 photos (99.58%). On top of that, DRIVE has a detection accuracy of 100%, which exceeds current models. Two datasets of the prospective OD region are presented [12].
3 Proposed Visual Attention-Based Optic Disc Detection System Figure 2 depicts the VAODD system’s main phases. The STructured Analysis of the REtina (STARE) project dataset was employed as the primary data source for this study [13–15]. 52 images are disjunctive type and 27 are conjunctive type among 79 fundus retinal images in the chosen subset. Saliency maps are computed using the Itti–Koch (IK) (2001) computational BU saliency model, which is closely aligned with the Feature Integration Theory (FIT). According to Fig. 2, each stage is broken down into the following categories: pre-processing; processing; and post-processing (Fig. 2).
320
A. Geetha Devi et al.
Fig. 2 Major steps in the visual attention-based optic disc detection (VAODD) system
Visual Attention-Based Optic Disc Detection System …
321
Fig. 3 a Actual image, b presented location by the IK model
3.1 Pre-processing Fundus retinal picture foreground from background can vary greatly in color, intensity, and orientation, as seen in Fig. 3a. Figure 3 shows the places detected using the IK model (b) [16]. After inspecting all four corners, the OD region can be distinguished due to the varying color, intensity, and direction of the retinal image and its background. Before employing the IK model for OD detection, a pre-processing step is necessary. Pre-processing involves using the morphological opening operator (◯) on the original image (Im). The following is the definition of the operation of opening by means of structuring element B: Impre = ◦(B)(Im )
(1)
In this example, the disc-like morphological organizing element is B. Using the input image Impre as a starting point for processing; an image is removed from it before being used as input.
3.2 Processing When calculating the saliency map, the IK model is used. The input image Impre is deconstructed using linear filters customized to specific stimulus parameters such as brightness, red, green, blue, and yellow colors, or various local orientations. At a variety of spatial scales, this breakdown is carried out. These channels can be subdivided to show smaller and larger things. Gaussian pyramids are used to create a variety of spatial sizes. The following equations are used to produce color channels: Red = red − (green + blue)/2
(2)
322
A. Geetha Devi et al.
Green = green − (red + blue)/2
(3)
Blue = blue − (red + green)/2
(4)
Yellow = red + green − 2(|red − green| + blue)
(5)
Int = (red + green + blue)/3
(6)
In this example, the color channels are used to construct four different Gaussian pyramids: red, green, blue and yellow. For the Gaussian pyramid Int(σ ), Int is used. Gabor O(σ, θ ), is used to extract local orientation information from Int. As you move outward, you’re subtracting (green–red) from your immediate surroundings, and then multiplying that result by six to get the six different red/green feature maps. An example of this is that IK model uses the RG(c, s) map to account for red/green and green/red double opposition at once. RG(c, s) = |(Red(c) − Green(c))(Green(s) − Red(s)|
(7)
BY(c, s) = |(Blue(c) − Yellow(c))(Yellow(s) − Blue(s)|
(8)
Int(c, s) = |Int(c)Int(s)|
(9)
The saliency map is created by combining the normalized maps together. Model outputs a saliency map with the locations that are most important to users. Saliency Map = 1/3 N Int + N Color + N Or1
(10)
Binary segmentation is used to remove the undesirable sections. It is necessary to employ a threshold in thresholding, and the Otsu algorithm is used. A picture can be transformed into a two-color image via thresholding, which separates light areas from dark ones. It is applied to a binary image and then subtracted from the threshold binary picture using morphological open operations. Following the thresholding of the saliency map, an ithresh image is generated. It is used in post-production.
3.3 Post-processing It is possible to eliminate little objects from an ithresh image while keeping the shape and size of larger ones using the morphological open method. A large enough structuring element is possible, but it isn’t necessary. It is also possible to keep the real edge of the OD intact by keeping the structuring element as small as possible.
Visual Attention-Based Optic Disc Detection System …
323
Table 1 VAODD result analysis Disjunctive OD target of 52 images Pass
Fail
45
7
Accuracy: 86.5%
Conjunctive OD target of 27 images
Total 79 images
Pass
Fail
Pass
Fail
14
13
59
20
Accuracy: 51.8%
Iout = (B)(Ithresh )
Accuracy: 74.1%
(11)
4 Results An optical disc detection (VAODD) technique based on visual attention is tested using 81 STARE dataset fundus images. This figure shows the intermediate outcomes of the VAODD system, including pre-processing, saliency map computation, postprocessing image, and OD detection. Figure 5 shows instances of fundus retinal images where the VAODD technology fails to detect the optic disc (OD). Table 1 shows the analysis of the results. Of 52 disjunctive and 27 conjunctive types of pictures, it can identify OD in 45 and 14, respectively. For disjunctive images, the VAODD system takes an average of 6.36 s and for conjunctive images, an average of 6.45 s, as shown in Figs. 4 and 5. Understanding how the BU technique is used in IK model works for OD detection is examined further. The most salient visual location can be selected using the maximum of the saliency map, and this is where the focus of attention (FOA) should be. As demonstrated in Fig. 6, the FOA for disjunctive images is one, but the FOA for conjunctive images varies when utilizing the IK model. FOA one and FOA four are displayed in Fig. 7 of the fundus retinal picture.
5 Conclusion In this study, the bottom-up (BU) saliency model is examined for its ability to locate OD in fundus retinal pictures. When paired with mathematical morphology, the IK saliency model and Otsu’s technique are effective tools for detecting OD in retinal images. Images with OD as a disjunctive type of target had an image success rate of 86.79%, whereas images with OD as a conjunctive type of target had an image success rate of 50%. It also demonstrates that the BU technique alone is not sufficient for detecting a target. EGODD (eye gaze-based optic disc detection) is presented to solve the limitations of this work. This system uses a combination of bottom-up and top-down methods to detect the optic disc.
324
A. Geetha Devi et al.
Fig. 4 a Retinal image as input, b pre-processing results, c saliency map, d post-processing result, and e detected OD
Fig. 5 For both disjunctive and conjunctive instances, the suggested algorithm takes a long time
Fig. 6 Saliency map showing the first FOA (i.e., disjunctive case) and the subsequent FOA (i.e., conjunctive case) as OD pop-ups (left) (right)
Visual Attention-Based Optic Disc Detection System …
325
Fig. 7 OD appears as a pop-up on the fundus image’s first FOA (on the left) and the image’s fourth FOA (on the right)
References 1. L. Deng, Y. Wang, J. Han, Optical disc location based on similarity to improved Harris algorithm, in 2021 40th Chinese Control Conference (CCC) (2021), pp. 2185–2189. https://doi. org/10.23919/CCC52363.2021.9550565 2. S. Athab, N.H. Salman, Localization of the optic disc in retinal fundus image using appearance based method and vasculature convergence. Iraqi J. Sci. 61(1), 164–170 (2020) 3. M. Liang, Y. Zhang, H. Wang, J. Li, Location of optic disk in the fundus image based on visual attention, in 2020 International Conference on Computer Information and Big Data Applications (CIBDA) (2020), pp. 446–449. https://doi.org/10.1109/CIBDA50819.2020.00106 4. D.V. Gunasekeran, D.S.W. Ting et al., Artificial intelligence for diabetic retinopathy screening prediction and management. Curr. Opin. Ophthalmol. (2020) 5. Z. Wang, N. Dong, S.D. Rosario, M. Xu, P. Xie, E.P. Xing, Ellipse detection of optic disc-andcup boundary in fundus images, in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) (2019), pp. 601–604. https://doi.org/10.1109/ISBI.2019.8759173 6. H. Fu, J. Cheng, Y. Xu et al., Joint OD and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans. Med. Imaging 37(7), 1597–1605 (2018) 7. Z. Gui-Ying, Z. Xian-Jie, Deep learning based on optic disk automatic detection. J. Guizhou Educ. Univ. 33(3), 27–32 (2017) 8. K.-K. Maninis, J. Pont-Tuset, P. Arbeláez, Deep retinal image understanding, in International Conference on Medical Image Computing and Computer-Assisted Intervention (2016), pp. 140– 148 9. B. Harangi, A. Hajdu, Detection of the OD in fundus images by combining probability models. Comput. Biol. Med. 65, 10–24 (2015) 10. S.H. Zheng, J. Chen, L. Pan et al., OD detection on retinal images based on directional local contrast. Chin. J. Biomed. Eng 33(3), 289–296 (2014) 11. A. Borji, M.-M. Cheng, Q. Hou, Salient object detection: a survey. Eprint Arxiv 16(7), 3118 (2014) 12. E. Erdem, A. Erdem, Visual saliency estimation by nonlinearly integrating features using region covariances. J. Vis. 13(4), 11–11 (2013) 13. R.K. Gupta, S.-Y. Cho, Window-based approach for fast stereo correspondence. IET Comput. Vis. 7(2), 123–134 (2013) 14. N. Sinha, R.V. Babu, Optic disk localization using L1 minimization, in 2012 19th IEEE International Conference on Image Processing (2012), pp. 2829–2832
326
A. Geetha Devi et al.
15. Sivaswamy, S.R. Krishnadas, Optic disk and cup segmentation from monocular color retinal images for glaucoma assessment. IEEE Trans. Med. Imaging 30(6) (2011) 16. A.K. Mishra, Y. Aloimonos, L.-F. Cheong, A.A. Kassim, Active visual segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 34(4) (2012)
An Overview of Blue Eye Constitution and Applications Jadapalli Sreedhar, T. Anuradha, N. Mahesha, P. Bindu, M. Kathiravan, and Ibrahim Patel
Abstract Every day, the Blue Eyes Technology grows in terms of new developments, allowing it to produce something new and valuable for human beings every day. Technology that can sense human emotions and sensations is being developed in order to make work easier for humans. A computer being able to recognize and respond to human emotions is crucial for this technology to succeed. In order to identify both physical and psychological actions, Blue Eyes technology employs a range of sensors and then pulls essential information from these processes. It is, therefore, possible to calculate the user’s mental, emotional, physical, or informational state. Keywords Blue eye · Sensors · Human emotions · Blue eye technology · Blue eye constitution
1 Introduction The ultimate goal of this research is to create computational computers with humanlike sensory capacities. Using camera and microphone, it produces a computational J. Sreedhar (B) EEE Department, Vignana Bharathi Institute of Technology, Hyderabad, India e-mail: [email protected] T. Anuradha Department of Electrical and Electronics Engineering, KCG College of Technology, Chennai, India N. Mahesha Department of Civil Engineering, New Horizon College of Engineering, Bangalore, India P. Bindu Department of Mathematics, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India M. Kathiravan Department of Computer Science and Engineering, Hindustan Institute of Technology and Science, Padur, Kelambaakkam, Chengalpattu, India I. Patel Department of ECE, B V Raju Institute of Technology, Narsapur, Medak, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_23
327
328
J. Sreedhar et al.
computer that also feels like a human and can identify the actions and feelings of a user. Aside than that, it sounds like a human voice. The term blue in the blue eye technology refers to Bluetooth, which allows for wireless communication, and the word eye refers to eye movement, which provides us with a wealth of useful and intriguing information. This technology’s primary goal is to give computers the ability to think like humans [1]. The ability to see the world from another person’s perspective is a skill that we all possess, and it’s this ability that allows computers to achieve human intelligence and power. The goal of the blue eye technology is to create computing machines with human-like per spectral and sensory abilities. It employs our non-intrusive sensing method, which includes the use of the most up-todate video cameras and microphones, to identify the user’s actions. Users’ physical and emotional states can be discerned by the machine based on what they are looking at and what they are trying to accomplish [2, 3]. For example, weariness and mental disease can be avoided by using blue eyes. As a result, sitting in front of a computer for a lengthy period of time might lead to exhaustion or mental disease in the form of fatigue. The computer equipped with Blue Eyes Technology can help us overcome these constraints. If you have a mental illness, you can employ Blue Eyes Technology to help construct a machine that understands your emotions and interacts with you [4]. A Blue Eyes Technology can be utilized to overcome these constraints.
2 Technologies Used in Blue Eye Devices that can detect human activity make blue eyes a popular topic in computer literature. It’s the primary goal of blue eyes to create computers that feel. Affective Computing is a term used to describe the process of creating computers that are able to recognize human emotions. The devices must be able to detect even the slightest shifts in our state of mind or behavior [5]. When a person is happy or angry, they could click their mouse quickly. The user’s voice and interest are primarily understood through the use of artificial intelligence speech recognition and a simple user interest tracker. People’s emotional feelings must now be taken into account by the machines. With the use of technical input devices such as the Emotion Mouse (which tracks the user’s emotions), these inputs are taken into consideration [6]. The following are some of the different Affective Computing implementation strategies that have been discussed.
2.1 Manual and Gaze Input Cascaded (MAGIC) Pointing It’s a new strategy for dealing with the “eye gaze” for the human–computer interface. An excellent pointing method for enhanced computer input has been gaze tracking.
An Overview of Blue Eye Constitution and Applications
329
However, there are several disadvantages to using eye-tracking technology. An alternate method, known as MAGIC (Manual and Gaze Input Cascaded) is considered in order to overcome these obstacles [7]. MAGIC Pointing is a gaze-tracking technology that is used in conjunction with manual controls to select and aim the cursor. By warping the cursor back to its old home position, MAGIC pointing reduces the amount of cursor motion required while selecting a target.
2.2 Artificial Intelligence Speech Recognition The spoken words are scanned and compared to words that have been stored in the brain. Internal storage of the user’s voice will be made available to the computer. Because of the wide range of pitch frequency and time gap, pattern matching is developed to discover the best fit.
2.3 Simple User Interest Tracker (SUITOR) This is a simple tool for keeping tabs on user preferences. Desktop computers benefit from it since it provides a wider range of information. Using this information, a computer screen’s scrolling ticker is filled with relevant information about the user’s current job.
2.4 Emotion Mouse Even the tiniest changes in a person’s emotional state can be detected by Blue Eyes Technology’s equipment. Depending on the person’s mood, he or she may strike the keyboard fiercely or softly, depending on the type of keyboard [8]. Simply by moving or touching the computer’s mouse or keyboard, the Blue Eyes Technology detects human emotional behavior, and the system begins to react accordingly. Emotion Mouse and other sophisticated devices are used to do this. Simply by touching the mouse, our Emotion Mouse is able to detect emotions. When a user interacts with a computer, the Emotion Mouse analyses and identifies their emotions, such as sadness, happiness, rage, excitement, and so on. The image (Fig. 1) displays a real mouse with the emotion mouse installed on it. Different infrared detectors and temperature-sensitive chips are included in the mouse’s sensors.
330
J. Sreedhar et al.
Fig. 1 Emotional mouse
3 Construction of Blue Eye 3.1 Software Required Connection manager, data analysis module, and visualization module are all part of the software system. • Connection Manager: Wireless connectivity between mobile data acquisition units and the central computer system unit will now be handled by a software connection manager. In order to create Bluetooth connections, authenticate users, buffer inbound data and provide alerts to the CSU Hardware’s connection manager [9]. • Data Analysis Module: Analyzing the raw sensor data, it provides information on the operator’s physiological status. Each working operator is under the watchful eye of a separate data analysis module. Many smaller analyzers are used to gather different kinds of data for the module. Analyses of eye movements, pulse rate,
An Overview of Blue Eye Constitution and Applications
331
and custom analyzers are the most important for determining an operator’s level of visual attention, as well as determining the pulse rate of the operator. • Visualization Module: Using the visualization module, they can see each of the working operators’ geological conditions, and they can see a preview of the second video source and record an audio file. The supervisor receives immediate notification of any and all incoming alarms [10]. When the visualization module is in offline mode, all of the recorded physiological parameters such as alert videos and audio data can be retrieved from the database and viewed.
3.2 Hardware Required Both the data gathering and central system units are hardware components. A.
Data Acquisition Unit
The Blue Eyes Technology uses a mobile component called the DAU. the physiological data from sensors is collected by DAU and sent to the CSU for processing and verification. The sensors and the Central System Unit (CSU) are connected via Bluetooth, which acts as a wireless interface. The operator is given a unique PIN and ID for authentication. Communication between the devices is accomplished through the use of a keyboard, beeper, and LCD display. A micro jack plug is used to transport the user’s data [11]. B.
Jazz Multi-sensor
Eye movement sensors, such as the Jazz multi-sensor, are used to collect critical psychological data in data collecting systems. Data on eye position, blood oxygenation, horizontal and vertical axis accelerations as well as ambient light intensity can also be retrieved from the device’s raw digital data [12]. Direct infrared holographic transducers can be used to monitor eye movement in that multi-sensor. C.
Central System Unit
The CSU is the second most significant component of the Blue Eyes Technology when it comes to connecting to a wireless network. The main components of the CSU are a wireless Bluetooth device and a speech information transmission system. USB, parallel, and serial cable wires are used to connect this CSU to a computer. The tiny jack plug is used to access audio data [10]. The serial and power ports of the personal computer are being used to communicate with a program that includes the operator’s personal ID. There are sensors in this technology that can assess a person’s emotional state, which can be utilized by the supervisor to track the progress of the selected operator’s work (Fig. 2).
332
J. Sreedhar et al.
Fig. 2 Blue Eye structure
4 Applications of Blue Eye • Allows users to work on other tasks at the same time as they utilize a speech recognition system. Using voice instructions, a user can remain focused on observation and manual tasks while still managing the machine. • It has been reported that some large stores are using Alma den’s Blue Eyes software to construct surveillance systems that monitor and interpret client movements, according to IBM engineers in San Jose, California. Blue Eyes collects video data on eye movement and facial expression to explore ways for computers to predict consumers’ wishes [7]. Your computer might, for example, identify comparable links and open them in a new window if your attention lingers on a web page’s title. However, snooping on customers turns out to be the first practical use of this study. • A human operator is necessary at all times in the sector of security and control, thus this can be used. • The vehicle sector could also benefit from this technology. An emotional state of an individual can be determined just by touching their mouse, as the computer is designed to be capable of doing so. Military operations are another key use for speech processing. • A good example is the ability to fire a weapon using only your voice. Pilots don’t have to use their hands to communicate with computers if they have high-quality speech recognition technology. • Other examples include radiologists analyzing hundreds of X-rays, ultrasonograms, and CT scan results while concurrently speaking their findings to a speech recognition system linked to word processors. Instead of writing the words, the radiologist can concentrate on the photographs [13]. • Computers could potentially be used to make airline and hotel reservations using voice recognition. To make a reservation, cancel a reservation, or inquire about the schedule, a user merely needs to state his demands.
An Overview of Blue Eye Constitution and Applications
333
5 Emotion Computing Using Blue Eye Technology The facial expression study of Paul Ekman has revealed a link between emotional state and physiological data. Ekman’s Facial Action Coding System is described in a selection of works by Ekman and others on tracking facial behavior. Devices that monitor numerous parameters, such as pulse, galvanic skin response (GSR), temperature, and somatic movement are fitted to subjects in one of Ekman’s investigations Facial expressions corresponding to the six most common emotions were assigned to each participant in the study. Emotions such as joy and surprise are also included in his list of six. By analyzing physiological data, Dryer (1993) was able to identify several emotional states. These include GSR, heart rate, skin temperature, and general somatic activity (GSA). There are two main types of data analysis: descriptive and predictive. The first step in determining the data’s dimensionality is to run it through an MDS process.
6 Results Six different physiological evaluations are used to represent the six different emotions that are shown in GSA, GSR, pulse, and skin temperature during the five-minute baseline and test sessions. Approximately three to four times a second, GSA data was sampled, and a pulse was discovered as the result of a beat. Individual physiological variance was taken into account while calculating the difference between the baseline and test results. When scores deviated by more than 1.5 standard deviations from the mean, they were deemed missing. A total of twelve scores were omitted as a result of these criteria. The Emotion mouse’s notion is based on solid evidence. correlation models are used to link the physiological data. Calibration is used to create the correlation model. Calibration signals generated by users with known or measured emotions at the time of calibration are evaluated using statistical analysis of the association between attributes and emotions (Fig. 3; Table 1).
7 Conclusion Today’s world is expanding at an incredibly rapid rate since it comprises mostly of real-time systems. The Blue Eyes Technology provides a more powerful and userfriendly computing environment for users. We can communicate wirelessly thanks to Bluetooth, and we can also learn new things thanks to our eyes’ movements. As technology continues to advance, it is only a matter of time before everyone is familiar with and uses this technology in their daily lives. Even our cell phones may be affected. In any case, this is merely a projection based on current technical trends. Unlimited the market for blue eye technologies in the future is expected to increase.
334
J. Sreedhar et al.
Fig. 3 Graph showing emotion scores wrt baseline index
Table 1 Emotion scores Anger
100%
50%
25%
6.5
6.5
6.5
Disgust
5.8
5.7
5.8
Fear
5
5
5
Happiness
6.5
6.2
5.8
Sorrow
5
5.2
5.3
Surprise
4.8
4.9
4.8
References 1. K. Dhinakaran, M. Nivetha, N. Duraimurugan, D.C.J.W. Wise, Cloud based smart healthcare management system using blue eyes technology, in 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC) (2020), pp. 409–414. https://doi.org/ 10.1109/ICESC48915.2020.9155878 2. M. Kumawat, G. Mathur, N.S. Saju, Blue eye technology. IRE J. 1(10) (2018). ISSN 2456-8880 3. B. Oyebola, O. Toluwani, Blue eyes technology in modern engineering: an artificial intelligence. Int. J. High. Educ. 45–65 (2018) 4. H.A. Patil, S.A. Laddha, N.M. Patwardhan, A study on blue eyes technology. Int. J. Innov. Res. Comput. Commun. Eng. 5(3) (2017) 5. M.R. Mizna, M. Bachani, S. Memon, Blue eyes technology, in Eighth International Conference on Digital Information Management (ICDIM 2013) (2013), pp. 294–298. https://doi.org/10. 1109/ICDIM.2013.6693995 6. H. Sharma, G. Rathee, Blue eyes technology. Int. J. Comput. Sci. Manag. Res. (2013) 7. Psychologist World, Eye Reading Language (Body Language), July 2013, www.psychologist world.com/bodylanguage/eyes.php 8. S. Madhumitha, Slide Share, Blue Eyes Technology, Mar 2013, www.slideshare.net/Colloq uium/blue-eyes-technology 9. S. Chatterjee, H. Shi, A novel neuro fuzzy approach to human emotion determination, in 2010 International Conference on Digital Image Computing Techniques and Application (DICIA) 10. A. Aly, A. Tapus, Towards an online fuzzy modeling for human internal states detection, in 2012 12th Internal Conference on Control, Automation Robotics and Vision (ICARCV 2012), Guangzhou, China, 5–7 Dec 2012
An Overview of Blue Eye Constitution and Applications
335
11. D. McDuff, R. Kaliouby, T. Seneschals, M. Arm, J. Cohn, R.W. Picard, Affective-MIT facial expression dataset (AMFED): naturalistic and spontaneous facial expressions collected in-thewild, in 2013 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’10), Portland, OR, USA, June 2013 12. R. Nagpal, P. Nagpal, S. Kaur, Hybrid technique for human face emotion detection. Int. J. Adv. Comput. Sci. Appl. 1(6) (2010) 13. F. Zhizhong, L. Lingqiao, X. Haiying, X. Jin, Human computer international research and realization based on leg movement analysis, in 2010 International Conference on Apperceiving Computing and Intelligence Analysis (ICACIA)
Enhancement of Smart Contact on Blockchain Security by Integrating Advanced Hashing Mechanism Bharat Kumar Aggarwal, Ankur Gupta, Deepak Goyal, Pankaj Gupta, Bijender Bansal, and Dheer Dhwaj Barak
Abstract The demand for blockchain is growing every day. Smart contracts, on the other hand, are commonly employed in business applications. The hashing mechanism is a security method that ensures the blockchain’s dependability. The purpose of the proposed article is to look at the function of hashing mechanisms in blockchain security. The current study took into account a number of previous studies in the field of hashing mechanisms and blockchain. Previous research has encountered challenges such as restricted scope, security, and performance. The suggested work is expected to create a more secure and high-performing solution. Simulation work has been done to confirm the execution time of smart contracts considering several hashing mechanisms. Comparative analysis of collision is also considered to present the security of the proposed work. Keywords Hashing mechanism · Smart contracts · Secure hash algorithm (SHA)-256 · Message digest algorithm · Blockchain · Rivest-shamir-adleman (RSA) encryption
1 Introduction The need for blockchain technology is developing. There is a considerable application of smart contracts in corporate world. Use of improved hashing techniques for smart contracts on the blockchain may be beneficial. Prior research in the realm of blockchain and hashing mechanisms has been included in the present study. Lacks of security and efficacy have impeded earlier studies in the past. Security and performance are both anticipated to be better with the proposed improvement. Proposed research focused on presenting advanced hashing mechanism for blockchain technology. The present research has also considered hashing techniques such as SHA256 and MD5 that are frequently used. The objective of the work is to simulate the execution time comparison among previous hashing and proposed RSA integrated B. K. Aggarwal · A. Gupta (B) · D. Goyal · P. Gupta · B. Bansal · D. D. Barak Deperment of Computer Science and Engineering, Vaish College of Engineering, Rohtak 124001, Haryana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_24
337
338
B. K. Aggarwal et al.
MD5 hashing mechanisms. Moreover proposed mechanism is supposed to provide better transaction success rate and lower failure rate as compared to previous hashing mechanisms [1].
1.1 Blockchain New shared databases have been created by storing data in blocks, which are connected via cryptography. A new block is created each time fresh data is received. Chaining each block together once it has been filled with data and linked to the next helps maintain track of the sequence in which they were formed. The core use case for blockchain is transactional data; however other forms of data may be kept as well. No one person or organization has any influence over blockchain technology since it is employed in a decentralized manner. Data may be saved and transmitted using blockchain technology, but it can never be tampered with. To create immutable ledgers, which are records that cannot be changed or destroyed, a blockchain must be used. DLT (distributed ledger technology) is a common term for blockchain because of this [2]. Working of Blockchain If new transaction has to be added to the chain, it will be verified by all of network’s members. The definition of “valid” in the blockchain system may differ from system to system. After then, a majority of the parties must agree to the transaction. After then, all nodes in the network get a block containing a collection of permitted transactions. They then check to see whether the new block is legitimate. Hash of previous block serves as a unique fingerprint for each subsequent block. Security and fault tolerance are built into the design of blockchains. As a consequence, a decentralized consensus was established using a blockchain. Since blockchains may be used to management activities, store events, and medical data, they can also be utilized to track the origin of food and the traceability of votes [3, 4].
1.2 Smart Contracts It is possible to activate a smart contract when a set of predetermined conditions are met. So that there is no third party or the long process involved in the execution of a contract, both parties will know precisely what is going to happen. The “if/when/then” instructions in the blockchain may be used to build a smart contract [5]. Role of Smart Contracts in Various Sectors Blockchain is now being used in areas other than digital money, such as health, the Internet of Things, and education. Research is proposing a comprehensive mapping analysis in this research to gather and evaluate important blockchain technology
Enhancement of Smart Contact on Blockchain Security …
339 Check External Data
Preset Trigger condition
Preset response rule
….
Block
State
Block
Condition 1: Response 1 Condition 2: Response 2 Condition 3: Response 3 ……. Condition n: Response n
Smart Contracts
Value
Block
Block
Block
….
Fig. 1 Mechanism of smart contract [7]
research in the higher education sector. The research aims to assess the current status of blockchain technology. Research gaps and obstacles are also highlighted in this paper. It is critical to offer and satisfy the changing needs of contemporary society while also providing for the individual. Using technology, it seems that the conventional system, which has been in existence for a long time, has improved. How Smart Contracts Work Smart contracts are composed of a series of “if/when…then…” expressions written in code and stored on a distributed ledger. Computers in a network carry out the activities when certain criteria are fulfilled and confirmed by other computers in the network. Actions like distributing money to the right people, registering an automobile, providing alerts, or issuing a ticket may fall under this category. This transaction’s blockchain will be updated as soon as it has been completed. Any number of criteria may be included in a smart contract to guarantee that the work is completed effectively for everyone concerned. If/when/then rules must be agreed upon, all possible exceptions examined, and a framework for addressing disputes must be developed before the terms may be defined. Smart contracts may then be developed by developer, but organizations that use blockchain for business are increasingly web interfaces, giving templates, and other online tools to make the production of smart contracts more convenient [6] (Fig. 1).
1.3 Blockchain Hash Function A function for calculating bitcoin’s value using a hash function, you may create a new string with a predetermined length from any input string. In computing, a hash is a kind of identifier. This hash is a consequence of a hash algorithm. A one-way hashing method is used to create message digests from an input file or text string. A key is unnecessary. In order to interpret an encrypted message, the receiver must be the intended recipient. Those who should not have access to a file’s data can’t read it thanks to this tool.
340
B. K. Aggarwal et al.
The “hash” or “hash value” is a numerical representation of a particular input value that is generated by a hash function. A processing unit that accepts input of any length and returns a fixed-length output—the hash value—as its result. Blockchain relies heavily on hashing. Hashing, a cryptographic method, may turn data into a string of characters. Hashing, on the other hand, is more efficient since the length of the hash is fixed [8].
1.4 Hashing Techniques Hashing is the process of converting any length of input into a cryptographic fixed output using a mathematical approach (Bitcoin uses SHA-256, for example). A block’s hash value may be computed using the following algorithms: MD5 To create a 128-bit (16-byte) hash value, the MD5 algorithm (Message Digest) utilizes a hexadecimal integer representation to produce a 32-digit hexadecimal integer in text format. MD5 has been utilized in cryptography applications and is commonly used to verify the integrity of data. SHA1 The SHA1 cryptographic hash function was developed by NSA. One of the most often used SHA1 hash functions; SHA1 provides a 160-bit (20 bytes) hash result. Hash functions like SHA1 are used in a wide range of applications and protocols, making them the most popular of the current generation. In the long run, the SHA1 algorithm may not be secure enough. SHA1 is not recommended for use. SHA224 The SHA224 cryptographic hash algorithm was developed by the National Security Agency (NSA). SHA224 generates a 224-bit hash value, which is usually expressed as a 56-digit hexadecimal number. SHA256 SHA256 cryptographic hash function was developed by NSA. Hexadecimal numbers are used to represent SHA256’s 256-bit (32-byte) hash value. People who make Bit-coins use the hash function and mining algorithm called (Table 1).
1.5 Encryption Algorithm Based on RSA The most widely used asymmetric encryption algorithm, RSA, was developed in 1977 by Ron Rivest, Adi Shamir, and Leonard Adleman (known collectively as “RSA”). Its strength comes from the “prime factorization” process on which it depends. Simply
Enhancement of Smart Contact on Blockchain Security …
341
Table 1 Different types of hashing algorithms [9] Keys for comparison
MD5
SHA
Security
SHA 128-bit 2128-bit operation is less safe
MD5 is not as secure as MD5
Length of the message digest
Operation in 264-bits
A 2160-bit operation using 160 bits
To unearth the original message, attacks are required
It is also quicker, requiring just Operation with 280 bits 64 iterations
The search for two mails with the same MD
There have been occasional reports of attacks
80 iterations were necessary since MD5 was so much slower
Speed
As compared to SHA, it has less resilience to collisions. In the case of SHA 256, it is 126 bits, while in the case of SHA 512, it is 256 bits
There have been no reports of a similar incident at this time
So far, all assaults have been successful
SHA 128-bit 2128-bit operation is less safe
MD5 has a lower rate of collisions compared to
Resistance to Collisions
Operation in 264-bits
MD5 is not as secure as MD5
multiplying two astronomically enormous random prime numbers yields an even greater one with this approach. We have to figure out the original prime numbers from this astronomically large number. It’s impossible for today’s supercomputers or even humans to solve this problem, even with the right key length and entropy. RSA-768bit key breaking required more than 1500 years of computing time (distributed over hundreds of workstations) in 2010—significantly longer than the normal 2048-bit RSA key in use at the time.
2 Literature Review In our investigation into how hashing techniques may be used to build smart contracts on the blockchain, we reviewed a variety of academic research publications on blockchain, Smart Contracts, and hashing algorithms. To assist us to keep on track with our research goal, we’ll shortly be providing a brief introduction to the papers that serve as the foundation for our work. Blockchain technology has been called a disruptive technology by Nofer et al. [1]. They believe that the financial sector is the primary application of blockchain technology. Cryptography, distributed technology, and consensus accounting mechanisms were all discussed by Chen et al. [2]. Credit risk is one of the issues brought on by the rapid growth of Internet technology, according to Zhang et al. [3]. In their paper, Hassan et al. [5] proposed insurance contract architecture based on smart contracts. There is evidence that this technology and its qualities may encourage
342
B. K. Aggarwal et al.
open research, according to a paper released in 2019 by Leible [4, 6]. Described how the banking sector might be disrupted by international money transmission, automated bank ledgers, digital assets, and smart contracts using blockchain technology in 2015. Throughout 2016, Jesse [10] conducted a detailed mapping analysis to gather all relevant blockchain studies. In 2017, Igor [11] investigated the use of blockchain technology to store, retrieve, and disseminate data over a decentralized network. An in-depth look at blockchain technology was provided by Zibin [12] in 2017. New solutions for NIST-specified secure hash algorithms were released by Gueron [13] in 2012 for the 2nd Generation Intel® CoreTM CPU. In 2013, Putri [8] employed the MD5 + salt technique to safeguard the database user’s login credentials. To make data more secure, Raju [14] sought to replace the MD5 method with SHA-2, which hashing techniques provide 256 and 384 hash values in 2018. To better understand how Man in the Middle (MITM) attacks affect patient privacy, Purwanti [15] carried out the study in 2018.
3 Problem Statements There have been several researches that did excellent work in field of hashing algorithm, blockchain, and smart contracts. But the scope of those researches is limited. Limited work has been made for smart contract security. Thus, there is a need to inject advance hashing mechanism into blockchain. It’s essential that the transactions be authenticated. In the case of an invalid transaction request, the consensus algorithm acts in line with and cancels the claim. The use of blockchain and smart contracts allows this solution to overcome all of the trust and security difficulties that depend on a traditional insurance policy.
4 Proposed Work In the realm of hashing mechanism, blockchain, and smart contracts, there have been a number of notable researchers. These studies, on the other hand, have a restricted scope. Smart contract performance has received just a small amount of attention. Working of Tradition Smart Contract Traditional smart contract took a lot of time because previous experience and probability and success rate of previous transaction is not considered. If certain requirements are satisfied, then a smart contract will be activated. Contracts may be automated to ensure that both sides know exactly what will happen without the need for a third party or a lengthy procedure. It is possible to create a smart contract using the "if/when/then" instructions contained in the blockchain.
Enhancement of Smart Contact on Blockchain Security …
343
Advance MD5 New MD5 contains 64-bits, as recommended by this paper. An evaluation of MD5’s performance has been carried out here. Because of this, extra bits were connected to boost collision resistance. A 32-bit additive random number was used to get this result. Smart contract running would make use of an advance hashing mechanism that is an integration of RSA and MD5. This system would provide more reliability and better performance. Working on Proposed Smart Contract In the proposed effort, a random key was established to reduce likelihood of collisions in the present MD5. Aside from that, RSA has also been utilized to enhance the security of MD5. It is not possible to compare the size of the SHA386 and the SHA512 hashes to the SHA256 and MD5. As the clock’s size grows, so does the number of clock cycles. CPU clock cycles, storage space, and collision probability are all used to evaluate the hashing algorithm’s performance. Performance of hashing algorithm = f (CPU clock cycle consumed, storage space, collision probability) (Fig. 2).
MD5 (32 BIT)
Random Number (32 Bit)
RSA Advance MD5 (64 BIT)
Smart Contract
Bob Wants to Sell His House
Land Deed is digitized
Matching of buyer and seller Smart contract receives and distributes assets
Automation of Clearing and Settlement Digitization of Currency
Fig. 2 RSA integrated MD5 used in smart contract
John wants to buy House
Undisputed Ownership
344
B. K. Aggarwal et al.
Party A wants to Execute Contract
Add Advanced MD5 Mechanism in Block-Chain
Party B is interested in Contract
Smart Contract Assure the Reliability Smart Contract Deed is digitized
Automation of Clearing Settlement
Undisputed Contract
Fig. 3 Working on AI-based proposed smart contract
A smart contract will be triggered if specific conditions are met after considering the previous experience. Contracts would be automated so that both parties know precisely what will happen without the need for a third party or a long process (Fig. 3). The above figure is presenting how the proposed work is considering the validity and reliability of the contract before execution on the bases of previous experience. If the buy or seller did default in the previous transaction then such parties would be filtered during smart contract execution.
5 Results and Discussion While creating a block, blockchain technology is used to perform hashing. For hashing algorithms, there are only four of them: SHA128, SHA256, and MD5. SHA256 has emerged as a popular hashing algorithm in the world of blockchains. Many factors are taken into account while choosing a hashing algorithm, such as storage capacity, time complexity, and collision ratio. “SHA256 is the best approach for preventing collisions, and it also consumes the least amount of storage space”. The suggested study on smart contracts uses an enhanced MD5 algorithm. It’s less secure than SHA256, but it’s also more efficient in terms of storage space. Create an improved MD5 algorithm that is quicker and more resistant to collisions than the standard MD5 algorithm. SHA 256, Traditional MD5, and suggested MD5 Smart contract execution times are compared (Table 2). Considering the above table, Fig. 4 has been plotted to present comparison of different hashing mechanisms. Table 3 is presenting a comparison of transaction failure after detection of treat in case different hashing techniques such as SHA256, MD5, and total time by RSA integrated MD5. Figure 5 is showing plotting of failure rate considering Table 3.
Enhancement of Smart Contact on Blockchain Security … Table 2 Time comparison of various hashing mechanisms
Smart contract
SHA 256
345 MD5
RSA integrated MD5
1
575
535
154.64
2
540
545
120.71
3
552.5
545
98.39
4
557.5
550
92.86
5
545
542.5
90
6
547.36
548.76
98.29
7
550.33
550.62
103.56
8
557.60
555.25
108.65
9
567.02
564.62
112.99
10
570.76
568.23
115.14
11
575.09
576.25
125.09
12
577.86
583.71
130.39
13
581.15
593.38
130.66
14
586.66
598.30
134.26
15
591.89
601.09
134.74
Fig. 4 Time comparison of smart contract execution
Table 4 presents comparison of success rate of transaction considering different hashing techniques such as SHA256, MD5, and total time by RSA integrated MD5. Figure 6 is showing plotting of failure rate considering Table 4. Table 5 is presenting the time consumption during the execution of different hashing techniques such as SHA256, MD5, and total time by RSA integrated MD5. Figure 7 is showing plotting of the failure rate considering Table 5.
346 Table 3 Comparison of transaction failure
B. K. Aggarwal et al. Smart Contract
SHA 256 (%)
MD5 (%)
RSA integrated MD5 (%)
1
92
99
87
2
50
57
49
3
41
43
33
4
76
81
70
5
24
32
22
6
98
100
98
7
42
45
34
8
16
22
10
9
62
65
59
10
82
90
74
11
92
96
89
12
32
37
24
13
41
46
36
14
27
37
19
15
58
59
57
Fig. 5 Comparison of transaction failure after detection of treat
6 Conclusions It has been concluded that the proposed work is a more secure and reliable approach as compared to tradition work. Results conclude that RSA integrated MD5 is providing a better success rate and lower failure rate as compared to standard MD5 and SHA 256. Moreover, the performance is also high. A smart contract’s security has been
Enhancement of Smart Contact on Blockchain Security …
347
Table 4 Comparison of success rate of transaction Smart contract
SHA 256 (%)
MD5 (%)
Total time by RSA integrated MD5 (%)
1
8
1
13
2
50
43
51
3
59
57
67
4
24
19
30
5
76
68
78
6
2
0
2
7
58
55
66
8
84
78
90
9
38
35
41
10
18
10
26
11
8
4
11
12
68
63
76
13
59
54
64
14
73
63
81
15
42
41
43
Fig. 6 Comparison of success rate of transaction considering different hashing techniques
increased by making use of an advanced hashing mechanism. Excessive security has been used to remove some invalid smart contracts. As a result of the suggested work, security and reliability both have been enhanced. The hashing is done by the blockchain technology during block creation. Hashing algorithms such as SHA256 and MD5 are only a few examples. It has been found that SHA256 is employed extensively in blockchains. When selecting a hashing method, a number of criteria, including storage space, time complexity, and collision ratio, are taken into account.
348 Table 5 Comparison of time consumption during execution of different hashing techniques
B. K. Aggarwal et al. Smart contract
SHA 256
MD5
RSA Integrated MD5
1
196.43939
205.95019
55.826479
2
201.54868
216.38711
59.271071
3
220.96657
263.40521
38.623334
4
220.80778
234.26493
41.236319
5
204.00119
245.58745
36.394729
6
223.24887
273.09077
41.498342
7
270.72091
251.10353
34.836661
8
208.30354
224.91529
54.050097
9
249.03755
257.04228
41.905234
10
271.40838
239.73005
42.803357
11
239.51108
228.7882
59.45131
12
236.38767
291.17711
45.556797
13
210.26131
286.17963
47.653667
14
231.33028
266.7778
57.426786
15
270.4546
227.27734
51.215905
Fig. 7 Comparison of time consumption during execution of different hashing techniques
The SHA256 algorithm is the most collision-resistant and consumes the least amount of space. A modified MD5 algorithm is used in the proposed study. Aside from the fact that MD5 is less secure than SHA256, it takes up less storage space. The goal of this study is to create a faster and more collision-resistant version of the MD5 algorithm. Therefore, a system that can be both secure and efficient is required. In this study, the revised MD5 was simulated for storage capacity, collision probability,
Enhancement of Smart Contact on Blockchain Security …
349
and CPU clock cycles. As a consequence, the improved MD5 results surpass MD5 and SHA256 in terms of collision resistance while using less storage and time.
7 Future Scopes Considering demand and need of blockchain, present work is need of the day. However, the scope of artificial intelligence is wide but present research laid significant foundation to put intelligence in execution of smart contracts. Such smart contracts could be used for identity protection and health care systems. In cryptocurrency, blockchain plays a vital role. There have been a number of different uses for blockchain, such as healthcare. Furthermore, blockchain is commonly used in the cloud for distributed computing. In the future, the security method presented in the study might be advantageous for such infrastructures.
References 1. M. Nofer, P. Gomber, O. Hinz, D. Schiereck, Blockchain. Bus. Inf. Syst. Eng. 59(3), 183–187 (2017). https://doi.org/10.1007/s12599-017-0467-3 2. Y. Chen, Y. Zhang, B. Zhou, Research on the risk of block chain technology in Internet finance supported by wireless network. EURASIP J. Wirel. Commun. Netw. 2020:71 (2020), https:// doi.org/10.1186/s13638-020-01685-6 3. Q. Zhang, X. Zhang, Research on the Application of Block Chain Technology in Internet Finance, vol. 885. Springer International Publishing (2019) 4. L. Stephan, S. Steffen, S. Moritz, G. Bela, A review on blockchain technology and blockchain projects fostering open science. Front. Blockchain (2019) 5. I.A. Hassan, R. Ahammed, M.M. Khan, N. Alsufyani, A. Alsufyani, Secured insurance framework using blockchain and smart contract. 2021 (2021) 6. G.W. Petersz, E. Panayiy, Understanding modern banking ledgers through blockchain technologies: Future of transaction processing and smart contracts on internet of money (2015) 7. P. Eze, Eziokwu, A triplicate smart contract model using blockchain technology. Circ. Comput. Sci. DC CPS 1–10 (2017). https://doi.org/10.22632/ccs-2017-cps-01 8. A.P. Ratna, P.D. Purnamasari, A. Shaugi, M. Salman, Analysis and comparison of MD5 and SHA-1 algorithm implementation in Simple-O authentication based security system, in 2013 International Conference Qualitative Research QiR 2013—Conjunction with ICCS 2013 2nd International Conferences Civil Sp., 2013, pp. 99–104 9. S, Prabhav, Trusted execution environment and linux a survey. Int. J. Comput. Trends Technol. (2017) 10. J. Yli-Huumo, D. Ko, Where is current research on blockchain technology? A Syst. Rev. (2016) 11. I. Zikratov, A. Kuzmin, V. Akimenko, V. Niculichev, L. Yalansky, Ensuring data integrity using blockchain technology, in Proceeding of 20th Conference of Fruct Association (2017) 12. Z. Zheng, S. Xie, H. Dai, X. Chen, H. Wang, An overview of blockchain technology: architecture, consensus, and future trends. IEEE, 2017 13. S. Gueron, Speeding up SHA-1, SHA-256 and SHA-512 on the 2nd generation Intel® CoreTM processors, in Proceeding of the 9th International Conference Information Technolonogy ITNG 2012 (2012), pp. 824–826
350
B. K. Aggarwal et al.
14. R. Raju, S. Aravind Kumar, R. Manikandan, Avoiding data replication in cloud using SHA2, in 7th IEEE International Conference on Computation of Power, Energy, Information and Communication ICCPEIC 2018 (2018), pp. 210–214 15. S. Purwanti, B. Nugraha, M. Alaydrus, Enhancing security on E-health private data using SHA512, in 2017 International Conference on Broadband and Wireless Sensors Powering, BCWSP 2017, vol. 2018-January (2018), pp. 1–4
Evaluation of Covid-19 Ontologies Through OntoMetrics and OOPS! Tools Narayan C. Debnath, Archana Patel, Debarshi Mazumder, Phuc Nguyen Manh, and Ngoc Ha Minh
Abstract Ontology provides a way to encode human intelligence so that machines can understand and make decisions by referring to this intelligence. For this reason, ontologies are used in every domain, specifically in the domains that relate to the emergency situation. Covid-19 became a serious concern and emerged as the most significant emergency for the world. Many Covid-19 ontologies are available on the Web to analyse the Covid-19 data semantically. However, a few questions arise from work done so far: How many Covid-19 ontologies are available? Which is a good Covid-19 ontology in terms of richness? What is the pitfall rate of the available Covid19 ontologies? This paper focuses on these questions by providing a comprehensive survey on available Covid-19 ontologies. By this paper, analysts and researchers find a road map, an overview of research work that exists in terms of Covid-19 ontologies. Keywords Ontology · Covid-19 · Ontology evaluation · Ontology pitfall scanner! · OntoMetrics
1 Introduction Ontology is a semantic model that represents the reality of a domain in a machineunderstandable manner. The basic building block of an ontology is classes, relationships, axioms, and instances [1]. The axioms (a statement is taken to be true, to act as N. C. Debnath · A. Patel (B) · D. Mazumder · P. N. Manh · N. H. Minh Department of Software Engineering, Eastern International University, Binh Duong, Vietnam e-mail: [email protected] N. C. Debnath e-mail: [email protected] D. Mazumder e-mail: [email protected] P. N. Manh e-mail: [email protected] N. H. Minh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_25
351
352
N. C. Debnath et al.
a premise for further reasoning) impose constraints or restrictions on sets of classes and relationships allowed among the classes. These axioms offer semantics because by using these axioms, machines can extract additional or hidden information based on data explicitly provided. Nowadays, ontologies are used everywhere due to the following reasons: (i) they simplify the knowledge sharing among the entities of the system; (ii) it is easier to reuse domain knowledge and (iii) they provide a convenient way to manage and manipulate domain entities and their interrelationships. W3C has standardized web ontology language (OWL) for the description of the facts in resource description framework (RDF) which can be formatted with syntaxes like N- triple, TURTLE, RDF/XML, RDF/OWL, etc. Web ontology language (OWL) is designed to encode rich knowledge about the entities and their relationships. OWL classes are built on top of and add additional semantics to RDFS classes. Whereas classes in most other languages only have heuristic definitions, OWL classes have a rigorous formal definition. Relations between individuals are described by OWL properties. There are three types of OWL properties: object properties, data properties, and annotation properties. One of the most powerful features of OWL is the ability to provide formal definitions of classes using the description logic (DL) language. These axioms are typically asserted on property values for the class. There are three kinds of axioms that can be asserted about OWL classes, namely quantifier restrictions, cardinality restrictions, and value restrictions. Ontologies are either developed by handcraft or by using ontology development tools. The protégé editor provides an environment for creating and maintaining ontologies [2]. It is open source and has many plugins. Several ontology repositories exist, namely OBO Foundry (contains biological science-related ontologies), bio portal (comprehensive repository of biomedical ontologies), agro portal (vocabulary and ontology repository for agronomy and related domains), and OLS (provides single-point access of the latest version of biomedical ontologies). These all repositories contain more than a thousand ontologies about a domain [3]. Users use these repositories to determine the appropriate ontology as per need. However, sometimes users get more than one ontology for the same domain; simultaneously, it is crucial to examine which ontology best meets the user requirement. Ontology evaluation is a way that determines the relevance and importance of the ontology in a specified domain. Ontology evaluation is an important process for the development and maintenance of an ontology. Many Covid-19 ontologies have been developed to analyse the Covid-19 data semantically. Now, the question is how to assess quality ontology among the available Covid-19 ontologies. The aim of this paper is to evaluate the available Covid-19 ontologies to find out the suitable ontology as per the need of the user. The major contributions include evaluation of richness, anomalies along with determining pitfall rate of Covid-19 ontologies. The rest of the paper is organized as follows: Sect. 2 shows the literature about the Covid-19. Section 3 looks at the available ontology development methodologies and Covid-19 ontologies. Section 4 focuses on the evaluation of the Covid-19 ontologies to check the richness, anomalies, and pitfalls rate. The last section concludes the paper.
Evaluation of Covid-19 Ontologies Through …
353
2 Literature To deal with complex highly connected big data it is essential that the data is understandable to humans as well as machines. This is where the web ontology language (OWL) is utilized. OWL defines the semantics of data in terms of ontologies which combine many different logical formalisms to represent data in an intuitive way. Many authors have proposed ontologies for analysing the Covid-19 data semantically. The authors González-Eras et al. [4] have proposed an ontology, namely COVID-19 pandemic ontology, by integrating of existing Covid-19 ontologies. They have also used ontologies from other domains to cover the broader aspects of the pandemic. For the development of COVID-19 Pandemic ontology, they have followed ontological mining approaches, which include ontology alignment, ontology linking or mapping, and ontology merging or fusion or mixing. They have evaluated the developed ontology by competency questions, ontological metrics, and protégé reasoner. The authors Kachaoui et al. [5] have developed a methodology for the acquisition of knowledge from data lake for the development of intelligent systems. The proposed methodology has four phases, namely Acquisition layer (loading of data from different sources), Exploration layer (analyse and pre-process the data to make it clear), Semantic layer (prepare new dataset in the form of ontology), and Insight layer (create any type of insights). They have also addressed three questions: the detection of the contaminated person of Covid-19, to avoid and reduce disease propagation, Can the government use big data to prevent Covid-19. The authors FonouDombeu et al. [6] have developed an ontology called COVID-19 ontology (COVIDonto) that extends the existing ontologies and provides more information about the origin, symptoms, treatment, and spread of Covid-19. The NeOn methodology are used for the development of this ontology and data is collected from different sources to reflect the different aspects of Covid-19. They have reused the resources from biomedical ontologies therefore, the developed ontology can be easily integrated with other ontologies. The authors Sayeb et al. [7] have proposed an actor centred approach to strengthen the ability of health care information system (HIS) by providing a precise definition of desired acts, identifying the component, and measuring the performance of actors. They have developed C3HIS ontology by using the protégé tool and contain two aspects—crisis and actors. They have shown the application of the C3HIS ontology to manage healthcare services in HIS. The authors Kouamé and Mcheick [8] have developed a COVID-19 ontology model called SuspectedCOPDcoviDOlogy along with an alert system that identifies COPD patients with Covid-19. The SuspectedCOPDcoviDOlogy ontology contains five ontologies, namely evaluation vital sign ontology (it identifies the key terms of the domain), questionnaire ontology (it contains questions that need to be answered by the person), symptom COVID-19 ontology (the answer of the questionnaire form is extracted by this ontology), alert ontology (it contains all alerts message), and service ontology (it provides service like sending the alert messages or emails). They have utilized the SWRL reasoning engine and OWL/DL tool for generating alerts. When Covid-19 results are positive, the system automatically generates a questionnaire
354
N. C. Debnath et al.
form and sends it to the staff or patient. The authors Ahmad et al. [9] have provided a survey on the ontologies and tools that support the analytics of Covid-19. They have followed the systematic mapping studies (SMS) to collect and analyse the documents for the review. Ontology can be developed either from scratch or by modifying an existing ontology. In both cases, approaches for evaluating the quality and quantity of ontology are necessary. Ontology evaluation provides the approaches and criteria to examine the quality and quantity of the ontology, which also concludes the best fit ontology in the specified domain as per requirement. The authors Raad and Cruz [10] have proposed following criteria: • Accuracy: It is determined by the definitions, descriptions of entities like classes, properties, and individuals. This criteria state that ontology is correct. • Clarity: It shows how effectively the meaning or definition of the term in the ontology is defined. The definition of a term should be independent of the context. • Completeness: It states that ontology covers complete information about a specified domain. • Adaptability: It measures the adaptability of an ontology. • Consistency: It shows that ontology does not have any contradictions. • Computational efficiency: It shows the flexibility of an ontology with the tools, specifically focus on the speed of the reasoner that infers the information from the ontology. Poveda-Villalón et al. [11] have proposed a tool called Ontology Pitfall Scanner! (OOPS!). It is a Web-based tool that shows the pitfalls or anomalies of an ontology. OOPS! shows the 41 types of pitfalls ranging from P01 to P41. Basically, OOPS! groups the pitfalls under three categories, namely minor pitfalls (these pitfalls are not serious and no need to remove them), important pitfalls (not very serious pitfalls but need to remove), and critical pitfalls (these pitfalls hamper the quality of an ontology and need to remove them before using ontology). These pitfalls are also classified by dimension and evaluation criteria—dimension (OOPS! tool has three types of dimension pitfalls, namely structural dimension, functional dimension, and usability-profiling dimension) and evaluation criteria (OOPS! determines the pitfalls under the three ontology evaluation criteria, namely consistency, completeness, and conciseness). The authors Lozano-Tello and Gómez-Pérez [12] have developed OntoMetrics tool. It is a Web-based tool that calculates the statistical information about an ontology. The current version of the OntoMetric tool is available at https://ontome trics.informatik.uni-rostock.de/ontologymetrics/. It has five types of metrics, namely base metrics, schema metrics, knowledge base metrics, class metrics, and graph metrics. As of now, many articles based on Covid-19 ontologies are available. However, a review of the current landscape shows that Covid-19 ontologies are not yet evaluated in terms of richness, anomalies, and pitfall rate that create a problem for choosing the suitable ontologies as per the need of the user. To overcome this problem, the paper evaluates the existing ontologies designed specifically for the Covid-19 context.
Evaluation of Covid-19 Ontologies Through …
355
3 Ontology at a Glance This section focusses on the ontology development methodologies and available Covid-19 ontologies that are accessible and downloadable.
3.1 Ontology Development Methodologies In the literature, various authors have developed ontologies for the semantical analysis of Covid-19 data. However, the major problem for the ontology developers is to choose the right methodology that builds correct, complete, and concise ontology as per requirement. Ontology development methodology describes the step-by-step process for the ontology development. The most famous used methodologies are TOVE, Enterprise Model Approach, METHONTOLOGY, and KBSI IDEF5 [13]. These methodologies have various steps for the ontology development, and some steps are common among them. Figure 1 shows the relationship among these methodologies via arrows (double-headed arrow). The most important step of ontology development is to identify the purpose and fix the boundary/scope of the ontology. This can be achieved by writing competency questions and motivation scenarios. The next step is to collect and analyse the information based on the defined scope and boundary of an ontology. For this purpose, ontology developers use different sources or repositories and conduct the interview with domain experts to collect the desired information. This step is also known as knowledge acquisition. After having required information, now need to formalize it. For this purpose, first, recognize the classes and their properties. The formal competency questions can be utilized here to identify the entities as classes, properties, and instances. Now, start to encode the ontology and imposed constraints on the classes and their properties as required. Ontology evaluation is the vital steps of ontology development methodologies. It shows completeness (ontology must contain all the required information as per domain need) and accuracy (ontology must be free from
Fig. 1 Most popular and extensively used ontology development methodologies
356
N. C. Debnath et al.
anomalies and is able to infer the correct answer). The last step of ontology development methodology is to document the ontologies that can become the base of other activities. Apart from these methodologies, three ontology development methodologies, namely Neon [14], YAMO [15], and SAMOD [16], are also available in the literature. The existing Covid-19 ontologies have been developed based on these above methodologies.
3.2 Available Covid-19 Ontologies Many ontologies have been developed that contain Covid-19-related data. However, only a few ontologies are available that have specifically designed for this pandemic. These ontologies are listed below: 1.
2.
3.
4.
5.
6.
7.
8.
An Ontology for Collection and Analysis of COviD-19 Data (CODO) [17]: The CODO ontology is a data model that publishes Covid-19 data on the Web as a knowledge graph. The CODO aims to show the patient data and cases of Covid-19. The latest version of COVID-19 was released in Sept 2020. COKPME [18]: This ontology is used to analyse the precautionary measures that help in controlling the spread of Covid-19. COKPME ontology is able to handle the various competence questions. The latest version of COKPME was released in Sept 2021. COVID-19 Surveillance Ontology (COVID-19) [19]: This ontology supports surveillance activities and is designed as an application ontology for the Covid19 pandemic. The developed COVID-19 surveillance ontology ensures transparency and consistency. The latest version of COVID-19 was released in May 2020. Long Covid Phenotype Ontology (LONGCOVID) [20]: It is RCGP RSC Long Covid Phenotype ontology. The latest version of LONGCOVID was released in Oct 2021. The COVID-19 Infectious Disease Ontology (IDO-COVID-19) [21]: It is an extension of two ontologies, namely IDO and VIDO, and provides integration and analysis of Covid-19 data. The latest version of IDO-COVID-19 was released in June 2020. WHO COVID-19 Rapid Version CRF semantic data model (COVIDCRFRAPID) [22]: It is a semantic data model of the Covid-19 cases provided by the WHO. This ontology aims to provide semantic references to the questions and answers of the form. The latest version of COVIDCRFRAPID was released in June 2020. COVID-19 Ontology [23]: It covers the wide spectrum of medical and epidemiological concepts linked to COVID-19. The latest version of COVID-19 ontology was released in May 2020. COVID-19OntologyInPatternMedicine (COVID-19-ONT-PM) [24]: It provides scientific findings of the Covid-19 that help to control the outbreak
Evaluation of Covid-19 Ontologies Through …
9.
357
of the pandemic. This ontology has the ability to make medical decisions in a systematic way. The latest version of COVID-19-ONT-PM was released in August 2021. Coronavirus Infectious Disease Ontology (CIDO) [25]: It is a communitybased ontology that imports Covid-19 pandemic-related concepts from the IDO ontology. CIDO is a more specific standard ontology as compared to IDO and encodes knowledge about the coronavirus disease as well as provides integration, sharing, and analysis of the information. The latest version of CIDO was released in August 2021.
4 Evaluation of Covid-19 Ontologies Ontology evaluation provides approaches and criteria to examine the quality and quantity of the ontology, which also concludes the best fit ontology in the specified domain as per requirement. The ontology evaluation aims to determine and check the following points: • Which ontology is optimal for the user among other available ontologies? • Which ontology has richer attribute values compared to other available ontologies? • Which ontology has no anomalies or errors? This section evaluates the richness, anomalies, and pitfall rate of the available Covid-19 ontologies. We use the OntoMetrics tool to examine the richness of an ontology and OOPS! tool for anomalies detection and pitfall rate.
4.1 OntoMetrics Tool It is a tool that evaluates ontology quantitatively. It measures the quantity (number of ontological attributes) of the ontology based on five metrics, namely base metrics, schema metrics, knowledge metrics, class metrics, and graph or structure metrics. These all metrics evaluate the different aspects of an ontology. The base metrics show the quantity of the ontology element, and they consist of simple metrics like classes, axioms, data property, object property, instances, etc. The schema metrics are used to describe the design of the ontology by indicating attribute richness, relationship richness, inheritance richness, etc. The knowledge metrics measure the amount of data that is encoded inside the ontology. It explains the effectiveness of the ontology design by counting instances of an ontology. The class metrics measure the classes and relationships of the ontology. The graph or structure metrics explain the structure of the ontology by examining the cardinality, depth, breadth, total number of paths, etc. We evaluate all the Covid-19 ontologies with respect to all these metrics, which examine the richness of these ontologies. Table 1 depicts the value of these metrics
358
N. C. Debnath et al.
Table 1 (a) and (b) shows the richness of the Covid-19 ontologies (a) Ontologies →Metrics ↓ Base metrics Axioms
CODO
COKPME
COVID-19
Long COVID
IDO-COVID-19
2047
406
165
33
5018
Logical axioms count
917
161
32
21
1032
Class count
91
26
33
13
486
Object properties
73
19
0
0
43
Data properties
50
19
0
0
0
Instances
271
41
0
0
94
Attribute richness
0.549
0.730
0.0
0.0
0.0
Inheritance richness
1.010
1.115
0.969
1.307
1.242
Relationship richness
0.471
0.472
0.0
0.190
0.340
Attribute class 0.0 ratio
0.0
0.0
0.0
0.0
Equivalence ratio
0.098
0.269
0.0
0.0
0.234
Axiom/class ratio
22.49
15.615
5.0
2.538
10.325
Inverse relations ratio
0.211
0.142
0.0
0.0
0.395
Class/relation ratio
0.522
0.472
1.031
0.619
0.530
Knowledge base metrics
Average population
2.978
1.576
0.0
0.0
0.193
Class richness 0.307
0.153
0.0
0.0
0.004
Graph metrics
Absolute root cardinality
1
1
1
1
1
Absolute leaf cardinality
67
17
28
4
306
Absolute sibling cardinality
91
26
33
13
486
Absolute depth
327
84
93
210
3588
3.230
2.818
6.176
7.382
Schema metrics
Average depth 3.59
(continued)
Evaluation of Covid-19 Ontologies Through …
359
Table 1 (continued) (a) Ontologies →Metrics ↓
CODO
COKPME
COVID-19
Long COVID
IDO-COVID-19
Maximal depth
6
5
3
8
13
Absolute breadth
91
26
33
34
486
Average breadth
3.64
2.6
5.5
2.428
2.685
Maximal breadth
13
6
11
5
18
Ratio of leaf fan-outness
0.736
0.653
0.848
0.307
0.629
Ratio of sibling fan-outness
1.0
1.0
1.0
1.0
1.0
Tangledness
0.021
0.076
0.0
0.307
0.213
Total number of paths
91
26
33
34
486
Average number of paths
15.16
5.2
11.0
4.25
37.384
(b) Ontologies →Metrics ↓
COVIDCRF RAPID
COVID-19
COVID-19-ONT-PM
CIDO
Base metrics
Axioms
6684
41,121
1232
134,742
Logical axioms count
1699
2630
472
28,118
Class count
399
2271
365
8775
Object properties
6
9
14
363
Data properties
7
1
1
18
instances
495
6
6
3646
Attribute richness
0.017
4.4E-4
0.002
–
Inheritance richness
1.932
1.153
1.257
–
Relationship richness
0.007
0.007
0.029
–
Schema metrics
(continued)
360
N. C. Debnath et al.
Table 1 (continued) (b) Ontologies →Metrics ↓
Knowledgebase metrics Graph metrics
COVIDCRF RAPID
COVID-19
COVID-19-ONT-PM
CIDO
Attribute class 0.0 ratio
0.0
0.0
–
Equivalence ratio
0.0
0.003
0.0
–
Axiom/class ratio
16.751
18.107
3.375
–
Inverse 0.0 relations ratio
0.0
0.0
–
Class/relation ratio
0.513
0.860
0.771
–
Average population
1.240
0.002
0.016
–
Class richness 0.057
0.0
0.002
–
Absolute root cardinality
1
1
1
–
Absolute leaf cardinality
344
1556
285
–
Absolute sibling cardinality
399
2271
365
–
Absolute depth
2092
26,269
2184
–
Average depth 5.23
9.555
5.983
–
Maximal depth
8
17
9
–
Absolute breadth
400
2749
365
–
Average breadth
7.142
2.899
4.506
–
Maximal breadth
91
247
36
–
Ratio of leaf fan-outness
0.862
0.685
0.780
–
Ratio of sibling fan-outness
1.0
1.0
1.0
–
Tangledness
0.666
0.130
0.079
–
Total number of paths
400
2749
365
– (continued)
Evaluation of Covid-19 Ontologies Through …
361
Table 1 (continued) (b) Ontologies →Metrics ↓ Average number of paths
COVIDCRF RAPID
COVID-19
COVID-19-ONT-PM
CIDO
50.0
161.705
40.555
–
for Covid-19 ontologies. The highest number of classes (8775) and object properties (363) are contained by CIDO ontology; however, the highest data properties (50) lie with CODO ontology among the existing Covid-19 ontologies. Hence, it is stated that CIDO ontology is rich in terms of classes and object properties, whereas CODO ontology is rich in terms of data properties.
4.2 Ontology Pitfall Scanner! (OOPS!) OOPs is a criteria-based tool that evaluates the ontology qualitatively, and it is available on the Web. This tool shows the errors or anomalies of the ontology under three categories, namely minor pitfalls (these are not serious concerns and no need to remove them), important pitfalls (these are moderate pitfalls, and it is good practice to remove them before using that ontology), and critical pitfalls (these pitfalls must be removed from the ontologies otherwise will provide a wrong reasoning results). OOPS! tool shows 41 types of different pitfalls ranging from P01 to P41 and their descriptions. Table 2 depicts the pitfalls that are available in the Covid-19 ontologies. The description of the obtained pitfalls is mentioned below. • Minor pitfalls: P04: Ontology has unconnected elements, P07: Concepts are merged in the same class, P08: Annotation properties are missing, P13: Property inverse is not defined explicitly, P20: Properties annotations are not used properly, P21: Ontology uses a miscellaneous class. P22: Ontology has different naming conventions, P32: Same labels are assigned to several classes. • Important Pitfalls: P10: Ontology lacks disjoint axioms between classes or between property, P11: Domain and range properties are missing, P24: Ontology uses recursive definitions, P25: Ontology relationship is defined as the inverse of itself. P30: Equivalence classes are not defined explicitly, P34: Ontology has untyped class, P38: owl:ontology tag is not declared, P41: No license declares. • Critical Pitfalls: P19: Ontology has multiple domains and ranges in properties. The numbers (e.g. 1, 2, 4,..) that are contained in Table 2 denote the total number of cases in accordance with the given pitfalls, like CODO ontology contains 1 case of pitfall P04. The sign × indicates that no pitfall case is available in the respective ontology. The pitfall describes the number of features that could create problems during reasoning. We have calculated the pitfall rate by using the following formula
1
×
1
P21
P22
P19
Critical pitfalls
×
3
×
×
P41
×
1
×
1
×
×
7
1
P34
×
×
2
P30
P38
×
×
×
×
4
4
×
1
1
×
×
×
×
× ×
1
×
1
14
P24
P11
×
×
×
× ×
×
×
1
P25
1
58
P10
Important pitfalls
P32
×
×
3
P20
×
12 ×
×
×
14
15
58
38
P08
×
1
LONGCOVID
×
4
COVID-19
P13
2
×
1
×
COKPME
P04
CODO
P07
Minor pitfalls
Ontologies → Pitfalls ↓
Table 2 Obtained pitfalls of Covid-19 ontologies
×
1
1
×
1
×
×
24
×
×
1
×
4
7
×
×
1
IDO-COVID-19
×
×
1
×
4
×
1
11
1
40
1
10
3
5
77
5
5
COVIDCRF RAPID
×
×
×
×
2129
×
×
9
×
×
1986
×
×
173
187
×
1
COVID-19
×
1
1
×
298
×
×
14
×
×
×
×
×
29
238
×
1
COVID-19-ONT-PM
×
×
×
×
7146
×
×
236
×
×
4376
×
×
259
7859
×
1
CIDO
362 N. C. Debnath et al.
Pitfall Rate
Evaluation of Covid-19 Ontologies Through … 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
363 0.472
0.454
0.147
0.125
0.109
0.086 0.042 0.007
0.024
Covid-19 Ontologies
Fig. 2 Pitfalls rate of Covid-19 ontologies
n i=1
Pi
N Pi represents the total number of pitfall cases according to the pitfall type Pi , and N is the total number of tuples (ontology size). The high value of the pitfall rate implies a more significant number of anomalies and vice versa. The pitfalls range lies between 0 and 1. Figure 2 shows the pitfalls rate of the available Covid-19 ontologies. The COVID-19-ONT-PM ontology has 0.472 pitfall rate, which is the highest as compared to other Covid-19 ontologies. COKPME ontology contains three cases of pitfall P19 which is a critical pitfall. So, before using COKPME ontology, it is mandatory to remove this pitfall.
5 Conclusion and Future Work It is challenging to choose a suitable ontology as many ontologies available in the same domain. The ontology evaluation approaches assess the quality of ontology among the available ontologies to get the most feasible ontology as per the need of the user. In this paper, we have collected those ontologies which are specifically designated in the Covid-19 context and then evaluated them in terms of richness, anomalies, and pitfall rate. OntoMetrics tool has been used for the calculation of the richness, and OOPS! tool has been utilized for the detection of anomalies. The evaluation results show that available Covid-19 ontologies are rich in terms of attributes; however, they are not free from anomalies. Since anomalies alter the reasoning results, the user must remove them before using ontology in an application. In the future, an attempt should be made to evaluate ontologies using software engineering process.
364
N. C. Debnath et al.
Acknowledgements This research is financially supported by Eastern International University, Binh Duong Province, Vietnam.
References 1. A. Patel, N.C. Debnath, A.K. Mishra, S. Jain, Covid19-IBO: A Covid-19 impact on Indian banking ontology along with an efficient schema matching approach. N. Gener. Comput. 39(3), 647–676 (2021) 2. M. Horridge, H. Knublauch, A. Rector, R. Stevens, C. Wroe, A practical guide to building OWL ontologies using the Protégé-OWL plugin and CO-ODE tools edition 1.0. Univ. Manchester (2004) 3. A. Patel, S. Jain, Ontology versioning framework for representing ontological concept as knowledge unit, in International Semantic Intelligence Conference, vol. 2786 (2021) 4. A. González-Eras, R. Dos Santos, J. Aguilar, A. Lopez, Ontological engineering for the definition of a COVID-19 pandemic ontology. Inform. Med. Unlocked 100816 (2021) 5. J. Kachaoui, J. Larioui, A. Belangour, Towards an ontology proposal model in data lake for real-time COVID-19 cases prevention (2020) 6. J.V. Fonou-Dombeu, T. Achary, E. Genders, S. Mahabeer, S.M. Pillay, COVIDonto: An ontology model for acquisition and sharing of COVID-19 data, in International Conference on Model and Data Engineering (Springer, Cham, 2021), pp. 227–240 7. Y. Sayeb, M. Jebri, H.B. Ghezala, Managing COVID-19 crisis using C3HIS ontology. Procedia Comput. Sci. 181, 1114–1121 (2021) 8. K.M. Kouamé, H. Mcheick, An ontological approach for early detection of suspected COVID19 among COPD patients. Appl. Syst. Innovation 4(1), 21 (2021) 9. A. Ahmad, M. Bandara, M. Fahmideh, H.A. Proper, G. Guizzardi, J. Soar, An overview of ontologies and tool support for COVID-19 analytics, in 2021 IEEE 25th International Enterprise Distributed Object Computing Workshop (EDOCW) (IEEE, 2021), pp. 1–8 10. J. Raad, C. Cruz, A survey on ontology evaluation methods, in Proceedings of the International Conference on Knowledge Engineering and Ontology Development, Part of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (2015). 11. M. Poveda-Villalón, A. Gómez-Pérez, M.C. Suárez-Figueroa, Oops!(ontology pitfall scanner!): An on-line tool for ontology evaluation. Int. J. Semant. Web Inf. Syst. (IJSWIS) 10(2), 7–34 (2014) 12. A. Lozano-Tello, A. Gómez-Pérez, Ontometric: A method to choose the appropriate ontology. J. Database Manag. (JDM) 15(2), 1–18 (2004) 13. R. Iqbal, M.A.A. Murad, A. Mustapha, N.M. Sharef, An analysis of ontology engineering methodologies: A literature review. Res. J. Appl. Sci. Eng. Technol. 6(16), 2993–3000 (2013) 14. M.C. Suárez-Figueroa, A. Gómez-Pérez, M. Fernandez-Lopez, The NeOn methodology framework: A scenario-based methodology for ontology development. Appl. Ontol. 10(2), 107–145 (2015) 15. B. Dutta, U. Chatterjee, D.P. Madalli, YAMO: yet another methodology for large-scale faceted ontology construction. J. Knowl. Manag. (2015) 16. S. Peroni, SAMOD: an agile methodology for the development of ontologies (2016) 17. CODO Ontology, https://bioportal.bioontology.org/ontologies/CODO 18. COKPME Ontology, https://bioportal.bioontology.org/ontologies/COKPME 19. COVID19 Ontology, https://bioportal.bioontology.org/ontologies/COVID19 20. LONGCOVID Ontology, https://bioportal.bioontology.org/ontologies/LONGCOVID 21. IDO-COVID-19 Ontology, https://bioportal.bioontology.org/ontologies/IDO-COVID-19
Evaluation of Covid-19 Ontologies Through …
365
22. COVIDCRFRAPID Ontology, https://bioportal.bioontology.org/ontologies/COVIDCRFR APID 23. COVID-19 Ontology, https://bioportal.bioontology.org/ontologies/COVID-19 24. COVID-19-ONT-PM Ontology, https://bioportal.bioontology.org/ontologies/COVID-19ONT-PM 25. CIDO Ontology, https://bioportal.bioontology.org/ontologies/CIDO
Recognition of the Multioriented Text Based on Deep Learning K. Priyadarsini, Senthil Kumar Janahan, S. Thirumal, P. Bindu, T. Ajith Bosco Raj, and Sankararao Majji
Abstract The development and use of systems for analyzing visuals, such as photos and videos, using benchmark datasets is a difficult but necessary undertaking. DNN and STN are employed in this study to solve the challenge at hand. The study’s network design consists of a localization and recognition network. The localization network generates a sampling grid and locates and localizes text sections. In contrast, text areas will be entered into the recognition network, and this network will then learn to recognize text, including low resolution, curved, and multi-oriented text. Street View house numbers and the 2015 International Conference on Document Analysis and Recognition were used to gauge the system’s performance for this study’s findings (ICDAR). Using the STN-OCR model, we are able to outperform the literature. Keywords Spatial transformer networks · Deep neural networks · Recognition · STN-OCR · Multi-oriented text etc
K. Priyadarsini Department of Data Science and Business Systems, School of Computing, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chennai, India S. K. Janahan (B) Department of CSE, Lovely Professional University, Phagwara, Punjab, India e-mail: [email protected] S. Thirumal Department of Computer Science and Engineering, Vels Institute of Science Technology and Advanced Studies, Chennai, India P. Bindu Department of Mathematics, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India T. A. B. Raj Department of Electronics and Communication Engineering, PSN College of Engineering and Technology, Tirunelveli, Tamil Nadu, India S. Majji Department of Electronics and Communication Engineering, GRIET, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_26
367
368
K. Priyadarsin et al.
1 Introduction Increasing demand for numerous computer vision jobs has pushed this community to focus on reading text in the wild (from scene photos). Despite substantial research in the last few years, finding text in uncontrolled contexts remains a difficult task [1, 2]. Even more challenging is recognizing text lines with random orientation, which takes into account a substantially greater number of hypotheses, which significantly expands the search field. In most cases, existing methods are able to recognize text that is either horizontal or close to horizontal. But when applied to multi-oriented text, the results of the recent ICDAR2015 competition for text detection show that there is still a considerable disparity. Text has a fairly different appearance and shapes when compared to generic objects since it can be handled as a sequence-like object with unlimited lengths [3–5]. This has led to the widespread use of scene image identification systems based on sliding windows and related components. ICDAR2013 and ICDAR2015 contests saw state-of-the art performance from component-based approaches using Maximally Stable Extremal Regions (MSER) as the fundamental representations. An extremely resilient representation of character components was recently learned through the use of a convolution neural network (CNN). To localize a word or a line of text, clustering algorithms or some sort of heuristic approach is usually required for this. Directly hits text lines from crowded photos, taking advantage of their symmetry and self-similarity features [6]. Text detection appears to need the use of both character components and text regions.
2 Related Work Natural picture text identification has piqued the curiosity of computer vision and document analysis professionals. Horizontal or near-horizontal-based text detection is the primary focus of most techniques. In order to create an end-to-end text recognition system, the first step is to locate word boundaries [7]. Here, we’ll take a look at some of the best examples of multi-oriented text detection. Lu et al. [8]. were the first to look at real-world multi-oriented text detection. Conventional detection pipelines can be compared to those that use connected component extraction and text line orientation estimation. Kang, et al. [8] turned the text identification problem into a graph partitioning problem by treating each MSER component as a node in a network. It has been proposed by Wei et al. [9]. Yin and others use multi-stage clustering methods in order to recognize multi-oriented text [10]. For multi-oriented text, an SWT-based end-to-end system was proposed by Yao and his colleagues. The ICDAR2015 text detection competition just announced a hard benchmark for multi-oriented text identification, and numerous academics have presented their results on it [11].
Recognition of the Multioriented Text Based on Deep …
369
3 Methodology It reads line by line, character by character, just like a person would using the STNOCR model. This human-like method to text analysis is no longer employed by text detection and recognition algorithms. An image is processed in its entirety, allowing these systems to retrieve all relevant information at once. Textual sections are found and localized progressively in photos using a human-based technique, and then recognized [12, 13]. Text detection and recognition are part of a Deep Neural Network (DNN) model that was created in this regard. This section focuses on the text detection stage’s attention mechanism and the complete approach for STN-OCR (Fig. 1). A.
Text Detection with Spatial Transformers
Jaderberg et al. employed the Spatial Transformer, a Deep Neural Networks learnable module that receives input I, spatially modifies the input feature map, and then outputs an output feature map O. This shift in location is made up of three main components. The initial part of the localization network computes the function f_ loc, which predicts the spatial transformation parameters 8. Based on projected parameters, the second portion generates a sample grid [14]. This portion generates the sampling grid, which is then delivered as input to the learnable interpolation algorithm in the third section, which produces the altered feature map O as an output. Part by part, we’ll go through all you need to know in this area. • Localization network: For example, an input feature map with dimensions such as height and width are fed into the localization network, which creates output parameters such as spatial transformation. The network of localization will locate and localize N letters, words, or lines of text. It will be necessary to use an affine transformation matrix in order to apply rotations, translations, skew, and zoom to the input feature map in order to achieve oriented text detection. When it comes to text rotation, translation, and zoom, this system has a lot to learn.
Fig. 1 Text detection and recognition using STN-OCR
370
K. Priyadarsin et al.
STN-OCR uses a feed-forward CNN and an RNN to generate N affine transformation matrices. This network of localization makes use of the CNN model ResNet50. The system’s performance is superior to that of other network architectures, such as VGGNet, while using this network structure. As a result, it overcomes the problem of vanishing gradient and maintains a higher level of accuracy than alternative network structures. For the experiments, Batch Normalization was employed, and subsequently, RNN was used for the rest of the study. In this case, the RNN is a Bi-directional LSTM RNN. Hidden states are used to predict affine transformation matrices. BLSTM is primarily responsible for generating concealed states. • Localization Network Configuration ResNet architecture, or residual neural network, is utilized in the localization network. Pictures from this study will be sent to the network, which will then use them to locate the corresponding texts. The first layer of the network will use 32 filters to do a 3 × 3 convolution, the second layer will use 48 filters to accomplish the same convolution, and the third layer will use 48 filters to perform the same convolution. This process is followed by Batch Normalization and averaging 2 × 2 and stride two for each convolution layer. In each layer, ReLU is employed as an activation function. Batch Normalization is applied after each layer, followed by the usage of two residual layers with 3 × 3 convolution. Finally, a BLSTM with 256 neurons was applied to the last residual layer. A sampling grid with bounding boxes (BBoxes) retrieved for textual portions is constructed after the aforesaid model. Only the textual portion of the document, as seen in Fig. 2, is used to generate BBoxes.
Fig. 2 Localization network
Recognition of the Multioriented Text Based on Deep …
371
• GRID Generation Using the feature map as input, the system creates n grids of the input feature map I using grid G0 and the coordinates xw0 and yh0. This stage generates a total of N output grids, including the BBoxes of the network-located textual parts. • Image sampling The values of feature map I were sampled at their corresponding coordinates on each of the N grids after N sampling grids were created with the grid generator in the second section. I can’t fit these spots into the feature map grid since they don’t make sense. As a result, bi-linear sampling was employed to select individuals from the regions that are closest to the centre of the population. The grid generator and picture sampler in action can be seen in Fig. 2. To choose picture pixels at a certain point in an image, the image sampler uses the grids generated by the grid generator. The vertices of the sampling grids are used to generate BBoxes automatically by this technique. Hence, the Spatial Transformer is formed by merging these three components: localization network, grid generation, and picture sampling, and can be employed in any region of a Deep Neural Network. This system begins with Spatial Transformer. B.
Text Recognition
It returns N textual areas retrieved from the input image as a result of the text detection stage This stage of text recognition treats each of the N regions separately. CNN handles the processing of N regions. Because ResNet has been shown to produce better outcomes in text recognition systems, a ResNet variant is also used in this CNN. Text recognition is required to produce strong gradients for text detection. Probability distributions over label space are estimated at this stage. Probability distributions can be predicted using Softmax classifiers. X n = On
(1)
ytn = soft max( f rec (x n ))
(2)
For example, we get the output f rec (x) after convolution feature extraction. Its configuration is identical to that of a localization network except for convolution filters. There are three convolutional layers totaling 32, 64, and 128 filters in this network. C.
Training Network
An image training set X and a text file for each individual image are used to train the network/model in ICDAR 2015. Coordinates for the top-left, top-right, bottom-right, and bottom-left coordinates of × 1 and y1, × 2 and y2, × 3, y3, × 4, and Y4 in each image are included in each file. After learning localization and detecting possible text possibilities in the first step, the model employs labeling to identify the specific piece of text.
372
K. Priyadarsin et al.
By calculating the loss of predicted text labels, error gradients are used to search and locate text regions. Some pre-training activities are required because we discovered that the model fails to merge multi-line texts into a picture. Optimizing the network during model training has a substantial influence. Adam optimizer is used after pre-training the network using Stochastic Gradient Descent (SGD) in order to improve the network’s performance on more difficult tasks. The learning rate in the first step of text detection is kept constant for a longer period of time. This is leading to an improved ability to locate and identify textual sections. As a result, SGD is employed, and it performs admirably in this context. The next stage of text recognition involves learning to recognize text sections that have already been predicted in a prior stage.
4 Results and Discussions What can be accomplished by utilizing this study structure is examined in this section. This investigation uses the ICDAR 2015 and SVHN benchmark datasets. A discussion of the aforementioned datasets follows. • ICDAR 2015 Dataset: Robust Reading Competition makes use of the ICDAR 2015 dataset, which includes 1500 images and over 10,000 annotations. A total of 1000 photographs are utilized in the training process, and a further 500 images are used in the testing process. In addition, photographs can be annotated with text. There are three primary functionalities of ICDAR 2015: text recognition and word localization. Each image has a set of text-bound boxes (BBoxes) for localization. Each image’s BBoxes is kept in its own file, separated just by a single line. To aid in automatic word recognition, BBoxes are provided in addition to the word itself. • SVHN Dataset: Low-resolution photos and little data processing and formatting make the Street View House Numbers (SVHN) dataset a good benchmark. It could be compared to the MNIST database. Because it was compiled from house numbers in Google Street View pictures, this collection comprises a wide range of photographs, including blurry, low-resolution images. It is available in two formats: a chopped digits image format similar to MNIST and a complete home door image file with digit bounding boxes. Too many photos in this dataset: 73,257 for training, and 26,032 for testing. It’s a mess. Experiments on Datasets: ICDAR 2015 was the first dataset to be experimented with. For this dataset, the most difficult part is the variety of photos, which include a variety of background noises and clutter, as well as fuzzy photographs and lowresolution images. Pictures of the outcomes can be seen in Fig. 3. With a Recall, Precision, and H-mean score of 64.2, 79.53, and 72.86%, the STNOCR approach exceeds all other methods. The comparison is shown side by side in Fig. 4.
Recognition of the Multioriented Text Based on Deep …
373
Fig. 3 Findings from the incidental scene text category of the 2015 international conference on documentary arts research Fig. 4 STN-OCR performance
Table 1 STN-OCR performance
Method
Recall (%)
Precision (%)
H-mean (%)
STN-OCR
64.2
79.53
72.86
Baidu VIS
62.12
70.28
65.89
HoText_v1
62.26
67.25
66.78
FOTS
52.19
74.59
64.42
374 Table 2 Results on SVHN dataset
K. Priyadarsin et al. Method
Accuracy (%)
MaxoutCNN
95
ST-CNN
95.3
STN-OCR
97.8
According to the data in Table 1, the proposed strategy has produced superior outcomes when compared to the alternatives. On the SVHN dataset, the network design is evaluated to demonstrate that this model can be used for real data. House numbers in SVHN also contain noise. Finding, locating, and recognizing SVHN house numbers on a sampling grid was found to be a successful method of testing this study’s network architecture. While initializing random weights were used to train the research model, weights from an already-trained network were used to initialize the localization network for optimum results. Better outcomes are generally obtained when using the localization network stage. A comparison of text recognition performance using real house numbers and the SVHN dataset is shown in Table 2. This system’s accuracy on SVHN improved after ICDAR 2015, when it reached 97.8%. Even if previous research has dealt with some of these conclusions, this study model works well with those photos. An image with a colour backdrop is processed in 2–3 s using Google’s K80 GPU and 12 GB of RAM for testing purposes.
5 Conclusion In this study, a single DNN was utilized to perform text detection and recognition (STN-OCR) utilizing recent benchmark datasets, such as ICDAR 2015. Two of the most important parts of this system are its text detection and recognition components. Text detection models are supplied into a text recognition network, which uses that network’s output to recognize text areas in images. As a result, we were able to better detect text from several perspectives. According to the findings, our model outperforms current best practices by a wide margin on SVHN and ICDAR 2015 tests. Only whole sentences and lines are possible with this model. In the future, this model will be applied to other regional or well-known languages (such as Urdu/Hindi) and the geometric design will be adjusted to detect directly curved texts.
References 1. A. Alshanqiti, A. Bajnaid, A. Rehman, S. Aljasir, A. Alsughayyir, S. Albouq, Intelligent parallel mixed method approach for characterising viral Youtube videos in Saudi Arabia. Int. J. Adv. Comput. Sci. Appl. (2020) 2. Y. Xu, Y. Wang, W. Zhou, Y. Wang, Z. Yang, X. Bai, Textfield: Learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. (2019)
Recognition of the Multioriented Text Based on Deep …
375
3. S. Khan, D.-H. Lee, M.A. Khan, A.R. Gilal, G. Mujtaba, Efficient edge-based image interpolation method using neighboring slope information. IEEE Access 7, 133539–133548 (2019) 4. S.L. Xue, F. Zhan, Accurate scene text detection through border semantics awareness and bootstrapping, in Proceedings of the European Conference on Computer Vision (ECCV) (2018), pp. 355–372 5. A.R. Gilal, J. Jaafar, L.F. Capretz, M. Omar, S. Basri, I.A. Aziz, Finding an effective classification technique to develop a software team composition model. J. Softw. Evol. Process 30(1), 1–12 (2018) 6. A. Sain, A.K. Bhunia, P.P. Roy, U. Pal, Multi-oriented text detection and verification in video frames and scene images. Neurocomputing 275, 1531–1549 (2018) 7. C. Bartz, H. Yang, C. Meinel, See: Towards semi-supervised endto-end scene text recognition, in 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (2018) 8. M. Liao, B. Shi, X. Bai, X. Wang, W. Liu, Textboxes: A fast text detector with a single deep neural network, in Thirty-First AAAI Conference on Artificial Intelligence (2017) 9. Y. Wei, Z. Zhang, W. Shen, D. Zeng, M. Fang, S. Zhou, Text detection in scene images based on exhaustive segmentation. Sig. Process. Image Commun. 50, 1–8 (2017) 10. Y. Zhu, C. Yao, X. Bai, Scene text detection and recognition: Recent advances and future trends. Front. Comp. Sci. 10(1), 19–36 (2016) 11. B. Xiong, K. Grauman. Text detection in stores using a repetition prior, in Proceeding of the WACV (2016) 12. S. Qin, R. Manduchi, A fast and robust text spotter, in Proceeding of the WACV (2016) 13. Z. Zhang, W. Shen, C. Yao, X. Bai, Symmetry-based text line detection in natural scenes, in Proceeding of CVPR (2015) 14. I. Posner, P. Corke, P. Newman, Using text-spotting to query the world, in 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (2010), pp. 3181–3186
Fruit and Leaf Disease Detection Based on Image Processing and Machine Learning Algorithms S. Naresh Kumar, Sankararao Majji, Tulasi Radhika Patnala, C. B. Jagadeesh, K. Ezhilarasan, and S. John Pimo
Abstract Climate change and population growth have recently brought huge environmental and agricultural challenges. Many technologies have been used to monitor and improve agricultural productivity. Modern agriculture has greatly reduced the use of technology due to its low resolution, destructiveness, cost, sensitivity, and reactiveness. In this work, we have concentrated on the fruit and leaf disease detection based on image processing. FCM, K-mean, and ABC algorithms are used for the segmentation of images. The adapted ML algorithms are used healthy and unhealthy dataset to calculate the accuracy and error rate. The ABC algorithms give the 83% in fruit dataset and 95% in leaf dataset. Likewise very less error rate in 16% in fruit and 4% in leaf dataset has been identified. Keywords Machine learning algorithms · Image processing · Accuracy · Error rate · Fruit and leaf disease detection
S. Naresh Kumar School of Computer Science and Artificial Intelligence, SR University, Waragal, Telangana, India S. Majji (B) Department of Electronics and Communication Engineering, GRIET, Hyderabad, India e-mail: [email protected] T. R. Patnala Department of Electronics and Communication Engineering, GITAM University, Hyderabad, India C. B. Jagadeesh New Horizon College of Engineering, Bangalore, India K. Ezhilarasan Department of ECE, CMR University, Bangalore, Karnataka, India S. J. Pimo St. Xavier’s Catholic College of Engineering, Chunkankadai, Nagercoil, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_27
377
378
S. Naresh Kumar et al.
1 Introduction Image processing is the process of enhancing or extracting valuable information from an image by performing various operations on it. When an image is sent into a signal processing algorithm, it can return a picture or other features related with that image as the output. The field of image processing is one that is exploding right now [1, 2]. In order to increase the quality of an image and to adapt it for usage in various applications, image processing is used to enhance, segment, feature extract, classify, etc. In order to improve an image, you can alter the brightness, change the colour tone and remove noise. If you want to break up an image into smaller pieces, you can use image segmentation [3]. Objects in digital photographs are often identified using this technique. Image segmentation can be done in a variety of methods, such as using threshold, colour, transform, or texture as a basis [4]. A form of dimensionality reduction known as “feature extraction” reduces the number of pixels in an image by extracting just the most important and visually appealing elements. Image matching and retrieval can be expedited by using a reduced feature representation and a high image size with this strategy [5]. When photos are classified, they are assigned to one of a number of predefined groups. Supervised and unsupervised are separated in the classification. Improving pictorial information for human interpretation, processing image data for storage, transmission, and representation in autonomous machine perception are the main objectives in digital image processing [6]. Leaf photos are grouped under four categories: fruit crops, vegetables, cereal crops, and commercial crops. Crop diseases are used to better categorise these photos. In this dataset, only illnesses that are prevalent in most crops are included. Fungus, viral, and bacterial diseased pictures are also present [7]. A total of 2912 photos are included in the dataset, which was gathered from the Web and universities. There are 728 photos in each category. A total of 364 photos were used in the training process, and a total of 364 images were utilised in the testing phase for each category. Figure 1 depicts a few representative images from the collection. To examine the efficiency of the presented technique, it is tested against fruit image gallery data set. We have collected data set, which is 609 fruit images [7] (Fig. 2).
2 Background Techniques for extracting features provide a useful set of reduced feature vectors, which can summarise virtually all of the information included in the original set of features. These reduced feature vectors are helpful for a variety of applications [8]. A machine learning (ML) classification model is used to assign class labels to the feature vectors in order to categorise them. The primary focus of this research is on image-based machine learning and pattern identification.
Fruit and Leaf Disease Detection Based on Image Processing …
379
Fig. 1 Data set of different types of leaf disease
The advancement of pattern recognition systems has resulted in the creation of a large number of classification models [9]. The features depicted in Fig. 3 were obtained through the pre-processing of the training dataset. During the pre-processing stage, noise reduction and picture verification are carried out. To construct a model that can represent multiple and full feature representations, features are extracted from their original forms. The unfortunate reality is that optimising this procedure is quite tough, and it is necessary in order to build functionality for all apps [10].
3 Methodology This technique is used to first preprocess the leaf image. Affected and unaffected parts of the image are first separated from the rest of the picture using thresholding. Once the non-disease zones have been cleared, only illness zones remain [11]. In the feature extraction stage, each disease region is represented by a different set of features (Fig. 4). A classifier model that has already been built is used to categorise the different areas of disease in the body [12]. Finally, the results of the detection tests are evaluated and analysed using the performance data collected during the process. In the following part, we will go over some of our suggestions for putting them into action. There are sections on pre-processing, segmentation, feature extraction, and classification included in the proposed study.
380
Fig. 2 Data set of different types of fruit disease
Fig. 3 Block diagram for the test phase of the detection
S. Naresh Kumar et al.
Fruit and Leaf Disease Detection Based on Image Processing …
381
Fig. 4 Flow chart of the classification process
For their general growth, transpiration, and nutritional function, plants’ leaves are completely reliant on water due to their highly sensitive capacities [13]. As a result, the timely and exact measurement of resource inputs such as water may be extremely beneficial to a sophisticated agricultural system in many ways. During the course of four days in the laboratory, the antioxidant properties of coffee, pea shoots, and spinach leaves were assessed. Following pre-processing for feature extraction, the data is fed into our suggested machine learning algorithms, which then perform automatic classification [14]. The K-mean, FCM, and ABC algorithms were used to evaluate a variety of machine learning methods.
382
S. Naresh Kumar et al.
4 Results and Discussion The classification system can be analysed with the help of K-mean, FCM, ABC algorithms. In this work we calculated on Accuracy and Error Rate Analysis. For the classifier, this experiment uses the ELM approach as the classifier because it gives the best result. Among five performance metrics, this experiment takes only detection accuracy and error rate as the performance metric. Table 1 shows the detection accuracy analysis of the K-means, FCM, and ABC methods. Table 1 gives the explanation about the accuracy analysis of the fruit and leaf disease detection with the help of the different segmentation methods. Graph can be plotted based on Table 1. ABC method gives more accuracy than the K-mean and FCM methods as shown in Fig. 5. Table 2 represents the error rate analysis with mentioned segmentation methods. ABC method gives the low error rate than K-mean and FCM methods (Fig. 6). Table 1 Segmentation method accuracy analysis
Type of test data set
K-Mean
FCM
ABC
Fruit data set
0.753
0. 762
0.832
Leaf data set
0.902
0.924
0.956
Accuracy Analysis
1.2 1 0.8 0.6 0.4 0.2 0
Fruit Dataset K-Mean
Leaf Dataset FCM
ABC
Fig. 5 Graph for accuracy analysis of the different segmentation methods
Table 2 Error rate analysis with mentioned segmentation methods
Type of test data set
K-Mean
FCM
ABC
Fruit data set
0.247
0. 238
0.168
Leaf data set
0.098
0.076
0.044
Fruit and Leaf Disease Detection Based on Image Processing …
383
Error Analysis
0.3 0.25 0.2 0.15 0.1 0.05 0 Fruit Dataset
K-Mean
Leaf Dataset FCM
ABC
Fig. 6 Graph for error analysis of the different segmentation methods
5 Conclusion Present days, most of the farmers are facing the issue in the form of leaf and fruit disease. ˙In this paper work on the detection of the leaf and fruit disease detection based on the image processing with the help of mechine learning classification algorithms. Here, K-mean, FCM, and ABC algorithms are used for the classification process. And calculate the accuracy and error rate. Finally, compare all the methods which are used in the classification. By observing the results, ABC method gives less error rate and more accuracy in terms of segmentation process.
References 1. R. Dhaya, Flawless identification of fusarium oxysporum in tomato plant leaves by machine learning algorithm. J. Innovative Image Proc. (JIIP) 2(04), 194–201 (2020) 2. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(02), 81–94 (2021) 3. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 4. Y. Yuan, Z. Xu, G. Lu, SPEDCCNN: Spatial pyramid-oriented encoder-decoder cascade convolution neural network for crop disease leaf segmentation. IEEE Access 9, 14849–14866 (2021). https://doi.org/10.1109/ACCESS.2021.3052769 5. J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm (JSCP) 3(02), 83–95 (2021) 6. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019) 7. Y. Zhang, C. Song, D. Zhang, Deep learning-based object detection improvement for tomato disease. IEEE Access 8, 56607–56614 (2020). https://doi.org/10.1109/ACCESS.2020.2982456 8. K.S. Patle, R. Saini, A. Kumar, S.G. Surya, V.S. Palaparthy, K.N. Salama, IoT enabled, leaf wetness sensor on the flexible substrates for in-situ plant disease management. IEEE Sens. J. 21(17), 19481–19491, 1 Sept, 2021, https://doi.org/10.1109/JSEN.2021.3089722
384
S. Naresh Kumar et al.
9. J. Sun, Y. Yang, X. He, X. Wu, Northern maize leaf blight detection under complex field environment based on deep learning. IEEE Access 8, 33679–33688 (2020). https://doi.org/10. 1109/ACCESS.2020.2973658 10. Q. Zeng, X. Ma, B. Cheng, E. Zhou, W. Pang, GANs-based data augmentation for citrus disease severity detection using deep learning. IEEE Access 8, 172882–172891 (2020). https://doi.org/ 10.1109/ACCESS.2020.3025196 11. A. Khattak et al., Automatic detection of citrus fruit and leaves diseases using deep neural network model. IEEE Access 9, 112942–112954 (2021). https://doi.org/10.1109/ACCESS. 2021.3096895 12. M. Kumar, A. Kumar, V.S. Palaparthy, Soil sensors-based prediction system for plant diseases using exploratory data analysis and machine learning. IEEE Sens. J. 21(16), 17455–17468, 15 Aug 2021. https://doi.org/10.1109/JSEN.2020.3046295 13. S. Janarthan, S. Thuseethan, S. Rajasegarar, Q. Lyu, Y. Zheng, J. Yearwood, Deep metric learning based citrus disease classification with sparse data. IEEE Access 8, 162588–162600 (2020). https://doi.org/10.1109/ACCESS.2020.3021487 14. G. Yang, G. Chen, Y. He, Z. Yan, Y. Guo, J. Ding, Self-supervised collaborative multi-network for fine-grained visual categorization of tomato diseases. IEEE Access 8, 211912–211923 (2020). https://doi.org/10.1109/ACCESS.2020.3039345
A Survey for Determining Patterns in the Severity of COVID Patients Using Machine Learning Algorithm Prachi Raol, Brijesh Vala, and Nitin Kumar Pandya
Abstract In recent era of medical domain research, a decision support system is need of the time. Various technological analyses based on historical data and clinical results are used for prediction and classification in medical domain. Support vector machine, Naive Bayes and random forest are examples of machine learning algorithms that have been demonstrated to be effective in medical studies such as heart disease, Alzheimer’s disease and brain tumor categorization. As COVID infection is world wide issue, and symptoms and other diagnosis for COVID affected patient is vary by age group, region and variant of the virus. There is a large scope for the system that can help medical persons and government organizations to arrange the resources and better treatment to the patient. Early prediction for the severity of COVID patient helps for better treatment. Like other dieses, machine learning can also play an important role in COVID pattern prediction also. There are also numerous problems in this study project due to a shortage of raw data and the need to uncover patterns. In this paper, we have examined some of the research that is been done by different researchers. A prediction or forecasting model that can well characterize and specify the severity of current COVID-19 disease infection rate in clinical diagnosis and provide support for clinicians to help scientific and rational medical and treatmentrelated decision making may be built. We have tried out to find their pros and cons and tried to find out better solution that can improve the overall result. Keywords Machine learning · Medical · Prediction · COVID · SVM · Naïve Bayes · Pattern mining · Patient care · Deep learning · Classification
P. Raol (B) · B. Vala · N. K. Pandya Parul Institute of Engineering and Technology, Vadodara, India e-mail: [email protected] B. Vala e-mail: [email protected] N. K. Pandya e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_28
385
386
P. Raol et al.
1 Introduction Health care and patient care are important sections which offer value-based care to millions of people. Resource management and in time care are main keys for the health sector [1]. In most cases apart from COVID, there have been enough patient care resources arranged in most of countries. Treatment strategy is also smooth due to various technical support in most of dieses. But COVID has tested the limitation of healthcare services in most of region due to its variant and different type of effect and its spreading methods. After evolution of machine learning, there is not any domain where ML has not left its impact. Health care is not also in exception. Machine learn ing technology and their algorithms have proven their efficiency in health care also. Many machine learning-based models are useful for diagnosis systems. Supportive systems are used in decision making in various health conditions like brain tumor, heart diseases, etc. For COVID, some research has been performed by various researchers. In this survey paper, we have reviewed some of the research work that has been carried out for use of machine learning algorithms for COVID. “Machine learning in health informatics can streamline recordkeeping, including electronic health records (EHRs). Using AI to improve EHR management can improve patient care, reduce healthcare and administrative costs, and optimize operations” [2]. Some of the algorithms used for this type of work are discussed below: Classification using Machine Learning The classification algorithm is a technique that is used to identify the category of new observations or samples based on provided training data. In classification, a model learns from the input dataset samples (training data) or observations and then classifies new sample into a number of classes or categories. The classes may be Yes or No, 0 or 1, spam or not spam, fake or not fake, etc. Classes can be referred as targets/labels or categories in technical terms (Fig. 1). Prediction using Machine Learning Prediction in machine learning refers to the outcome of any algorithm after training on a historical dataset and tested to new testing data with aim of forecasting. Example of prediction in machine learning includes weather prediction, customer purchase Fig. 1 Classification example [3]
A Survey for Determining Patterns in the Severity of COVID …
387
prediction, query prediction, etc. Mostly it works over probability criteria. Prediction has no general world meaning in machine learning. It is purely based on mathematical model. “Linear regression is perhaps one of the most well-known and well-understood algorithms in statistics and machine learning” [4]. Linear regression is represented by an equation that defines a line that best matches the relationship between the input variables (x) and the output variables (y) by determining precise weightings for the input variables, known as coefficients (B). Other algorithms like SVM, decision tree and XG Boost are also being used. Average results for these algorithms are shown at last section in this paper. The main challenges for the COVID Predictions are the lack of a dataset and insufficient information regarding COVID treatment and different symptoms. Due to lack of data feature selection methods, applying may not provide proper information. We have tried to find out study of current methodologies.
2 Related Works In a study [1] researchers used a machine learning algorithm to predict if COVID-19 patients require mechanical ventilation within 24 h of their initial hospital admission. They have tested their model over 197 patients [1]. The following graph shows the early warning score of their system (Figs. 2 and 3). In their study, the researchers used three classification algorithms: logistic regression (LR), random forest (RF) and extreme gradient boosting (XGB). In this study, 287 COVID-19 data of patients at Saudi Arabia’s King Fahad University Hospital was used. They tried to predict deceased from the dataset. Tables 1 and 2 displays the result founded by their algorithms. As per their result, they have applied SMOTE analysis. They have applied accuracy, sensitivity and specificity and F-score measure for their research parameter. Table 1 shows the detailed result for linear regression, random forest and XG Boost algorithm. In their research, lake of dataset is seen. And anomaly is the main issue for this research as death ratio for the COVID cases
Fig. 2 Early warning score used in model [1]
388
P. Raol et al.
Start
COVID 19 Patients data Data Cleaning Cleaned Data Data Transformation Transformed Data
Applying Machine learning Algorithms
Output Classifying COVID-19 cases into positive or negative
End Fig. 3 Flow for COVID data classification [5]
is nearly 2%. So, it is difficult to find the pattern for death cases. They have tried to get it by various machine learning algorithms and got significant result. As per their opinion, anomaly detection can be an important key point for death prediction. Various anomaly detection algorithms and methods can improve the result. In another research work “Predicting the COVID-19 infection with fourteen clinical features using machine learning classification algorithms” [5], researcher of this paper has also tried to classify COVID dataset. Based on 14 clinical variables, their study produced six prediction models by using six distinct classifiers for COVID19 diagnosis (i.e., Bayes Net (e.g., NB), logistic like regression, IBk algorithm,
A Survey for Determining Patterns in the Severity of COVID …
389
Table 1 Literature review table Title
Method/Parameters
Limitations
“Prediction of respiratory de-compensation in COVID-19 patients using machine learning: the READY trial” [1]
Linear regression Positive/Neg. score in percentage
More data is required for improving the result
“Machine learning-based model to predict the disease severity and outcome in COVID-19 patients” [13]
Researchers have used logistic regression (LR), random forest (RF) and extreme gradient boosting (XGB) evaluation parameters like accuracy and F-score were used
Imbalance data to be managed
“Predicting the COVID-19 infection with fourteen clinical features using machine learning classification algorithms” [5]
They have used BayesNet, Better feature selection method logistic, IBk, CR, PART and J48 can lead for a better result Evaluation parameters like accuracy and F-score were used
“Clinical and inflammatory features-based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study” [6]
They have applied XG Boost Mostly data is from Wuhan and algorithm is for single variant of COVID Evaluation parameters like AUC More data can improve result and ROC were used
“Severity detection for the They used SVM algorithm Coronavirus disease 2019 Evaluation parameters like (COVID-19) patients using a accuracy and F-score were used machine learning model based on the blood and urine tests” [7]
The findings may not be applicable to people of other ethnicities because their study focused only Chinese patients with COVID-19
“Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset” [8]
Decision tree was used to develop the model Evaluation parameters like accuracy and F-score were used
According to their studies, the decision tree model has the maximum accuracy of 94.99%
“Prediction of COVID-19 cases using CNN with X-rays” [9]
They have used algorithms like CNN and GoogLeNet They have used accuracy as a evaluation parameter
Their outcomes have 99% training accuracy and 98.5% testing accuracy
“Coronavirus disease ANN-GWO MAPE CR (COVID-19) global prediction using hybrid artificial intelligence method of ANN trained with grey wolf optimizer” [10]
In their research ANN-GWO gave MAPE of 6.23, 13.15 and 11.4%
“Forecasting COVID-19 via RCNN algorithm was used registration slips of patients using Accuracy has been used as ResNet-101 and performance evaluation parameter analysis and comparison of prediction for COVID-19 using faster R-CNN, mask R-CNN, and Res-Net-50” [11]
The dataset was X-ray images. The best accuracy of 87% was achieved by a faster R-CNN
(continued)
390
P. Raol et al.
Table 1 (continued) Title
Method/Parameters
Limitations
“COVID-19 prediction and detection using deep learning” [12]
They have used LSTM in their model Accuracy was the evaluation parameter in their model
They achieved average accuracy of 94.80% and 88.43%
Table 2 Algorithm accuracy
Source
Algorithm
Accuracy (%)
[5]
Multivariate random forest
80
[5]
Decision tree
82–86
[13]
Logistic regression without SMOTE
87
[13]
Logistic regression with SMOTE
86
[5]
CR
84.21
[7]
SVM
81
[8]
ANN
89
CR, PART and J48 algorithm). Their research looked back at 114 instances from the Taizhou hospital in Zhejiang Province, China. This system’s detailed steps are shown in Fig. 2. We have reviewed research “Clinical and inflammatory features-based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study” [6]. Researchers used 48 features in their study, including clinical and laboratory information. The LASSO approach was used to screen all features. The importance of each feature selected using LASSO was then ranked using a machine learning model based on multi-tree extreme gradient boosting. The simple-tree XGBoost model was then used to develop death risk prediction model. Model’s performance was evaluated using AUC, prediction accuracy, precision and F1-scores. In another research [7], they have applied support vector machine (SVM) for their research work. In research paper, “Supervised Machine Learning Models for Prediction of COVID-19 Infection Using Epidemiology Dataset” [8], they employed supervised machine learning algorithm for COVID-19 [8]. They employed a variety of techniques, including logistic regression, decision trees, support vector machines, Naive Bayes and artificial neutral networks. The dataset was tagged for positive and negative COVID-19 instances in Mexico by the researchers. They also used correlation coefficient analysis on a variety of features to determine the strength of each dependent and independent feature’s link. Testing and training data ratio is 20:80. Their findings show that, with a 94.99%, its decision tree model is the most accurate. The highest sensitivity is 93.34% for the support vector machine (SVM) model, and the highest specificity is 94.30% for the Naive Bayes model.
A Survey for Determining Patterns in the Severity of COVID …
391
In another research, _A transfer learning model was used in the study “Prediction of COVID-19 Cases Using CNN with X-rays” [9]. For the COVID-19 prediction, they used GoogLeNet. They used chest X-ray photos for Dataset. GoogLeNet is used to classify images. GoogLeNet is basically a CNN architecture. This model is also called InceptionV1. They classified X-ray images. The positively classified images indicate that the COVID-19 is present in X-ray. Their outcomes have 99% training accuracy and 98.5% testing accuracy. In research work “Coronavirus Disease (COVID-19) Global Prediction Using Hybrid Artificial Intelligence Method of ANN Trained with Grey Wolf Optimizer” [10], their research seeks to use a grey wolf optimizer integrated by artificial neural network for COVID-19 predictions. They have used global dataset. Time-series data was used for training and testing purpose. This data is from January 22 to September 15, 2020. To assess the results, they employed mean absolute percentage error (MAPE) and correlation coefficient values. ANN-GWO gave MAPE of 6.23%, 13.15% and 11.4% in their study. In research work “Forecasting COVID-19 via Registration Slips of Patients using ResNet-101 and Performance Analysis and Comparison of Prediction for COVID-19 using Faster R-CNN, Mask R-CNN, and ResNet-50” [11], the study has main two dimensions. They used patients’ registration slips in the first dimension. They used ResNet-101 out of an indigenous data set of COVID-19 patients’ registration slips. They have used dataset of 5003 patient’s records with exact timing. According to their model, the prediction accuracy in terms of time was 82%. X-Ray data was used in their second dimension. 8009 X-rays of the chest were used. Three neural networks were used. Over the X-ray dataset, faster R-CNN, ResNet-50 and Mask-CNN are used. The best accuracy of 87% was achieved by using a faster R-CNN. In the study “COVID-19 Prediction and Detection Using Deep Learning”, they worked over a deep convolutional neural network-based artificial intelligence technique [12]. The main goal was to use real-world datasets to detect COVID-19 patients. For identifying COVID-19 patients, their system examines chest X-ray images. As per their research and findings, they proved that X-rays can be available faster and at lower costs. For testing purpose, they have tested A total of 1000 X-ray scans of real-life patients. By using that dataset, they confirmed that the ML system may be beneficial in recognizing COVID-19. They got a 95–99% F-measure range. They used three forecasting approaches in the next phase. The three methods used were the prophet algorithm, the autoregressive integrated moving average model and the long short-term memory neural network. In Jordan and Australia, they had average accuracy of 88.43% and 94.80%, respectively. Tables 1 and 2 explains pro and cons founded by our survey. As COVID is still in initial stage of research work, there is too much scope for improving the various result. Event current tested model may not be applicable for next variants as symptoms and features in different variant differ in various region.
392
P. Raol et al.
3 Conclusion We have studied some of the research work done for COVID data classification and prediction from various researchers. We have founded that data collection is main issue for the same problem and imbalanced data leads to inaccurate result. We are going to develop a system that manages the data in proper way and classify and predict the severity for the patient. In future we are going to find the various sources of COVID data from various regions and apply classification algorithm that best suits for this imbalanced data.
References 1. H. Burdick et al., Prediction of respiratory decompensation in COVID-19 patients using machine learning: the READY trial. Comput. Biol. Med. 124, 103949 (2020) 2. Harvard Business Review, “Using AI to Improve Electronic Health Records”. https://hbr.org/ 2018/12/using-ai-to-improve-electronic-health-records 3. University of Illinois Chicago, “Machine Learning in Healthcare: Examples, Tips & amp; Resources for Implementing into YourCarePractice”. https://healthinformatics.uic.edu/blog/ machine-learning-in-healthcare 4. Built in BETA, The top 10 machine learning algorithms every beginner should know. https:// builtin.com/data-science/tour-top-10-algorithms-machine-learning-newbies 5. I. Arpaci et al., Predicting the COVID-19 infection with fourteen clinical features using machine learning classification algorithms. Multimedia Tools Appl. 80(8), 11943–11957 (2021) 6. X. Guan et al., Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study. Ann. Med. 53(1), 257–266 (2021) 7. H. Yao et al., Severity detection for the coronavirus disease 2019 (COVID-19) patients using a machine learning model based on the blood and urine tests. Front. Cell Dev. Biol. 8, 683 (2020) 8. L.J. Muhammad et al., Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset. SN Comput. Sci. 2(1), 1–13 (2021) 9. D. Haritha, N. Swaroop, M. Mounika, Prediction of COVID-19 cases using CNN with X-rays, in 2020 5th International Conference on Computing, Communication and Security (ICCCS) (IEEE, 2020) 10. S. Ardabili et al., Coronavirus disease (COVID-19) global prediction using hybrid artificial intelligence method of ANN trained with Grey Wolf optimizer, in 2020 IEEE 3rd International Conference and Workshop in Óbuda on Electrical and Power Engineering (CANDO-EPE) (IEEE, 2020) 11. H. Tahir, A. Iftikhar, M. Mumraiz, Forecasting COVID-19 via registration slips of patients using ResNet-101 and performance analysis and comparison of prediction for COVID-19 using Faster R-CNN, Mask R-CNN, and ResNet-50, in 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT) (IEEE, 2021) 12. M. Alazab et al., COVID-19 prediction and detection using deep learning. Int. J. Comput. Inf. Syst. Ind. Manage. Appl. 12, 168–181 (2020) 13. S.S. Aljameel et al., Machine learning-based model to predict the disease severity and outcome in COVID-19 Patients. Sci. Program. 2021 (2021)
Relation Extraction Between Entities on Textual News Data Saarthak Mehta, C. Sindhu , and C. Ajay
Abstract Relation extraction is one of the key components in information extraction. The primary step in relation extraction is entity recognition. Identifying the entities in a sentence is a highly complex task and requires multilayered models which use computationally heavy algorithms. The performance of these models was highly dependent on the natural language processing tools. Neural networks are used which use different algorithms and activation functions which are not dependent on these tools. Begin, interior, outside tags are also used as the tagging scheme for the dataset. Each of the word in the line or paragraph is tokenized, and then, each word goes through a named entity recognition process. The tokenized words are tagged in the begin, interior and outside tag format. An attempt to tackle the issue of relation extraction is done using conditional random fields to identify our entities and gated recurrent unit for the purpose of contextual data recognition and finally extract the relations between these entities. Begin, interior, and outside tagging scheme are used to figure out the type of entity, which will help us in classifying the entities better. Keywords Gated recurrent units · Conditional random fields · Begin input and outside tagging scheme · Named entity recognition · Relation extraction
1 Introduction Natural language processing came up as a means for humans to automate the process of our day-to-day work with the help of a computer. It was specifically for the ability of computers to understand human language semantically and lexically. Thus continued, the quest for automating more Natural Language Processing problems. It is in demand for the purpose of bringing the level of understanding between computers and humans to a higher ground. Natural language processing has the ability to solve many issues and come up with a new innovation. These tasks are speech recognition, part of speech tagging, word sense disambiguation [1], sentiment S. Mehta · C. Sindhu (B) · C. Ajay Department of Computing Technologies, SRM Institute of Science and Technology Kattankulathur, Kattankulathur 603203, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_29
393
394
S. Mehta et al.
analysis [2, 17, 20] and natural language generation. To solve these issues, natural language processing needed some tools to be able to solve specific tasks. For this purpose, the natural language ToolKit (NLTK) library was introduced. The NLTK library has the ability to tokenize the sentence, stemming and lemmatization of each and every word and also provide more functionality. Today, natural language processing is going beyond the scope of rule-based systems. It is turning toward machine learning and deep learning to solve the issues posed by the researchers. There are a number of use cases for natural language processing which include spam detection, machine translation, virtual agents, chatbots, text summarization, etc. An interesting task, relation extraction [16] is the most important fields in natural language processing. This was one such problem in natural language processing which was not worked on by researchers until recently. In this problem, there exist entities which have relations between each other, and the task is to extract these relations between these entities. The entities may be an organization, person, location or any other designation possible. The relation between two entities could be to signify whether one person lives in, was born in, etc. The goal is to correctly identify the entities as well as correctly give the relation which exists between them using deep learning and get a high accuracy. Before this task the process of named entity recognition needs to take place. In named entity recognition, the tokenized words are individual fed into the algorithm after which a tag is provided to the word which might indicate what a word signifies if the word is from the premade groups of in the library. The use cases of relation extraction are in the fields of medical, chemical research as well as general research. This could work to identify symptoms and causes of diseases, or to even something completely different, such as, finding out which individual was born in which city. The scope of relation extraction reaches to not only science but also to the fields of art and journalism. Previous works on the relation extraction used NLP tools such as named entity recognition, dependency parser and POS tagging. These tools would complete the task but proved to be inefficient because of the dependency on these tools and external features made specifically for the task. Liu et al. [3] used a dependency-based neural network which would extract relations and improve the working with other external features and thus requires these features. Kanjirangat and Rinaldi [4] also used the shortest dependency parser which is an external feature on which the models are based. Another issue was that multiple entities could have multiple relations, but generally a relation extraction system could only get back two entities and one relation in one sentence. To tackle this problem, the process of extracting multiple relations between entities was realized by [5, 6]. However, we noticed that neither of the researchers tried to apply GRU which is a computationally effective RNN. GRU has the update and reset gate which has similar working as a LSTM which has three gates which provide the same functionality. GRU thus proves to be a better option. In this paper, the task was to make a model which works on the basis of gated recurrent units and conditional random fields to handle the task of relation extraction and entity classification. The issue of multi-head instance where multiple relations can exist between multiple entities is also handled. Our model will help in being more computationally efficient than [5]. The model is applied on the CONLL04 dataset,
Relation Extraction Between Entities on Textual News Data
395
and the goal is to extract relations and entities which will show us the results without using external features of parser, part of speech tagging, etc.
2 Literature Survey The idea of adding semantic information to turn unstructured text into structured text is a popular one. In order to make an accurate annotation, the computer must be able to distinguish a piece of text with a semantic attribute of interest. As a result, extracting semantic relationships between elements in natural language text is a critical step in developing natural language understanding systems. The main focus is to target strategies for identifying relationships between elements in unstructured text in this paper. In this section, the different methods currently being used or which were in use in the past are listed and discussed about (Table 1). Recurrent neural networks, gated recurrent units, conditional random fields and named entity recognition were majorly used. Other methods used include: Sentence splitting for large textual document sentence splitting plays a vital role as processing a large paragraph would require high computational capabilities, lexical processing is the method in which each word of the sentence is broken down and the algorithm tries to recognize each of the words. Following are the list of related works: Table 1 Papers with unexplored fields Reference No. Task
Unexplored Fields
[7]
• Extract lexical and sentence-level information using deep learning neural networks
• Self-attention mechanism is something to add to this
[6]
• Independent of external features • Novel tagging scheme • Exploits self-attention to capture long-range dependencies
• Implementing GRUs is something which was not explored
[8]
• Implemented shortest dependency path from the dependency parse tree
• The word embedding of BioWord2Vec were not touched upon to check for performance
[5]
• A LSTM [18] and CRF method to extract entity and relation
• Applying other neural methods were unexplored
[9]
• CNN with character embedding • A Bi-LSTM [15] embedding also with tanh activation takes place
• Implementing GRUs is something which was not explored
396
S. Mehta et al.
2.1 Recurrent Neural Networks In the recurrent neural network, the input of the next step is based on the output from the last step. Usually, the inputs are independent from each other. Although in some cases, the previous words are needed, and the preceding words must be recalled/remembered. RNN has a hidden layer to overcome this issue and has a hidden layer to solve the problem. The most important component of RNN is the hidden state, which remembers specific information about a sequence. Although a step forward in many directions, the RNN has multiple shortcomings: (i) Problems exploding gradient and vanishing gradient, (ii) Difficult to train RNN, (iii) When utilizing tanh or RELU as an activation function, it will not be able to handle very long sequences.
2.2 Gated Recurrent Units Gated recurrent units (GRU) use an update gate and a reset gate, current memory content and final memory at the current step to address the vanishing gradient problem of a normal recurrent neural network (RNN). Essentially, they are two vectors that determine what information should be sent to the output. They are unique in that they can be trained to retain knowledge from a long time ago without being washed away by time or to discard information that is unnecessary to the prediction [18]. Update gate: Holds the information for the previous t−1 units and is multiplied by its own weight U(z). Both results are added together, and a sigmoid activation function is applied to squash the result between 0 and 1. It essentially eliminates the risk [19] of vanishing gradient problem. Reset gate: This gate is used from the model to decide how much of the past information to forget. The reset gate uses the same formula as the update gate. Current memory content: This step is required for collecting knowledge from the past that is relevant and is used with the input of the current cycle. Final memory at current step: The network then calculates the vector that holds information for the current unit and feeds it down to the network as the final stage of the cycle. In order to do that, the update gate is needed.
2.3 Conditional Random Fields A conditional random fields (CRF) is a sequence modeling algorithm which is used to identify entities or patterns in text, such as parts of speech (POS) tags. It is considered to be the best method for entity recognition. Conditional random fields are a type
Relation Extraction Between Entities on Textual News Data
397
of discriminative model that is well suited to prediction tasks in which the current prediction is influenced by contextual information or the status of the neighbors. CRF overcomes the label bias issue found in maximum entropy Markov mode (MEMM) by using global normalizer.
2.4 Named Entity Recognition There are special terms that indicate specific entities that are more informative and have a unique context in any written composition. These are referred to as named entities (Table 2), which are terms that describe real-world items that are frequently signified by proper names. A basic method may be to look for these in text documents’ noun phrases. To recognize and segment named items, as well as classify and sort them into several predetermined categories. SpaCy is one of the most widely used libraries.
3 Methodology In this paper, a model is provided which identifies the entities, that is, the type and all the relations which can be inferred from them and also tries to solve the problem related to multi-head issue. The multi-head issue is the presence of multiple relations between multiple entities in a single sentence. The model has different layers for the architecture. The layers are of word embedding, the Bi-GRU layer, the CRF layer and finally the sigmoid layer. For our model, the dataset which is being used is CONLL04 dataset with the BIO tags present. The sentence is tokenized, and the tokenized words are then going through a word embedding layer. The word embedding layers provide mathematical reference on which the statistical model is built and the machine can interpret the sentence. The Bi-GRU layer is utilized and Table 2 Types of named entities [10]
Type
Description
Human
People, including fictional
Nation
Country
Object
Vehicles, items, etc.
Event
Sports or business events
Law
Law-related document
Language
Particular language
Date
Format in date to identify dates
Time
Hours, minutes
Percent
Percentage
398
S. Mehta et al.
Fig. 1 Architecture diagram for relation extraction
Fig. 2 Layer-wise representation of the model
provides a complex contextual reference to the words. The CRF and sigmoid layers are eventually used to output the prediction made for the entities and the relation between them. If there is no relation between the entities, “N” tagging is used to signify as such (Figs. 1 and 2).
3.1 Dataset CONLL04 dataset: This dataset provides entity types, namely person, organization, location and other. The relation types are kill, located in, work for, etc. It contains 910 training examples,
Relation Extraction Between Entities on Textual News Data
399
243 for validation and 288 for testing. Zhao et al. [11] uses this for experimental purposes for the same. Implementation and hyperparameters: Python with the TensorFlow machine learning library is used. ADAM optimizer with a learning rate 10–2 is the main optimizer being applied in the network. The size of GRU was around 64. The dataset is regularized by applying dropout. It is applied to the hidden layers as well input embeddings. The character embedding size is set to 25.
3.2 Word Embedding A sentence may contain n number of words, which may have different meanings. These words need to have a mathematical reference for the machine to be able produce a statistical model. This word embedding layer helps in giving that mathematical reference to each and every word provided in the sentence. The pretrained embedding of GloVe is utilized, which is used for a contextual-level model, used universally. The 200-dimensional word embedding is applied. Character embedding is implemented. This takes each and every character of a word and provides it with a certain value. The whole purpose of this is to be able to obtain the power of prefixes and suffixes used. For example, a word like “famous” can have a prefix of “un” and can change the meaning only on a contextual level. The use of prefixes and suffixes thus becomes extremely useful. The embeddings are put through a BiGRU, and the two states from the forward and backward feeding are attached together. The character embedding vector is appended to the word embedding to obtain the final word embedding vector.
3.3 Bi-Directional Gated Neural Networks RNNs were introduced as a tool to tackle many NLP-related tasks which had lots of sequential data available. For our purpose, the same is required. BiGRU layer is applied. Deepa et al. [12] uses the same for relation extraction purpose. This layer has the ability to encode information from two directions and thus give more information at the time of output.
3.4 Conditional Random Fields Apart from using the BiGRU layer, the task at hand requires us to identify entities using the BIO tagging scheme. The BIO tag signifies B as the beginning of the entity,
400
S. Mehta et al.
I as the inside tag which suggests that it is a continuation of the entity tag from before and O as the outside tag which suggest that it is not an entity. Here, the CRF layer helps in calculating the most highly probable tag for the token. Referring from [5], the value for each token is calculated with this formula below. S(l) (h i ) = V(l) f U(l) h i + b(l)
(1)
l is used in the named entity recognition task. f (.) is an activation function where V (l) ∈ Rp×g , U (l) ∈ Rg×2d , b(l) ∈ Rg . The CRF layer is utilized to identify the entity recognition work which was required to be done. Chen and Hu [13] applies for the same purpose. For this named entity recognition, the linear-chain CRF will find out the probability with this formula, where we assume the word w, has a sequence of score s, and prediction y and we refer [5] for the formula. n n−1 S y1(l) , . . . , yn(l) = S (l) i, y (l) i + Ty (l)i y(l)i+1 i=0
(2)
i=1
where S ∈ Rsi , yi (e)(e) is the score for the tag which is predicted for token wi , T is a transition matrix in which each value entered denotes the transition scores from one tag to another. Referring [5], the probability of the entity having one particular tag over all possible tag sequence for the input provided is P y1(l) , . . . , yn(l) |W = e s y1(l) , . . . , yn(l) s y1(l) , . . . , yn(l) e s y1(l) , . . . , yn(l) (3) Cross-entropy loss is minimized. The entity tags for classification [21] purposes by which learns the label embedding are implemented.
3.5 Multi-Head Selection In this paper, the term multi-head is mentioned multiple times. Each token available in the sentence can have multiple relations between other entities. The vector, that is, the other entities and the relation between the entities (y, c) are predicted. Our aim is to find the most probable entities which have relations with the entity w and the most probable relation they have, with the following equation. S(r ) z y , z x , rk = V(r ) f U(r ) zy + W(r ) zi + b(r )
(4)
The goal is to find the probability of getting W y and Rk when the word W x is observed. This is used in [5] for the same. Cross-entropy loss is minimized during training.
Relation Extraction Between Entities on Textual News Data Table 3 Accuracy score
S. No.
Reference No.
401 Model
Accuracy
1
[4]
Transformers
0.52
2
[14]
BiLSTM
0.84
3
[3]
DepNN
0.83
4
[7]
CNN
0.82
5
[8]
LSTM
0.92
6
[5]
LSTM with CRF
0.80
7
[6]
BiLSTM
0.82
4 Results and Discussion The proposed model uses CRF and sigmoid layer for the purpose of finding entities and relations. There are different evaluation types which are being applied here which are strict, boundaries and relaxed. Strict is used to identify an entity and relation between the entities exactly as it is labeled. Boundaries suggest that if it is within the scope of being classified as a certain type of entity, it will be accepted as a positive result. Whereas in relaxed, if one of the token types is correct for a multi-token entity, then we can give it a positive result. Evaluation of the data will help us in the process of finding out how accurately does the model work in reaching its target. The model is evaluated on the basis of precision and recall and the F1-score. These are the various performance metrics on which models are evaluated on. The formula for these metrics are as follows: Precision = True Positive/(True Positive + False Positive)
(5)
Recall = True Positive/(True Positive + False Negative)
(6)
F1 Score = 2 · (Precision × Recall)/(Precision + Recall)
(7)
Upon applying the BiGRU on the CONLL4 dataset, the accuracy score of 60% is achieved (Table 3).
5 Conclusion In this paper, a model is presented to jointly extract relations and entities using the textual news data. CRF is applied to identify the entities, and sigmoid layer is used to get relations between them. This model overcomes the previously used models which applied NLP tools to get the result for relation extraction.
402
S. Mehta et al.
References 1. E. Barba, L. Procopio, R. Navigli, CONSEC: word sense disambiguation as continuous sense comprehension, in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021). https://doi.org/10.18653/v1/2021.emnlpmain.112 2. S. Shakya, A self monitoring and analyzing system for solar power station using IoT and data mining algorithms. J. Soft Comput. Paradigm 3(2), 96–109 (2021) 3. Y. Liu, F. Wei, S. Li, H. Ji, M. Zhou, H. Wang, A dependency-based neural network for relation classification. ACL Anthology (2015). https://aclanthology.org/P152047.pdf 4. V. Kanjirangat, F. Rinaldi, Enhancing biomedical relation extraction with transformer models using shortest dependency path features and triplet information. J. Biomed. Inform. 122, 103893 (2021). https://doi.org/10.1016/j.jbi.2021.103893 5. G. Bekoulis, J. Deleu, T. Demeester, C. Develder, Joint entity recognition and relation extraction as a multi-head selection problem. Expert Syst. Appl. 114, 34–45 (2018). https://doi.org/10. 1016/j.eswa.2018.07.032 6. J. Chen, J. Gu, Jointly extract entities and their relations from biomedical text. IEEE Access 7, 162818–162827 (2019). https://doi.org/10.1109/access.2019.2952154 7. D. Zeng, K. Liu, S. Lai, G. Zhou, J. Zhao, Relation classification via convolutional deep neural network. ACL Anthology (2014). https://aclanthology.org/C141220.pdf 8. D. Ningthoujam, S. Yadav, P. Bhattacharyya, A. Ekbal, Relation extraction between the clinical entities based on ... EZDI. https://www.ezdi.com/wpcontent/uploads/2018/10/Relation_extrac tion_between_the_clinical_entities_based_on_the_shortest_dependency_path_based_LSTM. pdf. 9. F. Li, M. Zhang, G. Fu, D. Ji, A neural joint model for entity and relation extraction from biomedical text. BMC Bioinform. https://doi.org/10.1186/s12859-017-1609-9 10. D. Sarkar, Named entity recognition: a practitioner’s guide to NLP (2018). KDnuggets. Retrieved from https://www.kdnuggets.com/2018/08/named-entity-recognition-practitionersguide-nlp-4.html 11. S. Zhao, M. Hu, Z. Cai, F. Liu, Modeling dense cross-modal interactions for joint entity-relation extraction, in Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (2020). https://doi.org/10.24963/ijcai.2020/558 12. C.A. Deepa, P.C. Raj, A. Ramanujan, Improving relation extraction beyond sentence boundaries using attention, in International Conference on Computational Sciences-Modelling, Computing and Soft Computing (CSMCS 2020) (2020). https://doi.org/10.1063/5.0046136 13. T. Chen, Y. Hu, Entity relation extraction from electronic medical records based on improved annotation rules and BILSTM-CRF. Ann. Transl. Med. 9(18), 1415–1415 (2021). https://doi. org/10.21037/atm-21-3828 14. S. Zhang, D. Zheng, X. Hu, M. Yang, Bidirectional long short-term memory networks for relation ... ACL anthology (2015). https://aclanthology.org/Y15-1009.pdf 15. Z. Huang, W. Xu, K. Yu, Bidirectional LSTM-CRF models for sequence tagging. arXiv. http:// export.arxiv.org/pdf/1508.01991 (2015) 16. S. Pawar, G.K. Palshikar, P. Bhattacharyya, Relation extraction: a survey. arXiv.org. https:// arxiv.org/abs/1712.05191 (2017) 17. C. Sindhu, D.V. Vyas, K. Pradyoth, Sentiment analysis based product rating using textual reviews, in 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), pp. 727–731 (2017). https://doi.org/10.1109/ICECA.2017.8212762 18. H.K. Andi, An accurate bitcoin price prediction using logistic regression with LSTM machine learning model. J. Soft Comput. Paradigm 3(3), 205–217 (2021) 19. S. Shakya, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC 3(03), 235–249 (2021) 20. A. Sungheetha, R. Sharma, Transcapsule model for sentiment classification. J. Artif. Intell. 2(03), 163–169 (2020) 21. T. Vijayakumar, R. Vinothkanna, Capsule network on font style classification. J. Artif. Intell. 2(02), 64–76 (2020)
A Comprehensive Survey on Compact MIMO Antenna Systems for Ultra Wideband Applications V. Baranidharan, S. Subash, R. Harshni, V. Akalya, T. Susmitha, S. Shubhashree, and V. Titiksha
Abstract 5G and beyond 5G technologies play a most crucial role in the current high speed communications systems. The Ultra Wideband (UWB) technologies are having the most important perspective in modern wireless systems. These communication systems will give desirable features over the support high data rate, reduce the cost, support more of the users, etc. In this paper, we give a comprehensive survey on different types of proposed UWB MIMO antennas and its designs proposed by the various researchers in recent literature. This paper elaborates the state of art of research in various 5G Ultra Wideband Multi Input Multi Output (MIMO) with good isolation antennas with their suitable performance enhancement techniques. This paper gives a detailed review about the 5G wideband antenna designs, structural differences, substrate chosen, comparison, and future breakthroughs in a detailed way. This will be more useful for the various wireless communication applications such as the UWB, WLAN, and Wi-Max, etc., with a very less isolation of 15 dB over the larger dimensions. Keywords Ultra Wideband · High speed communication · Data rate · Enhancement techniques · Substrate · Antenna design
1 Introduction Nowadays, there is a huge demand for low power and cost with high data rate in Ultra Wideband communication systems. The Federal Communications Commission (FCC) allocates the band for UWB communication between 3.1 and 10.6 GHz. This will induce the researchers towards the UWB antennas and its design. Wide impedance matching, low profile, radiation stability, and low cost are some of the UWB antenna designing difficulties of feasible UWB antenna design. Similar to other wireless systems, UWB systems are also affected by Multipath Fading. In order to V. Baranidharan · S. Subash (B) · R. Harshni · V. Akalya · T. Susmitha · S. Shubhashree · V. Titiksha Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_30
403
404
V. Baranidharan et al.
overcome this issue, in UWB systems, Multiple Input Multiple Output (MIMO) technology is introduced. This technology will provide diversity gain, multiplexing gain, link quality, and making further increase in capacity [1]. In the designing stage of MIMO antennas for UWB systems, two major challenges are faced. First is to minimize the antenna elements. Next is to enhance the isolation between the antenna elements. To overcome design issues, many methods are introduced. One is UWB diversity antennas, the second method involves decoupling structures for high isolation properties [1]. The third method is a hybrid combination of both structures. A compact UWB MIMO antenna is explained here. For achieving a wide bandwidth in UWB applications, on the radiating element, there is formation of a staircase structure. Design of MIMO antenna seeks to minimize mutual coupling between the elements of the antenna. To suppress the coupling, a comb-line structure works as an electromagnetic band-gap. The antenna proposed has a size of 2631 mm2 , which is compact in size than many of the UWB antennas with a single element. The observed bandwidths range from 2.8 to 11.9 GHz [1]. The paper is structured as follows: Sect. 2 explains the different UWB MIMO antenna types and various techniques to enhance its performance. Section 3 gives the detailed discussion about the recently designed antennas by the researchers and their findings and antennas future breakthroughs. This paper concludes with all the findings in Sect. 4.
2 UWB MIMO Antenna, Classifications, and Its Performance Enhancement Techniques In recent years, a large number of 5G antennas have been designed by different researchers worldwide. This section explains UWB MIMO, antenna types, and its performance enhancing techniques.
2.1 UWB MIMO Antennas Generally, the wireless communication systems are always affected by severe interference, radiation loss, and more multipath fading. In order to overcome these issues, we need Multi Input Multi Output (MIMO) antennas to achieve better transmission range without signal increasing power. This effective design will help to achieve better throughput and high efficiency [2]. Different wideband and multi-band antennas are designed to achieve the efficient different frequency bands. Even-though the antennas may be a compact device, this will help to achieve the higher transmission rate, we need a proper isolation between the antennas [3]. Different types of enhancement techniques are available to improve the various antenna designs and its structures to increase the gain, better and efficient isolation from other single antenna
A Comprehensive Survey on Compact MIMO Antenna Systems …
405
element terminals, improve bandwidth, envelope Correlation Coefficient (ECC), and radiation efficiency [3].
2.2 Suitable Antennas for 5G Applications Based on the available literature, some of the antennas are most suitable for most of the 5G applications. They are, Dipole antenna: Two straight microstrip lines with each of its length λ/4, where λ is defined as the wavelength of the antenna’s resonant frequency [4]. For this effective antenna structures, proper feeding may be given in between the two strip lines used in Microwave Integrated circuits. The entire length of this dipole antenna is λ/2. Monopole antenna: Generally, the monopole antennas are having its length as λ/4 [5]. These monopole antenna structures are modified into different shapes based on the applications. Magneto-electric Dipole antenna: This antenna structure consists of two dipole structures. The first one is planar electric dipole and the second one is vertically shorted planar magnetic dipole. The signal feeding is given at the bottom side of the substrate. Loop antenna: Different types of loop antennas are designed such as rectangular, circular, and square or any other shapes based on the application. Antipodal Vivaldi antenna: This antenna structure consists of two different conductors which are placed on the both sides of the substrates. One side is always a mirror to the opposite side [6]. It consists of two conductors, one will act as a radiator, and the second will act as a ground. Fractal antenna: This type of the antennas consist of similar structures repeated multiple more times. Mathematical rules are always used to design fractal antennas [7]. Different shapes are used while designing the fractal antennas based on the applications. Inverted F antenna: This type of antenna structures consist of microstrip lines with some bends. The signal feeding point is given at the bend part and straight portion of microstrip lines [8]. So, the overall structure will look like an inverted F antenna. Planar Inverted F antenna: This type of antenna structures consist of ground plane and patch antenna. This ground plane is always connected with a short pin and signal feeding point which is given at the bottom side of the substrate.
2.3 Antenna Performance Enhancement Techniques Some of the performance enhancement designs are used to enhance bandwidth, efficiency, mutual coupling reduction, and size reduction. Some of these techniques are discussed here. They are,
406
V. Baranidharan et al.
Choice of substrate: Different types of permittivity and loss tangent values are very important for choosing the substrates for an effective antenna design [9]. This will automatically increase the gain and reduce the power loss. Multi-element: By using the multi-antenna elements will increase the gain of an antenna [9]. Moreover, this will increase the antenna bandwidth and radiation efficiency. Single antenna element design will not fulfil the requirements. Corrugation: An effective radiator design and its shapes (sine, triangular, square, and rectangular) will helps to improve the bandwidth ratio. Dielectric lens: The dielectric lens are uses the electrostatic radiations which are used to transmit it in the one direction. This will helps to lead the increase of the gain and antenna directivity [10]. Mutual coupling based reduction: In this MIMO antenna-multi-element design, the different single elements performance will affect the performance of another element in the antenna [10]. In order to reduce this interference, different types of mutual coupling techniques are used in MIMO antennas. This will help with isolated or decoupling techniques proposed by the different researchers.
3 UWB MIMO Antenna In this section, we discussed the various recently proposed antenna structures by the different researchers.
3.1 Printed UWB MIMO Antenna In this work, the authors proposed a new UWB frequency based MIMO antenna with a compact size. It’s printed on a substrate FR4 with a relative permittivity value of 4.4.and substrate thickness value as 0.8 mm, the dimensions of the antenna are improved for a reduced size. The described MIMO antenna is made up of two L-shaped slot antenna elements which are marked as LS 1 and LS 2. To provide good isolation between antenna elements, the two LSs are arranged perpendicular structures. An L-shaped slot and a rectangular patch comprise the element antenna, which is fed by a 50-microstrip line [11]. At the rectangular patch, a T-shaped stub is attached, which has a vertical stub and a horizontal stub, to gain the bandwidth increase for UWB applications. A tiny rectangular gap is carved on the left bottom of the ground plane to increase the isolation between antenna elements at low bands. Simulation using the electromagnetic (EM) simulation tool CST is used to acquire the appropriate geometrical parameters and correct numerical analysis (Fig. 1). To improve simulation precision, the SMA connector was incorporated in the simulated model. The MIMO antenna’s final optimised parameters are verified. The plotted simulated ECC curves show that the measured ECCs are less than 0.04 in
A Comprehensive Survey on Compact MIMO Antenna Systems …
407
Fig. 1 Geometry of the proposed printed MIMO antenna. Source [11], p. 1
the frequency range 3.1–10.6 GHz, which is less needed to satisfy better diversity performance for the suggested MIMO antenna.
3.2 High-Isolation Compact MIMO Antenna In this work, the authors proposed two MIMO antenna elements made up of Rogers RO4003 26 × 31 mm2 with a relative permittivity of 3.55 and of 0.7874 mm thickness [11]. The loss tangent is 0.0027. The UWB MIMO antenna is made up of 2 monopole elements. The element radiating is U-shaped, and there is a staircase design that is identical to the radiating element’s two corner’s bottom. When multiple antenna elements are installed in a narrow space, the mutual coupling might be very strong between the UWB antenna elements. As a result, designing UWB MIMO antennas with low mutual coupling and confined size is a challenge. The elements of the antenna are positioned in the H-plane [11]. The edge-to-edge gaps between the elements is set to 8 mm, equals to 0.180, where the free space wavelength is 0. A 50-microstrip line connects each antenna element. These microstrip lines have the dimensions W mm2 [11]. The ground plane has dimensions of WLg mm2 , and a rectangular slot at high frequencies with parameters of Wfs Lfs mm2 is removed on the top corner of the ground plane. The last stage of design included the metal strip connected by an addition of ground stubs with dimensions of Ws Ls mm2 , which formed a structure of comb-line on the antenna’s ground plane Several techniques for combining both techniques to reduce mutual coupling in elements while maintaining a confined size have been investigated (Fig. 2). It is demonstrated that comb-line design can effectively expand impedance bandwidth across the entire UWB band and improve isolation [11]. The authors persuaded that the MIMO antenna is a good choice for portable UWB applications based on measured and simulated antenna performance.
408
V. Baranidharan et al.
Fig. 2 Geometry of proposed highly isolated MIMO antenna. Source [11], p. 2
3.3 Compact UWB Four Antenna Element MIMO Antenna Using CPW Fed The authors proposed the UWB MIMO antenna’s configuration for high frequency data communications. At first, the authors designed it for a single element. HFSS is used for simulating the antenna, and the antenna’s S-parameters are determined. It is indicated by the S-parameter that a single antenna element has the capability of working in UWB [12]. Whereas, after 17 GHz, the antenna’s performance degrades. The performance of the antenna is boosted by adding parasitic branches to the feeder. Matching of a single element antenna in the frequency region higher than 10.6 GHz has clearly improved after the stub was added. The 4-element based modified MIMO antenna can then be further designed [13]. The antenna construction and size were chosen to suit the required operating frequency. All measurements are in millimetres. The estimated and simulated FRS were found to be 4 and 3.7 GHz, respectively [13]. The estimated value is almost similar to the simulated value, as can be observed. The radiating patch and ground plane were printed on the substrate’s surface. A circular monopole evolved into the patch shape that is currently in use. The impedance matching of the antenna is heavily influenced by the size of the half-circle ground construction (Fig. 3). The R3-radius semi-circular protrusion structure at the top of the patch, as well as the stub attached to the feeder line, can both help with antenna matching and Ultra Wideband performance. The stub, in specific, plays a vital role in impedance matching. The little rectangular notches on the left and right sides of the semi-circular protrusion are also referred to as W2 and L2, and they aid in impedance matching. The simulation results reveal that the antenna ports have excellent impedance matching at 3 GHz to 20 GHz bandwidth. In terms of isolation, the four ports are separated by more than 17 dB [13]. The available bandwidth emphasised, however, that the
A Comprehensive Survey on Compact MIMO Antenna Systems …
409
Fig. 3 Geometry of the proposed multiple input multiple output (MIMO) antenna with CPW feed. Source [13], p. 1
suggested S11 has been measured. At high frequencies, the UWB MIMO antenna system’s results aren’t completely equal to the measured results.
3.4 8-Element Based UWB MIMO Antenna with Band-Notch and Reduced Mutual Coupling A small 4-element based Ultra Wideband frequency based antenna with a ground plane of slotted is described. The bandwidth is expanded by etching a modified slotted ground plane with a compact size of 0.33 × 0.33 × 0.01, further the isolation between its elements is 14 dB [14]. A wideband MIMO antenna has more number of the common parts that can be used as a LTE wireless access point and Wi-Fi. The latter is composed of 4 feedline printed microstrip antennas consisting of a simple element radiating and a ground plane which is round shape; the impedance bandwidth is raised from 1.8 GHz to 2.9 GHz [14] (Figs. 4 and 5). The suggested antenna structure includes a relative permittivity of 4.4 which is made of FR4 substrate and with a width of 1.6 mm that is a low-cost component with a 0.02 tangent loss. To estimate S-parameters, gain, efficiency, and radiation, the electromagnetic simulator HFSS is used. The surface waves of the patterns ECC travel over the ground. By CST MWS, the EBG structures are focused.
410
V. Baranidharan et al.
Fig. 4 Geometry of designed single antenna element (front view). Source [14], p. 1
Fig. 5 Geometry of an antenna single element b back view. Source [14], p. 2
3.5 WLAN Band Rejected 8-Element Based Compact UWB MIMO Antenna For Future Fifth Generation (5G) terminal equipment a compact UWB MIMO antenna with eight elements has the ability of supplying high data rates. An Inductor Capacitor is used to achieve band rejection from 4.85 GHz to 6.35 GHz [14]. A (LC) stub is shown at the ground plane. The included stub also allows for bandwidth control and the rejection of any desired band. The printed monopoles’ orthogonal positioning allows for polarisation diversity and excellent isolation. As suggested in this antenna structure, monopole pairs which are located on opposite corners of a planar substrate of 50 mm2 [14] (Fig. 6).
A Comprehensive Survey on Compact MIMO Antenna Systems …
411
Fig. 6 Geometry of 8-element compact UWB MIMO antenna. Source [15], p. 4
On the same board, 4 more perpendicularly placed monopole antennas are found, with a total volume of only 50 × 50 × 25 mm3 [15]. On proper comparisons, the results are concluded. The target criteria was met with a coefficient of reflection which is less than 10 dB over the whole bandwidth of 2 to 12 GHz [15] (excluding the rejected band), greater than 17 dB isolation, minimal strength variation, high signal rejection, minimal envelope correlation, and stable radiation pattern in the Wireless Local Area Network band. The ability to reject WLAN bands in a 3D with vertical and horizontal antennas. The suggested design’s planar configuration, i.e. 50 × 50 mm2 , which is small when compared to other structures.
3.6 Comparison of UWB MIMO Wideband Antennas Structures In this section, we have discussed the pros and cons of the recently proposed MIMO wideband antenna structures (Table 1).
4 Conclusion In this paper, different types of 5G UWB antennas are analysed critically, and comparisons are made based on their performance enhancement techniques. These UWB MIMO antennas are highly categorised by the multi-element wideband and single
412
V. Baranidharan et al.
Table 1 Comparison of UWB Wideband structures UWB MIMO wideband antennas structure
Pros
Cons
Printed UWB MIMO antenna • Provides a high efficiency [10] • This decoupled structure based multi-element is always comparatively compact, light in weight, smaller size, and easy to design
• Structure is highly complex and fabricate
High-isolation compact MIMO antenna [10]
• It always provides the optimised level of design, gives a good bandwidth, and high efficiency • Antenna size is always compact based on the low ECC value
• It requires external components for an effective design
Compact UWB MIMO antenna with CPW feed [11]
• Design is very easy to fabricate • This antenna is more compact and provides an enhanced gain, bandwidth and narrow bandwidth of operations
• Difficult design process and uses metamaterial type of unit cells
Band-notch and reduced • In this antenna, high • Provides the low bandwidth mutual coupling based UWB isolation is provided by using and analyse its challenging MIMO antenna [12] DRA and using the common issue ground structures • Capacity of the channel is improved by the good orthogonal polarisation and diversity WLAN band rejected 8 × 8 MIMO antenna for UWB applications [13]
• The feeding network is • Difficult to design and designed by complicated placing the slot or parasitic dual polarisation elements with optical Slotted ground structure gives elements is not easy a better efficiency to elevate the ECC, polarisation diversity techniques
element antenna structures. The recently proposed antenna types are explained in detail based on the structures with its performance enhancement techniques. The antennas electrical and physical properties are analysed to enhance the overall performances. This paper also emphases on the upcoming breakthroughs of the 5G smartphones or terminal devices, 5G IOT, mobile terminals, and base stations. This paper will shed some light on 5G UWB antenna design for selecting the antenna substrate, suitable performance enhancement techniques to meet out the 5G applications.
A Comprehensive Survey on Compact MIMO Antenna Systems …
413
References 1. V. Baranidharan, G. Sivaradje, K. Varadharajan, S. Vignesh, Clustered geographic–opportunistic routing protocol for underwater wireless sensor networks. J. Appl. Res. Technol. 18(2), 62–68 (2020) 2. B. Varadharajan, S. Gopalakrishnan, K. Varadharajan, K. Mani, S. Kutralingam, Energyefficient virtual infrastructure-based geo-nested routing protocol for wireless sensor networks. Turk. J. Electr. Eng. Comput. Sci. 29(2), 745–755 (2021) 3. C.R. Jetti, V.R. Nandanavanam, A very compact MIMO antenna with triple band-notch function for portable UWB systems. Progr. Electromagn. Res. C 82, 13–27 (2018) 4. Z.J. Tang, X.F. Wu, J. Zhan, Z.F. Xi, S.G. Hu, A novel miniaturized antenna with multiple band-notched characteristics for UWB communication applications. J. Electromagn. Waves Appl. 32(15), 1961–1972 (2018) 5. X. Zhao, S.P. Yeo, L.C. Ong, Planar UWB MIMO antenna with pattern diversity and isolation improvement for mobile platforms based on the theory of characteristic modes. IEEE Trans. Antennas Propag. 66(1), 420–425 (2018) 6. L. Wu, Y. Xia, X. Cao, Z. Xu, A miniaturized UWB MIMO antenna with quadruple bandnotched characteristics. Int. J. Microw. Wirel. Technol. 10, 948–955 (2018) 7. A. Chaabane, A. Babouri, Dual band notched UWB MIMO antenna for surfaces penetrating application. Adv. Electromagn. 8(3) (2019) 8. Z. Tang, X. Wu, J. Zhan, S. Hu, Z. Xi, Y. Liu, Compact UWB-MIMO antenna with high isolation and triple band-notched characteristics. IEEE Access (2019) 9. M. Irshad Khan, M. Irfan Khattak, S. Ur Rahman, A.B. Qazi, A.A. Telba, A. Sebak, Design and investigation of modern UWB-MIMO antenna with optimized isolation. MDPI (2019) 10. J. Ren, W. Hu, Y. Yin, R. Fan, Compact printed MIMO antenna for UWB applications. IEEE Antennas Wirel. Propag. Lett. 13, 1517–1520 (2014) 11. N. Malekpour, M.A. Honarvar, Design of high-isolation compact MIMO antenna for UWB application. Prog. Electromagn. Res. C 62, 119–129 (2016) 12. D.A. Sehrai, M. Abdullah, A. Altaf, S.H. Kiani, F. Muhammad, M. Tufail, M. Irfan, A. Glowacz, S. Rahman, A novel high gain wideband MIMO antenna for 5G millimeter wave applications. MDPI (2020) 13. S. Kumar, A.S. Dixit, R.R. Malekar, H.D. Raut, L.K. Shevada, Fifth generation antennas: a comprehensive review of design and performance enhancement techniques. IEEE Access (2020) 14. W. Yin, S. Chen, J. Chang, C. Li, S.K. Khamas, CPW fed compact UWB 4-element MIMO antenna with high isolation. Sensors 21(8), 2688 (2021) 15. M.I. Khan, M.I. Khattak, G. Witjaksono, Z.U. Barki, S. Ullah, I. Khan, B.M. Lee, Experimental investigation of a planar antenna with band rejection features for ultra-wide band (UWB) wireless networks. Int. J. Antennas Propag. (2019)
A Review of Blockchain Consensus Algorithm Manas Borse, Parth Shendkar, Yash Undre, Atharva Mahadik, and Rachana Yogesh Patil
Abstract The advent of Blockchain started when a mysterious person or organization with an alias Satoshi Nakamoto published a white paper named “Bitcoin: A Peer-to-Peer Electronic Cash System.” This paper introduced a digital currency with no middlemen and no central authority. This meant, no transaction taxes, secure transactions, and a uniform currency. Hence, started the expedition of Blockchain Technology. Blockchain Technology needs consensus algorithms to insert a valid block of data to the Blockchain and maintain its state. Due to the rapid developments in Blockchain Technology and it’s adaptation to a plethora of wide areas (games, digital art, medical records, etc.), a study of consensus algorithms is essential to help researchers and developers to adapt a consensus algorithm according to their needs (proof of resource or majority voting). Keywords Consensus algorithm · Blockchain · Proof of work · Proof of stake · Proof of elapsed time · Byzantine fault tolerance
1 Introduction 1.1 Blockchain Technology Blockchain Technology is a decentralized, immutable, consensus based, distributed ledger. It is a peer-to-peer network with its nodes spread all over the world, reaching an agreement about the form and validity of transactions. Once this consensus, as we call it is reached, the transactions are stored in a block of data and linked to the previous block. This forms an unending immutable series of data blocks, hence Blockchain. There are diverse consensus algorithms which are being used today. This paper discusses the consensus algorithms. All the parties involved in the Blockchain network has to agree upon a valid form of ledger so that the data block can be added to the block chain. The protocol that M. Borse (B) · P. Shendkar · Y. Undre · A. Mahadik · R. Y. Patil Pimpri Chinchwad College of Engineering, Pune, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_31
415
416
M. Borse et al.
Fig. 1 Taxonomy of consensus algorithms
allows these parties to come to a “consensus” is called a consensus Algorithm. The classification of consensus algorithm is shown in Fig. 1.
2 Related Work In this section, the existing blockchain consensus algorithms are studied and their advantages and disadvantages are analyzed.
2.1 Proof of Work Consensus Algorithm Bitcoin is still the most influential cryptocurrency system. Its Proof of Work consensus is a system that rewards the first individual who can solve a hard math problem. This system limits the number of miners who can successfully solve the problem. The authors of [1] gives a modification to the Bitcoin by taking advantage of the missed computational effort of miners. It would allow miners to justify their work and the network’s difficulty would then decrease accordingly. Due to the unique features of Blockchain, it has inspired a wide range of new social applications. However, its decentralized nature and its use of Proof of Work (PoW) technology can be very challenging to manage. The algorithm can only keep trail of the network hash rate quick enough [2]. However, it cannot set the target BPT on time. We introduce a (linear predictor based difficulty control algorithm) that takes into account the relationship between the hash rate and the difficulty. It achieves better stability and flexibility in terms of BPT. The crypto (Bitcoin) has utilized Proof of Work consensus to prevent double spending attacks [3]. However, it is not yet clear how this will affect the decentralized network’s security. Gavrilovic and Ciric [4] proposes a new protocol that takes into account the computing control of each server and regulates the trouble of making a block on the base of the evaluation.
A Review of Blockchain Consensus Algorithm
417
Majority of the emails traffic is spam. It is an abuse that is used for the determination of mass dissemination of unwanted messages. In [5, 6], solution needs a specific quantity of work from the recipient before the email message is sent. The extended SMTP protocol method would be evaluated through the use of the Proof of Work. assessment of the servers work and the influence of distributed spam on it will be shown. The proposed solution will be presented to help minimize server load and reduce spam traffic. A Blockchain is an open ledger technology that enables people to verify transactions. There is a way to improve scale and transaction speed is to find an answer that provides quicker Proof of Work algorithm. In [6], parallel mining method is introduced. The proposed method involves a selection process for a manager, a reward system, and a distribution of work. It was tested using a variety of test cases. Bitcoin, the world’s first digital money, is founded on the PoW (Proof of Work) consensus procedure. Its widespread adoption has raised the bar for Blockchain Technology. Feng and Luo [7] has also covered the various concepts of Blockchain and Proof of Work consensus algorithm along with its performance analysis. In [3], they propose a new protocol that will allow miners to generate blocks with equal difficulty and reward them with equal opportunity. The new method will also evaluate the computing power of the nodes. Based on the evaluation, the reward for the user’s incentive is also changed (Table 1).
2.2 Proof of Stake Consensus Algorithm Over the years Proof of Stake (PoS) consensus algorithms is developed and is to be added in the systems so that a distributed ledger can be added in the system and then being processed over the whole working of the system [10, 11]. The validators do not receive any amount of reward on the blocks that have been validated by them, instead they receive networking fees as their type of reward. Hence, they can be awarded in a way that has been a true form of reward. However, the technology used or developed in this system is currently limited. The Proof of Stake consensus algorithm’s operation is based on transactions, and their validation by the validators for the transaction or it being called as node to be added in the system [12]. The nodes are the ones who make the transactions (or the amount of transactions) in the system. All nodes who wish to be validators for the next block must stake a certain amount of money [13]. The highest amount of stake which was being validated is added in the system or is being elected as the validator [14]. Then, the validator verifies all the transitions which have being taken place in the system then if the transaction which was being processed was an authentic and proved transaction, then the validator does hold the transaction as valid and it adds it or publishes it to the block [15]. Now, if the block being added by the validator is verified then the transaction is valid and then the validator gets a stake back with a reward from the transaction
418
M. Borse et al.
Table 1 Reference papers studied of proof of work Reference
Advantages
Disadvantages
[1]
PoE utilizes computational power and increases mining power. Miners who fail at computational tasks might get experience
High energy consumption is advantageous to nodes that accumulate computational power
[2]
The prediction-based difficulty management method provides significantly improved stability and flexibility on BPT, as it’s based on the link between hash rate, difficulty, and BPT
Author prove that the prediction-based difficulty control technique cannot be implemented using smoothed BPT as a calculating method
[3]
This ensures that all miners have an equal chance of success. In addition, based on the evaluation, the reward for the user’s incentive is modified
Miner can easily attack against blockchain
[4]
The suggested approach reduces spam It takes more processing time traffic and server strain while having no negative impact on legitimate email users’ experience
[5]
In comparison to the present system, preliminary data suggest that proof of work scalability has improved by 34%
There is a chance to get incorrectly evaluated by a reputation system
[6]
Extensive testing shows that our method detects Sybil attacks with a high detection rate and a low communication and computation cost
Due to the wireless communication characteristics, a solution is difficult to achieve
[7]
CryptoNight, which uses scratchpads of Each algorithm is affected differently by various sizes, is the most equitable the size of the accessible memory region proof of work algorithm
[3]
This ensures that all miners have an equal chance of being successful. Furthermore, based on the evaluation, the reward for the users incentive is changed
a potential mining pool exhibits symptoms of centralization
[8]
It demonstrates that mining industry has a propensity for centralization, which is incompatible with the basic concepts of cryptocurrencies
The anticipated value influences the reward once it has been adjusted for the rate is obtained
[9]
The paper proposes a proof-of-useful-randomness method that can improve energy efficiency
A single iteration in PoUR might take longer than a typical PoW iteration
A Review of Blockchain Consensus Algorithm
419
Table 2 Reference papers studied of proof of stack Reference
Advantages
[10]
They demonstrated the use of Complication in reading and studying blockchain technology in the healthcare the data system as well as the system’s applications
Disadvantages
[11]
They introduced Bazo which is It is complex to implement in real world improved network to meet the demands scenarios of Internet of Things networks
[12]
The system measures user’s particular skill and ranks him according to his skill used
[13]
It has offered resistance to the security Cannot be used on multilevel platforms threats while recording the transactions of the data
[14]
This system has been used to improve consensus within it by scaling vote according to validator’s profile
Profile reading takes a considerate amount of time
[15]
Studied weighted voting in the distributed ledger and created an algorithm for better implementation of distributed ledger
Security risk is high
[16]
They used picturable signatures to construct a blockchain which can protect from LRSL attacks
System is complex to design
Risk of any user uploading fraud data is relatively high in it
processed or the block verified. If the block being processed is not verified, then the validator will lose the stake and will get a negative review for the transaction or the addition of the certain block. It will result in his negative ranking and will hold him in a low position of the validators [16] (Table 2).
2.3 Proof of Elapsed Time The properties that make Blockchain innovation incredible, like data integrity and flexibility to vindictive nodes, rely principally upon the decision of consensus protocol used to facilitate the network. The essential development of Blockchain innovation is to use these algorithms in completely decentralized, open networks [17]. Consensus can be thought of as an entity whose sole purpose is to eliminate uncertainty by giving everyone a fair chance to make their requests append into the Blockchain network [18]. Now, PoET is nothing but a lottery-based consensus algorithm that gives a fair chance to every validator to publish its data in the block [19].
420
M. Borse et al.
In a way, conventional PoW can be likened to a lottery, with the likelihood of winning being proportional to the amount of computing effort that is involved. Any node can be the first to solve a problem and distribute a block yet the possibility of winning is directly related to computational speculation [20]. This irregularity prevents any one side from consciously influencing the square composition. Instead of spinning its computational wheels, PoET assigns each node an irregular stand-by time-examined from a spectacular appropriation, produced by code operating inside a “trusted execution environment.” All validators collect exchanges into candidate blocks, and when the node’s stand-by period expires (and no other square has been distributed efficiently at this height), it publishes its candidate block and advertises to the network. When more than one block is published at the same moment, a percentage of time spent waiting up is used to calculate the predicted fork, with the chain with the shortest overall stand-by time being preferred [21, 22]. For several reasons, PoET stands out as a good one to study in terms of power and accessibility, it is one of the best algorithms, with its most essential use in Bitcoin [23]. It is also devoid of digital money, making it a viable solution for industry use cases where trades aren’t always monetary. Finally, it is extremely parameterizable, in contrast to the Bitcoin convention, which has remained mostly dormant since its inception and gives few tools for updating. The selection of block interval, the amount of time that elapses between sending progressive squares to the chain, is significant among these lottery-style consensus bounds [24, 25] (Table 3).
2.4 Byzantine Fault Tolerance In this paper [26], research on PBFT is done to improve its system flexibility, high communication overhead, bad behavior of leader node. This is done by simultaneous joining and leaving of nodes, by admitting valid nodes and limiting messages, and by election of trusted leader node, respectively. Scalability is a major problem [27] of BFT algorithm; and hence, this paper aims to address this issue. The proposed solution is a Multi-layer BFT algorithm which effectively decreases latency and improves scalability. This paper [28] suggests that a reputation-based model should be adopted to evaluate consistency and validity of information provided by nodes. This reputationbased model, according to this paper should be able to make BFT more secure and reliable. In this paper [29], transactions are classified into categories, namely equal and unequal transactions, unequal transactions are categorized into common and troubled transactions. Now, only these troubled transactions are subjected to the BFT algorithm to improve scalability. A group of leader nodes is selected to improve consensus accuracy and to minimize network communication cost [30]. SBFT makes use of a mixture of 4 ingredients: the usage of creditors and getaway signatures to lessen verbal exchange, the usage of a constructive rapid path, lowering
A Review of Blockchain Consensus Algorithm
421
Table 3 Reference papers studied of proof of elapsed time Reference
Advantages
Disadvantages
[17]
The execution upholds dependable gathering of any ideal responsibility. REM might be seen as recreating the conveyance of distribution of block-mining intervals related with PoW, however, REM does as such with PoUW, and hence wipes out squandered CPU exertion
Study shows that no single algorithm can be utilized to accomplish various information prerequisite and it not relies upon prevalence of that algorithm
[18]
DC-PoET was able to achieve a higher throughput
There is still room to improve the protocol’s efficiency by more smartly and dynamically creating the blockchain overlay network
[19]
Use of Blockchain in IoT strengthens Blockchain conveniently stores innovation with emphasis on recognition transactions and device Id’s without a of customers central server. However, the ledger must be kept on the nodes themselves, and it will inevitably increase in size thus reducing the efficiency
[20]
Adapts to a fairer means to achieve an agreement on who will get to publish their block
An infiltration in the Blockchain system can be expected if the attacker succeeds in generating blocks with the help of compromised nodes
[21]
PoET reduces the computational work load although being a lottery-based consensus like PoW
This algorithm can still be broken by corrupted SGX
[22]
After undergoing the tests, the Requirements for further proficiency throughput was able to cap on 2300 tx/s. observed achieve This is yet another milestone achieved as it beats the results acquired for hyperledger fabric
[23]
The TEE handles the confidentiality of An infiltration in the blockchain system data and integrity while dealing with the can be expected if the attacker succeeds computation aspect in generating blocks with the help of compromised nodes
[24]
Blockchain avoids data loss in reference A proper need to be implemented to to failure of centralized database. Also, make this process fool-proof blockchain provides more secure means to attain a transaction
[25]
Because this consensus is executed in TEE provided by SGX devices, it is supposed that cheating with the work of the two above functions is very difficult
An infiltration in the blockchain system can be expected if the attacker succeeds in generating blocks with the help of compromised nodes
422
M. Borse et al.
customer verbal exchange, and making use of redundant servers for the short path [31]. This paper [32] aims to improve transaction’s throughput and decrease the time required for implementation of requests in a personal Blockchain. The tactics controlling Block Proposal Verification are carried out simultaneously in place of strolling chronologically. To evaluate distinctive consensus algorithms primarily based totally on particular parameters consisting of crash fault tolerance, verification speed, throughput (TPS), scalability [33]. In case of semi-closed permissioned networks of more than one firms, PBFT is recommended. The aim of this article consists in allocating the Hyperledger Fabric and understand all information on Hyperedger Frameworks to decide whether its usefulness is justified in practice [34]. They have observed that the Hyperledger Fabric is an enormous and dynamic project and it has huge range of use cases (Table 4). Table 4 Reference papers studied of byzantine fault tolerance Reference Advantages
Disadvantages
[26]
More secure, fault tolerant
Nodes cannot join and exit freely
[27]
Faster can tolerate more number of nodes No improvements in security, no change in leader node selection system
[28]
Increases the average transactions per Scaling problem is not solved second by 15% and decreases latency by 10%
[29]
Faster than PBFT, showed 61.25% performance improvement when compared to BFT of Hyperledger Besu
[30]
The process of consensus and validation Scalability problem still remains between different organizations will have stability and speed. This can be applied in data management and value creation
[31]
In a large scale deployment, this modification of PBFT can work effortlessly
Election of primary node is same
[32]
Throughput is increased and latency is decreased
Election of primary node is same
[33]
Verification speed, throughput (TPS) is better in PBFT
Scalability needs improvement compared to other algorithms
[34]
Hyperledger fabric is planned to be used It has got a complex architecture for corporate right from the start. These It is not much network fault-tolerant are the maximum lively projects, and the network collected across the platform keeps to grow
No change in leader node selection system
A Review of Blockchain Consensus Algorithm
423
3 Challenges/Findings 3.1 Proof of Work Since Proof of Work algorithm in its current form has few vulnerabilities like Selfish mining attack, Eclipse attack, 51% attack, Scalability, Electricity dependency, and wastage. An attacker may use these techniques to break through system. So, there is a huge scope to improve this algorithm by using some techniques and modifications or using it in combination with other algorithms PoS (Proof of Stake) etc. The traditional PoW accounts for about 90% of the total market in digital cryptocurrencies [35], as well as dominating the Blockchain based applications. In spite of its merits and wide utilize in practice, the conventional PoW also has been appeared to be greatly energy-expensive. An enormous chunk of this electricity utilization is due to the computational inefficiency of conventional PoW algorithms. This serious energy waste is considered one of the greatest drawbacks of traditional PoW algorithm. Eclipse attacks are an extraordinary sort of cyberattack where an attacker makes a fake environment around one node, or client, which permits the attacker to control the affected node into wrongful activity [36].
3.2 Proof of Stake As a result, the Proof of Stake (PoS) consensus algorithm can be used to implement a variety of system procedures found which are advantageous to the product it is used in to deliver security and trust. This is gathered and handed to the entity that will create the new block. Finally, the impracticality of the 51% attack: in order to launch an attack, the attacker must have or have 51% of the stake in the cryptographic system in the network, which will cost the attacker a toll and, as a result, will be rather expensive for him. Decentralization is the primary benefit or use of existing PoS systems. Some may call it as an advantage and some may call it as a disadvantage for the system. But it will be verified in the coming years. Explaining it, it says that the validator with massive amount of stake or more positive reviews will get the first preference while validating a block over some validators who are not as experienced as the above validator. The positives of this are that the system will generate a verified block and trust will be high if the validator with more positive reviews and experience has verified the block in the process. But, in the negative side it will make more hard for the new validators to be in the process as the above validator will get all the necessary means for validating a block in the process.
424
M. Borse et al.
3.3 Proof of Elapsed Time Since PoET (Proof of Elapsed Time) uses Intel SGX as a TEE, an attacker that can corrupt a single SGX-enabled node can win every consensus round and break the system completely. So, there is a huge scope to improve this algorithm by using it in combination with other algorithms PoUW (Proof of Useful Work), PoS (Proof of Stake), etc., or by modifying the existing algorithm. The stale block will happen at whatever point the briefest stand-by time and another stand-by time contrast by not exactly the engendering deferral of the network. This outcome in reduction of network throughput fundamentally as the network size increments, different variables held steady.
3.4 Byzantine Fault Tolerance BFT has some definite limitations. 1. 2. 3.
Scalability. Election of trusted primary/leader node. Fault Tolerance.
Some modified algorithms solve its scalability issue and some make the election of leader node/nodes much efficient. No improvement in existing BFT completely removes its limitations. So, there lies a scope to find a modified BFT algorithm which improves its limitations. A multi layered PBFT approach can be followed to improve scalability. Fault tolerance can be increased by reducing client communication from f +1 to 1. A reputation-based approach can be followed to nominate a leader node. Excess servers can be maintained for better resilience and performance.
4 Conclusion The blocks in the Blockchain are linked together using a cryptographic hash value. Nodes in the blockchain system are protected from attack by the consensus protocol, which also maintains the integrity of the network. It’s no secret that Blockchain consensus protocols have been the subject of many different ideas. Focusing on Blockchain, this study presents a comprehensive examination of prior forms of distributed ledgers. A more reliable, scalable, and cost-effective network can be created by leveraging the permissioned, permission-less, and consortium Blockchain consensus protocols.
A Review of Blockchain Consensus Algorithm
425
References 1. S. Masseport, B. Darties, R. Giroudeau, J. Lartigau, Proof of experience: empowering proof of work protocol with miner previous work, in 2020 2nd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS) (IEEE, 2020), pp. 57–58 2. K. Zheng, S. Zhang, X. Ma, Difficulty prediction for proof-of-work based blockchains, in 2020 IEEE 21st international Workshop on Signal Processing Advances in Wireless Communications (SPAWC) (IEEE, 2020), pp. 1–5 3. R. Nakahara, H. Inaba, Proposal of fair proof-of-work system based on rating of user’s computing power, in 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE) (IEEE, 2018), pp. 746–748 4. N. Gavrilovic, V. Ciric, Design and evaluation of proof of work based anti-spam solution, in 2020 Zooming Innovation in Consumer Technologies Conference (ZINC) (IEEE, 2020), pp. 286–289 5. S.S. Hazari, Q.H. Mahmoud, A parallel proof of work to improve transaction speed and scalability in blockchain systems, in 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC) (IEEE, 2019), pp. 0916–0921 6. P.R. Yogesh, Backtracking tool root-tracker to identify true source of cyber crime. Proc. Comput. Sci. 171, 1120–1128 (2020) 7. Z. Feng, Q. Luo, Evaluating memory-hard proof-of-work algorithms on three processors. Proc. VLDB Endow. 13(6), 898–991 (2020) 8. H. Alsabah, A. Capponi, Pitfalls of bitcoin’s proof-of-work: R&D arms race and mining centralization. Available at SSRN 3273982 (2020) 9. E.U.A. Seyitoglu, A.A. Yavuz, T. Hoang, Proof-of-useful-randomness: mitigating the energy waste in blockchain proof-of-work (2021) 10. S.R. Niya et al., Adaptation of proof-of-stake-based blockchains for IoT data streams, in 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC) (IEEE, 2019) 11. G. Pattewar, N. Mahamuni, H. Nikam, O. Loka, R. Patil, Management of IoT devices security using blockchain—a review, in Sentimental Analysis and Deep Learning, pp. 735–743 (2022) 12. N. Nair, A.K. Dalal, A. Chhabra, N. Giri, Edu-coin: a proof of stake implementation of a decentralized skill validation application, in 2019 International Conference on Nascent Technologies in Engineering (ICNTE) (IEEE, 2019), pp. 1–4 13. D. Tosh et al., CloudPoS: a proof-of-stake consensus design for blockchain integrated cloud, in 2018 IEEE 11th International Conference on Cloud Computing (CLOUD) (IEEE, 2018) 14. S. Leonardos, D. Reijsbergen, G. Piliouras, Weighted voting on the blockchain: improving consensus in proof of stake protocols. Int. J. Network Manage. 30(5), e2093 (2020) 15. P. Gaži, A. Kiayias, A. Russell, Stake-bleeding attacks on proof-of-stake blockchains, in 2018 Crypto Valley Conference on Blockchain Technology (CVCBT) (IEEE, 2018) 16. S. Deore, R. Bachche, A. Bichave, R. Patil, Review on applications of blockchain for electronic health records systems, in International Conference on Image Processing and Capsule Networks (Springer, Cham, 2021), pp. 609–616 17. A. Pal, K. Kant, DC-PoET: proof-of-elapsed-time consensus with distributed coordination for blockchain networks, in 2021 IFIP Networking Conference (IFIP Networking) (IEEE, 2021), pp. 1–9 18. L. Chen, L. Xu, N. Shah, Z. Gao, Y. Lu, W. Shi, On security analysis of proof-of-elapsedtime (PoET), in International Symposium on Stabilization, Safety, and Security of Distributed Systems (Springer, Cham, 2018), pp. 282–297 19. R.Y. Patil, M.A. Ranjanikar, A new network forensic investigation process model, in Mobile Computing and Sustainable Informatics (Springer, Singapore, 2022), pp. 139–146 20. F. Zhang, I. Eyal, R. Escriva, A. Juels, R. Van Renesse, {REM}: resource-efficient mining for blockchains, in 26th {USENIX} Security Symposium ({USENIX} Security 17), pp. 1427–1444 (2017) 21. A. Corso, Performance analysis of proof-of-elapsed-time (PoET) consensus in the sawtooth blockchain framework, Doctoral dissertation, University of Oregon, 2019
426
M. Borse et al.
22. B. Ampel, M. Patton, H. Chen, Performance modeling of hyperledger sawtooth blockchain, in 2019 IEEE International Conference on Intelligence and Security Informatics (ISI) (IEEE, 2019), pp. 59–61 23. V. Costan, S. Devadas, Intel SGX explained. IACR Cryptol. ePrint Arch. 2016(86), 1–118 (2016) 24. C. Saraf, S. Sabadra, Blockchain platforms: a compendium, in 2018 IEEE International Conference on Innovative Research and Development (ICIRD) (IEEE, 2018), pp. 1–6 25. G.T. Nguyen, K. Kim, A survey about consensus algorithms used in blockchain. J. Inf. Process. Syst. 14(1), 101–128 (2018) 26. X. Zheng, W. Feng, Research on practical byzantine fault tolerant consensus algorithm based on blockchain. J. Phys: Conf. Ser. 1802, 032022 (2021). https://doi.org/10.1088/1742-6596/ 1802/3/032022 27. W. Li, C. Feng, L. Zhang, H. Xu, B. Cao, M.A. Imran, A scalable multi-layer PBFT consensus for blockchain. IEEE Trans. Parallel Distrib. Syst. 32(5), 1146–1160 (2021). https://doi.org/ 10.1109/TPDS.2020.3042392 28. K. Lei, Q. Zhang, L. Xu, Z. Qi, Reputation-based byzantine fault-tolerance for consortium blockchain 604–611 (2018) 29. J. Seo, D. Ko, S. Kim, S. Park, A coordination technique for improving scalability of byzantine fault-tolerant consensus. Appl. Sci. 10(21), 7609 (2020) 30. Y.-A. Min, The modification of pBFT algorithm to increase network operations efficiency in private blockchains. Appl. Sci. 11, 6313 (2021). https://doi.org/10.3390/app111463131 31. G. Golan Gueta et al., SBFT: a scalable and decentralized trust infrastructure, in 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 568–580 (2019). https://doi.org/10.1109/DSN.2019.00063 32. H. Samy, A. Tammam, A. Fahmy, B. Hasan, Enhancing the performance of the blockchain consensus algorithm using multithreading technology. Ain Shams Eng. J. (2021) 33. D. Mingxiao, M. Xiaofeng, Z. Zhe, W. Xiangwei, C. Qijun, A review on consensus algorithm of blockchain, in 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2567–2572 (2017). https://doi.org/10.1109/SMC.2017.8123011 34. M. Krsti´c, L. Krsti´c, Hyperledger frameworks with a special focus on hyperledger fabric. Vojnotehnicki glasnik 68, 639–663 (2020). https://doi.org/10.5937/vojtehg68-26206 35. R.Y. Patil, S.R. Devane, Network forensic investigation protocol to identify true origin of cyber crime. J. King Saud Univ. Comput. Inf. Sci. (2019) 36. D. Sivaganesan, Performance estimation of sustainable smart farming with blockchain technology. IRO J. Sustain. Wirel. Syst. 3(2), 97–106 (2021)
A Short Systematic Survey on Precision Agriculture S. Sakthipriya and R. Naresh
Abstract Agriculture plays an important role in supporting all livelihoods besides contributing to the country’s economy. The world’s food security is in jeopardy due to few major challenges such as population growth and resource competition. Smart cultivation and precision agriculture advancements provide useful tools for resolving farming sustainability issues and addressing the ever-increasing complexity of agricultural fabrication systems. This research presents a complete systematic review to synthesize features and algorithms that had been used in yield prophecy. Furthermore, it cautiously appraises the algorithms and methods used in machine learning techniques and provide idea for the upcoming precision agriculture farming system research work. The widely used operational environments are crop information, temperature, soil information, humidity, nutrients, climate, and irrigation management. Finally, the study recognizes the gaps in crop yield prediction and architecture proposed in machine learning algorithms by applying sensors for data classification. Future endeavor is mainly a focal point to develop a model of better crop prediction to monitor all atmospheric factors by sensor management. Keywords Crop yield · Machine learning · Precision agriculture · Prophecy · Crop information
1 Introduction Cultivation is a vital part of economy in several urban and developing countries. Innovations in agricultural technology, such as the use of artificial fertilizers and the automation of manual processes, have enhanced farm productivity, resulting in low food prices and higher yields. Increased yields have enabled countries to maintain higher population, and much more significantly, to make ethanol as an alternative to gasoline. Rising yields have resulted in lower-cost food as well as the ability to save lives. Agriculture’s long-term viability is critical to ensure food stability and hunger S. Sakthipriya · R. Naresh (B) Department of Computer Science and Engineering, SRM Institute of Science and Technology(SRMIST), Kattankulathur, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_32
427
428
S. Sakthipriya and R. Naresh
eradication for the world’s rapidly expanding population. By 2050, it is projected that global food production will need to increase by 60–110% to feed 9–10 billion people [1]. Precision Farming (PF) and agriculture technology, known as digital farming, emerge as a new systematic field that employs data intensive method to increase rural production while reducing environmental effects. Modern agricultural operations generate data from a variety of sensors that allow a better understanding of the operating environment, i.e., weather conditions, knowledge of different crops, and soils and the method itself. Data from machines lead to quicker result with greater efficiency and accuracy.
1.1 Precision Agriculture Precision agriculture (PA) is a collection of technologies that help agriculture move into a computerized world-based order and is designed to help farmers get a high yield by using fertilizers, water, pesticides, and other resources in a more regulated manner. The Precision Agriculture System (PAS), whose phases are shown in Fig. 1, is a terminology for a collection of emerging information technologies used to handle commercial agriculture in a large scale. It assures higher yield and lower input expenses by tracking site specific ecology, and in this manner improving crop management, dust conditions in real time and automatization of various sensors, waste elimination, and lowering employment costs. The predictive model is developed using a variety of features, and the parameters of the models are calculated during the training process using historical data. Division of the historical data is used for performance assessment during the testing process and not used for preparation process. Depending on the issues and questions, Machine Learning (ML) models may be predictive or descriptive. Predictive models are used to make prophecy about the upcoming, although descriptive are used to find insight from the composed data with clarity. A Short Systematic Review (SSR) analysis Fig. 1 Phases of precision agriculture
Precision Agriculture
A Short Systematic Survey on Precision Agriculture
429
TestingData
Data Set
Machine Learning Algorithms
Classify/ Predict Rule
OutputPredicted
Fig. 2 Overview of machine learning approaches
opens up new insights and aids new researchers in the field, in gaining a better understanding of the current state of the field.
1.2 Machine Learning Approaches The performance of a model in machine learning in a precise task is determined with a performance metric that improves over time. A variety of mathematical and statistical methods are used to estimate the output of machine learning algorithms and models. After the learning process has been completed and the model has been qualified, it can be used to forecast, cluster, or identify research data using the information gained through practice (new examples). Figure 2 depicts an illustration of the machine learning approach.
1.3 Review of Machine Learning Approaches Signal learning is centered on the scheme, where supervised and unsupervised learning are the two main classes of ML activities. When data is provided (example inputs and corresponding outputs), the aim of supervised learning is to create universal regulation that maps inputs to outputs. Inputs are partly available with accessibility as feedback to behavior, within a dynamic environment, in some instances for reinforcement learning or missing goal outputs.
430
S. Sakthipriya and R. Naresh
1.4 Analysis of Learning Models Dimensionality Reduction (DR) is a method of supervised and unsupervised learning analysis that aims to provide an additional compact dataset while preserving as much information from the original data as possible, in low dimensional representation. To avoid the dimensionality effects, the regression DR model is applied. The following are a few most popular algorithms in DR: (a) PCA—Principal Component Analysis (b) PLSQ—Partial Least Squares Regression and (c) LDA—Linear Differentiate Analysis.
2 Related Work In this study, the findings of the present SSR in cluster are analyzed. These clusters reflect the use of machine learning in various stages of the agriculture supply chain, which includes preproduction, fabrication, processing, and circulation. Similarly, Machine Learning is used for Flora test, production management system, disease detection techniques, soil classification, RFID Sub-Soil Sensor System, weather prediction, and livestock management [2] through the fabrication phase.
2.1 Flora Test in Data Acquisition In Precision Agriculture (PA), a data acquisition scheme is focused on the relationship of moveable devices such as the Flora test. Crop yield forecasting is used to keep the fertilizer cost to a minimum. It has been demonstrated that by offloading information processing to the server, common methods functioning on the server allows smart sensor devices to be simplified in data acquisition scheme. With the development and execution of such systems, the range of applications for the portable computer family Flora test has greatly expanded. GPN is the most familiar well-known data mining approach, like propositional rule learning in terms of classification decision trees. The capability of pyramidal networks to regulate their structure to respond to the incoming information is one of their most notable features.
2.2 Production Management System Sensor data is being transformed into artificial intelligence on real time enabled programs, that afford high analysis and recommendations for cultivator action and decision-making, by applying machine learning techniques for the management of
A Short Systematic Survey on Precision Agriculture
431
farm systems [3]. The reviewed papers that demonstrate how machine learning technology can benefit agriculture have been categorized and filtered. The works examined were divided into 4 categories: 1. Crop management—includes yield prediction in applications, species recognition, crop quality, weed detection, and disease detection, [4] 2. Livestock Management—contains livestock production and animal welfare applications [5], 3. Water Management, 4. Soil Management.
2.3 Disease Detection Techniques Machine Learning (ML) and Deep Learning (DL) used in farming [6] have been characterized into three sections. 1. 2. 3.
Yield prediction Price prediction Leaf disease detection.
The survey results based on the similarity of neural network models show that deep learning models have the highest accuracy and best conventional image processing, whereas, Machine Learning techniques outperform various traditional methods in prediction [7]. Deep Learning has a hierarchical structure that allows it to perform well with classification and prediction models. A complicated model has been in this study to detect leaf disease. Its complexity allows it to solve more complicated problems with greater precision and less model fault. DL uses a technique called function extraction to extract features from raw data. Convolutions, pooling, gate activation, completely linked layers, memory cells, and other features are included in DL.
2.4 Soil Classification Soil information set is engaged from real data soil, and machine learning assessed multiple algorithms to identify soil type. It uses the Rapid Miner software’s SVM, decision tree, Naive Bayes, and other algorithms. Finally, the SVM outperforms the other algorithms with the highest precision (83.52%), and kernel linear function is used. The soil dataset information is based on real-world private soil data. The below (Table 1) shows the different types of soil accessibility in agro zone districts of Tamil Nadu. Hand boring was used to collect soil samples. Soil samples are delivered to the lab for chemical and physical analysis. Laboratory work was done by a third-party lab. There are ten characteristics and twelve classes in total. The characteristics are the ten attributes Atterberg limit (PL, LL, IP), Specific gravity (Gs), Particle Size Distribution—gravel, silt, sand, PSF Clay and percent (%) finer. The classes of 12
432
S. Sakthipriya and R. Naresh
Table 1 Types of soil in agro zone districts of Tamil Nadu Agro climatic zones Areas in Tamil Nadu
Soil Type
North eastern zone
1. Red sandy 2. Saline coastal-alluvium 3. Clay loam
Kancheepuram, Thiruvallur, Vellore, Cuddalore, Tiruvannamalai and Villupuram
North western zone Namakkal, Dharmapuri and Salem
1. Calcareous-black 2. Non calcareous-red 3. Non calcareous-brown
Western zone
Erode, Coimbatore, Tiruppur, Theni-Karur-Namakkal-Dindigul (Part)
1. Red loamy 2. Black
Cauvery delta zone
Trichy, Perambalur, Thanjavur, Nagapattinam, 1. Alluvium (old delta) Ariyalur, Thiruvarur, Pudukkottai (Part) and 2. Red loamy (new delta) Cuddalore (Part)
Southern zone
Tirunelveli, Thoothukkudi, Sivagangai, 1. Coastal alluviam Madurai, Ramanathapuram and Virudhunagar 2. Red sandy soil 3. Deep red soil 4. Black
High rainfall zone
Kanyakumari
1. Saline coastal 2. Alluvium 3. Deep red loam
Hilly zone
Dindigul (Part) and Nilgiris
Latteritic
are MH, CH, MH&OH, SC, CL, CL-ML, MH-OH, ML, ML-OL, SM, SC-SM, OH, and SC (Table 2).
2.5 Production Phase Machine Learning is used in the development process for a variety of Weather Prediction, Crop Prediction, and Smart Farming Prediction.
2.5.1
Weather Prediction
The weather forecast is crucial during the crop production process. The best use of water for crop irrigation, preparation and scheduling are shown by weather forecasts such as rainfall, sunshine, humidity, temperature, and moisture. For weather forecasts, ML algorithms uses both supervised and unsupervised methods [16].
2.5.2
Crop Prediction
To deploy machine learning is one of the sophisticated technologies in crop yield prediction (Fig. 3) for sowing the sensible crops. Naive Bayes Gaussian classifier, a supervised learning algorithm puts the information about the collected seeds of the crops. The suitable parameter like humidity, moisture content, and temperature help
A Short Systematic Survey on Precision Agriculture
433
Table 2 Survey table for machine learning methods used in farming system S. No
Description
Virtues
1
[2017]—Appling normal data acquisition information obtaining frameworks with regular applied techniques on the worker to simplify smart sensor gadgets yet moving information processing to workers [8]
2
Algorithm
Prediction Attributes
Harvest
To impressively grow Growing crops with the use of pyramidal compact gadget network “Flora test” family unit
Climate, sensor
Plant state
[2018]—The RFID reader’s and RFID sub-soil sensor hub’s undeniable level plan are presented [9]
Significant level plan and RFID based sub-soil sensing system supports more advances in PA
RFID RFIS
Temperature
Sub-soil sensing
3
[2018]—WSN is a promising machinery for controlling, real time tracking and analyzing many routing protocols, such as AOMDV, AODV, DSR and Integrated MAC, as well as precision agriculture using WSN in the IMR routing protocol [10]
To extend the Network life time, the Routing Algorithm with implemented MAC for precision cultivation using WSN
Integrated MAC, AOMD, AODV, DSR, and Routing Protocol WSN
Temperature, humidity, soil-PH
Fertilizers, water, pesticides
4
[2017]—ML algorithm is applied to compare and classify various soil types and automating soil type classification [11]
Automatically classify soil types with a high degree of precision (>70%). Attribute increase SVM accuracy or class reduction in the rapid miner
Neural network, SVM, Decision tree, and Naïve Bayes
Soil type
Classification and accuracy
5
[2018]—Management of Farm systems are evolving into real time artificial intelligence enabled programs that use machine learning to provide rich farmer decision-making [12]
Deployment and decision-making in ML production levels and improving the quality of bio-products
Artificial Neural Networks Support Vector Machines Ensemble Learning
Crop, water, soil, and livestock management
Crop quality, species recognition, Disease detection and weed detection
6
[2018]—To cultivate soil in the IoT using low-cost hardware such as sensors and actuators, as well as communication technologies, through the design of functionality in IoT paradigms [13]
Farmers can use Edge and fog decision trees to computing simplify the Paradigms implementation and construction of current smart systems and new amenities
Soil
Crop screen predictive analysis weather conditions forecasting
(continued)
434
S. Sakthipriya and R. Naresh
Table 2 (continued) S. No
Description
Virtues
Algorithm
Prediction Attributes
Harvest
7
[2019]—Crop and weather types can be forecasted using useful datasets to assist cultivators by predicting the most profitable crops to grow [14]
Smart systems that provide real time recommendations and long-term forecasts based on customer choices
Bayesian Network Artificial Neural Network
Types of Soil, Weather, Pressure and crop type
Predict crop yield and cost prophecy
8
[2020]—Applying ML and DL are cheap, practical, easy to build up and using neural network models to analyze different limitations in dataset, increases the accuracy of dataset [1]
ML and DL techniques have made it easier for farmers to predict the future and increase crop productivity in government agencies
Sequential Minimal Optimization algorithm Reptree, ANN multilayer perceptron Naïve Bayes
Rainfall, temperature
Crop yield and crop price
9
[2020]—The different AI technique has been proposed a part of paradigm for yield of crops, Precision Agriculture, ‘crop recommender systems’ as more specifically. It includes of diverse procedures like Similarity models, KNN, Neural Networks, Ensemble Models [15]
External factors such as meteorological data, temperature, and other factors are used to provide the recommendations, resulting in higher yields with less use of energy
Random Forest, Decision table Logistic Regression Naive Bayes
Fertilizers, Topography, temperature and rainfall
Predict the weather accurately
10
[2020]—Agriculture supply chain benefits from machine learning methods, which contribute to ASC sustainability and efficiency [4]
Analyzing data in ML applications from different technologies and various phases in ASC
Supervised Unsupervised and Reinforcement learning
Water Decision-making in conservation, preproduction phase Soil Plant health
the crops to achieve a successful growth. The mobile application of android software has been developed and so, the farmers are encouraged to obtain the parameters and spot, by the application automatically to begin the crop prediction [17].
3 Precision Agriculture System Precision agriculture is a simple description: “Doing the right thing in the right place at the right time.” [11] The architecture in PA scenario consists of three levels: cloud services, fog, and edge. Precision agriculture is made up of a series of technologies that include an information management system, sensors, and improved machinery
A Short Systematic Survey on Precision Agriculture Fig. 3 Crop prediction flowchart
435
Data
Data Pre -Processing
Feature Extraction
Navie Bayes Gaussian Classifier
Data Splitting
Crop Yield
that maximize fabrication considering the inconsistency and complexity in farming systems. It is the management of crop yield model that relies on observing and measuring inter- and intra-field crops to yield predictability.
3.1 Precision Agriculture Routing Protocols Using WSN The boundaries have to be checked by the distant worker to make a proper move. Likewise, a substitute actuator or a computerized framework can be utilized to make a suitable move depending on the deliberate boundaries through a stretch of time. Remote Sensor Network is a promising innovation to observe, control, and dissect different directing conventions, like Adhoc on request Multipath Distance Vector Routing (AOMDV), Adhoc on request Distance Vector Routing (AODV), Dynamic Source Routing (DSR), and Integrated MAC and Routing convention IMR, for achieving accuracy in horticulture utilizing WSN. These hubs are fit for performing three tasks: communication, sensing and computing. Each WSN hub can detect natural conditions like temperature, pH, humidity, pressure, etc. Every Sensor hub consists of 4 units: Processing unit, Transreceiver, Sensing unit, and Source of Battery. Another steering prediction planned on static is, various leveled engineering of the sensors with a low requirement of sensor versatility. In such situations, the location wastage inherent in ZigBee’s Cskip calculation is avoided. Advancement in creating a multi-leveled geography ensures
436
Revision
S. Sakthipriya and R. Naresh
Identification
Define Review Protocol
Conducting
Development Analysis Fig. 4 SLR Methodology in PLF
that guiding is held at its most successful level. An overview of the categories of SLRs domain in the PLF is presented in Fig. 4.
3.2 IoT Computing Architecture in Precision Agriculture The soil cultivation has opened up lucrative avenues in Internet of Things, through the use of low-cost hardware such as communication internet technology and sensors/actuators. These emerging opportunities include remote equipment and field tracking, predictive analytics, crop weather forecasting, warehousing, and smart logistics. On the other hand, farmers are specialists in agriculture but lack the familiarity in Internet of Things applications (Fig. 5). Users of IoT applications must take part in the creation of applications, which will improve their integration and usability. Farmers and growers are looking at different industrial agricultural facilities, to know if they can design new functionalities based on Internet of Things paradigms. The user-centered design model is used to gain knowledge and experience in this sense of incorporating technology in farming
Fig. 5 Precision agriculture system
A Short Systematic Survey on Precision Agriculture Table 3 General acronym
LSTM
Long-short term memory
437
AO-DVR
Adhoc on demand-distance vector routing
PLF
Precision livestock farming
AO-MDVR
Adhoc on demand-multipath distance vector routing
NCDC
National cooperative development corporation
DSR
Dynamic source routing
RFID
Radio-frequency identification
PAS
Precision agriculture system
UCD
User-centred design
ASC
Agriculture supply chain
LFS
Livestock farming systems
SSR
Short systematic review
PAFS
Precision agriculture farming system
ANN
Artificial neural networks
NB
Naive Bayesian
BM
Bayesian model
CNN
Convolutional neural networks
MPR
Multivariate polynomial regression
applications [18]. The IoT paradigm tools aid in decision-making. Incorporation of IoT architecture, smart processes and operating rules [19] are execution-based edge fog computing paradigms in distributed model. Precision farming is collection of technology that uses information systems, sensors, better machinery, and informed management to optimize production while accounting for the variability and uncertainty that exists in farming systems. It’s a farming management principle that entails keeping track of, evaluating, and reacting to crop variability in and around fields. The (Table 3) general acronym of ML and AI terms used in precision agriculture system.
4 Comparison Accuracy The different tasks in precision agriculture have been successfully applied in ML technique previously. In the last decade, Machine Learning technique and sensing technologies have been rapidly and highly developed. These growths are expected to continue with more comprehensive and cost effective datasets, combined with more solutions of complicated algorithms that allow state estimation of environment, decision-making, and better crop yield. This study mainly reveals that:
438
S. Sakthipriya and R. Naresh
Data Percentage
120 100 80 60 40 Accuracy
20
Testing data
0
Traning data
Algorithm
Fig. 6 Comparison accuracy graph of ML algorithms
(a) (b) (c)
Neural Networks—more precise crop yield estimation and allows identification of the importance of different vegetation indices. Support Vector Machine (Least Square)—Analyze in regression to compute Nitrogen status. Fuzzy cognitive Map—model and signify expert information for yield prophecy and crop management.
The comparison of accuracy of ML algorithms by splitting the testing and training data as 50% and 50%, respectively, is represented graphically in Fig. 6 and as a numerical chart in Fig. 7.
5 of soil accessibility in agro zone districtsConclusion The conclusion of the PAS short analysis suggests that machine learning has a lot of potential for use in the various PAS phases. The findings show that Machine Learningdriven technologies can help PAS improve overall productivity and address a variety of issues that farmers face, including crop yield knowledge, soil types (health), irrigation management, field, livestock (PLF), nutrients, and other information. The data collected by the PAS sensor from various sources are used to make predictions and classifications using various machine learning algorithms. This work reveals that, depending on the study of agricultural environment, research’s goal, and the availability of data, reviewed articles employ a variety of features. Every paper looks into yield prediction of crops with machine learning, but they vary in the features like
Accuracy
A Short Systematic Survey on Precision Agriculture
120 100 80 60 40 20 0
439
RF
Polyno mial Regress ion
SVM
Neural N/W
Naive Bayes
D-Tree
Accuracy
97
88
82.35
76.47
71.76
68
Testing data
50
50
50
50
50
50
Traning data
50
50
50
50
50
50
Fig. 7 Comparison accuracy chart of ML algorithms
atmospheric factors, size, geological location, and crop. According to studies, models with more features do not have the best yield prediction results always. Therefore, models with more features and less features should be reviewed to find the best performing model. Several algorithms have been used in various studies; however, few Machine Learning models, such as Random forest, polynomial regression, neural networks, SVM, decision tree, Naive Bayes, linear regression, and gradient boosting tree, are used frequently, since Neural Networks are the most widely used Machine Learning algorithm.
References 1. J. Astill, R.A. Dara, E.D.G. Fraser, B. Roberts, S. Sharif, Smart poultry management: Smart sensors, big data, and the internet of things. Comput. Electron. Agric. 170, 105291 (2020) 2. C. Bahlo, P. Dahlhaus, H. Thompson, M. Trotter, The role of interoperable data standards in precision livestock farming in extensive livestock systems: a review. Comput. Electron. Agric. 156, 459–466 (2019) 3. M.P. Mcloughlin, R. Stewart, A.G. McElligott, Automated bioacoustics: methods in ecology and conservation and their potential for animal welfare monitoring. J. R. Soc. Interface 16(155), 20190225 (2019) 4. R. Dhaya, Flawless identification of fusarium oxysporum in tomato plant leaves by machine learning algorithm. J. Innov. Image Process. (JIIP) 2(04), 194–201 (2020) 5. R. Garcia, J. Aguilar, M. Toro, A. Pinto, P. Rodriguez, A systematic literature review on the use of machine learning in precision livestock farming. Comput. Electron. Agric. 179, 105826 (2020) 6. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 7. A. Sungheetha, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart Technol. (TCSST) 3(02), 81–94 (2021)
440
S. Sakthipriya and R. Naresh
8. K.R. Suryawanshi, S.M. Redpath, Y.V. Bhatnagar, U. Ramakrishnan, V. Chaturvedi, S.C. Smout, C. Mishra, Impact of wild prey availability on livestock predation by snow leopards. Roy. Soc. Open Sci. 4(6), 170026 (2017) 9. W. Chuan, G. Danielle, R. Peter Green, Development of plough-able RFID sensor network systems for precision agriculture, School of Electrical and Electronic Engineering, (IEEE, 2018), pp. 4799–2300 10. G. Konstantinos, Liakos, B. Patrizia, M. Dimitrios, P. Simon, B. Dionysis, Machine Learning in Agriculture: A Review, MDPI Sensor, 18, 2674 (2018) 11. S. Pudumalar, E. Ramanujam, R. HarineRajashree, C. Kavya, T. Kiruthika, J. Nisha, Crop recommendation system for precision agriculture, in 2016 8th International Conference on Advanced Computing (ICoAC) (IEEE, 2017), pp. 32–36 12. F. P. Francisco Javier, G. C Juan Manuel, H. Mario Nieto Hida, José Mora-Martínez, Precision Agriculture Design Method Using a Distributed Computing Architecture on Internet of Things Context, MDPI Sensors, 18, 1731 (2018) 13. Anna Chlingaryana, S. Salah Sukkarieha, W. Brett, Machine learning approaches for crop yield prediction and nitrogen status estimation in precision agriculture: A review, Comput. Electron. Agric. 0168–1699 (2018) 14. S.R. Rajeswari, P. Khunteta, S. Kumar, A.R. Singh, V. Pandey, Smart farming prediction using machine learning. Int. J. Innov. Technol. Explor. Eng. 8(07) (2019) 15. P. Shine, J. Upton, P. Sefeedpari, M.D. Murphy, Energy consumption on dairy farms: a review of monitoring, prediction modelling, and analyses. Energies 13(5), 1288 (2020) 16. M. Benjamin, S. Yik, Precision livestock farming in swine welfare: a review for swine practitioners. Animals 9(4), 133 (2019) 17. J.S. Manoharan, Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. J. Soft Comput. Paradigm (JSCP) 3(02), 83–95 (2021) 18. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019) 19. N. Gobalakrishnan, K. Pradeep, C.J. Raman, L. Javid Ali, M.P. Gopinath, A systematic review on image processing and machine learning techniques for detecting plant diseases, in 2020 International Conference on Communication and Signal Processing (ICCSP) (IEEE, 2020), pp. 0465–0468
Improving System Performance Using Distributed Time Slot Assignment and Scheduling Algorithm in Wireless Mesh Networks K. S. Mathad and M. M. Math
Abstract Wireless Mesh Network (WMN) is a springing technical advancement that provides users an easy internet access. In WMN, medium access control layer has to schedule data transmission in an emphatic environment, where the channels that are available, change with space and time. Distributed scheduling in multi-radio and multi-channel WMNs is a challenging issue. To use the Internet in WMN, the nodes must assist one of the mesh routers. There are many radio interfaces for a router, which are then assigned to channels that are partly overlapping, which improves the capacity of a WMN. The channel selection in the available time slot is scheduled in an efficient way such that the performance of the WMN increases. The proposed algorithm Distributed Time based Schedule for Channel Assignment is a distributed solution which performs the assigning of channels. The simulation results achieve a convincing 19 percent enhancement in the performance of the system by using around 66 percent lesser time for scheduling. Keywords Channel selection · Performance · Multi-radio · Capacity · Delay
1 Introduction The mobile devices used to access the Internet and the users using such devices are. WMNs compromise of mesh routers and mesh clients as shown in the Fig. 1. WMNs have static mesh routers which acts like wireless back bone and gives multi-hop connection to the mesh clients through the internet. In WMNs, the nodes establish and maintain connection in mesh. WMNs provide advantages like less cost for installation, wide area deployment, reliability, and self-management. As an integral component of broadband Internet applications today, WMN is structured as a topological mesh of large number of nodes with ad hoc combinations of smart two-way K. S. Mathad (B) · M. M. Math KLS Gogte Institute of Technology, Belagavi, Karnataka, India e-mail: [email protected] M. M. Math e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_33
441
442
K. S. Mathad and M. M. Math
Fig. 1 Wireless mesh networks with mesh routers and mesh clients
transmission devices [1]. Many forms of features are provided by a mesh router depending on the purpose of communication. A few mesh routers act like a gateway that provides communication between the wired and wireless networks. Some of the routers provides features of routing that creates a multi-hop WMN. Some of the mesh routers behave like access point. The retention of WMN increases by the use of mesh router to perform transfer and receiving of data concurrently in the environment of multiple radio interfaces. It also involves channel assignment to be done which improves the performance of the network. The throughput can be more by assigning the interfaces to the quadratical channels such that interference does not occur during communication of adjoining routers. Different WMNs have unequal number of quadratical channels; some have only 3 and few others have 12 channels. The application of multi-channel and multi-radio WMNs can be achieved by doing appropriate channel allocation. Distributed algorithms can proficiently use the radio channel assets and reutilize the time slots more effectively. The exchange of too many data packets among neighbor nodes will lead to the non-convergence of the allocation time and the number of rounds when the continuous time slot requests occur [2]. The proposed algorithm is Distributed Time based Schedule for Channel
Improving System Performance Using Distributed …
443
Assignment (DTSC) which improves the performance of the user by allowing the user to access more data. The algorithm provides schedule to assign channels in multi-radio and multi-channel WMN. This paper is organized as follows. Section 2 provides the details of previous work carried out in the proposed research area. Section 3 summarizes the mathematical model that is formulated for the proposed work. Section 4 describes the concert analysis of the proposed algorithm. Finally, the conclusion has been given in Sect. 5.
2 Related Work The capability of WMN decides the action of performing a task by a node. More reliable and best performing wireless network structures are required because of quick development in mobile services and growing number of mobile devices [3]. Single radio mesh routers provided with a single channel were used in the WMNs. Because of the usage of single channel in WMNs, the nodes use to have a poor performance. The provision to provide the efficient channel assignment by the mesh routers increases the capability of WMNs. Nowadays, mesh routers are provided with multiple radio coalition. Mesh routers furnished with multi-radio and multichannel assignment improves the overall performance of wireless mesh networks. IEEE 802.11 standard is followed in WMNs. Some of these versions provide different number of quadratical channels. An efficient algorithm for assigning channel is required to resolve having more number of radio coalition and less number of quadrilateral channels. The capability of WMNs increases by assigning biased channels and quadrilateral channels. To avoid the striking between separate transmissions, TDMA allows the non-interpose propagation’s in every time slot. The Gateways are also the mesh routers with special functionality of bridging and are useful to communicate with the outside network [4]. Channel intervention and connection between the networks are co-related in wireless mesh networks. The channel is shared by the mesh routers to minimize inter channel induction and maintaining connection with the backbone of the WMNs. The use of same channel by more number of mesh routers in WMNs must be avoided to decrease the inter channel induction. To have the connection with the backbone of WMNs, the mesh routers have to share a common channel. It’s a great task to schedule the assigning of the channels and has to consider the usage of shared channel and connection with the backbone of the network. For appropriate channel allotment schemes, the presumptions made on load of the traffic decides the proper channel that should be assigned. The protocol interference model is employed to investigate the interference between transmitters [5]. A lot of work has been carried out for the assigning of channels in multi-radio and multichannel WMNs. The schedule for assigning of the channels has been done in different schemes. These schemes differ in their approach of getting the solutions. Some are centrally coordinated and others are distributed. In centrally coordinated approach, one of the nodes acts as a base station for the assigning of channels. In distributed approach, the task of assigning the channels is scheduled by the distributed nodes
444
K. S. Mathad and M. M. Math
Fig. 2 Varieties scheme of assigning channels
among the network. The nodes performing these tasks in centralized and distributed approach are mesh routers. In distributed approach, each mesh router assigns itself the channels with the appropriate radio coalition. Based on the time slots for the assigning of channels, the schemes are differentiated as stationary, effective, and varieties schemes. In stationary scheme, a set of channels are assigned to a set of radio coalitions and changes are not allowed later. By assigning suitable channels, the resulting performance of the network can be improved which in turn reduces the network interference [6]. In effective scheme, assigning of channels and radio coalitions happens during runtime. Varieties scheme uses the combination of the ways followed in stationary scheme and effective schemes. The varieties scheme of assigning channels is shown in Fig. 2. In varieties scheme, the node that wants to transmit the data, informs all its neighbors that the fixed channels are available for transmission. A node adopts an interface to a channel and transmits the packets. This scheme adopts to the varying traffic easily. There are different ways of assigning the channels; some are focused on coalition aware, some are attached with improving the throughput and others are associated with Quality of Service (QoS) parameters. This paper proposes Distributed Time based Schedule for channel assignment algorithm that focuses on providing a node to access more data, by assigning channels in the mentioned schedule, which is working with multi-radio and multi-channel WMN.
3 Effective Distributed Time-Based Schedule for Channel Assignment This approach provides a schedule for assigning channels to access more data in multi-radio and multi-channel WMNs. Here, the mesh routers provide multi-radio interface that are used for accessing the interface and provides the schedule for possible routes. Reliability is often increased, but it affects network overhead, which rises linearly due to the increase in size of the network [7]. The interface connects the mesh routers and also schedules the packet forwarding in the wireless mesh networks. The interface also helps in accessing the network, where one node can be a companion with one mesh router. Some of the mesh routers can help in connecting WMN with the wired network. Even though any node in the network can act as Trust Centre, by default, it is the coordinator [8]. The WMN supports connectivity with
Improving System Performance Using Distributed …
445
multiple hops, and this is achieved by mesh routers providing multi-hop connection. One of the mesh router may act as a gateway in the WMN. The communication is completed by transfer of data from gateway to the mesh router and then to the corresponding node with which the mesh router is the companion. Each and every node waits for a packet and receive it upon successful delivery. Mesh routers are involved in transmission and are not involved in causing interference. Since it does not depend on the number of nodes, the average throughput given to all the users is same. The assumption made here is that, all the packets are of the same size. Few nodes having weak quality links, may take more time to receive the packets. The other nodes that have good quality links, may receive the packets in the exact estimated time. WMN comprises of mesh routers and mesh clients. The collection of mesh routers is denoted by MR = {M 1 , M 2 , M 3 , … M i }, and collection of mesh clients is denoted by MC = {C 1 , C 2 , C 3 , … C j }. The mesh client M p is involved in communication with a mesh router M q with a packet size Pl . The time rate of transmission of packet can be represented by the notation given below. T ( pl ) = Sr Mq /Nc
(1)
where S r (M q ) is instance sending rate, and N c is the number of channels. Time rate gives the rate of transmission of packets to reach mesh clients as the mesh routers send the packet. Mesh routers send the packets to the mesh clients which are in its reach. Each mesh router tries to access the channel for transmission of the packet. During this access, each mesh router does not get the complete access to the channel; hence, the mesh router get a span of time to access the channel. The schedule for access of the channel depends on the mesh routers that are involved in the access of the channel. The total throughput that a node gets by channel usage and switching between different mesh routers is given by Eq. (2) ThrTot Cq = Cap ∗ T /T ( pl )
(2)
where ThrTot is the total throughput achieved by a mesh client, Cap is the medium capacity, and T is the time taken for transmission. During node starvation, the uniformity on the node level can be defined as the reduction the node starvation by reducing total interference which in turn improves the performance of the network [9]. Each of the mesh router calculates the use of the time span taken for the transmission and receiving of the packets and the not usage state. The channel switching is calculated by total channels/number of mesh routers. Since each mesh router is involved in transmission of the packets to the mesh clients in its range, the total throughput achieved is the same for all the mesh clients. The switching of the channels by the mesh routers for transmission of the packets to the mesh clients, in multi-radio multi-channel WMNs, result in the same number of packets sent by each mesh routers to the mesh clients. The time taken for transmission of the packets of same size with a throughput “Thr” is given by Eq. (3).
446
K. S. Mathad and M. M. Math
T = (N ∗ Siz)/ThrTot
(3)
where N is the count of the packets that are to be sent, and Siz is the size of each packet. The throughput from source to destination which is end to end is computed by Eq. (4) ThrSD(Cn) = TW + (1 − W )ThrTot
(4)
where W is the delay involved in switching from one channel to another while sending the packets. W is calculated based on the number of mesh clients in the range of the mesh router and the number of channels supported in each radio. The selection of link can be made by collecting information about routes with less interference among links which has a great influence on the whole network’s throughput [10]. ALG DTSC (MRi, MCj) //input: MRi is the collection of mesh routers in WMN MCj is the collection of mesh clients in WMN //output: Assignment of channels using DTSC Start CalC (ThrTot); T = (N*Siz)/ThrTot; ThrSD (Cn) = TW + (1-W)ThrTot If [(ThrSD(Cn) < = ThrTot) && (T > Tthres)] DTSC (MRi,MCj). Assign Channel with Max {ThrSD(Cn)} Broadcast the same to all mesh routers End
4 Performance Evaluation The proposed algorithm DTSC is assessed by the simulator ns3. ns3 is a discrete event simulator which is an open source. It provides many attributes for the simulation of the proposed algorithm. All the entities of the wireless network can be modeled like the real life devices of the wireless mesh networks. IEEE 802.11 standard can be simulated using ns3. The medium access control layer features can be simulated using ns3. It provides modeling of signals that happens in real time. Usually in the normal operation of IEEE 802.11s standard, the mesh clients access data through a mesh router. When the throughput becomes lesser than the baselines, then the mesh router selects a channel from the remaining channels. The decision to select
Improving System Performance Using Distributed … Table 1 Simulation parameters
S. No
447
Network parameter
Value
1
Network size
100 m × 100 m
2
Number of mesh device 1200
3
Modulation scheme
BPSK, QPSK and QAM
4
Number of frequency channels
2
5
Number of time slots
8 µs
6
Bandwidth
1000–2000 kBps
7
Mobility of devices
3 cycles per frame
8
Coding rate
0.75
9
Message information size
40 bytes
from remaining channels is made that gives more throughput to the mesh client. The attempts to join the multipath transmission method, to find the optimal two route parallel transmission between the source node and the destination node, can enhance the timeliness and reliability of the protocol [11]. In this simulation, the alternate algorithm that is used is the Acquired Data Tenacity (ADT). This algorithm is the normal operation of standard 802.11 with the support of medium access control layer. This ADT algorithm is simulated to compare it with the proposed algorithm (Table 1). In this simulation, distributed environment is considered, where the mesh clients and mesh routers are randomly distributed. The WMN comprises of mesh routers, mesh clients, and the gateway node. The gateway node is joined to the wired network. Neighbors in a grid network interact undeviating, but in multi-hops, the paths to the farthest nodes are reached by the 802.11 s routing protocol. WMN also consists of hops, where each mesh router considers the number of hops as a measurement of distance during the transfer of the packets. Each mesh router does the receiving and sending of packets. The packet size is considered to be of 1000 bytes. The motion rate of the packets is 1Mbps. The simulation is carried for 300 s. The simulation is carried out by increasing the number of mesh clients. Each mesh router is associated with a collection of mesh clients. By adding the mesh clients, the throughput of the system increases, then the stated algorithm performance is analyzed. The proposed algorithm performance is compared with the performance of ADT algorithm. ADT algorithm selects the channel that gives more throughput to the mesh client. The throughput achieved by mesh client is deficient to decide that the data sent by sender and data received by receiver. It also depends on other factors like the intrusion created by other mesh client communication, the status of the channel, the usage of channels by more number of mesh clients simultaneously, etc. The performance of the proposed DTSC is more effective than ADT algorithm. DTSC algorithm considers the performance from sender to the receiver in addition to the existence of other mesh clients. It also considers the median performance achieved by the entire WMN.
448
K. S. Mathad and M. M. Math
System Aggregate Throughput kbps
12000 10000 8000 6000 4000 2000 0 0
200
400
600
800
1000
Number of Mesh Clients DTSC
ADT
Fig. 3 System performance with DTSC and ADT
ADT algorithm considers the mesh client satisfaction but ignores the length of the space between mesh router and mesh client. It also does not consider the capacity of mesh router. In certain situations, a mesh client at a far distance may be involved in coalition with the mesh router. The mesh client which is far with less data to send, may be overlooked by the nearest mesh client that has more data to send. In the DTSC algorithm, the mesh router estimates the throughput that is achieved from the sender to the receiver in the entire WMN. It also considers the throughput achieved by all the mesh clients in the WMN, also considering the consequences of the intrusion created by other mesh clients, the status of the channels, and usage of the channels. It utilizes the channel that provides outperforming performance and the best throughput from sender to receiver in WMN. The results of simulation is shown in Fig. 3. The proposed algorithm does not deal with the process of computing the route and does not include routing grade. So the proposed algorithm is considered using the number of hops required, i.e., the contemplated number of transmissions. The simulation is carried out using different grades of routing. The proposed DTSC algorithm selects the channel based on the process involved in accessing the interface and the time spent in the assigning of channels. The simulation is carried by increasing the data transfer between mesh clients and mesh routers. After simulation, the performance of WMN is observed. The number of clients and the time taken for assigning of channels to transfer packets between the desired source and destination is plotted. It is observed that the proposed DTSC algorithm takes less time in assigning of channels than the ADT algorithm. The results of simulation is shown in Fig. 4. ADT algorithm considers only the throughput achieved by individual mesh client. Here, one of the mesh router that experience more data may get less performance than the mesh router that is involved in less data transfer. The simulation is carried by distributing the mesh routers and mesh clients in two different ways. In the first way, the mesh clients are distributed at equal distances from the mesh routers, and the mesh routers are also placed at equal distances from each other. The
Improving System Performance Using Distributed …
449
Time Taken in Secs
2
1.726 1.580 1.435 1.332 1.5 1.230 1.098 0.9200.964 0.8580 1 0.7875 0.7500.820 0.7128 0.620 0.6060.664 0.548 0.450 0.4280.477 0.5 0.28 0.2700.3300.383 0.206 0.12 0 60 120 180 240 300 360 420 480 540 600 660 720 780 Number of Mesh Clients DTSC
ADT
Fig. 4 Number of mesh clients versus time taken in seconds
packet sent to gateway needs to be re-transmitted by intermediate node to confirm a throughput within a cluster. In this working environment, the packet size is 1000 bytes and the rate at which the packets are transmitted is 1 Mbps. The proposed DTSC algorithm performs better than the ADT algorithm. In the second way, the mesh clients are distributed at unequal distances from the mesh routers. The mesh routers are also at unequal distances between each other. The proposed DTSC algorithm performs better than the ADT algorithm in this process as well. The results of simulation are shown in Figs. 5 and 6. In the second scenario, since the mesh routers and mesh clients are distributed in a random manner, the ADT algorithm gives comparatively very low throughput. ADT algorithm uses the channel that gives more throughput to the mesh client. But the mesh clients are
System Throughput overall Mbps
DTSC
ADT
8.00 7.00
6.2
6.00 5.00 4.00 2.8 2.5 1.86 1.7 2.1 2.00 1.2 1.5 0.85 1.00
6.5
6.9
5.5 5.7 5.4 5.1 4.8 4.67 4.8 4.4 4.2 4.1 3.8 4.0 3.4 3.6 3.1
3.00
0.00 1.40 2.20 2.80 3.50 3.80 4.50 5.10 5.50 5.80 6.10 6.50 7.20 7.50 Network Capacity Mbps
Fig. 5 System throughput with uniform distribution of mesh clients and mesh routers
K. S. Mathad and M. M. Math System Throughput overall Mbps
450 DTSC
8
ADT
7 5.5 5.7
6 5
4.1
4.4
6.2
6.5
6.9
4.8
3.4
4
2.8
3
2.2
2 1.3 1.6 1.1 1.4 0.8 1 0.4
3.1 2.8 2.6 2.9 2.4 2.6 2.2 1.8 1.95
0 1.4 2.2 2.8 3.5 3.8 4.5 5.1 5.5 5.8 6.10 6.5 7.2 7.5 Network Capacity Mbps
Fig. 6 System throughput with random distribution of mesh clients and mesh routers
not uniformly distributed and hence, the channel selection results in less overall throughput achieved. DTSC algorithm not only considers the throughput achieved by single mesh client, but also the intrusion created by mesh client, the status of channel and usage of channels by more number of clients. The overall throughput achieved from sender to receiver is substantially more than that of the ADT algorithm. Since the proposed DTSC algorithm looks attentively to access interface and time involved in accessing the channel. When the load in the network is increased the performance differs in the DTSC and ADT algorithm. The usage of collective links permits adding the streaming bandwidth by stabling the weight over collective network links between source and destination. There must be adequate space in frequency that depends on the selected bandwidth [12]. By increasing the bulk traffic, the throughput achieved by the two working scenarios are analyzed and the result is shown in Figs. 5 and 6. In uniform distribution, with more bulkier traffic, the throughput is more for DTSC algorithm but the difference in throughput is less. In the second scenario, with more bulkier traffic, the throughput achieved in DTSC is much more compared to that of ADT algorithm. The nodes in the network interact together to complete the communication for all members, with the support of a base station, which improves the performance [13]. Wireless Mesh Networks (WMNs) are thus well suited for applications, where ad hoc network infrastructure is to be setup quickly [14]. The cross-layer way of communication gives a confirmation to the packet arriving rate which is required for upper applications [15].
5 Conclusion Wireless Mesh Network is an optimistic technology to provide wireless network connectivity to the mesh clients. Appropriate channels are assigned to multiple radio
Improving System Performance Using Distributed …
451
alliances to increase the overall throughput of the network. The proposed algorithm assigns the channels to the interface provided by the mesh router. It considers the throughput not only from a single mesh client, but also the intrusion created by mesh client, the status of channel, and usage of channels by a greater number of clients. The overall throughput achieved from sender to receiver is 19 percent substantially more than that of the existing ADT algorithm. The time taken for scheduling by the proposed DTSC algorithm is 66 percent lesser than that of the existing algorithm. Therefore, the performance shown by DTSC algorithm is comparatively satisfactory. In future, improvement in the design of DTSC may be accomplished by considering mesh client mobility.
References 1. M. Kashef, A. Visvizi, O. Troisi, Smart city as a smart service system: human-computer interaction and smart city surveillance systems. Comput. Hum. Behav. 124, 312–325 (2021) 2. Y. Li, X. Zhang, T. Qiu, J. Zeng, P. Hu, A distributed TDMA scheduling algorithm based on exponential backoff rule and energy-topology factor in Internet of Things. IEEE Access 5, 20866–20879 (2017). (special section on security and privacy for vehicular networks) 3. A. Paszkiewicz, M. Bolanowski, P. Zapała, Phase transitions in wireless MESH networks and their application in early detection of network coherence loss. MDPI Appl. Sci. 9(5508), 1–18 (2019) 4. J. Wang, W. Shi, K. Cui, F. Jin, Y. Li, Partially overlapped channel assignment for multi-channel multi-radio wireless mesh networks. Eurasip J. Wirel. Commun. Netw. 1–12 (2015) 5. Z. Askari, A. Avokh, EMSC: a joint multicast routing, scheduling, and call admission control in multi–radio multi–channel WMNs. Front. Comp. Sci. 2020, 1–16 (2020) 6. W. Hassan, T. Farag, Adaptive allocation algorithm for multi-radio multi-channel wireless mesh networks. MDPI Future Internet 127, 1–12 (2020) 7. M. Bano, A. Qayyum, R.N.B. Rais, S.S.A. Gilani, Soft-mesh: a robust routing architecture for hybrid SDN and wireless mesh networks. IEEE Access 9, 87715–87730 8. A. Cilfone, L. Davoli, L. Belli, G. Ferrari, Wireless mesh networking: an IoT-oriented perspective survey on relevant technologies. MDPI Future Internet 2019, 1–35 (2019) 9. F.A. Ghaleb, B.A.S. Al-Rimy, W. Boulila, F. Saeed, M. Kamat, F. Rohani, S.A. Razak, Fairnessoriented semichaotic genetic algorithm-based channel assignment technique for node starvation problem in wireless mesh networks. Hindawi Comput. Intell. Neurosci. 2021, Article ID 2977954, 1–19 (2021) 10. Z. Li, D. Zhang, H. Shi, Link interference and route length based dynamic channel allocation algorithm for multichannel wireless mesh networks. Hindawi Wirel. Commun. Mob. Comput. 2021, Article ID 8038588, 1–8 (2021) 11. Q.Q. Li, Y. Peng, A wireless mesh multipath routing protocol based on sorting ant colony algorithm, in 3rd International Conference on Mechatronics and Intelligent Robotics (ICMIR2019) (2019), pp 570–575 12. M. Rethfeldt, T. Brockmann, B. Beichler, C. Haubelt, D. Timmermann, Adaptive multi-channel clustering in IEEE 802.11s wireless mesh networks. MDPI Sens. 21, 1–43 (2021) 13. H.H. Attar, A.A.A. Solyman, A. Alrosan, C. Chakraborty, M.R. Khosravi, Deterministic cooperative hybrid ring-mesh network coding for big data transmission over lossy channels in 5G networks. Eurasip J. Wirel. Commun. Netw. 159, 1–18 (2021) 14. M. Wzorek, C. Berger, P. Doherty, Router and gateway node placement in wireless mesh networks for emergency rescue scenarios, Auton. Intell. Syst. 1–14 (2021)
452
K. S. Mathad and M. M. Math
15. A. Fan, Z. Tang, W. Wu, T. Yu, L. Di, Bluetooth-based device-to-device routing protocol for self-organized mobile-phone mesh network. Eurasip J. Wirel. Commun. Netw. 161, 1–25 (2020)
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA S. Santhameena, Edwil Winston Fernandes, and Surabhi Puttaraju
Abstract In order to tackle the problem of data security in resource constrained devices, many lightweight ciphers have been introduced. The PRESENT and KLEIN ciphers are popular lightweight block ciphers of SPN architecture. The implementation of the ciphers has been tried on an FPGA, mainly due to its flexible nature and the presence of block RAMs which can be used to store the intermediate states of the cipher, leaving behind more slices for other applications that maybe running in parallel. In this paper, the comparison results of PRESENT-80 and KLEIN-80, implemented using just slices/registers and block RAMs, have been presented. Keywords Lightweight · PRESENT · KLEIN · FPGA · Block RAMs
1 Introduction With most of the communication happening digitally in the recent years, there is an increasing need to secure the information that is sent across. Cryptography helps to achieve this necessary security. Some of the most popularly used cryptographic algorithms are AES, DES, RSA, Blowfish, etc. These algorithms provide high security and are ideal for almost all devices, except lightweight devices like Wireless Sensor Nodes (WSN), RFID (Radio Frequency Identification) tags, and IoT (Internet of Things) devices. These lightweight devices help to achieve important tasks on a daily basis. RFID tags or WSNs handle systems like electronic payment, product identification, tracking, etc. Standard symmetric ciphers like AES and DES use some complex techniques, which can be exhausting and power consuming for these tiny devices. Hence, there is a necessity to develop algorithms that are lightweight and energy-power optimized. Many lightweight algorithms of block and stream type have been implemented, such as, PRESENT, Klein, Katan, Clefia, HIGHT and Lizard, Fruit, Plantlet, etc. In all the lightweight algorithms, ensuring low-power consumption along with providing moderate to high security, acceptable throughput S. Santhameena (B) · E. W. Fernandes · S. Puttaraju Department of Electronics and Communication Engineering, PES University, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_34
453
454
S. Santhameena et al.
and minimal memory requirement are the biggest challenges. PRESENT is intended to be used in operations where low-power consumption and high chip efficiency is desired. The International Organization for Standardization and the International Electrotechnical Commission included PRESENT in the new international standard for lightweight cryptographic methods. This is the best known block cipher with least area and power footprint for the “present” scenario IoT devices. However, many IoT devices or smart devices and medical sensors today still use the heavy algorithms like AES, DES, Triple DES, RSA, and Blowfish. Lightweight cryptography is a vast area under research that has advanced since long and standards like the PRESENT can be expected to be incorporated into devices in the days to come. The flexible nature of FPGA has attracted the implementations of various applications and ciphers are one of them. In this paper, the results of implementation of PRESENT-80 and KLEIN-80 on Artix 7 xc7a200tfbg484-1, using two methods have been presented and the simulation results and implementation reports for the same have been included. Both the ciphers run for many number of rounds. Hence, after every round, some data would be generated which would be necessary for the next round. In the first method, this intermediate data is stored in registers, and block RAMs of the FPGA are not used. The ciphers are coded just as they have been explained in the algorithm with a datapath of 64 bits directly. As an example, after xoring the state and the key (each of 64 bits), the resultant is stored in a 64 bit register, and these 64 bits are passed to the next step, where substitution of all 64 bits happen at once. In the second method, the single port block RAM has been used to store these intermediate states and a datapath of 16 bits is used. As an example, in one cycle 16 bits of plaintext is read, in the next cycle, 16 bits of key is read, they are xored and these 16 bits undergo substitution simultaneously and then are stored back to the block RAM. Then, the next set of 16 bits of plaintext and key are read, operated and stored back, and so on. The design modules, reading from and writing to the RAM are all coded in such a way that all operations including substitution, permutation, xoring, shifting, mixing, etc., happen appropriately and produce the same end result as that of the first method. In the second method, the LUT count can be expected to be slightly reduced, allowing other applications to use them. The highlight of this paper is the comparison results of the two ciphers.
2 Related Work Everyday, a huge amount of data is generated and transferred. This could be in the form of emails, messages, images, bank transactions, medical data, etc. All this data which are shared over the Internet as digital data have to be stored and transferred safely. Hence, there is a need for cryptographic algorithms. Cryptography keeps information safe by turning data into a form that the unintended person cannot understand. The information that is to be sent, called “plaintext,” is passed through an encryption algorithm in order to produce a “cipher text,” which when decrypted
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA
455
at the destination gives back the original message sent. A brief overview on the historical ciphers like the Caeser cipher, simple substitution cipher, transposition cipher, and modern techniques like stream ciphers, block ciphers, hash functions, digital signatures, authentication, etc., has been provided in [1]. Securing data is so important that the main research of the quantum domain is about quantum cryptography and quantum key distribution. Even blockchain technology based operations require encryption. For examples, authors in [2] have proposed a searchable encryption algorithm that can be used for sharing blockchain-based medical image data securely between health institutes using cloud applications. The IoT devices are resource constrained devices which help to achieve many important and intelligent tasks. In these devices, the algorithms are carefully designed so that they can operate efficiently even with less power, area, memory size and battery supply. Data exchange using IoT devices is growing at high rates, hence increasing the need for data security and lightweight ciphers. As pointed out in [3], lightweight block ciphers use less resources and mitigate encryption overhead by implementing smaller block sizes, smaller key size, and simple round logic based on simple computations and simple key scheduling. It is also analyzed that when the block size is between 48 and 96 bits, and the round count is 16 or less; the energy consumption is optimum. Moreover, it has also been presented that increasing the round by 1 could and increases the energy per bit by 3.4%. For these reasons, most of the lightweight block ciphers have less rounds, usually data block size of 64 bits and are kept computationally simple. PRESENT and KLEIN are two such ciphers. Quite a lot of implementations of the PRESENT and KLEIN cipher can be found in the literature. Since the creation of AES algorithm, the need for creation of new ciphers has reduced. But for lightweight devices which include RFID Tags and legacy sensor platforms, AES is not the best option. Hence, an ultra-lightweight block cipher PRESENT was introduced in [4]. The best way to make a cipher lightweight is to keep it simple. Most lightweight ciphers are of block type, because the art of block ciphers is understood better than that of stream ciphers, and they are usually encryption only. The cipher utilizes 1570 GEs for a datapath of 64 bits. The PRESENT cipher is the most popular and lightweight block cipher. According to [5], the PRESENT-80 cipher consumes only 176 slices (on Xilinx Spartan3 board), as against KATAN80 which consumes 1054 slices and PRESENT-128 consumes 202 slices, which is very less LUT consumption compared to LED-128 (966 slices), GRAIN-128(1294 slices), TEA-128 (1140 slices). A slice consumption of just 117 for PRESENT-128 on a Xilinx Spartan3 board has been reported in [6] which makes it comparable in size to stream cipher implementations. A maximum delay of 8.78 ns and throughput of 28.46 Mbps have been reported for the same. From these results, the ultra-lightweight nature of PRESENT cipher is clearly evident. The KLEIN cipher is yet another lightweight block cipher introduced in [7]. What makes KLEIN different is that, it has superior performance on software and sensor platforms and also has smaller hardware dimensions. The paper focuses on the sensor implementation because it has software implementation which has several advantages like flexibility and economy while also having better power and hardware competency in manufacture and maintenance. KLEIN-80 consumed 2202 GEs for
456
S. Santhameena et al.
a datapath of 64 bits and 1530 GEs for a datapath of 8 bits. So, PRESENT cipher is better for hardware encryption. However, the software performance of KLEIN is better compared to PRESENT. The KLEIN-80 cipher consumes 52 RAM bytes, 3112 ROM bytes with a processing speed of 2.62/ 3.4 ms per 16-byte message on legacy low-resource sensor TelosB, whereas the PRESENT-80 cipher’s software encryption consumes 288 RAM bytes, 6424 ROM bytes with a processing speed of 6.6/11.1 ms per 16-byte message. Similar results can be observed on IRIS platform as well. According to [8], KLEIN cipher consumes low power (around 2.5 µW), and its energy per bit is less than 10 µJ. From the graphs presented, it is clearly observed that for 0.18 µm technology and hardware implementation, KLEIN consumes more GE, more energy(µJ), more power(µW), has lesser hardware efficiency (throughput/complexity) compared to the PRESENT cipher. For software implementation, for both 8-bit and 16-bit cases, energy consumption of KLEIN is lesser, and throughput(kbps) is greater than that of PRESENT. The 8-bit software efficiency (throughput/code size) of PRESENT is high comparable to that of KLEIN, whereas the 16-bit software efficiency of KLEIN is higher. ˙In the KLEIN cipher, there is a step called “MixNibbles” which is the most utilized area and power consuming step. An attempt to modify the cipher by replacing the “MixNibbles” step, with a 3 layer substitution which maintains security as well as reduces the time of execution has been proposed in [9]. The main intention behind using FPGAs for any operation is its reconfigurability. As indicated in [10], most of the cryptographic steps involve simple bit level operations which can directly be mapped onto the CLE (Configurable Logic Elements) of the FPGA. Furthermore, the availability of dedicated memory blocks on the FPGA attracts cryptographic implementations as they generate a lot of intermediate data to be stored. ˙If any cryptographic algorithm can be broken down into independent parts, multiple computation units on the same FPGA can be instantiated so that the tasks can be completed concurrently. Pipelining of tasks to increase throughput is also possible in an FPGA design. ASICs (Application Specific Integrated Circuits) have the advantage of low-power consumption, which is necessary for lightweight applications, but upgrading systems when the need arises, is tedious and they involve high non-recurring engineering cost and long time to market [6]. Almost all ciphers run for many rounds generating a large amount of intermediate data which can be stored in the block RAMs present on the FPGA, hence leaving behind many slices which can be used for other applications. This has been pointed out in a paper [11]. This paper describes two ways of implementing the PRESENT cipher using RAM blocks present on the FPGA. In the first approach, the algorithm is implemented using slices for S-boxes and RAM blocks for storing states. In the second approach, the S-boxes along with the intermediate states are stored in the block RAMs. For 128 bit key size, the first approach consumes 83 slices and 1 BRAM and has a throughput of 6.03 kbps. The second approach consumes 85 slices, 1 BRAM and has a throughput of 5.13 kbps.
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA
457
Fig. 1 Block diagram of PRESENT-80 cipher encryption as given in [4]
3 The PRESENT Cipher The PRESENT cipher is the most popular ultra-lightweight symmetric block cipher of block size 64. It uses 80 or 128 bit key size. It is of SPN architecture, i.e., it uses substitution for confusion and permutation for diffusion. It runs for 32 rounds with each round consisting of 4 operations, which are addkey (performs xoring operation of state and key), sLayer (performs substitution in accordance with the sboxes given in [4]), pLayer (performs permutation in accordance with the pboxes given in [4]), and key generation. However, the 32nd round just performs addkey operation. In order to generate the next 80 bit key, the current key is left shifted by 61 bits, followed by substitution of bits 79–76 using the sboxes and xoring of bits 19–15 with the 5 bit round counter. The block diagram of the PRESENT-80 cipher is given in Fig. 1 where S represents Substitution, P represents Permutation, and the states are stored in a 64 bit register made of D flipflops.
4 The KLEIN Cipher The KLEIN cipher is a lightweight symmetric block cipher of 64 bit block size. It uses a key of size 64/80/96 bits. It is also of SPN architecture. It runs for 12/16/20 rounds based on the key size (64/80/96). Each round consists of 5 operations which are AddKey (xoring of state and key), SubNibbles (substitution in accordance with sboxes given in [7]), RotateNibbles (two byte left shift of the current state), MixNibbles (bricklayer permutation of the state), and key generation. The final cipher text is obtained by xoring the state and the key. The next 80 bit key is generated using the following sequence of steps: divide the key into two equal tuples a and b, perform 1 byte left shift on each tuple to get a’ and b’, swap and update the tuples such that a” = b’ and b” = a’ XOR b’, xor the round counter with the third byte in the tuple a”, and substitute the second and the third bytes of the tuple b” using KLEIN sboxes. The block diagram of the KLEIN-80 encryption is given in Fig. 2. The figure has
458
S. Santhameena et al.
Fig. 2 Block diagram of KLEIN-80 cipher encryption as given in [7]
been taken from [7] but a few modifications have been made to represent KLEIN-80 case.
5 Implementation In order to observe the simulation results and obtain the utilization reports, the PRESENT and KLEIN ciphers were coded in Verilog and executed on Vivado 2016.2. The board used is Artix 7 xc7a200tfbg484-1. Just the implementation of the encryption for all methods have been tried. For the implementation of PRESENT-80 cipher, a top level module is executed, which runs a module called “present_round” 32 times. The “present_round” module performs AddKey, substitution, permutation and key generation in the method explained in [4]. The KLEIN-80 cipher is also coded in a similar fashion, except that “klein_round” runs for 16 rounds and performs AddKey, SubNibbles, RotateNibbles, MixNibbles and key generation operations. In both cases, the intermediate data and key states are stored in a reg (register) data type of size 64 and 80 bits, respectively. For the implementation of the RAM approach of PRESENT-80 and KLEIN-80, block RAMs of FPGA are used to store the intermediate states. The block RAM used is of single port, and no change mode. The RAM width and depth considered are 16 bits wide and 1024 rows deep. Inspired by the method explained in [11], the data state and key state are stored as 16 × 4 and 16 × 5 matrices in the RAM. In the PRESENT-80 RAM approach, the initial plaintext and key (stored in a text file) are loaded to RAM addresses 0–3 and 8–12, respectively. After performing the round operations, the next data state and key states are stored to addresses 4–7 and 13–17, respectively. In the next round, data state and key states are read from
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA
459
addresses 4–7 and 13–17 and after performing the operations, it is stored back to the addresses 0–3 and 8–12. In this way, memory requirement can be kept minimal. The KLEIN-80 RAM approach is implemented in a similar fashion. The only difference is that, the state obtained after AddKey and SubNibbles is stored to addresses 4–7, the state obtained after RotateNibbles is stored to 8–11, and the state obtained after MixNibbles is stored back to 0–3. So, in this case, the RAM memory requirement is more, compared to the PRESENT-80 RAM case. ˙In both the RAM based approaches, the design codes only perform the necessary operation without worrying about reading and writing to the RAM, that is handled externally.
6 Results and Conclusion 6.1 Results The board used during the project is Artix 7 Part: xc7a200tfbg484-1, that has a maximum limit of 134,600 for LUT and 285 for Bonded IOB. The verilog codes of PRESENT and KLEIN are run on Icarus verilog and Vivado. The analysis and utilization reports are obtained from Vivado. The number of LUTs obtained for the PRESENT-80 is 223, and the number of FFs used are 246. For the PRESENT-80 RAM approach, the result for the number of LUTs used is 31. Along with that, half a tile of block RAM is used. It can be clearly observed that, by using block RAMs, lesser LUT count can be obtained. The simulation result (obtained from Vivado) for PRESENT-80 approach (encryption only) is given in Fig. 3. The RTL schematic of the PRESENT-80 encryption is shown in Fig. 4. “rc” is the round count, “plaintext,”’ “key,” and “clk” viz clock are the input ports, and “cipher” is the output port. “present_round” is the module which contains the AddKey, sLayer, pLayer, and key generation submodules. Three 2:1 MUXes have been used to ensure that the plaintext and key are given initially and only one “present_round” module is instantiated and used for all the 31 rounds of the cipher. The simulation result (obtained from Icarus Verilog) for 1 round of PRESENT-80 RAM approach encryption is shown in Fig. 5. The implementation has been coded in such a way that the LUT count remains unaffected by the number of rounds the cipher runs for.
Fig. 3 Simulation result of PRESENT-80 cipher encryption
460
S. Santhameena et al.
Fig. 4 RTL schematic of the PRESENT-80 cipher encryption
Fig. 5 Simulation result of the PRESENT-80 RAM based approach encryption
This figure displays the contents of the RAM. ˙In Fig. 5a, the addresses 0–3 contain the plaintext and addresses 8–12 that contain the key. This plaintext and key are passed through one round of PRESENT-80 encryption. The resultant state is written to addresses 4–7 as shown in Fig. 5b. The key is passed through one round of generation, and the resultant next round key is written to addresses 13–17 as shown in Fig. 5b. ˙In the next round, state from addresses 4–7 and key from addresses 13–17 in Fig. 5b are used. The RTL schematic of the PRESENT-80 RAM approach is shown in Fig. 6. “xilinx_single_port_ram_no_change” is the module which instantiates the blockRAM. The “subround” and “generatekey” modules perform the cipher operations. The reason these modules are not interconnected is that the reading and writing to the RAM is performed from the testbench as the timing of reading from and writing to the RAM has to be carefully controlled. This is easier to achieve from the testbench for the purpose of simulation.
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA
461
Fig. 6 RTL schematic of the PRESENT-80 RAM approach encryption
A similar approach is taken for KLEIN-80 which is implemented on the same board. The count of LUT is 265, the Bonded IOB is 205, and FFs is 229. For the RAM implementation of KLEIN-80, the count of LUT is 81 and that of block RAM is 0.5. It can be observed from these results that in general, the implementation of KLEIN cipher requires more LUTs than PRESENT cipher, in both approaches. It can also be observed that the KLEIN-80 RAM approach utilizes lesser LUTs than the normal KLEIN implementation. The simulation results (obtained from Vivado) of KLEIN-80 approach (encryption only) is given in Fig. 7. The RTL schematic of the KLEIN-80 cipher encryption is shown in Fig. 8. “rc” is the round count, “plaintext,” “key,” and “clk” viz clock are the input ports, and
Fig. 7 Simulation result of KLEIN-80 cipher encryption
462
S. Santhameena et al.
Fig. 8 RTL schematic of KLEIN-80 cipher encryption
“cipher” is the output port. “klein_round” is the module which contains the AddKey, SubNibbles, RotateNibbles, MixNibbles, and key generation submodules. The simulation result (obtained from Icarus Verilog) for 1 round of KLEIN-80 encryption is shown in Fig. 9. It can be observed from the figures that the amount of memory used in the RAM for KLEIN-80 is more than that of PRESENT-80. This figure displays the contents of the RAM. ˙In Fig. 9a, the addresses 0–3 contain the plaintext and addresses 12–16 contain the key. This plaintext and key are passed through one round of KLEIN-80 encryption. The result of the AddKey and SubNibbles steps is written to addresses 4–7, which can be observed in Fig. 9b. This state
Fig. 9 Simulation result of the KLEIN-80 RAM approach encryption
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA
463
is passed through the RotateNibbles step and the result is stored to addresses 8–11. Finally, this state is read and passed through the MixNibbles step and the resultant state of the first round is stored back to addresses 0–4, which can be observed in Fig. 9b. The next round key generated is stored to addresses 17–21. The RTL schematic of the KLEIN-80 RAM approach is shown in Fig. 10. “xilinx_single_port_ram_no_change” is the module which instantiates the block RAM. The “addsub,” “mixnibbles,” and “genkey_klein” modules perform the AddKey, SubNibbles; MixNibbles and key generation operations. A design module for RotateNibbles step has not been included because of its mere shifting nature. From a RAM point of view, it can be done by just reading and writing to a different address.
Fig. 10 RTL schematic of KLEIN-80 RAM approach encryption
464
S. Santhameena et al.
Table 1 Table to compare the amount of elements used by the algorithms Element
PRESENT-80
KLEIN-80
PRESENT-80 RAM
KLEIN-80 RAM
Slice LUTs
223
265
31
81
Slice Registers
246
229
0
0
F7 Muxes
1
0
0
13
F8 Muxes
0
0
0
5
Bonded IOB
214
205
133
177
RAM tile
0
0
0.5
0.5
Fig. 11 The cipher output (8 bits) observed on FPGA with LEDs
The utilization results are summarized in Table 1. The number of LUTs consumed by KLEIN-80 is more than that of PRESENT-80 in both cases. However, the LUTs consumed by the RAM approach is lesser than the slices approach for both the ciphers. A snapshot of the implementation of PRESENT-80 encryption on Basys3 FPGA board is shown in Fig. 9. For simplicity, only the first 8 bits of the cipher text are displayed using the output LEDs on the FPGA. The cipher text obtained for a plaintext 64’hfedc4a3973528101 and key 80’h1234566890abcdef9876 is 64’h6e5a7491b9f2ceea, of which only “ea” is displayed (observed by reading the LEDs from right to left). The RTL schematic for this implementation is almost same as that of Fig. 4. However, the module is wrapped around by another module which takes the 64 bit plaintext and 80 bit key (the input ports), and outputs only 8 bits of the cipher (output port) (Fig. 11).
6.2 Conclusion In this project, the implementation results, i.e., simulation and utilization reports of PRESENT-80 and KLEIN-80 ciphers and their corresponding RAM approach have been presented. The PRESENT-80 RAM approach consumed just 31 LUTs as against the normal PRESENT-80 approach that consumed 223 LUTs. Similarly, the
Comparison of PRESENT and KLEIN Ciphers Using Block RAMs of FPGA
465
KLEIN-80 RAM implementation consumed about 81 LUTs, which is lesser than the normal cipher implementation (265). Therefore, the amount of LUTs required for the operation is reduced in the RAM approach, and the LUTs saved can be used by other operations. It is observed that KLEIN cipher utilizes more LUTs than PRESENT cipher, hence PRESENT cipher establishes its. supremacy in lightweight ciphers. Acknowledgements We would like to sincerely thank our guide Prof. SanthaMeena S for all her guidance, assistance, and constant support.
References 1. A. Mohammed, N. Varol, A review paper on cryptography 1–6 (2019). https://doi.org/10.1109/ ISDFS.2019.8757514 2. C.V. Joe, J.S. Raj, Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 3(3), 259–271 (2021) 3. B. Mohd, T. Hayajneh, Lightweight block ciphers for IoT: energy optimization and survivability techniques. IEEE Access 1–1 (2018). https://doi.org/10.1109/ACCESS.2018.2848586 4. A. Bogdanov, L. Knudsen, G. Leander, C. Paar, A. Poschmann, M. Robshaw, Y. Seurin, C. Vikkelsoe, Present: an ultra-lightweight block cipher. 4727, 450–466 (2007) 5. V.G. Kiran, FPGA implementation of lightweight cryptographic algorithms-a survey. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 5, 819–824 (2016) 6. P. Yalla, J.-P. Kaps, Lightweight cryptography for FPGAs, in ReConFig’09–2009 International Conference on ReConFigurable Computing and FPGAs (2009), pp. 225–230. https://doi.org/ 10.1109/ReConFig.2009.54. 7. Z. Gong, N. Svetla, Y.W. Law, KLEIN: A new family of lightweight block ciphers, 7055, 1–18 (2011) 8. G. Hatzivasilis, K. Fysarakis, I. Papaefstathiou, H. Manifavas, A review of lightweight block ciphers. J. Cryptogr. Eng. 8, 1–44 (2018). https://doi.org/10.1007/s13389-017-0160-y 9. S.R. Ghorashi, T. Zia, Y. Jiang, Optimisation of lightweight KLEIN encryption algorithm with 3 S-box, in 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops) (2020), pp. 1–5. doi: https://doi.org/10.1109/PerComWorksh ops48775.2020.9156189 10. T.E. Güneysu, FPGAs in Cryptography, in Encyclopedia of Cryptography and Security, ed. by H.C.A. van Tilborg, S. Jajodia (Springer, Boston, MA, 2011). https://doi.org/10.1007/978-14419-5906-5_31 11. E. Kavun, T. Yalcin, Ram-based ultra-lightweight FPGA implementation of present (2011), pp. 280–285
Remote Controlled Patrolling Robot Samarla Puneeth Vijay Krishna, Vritika Tuteja, Chatna Sai Hithesh, A. Rahul, and M. Ananda
Abstract A lot of manual effort is put into patrolling a society at night. Crimes occur despite the efforts put into patrolling. This also puts the life of the patrol at risk. Although the Surveillance Cameras are largely adopted, they failed to cover a large distance. The goal is to bring these crimes to a minimum rate. In this project, a proposal of a completely self-governing protected machine that works enthusiastically and guards a huge region on its own to protect the facility by reducing human attempts. In this present work, a Raspbian Operating System with a controlled algorithm based Automated Obstacle Detection Robot that holds a night-vision camera and IoT is employed for motion control and also useful for purposes like image or video capture by machine and received via the Internet to the user using ThingSpeak Application. Keywords Patrol robot · Raspberry Pi3 · Ultrasonic sensor · Sound sensors · Infrared sensor · Night-vision camera · Open CV · Flask · Internet of Things (IOT) · Local area network (LAN) · ThingSpeak
1 Introduction 1.1 Objective A lot of human effort goes into patrolling the streets, especially at night. This automated machine will reduce manual labor and work tirelessly throughout the night. The robot provides remotely accessible and monitored visuals and audio streams to increase the efficiency of security while reducing human effort. S. P. V. Krishna (B) · V. Tuteja · C. S. Hithesh · A. Rahul · M. Ananda Department of Electronics and Communication (UGC), PES University Electronic City Campus (UGC), Bangalore, India e-mail: [email protected] M. Ananda e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_35
467
468
S. P. V. Krishna et al.
Safety has become a very big concern in the society, not only in the corporate world but also in our day-to-day existence. There are stories about robbery, crimes etc. which cannot be solved for many years. The primary objective of this product is to provide remote surveillance to enhance security in an organization. It can be used to monitor inaccessible areas henceforth reducing the risk to human lives. This product not only ensures security, but also focuses on safety and the right to privacy by maintaining a highly secure data storage system.
1.2 Synopsis Nowadays, robots play a significant part in everyday life, consequently lessening human attempts by automating tasks. In this project, a proposal of an automated safety guarding machine uses a night sight camera for protecting any property. This Automated Robot proceeds autonomously by avoiding all obstacles in a path. The end user can constantly observe the encompassing pursuit of robot position with live video call employing a night vision camera from the robot. This machine is designed with an R-Pi microcontroller which controls the machine’s way of behaving. The robotic vehicle proceeds at specific intent and halts at certain points and proceeds to the following position providing any sound is heard, and it is provided with a night vision camera, sound sensors, ultrasonic detectors. The robot makes use of a pre-planned line to go after its path for a period of time guarding the place. Not only that, but it will keep track of each place to detect any invasion of employing a 360° revolving high-definition camera. It has the provision to notice sound within the area. It will then examine the area using its camera to determine any object exposed. The robot apprehends and begins transferring pictures of things right away on sound or obstacle spotting. By using LAN (Local Area Network) a live feed is received from the robot and is displayed to the user. In this current work, a Raspbian OS-build robot principles with remote tracking and control design through the Internet of Things is enlarged which can rescue lives and can reduce human fallacy. Thus, a proposal of a self-governing protected machine works enthusiastically and guards a huge region on its own to protect the facility reducing human attempts. This type of project can be utilized during the daytime as well.
2 Related Work—Literature Survey 2.1 Overview By referring to many journals/IEEE conference papers on the Internet, and watching many videos on YouTube of several patrolling robots, a conclusion was found, one that would fit the problem statement and tried to build a problem solution on that.
Remote Controlled Patrolling Robot
469
The analysis of various microprocessors that might form the centre of the project is needed. The Literature Survey of a task paperwork is its backbone, it defines a route and technique to a task. This shaped the base of task that are noted each time needed and regarded upon it for guidance. The authors of [1] have proposed a model which uses planar mirror image version is sensive with an infrared digital digicam to distribute the power of remarkable pixels. This paper [2] specifies a study of a cellular robotic accompanied by Global Positioning System inception. Global Positioning System era built it pivotal to think about the monitoring of the robot. This [3] approach measures relative switch for cellular robot structures with a sequence of Digital digicam images using visual Odometry for the usage of headlights to provide their model with a better visual of the path ahead. The Authors of [4] have presented a paper with an arrangement of safe keeping robotic for guarding which makes use of a nighttime imaginative and prescient digital digicam to securing its property. It is likewise ready with a nighttime imaginative and prescient digital digicam and sound detectors. And that can be utilized by the means of pre-decided way this is specified to the reviewer for the motion of guarding. Robot can also take pictures without delay to the manipulate display room, for similar actions. The surveillance [5] is based on dense woodland with undercover agent robotic that is applied with the proposed technique of 8 km conversation with RF based sign. The human detection is achieved with photograph processing method in evaluation with captured photograph and the reminiscence photograph. The robot motion is managed primarily based totally on sensor data. This idea may be similarly applied to broaden an actual product with inside the destiny with wifi-primarily based totally photograph transfer. Wide range referring to location surveillance has completed the usage of the nighttime imaginative and prescient digital digicam equipped at rover and additionally automated device, while the sound is detected robotically [5, 6], will observe the unique route and visit that noticed location and seize the location and ship to police station server the usage of IoT. This idea is an automated clever manner to patrolling in a single day to store women. In this [7] design, proposed security patrolling robot that uses a night vision camera for securing any demesne. The robotic vehicle moves at particular interval and is provided with a night vision camera and sound detectors [5–7]. It uses a predefined line to follow its path while patrolling. It stops at particular points and moves to the following points if sounds are detected. The system uses IR grounded path following system for patrolling assigned areas. It monitors each area to discover any intrusion using a 360° rotating HD camera. It has the capability to cover sound on the demesne.
470
S. P. V. Krishna et al.
3 Methodology- Proposed Work See Figs. 1 and 2.
3.1 Design Components 3.1.1
Hardware Components
i. ii. iii. iv. v. vi. vii. viii.
Raspberry Pi3 model B + Ultrasonic Sensor IR sensor Sound sensor DC motors L293D motor driver circuit Camera module: R-Pi Zero Robotic Chasis.
3.1.2 i. ii. iii. iv.
Software Components Raspbian OS Flask ThingSpeak Python 3.7.
Fig. 1 Block diagram of the system design
Remote Controlled Patrolling Robot
471
Fig. 2 Flow chart of the system design
3.2 Overview In this project sound sensors have been used to find the area from which the sound is coming. Connection is made from the camera to the raspberry pi. The captured photo or video is sent by IoT. The raspberry pi is powered by a battery or solar panel. Moving forward a connection has been established between the monitor to the raspberry pi using the Video Graphics Array (VGA) to High-Definition Multimedia Interface (HDMI) cable. Then connect the keyboard and mouse to the raspberry pi. Connect the raspberry pi camera to the raspberry pi. When a sound is sensed by the sound sensor it moves towards the direction of the sound and captures the photo or video and sends to the user using IoT technology. The infrared sensors both emit and detect infrared radiation. When an object comes close to the sensor, the infrared light from the LED reflects off of the object and is detected by the receiver. The robot can be controlled in two ways, wired and wireless control. The Wireless control has an additional advantage as it reduces the installation cost and increases the flexibility of the robot.
472
S. P. V. Krishna et al.
3.3 Movement of Robot The robot moves with the help of two motors and one universal wheel. The methodology used for movement is obstruction avoiding. So, for that reason Ultrasonic Sensor comes into play. Here it senses between the object and robot, if it is less than 15 cm, then it makes sure it turns around or goes back, otherwise it keeps going forward. This way it keeps moving without any disturbances. The robot can be controlled in two ways, wired and wireless control. The Wireless control has an additional advantage as it reduces the installation cost and increases the flexibility of the robot. Two gears motor are used to move the robot and motor driver module to control the motor. They have used two gear motor to reduce the power consumption. The robot has an infrared sensor attached to it so when an obstacle is detected it will stop and move in the opposite direction.
3.4 Designing the Live Stream The camera is to be enabled on the Raspberry Pi device. The technology that is used is OpenCV to run the live stream and flask to make sure it is visible on a webpage and make sure to install those and related libraries. Step 1 Step 2
Step 3
Starting the Raspberry Pi Stream. This makes the copy of every image and displays it on the webpage. Launching the Stream onto the browser The web framework is the flask technology which is linked to the python code onto the Raspberry Pi. Automatically starting the live stream.
A good idea is to make the camera stream auto start at bootup of pi. There is no need to re-run the script every time, to create the stream. Executing the code from the shell should be sufficient to view the live stream. Auto start is done to live stream each time to boot Raspberry Pi. The live stream will be displayed from a webpage by giving in the IP address of the pi. Hence, one will be able to view the livestream from anywhere and in multiple devices.
3.5 ThingSpeak Cloud Application in Raspberry Pi ThingSpeak is an open IoT based implementation of Cloud. ThingSpeak acts as a cloud server to store the data. Here Raspberry Pi will read its Sensor Values and send it to ThingSpeak, and it can be monitored from anywhere in the world using Internet. It gives the statistical analysis of the data, which helps in better analysis of the situation encountered by the robot. This will be useful for running the Pi for long
Remote Controlled Patrolling Robot
473
time for some application at some remote place and need to monitor its values for surveillance purposes. The robot will be reading out data from the ultrasonic sensor, IR sensor and sound Sensor. This will give the distance measured from the nearby object (if any) and to check if noise is present or not. The first step would be to set up an API (Application Programming Interface) which will help the cloud and the pi to communicate with each other via API keys. Once the ultrasonic, IR and sound sensors gives out data that is, the distance, noise, presence of object, those values are given to the cloud which are then systematically plotted with time and distance/noise/presence of object. This graph can be accessed from multiple devices at once and will help the user to get an estimate of the patrolling robot. The data can be set public or private according to the necessary requirements.
4 Implementation and Result 4.1 Implementation 4.1.1
Hardware Implementation
The hardware uses is one Ultrasonic Sensor, an IR sensor, Sound sensor and the RaspberryPi camera. In this project, all the sensors have been interfaced together. The ultrasonic sensor will be useful in providing us with a distance between the facing object and the robot which is then observed by the user. The sound sensor will enable us to detect any background noises from the environment which will allow the user to focus on the disturbance within the given environment. The Raspberry Pi camera is a powerful tool which will give a live feed at all times in order to ensure the safety of the environment. All the above sensors are interfaced with each other in order to function simultaneously using Thonny Python. This will help the user get a better description and understanding of the environment around the robot. The robot will move in a particular direction until it is faced with an obstacle. If the distance between the obstacle and the robot is less than 15 cm, it will halt its movement. The robot moves backward and turns to a particular direction (either left or right) and continues to move in that particular direction. The path the robot defines is based on obstacle detection. The distance obtained from the ultrasonic sensor is given to the Raspberry Pi. This distance is given to the L293D motor driver which controls the movements of the wheels and helps the robot move (Fig. 3).
4.1.2
Browser Implementation
The live stream for the RPi Zero camera is directed towards a web browser. Using the flask technology, one can not only view the live stream in multiple browsers but also on multiple devices simultaneously. This allows multiple users to keep a watch
474
S. P. V. Krishna et al.
Fig. 3 The overall circuit connection of the project interfacing ultrasonic sensor, IR sensor and Sound sensor
on the environment and enhances security. In the code OpenCV has been used to help us stream the live feed better. A virtual environment has been created on the Pi to ensure that the live stream is not effected by other programs running on the same platform. The camera will capture images of the environment. These images will be represented as data frames to the Pi. These picture frames are then displayed continuously in JPEG format to produce a live stream onto the web browser. The live feed is broadcasted from the IP of 0.0.0.0 to the network of the IP address of the given Pi. Due to this the user can view the feed from anywhere. Adding to this advantage, the user can also view the live feed from the comfort of their phone using the camera option from a third party app called Rasp Controller. Once the app is connected to the Pi, the user can not only view the feed but also turn on/off the Pi or make changes to the robot just by using their phones.
4.1.3
Sensor Readings Using ThingSpeak
Reading have been taken from both Ultrasonic sensor and Sound sensor in this project. These readings are the distance and the sound respectively. Ultrasonic sensor reading will determine the distance between the obstacle and the robot. If the distance is less that 15 cm, it will be displayed in a graphical format on the cloud where it can be monitored by the user. The sound sensor detects the Sound from the environment. The readings taken from the sound sensor is given out in the form of numerical values 0 or 1. 0 is being given when there is no sound from the environment and 1 is being given when noise is detected. Both the reading are plotted statistically and graphically on the cloud. These graphs are plotted using Matlab which the cloud uses (Fig. 4).
Remote Controlled Patrolling Robot
475
Fig. 4 Graphical data from the cloud
4.2 Result 4.2.1
Movement of the Robot
The movement of the robot was determined by the obstacle in front of it. The robot successfully responded to the path movement by overcoming the spontaneously introduced obstacles in the path at random points. The robot reacted as expected to all the trials that were put forth. The robot is moving in a straight path if no obstacle is present in front of it. The robot stopped and moved backward and turns left or right if an obstacle is present at the specified distance. As noticed the IR sensor is showing forward meaning no obstacle is present in front of the robot. The ultrasonic sensor is showing the distance from the obstacle. The sound sensor is showing noise if there is any sound present and no sound is present if is showing quiet (Fig. 5).
Fig. 5 Working model
476
S. P. V. Krishna et al.
Fig. 6 Live stream
4.2.2
LiveStream
The video captured by the Pi camera is live-streamed. The above information is the readings from the different sensors that have been used in the robot. A live stream is given out of the Pi at all times so that the user can monitor and observe the environment (Fig. 6).
4.2.3
ThingSpeak Readings
To make the observations easier for the user, ThingSpeak cloud service is used to obtain a statistical and graphical visualization of the distances given out by the ultrasonic sensor. The user will also get an idea of the background that the robot is moving in by studying the values from the sound sensor that is getting plotted onto the ThingSpeak graph as well (Fig. 7).
5 Conclusion After several trials and tests, the team has concluded that this robot can be used in a closed environment to add extra security and safety to it. The movement of the robot depends on object detection which will give us a clear idea of what lies in front of us. The robot was able to successfully communicate with Pi using Wifi as it is a short distance application. A connectivity range of upto 50 m was established.
Remote Controlled Patrolling Robot
477
Fig. 7 Graphs in ThingSpeak
With the help of power supplied from power the robot is able to run for around 3 h. The 5MP camera module has been used, the visibility will be good in the day time. Night Vision camera can be used for that purpose. Raspberry Pi is used as core processing unit due to which the project was able to achieve low power consumption and use of multiple sensor, camera and to drive the motors. There was no requirement of an additional power source to drive the L293D motor driver as Pi was able to do that. Raspberry Pi IP address was used to stream the camera on the multiple PC and mobile phones, which gave accessibility for surveillance and remote monitoring. The robot moved on the principle of obstacle avoidance. So that it can move freely without human interruption. Thingspeak has given the statistical analysis of the sensor values, (graphical outputs) so that the user can get a better understanding of the surroundings. This information is helpful in remote monitoring along with camera module.
6 Future Scope This project has immense scope for improvement in the future. With opportunities for improvement, this project can be elevated to a greater standard of safety and security. Here are some of the improvement scopes for the benefit of the project: • Developing a training model for the robot to move in a particular path around the environment. While making errors, the robot uses machine learning and artificial neural network to adapt to a new environment and develop a path for itself. • The more mistakes the robot makes the more it learns and trains itself. • An addition of memory would also enhance the working of the robot. • A memory would store everything previously seen by the robot and make decisions carefully in the future. • Addition of a fire detection sensor or water sensor could enhance the security of the robot. • In the case of fire detection, the user could be alerted for an immediate fire.
478
S. P. V. Krishna et al.
• The user can dictate the movement of the wheels based on speed. The robot could move fast or slow as per the user’s wishes.
References 1. C. Tang, Y. Ou, G. Jiang, Q. Xie, Y. Xu, Road detection at nighttime supported a planar reflection model, in Do an Night out on the Road Supporting a Planar Reflection Model in The Year (2013) 2. T. Saito, Y. Kuroda, Mobile robot localization by GPS and sequential appearance-based place ˙ recognition, in Implemented a Mobile Robot Using GPS with a Neighborhood Recognition Organization (2015–2017) 3. K. MacTavish, T.D. Barfoot, M. Paton, Night rider: visual odometry using headlights, in 2017Visual Odometry with the Headlights. So the System Measures Relative Transfer for Mobile ˙ Robotic Systems with a Series of Camera Images 4. Night-sight robot patrolling and monitoring system. IJRASET (2019) 5. Spy robot for surveillance using raspberry Pi controller. IJESRT (2020) 6. Night-sight patrolling rover navigation system for women’s safety using machine learning. Int. J. Psychosoc. Rehabil. (2019) 7. Night Vision Patrolling Robot with Sound sensors using computer vision technology (IJESRT 2020). IOT implementation using raspberrypi3 and ThingSpeak Cloud. https://youtu.be/lhIiY1 If8Os
Novel Modeling of Efficient Data Deduplication for Effective Redundancy Management in Cloud Environment G. Anil Kumar and C. P. Shantala
Abstract A lightweight and robust security mechanism for data deduplicated data is one of the essential requirements from both security and storage optimization viewpoints. However, existing schemes for securing deduplication on the cloud have largely focused on incorporating sophisticated encryption technique, with relatively little focus on facilitating computationally efficient deduplication. Hence, a wide trade-off is being explored between robust security demands and efficient redundancy management of data over cloud storage. This problem is addressed in the proposed study, where a simplified solution is presented, emphasizing the over deduplication principle, which will complement the supporting usage of the further security scheme. According to proposed scheme, the data is subjected to artifact removal operation followed by manual labeling and a unique clustering mechanism to group deduplicated data. The experimental analysis showcases the proposed scheme to excel better performance in contrast to the existing system concerning computational resource efficiency. Keywords Data deduplication · Data redundancy · Cloud environment · Storage optimization · Labeling · Clustering
1 Introduction With the increasing usage of cloud computing and mobile networks, there is also an increase in data redundancy concerns resulting in unwise utilization of cloud storage [1]. The dynamic nature of data over an integrated cloud environment poses a prime impediment toward data redundancy [2]. This problem can be solved using data deduplication, which eliminates duplicated data [3]. At present, various data deduplication schemes are further classified concerning the location and size of data. G. Anil Kumar (B) · C. P. Shantala Department of Computer Science and Engineering, Channabasaveshwara Institute of Technology Gubbi, Tumkur, India e-mail: [email protected] Visvesvaraya Technological University, Belagavi, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_36
479
480
G. A. Kumar and C. P. Shantala
From a location perspective, deduplication is carried out on the server-side and clientside. In contrast, from a size perspective, deduplication is carried out using file level, block level, and byte level. Further, the block-level deduplication is classified into fixed and variable sizes. However, it is always challenging to secure the deduplicated data [4]. At present, various research work is carried out toward ensuring the data duplication process; however, the downsize is still open. Existing schemes make use of (i) convergent encryption [5], (ii) file popularity-based deduplication [6], (iii) file/block-based deduplication [7], (iv) proof-of-storage based approach [8], (v) keyword search-based scheme on encrypted data [9], etc. However, these schemes suffer from various practical problems associated with the domain. There is always a trade-off between deduplication and security operation. Therefore, the proposed system develops a computationally cost-effective solution for data deduplication to support potential and robust security requirements. The manuscript’s organization is as follows: Sect. 2 discusses existing approaches of secure deduplication, followed by a problem in Sect. 3. The methodology is briefed in Sect. 4, while Sect. 5 illustrates the system design, Sect. 6 discusses result analysis, and Sect. 7 gives conclusive remarks.
2 Related Work This section outlines the existing schemes of secured data deduplication from recent studies as a continuation of the review discussion presented in our prior work [10]. The current system considers security concerns of proof-of-ownership using both file and block-level data deduplication and sophisticated key management, as seen in the study of Jiang et al. [11]. Ni et al. [12] present security toward crowd sensed data, where pseudorandom operators are used for privacy in a fog environment. The work presented by Cui et al. [13] uses attribute-based encryption for securing deduplicated data. Further public key encryption is also reported to secure the link leakage, as seen in the work of Li et al. [14]. Stanek and Kencl [15] presented an encryption scheme based on data popularity over unsecured cloud storage units. A similar category of work is also carried out by Wu et al. [16], where a public auditing scheme is presented over the encrypted file with deduplication of authentication tags. A study toward an attribute-based encryption scheme is presented by Xu et al. [17], addressing privacy issues using revocable ciphertext encryption. Encryption assisted from the server is also reported for securing deduplicated data using a decentralized scheme, as discussed by Shin et al. [18]. In contrast, a similar strategy toward clientside encryption is presented by Youn et al. [19]. Zhang et al. [20] have used ciphertext encryption using attributes to secure the authentication tag. The work carried out by Zheng et al. [21] has used encryption using a functional deduplicated data structure to save bandwidth cost and resist attacks on the public cloud. Adoption of randomized tags used the tree-based scheme for controlling time complexity on client machine while performing deduplication as reported in the work of Jiang et al. [22]. Adoption of bloom-filters and signature is also said to secure deduplicated data considering the
Novel Modeling of Efficient Data Deduplication …
481
use case of the Internet of Things (IoT), as seen in the work of Mi et al. [23]. Shakya [24] presented a security framework for data migration. The existing system also discussed the usage of blockchain for decentralized application over cloud storage, as seen in Wang et al. [25]. Similarly, various other related scheme exists, e.g., encryption using identity [26], attribute-based encryption [27], homomorphic encryption for protecting brute-force attack over deduplicated data [28], encryption over block and message [29], and elliptical curve encryption (Shynu et al. [30]). Out of all, the work presented by Xiong et al. [31] has presented a potentially better deduplication scheme using a tree, dynamic count filters, and re-encryption policy. Hence, various schemes have been suggested in the literature for securing data deduplication process. However, a significant concern arises regarding the demand of computing resource and their optimization to support the requirements in real-time implementation scenarios. The next section highlights the significant gaps identified based on review of above discussed literature.
3 Problem Description The research problems that have been observed from the existing approach and is addressed in the proposed system are as follows viz. (i) existing approaches mainly uses public encryption as well as other complex forms of encryption which drains resources while operating, (ii) the focus of existing secure deduplication scheme is mainly toward incorporating encryption and significantly less toward addressing the forms, structure, management, and clustering of raw data, (iii) existing decentralized scheme considers storage of data and not the generation of data which results in the inapplicability of this deduplication scheme over data with different domain in decentralized fashion in the cloud, and (iv) none of the existing schemes has offered evidence of simplified computational effectiveness while performing secure deduplication over cloud environment. Therefore, the problem statement for the proposed work can be stated as “designing a lightweight mechanism for data deduplication is a challenging task that can support maximum level of security without compromising network performance”.
4 Proposed Methodology The proposed scheme is a continuation of the prior security model, which has emphasized data privacy and data integrity [31]. This model implements an encryption mechanism to secure the data storage over cloud. However, there is still a possibility of the presence of vulnerable data, which would further weaken the data integrity scheme. This is possible in the case of duplicated data, which could also unnecessarily saturate the cloud storage system. Hence, the indexing mechanism discussed in [31] will require more optimization in the proposed scheme, as shown in Fig. 1.
482 Fig. 1 Proposed scheme of data deduplication
G. A. Kumar and C. P. Shantala Data Storage
Preprocessing
Manual Labeling for 10 examples
Clustered data deduplication
BERT Clustering
The prime aim of the proposed system is to ensure efficiency and optimality in the computational requirement in the execution of algorithm. The scheme presented in Fig. 1 is meant to contribute to further data privacy using a novel data deduplication scheme. According to this scheme, the storage secured data is further assessed for artifacts, which is mitigated using preprocessing operation. The indexing mechanism in the prior model is further improved upon by inducing a manual labeling scheme. Additionally, a clustering scheme is introduced, which generates clustered deduplicated data. Adopting an experimental research methodology, the proposed system is implemented, and its outcome is benchmarked to exhibit its effectiveness. The following section further elaborates its system design.
5 System Design This section discusses the system design implemented toward achieving the proposed deduplication scheme. Discussion of system design is based on strategies adopted for System Design, Assumptions and Dependencies, and Algorithm Design.
5.1 Strategies of System Design The prime strategy of the proposed model is to ensure the uniqueness of the data in distributed storage over cloud clusters. The development of the proposed system is carried out considering three prominent strategies of implementation. The primary implementation strategy will be to carry out document level deduplication scheme that can complement the proper indexing mechanism using encryption discussed in our prior model [31]. The idea is to further optimize the privacy performance overall by ensuring unique data retention over the storage. The secondary strategy of the proposed implementation will be to ensure the elimination of the artifacts present in textual data aggregated in data storage in a distributed fashion. The tertiary approach is to apply clustering operation to further confine the deduplicated data before storage in cloud distributed storage units. The advantage of adopting this strategy will be to introduce a balance between storage optimization as well as optimization of data security. The storage optimization is ensured using the proposed data deduplication
Novel Modeling of Efficient Data Deduplication …
483
scheme, while the data security optimization is carried out by providing the retention of unique data. The justification behind this is the presence of duplicated data will add more overhead over any encryption process; moreover, it is unlikely that the encryption process could ever distinguish the difference between unique data and duplicated data for complex textual input.
5.2 Assumptions and Dependencies The first assumption of the proposed system is that there is an auditing framework capable of assessing the genuity of the presence of the node in the process of data transmission in the cloud. The second assumption of the proposed system is that the proposed model runs a specific block of the operation, which is responsible for data transformation, computation of secret tokens, generating secure parameters, followed by verification. The third assumption of the proposed system is that there is an underlying algorithm for authenticating the data integrity, which takes the input from the communication node and generates the outcome of validation. All these assumptions can be legitimately considered as the proposed deduplication model is developed on top of the prior model [31]. Hence, the proposed model emphasizes mainly novel deduplication operations. The major dependency of the proposed system is the presence of distributed data being generated from multiple sources, which has the presence of duplicated contents. The next dependency is the usage of textual data only where the labeling is carried out manually, which is meant to further improve upon the indexing mechanism discussed in the prior model [31]. These dependencies are not hard form and they can be always customized.
5.3 Algorithm Design The proposed system implements the mechanism of deduplication in a progressive mechanism where the idea is to ensure a better form of unique data retention before storage in cloud units. The algorithmic steps of the proposed system are as shown below:
484
G. A. Kumar and C. P. Shantala
Algorithm for Data Deduplication Input: d max (maximum number of documents) Start 1. For i = 1:dmax 2. Perform data preprocessing to obtain refined data: d ref ⇓ f 1 (i) 3. Constructs a vocabulary matrix: vmat ⇓d ref 4. Check: rec = iden 5. flag ‘duplicate content’; 6. Otherwise: a. Check: vmat > rec 7. generate req♦user 8. Check: (req = marked) 9. then flag ‘duplicate content’; 10. Otherwise: flag ‘No duplicate content.’ 13. Check (rec > iter = 10) 14. break(); 15. Otherwise: apply BERT 16. ddedup = f 2 (rec, flag) 17. End End
The algorithm takes the input of a maximum number of document d max which after processing yields and outcome of deduplicated data ddedup. The algorithm considers all the documents which have textual data (Line-1) and performs the reading operation. Further, a function f 1 (x) is constructed in order to achieve the dual operation, i.e. (i) elimination of white spaces and (ii) conversion to lower cases (Line-2). For further operation, the algorithm constructs a vocabulary matrix vmat using all the refined data d ref obtained from the prior step of operation (Line-3). The next part of the process considers all the refined documents in the form of records rec in order to perform a conditional check for the presence of identical records iden (Line-4). In case of the presence of identical records, the algorithm generates a flag of duplicate content of the text (Line-5). However, if identical records are not found (Line-6), then the algorithm further checks of the vocabulary having more than two identical words (Line-6), and this is again followed by generating a request req to the user for confirming their feedback about the state of duplicates (Line-7). The user either marks the document to be duplicate (Line-9) or in case of the absence of duplicates, the user marks the document as non-duplicated content (Line-11). The non-duplicated contents are further checked for their presence of more than ten iterations (Line-13). If there are more than ten uniquely marked non-duplicated data, the operation is stopped (Line-14), or else the algorithm checks further for other documents. Finally, the algorithm applies the Bidirectional Encoder Representations from Transformers (BERT) clustering process, using function f 2 (x) which is implemented in order to address the issues of document similarities in Natural Language Processing (NLP). It assists in obtaining duplicated data from multiple domain documents. The BERT clustering is used for training with manually labeled data in such a way that when similar inputs are given, the algorithm gives the outcome of deduplicated data d dedup
Novel Modeling of Efficient Data Deduplication …
485
in the form of 0 and 1 representing presence and absence of deduplicated data. Therefore, a simplified scheme of data deduplication is presented by proposed algorithm. The complete operation of proposed manual labeling is shown in Fig. 2.
Fig. 2 Process flowchart of proposed system
486
G. A. Kumar and C. P. Shantala
6 Results Discussion This section discusses the outcome obtained from the implementation of the proposed study. The discussion of this result is carried out with respect to experimental parameters used for the study, test environment considered for benchmarking, and discussion of the result analysis.
6.1 Experimental Parameters There are various parameters being involved in the proposed experimental study toward data deduplication. Following are the prominent experimental parameters: • PreProcess: This parameter is responsible for performing preprocessing with respect to eliminating white spaces and conversion of the lower case. • ReadData: This parameter is responsible for reading the data from the distributed source resulting in the construction of a dictionary of records where the key is a unique record ID. • Fields: These parameters are customized with respect to the fields of the dataset, i.e., site name, address, zip code, and phone number. The study considers the possibility of missing values of the last two fields. • ActiveLearning: This parameter is specifically meant for obtaining the relevance feedback using a supervised scheme where the user is forwarded with data, upon which the user is anticipated to make them positive/negative/uncertain duplicates.
6.2 Test Environment The implementation of the proposed scheme is carried out over a Python environment considering the publicly available Kaggle dataset [32]. The dataset is basically textual data based on the inputs obtained from the survey carried out for Chicago Early Learning Program. The complete operation is carried out over a memory system of a normal 64 bit RTOS computing environment with a dataset up to 10,000 rows maintained in CSV file format. It is to be noted that this is raw data and quite unstructured, which is subjected to the proposed algorithm discussed in the prior section of algorithm design. For the purpose of benchmarking, the proposed system considers a study carried out by Xiong et al. [30], where a cryptographic version of deduplication has been discussed. The uniqueness of this implementation is that it outsources the task of data security to a management module that constructs a binary tree-based deduplication scheme connecting with the keys and roles of the request. The study has claimed of faster dynamic updating mechanism without any events of data leakage. The comparative analysis is carried out using processing time and
Novel Modeling of Efficient Data Deduplication …
487
compression ratio over a similar test environment for three schemes (i) rule-based, (ii) binary tree deduplication, and (iii) proposed system.
6.3 Result Analysis The results obtained from the experimental outcomes are shown in Figs. 3 and 4. Figure 3 highlights the impact of increasing memory over the processing time, where the outcome exhibits that the proposed system obtains superior performance of reduced processing time significantly better than the existing approach. A similar trend of better compression performance is also exhibited in Fig. 4. Existing mechanism of deduplication is either rule based or binary-tree based which involves extensive iterative process while subjected to training operation. Fig. 3 Benchmarked outcome of processing time
Fig. 4 Benchmarked outcome of compression ratio
488
G. A. Kumar and C. P. Shantala
Whereas, proposed system is completely progressive with extremely lesser inclusion of iteration except in clustering operation. This potentially contributes to highly reduced processing time even of the size of memory increases. A closer look into the deduplication process of proposed system shows a unique usage of BERT clustering which offers a sequential processing in generating deduplicated data where the supervised training and manual labeling scheme results in better sequence encoding scheme. This potentially contributes toward better compression performance in contrast to existing scheme.
7 Conclusion This paper has presented a unique scheme of implementing a data deduplication scheme which is meant to run in integration of our prior model where encryptionbased operation is performed for securing data. From the review of existing secure data deduplication scheme, it is noticed that emphasis is mainly offered toward encryption scheme while very less emphasis is given to strengthen the data deduplication scheme. This results in wide trade-off between lower inclination toward data management and higher inclination toward using complex encryption process. This problem is addressed in proposed study. The contribution as well as novelty of proposed scheme are as follows: (i) proposed scheme performs artifacts removal operation over multiple set of different data object resulting in quality and appropriate data, (ii) proposed model offers a manual labeling operation which compliments the indexing operation required by any encryption operation thereby upgrading the security performance too, (iii) the proposed system uses a unique sequence encoding scheme via BERT clustering process which is capable for better grouping of duplicated data, and (iv) the benchmarked outcome of proposed study is witnessed to offer reduced processing time and better compression ratio with increase of memory. The advantage of using BERT based clustering operation is that it is pre-trained on a large text corpus of unlabeled and captures useful and separable information compared to other existing schemes. In the future, the scope of the proposed scheme can be extended with machine learning approaches toward improving its robustness and efficiency.
References 1. J. Maillo, I. Triguero, F. Herrera, Redundancy and complexity metrics for big data classification: towards smart data. IEEE Access 8, 87918–87928 (2020). https://doi.org/10.1109/ACCESS. 2020.2991800 2. N.M. Koji´c, D.S. Mili´cev, Equilibrium of redundancy in relational model for optimized data retrieval. IEEE Trans. Knowl. Data Eng. 32(9), 1707–1721 (2020). https://doi.org/10.1109/ TKDE.2019.2911580
Novel Modeling of Efficient Data Deduplication …
489
3. Y. Fu, N. Xiao, H. Jiang, G. Hu, W. Chen, Application-aware big data deduplication in cloud environment. IEEE Trans. Cloud Comput. 7(4), 921–934 (2019). https://doi.org/10.1109/TCC. 2017.2710043 4. P. Prajapati, P. Shah, A review on secure data deduplication: cloud storage security issue. J. King Saud Univ. Comput. Inf. Sci. (2020) 5. J. Bai, J. Yu, X. Gao, Secure auditing and deduplication for encrypted cloud data supporting ownership modification. Soft Comput. 1–18 (2020) 6. J. Stanek, A. Sorniotti, E. Androulaki, L. Kencl, A secure data deduplication scheme for cloud storage, in International Conference on Financial Cryptography and Data Security (Springer, 2014), pp. 99–118 7. R. Chen, Y. Mu, G. Yang, F. Guo, BL-MLE: block-level message-locked encryption for secure large file deduplication. IEEE Trans. Inf. Forensics Secur. 10(12), 2643–2652 (2015). https:// doi.org/10.1109/TIFS.2015.2470221 8. H. Gajera, M.L. Das, Fine-grained data deduplication and proof of storage scheme in public cloud storage, in 2021 International Conference on COMmunication Systems & NETworkS (COMSNETS) (2021), pp. 237–241. doi: https://doi.org/10.1109/COMSNETS51098.2021.935 2742 9. J. Li, X. Chen, F. Xhafa, L. Barolli, Secure deduplication storage systems supporting keyword search. J. Computer Syst. Sci. 81, 1532–1541 (2015) 10. K.G. Anil, C.P. Shantala, An extensive research survey on data integrity and deduplication towards privacy in cloud storage. Int. J. Electr. Comput. Eng. (IJECE) 10(2), 2007–2018 (2020) 11. S. Jiang, T. Jiang, L. Wang, Secure and efficient cloud data deduplication with ownership management. IEEE Trans. Serv. Comput. 13(6), 1152–1165 (2020). https://doi.org/10.1109/ TSC.2017.2771280 12. J. Ni, K. Zhang, Y. Yu, X. Lin, X.S. Shen, Providing task allocation and secure deduplication for mobile crowdsensing via fog computing. IEEE Trans. Dependable Secure Comput. 17(3), 581–594 (2020). https://doi.org/10.1109/TDSC.2018.2791432 13. H. Cui, R.H. Deng, Y. Li, G. Wu, Attribute-based storage supporting secure deduplication of encrypted data in cloud. IEEE Trans. Big Data 5(3), 330–342 (2019). https://doi.org/10.1109/ TBDATA.2017.2656120 14. J. Li, Z. Su, D. Guo, K.-K.R. Choo, Y. Ji, H. Pu, Secure data deduplication protocol for edgeassisted mobile crowdsensing services. IEEE Trans. Veh. Technol. 70(1), 742–753 (2021). https://doi.org/10.1109/TVT.2020.3035588 15. J. Stanek, L. Kencl, Enhanced secure thresholded data deduplication scheme for cloud storage. IEEE Trans. Dependable Secure Comput. 15(4), 694–707 (2018). https://doi.org/10.1109/ TDSC.2016.2603501 16. J. Wu, Y. Li, T. Wang, Y. Ding, CPDA: a confidentiality-preserving deduplication cloud storage with public cloud auditing. IEEE Access 7, 160482–160497 (2019). https://doi.org/10.1109/ ACCESS.2019.2950750 17. R. Xu, J. Joshi, P. Krishnamurthy, An integrated privacy preserving attribute-based access control framework supporting secure deduplication. IEEE Trans. Dependable Secure Comput. 18(2), 706–721 (2021). https://doi.org/10.1109/TDSC.2019.2946073 18. Y. Shin, D. Koo, J. Yun, J. Hur, Decentralized server-aided encryption for secure deduplication in cloud storage. IEEE Trans. Serv. Comput. 13(6), 1021–1033 (2020). https://doi.org/10.1109/ TSC.2017.2748594 19. T. Youn, K. Chang, K. Rhee, S.U. Shin, Efficient client-side deduplication of encrypted data with public auditing in cloud storage. IEEE Access 6, 26578–26587 (2018). https://doi.org/10. 1109/ACCESS.2018.2836328 20. S. Zhang, H. Xian, Z. Li, L. Wang, SecDedup: secure encrypted data deduplication with dynamic ownership updating. IEEE Access 8, 186323–186334 (2020). https://doi.org/10.1109/ ACCESS.2020.3023387 21. Y. Zheng, X. Yuan, X. Wang, J. Jiang, C. Wang, X. Gui, Toward encrypted cloud media center with secure deduplication. IEEE Trans. Multimedia 19(2), 251–265 (2017). https://doi.org/10. 1109/TMM.2016.2612760
490
G. A. Kumar and C. P. Shantala
22. T. Jiang, X. Chen, Q. Wu, J. Ma, W. Susilo, W. Lou, Secure and efficient cloud data deduplication with randomized tag. IEEE Trans. Inf. Forensics Secur. 12(3), 532–543 (2017). https://doi.org/ 10.1109/TIFS.2016.2622013 23. B. Mi, Y. Li, H. Darong, T. Wei, Q. Zou, Secure data de-duplication based on threshold blind signature and bloom filter in Internet of Things. IEEE Access 8, 167113–167122 (2020). https:// doi.org/10.1109/ACCESS.2020.3023750 24. S. Shakya, An efficient security framework for data migration in a cloud computing environment. J. Artif. Intell. 1(01), 45–53 (2019) 25. S. Wang, Y. Wang, Y. Zhang, Blockchain-based fair payment protocol for deduplication cloud storage system. IEEE Access 7, 127652–127668 (2019). https://doi.org/10.1109/ACCESS. 2019.2939492 26. L. Liu, Y. Zhang, X. Li, KeyD: secure key-deduplication with identity-based broadcast encryption. IEEE Trans. Cloud Comput. 9(2), 670–681 (2021). https://doi.org/10.1109/TCC.2018. 2869333 27. H. Ma, Y. Xie, J. Wang, G. Tian, Z. Liu, Revocable attribute-based encryption scheme with efficient deduplication for ehealth systems. IEEE Access 7, 89205–89217 (2019). https://doi. org/10.1109/ACCESS.2019.2926627 28. Z. Pooranian, M. Shojafar, S. Garg, R. Taheri, R. Tafazolli, LEVER: secure deduplicated cloud storage with encrypted two-party interactions in cyber-physical systems. IEEE Trans. Industr. Inf. 17(8), 5759–5768 (2021). https://doi.org/10.1109/TII.2020.3021013 29. Y. Zhao, S.S.M. Chow, Updatable block-level message-locked encryption. IEEE Trans. Dependable Secure Comput. 18(4), 1620–1631 (2021). https://doi.org/10.1109/TDSC.2019. 2922403 30. P.G. Shynu, R.K. Nadesh, V.G. Menon, A secure data deduplication system for integrated cloud-edge networks. J. Cloud Comput. Article Number-61 (2020) 31. J. Xiong, Y. Zhang, S. Tang, X. Liu, Z. Yao, Secure encrypted data with authorized deduplication in cloud. IEEE Access 7, 75090–75104 (2019). https://doi.org/10.1109/ACCESS.2019.292 0998 32. G. Anil Kumar, C.P. Shantala, Framework towards higher data privacy by novel data integrity scheme. Cognitive Inf. Soft Comput. 381–389 (2021)
A Survey on Patients Privacy Protection with Steganography and Visual Encryption Hussein K. Alzubaidy, Dhiah Al-Shammary, and Mohammed Hamzah Abed
Abstract In this survey, thirty models for steganography and visual encryption methods have been discussed to provide patients privacy protection. Image steganography techniques are classified into two domains: Spatial Domain and Frequency Domain. Spatial hiding models are divided into three groups: Least Significant Bit (LSB), Most Significant Bit (MSB) and other spatial hiding while frequency models are divided into two groups: Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT). Moreover, some models have been used to encrypt medical images. Technically, PSNR (Peak-Signal-to-Noise-Ratio) have been computed and compared to investigate and evaluate the model’s performance. As a result of this survey analysis, Frequency Hiding models for DCT (Discrete Cosine Transform) have shown the best average PSNR with 70.66 dB, compared to other approaches. Keywords Steganography · Patient privacy · Medical images · Visual encryption
1 Introduction The rapid growth of the Internet and information technology has led to exposure to patient data. It has become easy to attack and tamper medical images [1]. To avoid data loss and mishandling, several steganography techniques are used for privacy protection. Steganography can be classified into two domains: Spatial and Frequency. In the spatial domain, secret messages are directly hidden in the cover image. Pixel values are manipulated to investigate the desired improvement. On the other hand, the frequency domain is based on the conversion of the pixel from Time-domain to the Frequency domain. Cryptography is the process of encrypting messages in a way that hackers cannot understand the content. One of the traditional methods for cryptography is Advanced Encryption Standard (AES) which is an asymmetric cypher. Furthermore, visual encryption is another form of cryptography dedicated to visual images to be encrypted or scrambled. H. K. Alzubaidy · D. Al-Shammary · M. H. Abed (B) College of Computer Science and Information Technology, Al-Qadisiyah University, Al-Dewaniyah, Iraq e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_37
491
492
H. K. Alzubaidy et al.
1.1 Motivation Lately, cases of privacy violation for personal data is growing rapidly and tremendously with the increased usage of the Internet [2]. Moreover, sudden attacks on organizations and modifications that can be occurred to secret information, the main challenge is assuring security for Electronic Patient Records (EPRs) and preservation of data integrity from unauthorized access in telemedicine applications and healthcare services. Concealing confidential information is important at securing communication. Therefore, it is necessary to discover effective and powerful methods to hide such information through using the steganography technique that allows hiding the content of data during transmission from eyes intruders.
1.2 Survey Strategy and Evaluation This survey presented thirty Publications for steganography techniques described and classified into six groups. Three groups represent spatial hiding models for medical images: Least Significant bit (LSB), Most Significant Bit (MSB), and other Spatial hiding models. Furthermore, two groups for frequency hiding models: Discrete Cosine Transforms (DCT) and Discrete Wavelet Transform (DWT). Finally, visual encryption models are discussed in the last group. To evaluate the success of the proposed models, the best PSNR results are collected and analyzed in tables and charts. Frequency hiding models have achieved the highest average PSNR compared to other approaches.
1.3 Paper Organization The rest of this paper is organized as follows: Sect. 2 describes spatial hiding models for medical images that include three groups: Least significant bit (LSB), Most significant bit (MSB) and other spatial hiding. Frequency Hiding models for medical images is presented in Sect. 3 that includes two groups: Discrete Cosine Transform (DCT) and Discrete Wavelet Transform (DWT). Visual encryption models for medical images are illustrated in Sect. 4. Analysis and evaluation are presented in Sect. 5. Finally, conclusions are presented in Sect. 6.
A Survey on Patients Privacy Protection …
493
2 Spatial Hiding Models Secret messages are directly hidden in the cover image. Pixel values are manipulated to investigate the desired improvement. This section has discussed Least significant bit (LSB), the Most significant bit (MSB) and other spatial hiding.
2.1 Least Significant Bit (LSB) The Least Significant Bit (LSB) is the most common steganography method based on manipulating the least significant bits in image pixels. Siddiqui et al. [3] has discussed the improvement of data masking ability for medical stego-images and patient records. Image Region Decomposition (IRD) has been proposed to divide (MRI) images into three regions based on intensity. To evaluate their proposed solution, PSNR, SSIM, MSE values have been computed and compared with other techniques. The authors have used ten medical images for evaluation. Furthermore, another five standard images like Lena and cameraman have been used for evaluation to compare with other models. The best-achieved result in this research is 49.27 PSNR. Technically, few medical image samples have been evaluated that would not prove the persistency of the proposed method. Lee et al. [4] has discussed the concealment of sensitive information for medical records without an increase in data size. This research has proposed a data hiding method by using a magic cube. Moreover, they have proposed the Least Significant Bit (LSB) method as a partial hiding technique. Then the differences are computed and contrasted with the coordinate position in the 3D magic cube. To evaluate their proposed solution, PSNR, MSE, ER have been computed and compared to investigate their results. This paper has used nine standard images like Lena and Baboon in addition to a few medical images for evaluation. The best-achieved result in this research is 2.25 an embedding capacity with 44 average PSNR. Few image samples have been evaluated and the model performance has not been investigated. Unauthorized disclosure of medical images for electronic patient’s records (EPR) and hiding have been addressed by Ulutas et al. [5]. This paper has proposed a (k, n) Shamir’s secret sharing scheme to hide (EPR) and prevent unauthorized access of medical information. Technically, PSNR, MSE and NOC (Number of characters) have been computed and compared to investigate their results. Eleven medical images have been used for evaluation. The best-achieved result in this research is 46.3702 PSNR. Although high performance has been achieved by the proposed method, the reconstructed image contains removed regions and might be susceptible to damage during transmission. Karakus et al. [2] have discussed the increasing amount of data to be hidden and stego-image to reduce deterioration medical image quality. A new optimization method has been proposed based on optimum pixel similarity. Technically, PSNR, MSE, RMSE, SSIM, UQI have been computed and compared to investigate their
494
H. K. Alzubaidy et al.
results. This paper has used twenty images for both MR, CT and OT in the evaluation process. The best-achieved result in this research is 66.5374 PSNR for 1000 characters’ data hidden in 512 × 512 images. On the other hand, 10,000 characters of data hiding cannot be hidden in 256 × 256 images without compression techniques. Therefore, the optimization process is hard. Another study by Bhardwaj et al. [6] that has discussed provision saving for telemedicine applications during the communication process and exchange of patient information. A dual image separable block has been proposed based on a reversible data hiding algorithm. Technically, PSNR, MSE, embedding capacity, BER has been computed and compared to investigate their results. This paper has used twenty general images like Lena and Boat and medical images for evaluation. The bestachieved result in this research is 54.10 PSNR. Finally, the proposed method has achieved potential PSNR, However, it is still weak for resistance of various attacks on images. In addition, the proposed method has not been proven its efficiency with processing time. Mansour et al. [7], have addressed sharing electronic patient information in telemedicine systems and improving the security of patients services overcloud. RDH (Reversible Data Hiding) has been proposed based on Lagrange’s interpolation polynomial, secret sharing and bit substitution. To evaluate their proposed solution, PSNR, payload, entropy, standard deviation and cross-correlation have been computed and compared to investigate their results. This paper has used sixteen general and medical images for evaluation. The best-achieved result in this research is 52.38 PSNR. Several image samples have been tested and the proposed method performance is suffering from high distortion and lack of capacity. Hiding electronic patient records (EPR) in medical images by using LSB (Least significant bit) is proposed by Loan et al. [8]. Reversible data hiding schemes have been proposed based on Pixel Repetition Method (PRM) and edge detection. Technically, PSNR, MSE, SSIM, BER have been computed and compared to investigate their results. This paper has used twenty general images and ten medical images for evaluation. The best-achieved result in this research is 58.832 PSNR. On the other hand, few image samples are used that does not confirm the high efficiency. Hiding medical images for patients to preserve the integrity and security of medical information has been discussed by Kim et al. [9]. A stenographic data hiding scheme (HDH) has been proposed. This proposed method is a combination between Hamming code and LSB with an optimal pixel adjustment process algorithm. Technically, PSNR, MSE has been computed and compared to investigate their results. This paper has used ten medical images for evaluation. The best-achieved result in this research is 52.44 PSNR. Although the proposed method has achieved potential PSNR, ten image samples are still few to prove their performance. Wang et al. [10], have addressed avoiding malicious attack and tampering during data transmission for EHR (Electronic health records) for security privacy patients. A high steganography capacity spatial domain data hiding scheme has been proposed to hide information by using four LSB (Least Significant Bit). Technically, PSNR, Weighted PSNR (w PSNR), MSE, SSIM have been computed and compared to investigate their results. This paper has used ten medical images for evaluation. The
A Survey on Patients Privacy Protection …
495
best-achieved result in this research is 66.08 PSNR. On the other hand, image quality deterioration is not obvious, average loss of PSNR is 0.45. Therefore, the proposed method is not guaranteed to improve concealment capacity. Another research by Gao et al. [11] have discussed using new methods for hiding secret data and medical images with lossless and low distortion. Reversible Data Hiding with an Automatic contrast enhancement algorithm (RDHACEM) have been proposed for hiding medical images. This proposed method separates medical images into Region of interest (ROI) and non-ROI. To evaluate their proposed solution, PSNR, SSIM, RCE (Relative Contrast Error), Relative Mean Brightness Error (RMBE) and Mean Opinion Score (MOS) have been computed and compared to investigate their results. This paper has used six medical images with twelve marked images for each image and four algorithms (RHCRDH, ACERDH, RDHMBP and RDHACEM) for evaluation. The best-achieved result in this research is 26.5136 PSNR for Brain-1 image. Although several images have been tested, image robustness and data security protections are not clarified. Abdel Raouf et al. [12] have addressed protecting secret messages from unauthorized access through using the steganography technique. A new data hiding model has been proposed based on the human visual system (HVS) and using Least Significant Bits (LSB). Technically, PSNR, MSE, SSIM have been computed and compared to investigate their results. This paper has used six images as the dataset for evaluation. The best-achieved result in this research is 84.48 as an average PSNR. Finally, few images have been tested, it is not enough to get an accurate evaluation to have.
2.2 Most Significant Bit (MSB) Most Significant Bit hiding methods are considered to be sensitive as they may cause a high error in comparison with the original image. Yang et al. [13], has discussed the weakness of traditional LSB and how to increase secret information protection, Most Significant Bit (MSB) has been proposed to increase hiding capacity and reduce stego-image distortion to improve its quality and reduce bit error rate (BER) when distortion occurs. Moreover, they have applied BOW (Bag of Words), this model extracts visible words (VW) to represent confidential information. To evaluate their proposed solution, PSNR, MSE, SSIM, QI have been computed and compared to investigate their results. The proposed model has achieved potential PSNR; however, it is not clear how to resist distortion. The best-achieved result in this research for hiding {2601} bits of secret information with PSNR is near to infinity. Although high performance has been investigated in hiding information, it’s still prone to error bit rate and sudden attacks. Abd El-Latif et al. [14] have discussed concealing quantum secret images inside a cover image. Two new and efficient information hiding approaches have been proposed based on MSQb and LSQb. To evaluate their proposed solution, PSNR, MSE has been computed and compared to investigate their results. This paper has used six as cover and secret images and nine as carrier and watermark images for
496
H. K. Alzubaidy et al.
evaluation. The best-achieved result in this research is 48.1897 PSNR for carrier and watermark images and 46.7901 PSNR for cover and secret images. On the other hand, several images have been tested, the proposed method is not potentially investigated on known data.
2.3 Other Spatial Hiding Models Other spatial hiding methods are depicted in this section. Li et al. [15] has addressed tracing illegal disclosure of medical images in a multicast communication environment and how to define a set of requirements, in addition to the challenges to enhance patient’s privacy. This research has proposed multicast fingerprinting based on IA-W watermarking. Moreover, invertible watermarking is proposed for integrity protection. To evaluate their proposed solution, PSNR, QI, MSME have been computed and compared to investigate their results. This paper has used thirty-one images with five modalities for evaluation. The best-achieved results in this research are 86.34 PSNR with a compressed ratio of 10:1. Although best performance has been achieved, still tracing is harder because of the number of comparisons. Kawa et al. [16], has discussed protecting electronic patient records (EPR) from unauthorized access in the medical image. This research has proposed Optimal Pixel Repetition (OPR) to secure (EPR) inside a medical image. Moreover, the reversible framework is proposed in three phases. To evaluate their proposed solution, PSNR, NCC, SSIM, BER, NAE have been computed and compared to investigate their results. Nine images have been used for evaluation that was obtained from several databases. The best-achieved results in this research are 42 average PSNR. The proposed model requires significant extraction time when hiding a large amount of data. Maximizing and improving hiding capacity for medical images has been discussed by Al-Qershi et al. [17]. Two reversible data hiding schemes have been proposed based on Difference Expansion (DE). Furthermore, this proposed method combine Chiang with Tian and AL attar. To evaluate their proposed solution, PSNR, SSIM has been computed and compared. This paper has used sixteen DICOM images for evaluation. The best-achieved result in this research is 95.59 PSNR for Chiang and 82.34 PSNR for (Tian and Chiang) method with (embedding capacity = 0.1). On the other hand, a small number of image samples have been tested. Moreover, there is no clear security metric as most metrics are for image quality. Parah et al. [18], have discussed the provision high degree of security for EHR in the healthcare system by hiding images for smart city applications. This paper has proposed high capacity, secure and computationally efficient Electronic Health Record (EHR) hiding for medical images in Internet of Things (IoT) based healthcare technique. Moreover, the proposed technique has used Pixel Repetition Method (PRM) and modular arithmetic. Technically, PSNR, MSE, NAE, NCC have been computed and compared to investigate their results. This paper has used twelve images that includes six general images and six medical images. In addition, twenty
A Survey on Patients Privacy Protection …
497
UCID images for evaluation. The best-achieved result in this research is 45.42 PSNR. On the other hand, the proposed technique has not shown clear efficiency.
3 Frequency Hiding Models Based on the manipulation of the orthogonal transform of the image instead of the image itself, this section has addressed Discrete Cosine Transforms (DCT), Discrete Wavelet Transform (DWT).
3.1 Discrete Cosine Transforms (DCT) Discrete Cosine Transform is a commonly used mathematical transformation. The DCT has been applied in several areas including frequency-based hiding approaches. Ayub et al. [19], has discussed hiding data in a highly secure position. This paper has proposed a novel technique to hide secret information along edges inside images. Several filters have been used to detect edges and data is hiding by DCT. In order to evaluate their proposed solution, PSNR, MSE, SNR have been computed and compared to investigate their results. Ten BMP and jpeg images have been used for evaluation. The proposed model has resulted in the best PSNR of 84.9 with a canny filter for BMP images. Furthermore, best-achieved PSNR for jpeg images is 99.8 with a canny filter. Although the proposed model has achieved potential PSNR, the resultant image size reduction might affect intruders. Liao et al. [20], has discussed how to conceal medical JPEG images and provide safeguard for patient’s medical information. A new medical JPEG image stenographic scheme has been proposed that relied on dependencies of inter-block coefficients. In order to evaluate their proposed solution, nzAC, CR, CEC, ECR have been computed and compared to investigate their results. This paper has used twelve images for evaluation. The best-achieved result in this research is less than 1% as incremental testing error ratio. The destruction in inter-block dependencies of DCT coefficients has caused potential weakness in performance. Hiding a medical image into another medical image to maintain on electronic patient records from the leakage is discussed by Elhadad et al. [21]. This paper has proposed a steganography technique based on DICOM medical images. Moreover, the proposed method included three main parts: preprocessing, data embedding based on DCT and an extraction process. In order to evaluate their proposed solution, PSNR, MSE, SSIM, UQI, Correlation coefficient (R) have been computed and compared to investigate their results. This paper has used twenty patients for the evaluation process. The best-achieved result in this research is 85.39 PSNR. Finally, several image samples have been evaluated. However, a compression ratio is still low (varies between 0.4770 and 0.4988) of files size.
498
H. K. Alzubaidy et al.
Arunkumar et al. [22], have addressed securing secret medical images and sensitive information in healthcare from unauthorized use during transmission. A robust image steganographic method has been proposed based on combined Redundant Integer Wavelet Transform (RIWT), Discrete Cosine Transforms (DCT), Singular Value Decomposition (SVD) and the logistic chaotic map. Technically, PSNR, IF, NAE, MSE, MSSIM, NCC, have been computed and compared to investigate their results. This paper has used eight images for evaluation and four for comparison. The best-achieved result in this research is 50.18 PSNR and the average PSNR of the proposed method for four images have been selected Lena, Airplane, Pepper and Baboon is 49.77 compared to other methods. Finally, few image samples have been used and that still is not enough for testing. Another hiding method proposed by Yang et al. [23] has addressed enhancement hiding capacity and maintaining quality for JPEG images. RDHS (Adaptive realtime reversible data hiding scheme) has been proposed based on discrete cosine transformation (DCT) coefficients that have been rearranged in a zigzag manner. In order to evaluate their proposed solution, PSNR, Embedding Capacity (EC) have been computed and compared. This paper has used six stego-images like Zelda and Baboon for evaluation. The best-achieved result in this research is 47.27 PSNR. On the other hand, the proposed method requires one more pass before the data is embedded or decoded. Therefore, it is hard and needs more time to execution. Moreover, few images have been tested and not enough to evaluate their performance.
3.2 Discrete Wavelet Transform (DWT) Discrete Wavelet Transform is another common mathematical transform that is being used widely for steganography models. Subramaniyam et al. [24], has discussed secure data communication and hiding secret messages from hacking has been proposed to connect and separate images based on frequency segments. Technically, PSNR, MSE have been computed and compared to investigate their results. This paper has used three JPEG images for both cover and secret images in the evaluation process. The best-achieved result in this research is 53.771 PSNR. Finally, very few image samples have been tested. Mohan Pandey et al. [25], have discussed medical sensitive data safety during transmission and reception. Bitmask oriented genetic algorithm (BMOGA) has been proposed to reduce redundancy of medical tests data while transmission. Furthermore, they have proposed steganography as an additional hiding technique. PSNR, SC, SSIM, MSE, BER and correlation have been computed and compared to investigate their results. Five medical images have been used for evaluation. The bestachieved result in this research is 74.69 PSNR. Finally, the processing time is not clear. Therefore, it is significantly hard to decide model to apply in real world. Another study by Borraa et al. [26] has discussed the provision of security and integrity for color radiological images during transmission. A hybrid image hiding technique with high capacity has been proposed based on Fast Discrete Curvelet
A Survey on Patients Privacy Protection …
499
Transform (FDCuT) and Compressive Sensing (CS for hiding color secret images). Technically, PSNR, MSE, SSIM, MS-SSIM have been computed and compared to investigate their results. This paper has used five images like knee MRI and Liver US for evaluation. The best-achieved result in this research is 70.27 PSNR. Few medical images are not enough to assure efficiency.
4 Visual Encryption Models Visual encryption is one of the well-known technical concepts to protect patient privacy through visually encrypting patient visual records. Kaur et al. [27], have addressed encryption and decryption to conceal secret images. IHED (Image Hiding Encryption and Decryption) has been proposed to encrypt and decrypt images. Moreover, the encryption process is carried out using the Mid Frequency (MF). They have been specified by Mid Search African Buffalo Model (MSABM). Technically, PSNR, MSE, SSIM have been computed and compared to investigate their results. This paper has used five MRI images like Heart and Liver for evaluation. The bestachieved result in this research is 71.6677 PSNR. Few images have been evaluated and the model proposed is inefficient against different types of attacks. Elhoseny et al. [28], have addressed securing and hiding confidential patient data in the electronic healthcare system, in this paper, a hybrid encryption model has been proposed to encrypt secret patient’s data using two algorithms that are RSA and AES through a combination of 2D-DWT-1L or 2D-DWT-2L steganography technique with the proposed model. The encoded data is hidden inside the cover image, then embedded data is extracted and decoded to retrieve the original image. Technically, PSNR, MSE, SSIM, BER, SC, and correlation have been computed and compared to investigate their results. This paper has used five images for both color and gray images for the evaluation process. The best-achieved results in this research are 57.44 PSNR and 56.09 PSNR in both color images and gray images respectively. On the other hand, the processing time is not computed. Therefore, it is significantly hard to assess model efficiency in real scenario. Mechanisms to analyze and enhance multilevel techniques against security threats has been discussed by Panwar et al. [29]. These aims investigate more integrity and security of ensuring hiding capacity for confidential patient information into image. Moreover, the impact of potential attack and noise is clear on communication channels. A hybrid encryption model has been proposed based on Quantum chaos (QC) and RSA encryption. Furthermore, encrypted data is embedded into image using the improved BPCS technique for getting stego-image. In order to evaluate their proposed solution PSNR, MSE, BER, SSIM, SC, UIQI, MAE, JI, BC, IC, CC, have been computed and compared to investigate their results. This paper has used five images for evaluation. The best-achieved result in this research is 71.38 PSNR in absence of attacks. Finally, the proposed method has achieved a high level of security. However, it is not proven to have high efficiency with testing on five images.
500
H. K. Alzubaidy et al.
Prasanalakshmi et al. [30], have discussed secure medical data for patients during its transmission in IoT. HECC (Hyper elliptic Curve Cryptography) have been proposed based on AES algorithm, Blowfish hybrid cryptography, Koblitz method. Technically, PSNR, MAE, SSIM, SC and correlation have been computed and compared to investigate their results. This paper has used eight sample images for evaluation. The best-achieved result in this research is 70 PSNR. Finally, many modern methods have been compared and tested. Furthermore, the processing time is not clarified. Another study by Shastri et al. [1] has addressed securing privacy secret data from leakage and unauthorized access using encryption and steganography techniques. A dual image RDH algorithm has been proposed based on the Centre Folding Strategy (CFS) and the Shiftable Pixel Coordinate Selection Strategy. In order to evaluate their proposed solution, PSNR, MSE, SSIM, Embedding Rate have been computed and compared. This paper has used six images like Lena and Barbara for evaluation. The best-achieved result in this research is 50.66 PSNR. Finally, a number of bits might cause problems in overflow or underflow if the embedding process is low, therefore the quality image depends on the number of pixels. In addition, the execution time of the proposed method is slowest compared to other techniques.
5 Analysis and Evaluatio This paper has presented thirty models, twenty-five for steganography techniques and five for Visual Encryption models. The analysis process has relied on the extraction results of PSNR from all papers in order to evaluate model’s efficiency. Tables 1, 2 and 3 shows the best results PSNR for spatial hiding models for medical images distributed into three groups: (Least Significant Bit (LSB), Most Significant Bit (MSB) and Other Spatial hiding models). Furthermore, Table 4 illustrates the best resultant PSNR for frequency hiding models for medical image distributed into two groups: Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT). Table 5 explains best resultant PSNR for Visual Encryption models. On the other hand, obtained results in Tables 1, 2, 3, 4 and 5 are represented visually in the bar chart as shown in Fig. 1.
6 Conclusion In conclusion, this paper has presented thirty models divided into spatial and frequency domains for hiding the medical image for patients. Furthermore, visual encryption models for encrypting medical images. Hiding spatial models are divided into Least Significant Bit (LSB), Most significant Bit (MSB) and Other spatial hiding models. Hiding frequency models are divided into Discrete cosine transform (DCT), Discrete wavelet transform (DWT). Technically, PSNR is extracted from the literature
A Survey on Patients Privacy Protection …
501
Table 1 Best PSNR for spatial hiding models with least significant bit (LSB) for medical images Category
Models
Least significant bit (LSB)
Image region decomposition (IRD) based on intensity 49.27 [3]
PSNR dB
Data hiding method by using magic cube and least significant bit (LSB) [4]
44
a (k, n) Shamir’s secret sharing scheme[5]
46.3702
New optimization method has been proposed based on optimum pixel similarity[2]
66.5374
Dual image separable block based reversible data hiding algorithm[6]
54.10
RDH (Reversible Data Hiding) based on Lagrange’s interpolation polynomial, secret sharing and bit substitution[7]
52.38
Reversible data hiding scheme based Pixel Repetition 58.832 Method (PRM) and edge detection [8] A steganographic data hiding scheme (HDH) with 52.44 OPAP optimal pixel adjustment process algorithm [9] A high steganography capacity spatial domain data hiding scheme [10]
66.08
Reversible data hiding with automatic contrast enhancement algorithm (RDHACEM) [11]
26.5136
Two new and efficient information hiding approaches 46.7901 based on MSQb and LSQb [14] A new data hiding model based human visual system 84.48 (HVS) [12] Table 2 Best PSNR for spatial hiding models with Most significant bit (MSB) for medical images Category
Models
Most significant bit (MSB) Most significant bit (MSB) model [13]
PSNR dB PSNR near to infinity
Two new and efficient information hiding 48.1897 approaches based on MSQb and LSQb [14] Table 3 Best PSNR for Other spatial hiding models for medical images Category
Models
PSNR dB
Most significant bit (MSB) Multicast fingerprinting based on IA-W watermarking 86.34 [15] Optimal pixel repetition (OPR) and The reversible framework [16]
42
Two reversible data hiding schemes based difference expansion (DE) [17]
95.59
high capacity, secure and computationally efficient 45.42 Electronic Health Record (EHR) hiding with pixel repetition method (PRM) and modular arithmetic [18]
502
H. K. Alzubaidy et al.
Table 4 Best PSNR for frequency with DCT and DWT for medical images Category
Models
PSNR dB
Discrete Cosine Transforms (DCT)
A novel technique to hide secret information along edges inside images [19]
99.8538
New medical JPEG image steganographic scheme based dependencies of inter-block coefficients [20]
_
steganography technique based on DICOM medical image [21]
85.39
A robust image steganographic method based combine Redundant Integer wavelet transform (RIWT), discrete cosine transforms (DCT), singular value decomposition (SVD) and the logistic chaotic map [22]
50.18
RDHS (Adaptive real-time reversible data hiding scheme) based discrete cosine transformation (DCT) coefficients [23]
47.27
Discrete wavelet transform (DWT) method [24]
53.771
BMOG (bit mask oriented genetic algorithm) [25]
74.69
Discrete wavelet transform (DWT)
A hybrid image hiding techniques with high 70.27 capacity based fast discrete curve let transform (FDCuT) and compressive sensing (CS) [26] Table 5 Best PSNR for visual encryption models for medical images Category
Models
Visual Encryption models IHED (image hiding encryption and decryption) [27]
PSNR dB 71.6677
A hybrid encryption model using two algorithms are RSA and AES [28]
57.44
A hybrid encryption model has based quantum chaos (QC) and RSA encryption [29]
71.38
HECC (hyper elliptic curve cryptography) based on AES algorithm, Blowfish hybrid cryptography, Koblitz method [30]
70
A dual image RDH algorithm based on the centre 50.66 folding strategy (CFS) and the shiftable pixel coordinate selection strategy [1]
A Survey on Patients Privacy Protection …
503
PSNR 120 100 80 60
84.48
86.34
95.59 99 85.39 74.6971.667771.38 70.27 70
66.5374
66.08 58.832 54.1 52.44 52.38 40 49.2746.3702 46.7901 48.1897 44
20
26.5136
0
53.771 50.18 47.27
42 45.42
0
57.44
50.66
0 PSNR
[2]
[3]
[7]
[14]
[15]
[16]
[20]
[21]
[22]
[24]
[25]
[30]
[4]
[25]
[1]
[5]
[27]
[29]
[6]
[9]
[13]
[26]
[28]
[10]
[11]
[17]
[8]
[12]
[18]
[19]
[23]
Fig. 1 PSNR values for different methods and techniques of steganography
of the included models to evaluate models efficiency. Average PSNR is calculated for all groups and compared with each other. Average PSNR has showed outperformed hiding frequency models with 70.66 PSNR for Discrete Cosine Transforms (DCT) while average PSNR for hiding spatial models for a Least significant bit, Most Significant Bit and other Spatial hiding models are 53.9827, 48.1897 and 67.3375 respectively and average PSNR for visual encryption models is 64.22954.
References 1. S. Shastri, V. Thanikaiselvan, Dual image reversible data hiding using trinary assignment and centre folding strategy with low distortion. J. Vis. Commun. Image Represent. (2019) 2. S. Karakus, E. Avci, A new image steganography method with optimum pixel similarity for data hiding in medical images. J. Med. Hypotheses (2020) 3. G.F. Siddiqui, M.Z. Iqbal, K. Saleem, Z. Saeed, A. Ahmed, I.A. Hameed, M.F. Khan, A dynamic three-bit image steganography algorithm for medical and e-healthcare systems. IEEE Access (2020) 4. C.F. Lee, J.J. Shen, S. Agrawal, Y.X. Wang, Y.H. Lee, Data hiding method based on 3D magic cube. IEEE Access (2020) 5. M. Ulutas, G. Ulutas, V.V. Nabiyev, Medical image security and EPR hiding using Shamir’s secret sharing scheme. J. Syst. Softw. (2010) 6. R. Bhardwaj, An improved separable reversible patient data hiding algorithm for E-healthcare. Multimedia Tools Appl. (2020) 7. R.F. Mansour, S.A. Parah, Reversible data hiding for electronic patient information security for telemedicine applications. Arab. J. Sci. Eng. (2021)
504
H. K. Alzubaidy et al.
8. N.A. Loan, S.A. Parah, J.A. Sheikh, J.A. Akhoon, G.M. Bhat, Hiding electronic patient record (EPR) in medical images: a high capacity and computationally efficient technique for e-healthcare applications. J. Biomed. Inf. (2017) 9. C. Kim, D. Shin, B.G. Kim, Secure medical images based on data hiding using a hybrid scheme with the Hamming code, LSB, and OPAP. J. Real-Time Image Process. (2017) 10. D. Wang, D. Chen, B. Ma, L. Xu, J. Zhang, A high capacity spatial domain data hiding scheme for medical images. J. Sign. Process. Syst. (2016) 11. G. Gao, S. Tong, Z. Xia, B. Wu, L. Xu, Z. Zhao, Reversible data hiding with automatic contrast enhancement for medical images. J. Sign. Process. (2020) 12. A. AbdelRaouf, A new data hiding approach for image steganography based on visual color sensitivity. Multimedia Tools Appl. (2020) 13. L. Yang, H. Deng, X. Dang, A novel coverless information hiding method based on the most significant bit of the cover image. IEEE Access (2020) 14. A.A. Abd El-Latif, B. Abd-El-Atty, M.S. Hossain, M.A. Rahman, A. Alamri, B.B. Gupta, Efficient quantum information hiding for remote medical image sharing. IEEE Access (2018) 15. M. Li, R. Poovendran, S. Narayanan, Protecting patient privacy against unauthorized release of medical images in a group communication environment. Comput. Med. Imag. Graph. (2005) 16. J.A. Kaw, N.A. Loan, S.A. Parah, K. Muhammad, J.A. Sheikh, G.M. Bhat, A reversible and secure patient information hiding system for IoT driven e-health (2018) 17. O. Qershi, B.E. Khoo, High capacity data hiding schemes for medical images based on difference expansion. J. Syst. Softw. (2010) 18. S. Parah, J. Sheikh, J. Akhoon, N. Loan, Electronic health record hiding in images for smart city applications: a computationally efficient and reversible information hiding technique for secure communication. J. Future Gener. Comput. Syst. (2018) 19. N. Ayub, A. Selwal, An improved image steganography technique using edge based data hiding in DCT domain. J. Interdisc. Math. (2020) 20. X. Liao, J. Yin, S. Guo, X. Li, A.K. Sangaiah, Medical JPEG image steganography based on preserving inter-block dependencies. Comput. Electr. Eng. (2017) 21. A. Elhadad, A. Ghareeb, S. Abbas, A blind and high-capacity data hiding of DICOM medical images based on fuzzification concepts. Alexandria Eng. J. (2021) 22. S. Arunkumar, V. Subramaniyaswamy, V. Varadarajan, N. Chilamkurti, R. Logesh, SVD-based robust image steganographic scheme using RIWT and DCT for secure transmission of medical images. J. Measure. (2019) 23. C.N. Yang, C. Kim, Y.H. Lo, Adaptive real-time reversible data hiding for JPEG images. J. Real-Time Image Process. (2015) 24. G. Murugan, R. Uthandipalayam Subramaniyam, Performance analysis of image steganography using wavelet transform for safe and secured transaction. Multimedia Tools Appl. (2020) 25. H.M. Pandey, Secure medical data transmission using a fusion of bit mask oriented genetic algorithm, encryption and steganography. Future Gener. Comput. Syst. (2020) 26. S. Borraa, R. Thanki, N. Dey, K. Borisagar, Secure transmission and integrity verification of color radiological images using fast discrete curve let transform and compressive sensing. Smart Health (2018) 27. S. Kaur, S. Bansa, R.K. Bansa, Image steganography for securing secret data using hybrid hiding model. Multimedia Tools Appl. (2020) 28. M. Elhoseny, G. Ramírez-González, O.M. Abu-Elnasr, S.A. Shawkat, N. Arunkumar, A. Farouk, Secure medical data transmission model for IoT-based healthcare systems. IEEE Access (2018) 29. P. Panwar, S. Dhall, S. Gupta, A multilevel secure information communication model for healthcare systems. Multimedia Tools Appl. (2020) 30. B. Prasanalakshmi, K. Murugan, K. Srinivasan, S. Shridevi, S. Shamsudheen, Y.-C. Hu, Improved authentication and computation of medical data transmission in the secure IoT using hyperelliptic curve cryptography. J. Supercomput. (2021)
Survey on Channel Estimation Schemes for mmWave Massive MIMO Systems – Future Directions and Challenges V. Baranidharan, B. Moulieshwaran, V. Karthick, M. K. Munavvar Hasan, R. Deepak, and A. Venkatesh Abstract In recent years, the fifth Generation (5G) and beyond cellular networks are developed and having enormous amount of applications in different domains. This 5G technology requires large number of antennas, support high data rates, very low latency, and high connection density. For an effective implementation of 5G communication systems and meet the above requirements, we need a millimeter wave (mmWave) Massive Multi Input Multi Output (MIMO) technology which is one of the promising technology. The characteristics of this millimeter wave propagation is a challenging task for effective signal transmission and reception. In this paper, we have explained the recent state-of-the-art review about the various channel estimation schemes for mmWave massive MIMO systems in a comprehensive way. Subsequently, we will discuss the comparison of such existing various solutions available in the literatures in terms of its benefits and shortcomings. This paper also helps the researchers to point out the directions of research and challenges in channel estimation schemes of massive MIMO systems. Keywords Massive multi input multi output · Channel estimation · Latency · Data rates · Connection density · mmWave
1 Introduction Most recent development in the areas of augmented reality, 3D video transmission, UWB band, industrial IOT, smart cities etc. to support the enormous amount of mobile traffic, the capacity demand will need to be addressed in upcoming future. This 5G era promises the users to support the more number of data users, reduces its latency, increase the connectivity density, and it also reduces the cost and power V. Baranidharan · B. Moulieshwaran · V. Karthick (B) · M. K. M. Hasan · R. Deepak · A. Venkatesh Department of Electronics and Communication Engineering, Bannari Amman Institute of Technology, Sathy, India e-mail: [email protected] V. Baranidharan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_38
505
506
V. Baranidharan et al.
consumption. This 5G will also increase the demands based on the more number of connectivity to support more number of the users. 5G bands are based on millimeter wave technology (mmWave), the spectrum ranges between the 30–300 GHZ subband of frequency [1]. This substantial attenuation in the signal frequencies, severe path loss due to the spherical absorption, rain and fog will create problem in signal reception. Unfortunately, the high frequency signals will propagate over a very low range for an indoor applications. In order to overcome these issues, we need a high gain, highly directional, massive number of antenna to support more number of users in 5G and beyond communication systems. High directivity of the antennas are achieved by narrowing the steering beams for such transmitter and receivers which will help us to steer the vectors toward the receivers. By effective utilizing the digital transceiver architectures, directly use the mmWave systems. This will give the effective power consumptions over the mmWave MIMO systems. This MIMO architectures issues are overcome by the design the purely analog, purely digital, hybrid analog–digital, and few-bit Analog Digital Converters architectures. The researchers are working on channel by the way to propose new algorithms or solutions to increase the channel capacity [2]. The mmWave is a key technology for enabling the 5G communication systems and its beyond. The methods of wireless backhauling and short range communications are highly suitable for mmwave links. The Ultra Dense Networks are highly capable for mitigating the adjacent and co-channel interferences produced in the communication systems by using the high directional antennas [3]. Some of the challenges are giving the propagation characteristics over mmWave frequency bands to overcome the very high consumption of power, high channel and hardware impairments. The paper is summarized as follows: Sect. 2 explores the introduction about the mmWave MIMO and its channel characteristics. Section 3 gives the detailed explanation about the various existing architectures for channel estimation. The paper is finally concluded in the Sect. 4.
2 mmWave MIMO Systems and Its Characteristics In order to support the high speed communication, we need huge spectrum of frequency bands in mmWave frequency bands. To meet out this needs, we are using the mmWave frequency bands that have been widely employed for 5G and beyond systems. The propagation characteristics of the mmWave technologies are always different from the traditional sub 6 GHz [4]. The propagation characteristics and challenges in mmWave technology is discussed in this section.
Survey on Channel Estimation Schemes for mmWave …
507
2.1 Propagation Characteristics There will be a major difference in the propagation characteristics between the sub6 GHz and mmWave frequency bands. The free space propagation loss (PLFS ) is always directly proportional to the term of d n over λ2 , where, the term d represents the distance between the BS and MT’s. The term n represents the path-loss exponent (generally the n value will be equal to 2 for any indoor and cellular applications) and λ represents the transmitting signals wavelength. N value will be chosen based on the different applications and its environments. Generally there will be a severe path loss during mmWave signal propagation, this will be automatically compensated by the strong antenna directionality and its gain. The mmWave high frequency signals are widely attenuated by the different type of absorptions of the oxygen and water vapours, rain scattering, and other scattering parameters [5]. This atmospheric attenuation also depends upon the various operation frequency. The atmospheric oxygen absorptions is severe at the frequency range of 60–120 GHz. The vapour absorption is very high at the 180 GHz frequency of signals. Similarly, the rain attenuation is greater than the sub-6 GHz frequencies. Severe penetration loss also occurs due to the static and dynamic blockage effects, this is achieved by the other factor which will effect the propagation of mmWave signals. This static and blockage effect arises due to the some obstacles by the human bodies, glass, walls, and doors, so on. The type of attenuations are able to penetrate over the two walls and four doors. Generally, this type of penetration frequency are not realistic to employ in outdoor base stations (BSs) to severe indoor losses. There is always a vulnerability in the mmWave data transmission due to such static and dynamic blockages, and atmospheric attenuation effects and operating frequency ranges, and changes in its system design and architectures [6].
2.2 mmWave Technology The mmWave communication technology is one of the most promising technology for 5G communication systems. Such advances in the digital signal processing such as massive MIMO technology will help to develop the 5G systems by effective circuit design and integration [7]. This technology uses the large number of different antennas at the Base stations (BS) and more number of mobile terminals (MT). This technology widely helps to increase the channel capacity of the cellular networks used in 5G and beyond communication systems. High directivity of such 5G antennas systems gives a better efficiency over the conventional systems. In order to achieve the better efficiency, we are using the antenna arrays in it. The more number of antennas are designed in such a way that the CMOS type of circuits will be widely used to increase the capacity of the circuit integrations.
508
V. Baranidharan et al.
Fig. 1 General architecture of the mmWave communication systems. Source [4],p. 6
This mmWave communications systems are employed by using the more number of transmitting antennas deployed over the transmitter side and the receiver end. Some of the researchers are using the 32, 64, 128 or 264 number antennas placed in the BS’s and MT’s. For an integrated mmWave systems, the power consumptions and the cost are influencing the large number of the antennas widely integrated with it. For an effective signal processing, we need a baseband based signal precoding techniques at the transmitter and it will be combined at the receiver, then only the dedicated RF chains are placed and connected with the each and every transmitting and receiving antennas. Similarly, the each RF chains are connected with one antenna. For achieving the better efficiency, the each RF chains includes the Analog to Digital Converter (ADC). Figure 1 depicts the generalized block diagram of the massive MIMO architecture. Based on the assumption of no loss, a BS is equipped with the M, be the number of the RF Chains that are connected with antennas and N be the number of the MT RF chains at the transmitter side [8]. In general, the antennas are communicated with the MT number of the RF chains, where, NT be the number of the transmitting antennas at the mobile terminals. In Fig. 1, we have considered only the point to point communications. But, we can extend it up to the multi users and multi antennas. As a combined role, we can utilize this architectures of baseband signal processing techniques and the analog circuits of the various different hybrid architectures used to be directed over the antenna beams of the transmitter end and the receiving end [9].
3 Existing Architectures for Channel Estimation for Massive MIMO Systems In recent years, the authors are investigated and developed in different types of channel estimation problems for mmWave MIMO systems. These MIMO systems are always equipped with large number of Base Stations (BS) and Mobile Terminals (MTs) with lens antenna arrays for effective data transmission and reception. In this section, we have discussed some of the recently proposed architectures.
Survey on Channel Estimation Schemes for mmWave …
509
Fig. 2 Diagram of CP decomposition of a third-order tensor method. Source [10], p. 3
3.1 Tensor Based Channel Estimation Scheme In this work, the authors introducs the tensor theory based technique with high end spatial smoothing method based channel estimation schemes that are designed for mmWave MIMO systems. The channel has to be modeled with dual-wideband effects, the authors proposed a hybrid analog–digital architecture and a dedicated SCPD algorithm also proposed for dual-wideband effects in mmwave Massive MIMO systems. The efficiency increases and computational complexity reduces by using the DWE-SCPD based proposed algorithms also having a significant dualwideband effects. As a multi-dimensional a tensor stores data in its portals. Vector is one-dimensional and Fibers and slices are vectors. The proposed hybrid analog– digital architecture based on the tensor theory with Nd data streams is reasoned to facilitate efficient hardware implementation [10] (Fig. 2). The simulated results, the calculated NMSE value versus the transmission bandwidth shows that this proposed method outperforms than the existing architectures. If the fs increases from less value than the existing schemes owing to the observed phase rarefaction effects achieve improved estimation performance. The proposed work shows that the critical dual-wideband effects need to be eradicated over the mmWave massive MIMO concept scenarios, and this also derives a new spatialfrequency channel model for the mmWave massive MIMO systems. Two dedicated algorithms are proposed by the particular tensor processing procedures for such cases with negligible. The uniqueness feature of the structured Compressive sensing based decomposition technique is always guaranteed with relaxed system parameter constraints. The results show that the proposed channel estimation strategy outperforms the existing schemes is compared in terms of its estimation accuracy and computational complexity.
3.2 Beam Squint Based Channel Estimation Scheme In this work, the authors proposed a new channel estimation scheme for mmWave massive MIMO OFDM schemes using beam squint techniques (Fig. 3). Here, the authors derived the channel model of wideband mmWave massive MIMO with different consideration of a new and different spatial wideband effects,
510
V. Baranidharan et al.
Fig. 3 Hybrid precoding architecture for the Uplink and downlinks for mmWave Massive MIMO systems. Source [11], p. 5
which will be the extracted over the frequency domain based on the beam squint effect. This model is always irrelevant to the architecture behind the array and related to the array manifold. So, full-digital and hybrid analog or digital precoding systems share the same channel model. The state of channel information and the user locations are new to the Base Station and to neglect the inter-user interference the orthogonal training have to be applied and pilot contamination at the Base Station [11]. Here, the AoA, and for all users complex gain values are always estimated over the channels and it is called as a phase initial parameter extraction. Here, the authors introduce an algorithm parameter extraction, which will suffer by the successive multi-user channel estimation for the uplink and downlinks. The evaluated numerical results show the beam squint effect and it is verified the proposed new approaches in practical mmWave massive MIMO system configurations. The Base Station is always equipped with a Uniform Linear Array whose spacing between the antennas is always equal to half of the carrier wavelength for the downlink over the entire systems. The
Survey on Channel Estimation Schemes for mmWave …
511
uplink carrier frequency is f c = 26 GHz and downlink carrier frequency is f Dc = 28 GHz. First, we compare our proposed method for focusing on initial parameter extraction to the conventional off-grid compressive sensing method, which takes frequency selectivity into consideration but ignores beam squint.
3.3 Frequency Domain Compressive Channel Estimation Scheme In this work, the authors have calculated a new compressed sensing problem in the frequency domain discussed widely to estimate the vectorized sparse channel vector for massive MIMO systems. Here, the authors have proposed two new algorithms used to solve this optimization problem is formulated (Fig. 4). The first technique has very good performance because it uses common support among the different sub channels, whereas the second algorithm uses information from a smaller number of subcarriers and hence has a lower computational complexity [12]. In this work, the main experimental results are achieved with the two suggested methods, SW-OMP and SS-SWOMP + Thresholding, as well as comparisons with other channel estimation algorithms. In order to estimate the normalized mean squared error (NMSE) as functions of SNR and the number of training frames M, the authors are using Monte Carlo simulations and average the findings over several trials. The simulated results shows that the average of SNR and NMSE obtained in graph for channel estimation algorithm for SNR range −15–10 dB. It shows that the performance of DGMP algorithm is very worse, as it has been designed to evaluate near LoS mmWave MIMO channels. The estimation error for NLOS channels is large, because it estimates in a single path. Estimation variance in the weakest paths is lower in the high SNR regime than in the low SNR domain [13].The paths are separated with greater SNR because of it’s threshold value by the noise variance. When the value of M increases so, automatically it will reduce the estimated by increasing the gap. The OMP algorithm’s average performance is poor in all cases because it does not exploit the common support property shared by different sub-channels.
Fig. 4 Compressive Channel Estimation Scheme for mmWave MIMO systems. Source [12], p. 4
512
V. Baranidharan et al.
3.4 Hybrid Beamforming Based Channel Estimation Schemes via Random Spatial Sampling In this work, the authors have proposed a new hybrid beamforming based channel estimation schemes using random spatial sampling techniques. In general, massive MIMO systems over large bandwidths millimeter Wave realize imperative communication through Hybrid analog/Digital precoding systems. With the short balance time inherit of the systems, the hardware design with HBF architectures is a challenging task in mmWave communication render. Particularly for short beam training intervals the available angular information’s such as AoA/AoD to give better channel recovery accuracy over the region and the assumed framework are widely used to exploit over the low rank property in the mmWave communications. For wireless communication, the mmWave frequency band provides large bandwidths. For efficient CSI estimation, the sparsity is jointly exploited with low rank properties of certain wireless channels. And it introduced time and frequency domains in channel’s sparsity where a CSI estimation technique imposed [14]. For considered wideband MIMO channel, we describe the sounding procedure and then the currently proposed analog combining architectures are discussed (Fig. 5). For different values of NR, we consider an mmWave based point-to-point MIMO systems, NR x NT MIMO system and examine the performance of the proposed analog combining architectures widely used with HBF reception. For T = 40,Np = 6, NT = 16 and SNR = 15 dB, ASE as a function of the number of RF chains MR. The figure shows that power consumption of the assumed design does not depend on MR. The power consumption of the assumed analog BF is less than the conventional ZC case.
Fig. 5 Proposed combining architecture for channel estimation for massive MIMO systems using HBF reception. Source [14], p. 4
Survey on Channel Estimation Schemes for mmWave …
513
Fig. 6 Proposed TLS-OMP based channel estimation method
Fig. 7 Frame structure at the receiver end of massive MIMO systems. Source [15], p. 3
3.5 Time-Domain Based Channel Estimation with Hybrid Architecture The time domain based channel estimation scheme is developed by the authors. The authors proposed the new model of signal based TD OMP (T-OMP) estimator and they show that how it will be applied to millimeter wave systems with FD pilots (Figs. 6 and 7). We also develop the TD LS/OMP (T-LS/OMP) estimator, which is a two-step channel estimation method. In general, increasing η will enhance the performance of T-OMP, while TLS/OMP remains the same. In the discretized delay space, because T-OMP exploits finer grids; the T-LS/OMP algorithm is independent of η. The TLS/OMP uses the Least Square method to evaluate the effective channel, which includes beamforming, pulse shaping, and array responses, and the OMP to estimate the effective channel. When the pilot matrix is unitary and the transmitting the different pulse shaping matrix is used by the given identity matrix, the proposed T-LS/OMP performs as well as the original single-step TD estimator, known as the T-OMP for different practical cases [15].
3.6 Comparison of Existing Channel Estimation Schemes In this section, we have discussed the various channel estimation schemes and its characteristics are explained. The comparison of these methods are tabulated in Table 1.
514
V. Baranidharan et al.
Table 1 Comparison of the existing state of art channel estimation schemes Channel estimation schemes Pros
Cons
Tensor based channel Estimation scheme [10]
• High data rate • Supports more number of antennas
• Doesn’t give full diversity at a high spectral efficiency
Beam squint based channel estimation scheme [11]
• Trade-off between the system • The proposed PCA matrix performance and its are symmetric in nature and complexity positive definite over the • It gives a fast convergence channels rate than the existing • Noise effects are not algorithms considered in this proposed systems
Frequency domain compressive channel estimation scheme [12]
• If the M/N values are large, • In order to find the suitable this method easily obtains the optimal weights, this near optimal performance algorithm needs additional • Computation complexity is calculations quite more when compared • It cannot converge when the with other methods N value is large
Hybrid beamforming based channel estimation schemes via random spatial sampling [14]
• Parallelism degrees are • Even though the convergence comparatively very high rate is high, but the initial • Convergence rate is also very value calculations are highly high and faster at the last complicated stage of iteration process
Time-domain based channel estimation with hybrid architecture [15]
• Effect of all types of noise are • The ability for further considered in this algorithm enhancement of diversity • For all propagation gain is not possible mechanism cases, the near • When the ratio of M/N value optimal performance are goes to one, the system will quite good automatically reduce its computation complexity
4 Conclusion 5G mmWave massive MIMO systems are improved drastically to achieve better spectral efficiency and integrated services to the end terminals. This paper surveyed the recently proposed channel estimation schemes and its algorithms to pertain the massive MIMO systems. These methods are analyzed at the different scenarios for transmitter designs. In this paper, we have critically analyzed the pros and cons of the different channel estimation schemes based on different performance metrics.
References 1. X. Gao, L. Dai, S. Zhou, A. M. Sayeed, L. Hanzo, Wideband beamspace channel estimation for millimeter-wave MIMO systems relying on lens antenna arrays. IEEE Trans. Sign. Process. 67(18), 4809–4824 (2019)
Survey on Channel Estimation Schemes for mmWave …
515
2. J. Brady, N. Behdad, A. Sayeed, Beamspace MIMO for millimeter-wave communications: system architecture, modeling, analysis, and measurements. IEEE Trans. Antennas Propag. 61(7), 3814–3827(2013) 3. V. Baranidharan, G. Sivaradje, K. Varadharajan, S. Vignesh, Clustered geographicopportunistic routing protocol for underwater wireless sensor networks. J. Appl. Res. Technol. 18(2), 62–68 (2020) 4. K. Hassan, M. Masarra, M. Zwingelstein and I. Dayoub, Channel estimation techniques for millimeter-wave communication systems: achievements and challenges. IEEE Open J. Commun. Soc. 1, 1336–1363 (2020) 5. F. Talebi, T. Pratt, Channel sounding and parameter estimation for a wideband correlation-based MIMO model. IEEE Trans. Veh. Technol. 65(2), 499–508 (2016) 6. A. Brighente, M. Cerutti, M. Nicoli, S. Tomasin, U. Spagnolini, Estimation of wideband dynamic mmWave and THz channels for 5G systems and beyond. IEEE J. Sel. Areas Commun. 38(9), 2026–2040 (2020) 7. K. Venugopal, A. Alkhateeb, N. González Prelcic, R. W. Heath, Channel estimation for hybrid architecture-based wideband millimeter wave systems. IEEE J. Sel. Areas Commun. 35(9), 1996–2009 (2017) 8. Z. Zhou, J. Fang, L. Yang, H. Li, Z. Chen, R.S. Blum, Low-rank tensor decomposition-aided channel estimation for millimeter wave MIMO-OFDM systems. IEEE J. Sel. Areas Commun. 35(7), 1524–1538 (2017) 9. B. Varadharajan, S. Gopalakrishnan, K. Varadharajan, K. Mani, S. Kutralingam, Energyefficient virtual infrastructure-based geo-nested routing protocol for wireless sensor networks. Turk. J. Electr. Eng. Comput. Sci. 29(2), 745–755 (2021) 10. Y. Lin, S. Jin, M. Matthaiou, X. You, Tensor-based channel estimation for millimeter wave MIMO-OFDM with dual-wideband effects. IEEE Trans. Commun. 68(7), 4218–4232 (2020) 11. B. Wang, M. Jian, F. Gao, G. Y. Li, H. Lin, Beam squint and channel estimation for wideband mmWave massive MIMO-OFDM systems. IEEE Trans. Sign. Process. 67(23), 5893–5908 (2019) 12. J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal, R.W. Heath, Frequency-domain compressive channel estimation for frequency-selective hybrid millimeter wave MIMO systems. IEEE Trans. Wirel. Commun. 17(5), 2946–2960 (2018) 13. P. Amadori, C. Masouros, Low RF-complexity millimeter-wave beamspace-MIMO systems by beam selection. IEEE Trans. Commun. 63(6), 2212–2222 (2015) 14. E. Vlachos, G.C. Alexandropoulos, J. Thompson, Wideband MIMO channel estimation for hybrid beamforming millimeter wave systems via random spatial sampling. IEEE J. Sel. Top. Sign. Process. 13(5), 1136–1150 (2019) 15. H. Kim, G.-T. Gil, Y.H. Lee, Two-step approach to time-domain channel estimation for wideband millimeter wave systems with hybrid architecture. IEEE Trans. Commun. 67(7), 5139–5152 (2019)
Fake News Detection: Fact or Cap C. Sindhu , Sachin Singh, and Govind Kumar
Abstract Fake news has been in our society for a long time but with the introduction of social media, Internet and mobile phones the spread of fake news has severely increased. Social media sites are being used to effectively spread misinformation and hoaxes around the world which not only causes people to change their thinking but also manipulates their opinions and decisions. In today’s world it has become nearly impossible to detect if the given news is fake or real. With the arrival of the novel coronavirus-19 pandemic the propagation of fake news is now more than ever. In this time there is a need for something which can classify if a given news is real or not. In this article we aim to develop a model which, using some algorithms, determines if the given news is fake or not. Machine Learning is a form of Artificial Intelligence which utilizes various strategies applying them on data and algorithms to do the same way humans learn. Previous data is used as input by machine learning algorithms to predict new output values. Fake news has the ability to hurt both individuals and society if it is widely spread such as riots, violence and hatred against a community. Understanding the truth of new information and its message can have a positive impact on society when used in conjunction with news detection. We created four prediction models using machine learning that have an accuracy of above 90% which predicts if the given news is either fact or capped (fake). Keywords Social media data input · Fake news detection · COVID-19 fake news · Pandemic fake news · Twitter data · Classifier · Logistic regression · Random forest · Gradient boosting · Decision tree
1 Introduction The extensive use of cellular phones, social platforms allow us to simply share news and information with our friends and family and be able to know about the trending news and information happening around the world. A large amount of data is being C. Sindhu (B) · S. Singh · G. Kumar Department of Computing Technologies, School of Computing, SRM Institute of Science and Technology, Kattankulathur 603203, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_39
517
518
C. Sindhu et al.
generated from online social platforms like twitter, Instagram and Facebook etc. despite that tremendous amount of data and information are fake and misleading which spread rumors and disinformation in society. For example, in recent times the world has suffered a lot due to COVID-19 crisis. A lot of fake news has been spread about COVID-19 and its vaccine which created an atmosphere of fear among the people and the main source of this fake news was social media platforms. Another example, the riots caused in Ukraine in 2014 due to fake news in which many citizens have lost their lives. It is manually not possible for us to check such large amounts of data whether it is true or not. We have technology like Artificial Intelligence (AI), Machine Learning (ML) and supervised learning which are applied in various models to detect fake news [1]. A key factor is finding false stories in the text on social media. As we spend more time connecting online via social media, social media is used by more individuals to find and consume news than conventional news organizations [2]. The reasons for this change in consuming habits are based on the nature of those forums: firstly, reading news on social media, is frequently more timely and less expensive than traditional journalism, such as newspapers or television and secondly, sharing, discussing with our close ones and same taste readers, on social media, is much easier. In 2016, for example, 62% of individuals in the United States got news on social media, up from only 49% in 2012 [2]. It was also discovered that, as a key news source, social media currently exceeds television. Despite the advantages of social media, the quality of social media stories is lower than that of traditional news organizations. Fake news [3], or news with purposefully false information, is widely distributed online for a number of purposes, including financial and political benefit, since it is less expensive to produce news online and more faster and simpler to spread through social media. The fake news has been linked to over a million tweets [4]. Fake news has the ability to hurt both individuals and society if it is widely disseminated. For starters, fake news has the ability to upset the news ecosystem’s authenticity balance; for example, during the 2016 presidential election in the United States, the most popular false news was shared on Facebook even more than the most commonly recognized genuine mainstream news. Second, fake news is designed to induce customers to accept biased or incorrect information. Propagandists use false stories to spread ideas or political influence. For example, according to some reports, Russia has set up fake accounts and public bots to spread misleading information. Third, false stories change the way people interpret and respond to real stories. For example, some false stories are simply intended to distort and confuse people, making it difficult for them to tell the truth.
2 Related Work There have been several machine learning and in-depth study methods used by various researchers in the past [5]. Traditional machine learning algorithms have
Fake News Detection: Fact or Cap
519
been very successful in identifying fake stories in the past. In [6], the authors used pre-trained language models with a variety of training techniques and designed two models for fake news detection: the first being a dual-directed LSTM text-RNN model, and the other text-Transformers. In [7] the authors use the Auxiliary Task Based Binary Sub-Classification approach to train their model. Some also treat the task as NLI (Natural Language Interface) and train a number of the strongest NLI models as well as BERT [8] and some tried to approach this as a text classification problem [9]. The language model can be improved via selfsupervised learning from a big corpus. Learn general information, then fine-tune it on specific activities to transfer it to downstream jobs. Elmo extracts word vectors using context information using Bidirectional LSTM [10]. By changing the transformer [11], GPT [12] improves context-sensitive embedding. To improve word embeddings, the bidirectional language model BERT [13] uses cloze and next sentence prediction in self-supervised learning. Qi et al. [2] eliminates the prediction of the next phrase from self-training and performs more comprehensive training, resulting in a superior language model for Roberta. In [14], the pre-trained language model was effectively improved, hiding the Ernie’s spread. In Ernie 2.0, the work in [15] has suggested continuous multi-task pre-training as well as a variety of pre-training exercises. Several research using computational approaches to detect false news have been published in the last several years. Ceron et al. [16] has advocated using Bag of Words (BoW) and BERT embedding to recognize bogus news, whereas in [17] it has recommended using topic models. The work in [18] has openly leveraged the reliability of publishers and users for early fake news detection [19]. During the COVID-19 pandemic, therefore, it will be necessary to construct a reliable automated COVID-19 detection software; unfortunately, the above-mentioned study seldom considers fake news detection or overlooks the ensemble approaches of pre-trained language models.
2.1 Pre-trained Linguistic Models In natural language processing, pre-training and then fine-tuning has become a new paradigm. The language model may acquire broad information and then transfer it to downstream tasks by fine-tuning on specific tasks using self-supervised learning from a huge corpus. Elmo extracts word vectors using context information using bidirectional LSTM. By changing the transformer, GPT improves context-sensitive embedding. To improve word embeddings, the bidirectional language model BERT [7] uses cloze and next sentence prediction in self-supervised learning. Liu et al. eliminate the prediction of the next phrase from self-training and undertake more comprehensive training, resulting in a superior language model called Roberta. Sun et al. [improved the pre-trained language model, thereby hiding Ernie’s spread]. In Ernie 2.0, Sun
520
C. Sindhu et al.
et al. introduced continuous multi-task pre-training as well as multiple pre-training activities.
2.2 Clickbait Detection The word “clickbait” is used in the online media to characterize headlines that are both eye-catching and teaser headlines. Clickbait titles create a “curiosity gap,” making it more likely for readers to click the destination link to fulfill their curiosity. Various linguistic elements collected from teaser messages, connected web sites and tweet meta information are used in existing clickbait detection methods. There are several varieties of clickbait, and some of them are closely connected with false claims. The motivation behind clickbait is usually to increase click-through rates and thereby advertising revenue. As a result, the body material of clickbait pieces is frequently disorganized and poorly reasoned. Researchers have utilized this mismatch to spot inconsistencies between headlines and news content in an attempt to spot fake news articles. Even though clickbait headlines may not be present in every false news, they can serve as an essential indicator, and many aspects [20] can be used to help detect fake news.
2.3 Fake News Categorization Several studies using computational approaches to detect fake news have been published in the last several years. Ceron et al. [16] recommended using Bag of Words (BoW) and BERT embedding to distinguish bogus news, while Hamid et al. proposed using topic models. For early false news identification, Yuan et al. [18] deliberately exploited the credibility of publishers and users. However, during the COVID-19 pandemic, a reliable automated detection tool for COVID-19 is required, however the above-mentioned research rarely looks into this on how to detect COVID-19, fake news detection ignores ensemble techniques.
3 Methodology During training, each input is converted into the numerical form using the keras and stored in an array, this is a type of text classification problem where the news has to be classified into two “Real” or “Fake.” Faking can also be called as capping. So, in this work, our model consists of six steps: (1) Input Data Collection (2) Data Pre-Processing (3) Feature Extraction (4) Splitting the Dataset (5) Machine Learning Model Training and Verification (Fig. 1).
Fake News Detection: Fact or Cap
521
Fig. 1 A general architecture diagram for fake news classification
3.1 Data Collection It consists of the data which have been collected from various sources such as social media and other websites. The real news was collected from authentic places such as verified websites and were actually real whereas coming to the fake news it was compiled from tweets, postings and stories that have been shown to be fake. The data collection consists of 23,500 rows and 4 columns. The dataset has 60% of the real news and 40% of the fake news. Why we have taken this dataset: This dataset consists of more than 41,000 data from many social media platforms like twitter and Facebook.
3.2 Data Pre-processing In data pre-processing basically we check if our dataset consists of some null or missing data and if it does, we drop those rows. And we also remove the unnecessary data from the dataset which we might not need. In data pre-processing the next step is to clean up the data that already exists. It is important to remember that we work with the computer, we teach to differentiate between the fake and the authentic one. We convert data in the form of text, but machines understand numbers. So firstly, we have to change them into binary, and then we have to make sure we only translate those texts that are needed for understanding. The first step in data cleaning is to remove unwanted signs in the database. Web addresses or any other referring symbol(s), such
522
C. Sindhu et al.
as at (@) or hashtags, could be used. After we’ve removed that, we’ll move on to the following step: removing the other symbols, which are the punctuation. When we think about it, punctuation has no discernible impact on our ability to comprehend the truth of a particular piece of news. When there are a lot of punctuations, for example, excess of exclamations, it’s conceivable that the news isn’t true. However, those are unusual circumstances that would necessitate a rule-based approach. As a result, we’ll be eliminating the punctuation marks for our fake news detecting effort.
3.3 Feature Extraction The following step is critical. The process of converting tokens into numbers. Feature extraction is another name for this step. In our application, we use the TF-IDF technique to extract and generate features for our machine learning algorithm. TFIDF is an acronym for “term frequency-inverse document frequency.” We glean information about the dataset by looking at the frequency of terms in the dataset as well as the frequency of terms across the whole dataset, or collection of documents, as the name implies. By combining the TF and IDF data, TF-IDF may be simply determined. Simple ratios are used in both calculations. TF = total no. of words/Frequency of a word in a document. IDF = log of (sum of total documents/count of texts containing a sentence). While computing the term frequency every single term is given equal weightage. TF-IDF is then calculated using the product of TF and IDF. More important words would be indicated by a higher TF-IDF score. Some common mistakes. It can be achieved by using the sklearn package for pre-processing and importing the train test separation function. In this step we combine all the columns together and take the label column as a whole new dataset and then we split the dataset into training and testing data, we use the train test split function. We import different types of classification models and feed the model the training data and wait for the model to be trained and after the training of the model we use the accuracy matrix and the test data to test the accuracy of each model. We repeat the same process for each of the four classification models and get the accuracy score of the models. Logistic regression: Logistic regression is a very common classification algorithm. Logistic regression is a statistical analysis approach for predicting a data value based on previous data set observations. It is used to find the ratio of problems where there is more than one descriptive difference. With the exception that the response variable is binomial, the approach is quite similar to multiple linear regression. The influence of each variable on the perceived profit event’s odds rate is the outcome. h θ (x) = g θ ∧ T X
(1)
Decision tree: Decision Tree is a common way of predicting a common purpose with applications in different fields. Typically, decision trees are constructed using
Fake News Detection: Fact or Cap
523
an algorithm that determines multiple ways to classify a set of data depending on specific circumstances. It is one of the most popular and useful study algorithms. Decision Trees is a non-parameter supervised method which can be used for planning and retrieval applications (Fig. 2). Random Forest: A random forest is a method of classification that uses a number of decisive trees to separate data. In the case of individual trees, we use bags and randomization to produce unrelated forest trees that the committee predicts are more accurate than any single tree. Because we work with just a small set of features in this model, we can easily work with hundreds of features, training faster than decision trees (Fig. 3). Gradient boosting: One of the major powerful algorithms in machine learning is the process of increasing the gradients. Bias error and diversity error are the two types of machine learning algorithm errors. Gradient booster is one of the most advanced methods used to reduce the bias of the bias model. The gradient enhancement method can be used to predict not only continuous but also target variable variables (such as Regressor) (as a Classifier). The Mean Square Error (MSE) is a costly function when used as a router, while a Loss is a cost function when used as a separator (Fig. 4).
Fig. 2 Example of decision tree
524
C. Sindhu et al.
Fig. 3 Random forest algorithm
Fig. 4 Illustration of bonding
4 Results and Discussion We took a dataset which consists of fake and real news on social media platforms and split it into two parts. One for testing and other part for training. Testing part consists of 20% of the dataset and training consists of 80% of the dataset. To compare and evaluate the result we are calculating the recall (R), F-1 score (F), accuracy (A) and precision (P).
Fake News Detection: Fact or Cap Table 1 Accuracy of different classifier
P=
525
Classifier
Accuracy (%)
Logistic regression
97
Random forest
96
Gradient boosting
94
Decision tree
96
True Positive True Positive + False Positive
Precision can be defined as the ratio of positive samples classified correctly to the total positive classified correctly or incorrectly. It gives the information about the factualness of the model. R=
True Positive True Positive + False Negative
It is defined as the ratio of correct prediction to the total number of positive predictions, it gives the information about the completeness of the model. F1 = 2 ×
Precision · Recall Precision + Recall
The weighted average of accuracy and recall is the F1 score. When we have a class distribution, it is quite handy. Our proposed model of fake news detector is trained on 25,000 samples and tested on 18,000 articles and headlines. Model Training has been performed using an ASUS Intel i7 7th gen GPU 2.4 GHz with 16 GB DDR4 Random Access Memory (RAM) machine which has 1 TB of HDD (Hard Disk Drive). It took 3 h to train the model. We employed a variety of classifier techniques, including logistic regression, random forest, decision tree and gradient boosting. The presented model outperforms all other models by producing an accuracy of 98%. The average precision, recall and F1-score for all classes are respectively as shown in Table 1.
5 Conclusion As fake news and misinformation spread through online social media, it’s more important than ever to obtain a thorough grasp of the differences between false and real news pieces in order to effectively spot and filter fake news. In order to successfully combat fake news, this study examines hundreds of very popular false and true news from a variety of perspectives, including the domains and reputations of news producers, as well as the major words of each news and their word embeddings. The reputations and domain attributes of news sources change dramatically
526
C. Sindhu et al.
between fake and truthful news, according to our findings. In this paper, we evaluated four machine learning models on a fake news dataset in terms of F1-score, recall, precision, accuracy. Our model achieved the accuracy up to 97%. We tested out several pre-trained language models in our initial attempt. As our model is limited to the linguistic news only and it can’t identify the pictorial representation of the news, we aim to develop a model which can predict if the given news is fake or real for the pictorial representation too. Our method is new and better because we have used the latest dataset based on the COVID and other than our model has been trained on a larger set of datasets so the accuracy is better. We have also used different classification algorithms and chose the best out of them.
References 1. M.D. Vicario et al., Polarization and fake news: early warning of potential misinformation targets. ACM Trans. Web (TWEB) 13(2), 1–22 (2019) 2. P. Qi et al., Exploiting multi-domain visual information for fake news detection, in 2019 IEEE International Conference on Data Mining (ICDM) (IEEE, 2019) 3. T. Mladenova, I. Valova, Analysis of the KNN classifier distance metrics for Bulgarian fake news detection, in 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) (IEEE, 2021) 4. M. Tripathi, Sentiment analysis of Nepali COVID-19 tweets using NB, SVM AND LSTM. J. Artif. Intell. 3(03), 151–168 (2021) 5. D. Viji, N. Asawa, T. Burreja, Fake reviews of customer detection using machine learning models. Int. J. Adv. Sci. Technol. 29(06), 86–94 (2020) 6. X. Li et al., Exploring text-transformers in aaai 2021 shared task: COVID-19 fake news detection in English. arXiv preprint arXiv:2101.02359 (2021) 7. O. Kamal, A. Kumar, T. Vaidhya, Hostility detection in Hindi leveraging pre-trained language models. arXiv preprint arXiv:2101.05494 (2021) 8. K.-C. Yang, T. Niven, H.-Y. Kao, Fake news detection as natural language inference. arXiv preprint arXiv:1907.07347 (2019) 9. S.D. Das, A. Basak, S. Dutta, A heuristic-driven ensemble framework for COVID-19 fake news detection. arXiv preprint arXiv:2101.03545 (2021) 10. S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 11. A. Vaswani et al., Attention is all you need, in Advances in Neural Information Processing Systems (2017) 12. A. Radford et al., Improving language understanding by generative pre-training (2018) 13. J. Devlin et al., Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) 14. Y. Sun et al., Ernie: enhanced representation through knowledge integration. arXiv preprint arXiv:1904.09223 (2019) 15. Y. Sun et al., Ernie 2.0: a continual pre-training framework for language understanding. Proc. AAAI Conf. Artif. Intell. 34(05) (2020) 16. W. Ceron, M.-F. de-Lima-Santos, M.G. Quiles, Fake news agenda in the era of COVID-19: identifying trends through fact-checking content. Online Soc. Networks Media 21, 100116 (2021) 17. A. Hamid et al., Fake news detection in social media using graph neural networks and NLP techniques: a COVID-19 use-case. arXiv preprint arXiv:2012.07517 (2020)
Fake News Detection: Fact or Cap
527
18. C. Yuan et al., Early detection of fake news by utilizing the credibility of news, publishers, and users based on weakly supervised learning. arXiv preprint arXiv:2012.04233 (2020) 19. J.I.-Z. Chen, K.-L. Lai, Deep convolution neural network model for credit-card fraud detection and alert. J. Artif. Intell. 3(02), 101–112 (2021) 20. C. Sindhu, G. Vadivu, Fine grained sentiment polarity classification using augmented knowledge sequence-attention mechanism. J. Microprocess. Microsyst. 81 (2021). https://doi.org/ 10.1016/j.micpro.2020.103365
The Future of Hiring Through Artificial Intelligence by Human Resource Managers in India Ankita Arora, Vaibhav Aggarwal, and Adesh Doifode
Abstract There has been explosion of Artificial Intelligence (AI) in recent times and sweeps through various industries and developments tools. These tools are being increasingly being also used in Human Resource Management for hiring, training, employee engagement, promotions, decision-making, speed and accuracy of work and removing the daily hurdles with high end technology. Machine learning algorithms, chatbots, robots help the process to achieve organization goals. This study aims at understanding the intensity of increase in usage of AI by human resource managers across different organizations of different sizes. The five-point Likert scale to develop the questionnaire is used to survey the participating HR managers in the Indian context. The findings of this study are multi-fold. First, the results indicate that although AI will take away lower-level jobs via automation, but it is difficult for top level hiring to be done largely via AI. Second, our results also depict those jobs which require more human touch can’t be easily replaced by automation. Finally, promotion decisions based only on automation can result in errors and hence it is also important to have human interaction before arriving at an outcome. Keywords Artificial intelligence · Machine learning · Algorithms · Human resource management
1 Introduction Around 1950, AI had a limited scope as technology was not much advanced. With the growing needs and developments in system required appropriate use of inevitable technology in field of business so the importance of data analytics and data mining A. Arora Apeejay School of Management, Delhi, India V. Aggarwal (B) O.P. Jindal Global University, Sonipat, India e-mail: [email protected] A. Doifode Institute for Future Education, Entrepreneurship and Leadership, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_40
529
530
A. Arora et al.
started flourishing. The success of an organization is based on their cost, quality of services catered, innovation, continuous improvement, techniques used, and interaction with internal and external environment. To predict the uncertain and difficult issues, human skill, experience, and knowledge are required which consumes a lot of time. Hence, the organizations look for effective tools suitable for analyzing the variables with appropriate technology. The introduction and growing significance of AI have encouraged the wide accessibility of information, data, and accuracy for predicting the human behavior, trend, and analysis of knowledge in HR and other fields as well [1]. Artificial Neuron Network came back in the form of Deep Learning where Google developed a program Alpha Go and both did form the basis for almost all apps under AI. Through empirical analysis, AI will become an integral part of firm’s decision-making and will perform thinking task [2]. The question is of the credibility of AI, without any biasness or prejudice in talent attraction and retention of existing human asset. The training and testing of AI algorithm to avoid any error and biasness while acquiring talent in the organization. The rising use of AI will result in less need of professionals, experts, and employees for carrying out the task. Image recognition tools are outperforming professionals in various fields. Regulations might be a way not to get away with employment of human into jobs that don’t require automation by spending a specified percentage and limiting use of AI. Privacy is a major concern in AI which can lead to disruptions such as money laundering, economic trade, virtual detection, and recognition of faces under automatic surveillance cameras. As the AI is expanding its horizon, it is important to allow evolutions in fast moving world, but regulation is challenging and needs to be looked into with such extraordinary developments. AI does not know what’s biased or not, it learns and perpetuates the biases. Recruitment is the process of fitting the right candidate for the right job and Human Resource is the backbone for any organization as they have to bring in the personnel that fit best in the organization. From screening to selection to recruitment it involves the task of hunting down the right candidate with the skill set for the job vacancy. Recruitment is of two types internal and external. Internal includes promotions, transfers and external recruitment is through referrals, advertisements, social media, websites, and other sources. HR manager has the utmost role in recruiting the right job fit as they have to attract and retain the talent who can carry on the legacy of organization in retaining its goodwill and accomplishing the goals. AI is designed in such a way that it has the ability to work like humans and perform the routine task of recruitment with speed, knowledge, accuracy and give faster results so that the HR managers can focus on other important strategies [3]. AI with its speed, accuracy of delivering results and predictions using big data analysis is an advantage for any organizations. AI technology is completing time consuming task very fast and has chatbots for aiding users. Man and machine get to work together letting programs and innovation, creativity go hand in hand and will help in maintaining data ethic, data security, and privacy [4]. This integration will increase productivity and will scale up the business. It’s a huge advantage for HR to on board talent in the organization through AI. The algorithms have advanced over the time with machine learning and deep neural network. The language processing,
The Future of Hiring Through Artificial Intelligence by Human …
531
machine translation, and the chatbots have made communication flawless. AI is an opportunity but also a threat to humanity at the same time because chances are high of replacing man by machine and its creation might be uncontrollable by humans. Research on building AI technology with a regulatory framework and policies needs to be monitored and Human Resource should take care of the deployment of manpower with AI developments [5]. Cowgill has shown in his studies that machine learning is biased when there is noisiness in human behavior [6]. If human decisions are considerably noisy then machine can improve human biases. Biases and noise are directly related. More the noise in human cognition decision-making less will be machine biasness and vice versa. It is because humans tend not to reveal certain behavior patterns as they might be exposed. It hires all types of candidates whether inexperienced, those belonging to minority community, elderly, women, etc., so even human recruiters were not likely to select them AI would have hired them due to noise and inconsistency. As the Artificial Intelligence is advancing the major evolution in the upcoming forth industrial revolution, it is important to get the insight of how it will collaborate in Indian industries and how businesses will restructure in the process of complementing AI with humans as both are incomplete without each other and will be a turning point for the industrial sector. Shakya and Smys have argued that adoptation of AI can benefit different business segments [7]. In this study, we have focused on the growing role of AI in India. A surveybased questionnaire using the Likert scale was developed and sent to human resource managers across India using a convenient sampling method. The respondents’ age was between 20 and 60 years from companies like dailyobjects.com, HCL Tech, Info Edge India, KPMG, etc. The findings of the study are discussed in Sect. 4.
2 Literature Review This research paper reviews the domain of Artificial Intelligence focusing on the entailing AI in utilitarian for attaining higher outcomes. Virtual use of Artificial Intelligence has both anticipation and apprehensions by others. AI should foster for safety, biasedness, privacy and security, accountability, and sociability beneficial toward people and should be in uniformity with the principles. It shouldn’t cause harm, injury, subject to surveillance and doesn’t contravene international law and human rights. The legal and regulatory structure should be compelled in case the rules are violated to protect the human rights and free expression. Justice and temperance, faith and hope, prudence, and courage are key reflections demonstrating good character as they make decision to ethically utilize the immense potential of AI [8]. AI performs the repetitive task and has advanced algorithm to analyze the performance of employees by creating relationship between attribute and job performance which can be biased pattern of judging the criteria. Decision-making should be based on reasonable measures of performance and digital tools of HR. Tools of database
532
A. Arora et al.
Management for data analytics like Tableau, data lakes are underused in HR. Larger data in machine learning predicts better accuracy [9]. AI connects with the right candidates by showing them the required pop-ups, messages, emails, and messages for the job opportunities. AI pools expand the reach to both active and passive candidates by collecting incomprehensible data and information about them. It has the ability to push reach-richness frontier and is yet to be exploited. It generated opportunities for past candidates by examining them to current position rejected previously. AI is efficient and effective and faster than human in evaluating applications. AI-enabled assessment and screening have made the selection process smoother. The growth of AI is inevitable with the upcoming Digital Recruiting 3.0 and is a significant tool [10]. Lawler and Elliot stated that in a decision aid use of expert system i.e., AI is more efficient in problem-solving as they encapsulate the essence of series of pattern followed to identify a solution to problem [11]. Computer-based decision has less complexity, has less uncertainty on impacting quality and accuracy of context, reduces time required to solve the problem, is efficient and helps managers in taking better decisions and builds satisfaction and enhances user experience along with the quality of choices made while taking a decision as compared to paper pencil task. Brougham and Haar studies show that with the evolution of technology and improvisation in the fourth industrial revolution the jobs could be displaced and robots will impact their future [12]. The future career plans and opportunities will be challenging and this will have implications on mental health of employees, job insecurity, and their well-being. There will be lower organization commitment, career satisfaction and have adverse effects on turnover intentions. The machine has to be trained with the programming to be embedded to interpret the results in true sense. Also the machine learning has outstanding performance ability in predicting accuracy of results, the human expertise has contributed in building a technology and training the neural network for accurate results [13]. The Machine learning tries to maximize the accuracy of Algorithm prediction by training the AI system. It allows algorithm to extract correlations from data with minimal supervision. The findings highlighted that ML algorithms deploy discrimination in facial recognition technology and data can omit and reflect biasness, It is imperative that scrutiny and transparency of usage of such technology are mandated under legal framework. There is a risk in recruitment as bias can be inbuilt in data and algorithm will pick and use for hiring. The protection from discrimination should be of occupation, goods, and services while using algorithms [14]. The rapid advancement of computer technology leads to high intelligence at par with humans, but the humans have emotional side i.e., intuition which is lacked by computer machine no matter how powerful they are in delivering results. Humans are rational and have cognition whereas machine is superficial when dealing with situations. AI can surpass human intelligence but not them. Both humans and computers have experienced complexities over time and will have to become net neutral with growing diversities. Human–machine might complement each other through their interaction and independence and freedom in work processes. Martinsons stated the evolution of IT in HRM and its automation in recording of employee data. The use of IT application is to increase efficiency and focus on operational task rather than
The Future of Hiring Through Artificial Intelligence by Human …
533
replicating manual work [15]. The data set can be extracted timely and accurately with all the relevant information. Knowledge based system (KBS), the application software has expertise and is useful in making decisions and recommendations in a problem. AI technology is being used for achieving higher productivity and profits. Rao and Hill as the technological advancements are on the forefront Artificial Intelligence in the Talent Acquisition (TA) is trending but still low as only 10% of HR practitioners made use of high tech in recruiting [16]. The limitation of using AI could be poor programming, biasness in TA process as it replicates human and scarcity of HR expertise in dealing with AI as it demands years of experience and knowledge. The information can be remotely accessed by bots and they are programed in a manner that defines their capability of modification and molding which could be a threat to human skills. There won’t be a complete replacement of labor with bots as both will be utilized in their sphere of tasks [17]. Geetha and Bhanu argued that data analysis for decision-making is used in recruitment processes which are none other than the Artificial intelligence [3]. AI with its speed and accuracy hires the top most talent for the organization. It is a powerful machine acts like humans with knowledge, reasoning, and automation for predicting and solving problem. AI is used for screening candidates by interacting with them digitally through chatbots. Butner and Ho argued that the companies need to redesign their structure and train their employees as automation is transforming the work scenario [18]. Machine intelligence can predict outcomes, make autonomous decisions, and behave like human. It is being used in various industries like health, banking, retail industry, insurance companies, and telecommunication. The automation eases the work process without human intervention and generates faster results. Before implementing the AI the organization must reorganize the skill set of employees so that they are complement to each other. AI software can mitigate risk and employees can focus on more creative and other learning to end the mundane activity and proves to be of higher value. Bessen found that the use of computerized work in organization is associated with unequal wage pay as it demands skills [19]. The low skilled worker might have to leave the job but the high skilled worker will be retained. But automation of work does not let one lose occupation but requires workers to learn new set of skills as automation will speed up the task saving one’s time. AI should be ethical and not take away the jobs of people such regulatory frameworks and guidelines should be measured to protect the livelihood and employment [20]. The type of skills required has to be matched to secure the inequality of jobs and balance out the automation and labor together to complement each other [21, 22]. AI uses human cognition to perform functions and achieves results with accuracy. Case study method and candidate background data analysis showed that AI with human resource allocation can improve work efficiency and better implement intelligent decision of human resource allocation [23]. AI will drive inner motivation as it can figure out the employee who’s likely to quit and promoted for the job. HR will have to see the transparency and biasedness of AI [24]. The use of algorithm for prediction is pivotal to evaluate performance and hire best candidate. With the set of data mining identifying and promoting the existing employee, getting to know the future trends and organizing relevant information use model classification [25].
534 Fig. 1 Conceptual model—describing the schematic of the proposed idea
A. Arora et al. Human Resource Management
Employee Recruitment
Employee Retention
Artificial Intelligence
Framework for AI technologies in human resources practices to study human traits, including emotional skills which was discussed by Vijh et al. [26].
3 Research Methodology The study is based on literature review and provides the insight on Artificial Intelligence, machine learning, digitalization of technology, chatbots and how it will impact recruitment in HRM. The source used is secondary by the means of journals and reports from various websites to conclude the findings. The present study was conducted by using a convenient sample method. The questionnaire was developed on a five-point scale of Likert ranged from strongly agreed (5) to strongly disagreed (1). The questionnaire prepared was circulated among various HR professionals. After proper filling and examination of questionnaire correctly, the analysis was done (Fig. 1).
4 Findings and Discussions With the growing use of technology, there are various concerns related to privacy and security of data that need to be addressed and assured to be taken custody of. Ethical governance is very important for trusting robotics and Artificial Intelligence as it would create a fear of disparity and losing mass employment and jobs. The ethical governance at individual and institutional level with the values should be responsible for the functioning of program. The policy framework, sustain-ability, innovation, and technology should be highlighted collectively so as to maintain transparency without any loopholes. The organization should adhere to the compliances from time to time and scrutiny of documentation should be done accordingly. It is the responsibility and accountability of the organization to respect rights and dignity of the employee and keep their information secure and private. AI should be trained in
The Future of Hiring Through Artificial Intelligence by Human …
535
such a manner that it responds to morally challenging situations without any biasness. These practices need to come into force governing the AI technology with utmost transparency by following professional code of conduct [27]. AI should be able to take rational decision and is held accountable, secure and keeps the data confidentiality. There should be no technical glitch and biasness based of racial discrimination, gender, etc. It should not be threat to humanity and should omit any data or speculate gains by exaggerating information [28]. As technology is advancing and AI has a lot potential to set up in various sectors of Indian industries, still it has not yet implemented fully and wide research is going on by the government for exploring the potential. AI enhances efficacy and has potential of predicting future. There are several challenges of security as AI as public data stored with it which shouldn’t be accessible by anyone and various companies are accessing it which is a drawback and also AI requires a high-speed network. Regulations are needed to protect the privacy and security and it should make sure that the ML application is not biased as the developer may insert biasedness of data. The government should support and fund research and development and should complement with the companies for applications-based technology (Table 1). The impact of AI will have serious implications on job opportunities for people employed in low skilled jobs in India and IT professionals are likely to hit due to automation. The HRM has to identify and differentiate the task between robot and human so that the chatbots can focus on guidance and instructions and can reply at faster speed to the queries and day-to-day work. The government should ensure adequate infrastructure for Artificial Intelligence as it required storage space in cloud and it uses public data for the processing of outcomes for which it may take help of third-party application abroad and information of cloud will be leaked. So certain protection and applications have to be installed to protect the privacy and security of citizens. The HRD should collectively help in featuring AI and its dominance in various sectors for the future infrastructure development [29]. The connectivity issue will have to be made accessible as it requires huge data for performance. It should be the priority of HRM to retain the workforce and should be provided with opportunities, training through formal and informal means of education and realign resources to secure them in the organization and have such manpower to deal with the changing dynamics of the environment [5]. We conducted an analysis on the future hiring through Artificial Intelligence in Human Resource Management through questionnaire with various companies like CSC e Governance services India Limited, Axa xI, CSIR IGIB, dailyobjects.com, HCL Tech, Info Edge India Limited, KPMG, KVS, NTT, NeGD, Tartan Sense, WTW, Royal Bank of Scotland, with varied age groups. The respondents are aged from 20 to 60 years. It was significant from the survey that average employees strength ranged from 1000 to 4999 was 44% and 38% strength was 10,000 employees and above and rest ranged approx. 18% respectively. The responses from the survey included mixed reactions by respondents (Figs. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12).
536
A. Arora et al.
Table 1 Summary of the survey findings Findings
SD (%) D (%) N (%) A (%) SA (%)
Startups and mid-sized companies will use AI to hire a small proportion of candidates?
0.0
23.8
38.1
38.1
0.0
AI will replace employees from jobs in coming future?
14.3
38.1
28.6
14.3
4.8
Indian companies will be transparent and unbiased in recruitment process through AI?
4.8
0.0
42.8
47.6
4.8
AI will be able to recruit top level management in the organization?
9.5
57.1
23.8
9.5
0.0
AI will be able to handle tasks that require more personal touch?
9.5
52.4
14.3
23.8
0.0
AI will be able to recruit multi purposed talent skill in the organization?
0.0
14.3
33.3
52.4
0.0
Candidates will be outspoken while interview with AI robot and will have more chances of getting selected?
4.8
23.8
38.1
33.3
0.0
Artificial intelligence will ignore the internal factors of human emotions and will lead to wrong decision-making regarding promotion and attrition?
0.0
4.8
38.1
57.1
0.0
The percentage of total job be filled by AI is about?
38.1
19.0
23.8
9.5
9.5
As AI will focus on routine work, human resource will easily figure out the creative tasks for the employees?
0.0
0.0
28.6
66.7
4.8
The cost of outsourcing and dependency on third party will reduce through introduction of AI in organizations?
0.0
9.5
19.0
66.7
4.8
Note SD strongly disagree, D disagree, N neutral, A agree, SA strongly agree
38.1% of the respondents agreed that startups and mid-sized companies will use AI to hire small proportion of candidates.
Fig. 2 Startups and mid-sized companies will use AI to hire a small proportion of candidates?
The Future of Hiring Through Artificial Intelligence by Human …
537
38.1% of the respondents disagreed that AI will replace employees from jobs in coming future.
Fig. 3 AI will replace employees from jobs in coming future?
42.8% respondents were neutral about Indian companies will be transparent and unbiased in recruitment process through AI.
Fig. 4 Indian companies will be transparent and unbiased in recruitment process through AI?
67% respondents disagreed that AI will be able to recruit top level management in the organization.
Fig. 5 AI will be able to recruit top level management in the organization?
5 Conclusion The world we are living is changing drastically. Artificial Intelligence will be tomorrow’s future as it is rolling out with speed. It is an empowering technology the magnitude of which weighs on the human resource to keep technology at par with humans. AI is playing a major role in efficiently solving problems in HRM
538
A. Arora et al.
63% respondents disagreed that AI will be able to handle tasks that require more personal touch.
Fig. 6 AI will be able to handle tasks that require more personal touch?
52.4% respondents agreed that AI will be able to recruit multi purposed talent skill in the organization.
Fig. 7 AI will be able to recruit multi purposed talent skill in the organization?
38.1% respondents were neutral about candidates will be outspoken while interview with AI Robot and will have more chances of getting selected.
Fig. 8 Candidates will be outspoken while interview with AI robot and will have more chances of getting selected?
The Future of Hiring Through Artificial Intelligence by Human …
539
57.1% respondents agreed Artificial intelligence will ignore the internal factors of human emotions and will lead to wrong decision making regarding promotion, attrition.
Fig. 9 Artificial intelligence will ignore the internal factors of human emotions and will lead to wrong decision-making regarding promotion and attrition?
According to the respondents 0%-15% percentage of total job be filled by AI.
Fig. 10 The percentage of total job be filled by AI is about?
66.7% respondents agreed that as AI will focus on routine work, Human Resource will easily figure out the creative tasks for the employees.
Fig. 11 As AI will focus on routine work, human resource will easily figure out the creative tasks for the employees?
[30, 31]. Hiring the right candidate for the job fit will be a competitive advantage over the others as AI will result in faster and better hiring. Overall, the present study highlights the current state of AI technology being used in HRM for hiring and its growing significance will have repercussions over the employees. AI can be
540
A. Arora et al.
66.7% respondents agreed that the cost of outsourcing and dependency on third party will reduce through introduction of AI in organizations.
Fig. 12 The cost of outsourcing and dependency on third party will reduce through introduction of AI in organizations?
trained through learning management system so that it can understand the behavior of learner and will result in better hiring. Further studies should be continued in this field as policies, security, and data authenticity has to be accurate and confidential and government should implement such guidelines and measures to protect the information and work actively for its existence. There are few limitations of this study. Firstly, the study mainly focused Indian companies, while the study can be further expanded across countries. Secondly, the lack of funding to carry out a physical survey. Thirdly, relatively small sample size was used for this research. Fourthly, mainly large sized companies were targeted for getting data. Finally, there is scope for further deciphering of survey results using exploratory factor analysis.
References 1. Y. Pan, F. Froese, N. Liu, H. Yunyang, M. Ye, The adoption of artificial intelligence in employee recruitment: the influence of contextual factors. Int. J. Human Resour. Manage. 1–23 (2021) 2. N. Kshetri, Evolving uses of artificial intelligence in human resource management in emerging economies in the global South: some preliminary evidence. Manag. Res. Rev. 44(7), 970–990 (2021) 3. R. Geetha, S.R.D. Bhanu, Recruitment through artificial intelligence: a conceptual study. Int. J. Mech. Eng. Technol. 9(7), 63–70 (2018) 4. S. Jha, Data privacy and security issues in HR analytics: challenges and the road ahead, in Expert Clouds and Applications, pp. 199–206 (2022) 5. S.K. Srivastava, Artificial intelligence: way forward for India. J. Inf. Syst. Technol. Manage. 15, 1–23 (2018) 6. B. Cowgill, Bias and Productivity in Humans and Algorithms: Theory and Evidence From Resume Screening (Columbia University, Columbia Business School, 2018), p. 29 7. S. Shakya, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC 3(03), 235–249 (2021) 8. M.J. Neubert, G.D. Montañez, Virtue as a framework for the design and use of artificial intelligence. Bus. Horiz. 63(2), 195–204 (2020) 9. P. Tambe, P. Cappelli, V. Yakubovich, Artificial intelligence in human resources management: challenges and a path forward. Calif. Manage. Rev. 61(4), 15–42 (2019)
The Future of Hiring Through Artificial Intelligence by Human …
541
10. J.S. Black, P. van Esch, AI-enabled recruiting: what is it and how should a manager use it? Bus. Horiz. 63(2), 215–226 (2020) 11. J.J. Lawler, R. Elliot, Artificial intelligence in HRM: an experimental study of an expert system. J. Manag. 22(1), 85–111 (1996) 12. D. Brougham, J. Haar, Smart technology, artificial intelligence, robotics, and algorithms (STARA): employees’ perceptions of our future workplace. J. Manag. Organ. 24(2), 239–257 (2018) 13. M. Roccetti, G. Delnevo, L. Casini, P. Salomoni, A cautionary tale for machine learning design: why we still need human-assisted big data analysis. Mobile Networks Appl. 25(3) (2020) 14. R. Allen, D. Masters, Artificial intelligence: the right to protection from discrimination caused by algorithms, machine learning and automated decision-making. ERA Forum 20(4), 585–598 (2020) 15. M.G. Martinsons, Human resource management applications of knowledge-based systems. Int. J. Inf. Manage. 17(1), 35–53 (1997) 16. R. Rao, B. Hill, How is the role of AI in talent acquisition evolving? 16 (2019) 17. O. Gomes, Growth in the age of automation: foundations of a theoretical framework: Gomes. Metroeconomica 70(1), 77–97 (2019) 18. K. Butner, G. Ho, How the human-machine interchange will transform business operations. Strategy Leadersh. 47(2), 25–33 (2019) 19. J.E. Bessen, How computer automation affects occupations: technology, jobs, and skills. SSRN Electron. J. (2015) 20. A. Bundy, Preparing for the future of artificial intelligence. AI Soc. 32(2), 285–287 (2017) 21. D. Acemoglu, P. Restrepo, The race between machine and man: implications of technology for growth, factor shares and employment. SSRN Electron. J. (2017) 22. D. Acemoglu, P. Restrepo, Artificial Intelligence, Automation and Work. w24196. National Bureau of Economic Research, Cambridge, MA 23. H. Ma, J. Wang, Application of artificial intelligence in intelligent decision-making of human resource allocation, in International Conference on Machine Learning and Big Data Analytics for IoT Security and Privacy, vol. 1282, pp. 201–207 (2020) 24. T. Rana, The future of HR in the presence of AI: a conceptual study. SSRN Electron. J. (2018) 25. H. Jantan, A.R. Hamdan, Z.A. Othman, Human talent prediction in HRM using C4.5 classification algorithm 2(8), 10 (2010) 26. G. Vijh, R. Sharma, S. Agrawal, The heartfelt and thoughtful rulers of the world: AI implementation in HR, in International Conference on Futuristic Trends in Networks and Computing Technologies, vol. 1395, pp. 276–287 (2020) 27. A.F.T. Winfield, M. Jirotka, Ethical governance is essential to building trust in robotics and artificial intelligence systems. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 376(2133), 20180085 (2018) 28. T. Hagendorff, The ethics of AI ethics: an evaluation of guidelines. Mind. Mach. 30(1), 99–120 (2020) 29. T. Dhanabalan, A. Sathish, Transforming Indian industries through artificial intelligence and robotics in industry 4.0. 12 (2018) 30. S. Verma, S.K. Jha, Application of artificial intelligence in HR processes. Rev. Manage. 10(1/2), 4–10 (2020) 31. S.K. Jha, S. Jha, M.K. Gupta, Leveraging artificial intelligence for effective recruitment and selection processes, in International Conference on Communication, Computing and Electronics Systems (Springer, Singapore, 2020), pp. 287–293
A Computer Vision Model for Detection of Water Pollutants Using Deep Learning Frameworks Anaya Bodas, Shubhankar Hardikar, Rujuta Sarlashkar, Atharva Joglekar, and Neeta Shirsat
Abstract Water pollution at large has increased to a large extent. Actual existing technologies require massive human input and resources and take up a lot of time to reach conclusions. Thus, the need of automated pollutant detection system has risen. Deep learning is used to detect floating macro wastes from images of the water surface. The model presented in this paper is used for detection and classification of the different pollutants and harmful waste items floating in the water bodies. We present a deep learning-based model for allowing high precision detection in aquatic environments. OpenCV and TensorFlow have been used for the implementation of YOLOv3 and YOLOv4 models. This model will support our idea of curbing water pollution by detecting the necessary pollutants and also drawing a bounding box around it to specify its location. Therefore, we aim to achieve the necessary benchmarks for the development of an effective and efficient water pollutant detection system using the concept of transfer learning and making use of pre-trained models. Keywords Deep learning · Computer vision · Water pollution · Automated water pollutant detection · YOLO
1 Introduction Aquatic environments worldwide are facing the serious issue of water pollution which propose a threat to marine life and humankind. The quantity of these pollutants is steadily increasing. Out of these pollutants, majority are harmful waste items that have adverse effects on marine ecosystems [1]. It consists of wastes like plastics, glass, metal, etc. Floating macro plastics are the root cause of pollution in aquatic environments. The plastic quantities in the ocean have experienced a steady growth A. Bodas (B) · S. Hardikar · R. Sarlashkar · A. Joglekar · N. Shirsat Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India e-mail: [email protected] N. Shirsat e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_41
543
544
A. Bodas et al.
from being just 5.25 trillion in 2015 [2] to being 25 trillion macro and 51 trillion microplastics in 2020 [3]. These numbers are proof that water pollution detection mechanisms are underprepared to handle the severity of this issue. Water pollutants such as plastic have longer life spans and can be found unfazed even after years of being submerged in water and exposed to extreme varying weather conditions. Such type of waste needs to be manually collected and treated to reduce its harmful nature. Rivers and canals flowing through cities are also affected due to trash dumping. The major contributors to water pollution in cities are used plastics, paper wastes, glass bottles as well as metal canisters. All of these pollutants contaminate the freshwater channels and cause blockage in sewers [4] leading to manmade urbanization problems. Several wastes are found floating on water surfaces which can get entangled and cause suffocation for marine animals. Its toxicity content can harm humans who come in contact with the same. Waste in water is subjected to transformation in size and shape which makes it difficult to classify such wastes present in water. The existing vision-based approaches for trash detection are mainly focused on [4] detection of trash dumped on solid grounds and rarely on water surfaces. Our problem focuses on water surface trash detection. The detection task for surface water pollutants is comparatively difficult as the trash could be disfigured, partially submerged, decomposed into smaller pieces, clumped together with other objects which obscures its shape [4]. Most of the wastes responsible for water pollution can be categorized as organic wastes, non-degradable wastes, recyclable wastes, hazardous wastes, or chemical wastes. These wastes can also be categorized as submerged, semi-submerged, or floating wastes. The focus is directed toward floating waste that can be detected from a distance. The conceptual illustration for the model focuses on categorizing these pollutants into 4 classes plastic, paper, metal, and glass. Detection of plastic bottles, plastic containers, semi-submerged plastic bottles, plastic bags, plastic cups, thermocol, metal cans, metal containers, glass bottles, glass cups, paper cups, discarded paper boxes, newspapers, paper containers, etc. is handled. Thus, the model is trained using the dataset which predominantly comprises of images of floating pollutants. Object detection task of automatically detecting and classifying objects using various deep learning algorithms such as YOLO, Faster RCNN, SSD, R-FCN perform better on benchmark datasets such as MS-COCO [5] and Pascal-VOC [6]. RetinaNet [1], modified YOLOv3 [7], Faster RCNN [8], SSD, R-FCN [9] are employed algorithms that work on detection in aquatic environments. Use of two-stage RetinaNet [1] and Faster RCNN [8] for detection of floating water pollutants has achieved significant results. However, we propose the use of latest versions of YOLO to take advantage of its increasing accuracy and speed. YOLO provides a balance in speed and accuracy and is therefore the better choice of a single-stage detector. It includes the development of a model which can successfully classify wastewater pollutants using deep learning frameworks. Pre-trained models provide computational ease by avoiding random initialization and reusing the features that are already learned [10]. A customized dataset is developed for improving training results for the model
A Computer Vision Model for Detection of Water Pollutants …
545
as well. The images of trash floating on water surfaces is manually annotated and converted to the YOLO suitable format. The paper further describes the related work concerned with use of deep learning models for detection in aquatic environments. The architectural details followed by experimental details and results are specified in the next sections. Finally, the conclusions and future work are stated along with references.
2 Related Work The earlier work carried out in object detection for waste segregation from water bodies is implemented on Faster RCNN, RetinaNet, and some other YOLO based models. These algorithms are predominantly based on convolution neural networks (CNNs) which is a type of deep learning algorithms. But none of these models have achieved high accuracy and got poor average precision. Most of the recent work on waste detection function around deep learning-based object detectors including SSD, YOLO, and Faster RCNN. These well-known object detectors are designed for general applications, especially for urban scenarios such as those related to surveillance and self-driving cars. The proposed problem focuses on water surface trash detection. YOLO algorithms are one staged detector which has fastest speed of detection [11]. YOLOv3 predicts an object score for each bounding box using logistic regression [12]. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP. YOLOv4’s work is designing an object detector of fast-operating speed in production systems and parallel computations optimization, rather than the low computation volume theoretical indicator (BFLOP) [13]. They have verified a large number of features and selected them for improving the accuracy of both the classifier and the detector. Model for detection of water surface garbage based on YOLOv3 is implemented [7]. Re clustering of anchor boxes and changing the detection scales of YOLOv3 from 3 to 2 for more accuracy and real-time performance are carried out. Their model achieved a mAP of 91.43% in just 18.47 ms for real-time application in robot which was further used for collection of detected waste. YOLOv3 and YOLOv4 are used to support detection of water pollutants in accordance to these results. Comparison of state-of-art object detectors such as Faster R-CNN, R-FCN, and SSD has been stated by making use of pre-trained models [9]. The final conclusion was that an alga monitoring system based on the R-FCN model would be highly robust, accurate and fast to enable effective, real-time algae monitoring. The accuracy on Faster R-CNN was 72%, R-FCN was 82%, and SSD was 50%. The future scope added here was using a larger dataset enabling the neural network to learn more intricate features to be able to detect algae with more consistency. Light weight CNN [14] and Two-Layered Novel CNN [15] provide an insight to the use of CNNs with alterations to suit speedy crowd counting and detection of buildings using remote sensing respectively. CNNs are also combined with ML algorithm of SVM [16] to
546
A. Bodas et al.
carry out time reduction of training and detection without much loss. CNN is used for feature extraction and SVM for accurate classification. Panwar et al. [1] proposed a model named AquaVision which was based on RetinaNet50 as backbone and FPN as regional proposal regression. They could achieve mean Average Precision of 0.814. This model has limited number of images in the dataset which can be a limitation. Zhang et al. [8] used a scale aware network based on Faster R-CNN which contained two modules and integrated low-level features with high-level features for the betterment of its detection accuracy. This model achieved a mean average precision of 83.7% at a frame rate of 11 FPS. Our proposed model implements improvements in the areas such as: The custom dataset which has been used for this project consists of about 1500 pictures, some pictures from opensource datasets, some from internet images and videos and most of them include images which we are captured from nearby rivers using a professional camera. The proposed model makes use of the single-stage detector YOLO, its versions YOLOv3 or YOLOv4. Yolo is used as an object detector for detection of multiple classes of waste while image classification for given classes is implemented in [1].
3 Architecture Details The architectural diagram (Fig. 1) depicts the overall outline of the software system and the relationships, constraints, and boundaries between components. The architecture design of this model is segregated into 3 major components. 1. 2. 3.
Dataset Construction Model Training Water Pollutant Detection.
The model training module is responsible for carrying out the actual training process of the model. It takes the dataset as input and processes it for training using a suitable model. Parameters are set according to training necessity. Network initialization step implements default setting changes to suit and fulfill custom requirements. Actual model training is then started and is under constant scrutiny for loss convergence. The average loss is checked at specified time intervals to ensure reduction in loss and better training results. Once the model is fully trained as per expectations, the model can then be used for inferencing. The network and algorithm used for training the algorithm are YOLOv3 and YOLOv4. Water pollutant detection module carries out the function of inferencing on images fed to it as input. It takes into consideration the weights and the configuration of the model and predicts the presence of an object in the input image. It also draws a bounding box around this detected object in the input image if the prediction of the object is above a certain threshold. The output of this module is the corresponding input image with a bounding box drawn around the detected object. The detected object will be of either of the four categories—plastic, paper, metal, or glass. The
A Computer Vision Model for Detection of Water Pollutants …
547
Fig. 1 Computer vision water pollutant detection model architecture
image to be tested is uploaded on the frontend and the corresponding output is displayed on the frontend for the given proposed system.
3.1 Dataset Development Preparation of a dataset is the first and foremost step in the applications of machine learning algorithms. As the images of the water pollutants in the publicly available dataset are very limited, a new dataset must be developed. Every image in the dataset should be annotated to generate files which contain coordinates of the bounding boxes, thus indicating the location of the water pollutants in the image. The custom dataset includes the images of floating waste dumped in the water bodies which are classified into 4 categories name plastic, paper, metal, and glass. The dataset was prepared by making use of personal camera devices for capturing the water pollutants
548
A. Bodas et al.
Fig. 2 Sample images from custom dataset
floating on the surface of the water body. Around 900 images have been captured by this method, and the remaining 600 images have been picked up from online sources like Aquatrash and TrashNet datasets as well as from famous researching websites. A total of 1500 images have been included in the custom dataset which will be used for training as well as testing. The annotation of these images is done using an online tool. Subsequently, images are split into training and testing sets such that 10% images are segregated into test set and the remaining into training set. Thus, training set contains around 1350 images and test set contains 160 images (Fig. 2).
3.2 Detection Algorithms Deep learning requires a large dataset for training. The number of images available in our custom dataset is at par with the basic requirements. This problem can be dealt with by the use of the concept of transfer learning. The concept of transfer employs a methodology in which the model is pre-trained on a sufficiently large dataset and the final layers are replaced to suit the custom requirements. This means that the Convolutional Neural Network (CNN) is trained on a large annotated dataset from a different domain and can be applied for feature detection in a completely new domain [9]. The lower layers of the CNN are capable of detecting general features such as edges, blobs, etc., from an image, while higher layers capture domain specific features [9]. This ensures that a pre-trained network can learn from a comparatively smaller dataset, to recognize a completely new class of objects, as only the final layers of the network need to be trained [9]. Deep learning models are categorized into single-stage and two-stage detectors, both having their own advantages and short comings. The YOLO algorithms are from the single-stage category which implies that the models take a single network to detect objects. Region proposal stage is skipped in this approach and detection over a dense sampling of potential locations is carried out. YOLO stands for you only look once which is what the detector does during object detection (it only takes a single look at the image). Versions V3 and V4 are the versions under consideration.
A Computer Vision Model for Detection of Water Pollutants …
549
YOLOv3: Makes modifications to the YOLOv2 [11] version. The YOLOv3 [12] model implements Darknet 53 used for feature extraction in the backbone part, and the head part consists of actual bounding box prediction, non-max suppression, and classification of image. YOLOv4: YOLOv4 [13] is a further modified version of YOLOv3 [12]. Consists of backbone, neck, head, and prediction regions. Introduces bag of freebies and bag of specials and a mish activation as an improvement to the existing model. Detection in head takes place in the same way as that in its earlier version.
3.3 Custom Dataset Training Step 1: Gathering and Labeling a custom dataset. It includes the preparation of dataset to include images taken from various angles and from various locations to improve quality of dataset. Online tools are utilized for manually annotating the images in the dataset. Step 2: Uploading the custom dataset. The custom dataset annotations are converted into a yolo suitable format and stored in txt file format. This dataset is then uploaded to cloud-based database. Step 3: Configuring files for training. Configuration of label map file, data file, and custom CFG format files are updated to suit custom requirements and uploaded to the cloud. Step 4: Downloading pre-trained weights. The pre-trained weights for convolution layers are downloaded to enable the use of transfer learning concept. Step 5: Training. After the initialization steps of file configuration, the actual training of model is started. The model training is under constant evaluation for loss convergence. The performance of the model on input image is checked regularly after certain intervals. Step 6: Testing on unseen images. Once, the training is completed up to a satisfactory level, the testing is done on unseen images given as input to the model. The output specifies the class prediction and bounding box construction coordinates. The prediction image is also displayed with the input image and bounding box drawn around detected object.
4 Experimental Details and Results Since the detection objective is to detect water pollutants at a faster speed with high accuracy, the evaluation of the models can be carried out using basic parameters.
550
A. Bodas et al.
These make use of True Positives (TP), False Positives (FP), and False Negatives (FN). Precision and Recall: Used to analyze the performance of the system for detecting whether the image contains water pollutants or not. precision = recall =
TP TP + FP
TP TP + FN
Mean Average Precision (mAP): Used to evaluate how accurately the model can locate a pollutant in a water body from the image. It is evaluated by taking the mean of precisions over all classes. mAP-50 is the mAP calculated at IoU (Intersection over Union) threshold 0.5 [17]. Here, IoU threshold determines whether a prediction is to be considered as True Positive or not. Frames Per Second (FPS): It is an evaluation metric used to analyze how fast the water pollutant detection model is. It determines the rate at which a video can be processed and the desired output is published. Inference Time (in ms): The inference time is used to determine the amount of time that is utilized by the model to process and detect the necessary water pollutant from input image. Table 1 illustrates the comparison of performance of YOLOv3 and YOLOv4 models with other object detection models on MS-COCO [5] dataset. It is evident that YOLOv4 and YOLOv3 provide better results. Table 1 Comparison of other object detection models with YOLOv3 and YOLOv4 [18, 19] Method
mAP-50(%)
Time (ms)
Frame rate (FPS)
YOLOv4-608
65.7
16
62
YOLOv4-416
62.8
12
83
YOLOv3-608
57.9
51
20
YOLOv3-416
55.3
29
35
YOLOv3-320
51.5
22
45
RetinaNet-50–100
50.9
73
14
RetinaNet-101–500
53.1
90
11
RetinaNet-101–800
57.5
198
5
SSD321
45.4
61
16
SSD513
50.4
125
8
FPN-FRCNN
59.1
172
6
R-FCN
51.9
85
12
A Computer Vision Model for Detection of Water Pollutants …
551
4.1 Training Environment The models of YOLO make use of TensorFlow and OpenCV libraries for training processes. TensorFlow consists of numerous tools, libraries, and community resources to help develop ML applications. OpenCV is an open-source computer vision software library which is used to provide common infrastructure for computer vision applications [20]. The test set consists of 160 images and the training set consists of around 1350 images.
4.2 Hardware Specifications We used professional cameras Canon DSLR and Nikon D5600 for clicking photos and collecting the dataset for the better quality of images. We used high end GPU servers of 32 GB for training our model. GPUs can perform multiple, simultaneous computations. This facilitates the parallel execution of training processes. The following objects have been successfully classified (Fig. 3): Table 2 can be referred to get the information of prediction results. The class name, box coordinates, and detection scores are illustrated. Image 1
Image 3
Source image from: https://tinyurl.com/2p8f7b4h
Fig. 3 Detected objects in test image
Image 2
Image 4
Source image from: https://tinyurl.com/dvkzzs9c
552
A. Bodas et al.
Table 2 Prediction results of detected objects in test image X1
Y1
X2
Y2
Class
Score
Image 1
263
138
702
312
Plastic
0.77
Image 2
504
31
690
547
Metal
0.64
Image 3
41
52
95
165
Plastic
0.93
Image 3
12
168
96
203
Plastic
0.68
Image 3
154
101
233
162
Plastic
0.54
Image 3
239
67
347
110
Plastic
0.98
Image 4
47
176
668
409
Glass
0.70
5 Conclusion and Future Work In this work, the objective of proposed work was to detect and locate the waste from the image and categorize according to the classes specified. Hence, here the computer vision pre-trained models of YOLOv3 and YOLOv4 are used. The accuracy and mean average precision calculation can be used to evaluate and analyze these models further. This system will significantly help the local authorities and civilians by providing an automated system for detecting the floating waste pollutants to monitor the pollution in water bodies. Future works will be focused on implementing this system in real-time and getting the real-time video-feeds which will facilitate the NGOs and larger organizations. The improvement can be also done by customizing the algorithms for better performance with variation in dataset.
References 1. H. Panwar, P.K. Gupta, M.K. Siddiqui, R. Morales-Menendez, P. Bhardwaj, S. Sharma, I.H. Sarker, AquaVision: automating the detection of waste in water bodies using deep transfer learning. Case Stud. Chem. Environ. Eng. (2020). https://doi.org/10.1016/j.cscee.2020.100026 2. L. Parker, Ocean trash: 5.25 trillion pieces and counting, but big questions remain. Natl. Geogr. 11 (2015) 3. Condor Ferries, (n.d.), 100 plastic in the ocean statistics and facts (2020–2021). Retrieved from https://www.condorferries.co.uk/plastic-in-the-ocean-statistics 4. M. Tharani, A.W. Amin, M. Maaz, M. Taj, Attention neural network for trash detection on water channels. Computer Vision and Graphics Lab, School of Science and Engineering, Lahore University of Management Sciences, Lahore, Pakistan. arXiv:2007.04639v1 [cs.CV] 9 Jul 2020 5. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll´ar, C.L. Zitnick, Microsoft COCO: common objects in context, in European Conference on Computer Vision (Springer, 2014), pp. 740–755 6. M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
A Computer Vision Model for Detection of Water Pollutants …
553
7. X. Li, M. Tian, S. Kong, L. Wu, J. Yu, A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int. J. Adv. Rob. Syst. 17(3), 172988142093271 (2020). https://doi.org/10.1177/1729881420932715 8. L. Zhang, Y. Zhang, Z. Zhang, J. Shen, H. Wang, Real-time water surface object detection based on improved faster R-CNN. Sensors 19(16), 3523 (2019). https://doi.org/10.3390/s19 163523 9. A. Samantaray, B. Yang, J. Eric Dietz, B.-C. Min, Algae detection using computer vision and deep learning. arXiv:1811.10847v1 [cs.CV] 27 Nov 2018 10. F. Cunha, Transfer learning with Yolo V3, darknet, and google colab (2021). Medium. Retrieved January 15, 2022, from https://medium.com/@cunhafh/transfer-learningwith-yolov3-darknet-and-google-colab-7f9a6f9c2afc 11. J. Redmon, A. Farhadi, YOLO9000: better, faster, stronger. arXiv:1612.08242v1 [cs.CV] 25 Dec 2016 12. J. Redmon, A. Farhadi, University of Washington “YOLOv3: an incremental improvement. arXiv:1804.02767v1 [cs.CV] 8 Apr 2018 13. A. Bochkovskiy, C.-Y. Wang, H.-Y. Mark Liao, YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934v1 [cs.CV] 23 Apr 2020 14. B. Vivekanandam, Speedy image crowd counting by light weight convolutional neural network. J. Innov. Image Process. 3(3), 208–222 (2021) 15. P. Karuppusamy, Building detection using two-layered novel convolutional neural networks. J. Soft Comput. Paradigm (JSCP) 3(01), 29–37 (2021) 16. R. Sharma, A. Sungheetha, An efficient dimension reduction based fusion of CNN and SVM model for detection of abnormal incident in video surveillance. J. Soft Comput. Paradigm (JSCP) 3(02), 55–69 (2021) 17. K.E. Koech, On object detection metrics with worked example (2021). Medium. Retrieved January 15, 2022, from https://towardsdatascience.com/on-object-detection-metrics-with-wor ked-example-216f173ed31e 18. Papers with code—coco benchmark (real-time object detection). The latest in Machine Learning. (n.d.). Retrieved January 15, 2022, from https://paperswithcode.com/sota/realtimeobject-detection-on-coco 19. J. Redmon, (n.d.), YOLO: real-time object detection. Retrieved January 15, 2022, from https:// pjreddie.com/darknet/yolo/ 20. OpenCV (2019). Retrieved from https://developer.nvidia.com/opencv
Medical IoT Data Analytics for Post-COVID Patient Monitoring Salka Rahman, Suraiya Parveen, and Shabir Ahmad Sofi
Abstract The COVID-19 has changed the scenario of patient care in most of the hospitals and healthcare centers throughout the world. Since pandemic is spread through contact with the COVID infected person, the most vulnerable community is doctors and healthcare workers. To avoid the mixing of COVID patients from other diseases, there is a need to be a separation, which will contain the spread of virus. As such, remote patient monitoring becomes very essential to take care of the patients under such situations. Because of this medical data is growing exponentially and needs to be analyzed continuously to improve the health care. Data analytics improved the performance of health care organization by proper decision making, accurate and timely information. Medical data can be explored from the sensors or medical equipments installed in the hospitals or can be collected from the fastgrowing Internet of Things (IoT) devices. Visualizing the data and correlating the same with the patient monitoring for better treatment and health care is an essential part of it. This paper discusses various Machine Learning (ML) approaches for remote patient monitoring using medical IoT data for the Post-COVID patient care. Keywords Medical IoT data · ML approaches · Remote monitoring · Post-COVID care
S. Rahman (B) · S. Parveen Department of Computer Science and Engineerin, Jamia Hamdard University New Delhi, Delhi 110062, India e-mail: [email protected] S. Parveen e-mail: [email protected] S. A. Sofi Department of Information Technology, National Institute of Technology Srinagar, Jammu and Kashmir 19006, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_42
555
556
S. Rahman et al.
1 Introduction Noncommunicable chronic diseases are the major cause of death worldwide. These diseases include liver infection, heart diseases, cancer, anemia, diabetes etc. According to W.H.O statistics, around 41 million people die every year which is approximately 71% deaths worldwide [1]. There are chances of misdiagnosis and human error by the doctors. So, to avoid such critical situation IoT along with ML techniques are applied which provides the solution by giving accurate diagnosis of these chronic diseases. Thus, it is evident that these diseases are a major issue and need attention. Machine learning combined with IoT is a way by which it helps the doctors, patients and medical staffs by providing the patient with better and improved treatment. Various IoT sensors are applied on the human body which keep monitoring the condition and provide the data which is stored over the cloud, servers or databases; where machine learning and deep learning algorithms are applied which gives accurate prediction. In this way, doctors efficiently treat the patients. Medical data is expanding every day, to process and store this huge data is a very complex task thus, health care analytics is one of the approach that can be incorporated for the analysis of medical data. Healthcare data analytics is the process of analyzing data for decision making and improves performance of healthcare organizations. Thus, these approaches can be utilized for healthcare real-world applications such as disease prediction, discover new drugs, diagnosis purposes etc. Considering the current pandemic situation, coronavirus is a type of infectious virus which emerged in December 2019 and caused pandemic globally. According to W.H.O, coronavirus is a communicable disease which is caused by SARS-COV-2 virus. People who are infected will experience mild to moderate respiratory illness and will recover without any medical treatment but some however will need proper medical treatment as well. Any age group can get infected and may get seriously ill or even die. The vulnerability is more in case of elderly people and people with other ailments. Thus, it is evident that patients who recover from COVID-19 suffer a longterm Post-COVID-19 serious symptoms such as lung damage, heart-related issues, kidney issues, brain malfunctioning etc. [2–4]. The Post-COVID patient monitoring system is one of the important issues which can be taken care by monitoring patient on a regular basis. Thus, there is a need for Post-COVID-19 monitoring system by integrating medical data, IoT medical data and ML approaches to analyze patient’s condition on a real-time basis. Data collection is a method by which related data is gathered for a given problem. It’s a major step involved in the research or any other project in the company. Thus, a number of methods are available for different purposes such as surveys, polls, experimental observations, reports etc. These methods provide a way by which data is gathered in an efficient way. Medical IoT is another approach for data collection used frequently for real-time analysis. It uses sensors to monitor patient’s health status and stores the collected information in cloud infrastructure or any other servers or databases which ultimately help the doctors in keeping the track of the patient’s
Medical IoT Data Analytics for Post-COVID Patient Monitoring
557
condition by simply using devices like smartphones, laptops and tablets remotely. IoT devices and sensors are used in shaping the way that we live our lives today. IoT is a giant network with connected devices, these devices gather and share data. In every physical device, sensors are embedded that provide the data which is then stored and pre-processed for valuable information extraction as per the requirement.
2 Literature Review Chronic disease causes millions of death worldwide. This section discusses ML approaches and medical IoT sensors for the classification of these diseases. Researchers are working continuously for the development of new advanced and integrated technologies for better classification. Khanday and Sofi [3] in their work successfully show promising results for better prognosis of COVID-19 to help researchers and medical community in multiple directions. Health care data analytics is the process of analyzing data for improved quality, decision making and performance of health care organizations. Different IoT sensors are applied which keeps the track of the patient’s condition. As discussed by Pandey [5] by using heart rate sensors we can find out about the mental stress of a person by using ML along with medical IoT sensors. Ensemble learning is one of the leading ML algorithm which basically means combining multiple machine learning models by giving accurate predictions. Ensemble techniques incorporate a number of classifiers which helps to create a strong classifier that has more predictive power, accuracy and has lesser error rates when compared with other machine learning approaches like Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Decision Tree (DC) etc. For the diagnosis of heart-related diseases an improved ensemble learning approach is applied on the heart data set which gave accurate results. Livieris et al. [6] propose a semi-supervised ensemble learning algorithm used for the classification of lung abnormalities from X-rays, based on new weighted voting scheme. Mienye et al. [7] proposed randomized decision tree ensemble technique model for the prediction of heart disease risk. Nahar et al. [8] summarize their work by using various ensemble machine learning methods, a comparative study was obtained in which LogitBoost algorithm gave most accurate results as compared with others which was about 71.53% for the prediction of liver diseases. Lakshmanarao et al. [9] proposed a model in which heart disease prediction is done with the help of feature selection incorporated with ensemble techniques and to balance out the data set various sampling techniques were applied before modeling. Thus, in this way, the classification of the heart-related disease is diagnosed. Zaman et al. [10] in his work proposed a system where wearable IoT and logistic regression are combined for the classification of cardiovascular diseases. In his model different sensors are applied on the human body which collects the data such as Blood Pressure (BP), ECG, pulse rate etc. and logistic regression ML algorithm is applied for the classification. Tékouabou et al. [11] summarize his work that early glaucoma prediction can be done by combining
558
S. Rahman et al.
pre-processing techniques and ensemble classifier with selection strategies, thus in this way early diagnosis and prediction can be obtained in this manner. Roh et al. [12] proposed a survey on different data collection methods; for the new machine learning applications where data is not available to train thus three new data collection methods are introduced which are data acquisition, data labeling and data improvement. In this way, an improved data collection methods are obtained. Another very impressive work done by Sarmah [13] discuses an IoT-Based Patient Monitoring and Heart Disease Prediction System which is done by using Modified Neural Network technique in deep learning that helps in monitoring the patients and predicting the status of heart. Firouzi et al. [14] summarize his contribution on tackling current COVID-19 situation by using techniques such as IoT, AI, Robotics and Block chain.
3 IoT Based Medical Data Analytics Medical data is growing exponentially and is very complex to store, process and analyze insights from the big data by using traditional methods thus, advanced machine learning techniques are used to analyze medical data efficiently. Health care data analytics is the process of analyzing data for improved quality, decision making and performance of health care organizations. In this way, the data is converted into precise, uniform and timely information. The medical data collected on electronic health recorder needs proper EDA (Exploratory Data Analysis) to analyze the data [15]. Data analytics are used to get insights out of healthcare data so that these patterns can be used for treatments, discover new diseases, research purposes etc. in the future. There are basically 3 major types of healthcare analytics: Descriptive, predictive and prescriptive. Descriptive analytics mainly focuses on the existing data which has already happened and is available. Predictive analytics on the other hand focuses on what will happen next and finally prescriptive analytics tells us about the creation of future by providing suggestions that are best to avoid and also suggests what to incorporate. Refer Fig. 1 Exploratory data analysis is one of the methods to analyze the data. It’s a set of techniques which are used to understand the data in a much better way before
Descriptive data analytics
Predictive data analytics
Fig. 1 Classification chart for healthcare analytics
Prescriptive data analytics
Medical IoT Data Analytics for Post-COVID Patient Monitoring
559
1.
2.
3.
Understanding the Data
Cleaning the Data
Analyzing Relationships
Fig. 2 Components of EDA process
developing the model. The main purpose of EDA is to extract patterns, insights and relations between the data using some statistical, graphical and visualization techniques [16, 17]. There are three main components of EDA process: (I) Understanding the variable—which means you need to acknowledge what the insights imply while applying various functions on the data set. The very first step is to import the various libraries like pandas, numpy, mat-plot etc. and load the datasets on which you’re working, then shaping them and applying various functions to get insights, which helps to get better understanding of the variables. (II) Cleaning the datasets—which means removing redundant variables, removing outliers, fixing missing values, eliminating null rows and columns and applying feature selection. (III) Analyzing the relationships—means visualizing the data set using visualization techniques like correlation matrix, and then representing them using any graphical method for example scatter plot, histogram etc., in this way, the relationship between the variables is more comprehensible. Refer Fig. 2. Therefore EDA techniques are used to process the raw data in order to get meaningful insights which allows data scientists and analysts to comprehend the data in a much better way before making any assumptions for their projects. Thus, we can say EDA is a crucial step involved in the development of any model efficiently. IoT devices and sensors are used in shaping the way that we live our lives today. IoT is a giant network with connected devices, these devices gather and share data. In every physical device, sensors are embedded that provide the data which is then stored and pre-processed for valuable information extraction as per the requirement. In medical healthcare IoT devices play an important role in analyzing the patient’s condition, consider an example where a patient’s pulse rate is to be monitored so, for that wearable IoT devices like fitness bands or any other device can be used for evaluating the condition of the patients on the regular basis in real time. The collected data can be stored on the cloud infrastructure and pre-processing of the same is done. The doctors can then easily monitor and track the patient’s progress by simply using his smartphone. The basic IoT architecture comprises of 3 layers as shown in Fig. 3. At the first layer i.e., perception layer acts as a physical layer where IoT sensors and embedded devices/systems collect huge amount of data. At the network layer, the collected data from the perception layer is distributed and stored. Finally at the application layer different users are connected to the IoT system. It acts as an interface between the two by which the users can access the data.
560 Fig. 3 Three layer IoT architecture
S. Rahman et al.
APPLICATION
NETWORK
PERCEPTION
4 Data Collection Data collection is the method of gathering related data from various resources in order to measure, collect and analyze accurate insights for research by using standard techniques. Data collection is broadly divided into two types: primary and secondary data collection methods. Primary data collection is a process of collecting original data by a researcher for a specific purpose. It is further divided into two categories: quantitative research method and qualitative research method. Quantitative research method: it is related to numbers and requires mathematical calculation. Methods of correlation and regression, mean, median and mode could be used. Tools required for quantitative data collection are—face-to-face interviews, online interviews, mails, phone calls etc. Qualitative research method: it is based on non-numeric elements like feelings or emotion, for example, open-ended questionnaire. Qualitative data collection tools are surveys, pools, group discussions, web survey chats etc. [18]. Secondary Data Collection is the process of gathering data which is done by someone else. Data collection methods are reports, online surveys and interviews. It is important to decide which data collection tools must be carried out for the research, such as case studies, surveys, polls, observational experiments, computer-assisted system, sensors etc., because different tools are used for different research scenario as demonstrated in Fig. 4. Another most used data collection method now a days are online forums/websites which contains a huge amount of datasets that are easily accessible, you just need to download them and start pre-processing. Kaggle, UCI repository and GitHub are some of the most used websites for datasets (data collection). Medical IoT is another very important data collection method used frequently for real-time analysis and better prediction. It uses sensors to monitor patient’s health status and stores the collected information in public cloud infrastructure which helps the doctors in keeping track of them at any point of time by simply using devices like smartphones, laptops, PC’s and tablets remotely. New machine learning applications sometimes don’t have enough data available to train the model in these kind of situation traditional methods doesn’t work thus, a new accurate and scalable data collection technique is needed. Roh et al. [12] propose
Medical IoT Data Analytics for Post-COVID Patient Monitoring Fig. 4 Traditional data collection methods
561
Traditional data collection methods
Primary
Secondary
Quantitative
Reports
Qualitative
online financial/ sales report
3 new Data collection method for the same which are: Data acquisition technique, Data labeling technique and Data improvement technique. I.
II.
III.
Data acquisition—basically searches and shares all the related data available, then augmentation on the discovered data is done by applying latent semantics and entity integration then finally generation of data can also be done either manually or automated. Refer Fig. 5. Data labeling—Once the data is gathered it needs to go for labeling the data which is done by applying semi-supervised learning methods when the data set has some existing labels already in it otherwise for no labels in the data set, fact extraction method, active learning method and crowd sourcing methods can be applied to label the data as demonstrated in Fig. 6. Improvement—And finally the datasets have to go for the improvement either on the model or on the datasets by applying re-labeling, transfer learning and data cleaning methods. Once the improvement on the data is done, it can now go for exploratory data analysis. Refer Fig. 7.
Discovery
•Searching •Sharing
Fig. 5 Data acquisition [12]
Augmentation
•Latent semantic •Entity integration
Generation
•Automated •Manual
562
S. Rahman et al. No labels
Some labels
• Fact extraction • Active learning • Crowd sourcing
• Semi-supervised learning methods
Fig. 6 Data labeling [12]
Improve data
Improve model
• Re- labeling • Data cleaning
• Transfer learning
Fig. 7 Improvement [12]
5 ML Approaches for Medical IoT Machine learning is a technique by which we can extract patterns from the unfiltered data by using various algorithms or methods. The main objective of machine learning is to allow computer systems to learn on its own from its various experiences without any human intervention. The key point to understand here is that computers are very strong at calculations so, we needed a way out to make computers intelligent predictors with higher accuracy and it is done by applying machine learning to the model/computers, train and test them using some data set and then evaluate its prediction for new data point. Refer Fig. 8. Machine learning algorithms are used for solving good number of real-world problems and one of which is for the diagnosis of diseases. Conventional machine learning methods like SVM, KNN, logistic regression etc. were used to identify people with risk which gave results that were not optimized and needed more improvement. Thus,
Data set
Modelling
Prediction
•As an input
•ML algorithm
•As an output
Fig. 8 Basic ML process
Medical IoT Data Analytics for Post-COVID Patient Monitoring
563
ensemble learning works as a solution for the prediction and prevention of disease which performs better as compared with conventional machine learning methods. Ensemble learning means combining multiple machine learning models for better prediction for a given problem. Ensemble techniques integrate a number of classifiers which helps to create a strong classifier which has more predictive power, accuracy and has lesser error rates when compared with other machine learning approaches like SVM, KNN, DC etc. On comparison of Randomized Decision Tree Ensemble Model with other machine learning models like k-nearest neighbour, logistic regression, support vector machine, gradient boosting, linear discriminant analysis, classification and regression trees (CART), and random forest for Heart Disease Risks has lesser accuracy when tested. All the machine learning models along with the Randomized decision tree ensemble models are trained on Framingham and Cleveland datasets. The results signify that all the machine learning models are not as efficient and accurate as the randomized decision tree ensemble model and have an accuracy of 93% for Framingham and 91% for Cleveland data sets [7]. Another example is when ensemble approaches like bagging and boosting are compared with other machine learning models such as K-nearest neighbour, naive Bayes, decision tree, support vector machine and neural networks for protein structured class prediction. When evaluated individually, decision tree on its own had 25.25% of error rate mean whereas bagging and boosting had about 18.55% and 15.50% of error rate means respectively. Thus, it clearly signifies that ensemble techniques have lesser error rates when compared with other machine learning approaches [19]. IoT along with machine learning is one of the paramount technology integrated now a days especially in the field of health care. By using this technology, various noncommunicable chronic diseases such as cancer, anemia, diabetes, cardiac diseases, kidney failure etc. can be monitored before any causality because most of the people avoid going for regular check-ups which eventually makes the situation worse and even sometimes people die. In order to avoid such critical situation, IoT combined with machine learning algorithms are used for the prediction of these diseases. Different IoT sensors (wearable and mobile) are applied on the human body which keeps the track and provides data to the cloud, servers or databases where machine learning algorithm works by giving accurate predictions. Zaman et al. [10] propose a system in which wearable IoT sensors along with logistic regression are integrated for the classification of cardiovascular diseases. In his model different sensors are applied on the human body which provides the data to be stored over the cloud, servers or databases and collects the data such as BP, ECG, pulse rate etc., on which ML algorithms are applied for the classification.
6 Post-COVID-19 Patient Monitoring System The coronavirus pandemic has been one of the major threat to the human life throughout the world since 2019. A person who got infected with COVID-19 and thereafter recovers with or without medical treatment is prone to many ailments if
564
S. Rahman et al.
precautions are not taken. The vulnerability is more in case of elderly people and people with other ailments. The Post-COVID patient monitoring is one of the important issues which can be taken care if the patient is monitored on a regular basis. The Post-COVID patient monitoring system includes medical data collection or data acquisition using sensors for example medical IoT devices. Since the number of patients suffering from Post-COVID ailments has increased to such an extent that doctors may not be able to cater the patient load. Therefore, there is a need to automate the patient monitoring system using medical data, IoT medical data and analysis of data through ML approaches so that the doctors will be intimated if need arises for patients care on real-time basis.
7 Some of the Existing Models Some of the available models which are used for the classification and prediction purposes are: SVM, KNN and so on. SVM stands for support vector machines, it falls under supervised learning model which basically helps in classifying different data points, it is used for both classification and regression problems. KNN stands for k-nearest neighbour, it calculates the distance of each points and accordingly makes prediction. It is one of the simplest and most implemented model for supervised learning. But when it comes to accuracy these type of models sometimes fails to provide accurate prediction. In order to improve, ensemble learning can be incorporated for accurate prediction. In ensemble learning, different ML algorithms are combined together to form a stronger classifier for accurate prediction. In my proposed model I’ll be using ensemble learning as a classifier as it works way better than others. Refer Fig. 9.
SVM
•Works for both classification and regression problems •Works fine
KNN
•Mostly used model •works for supervised learning problems.
Fig. 9 Existing models
Medical IoT Data Analytics for Post-COVID Patient Monitoring
565
8 Proposed System Many doctors and data scientists are working hand in hand to have a better understanding of the current pandemic situation. It is evident that patients who recover from COVID-19 also suffer a long-term Post-COVID-19 symptoms that fall in two categories: minor symptoms and major symptoms. Minor symptoms are the ones that usually fade away in a couple of weeks which are: cold, fever, body ache, cough etc. and are also not that dangerous. On the other hand, major symptoms are the ones which affects different organs like heart, liver, kidney, lungs, brain and so on in the human body. Therefore, there is a need to monitor Post-COVID-19 patient using different IoT devices and sensors which keeps the track of patients condition and provides the required data to be stored on cloud or any other databases so that analysis could be made by using various ML algorithms like logistic regression, ensemble learning, SVM, KNN, DC etc. and once the classification of the affected organ/disease is done the doctors can then treat the patient. My proposed model comprises of 5 major stages each having its own functionality. It follows waterfall model mechanism, where all the stages work in a sequence i.e., the output of one stage is taken as an input of another stage and the process goes on. Firstly using IoT sensors and medical devices the monitoring of patients is done. These sensors provide the required data. Secondly, the data provided by IoT devices are then stored on the cloud or any database. Thirdly, pre-processing of collected data is done by using exploratory data analysis which basically helps to understand the collected data, cleans the data, apply feature selection and finds relationship between the variables. Fourthly, on the pre-processed data machine learning algorithms are applied for the classification of the problem such as ensemble learning, SVM, DC, KNN and so on and a comparative analysis is done in order to select the best classifier among the rest which successfully gives accurate prediction. Finally, at the last stage, the output is then made available to the doctors for the treatment purposes as shown in Fig. 10. Considering an example for heart diagnosis for Post-COVID-19 patients, various sensors should be applied on the patients body such as wearable IoT devices which are fitness bands, Qardio core etc., and Mobile IoT sensors like smart temperature, Glucometer etc. These devices can be used to monitor the BP, Pulse rate, ECG, oxygen level, cholesterol levels, temperature and sugar level respectively in the human body. All the recorded data is gathered and stored over the public cloud infrastructure so that analysis of the collected data can be done. Once the data is collected, ML algorithms are applied so that predictions could be made whether a person suffers from major symptoms or not and what is the condition. In this way, the doctors will be able to do treatments in a much efficient manner as there are no chances of any misdiagnosis or any sort of human error. Thus, by using the proposed model, improved predictions can be done that will help the healthcare professionals in the treatment.
566
S. Rahman et al.
1.
2.
Data collection
Collected data Stored • IoT sensors • medical equipments
• Cloud • Data bases • Servers
3. Pre-processing
• Cleaning • Eliminating null values • Feature selection • Balancing datasets • Understand relationships
4. ML modelling
5. Treatment by doctors
• SVM,KNN,DC, Ensemble learning etc • Prediting • Classification
Fig. 10 Layout of the proposed model
9 Evaluation Evaluation of the system is a very crucial step as it helps the researchers uncover hidden obstacles and provides with clarity about their work. For the evaluation confusion matrix is widely used. It basically helps to determine the performance of the model on a set of test data for which the true values are already known. It’s used to evaluate the results of a predictive model with a simple way to lay out how many predicted categories were correctly classified and how many were not. It is represented in a matrix format where x-axis is the prediction made and y-axis is the actual class labels and accordingly this brings a clarity by evaluating the model on the basis of its classification/prediction. Refer the matrix Fig. 11.
10 Conclusion The Post-COVID patient care demands prompt and efficient medical attention to the patients who are suffering from many morbidity in addition to COVID-19. The elderly patient needs more attention from the others, especially if infected by COVID-19. The medical data should be explored quickly and timely action should be taken for analyzing the data in real-time. One of the approaches is to use artificial intelligence in medical IoT devices and equipments so that any eventuality can be avoided. The
Medical IoT Data Analytics for Post-COVID Patient Monitoring
567
Prediction made
Actual class
Fig. 11 Illustration of confusion matrix
use of ML approaches as discussed in the paper will help in remote patient monitoring and will lessen the load on the medical staffs. Visualizing and correlating the data is an essential part of the proposed system and will be an addition to the existing gadgets for the Post-COVID patient care.
References 1. https://www.who.int/news-room/fact-sheets/detail/noncommunicable-diseases 2. https://www.who.int/health-tpoics/coronavirus#tab+tab_1 3. N.Y. Khanday, S.A. Sofi, Deep insight: convolutional neural network and its applications for COVID-19 prognosis. Biomed. Signal Process. Control 69, 102814 (2021) 4. J. Thavorn, C. Gowanit, V. Muangsin, N. Muangsin, Collaboration network and trends of global coronavirus disease research: a scientometric analysis. IEE Access 9, 45001–45016 (2021) 5. P.S. Pandey, Machine learning and IoT for prediction and detection of stress, in 2017 17th International Conference on Computational Science and Its Applications (ICCSA), pp. 1–5 (2017) 6. I.E. Livieris, A. Kanavos, V. Tampakas, P. Pintelas, A weighted voting ensemble self-labeled algorithm for the detection of lung abnormalities from X-rays. Ensemble learning and their applications. Algorithms 12(3), 64 (2019) 7. I.D. Mienye, Y. Sun, Z. Wang, An improved ensemble learning approach for the prediction of heart disease risk. Inform. Med. Unlocked 20, 100402 (2020) 8. N. Nahar, F. Ara, M.A.I. Neloy, V. Barua, M.S. Hossain, K. Andersson, A comparative analysis of ensemble methods for liver disease prediction, in 2019 2nd International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6 (2019) 9. A. Lakshmanarao, A. Srisaila, T.S.R. Kiran, Heart disease prediction using feature selection and ensemble technique, in Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021), pp. 994–998 (2021) 10. M.I.U. Zaman, S. Tabassum, M.S. Ullah, A. Rahaman, S. Nahar, A.K.M. Muzahidul Islam, Towards IoT and ML driven cardiac status prediction system, in 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology 2019 (ICASERT), pp. 1–6 (2019)
568
S. Rahman et al.
11. S.C.K. Tékouabou, E.A.A. Alaoui, I. Chabbar, H. Toulni, W. Cherif, H. Silka, Optimizing the early glaucoma detection from visual fields by combining pre-processing techniques and ensemble classifier with selection strategies. Expert Syst. Appl. 115975 (2021) 12. Y. Roh, G. Heo, S.E. Whang, A survey on data collection for machine learning: a big data—AI integration perspective. IEEE Trans. Knowl. Data Eng. 33(4), 1328–1347 (2021) 13. S.S. Sarmah, An efficient IoT-based patient monitoring and heart disease prediction system using deep learning modified neural network. IEEE Access 8, 135784–135797 (2020) 14. F. Firouzi, B. Farahani, M. Daneshmand, K. Grise, J. Song, R. Saracco, L.L. Wang, Harnessing the power of smart and connected health to tackle COVID-19: IoT, AI, robotics, and blockchain for a better world. IEEE Internet Things J. 8(16), 12826–12846 (2021) 15. W. Raghupath, V. Raghupathi, Big data analytics in healthcare: promise and potential. Health Inf. Sci. Syst. 2, 3 (2014) 16. C. Felix, A.V. Pandey, E. Bertini, TextTile: an interactive visualization tool for seamless exploratory analysis of structured data and unstructured text. IEEE Trans. Visual Comput. Graph. 23(1), 161–170 (2017) 17. J.T. Behrens, Principles and procedures of exploratory data analysis. Psychol. Methods 2(2), 131–160 (1997) 18. J.S. Rabiansk, Primary and secondary data: concepts, concerns, errors, and issues. Appraisal J. 71(1), 43–55 (2003) 19. V.G. Bittencourt, M.C.C. Abreu, M.C.P. de Souto, A.M. Canuto, An empirical comparison of individual machine learning techniques and ensemble approaches in protein structural class prediction, in 2005 IEEE International Joint Conference on Neural Networks, pp. 527–531 (2005)
SNAP—A Secured E Exchange Platform Neeta Shirsat, Pradnya Kulkarni, Shubham Balkawade, and Akash Kasbe
Abstract E-commerce is booming at a breakneck pace in this Internet age, leaving brick-and-mortar enterprises in the dust. People in the developed world, as well as those in the developing world, use e-commerce Web sites every day. Even, however, e-commerce growth in the developing countries is still limited. This paper outlines the secured E exchange platform for African countries to improve economy and promote their local business. This platform is useful way that it makes easier way to buy and sell items online. The E exchange platform SNAP is an interactive, mobile-friendly e-commerce solution providing solutions to various types users like B2B, B2C and C2C. SNAP is first online platform giving auction, bidding and separate perks for professional sellers. This platform comprises multiple modules like seller module, buyer module, chat functionality, user management, SMS integrations, cloud infrastructure and front end UX/UI. It is integrated with SMS integration for secured registration, auction updates, sales alerts and necessary notifications. SNAP—secured E exchange platform will promote local businesses and will help in improving economy and per capita income of local artifacts. Keywords Cloud storage · Data analysis · E-commerce · Online marketplace · Security · Security of transaction integration · Web application
1 Introduction SNAP—secured e-commerce platform is Ghanaian countrywide that facilitates consumer-to-consumer, business-to-consumer and business-to-business to sales through its Web platform. SNAP is being developed to combine features of eBay [1], Amazon [2] and OLX [3]. In addition to eBay’s functionality, this platform N. Shirsat · P. Kulkarni · S. Balkawade (B) · A. Kasbe Department of Information Technology, PVG’s College of Engineering and Technology, Pune, India e-mail: [email protected] N. Shirsat e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_43
569
570
N. Shirsat et al.
will also have some additional features like different identity for professional seller, chat functionality, location selection and customization in categories. SNAP will be offering online money transfer to have fast and secure money transfer across the globe. This secured E exchange platform will be utilizing cloud computing technology to achieve scalability, security and big data storage. Secured E exchange platform focuses on e-commerce to promote individual users and businesses to be on online market place. SNAP—E exchange secured platform—is an online auction site allowing person-to-person deals/transactions where users are the buyers, sellers, and SNAP is only the medium. SNAP solutions are built to be safe, smart and convenient for African countries customers. E-commerce platform compares new discoveries to earlier findings and theories on purchasing and selling behavior to stay thoroughly updated with the market trends and secured the connection between businesses and companies and their customers and clients. We help people buy/sell absolutely any and every available item in the market. SNAP is an idea to make a positive contribution to our world by helping make the limited resources more efficient and easily accessible without having the need to search for them tediously. In a competitive world, where time is of the essence, it will be helpful that secured online shopping to be a walk in the park for customers. The most important aspect is the respected entrepreneurs and businessmen, as well as the customers who buy from them. With the fluidity of being able to buy/sell absolutely any and every available item in the market, SNAP is constantly striving to update our product and their categories to make this an amazing experience for the users of this platform! This is an initiative to try and build a global community in the cut-throat competition of the virtual business world. SNAP is comprising features like secured authentication, add products in specified category, search product by various relevant filters to that category, auction, bidding, promoting new advertisement and many more. SNAP will generate revenue through its promotional advertisement feature, in which user will pay some fees for particular period to showcase their banner/advertisement in sellable area in Web platform. Apart from promotional advertisement, other users will not have to pay any additional charges for registration or adding their products into the platform. Figure 1 shows an overview of multivendor marketplace structure of Web site followed by SNAP. The objective is to develop a user-friendly e-commerce platform where any kind of product can be sell/purchase and also allow to bid product. Following are several important objectives of proposed work: • To avail state-of-the-art technologies for developing E exchange platform, to provide flawless system for businesses and customers. • To avail buying and selling of new as well as used products. • To allow service providers to register on platform and receive enquiries. • To access platform on mobile devices. • To provide security to database on cloud, professional UX/UI and user-friendly solution pertaining to demographic. • Secure transaction—The familiarity, regularity, convenience and speed of performing online transactions may mislead individuals into a false feeling of
SNAP—A Secured E Exchange Platform
571
Fig. 1 Multi-vendor marketplace structure [4]
security. As a result, we must take additional efforts to protect our clients’ personal and financial data from data breaches for that we use PAYPAL as our payment gateway which has following features; – – – –
Comply with PCI DSS SSL certificate Use of tokenization and encryption Transaction verification.
This paper is organized like: Sect. 1 is an introduction, Sect. 2 is literature review, Sect. 3 describes the related work, Sect. 4 gives proposed system implementation, Sect. 5 describes expected results, and third-party API details and concluding remarks are given in Sect. 6.
2 Literature Survey Online auctions have gained a lot of traction. eBay [1], one of the most popular online auction sites, claims that the number of active users worldwide climbed from 27.7 million in 2002 to 41.2 million in 2003, with around 292 million listings. eBay was also one of the top five sites in Germany and the UK in February 2004, according to Nielsen/Netratings, the worldwide benchmark for Internet audience measurement and research. The Aberdeen Group discovered that auctions accounted for 94% of net market transactions, whereas catalogue sales accounted for just 6%. The majority of auctions are available to the general public. Whatever you are looking for, you will be able to find it. Given the virtual market’s fast growth, there are no de facto norms for the bidding rules and procedures that regulate the online auction sector. Despite the fact that online auctions have been around for a long time, there are still two big issues: trustworthiness and security. Many auction sites [5, 6] refer to themselves as “meeting places for buyers and sellers” when it comes to the first issue, trustworthy
572
N. Shirsat et al.
transactions. They just enable vendors to put items for trade and do not check that the items are in fact available or that the descriptions are truthful. To identify the traders—buyers and sellers—they merely require an email address. Following the auction, the seller is responsible for dealing directly with the buyer about payment and delivery. The auction houses are not in any way responsible for the sale. As a result, auction fraud is becoming a more tough issue in the virtual market. The following are the most typical kinds of auction fraud: • Failure to deliver: Customers pay for a product that is never delivered. • Misrepresentation: The items obtained do not match the description provided. • Shill bidding is when a seller or an associate puts a fictitious bid in order to inflate pricing. The Federal Trade Commission (FTC) has received a number of complaints. • Provision of a security system with fine-grained access control that allows legitimate users access to resources while safeguarding critical information from hackers and unauthorized users on the one hand (i.e., all other users). Currently, various e-commerce platforms are popular and in daily use across globe. Many of the platforms allow the buyers and sellers to meet at a single platform and itself charges a bit for the transaction that happens between them. They online shopping platform with huge number of varieties of the products available on the platform. Both Amazon and eBay are online shopping platforms where users explore products for sale or auction. The e-commerce movement is larger than its ever been, and it is just becoming bigger. Both eBay and Amazon are long-standing, big participants in the industry when it comes to e-commerce enterprises. Both eBay and Amazon are online shopping services that allow customers to explore available items for sale or auction via each company’s online storefront. While both Amazon and eBay have evolved to meet today’s consumer needs, they are fundamentally different. Amazon and eBay have different business models, pricing, seller services and customer services. Another popular e-marketplace in Nigeria is Jiji [7]. Jiji is an African Internet marketplace that connects buyers and sellers. Online marketplace Jiji.ng [8] allows users to post free advertising and quickly find buyers faster. Jiji’s mission was to connect buyers and sellers swiftly and easily online. Search for employment by looking through job listings on the Internet is their recently added service on their portal. Table 1 gives the existing system working and some of drawbacks of those systems.
3 Proposed System The Web application “Secured E exchange platform” provides solution to the problem faced by the Ghana citizens. This Web portal emphasizes on developing secured and mobile-friendly options through which user can interact and make it easier for the
SNAP—A Secured E Exchange Platform
573
Table 1 Literature review of system Paper
Work description
Problems found
[9]
Evaluating ecommerce Web sites’ usability and accessibility
The number of e-commerce Web sites does not follow WCAG requirements and breach fundamental usability principles
[10]
An emotional intelligence-based entropy management system that helps reduce organizational and operational inefficiencies in real time. Anonymous reflection at the individual level is aggregated by team, to create a real-time mirror of a range of team and organizational dynamics that the team may be experiencing
Sustainability of complex adaptive system
[11]
With the advancement in e-commerce, the Information obtained needs to filtered research has increased in this field. The more, and traditional system does not recommendation system uses several provide collaborative filters filtering approaches to give the user with customized recommendations
[12]
The project’s observing and assessment strategy should identify areas where examiners lack the competence to investigate auditing systems, allowing methodical auditing to be piloted in such areas
No such a problem found. As the paper describes important precaution/measures that should be taken before starting any project
[13]
The writers work on five variables that benefits from big data analysis which are improved customer management, increased profit, competitive advantage, company volume and others
The work mainly focusses on banking
user to buy and sell the products on the platform. User is able to register through mobile phone and also login via Facebook and Gmail. On this application the user can select the location (region) and search for product the specified category. In this platform, there are following system modules: 1.
2.
3.
Super admin: Super admin keeps track of all the actions that are taking place on the online platform. They are one who will keep track of the users and categories in the database. The platform’s functionality is managed by the super admin. Professional seller: They will have some privileges than the normal user. They will able to add quantity for their products. Also, this type of user is given some identification like verified seller/gold category. User/buyer/seller: This online portal’s core element is the user, who can register, place orders and connect with the service provider. It will assist the user in reducing the amount of time he spends looking for items/service providers near his specified area or at his convenience.
574
N. Shirsat et al.
The functional capabilities which make up an SNAP system suitable for public procurement specified as the following: B2B, C2C, B2C, search for categories by name, category, locality code, create purchase requisitions, bid for product, generate purchase orders while including optional approver workflow, receive goods into the system, allow for the customization of “policies”, buyer/seller data management, language selection, dynamic advertising for promoting businesses/offers and many more. Although Amazon and eBay are big players in the e-commerce market, there are still some functionalities that are missing in African countries. African citizen is not forward or techsavvi like Americans or European countries. The platform must be suitable for their needs and detailed categories. Also, sell/auction of pre-owned good is common and will help in increase in the business.
4 System Architecture Proposed system mainly divided into three tiers, namely front view/client side, controller’s classes (business logic—middle layer and storage database and file storage). Figure 2 shows the proposed system architecture of E exchange secured platform. The proposed system has divided several components, as per architecture webserver, warehouse and database, controller classes like product, user, order, admin, vendor and category. This also utilizes external components like payment gateway and SMS/email integration. Initially, the system relied on the user registering (buyer) or logging in on a mobile or computer device, after which he or she may explore different categories and products. When a customer puts an order using an e-commerce system, the order data must be entered, and the order must be entered into the company software. The order is transferred onto the customer once it has been validated and processed in business software. For processing, a seller account is required. In this case, the seller must manually validate the order against yours choose list business rules (stock). Create and print a choice list using the information you have gathered (availability, item location, etc.). Once an order has been processed at the warehouse, it is ready to ship. It is now ready to be sent to shipping, where it will be fulfilled by a courier. Your company regulations will define which options are available. The employee selects the shipment route. It is necessary to acquire package information such as weight, size, destination and fees. A worker will also have to print the shipping labels by hand and contact the courier for delivery. Aside from that, the system allows users to bid on items. Users may make an offer for a certain good, and there is also a chat option. The system has a module of advertisement where seller can add advertise of their goods or gives various offers for buyers/user to attract more business. Following are several real-time features benefits that we have considered during development of SNAP.
Fig. 2 Architecture of E exchange secured platform
SNAP—A Secured E Exchange Platform 575
576
N. Shirsat et al.
Personalization—Personalization of content entails providing the user with content that is customized to his individual requirements. Improved user experience—Because personalized content answers the user’s concerns, the user’s engagement with the shop will be enhanced. Better selling—It stands to reason that a happy customer is more likely to purchase. And your sales are increasing at a rapid pace. Increasing brand recognition—Because of the real-time functionality, the customer memorizes your business quicker and better, resulting in increased brand awareness. The most up-to-current and fresh information—Your content is always up to date and fresh. Users adore it. Promising user base—All of the aforementioned activities build your user database. And each user represents a prospective client. Following trends—While this may seem to be a little element, the ability to stay up with the times is a vital success component in every business. Real-time chat support—Our system offers one-on-one (seller–buyer) real-time chat help. As a result of incorporating real-time features into our programmes, estore owners become nearly friends with their consumers: friends who know what these customers want and are ready to deliver it to each of them.
5 Expected Results and Discussion 5.1 Experimental Setup Software requirements presented in hierarchy with stepwise deployment and important software’s. Proposed work development needs PHP, Apache, Laravel framework and Composer. We are proposing extra security so firewall is also included along with AWS.
5.2 Third-Party Services API/CLOUD Services: Currently, this secured E exchange platform is utilizing PayPal and Vonage as third-party API to extend more applicability of. PayPal [14]: PayPal gives support to all international transaction, multi-currency support and also secured payment gateway. All the transactions will be secured through PayPal which will motivate all online transactions to increase fluidity across Ghanaian population. Motive behind integrating PayPal is that it should support
SNAP—A Secured E Exchange Platform
577
transactions from one country to another country without any additional steps to improve trust and user experience. Vonage [15]: Vonage is the cheapest and reliable platform form SMS package integration. Vonage is an American publicly held business cloud communications provider. It gives telecommunications services based on voice over Internet Protocol. This popular amounts all e-commerce developer because it is global performance. Amazon EC2 Cloud [2]: Assemblies are virtual servers on Amazon’s Elastic Compute Cloud (EC2) that run applications on Amazon Web Services (AWS) architecture. Users may choose from AWS’s offerings, the user community or the AWS marketplace. Use Amazon EC2 to deploy as many or as few virtual servers as you need, set security and networking and manage storage, according to research. Amazon EC2 allows users to scale up or down in response to changes in demand or popularity surges, decreasing the need to anticipate traffic.
5.3 Expected Results Table 2 shows the comparison with similar system and expected outcome of proposed ecommerce platform SNAP. The expected system will be more user friendly, mobile friendly, and it is expected to have registration by 20,000 users in initial year. It is expected to add more and more location and allow to serve for more filtered options. Performance: The platform must be engaging, with minimum delays. As a consequence, the system’s action reactions are not significantly delayed. When opening windows forms, popping error messages and saving settings or sessions, there is a delay of less than 2 s; there are no delays when opening databases, sorting questions and evaluating; the operation is completed in less than 2 s for opening, computing, sorting and posting > 95% of the files. Furthermore, the delay when connecting to the server is dependent on the distance between the two systems and their configuration, so there is a good probability that a successful connection will be formed in less than 20 s for the purpose of great communication. The proposed system will suit the needs of Ghanaians.
6 Conclusion and Future Scope Because it understands local needs, the SNAP secured exchange platform will be a major player in the online market. It offers various filters and purchasing options such as daily deals, bidding, make an offer, and auction to help you find your preferred shopping platform. Its mobile-friendly approach will attract all types of users and will make their buying/selling easier. This is expected to improve economy through B2B, B2C and C2C online marketplace. It will be simplified version than Amazon,
578
N. Shirsat et al.
Table 2 Comparison between present work (SNAP) with Amazon [16] and eBay [16] Features
eBay
Amazon
SNAP
Advanced search option by keyword
Yes
No
Yes
Daily deals option on home screen
Yes
Yes
Yes
Customer service contact link on header
Yes
Yes
Yes
Banners
Single banner static
Static images with category options
Different banner with taglines
Language selection
Yes, we can select multiple language
Yes, we can select
No
Sign in and register options
Separate option on home screen
Under sign in option, there is as register button
Under sign in option, there is as register button
Sign in options
Facebook, Google, Apple and eBay account
Need compulsory Amazon account
Email and Facebook, Google, OTP sign in option
Search options
Product appeared with Product appeared so many filters on same with so many filters screen on same screen
Filters are available
Product display option
In detailed information along with ratings
Product name, photo, price
Specific product page
Full product Full product information images and information with data images and product brochure
Product name, photo, price with details
Product sharing option
Yes, enabled with social media links
Yes, available with social links
All options available
Product images
Using arrow, we can click multiple images
We need to click every image
We need to click every image
Detailed name and price with ratings
Image zoom option
Yes
Yes
Yes
Items sold option on product screen
Yes
No
Yes
Top seller option
Yes
Yes
Yes
Items promoted
No
Yes, by Amazon itself
Yes Pagination available
Product listing
Pagination available
Pagination available
Product appearance after search
Single one by one
Four products in one Four products in one row row
Bidding for product
No, fix price by seller
No, fix price by seller
Yes, we can bid for product
Make an offer for product
No
No
Yes (continued)
SNAP—A Secured E Exchange Platform
579
Table 2 (continued) Features
eBay
Amazon
SNAP
Individual buyer/seller chat option
No, only you can chat with customer care
No, only you can chat with customer care
Yes, one-to-one chat option is available
eBay and OLX, and also, it will achieve selling and buying process in minimal steps. The platform’s future scope includes more specific page, Web site landing page to professional seller, mobile application and wide use across the continent. In future, various Google APIs will be integrated to add location, enter location and geo location applications. Server needs would be fulfilled by Amazon cloud services. This provides wide selection of instance types optimized to fit different use cases. The proposed idea will be developed in mobile app using hybrid technology and react Native. Audience can directly capture product photographs and upload on platform. This is expected to expand in other countries of African continent.
References 1. 2. 3. 4. 5. 6. 7. 8. 9.
10.
11.
12. 13. 14. 15. 16.
Website—https://www.ebay.com https://aws.amazon.com/ https://www.olx.in/ https://www.codica.com/blog/how-to-build-a-multi-vendor-marketplace-website/ T. Ow, B. Spaid, C. Wood, S. Ba, Trust and experience in online auctions. J. Organ. Comput. Electron. Commer. 28, 294–314 (2018). https://doi.org/10.1080/10919392.2018.1517478 M. Saibharath, K. Padmavasani, M. Packiyaraj, An effective online auction system. IIOAB J. 7, 137–143 (2016) https://paganresearch.io/details/jijing https://jiji.ng/ S. Hamid, N.Z. Bawany, K. Zahoor, Assessing ecommerce websites: usability and accessibility study, in 2020 International Conference on Advanced Computer Science and Information Systems (ICACSIS), 17–18 Oct 2020 (IEEE). ISSN: 2330-4588 P. Malik, H. Delaney, An emotional intelligence based entropy-management system to accelerate the development of an e-commerce retail organization as a complex adaptive system, in 2020 IEEE International Systems Conference (SysCon), 24 Aug 2020 J.G. Pereira, S. Tiwari, S. Ajoy, A survey on filtering techniques for recommendation system, in 2020 IEEE International Symposium on Sustainable Energy, Signal Processing and Cyber Security (iSSSC), 25 Feb 2021 A. Yadav, The substance of auditing in project system. J. Inf. Technol. Digit. World 3, 1–11 (2021). https://doi.org/10.36548/jitdw.2021.1.001 S. Shakya, S. Smys, Big data analytics for improved risk management and customer segregation in banking applications. J. ISMAC 3(3) (2021) https://www.paypal.com/in/home https://www.vonage.com/ Website—https://www.investopedia.com/articles/investing/061215/how-are-ebay-and-ama zon-different.asp
An AES-Based Efficient and Valid QR Code for Message Sharing Framework for Steganography Abhinav Agarwal and Sandeep Malik
Abstract Steganography is one of the methods that may be used to obscure information to facilitate information exchange. Generally speaking, it may be determined as a study of communication that focuses mostly on methods of hiding the message via the use of various mediums such as images, audio and so on. In this method, if the message is adequately concealed, it will be difficult to extract the message from attackers and eavesdroppers. In this study, we investigate the use of QR codes for message exchange in steganography, as well as the AES approach. The end findings’ performance will be assessed by comparing image file quality before or after data has been hidden from view in these files. For maximum difficulty in detecting that their data is hidden in these files depending upon the encryption and decryption processes using the AES method, the quality of original and stego image files must be similar or very close to one another. The robustness of the proposed approach is calculated by PSNR and error term with the steganographic attack. Keywords QR code · Steganography · Encryption · Decryption · AES · Payload · Message sharing
1 Introduction At present, the Internet and communication apps provide several benefits, including real-time data and information sharing and online conferencing [1]. This rapid growth of online applications in a variety of fields, including financial, government and social, emphasizes the critical importance of secure data transmission channels via the Internet network [2]. As a result, these apps are vital for the development of enterprises in a variety of fields. However, data and information security are among A. Agarwal (B) Department of Computer Science, Singhania University, Jhunjhunu, India e-mail: [email protected] S. Malik Department of Computer Science, Oriental University, Indore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_44
581
582
A. Agarwal and S. Malik
the most significant difficulties confronting enterprises that must send sensitive or private data via the Internet. According to [3], number of hackers and online data thieves have risen significantly in recent years. The hackers are mostly interested in stealing sensitive data like credit card numbers and organization secrets. Thus, corporations are always concerned about the security of data transmission methods. Communication is a fundamental need of every expanding business in today’s society. Everyone desires the confidentiality and security of their communication data. While we utilize several secure channels in our everyday lives, such as the Internet or telephone, to transmit and share information, they are not completely secure. Two ways might be employed to distribute the information covertly. Cryptography and steganography are examples of these methods. Cryptography is the process of encrypting the communication with the use of an encryption key that is only known to the sender and receiver. Without access to an encryption key, communication cannot be accessed by anyone. However, the transmission of encrypted messages may readily draw the attention of an attacker, and encrypted communication may therefore be intercepted, attacked or brutally deciphered. Steganography methods have been created to circumvent the inadequacies of cryptographic systems. Steganography is the art and science of transmitting data in a manner that the receiver does not know about it. To put it another way, steganography hides the existence of data in such a manner that its existence cannot be recognized. Embedding is the term used in steganography to refer to the technique of concealing information content inside any kind of multimedia content such as an audio, image or video. Both strategies may be used in conjunction to increase the anonymity of data transmission. An image or video may be hidden inside another image or video using the steganography method. Two Greek words are used to describe the practice of steganography: steganos, which means “protected or shielded,” or graphein, which indicates “writing.” Storing secret information in computer files is known as steganography. It is possible to hide the origin of a message using digital steganography by embedding steganographic code in the transport layer of electronic communication such as a document, an image file, software or a protocol. Because of their large size, media files are absolute for steganographic transmission due to their inherent security. Such, the sender may start with a benign image file and then adjust every 100th pixel’s color so that it matches the color assigned to a letter of the alphabet, a change so minute that it would go unnoticed unless someone was paying close attention. Depending on the kind of cover object, a variety of appropriate steganographic methods are used to provide security. It is shown in Fig. 1. • Image steganography: This is the process of steganography in which the cover item is used as the image. Usually, pixel intensities are employed to conceal information in this approach. • Audio steganography: When audio is used to conceal information, this is referred to as audio steganography. Because of the widespread use of voice over IP (VOIP), it has evolved into a significant media platform. Audio steganography takes use
An AES-Based Efficient and Valid QR Code for Message Sharing … Fig. 1 Digital medium to achieve steganography [6]
583
Steganography
Network protocol
Image
Video Audio
Text
of digital audio codecs like MIDI, WAVE, MPEG, AVI and others to conceal the source of the audio signal. • Video steganography: Steganography is means of concealing any type of file or information inside a digital video file. Video (a composite of images) is utilized to obscure sensitive information from view. Generally speaking, discrete cosine transforms (DCTs) alter values (e.g., 8.667–9), which is utilized to conceal data present in every of video’s photos that is not apparent to the human eye. Steganography formats like H.264, MP4, AVI, as well as other video formats, are employed. • Network steganography: When a cover object like a network protocol is applied as a carrier, this is referred to as network protocol steganography and may include TCP, UDP, ICMP or IP as a carrier. The header bits of TCP/IP fields that are not in use may be used for steganography using the OSI network layer model’s hidden channels [4]. • Text steganography: To do this, a general approach in this steganography is utilized, like number of tabs, capital letters and white spaces, similar to Morse code [5].
2 Related Work This work determines to present an image steganographic technique for embedding an encrypted secret message into picture data through quick response (QR) code. QR code is embedded using the discrete wavelet transformation (DWT) domain, with the embedding process further safeguarded by the advanced encryption standard (AES) encryption method. Additionally, the encryption was used to break the QR code’s normal properties, which makes the process more secure. The purpose of this work is to build an image steganographic approach that has a high degree of security while also having a high level of non-perceptibility. By compressing the QR code before
584
A. Agarwal and S. Malik
embedding, the method’s security and capacity were increased. The efficiency of the proposed technique was quantified using peak signal-to-noise ratio (PSNR), and obtained findings were contrasted to those obtained using various steganographic tools [7]. The proposed article presents a comprehensive analysis of asymmetric encryption techniques using a variety of parameters. The primary purpose was to ensure information sharing security by a strategic combination of two security technologies, namely steganography and cryptography. RSA was shown to be the top performer of flexibility, security and encryption efficiency throughout this analysis. While the other methods were equally capable, the majority of them need a trade-off among memory consumption and encryption efficiency [8]. In this study, the major aim was to minimize the amount of physical storage space necessary on several storage media as well as the amount of time it took to transfer data over the Internet while guaranteeing that the data was securely encrypted and concealed from attackers. Two strategies are implemented: one that results in data loss (lossy) and one that does not result in data loss (lossless). Suggested work presents a hybrid data compression approach that improves the amount of input data that can be encrypted using Rivest–Shamir–Adleman (RSA) encryption method, hence boosting security, and that can be used to build lossy and lossless compacting steganography techniques. This approach may be utilized to reduce the quantity of data communicated, allowing for faster transmission while utilizing a slow Internet connection, or to save space on various storage devices. Huffman coding is used to minimize the size of plain text, while the cover picture is compressed using an approach called DWT, which compresses the cover image using lossy compression to lower its size. After that, a compressed cover picture will include the encrypted data, which will be inserted using the least significant bit algorithm (LSB). The following criteria were used to evaluate the system: percentage savings, compression duration, the bit rate per pixel, PSNR, compression ratio, compression speed, mean squared error and structural similarity index (SSI). When contrasted to other systems that utilize a similar technique, this system demonstrates superior performance and methodology [9]. The following letter describes a novel 2 description image coding technique based on steganography. To be more precise, we suggest that each description be formed by embedding (hiding) coarsely coded parts inside finely coded sections using an LSB steganographic technique. Thus, if the embedding process is correctly planned, the bit budget for coarsely coded half of each description may be reduced, while the reconstruction of the finely coded section suffers little degradation. The experimental findings verify the proposed method’s efficacy [10]. Visual cryptography (VC) is gaining traction in security applications. While visual cryptography is intriguing, hackers may readily get access to the visual components. VC may be coupled with steganography to strengthen the security of concealed communications, which is the primary focus of this work. Additionally, a chaotic function is used to create visual components in this example to improve visual cryptography (share). In other words, the sender transmits a single share, and the receiver creates a related share, which is then combined with the received share to disclose
An AES-Based Efficient and Valid QR Code for Message Sharing …
585
a hidden message. The proposed system’s histogram uniformity, key sensitivity, correlation coefficient and keyspace are all evaluated. The correlation coefficient, key sensitivity and keyspace are found to be adequate. The comparison findings illustrate the proposed techniques’ superiority over advanced procedures in terms of the aforementioned assessment criteria [11]. By increasing the detection error of often occurring optimum detectors inside the statistical model in use, they produce a unique Gaussian embedding model. A universal embedding technique that surpasses current cost- and statistical modelbased tactics has been developed using this approach, which they have also applied to cost-based steganography. Due to the assumption of a continuous hidden message, this approach and its offered solution apply to every embedding case. Following that, closed-form detection error is obtained using a picture steganography model and extended to batch steganography. So they create AdaBIM, an Adaptive Batch size Image Merging steganographer, and show theoretically that it surpasses advanced batch steganography technique, in addition to practically establishing that it is better [12]. This study will examine several steganography approaches, the artifacts made by steganography and the forensic investigation tools used to identify and extract steganography from mobile devices. On two major mobile device platforms, Android and Apple, a variety of steganography methods will be employed to produce various artifacts. Additionally, forensic investigative technologies will be used to discover and, if necessary, uncover hidden data. Finally, a set of policies and procedures for mobile forensic investigation will be produced [13]. This article covered many of the researchers’ proposed strategies. The following three key image steganography elements must be taken into consideration while using jpeg steganography: embedding capacity, resilience and undetectability. However, because of its detectability, the first recommended algorithm, Jsteg, was deemed inadequate for the task at hand. Statistical restoration algorithms were developed as a result of this work, and they focused on changing statistical parameters to maintain undetectability. Adaptive and resilient steganography have been proposed as ways to improve the embedding capacity and resilience of stego pictures. Adaptive steganography proposes a variety of strategies for identifying places in a picture where data may be buried without being noticed and subsequently embedding the data. Robust steganography takes into account the fact that jpeg pictures are susceptible to not just steganography but also jpeg recompression. When recompressing or cropping jpeg images, hidden data may be lost. Natural steganography is gaining popularity as well, although more work needs to be done in this field [14]. For the aim of this research, we will expand the statistical platform for stage problems, in that cover and stego messages will be represented by Gaussian random variables that are independently distributed. Following that, a unique Gaussian embedding model is developed that minimizes the detection error of three ideal hypothesis testing detectors simultaneously. Given that the suggested Gaussian embedding model is capable of dealing with both pixel embedding costs and residual variances, it may be applied to all existing image steganography approaches, finding in a universal embedding methodology that substantially increases their security. In
586
A. Agarwal and S. Malik
addition, the closed-form detection error as a feature of the payload is estimated in the image steganography model and then applied to batch steganography. Due to the availability of closed-form detection error, they were able to study the relationship between batch size and steganography security. As a consequence, they offer a novel batching approach, AdaBIM, which is demonstrated to surpass the state-of-the-art both theoretically and experimentally [15]. The topic of security of data is also applied in other fields apart from image steganography like digital watermarking techniques like LW-CNN [16], cloud security [17, 18], authentication systems [19, 20], etc. (Tables 1 and 2).
3 Proposed System The proposed method is a sophisticated combination of many steganographic techniques. This effort focused on strengthening security needs via the usage of QR codes. We use the LSB substitution approach for the steganography, as well as then use the AES method for the encryption of the message. AES has three alternative key length sizes, which are 128 bits, 192 bits and 256 bits, respectively. There are varying amounts of processing rounds necessary for each AES method that uses data blocks of 128 bits in size and keys with 256 bits in size. This is true for algorithms that use 128-bit data blocks and keys with 256-bit key sizes. As an iterative algorithm, the term “round” refers to a single complete iteration of the method. The total number of rounds Nr is determined by the length of the key Nk. The 128-bit data is split into 32 bytes for storage purposes. These bytes are mapped to an 8 * 8 array known as the state, and it is in this state that all of the AES operations are conducted. The proposed system is divided into two parts: encryption and decryption. Figure 2 illustrates the encryption flowchart for the QR code. Figure 3 depicts the decryption flowchart. The following is an explanation of the encryption in the QR code: • • • • • • •
Input plain text as a secret message. Encrypt using the AES technique with a 256-bit key. Generate QR code using encoded message. Embedded QR code into a cover image using LSB algorithm. Get stego image. Any QR code software may be applied to produce QR codes. The message is self-contained and is not associated with the payload. It might be a benign message, like a food label, or a message intended to deceive the adversary. • The encryption technique’s goal is to randomly generate the payload such that the pixel values are divided 50/50 between zeros and ones. Consequently, the payload appears to be random noise rather than crucial information. This objective may be accomplished using any encryption technique (symmetric or asymmetric). However, symmetric encryption is preferable due to its space-saving properties. To send a secret key to authorized recipients, a key exchange protocol might
An AES-Based Efficient and Valid QR Code for Message Sharing …
587
Table 1 Comparative related work Method
Strength
Weakness
Outcome
Hussain Information et al. (2021) hiding
Least significant bit (LSB) and pixel value differencing (PVD)
Possibility of avoiding the issue of the error block
To retrieve the secret bits, only three LSB extraction would be used
To increase embedding capacity by 6–7% while maintaining appropriate visual quality [21]
Karthikeyan Higher level et al. (2021) of redundancy in digital images
LSB algorithm
They maximize the image quality
The image’s original quality should be kept to a minimum level
LSB technique was used to provide an additional layer of access control for obtaining secret messages contained inside the green component of cover pictures [22]
Alajmi et al. Protecting the Quick response (2020) payload from (QR) code being intercepted by an attacker
Take benefit of this by producing a message that provides the attacker with false information
The container should only be used to conceal the payload
They are tested, and findings demonstrate that they are almost indistinguishable from standard QR codes [23]
Pandey (2020)
Security concern
Bitmask-oriented genetic algorithm (BMOGA)
To lessen the Only provide amount of data security that is duplicated in medical testing
According to the test results, the suggested approach is capable of encrypting and transmitting safe data [24]
Liu and Chen (2020)
Low-security level
LSB steganography
To enhance the performance of picture reconstruction or minimize the number of measurements
The outcome of our experiments shows that the optical ghost cryptography and steganography technique is completely effective [25]
Author
Problem
Only extends the GI into the steganography region
(continued)
588
A. Agarwal and S. Malik
Table 1 (continued) Author
Problem
Method
Strength
Weakness
Outcome
Sahu and Swain (2020)
Less embedding efficiency
RIH methods
To get stronger robustness against various stego-attacks
Loss of a single bit of information is not acceptable
The potential antisteganalysis ability of the suggested approach is found to be superior [26]
be utilized. When asymmetric encryption is utilized, a public key is utilized to encrypt data [35]. • A variety of data scrambling algorithms may be used. When scrambling QR codes, it is preferable to employ algorithms that operate on square matrices, such as those used by the baker map technique for scrambling QR codes. We should note that scrambling methods employ symmetric keys, which means that the key used to scramble the payload is identical to the key used to unscramble it. 5—two secret keys are necessary when utilizing symmetric encryption; one for encryption algo also another for scrambling algo. We may, on the other hand, use the same secret key for both encryption and scrambling methods by using message as a nonce (salt), which does not need to be kept secret at all [36]. Decryption in QR code is explained as follows: • • • • • (1) (2)
(3)
Input stego image. Extract the QR code image using reverse LSB. Decode the QR code and get an encrypted text. Decryption using AES-256-bit key. Get plain text. If the following circumstances are satisfied, produced QR codes may deceive the attacker. The created QR codes are completely interchangeable with standard QR codes. That is, created QR codes may be read accurately whether they are sent electronically or printed on papers and magazines. QR codes created by this method are resistant to steganalysis attacks.
Produced QR codes doesn’t look suspicious through naked eyes. First criterion is verifiable using a smartphone equipped with a camera and QRcode reader software. All of QR codes that are contained inside the document should be correctly read. The data retrieval in the decryption of AES-256 is difficult because using brute-force approaches, AES-256 is almost indecipherable. With today’s processing capability, it would take billions of years to crack a 56-bit DES key, while an AES key would take billions of years. Hackers would be foolish to even try this type of attack. It is just a matter of how long will it take to break it. The tricky bit about encryption is the key and its security. 256-bit encryption is just harder to break than 128-bit.
An AES-Based Efficient and Valid QR Code for Message Sharing …
589
Table 2 Comparison of several ways for hiding data Algorithm
Robustness
Complexity
Conclusion
LSB [27]
Low
Low
Because LSB is easily manipulated, it does not provide a high level of security. Watermark bits may easily take the place of the LSB
Grey-level modification [28]
Moderate
Low
GLM is well-suited to massive networks and systems. GLM can be utilized to secure data and servers, as well as to protect data without the involvement of third-party trustees
Pixel value Low differencing (PVD) [29]
High
In terms of watermark picture quality and imperceptibility, PVD outperforms LSB
Quantization index High versus AWGN modulation (QIM) (the message is [30] embedded with inserting Gaussian white noise)
Low
When it comes to additive noise corruption, QIM approaches outperform spread spectrum (SS)
Multiple base notational system (MBNS) [31]
–
PVD is inferior to MBNS. When it comes to PSNR and invisibility, it is a no-brainer
Prediction-based Low steganography [32]
Low
According to Wu et al., prediction-based steganography is ideal for enhancing payload capacity, as it leads to roughly 99.85% embedding capacity
Discrete cosine transformation (DCT) [33]
High, although cropping has an influence
Low
DCT is the most robust lossy compression technique, although it is not resistant to image cropping
Discrete wavelet transformation (DWT) [34]
High
Moderate, yet capable of reaching a high level
DWT is an approach that is both efficient and versatile
Low
590
A. Agarwal and S. Malik
Fig. 2 Flowchart of encryption
3.1 AES Algorithm National Institute of Standards and Technology (NIST) produced the AES algorithm in 2000, which is one of the block cipher encryption methods. The primary goals of this method were to replace the DES algorithm once it was discovered to have several vulnerabilities. Experts in encryption and data security from across the globe convened at the NIST to introduce a new block cipher algorithm that can encrypt and
An AES-Based Efficient and Valid QR Code for Message Sharing …
591
Fig. 3 Flowchart of decryption
decode data with a powerful and sophisticated structure. Multiple organizations from throughout the world submitted their algorithms for evaluation. NIST has recognized five techniques for assessment. They chose one of the five encryption techniques supplied by two Belgian cryptographers, Vincent Rijmen and Joan Daeman, after applying several criteria and security aspects. AES technique was originally known as the Rijndel technique. Furthermore, this has not been a widely used moniker for this technology; instead, it is recognized around the world as the AES technique [37].
592
A. Agarwal and S. Malik
3.2 Encryption Process Encryption is a widely used method that serves a critical role in protecting data from unauthorized access. The AES algorithm encrypts data using a specific structure to give the highest level of security. It does this by the usage of a number of rounds, with each round consisting of four sub-processes inside them. Each round comprises four processes listed below, which are utilized to encrypt a 128-bit block. During the initial step of each round, the sub bytes transformation takes place. To complete this step, the usage of a nonlinear S-box is necessary to swap one bit of the state with another bit of the state [38]. The principles of dispersion and confusion for cryptographic algorithm design developed by Shannon have been shown to have a critical role in obtaining much greater levels of security. In terms of what it does to the state, shift row is the next step after sub byte. Rather than beginning with row number 0, the major objective of this phase is to transfer bytes of state cyclically to left in each row. This operation retains bytes from row number 0 and does not execute any permutations while it is in progress. The first row contains just one byte, which has been shifted to the left by one position. The second row is two bytes further to the left than it was originally. The final row has been relocated three bytes to the left [39].
4 Results and Discussion This section discusses the evaluated results achieved by the proposed approach. To measure the performance and robustness of the system against attack is measured through three parameters, namely PSNR, root mean squared error (RMSE) and mean squared error (MSE).
4.1 Parameters Evaluation PSNR: It is the ratio between maximum possible power with corrupting noise that causes the representation of the picture to be corrupted. When the value is higher, the image’s quality is better. MSE: It is referred to as a “figure of merit,” and it reflects the degree to which two pictures are similar or different from one another. The lower the MSE value of a picture, the higher the quality of the image and the less distortion from the original image. RMSE: The difference between the source image as well as the segmented image is measured using the RMSE. The lower the value of RMSE, the better the segmentation performance.
An AES-Based Efficient and Valid QR Code for Message Sharing …
593
4.2 Outcomes This section displayed and discussed the steganographic result after applying secure AES encryption algorithm. Figure 4 depicts the picture that is applied as a cover image to conceal a secret message and is chosen at random for input. This picture was obtained from a Webbased source and has a resolution of 2400 * 1600 pixels. This cover picture is the original image, and there is no secret information hidden inside it. Figure 5 depicts the QR code. Not only does the data type affects the number of characters encoded by the QR code but also the QR code version, different sorts of data as well as the greatest degree of QR code storage capacity also gets affected. There are two types of QR codes: electronic as well as printed. The study in this work is relevant to both. For example, if you scan an electronic QR code, you can read its message, but if you scan a printed QR code, you can extract its payload. Fig. 4 Input as cover image
Fig. 5 Secret text embedding QR code
594
A. Agarwal and S. Malik
Fig. 6 Stego image
Table 3 Result of AES-LSB-based approach Image
Type
PSNR (dB)
MSE
RMSE
.jpg
51.138
0.50
0.707
Steganalysis applies to electronic as well as printed QR codes and thus can be read properly. Furthermore, we generate QR codes with various message-to-payload ratios in conducted to evaluate the message-to-payload ratios. Figure 6 shows the stego picture after deployment of QRcode to cover the image. This stego picture is a result of the embedding process, as seen here. The hidden information might be found either in the pixel values or in the ideally determined coefficients. The payload is not accessible in this QR code, despite the size of the message hidden behind the cover message, as well as the message is kept secret in this QR code. Table 3 results show the performance parameters for the proposed AES-LSBbased approach that are a combination of steganography and cryptography. Following the findings in Table 3 and analyzing the performance based on the combination of steganography and cryptography, we may conclude that this is something we learn. The jpg format is the best format for supporting both, as well as the requirements of minimal RMSE and MSE, as well as the criteria of maximum PSNR, which are both met by the jpg format when combined.
4.3 Robustness Against Attacks In this part, we examine the resilience of the proposed system in the face of two steganographic attacks: the regular-singular (RS) steganalysis as well as the pixel
An AES-Based Efficient and Valid QR Code for Message Sharing …
595
difference histogram (PDH). The RS steganalysis method works by determining how smooth the transition among the image (container) pixels is. Whenever this analysis is done to LSB, the difference among the regular group setting as well as the singular groupset is very near to zero. Following that, we explain why the suggested system cannot be detected through RS analysis. Because QR codes are binary pictures, and the payload is embedded in the QR code using the xor operation, which is the same as addition modulo 2, the xor operation is equivalent to addition modulo 2. An investigation of PDH is carried out to assess the robustness of the proposed system. The PDH curve plots the difference between two consecutive pixels versus the frequency with which these values are encountered in the data set. The values of the pixels in the QR code are either 1 or 0. As a result, the difference value may be any of the following: 1, 0, 1. We acquire the pixel difference histogram (PDH) curve between adjacent pixels of the standard QR code as well as the PDH curve between adjacent pixels of the produced QR code by comparing the two sets of adjacent pixels. The closer the two curves are together, the more difficult it is to recognize the payload in the QR code that has been formed. For AES algorithms that need a key length less than the requisite number of bits, the key must be enlarged using the zero-padding approach to get it to the correct length before it is used. Nevertheless, if the needed key length is more than the number of data bits, as in the case of AES-256, the key expansion technique is employed to increase the length of the key. Here, AES-256 was selected to perform encryption to provide data security that has a key length of 256 bits, supports the greatest bit size, as well as is almost impenetrable by brute force depending on current computing power, giving it the strongest encryption standard, is the most secure encryption standard accessible. Hence, it can be say that this combination is more robust to provide data security by their key formulation.
5 Conclusion This study’s results suggest a steganographic system that might make use of QR codes as containers to hide content. They incorporate payload into a QR code rather than embedding it within an image. It is possible to read the QR code generated by every QR reader since it contains a unique message. The usage of the Internet and other networks is rising rapidly. Daily, a vast volume of digital data is transmitted between users. Some of the information is sensitive, and it must be safeguarded from unauthorized access. The use of encryption technologies is essential in safeguarding original data from illegal access and modification. Various algorithms are available for encrypting data. AES technique is an efficient technique available, and it is extensively supported and implemented in both software and hardware. This technique is capable of dealing with a variety of key sizes, including 256 bits block ciphers. From the results, it has been found that the PSNR is good for data security and the error rate is minimized. An analysis of the AES algorithm’s performance at encrypting
596
A. Agarwal and S. Malik
data under various settings is presented in this work, which highlights some of the algorithm’s most important aspects. In the future, AES may be reworked to handle real-time data from a wide range of sources (e.g., e-commerce and news sites blogs). Furthermore, its effectiveness may be assessed in a variety of different application fields, such as the assessment of a company’s reputation and the study of competitor products.
6 Future Scope After reading all of the aforementioned articles, it is obvious that there are many ways to improve the security of QR codes. Because these codes conceal a wealth of information that is difficult to decipher. In the future, we may devise a computation that incorporates both cryptography and steganographic methods. We may also use color coding and compression methods to improve the storage capacity of these codes. It can also be extended for cloud applications where secure message sharing is needed. Users must have security in terms of their data while storing it on the cloud. As a result, we need models that will improve the security of personal information. The use of picture steganography is a method of safeguarding data from unwanted access. Image steganography enables users to hide sensitive information under a cover image.
References 1. L. Guo, B. Yan, Y. Shen, Study on secure system architecture of IoT. Inf. Secur. Commun. Privacy (2010) 2. S. Zmudzinski, B. Munir, M. Steinebach, Digital audio authentication by robust feature embedding. IS&T/SPIE Electron. Imaging 83030I-1–83030I-7 (2012) 3. T. Kothmayr, C. Schmitt, W. Hu, M. Brünig, G. Carle, DTLS based security and two-way authentication for the Internet of Things. Ad Hoc Netw. 11(8), 2710–2723 (2013) 4. T. Handel, M. Sandford, Hiding data in the OSI network model, in Proceedings of the 1st International Workshop on Information Hiding (1996) 5. N. Johnson, S. Jajodia, Exploring steganography: seeing the unseen. IEEE Comput. 26–34 (1998) 6. I. Shafi, M. Noman, M. Gohar, A. Ahmad, S. Din, S. Ahmad, An adaptive hybrid fuzzy-wavelet approach for image steganography using bit reduction and pixel adjustment. Soft Comput. 22 (2018) 7. V. Hajduk, M. Broda, O. Kováˇc, D. Levický, Image steganography with using QR code and cryptography, in 2016 26th International Conference Radioelektronika (RADIOELEKTRONIKA) (2016), pp. 350–353 8. A. Mendhe, D.K. Gupta, K.P. Sharma, Secure QR-code based message sharing system using cryptography and steganography, in 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) (2018), pp. 188–191 9. O.F.A. Wahab, A.A.M. Khalaf, A.I. Hussein, H.F.A. Hamed, Hiding data using efficient combination of RSA cryptography, and compression steganography techniques. IEEE Access 9, 31805–31815 (2021)
An AES-Based Efficient and Valid QR Code for Message Sharing …
597
10. Z. Zhang, C. Zhu, Y. Zhao, Two-description image coding with steganography. IEEE Signal Process. Lett. 15, 887–890 (2008) 11. M. Mostaghim, R. Boostani, CVC: chaotic visual cryptography to enhance steganography, in 2014 11th International ISC Conference on Information Security and Cryptology (2014), pp. 44–48 12. M. Sharifzadeh, M. Aloraini, D. Schonfeld, Adaptive batch size image merging steganography and quantized Gaussian image steganography. IEEE Trans. Inf. Forensics Secur. 15, 867–879 (2020) 13. C. Burrows, P.B. Zadeh, A mobile forensic investigation into steganography, in 2016 International Conference on Cyber Security and Protection of Digital Services (Cyber Security) (2016), pp. 1–2 14. D. Watni, S. Chawla, A comparative evaluation of jpeg steganography, in 2019 5th International Conference on Signal Processing, Computing and Control (ISPCC) (2019), pp. 36–40 15. M. Sharifzadeh, M. Aloraini, D. Schonfeld, Adaptive batch size image merging steganography and quantized Gaussian image steganography. IEEE Trans. Inf. Forensics Secur. 1 (2019) 16. R. Dhaya, Light weight CNN based robust image watermarking scheme for security. J. Inf. Technol. Digit. World 03(02), 118–132 (2021). https://doi.org/10.36548/jitdw.2021.2.005 17. S.S. Kirubakaran, Study of security mechanisms to create a secure cloud in a virtual environment with the support of cloud service providers. J. Trends Comput. Sci. Smart Technol. (TCSST) 02(03), 148–154 (2020). https://doi.org/10.36548/jtcsst.2020.3.004 18. J.S. Manoharan, A novel user layer cloud security model based on chaotic Arnold transformation using fingerprint biometric traits. J. Innov. Image Process. (JIIP) 03(01), 36–51 (2021). https://doi.org/10.36548/jiip.2021.1.004 19. C.V. Joe, J.S. Raj, Deniable authentication encryption for privacy protection using blockchain. J. Artif. Intell. Capsule Netw. 03(03), 259–271 (2021). https://doi.org/10.36548/jaicn.2021. 3.008 20. M. Satheesh, M. Deepika, Implementation of multifactor authentication using optimistic fair exchange. J. Ubiquitous Comput. Commun. Technol. (UCCT) 02(02), 70–78 (2020). https:// doi.org/10.36548/jucct.2020.2.002 21. M. Hussain, Q. Riaz, S. Saleem, A. Ghafoor, K.-H. Jung, Enhanced adaptive data hiding method using LSB and pixel value differencing. Multimed. Tools Appl. 80(13), 20381–20401 (2021). https://doi.org/10.1007/s11042-021-10652-2 22. N. Karthikeyan, K. Kousalya, N. Jayapandian, G. Mahalakshmi, Assessment of composite materials on encrypted secret message in image steganography using RSA algorithm. Mater. Today Proc. (2021). https://doi.org/10.1016/j.matpr.2021.04.260 23. M. Alajmi, I. Elashry, H.S. El-Sayed, A. Farag, S. Osama, Steganography of encrypted messages inside valid QR codes. IEEE Access 8(1), 27861–27873 (2020). https://doi.org/ 10.1109/ACCESS.2020.2971984 24. H.M. Pandey, Secure medical data transmission using a fusion of bit mask oriented genetic algorithm, encryption and steganography. Future Gener. Comput. Syst. (2020). https://doi.org/ 10.1016/j.future.2020.04.034 25. H. Liu, W. Chen, Optical ghost cryptography and steganography. Opt. Lasers Eng. 130(2), 106094 (2020). https://doi.org/10.1016/j.optlaseng.2020.106094 26. A.K. Sahu, G. Swain, Reversible image steganography using dual-layer LSB matching. Sens. Imaging 21(1), 1 (2020). https://doi.org/10.1007/s11220-019-0262-y 27. N.F. Johnson, S. Jajodia, Exploring steganography: seeing the unseen. Computer 31(2) (1998) 28. V.M. Potdar, E. Chang, Grey level modification steganography for secret communication, in Proceeding of 2nd IEEE International Conference on Industrial Informatics (2004), pp. 223– 228 29. D.C. Wu, W.H. Tsai, A steganographic method for images by pixel-value differencing. Pattern Recogn. Lett. 24(9–10), 1613–1626 (2003) 30. B. Chen, G.W. Wornell, Quantization index modulation: a class of provably good methods for digital watermarking and information embedding. IEEE Trans. Inf. Theory 47(4), 1423–1443 (2001)
598
A. Agarwal and S. Malik
31. X. Zhang, S. Wang, Steganography using multiple-base notational system and human vision sensitivity. IEEE Signal Process. Lett. 12(1), 67–70 (2005) 32. S. Singh, V.K. Attri, State-of-the-art review on steganographic techniques. Int. J. Signal Process. Image Process. Pattern Recogn. 8(7), 161–170 (2015) 33. O.S. Faragallah, Efficient video watermarking based on singular value decomposition in the discrete wavelet transform domain. AEU-Int. J. Electron. Commun. 67(3), 189–196 (2013) 34. R.K. Nayak, P.A. Saxena, D.M. Manoria, A survey & applications of various watermarking & encryption techniques. Int. J. Sci. Eng. Res. 6(5), 1532–1537 (2015) 35. R. Bhanot, R. Hans, A review and comparative analysis of various encryption algorithms. Int. J. Secur. Appl. 9(4), 289–306 (2015) 36. S. Gueron, S. Johnson, J. Walker, SHA-512/256, in Proceedings of 8th International Conference on Information Technology—New Generations (2011), pp. 354–358 37. A. Berent, Advanced encryption standard by example (2013). Document available at URL http://www.networkdls.com/Articles/AESbyExample 38. R. Jain, R. Jejurkar, S. Chopade, S. Vaidya, M. Sanap, AES algorithm using 512 bit key implementation for secure communication. Int. J. Innov. Res. Comput. Commun. Eng. 2(3) (2014) 39. N. Selmane, S. Guilley, J.L. Danger, Practical setup time violation attacks on AES, in Seventh European Dependable Computing Conference, 2008. EDCC 2008 (IEEE, 2008), pp. 91–96
Categorization of Cardiac Arrhythmia from ECG Waveform by Using Super Vector Regression Method S. T. Sanamdikar, N. M. Karajanagi, K. H. Kowdiki, and S. B. Kamble
Abstract Various heart disorders affect a large number of people around the world today. As a result, understanding the properties of the ECG waveform is crucial to identify a variety of heart conditions. The ECG is an investigation that determines the strength of the electric impulses of heart. PQRST waves are a collection of waves that make up a cardiac cycle in an ECG waveform. The magnitude and temporal periods of PQRST impulses are estimated for the learning of ECG waveforms in the attribute removal of ECG waveforms. The values of the PQRST segment’s amplitudes and time intervals can be utilized to determine the proper operation of the human heart. In recent years, the bulk of methodologies and studies for analysing the ECG waveform have been developed. In the bulk of the systems, fuzzy logic methods, artificial neural networks, genetic algorithm, support vector machines, the wavelet transform and other waveform examining techniques are used. SVM, ANN, neural mode decomposition and support vector regression approaches are compared in this work. The ISVR approach outperforms the other two methods. Each of the methods and strategies outlined above, however, have its own set of compensation and disadvantages. In this article, the wavelet transform Db4 is utilized to extract various properties from an ECG waveform. The proposed system is designed with MATLAB software. The verification of arrhythmia is presented in this study utilising the MIT-BIH dataset, which was used to validate should be manually annotated and produced. Keywords Cardiac arrhythmia · QRS complex · Median filter · Electrocardiograph · Wavelet transform Db4
S. T. Sanamdikar (B) Instrumentation Department, PDEA’s College of Engineering, Manjari, Pune, India e-mail: [email protected] N. M. Karajanagi · K. H. Kowdiki Instrumentation Department, Government College of Engineering and Research, Awasari Khurd, Pune, India S. B. Kamble Electronics Department, PDEA’s College of Engineering, Manjari, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_45
599
600
S. T. Sanamdikar et al.
1 Introduction The electrocardiogram (ECG) is a diagnostic technique so as to measure the electrical motion of the heart using a skin electrode. A human heart beat’s form and heart rate show its cardiac health. It is a noninvasive technology in which a waveform is determined mostly on human body’s surface and utilized to diagnose cardiac problems. Any abnormalities in heart rate or rhythm, as well as changes in the morphological pattern, which may be detected by an ECG waveform analysis, are symptoms of cardiac arrhythmia. The P-QRS-T wave’s amplitude and duration reveal important details regarding the nature of heart illness. The presence of Na+ and K+ ions in the blood causes the electrical wave. The ECG waveform provides the following information about a human heart: • • • • • •
The position of the heart and the size of its chambers. The origin and spread of impulses. Disturbances in conduction and cardiac rhythm. The myocardial ischemia extent and its location. Electrolyte concentration changes. The effects of drugs on the heart.
1.1 Electrocardiogram (ECG) Electrodes The ECG sometimes does not show whether the heart is contracting or circulating. There are 12 electrodes in a conventional ECG: three bipolar electrodes, three enhanced six chest (precordial) electrodes and six unipolar electrodes. A lead is a set of electrodes (+ve and −ve) attached to an ECG record and inserted in certain anatomical locations on the body. With bipolar electrodes, monitor the potential difference between two points (+ve and −ve poles). Unipolar electrodes use a single probing electrode to record the electrical potential at a specific location. Because they only require two electrodes to provide a view, bipolar electrodes are sometimes referred to as electrodes I, II, and III. The positive electrode is one of the electrodes, whereas the negative electrode is the other as shown in Table 1. One of the electrodes connected to the +ve terminal is the contrast electrode, while the other two limbs connected to the −ve terminal are the indifferent (reference) electrode. Wilson sensors (V1–V6) are unipolar heart electrodes that are inserted on the left side of the thorax in an approximately horizontal plane. The insensitive electrode is constructed by joining the three conventional limb electrodes [1]. When used in combination with the unipolar arm electrodes in the frontal plane, they provide a three-dimensional picture of the integral vector.
Categorization of Cardiac Arrhythmia from ECG Waveform … Table 1 In ECG monitoring, these are the most commonly used electrodes
Standard electrodes Electrodes by the limbs
601 Electrodes for the chest
Electrodes that are bipolar
Electrodes that are unipolar
Electrodes that are unipolar
Lead III
AVR
V6 V5 V4 V3
Lead II
AVL
V2
Lead I
AVF
V1
V1: Right sternal edge, 4th intercostal space, V2: Left sternal edge, 4th intercostal space, V3: the area between the second and fourth electrodes are the precordial (chest) electrodes [2]. In the midclavicular line, V4 is the fifth intercostal gap, V5: anterior auxiliary line on fifth rib and V6: in the auxiliary line’s middle as shown in Table 2 In general, the captured ECG waveform is frequently infected by various forms of disturbances and artefacts that might occur inside the ECG waveform’s frequency spectrum, altering the waveform’s properties [3]. As a result, extracting usable information from the waveform is difficult. The following significant sounds cause ECG waveform corruption. Transmission lines interferences contain 60 Hz or 50 Hz pickup (in the United States) due to improper grounding (in India). It occurs as an impulse or spike at 60 Hz/50 Hz harmonics, with additional spikes at integral multiples of the fundamental frequency [4]. It has a frequency content of 60 Hz/50 Hz, with harmonic amplitudes up to 50% of the peak-to-peak ECG waveform amplitude. A 60 Hz notch filter can be used to reduce energy line interferences [5]. Coughing or inhaling with considerable chest movement, or moving in the instance of limb-lead ECG recording, an arm or leg is used, might generate baseline drift in Table 2 Wave amplitude and frequency, as well as intervals and segments, are all part of the ECG waveform
S. No.
Features
Amplitude (mV)
Duration (ms)
1
RR-interval
–
(0.4–1.2) s
2
ST-interval
–
320
3
T-wave
0.1–0.3
120–160
4
ST-segment
–
100–120
5
QRS complex
1
80–120
6
PR-interval
–
120–200
7
PR-segment
–
50–120
8
P wave
0.1–0.2
60–80
602
S. T. Sanamdikar et al.
Fig. 1 ECG waveform [7]
chest-lead ECG readings [6]. Temperature and sensitivity swings in the instrumentation amplifiers might induce baseline drift. It has a frequency range of about 0.5 Hz. With a cut-off wavelength of 0.5 Hz, a high pass filter is used and is employed to reduce baseline drift (Fig. 1). Excessive alcohol abuse, drug abuse by diabetics, coffee use, heart disease and hypertension hyperthyroidism, emotional stress, smoking, some dietary supplements and heart scarring, to name a few [8]. 1. Tachycardia, 2. Bradycardia, 3. Premature atrial contraction, 4. Normal sinus rhythm, 5. Atrial flutter, 6. Atrial fibrillation, 7. Premature ventricular contraction, 8. First degree AV block, 9. Ventricular tachycardia, 10. Ventricular fibrillation, etc., are the classification of arrhythmia.
2 Review of the Existing Literature Willams et al. [1] performed out the measurement, which was examined separately by a group of cardiologists from the American Heart Association. The results of an analysis of a series of proposals aimed at standardizing quantitative ECG measurement are presented. These AHA suggestions have gained worldwide acclaim. Zhao and Zhan [2] proposed a novel feature extraction method for accurate heart rhythm identification. The suggested classification method is made up of three parts: data pre-processing, feature extraction and ECG signal categorization. The feature vector of ECG data is created by combining two different feature extraction algorithms. The coefficients of the transform are extracted as features of each ECG segment using the wavelet transform. At the same time, autoregressive modelling (AR) is used to grasp
Categorization of Cardiac Arrhythmia from ECG Waveform …
603
the temporal features of ECG signals. Finally, distinct ECG heart rhythms are classified using a support vector machine (SVM) with a Gaussian kernel. The correctness of the proposed approach was determined by computer simulations, which yielded a 99.68% overall accuracy. The filtering approach based on moving averages reported by Chouhan and Mehta [3] produces smooth spike-free ECG output, which is suitable for slope feature extraction. The first step is to extract the slope feature from the filtered and driftcorrected ECG signal by processing and converting it so that the derived feature signal is greatly boosted in the QRS area and suppressed in the non-QRS region. The proposed method has a detection rate of 98.56% and a positive predictive value of 99.18%, respectively. Saxena et al. developed a modified mixed wavelet transforms technique in [4]. The approach was created to evaluate multi-lead ECG readings in order to diagnose heart problems. For QRS detection, a quadratic spline wavelet (QSWT) was utilized, while for P and T detection, the Daubechies six coefficient (DU6) wavelet was used. For the identification of various heart disorders, a process based on ECG data and a point scoring system has been developed. When both diagnostic criteria yielded the same results, the consistency and reliability of the discovered and measured parameters were confirmed. The comparison of several ECG signal feature extraction approaches is shown in Table 1. Ramli and Ahmad [5] explained correlation analysis for aberrant ECG signal feature extraction. Their planned research looked on a technique for extracting essential features from ECG signals from a 12-lead system. They picked II as the basis for their entire study since it has representative properties for detecting prevalent cardiac problems. Cross-correlation analysis was used as the analysis technique. Cross-correlation analysis determines how similar two signals are and extracts the information contained in them. Their tests showed that the proposed technique could efficiently identify elements that differentiate between the numerous types of heart disorders studied as well as a normal heart signal. Laurence et al. [6] provide a continuous wavelet transform-based technique for studying the respiratory sinus arrhythmia’s non-stationary intensity and phase delay (RSA). The RSA is the cyclic variation of instantaneous heart rate at the frequency of breathing. Paced breathing or postural alterations, low respiratory frequencies and quick changes have all been observed in studies of cardio-respiratory interaction during sleep. Karhk and Ozba [9] used an artificial neural network to assess ECG signals in the time domain and determine related arrhythmias, achieving a 95% success for arrhythmia recognition. Chiu et al. [8] developed an efficient arrhythmia recognition algorithm based on the correlation coefficient in ECG signals for QRS complex detection. The correlation coefficient and RR-interval were used to quantify arrhythmia similarity. Gradl et al. [10] investigated the Pan-Tompkins method for QRS detection, template generation and adaption, feature extraction and beat classification. The MIT-BIH arrhythmia and MIT-BIH supraventricular arrhythmia databases were used to validate the algorithm. The programme properly detected more than 98% of all QRS complexes. The overall sensitivity and specificity for aberrant beat detection were 89.5% and
604
S. T. Sanamdikar et al.
80.6%, respectively. Wavelet transform and linear discriminate analysis were used to extract input features by Lee et al. [11]. The accuracy of the proposed approach in detecting arrhythmias was 97.52, 96.43, 98.59 and 97.88% for NSR, SVR, PVC and VF, respectively. The wavelet transform and hidden Markova models were performed by Gomes et al. [12]. The experimental results in real data from the MIT-BIH arrhythmia data source show that it outperforms the standard linear segmentation. Tavakoli et al. [13] describe a novel ECG signal analysis approach based on non-uniform sampling. It is demonstrated that a newly developed method called finite rate of innovation (FRI) may be used to better assess left and right bundle branch block arrhythmias as well as normal signals, and that spline modelling can be used to better study different types of arrhythmia. As a result, a multi-stage technique for diagnosing and compressing ECG signals is provided, which is both faster and more accurate. The wireless and wearable sensor ECG system, handheld device with RF receiver and arrhythmia algorithm were studied by Fensli et al. [14]. The continuous wavelet transform (CWT) was investigated by Daqrouq and AbuIsbeih [15] for evaluating ECG signals and extracting desirable parameters such as arrhythmia. This approach distinguishes between Nomo cardia, Bradycardia and Tachycardia with a clear threshold. The Pan Tomkins algorithm (it is implemented for the identification QRS complex on normal and arrhythmia databases and discrete WT) has been researched by Vijaya et al. [16]. The most common cause of mortality is cardiac arrhythmia. An algorithm for R peak and QRS complex detection using WT has been developed and assessed using ECG feature extraction. Saheb and Meharzad [17] investigated the construction of a heart diagnosis tool with few complex computations. The accuracy of the designed classifier is 98%, and it has been attained for three different arrhythmias, including RBBB, LBBB and normal heart rhythm. The sequential probability ratio test has been modified by Chen et al. [18]. Using this method, they were able to reduce the total rate error rate by 5% compared to the previous result. The feature extraction from the Pan-Tompkins algorithm and the Hierarchical system was studied by Leutheuser et al. [19]. Early detection of arrhythmic beats in the ECG signal may help identify patients who are at risk of sudden death due to coronary heart disease. A demonstrative and measurable technique for ECG research is proposed. Information disentanglement using multiscale PCA, issue recognition and localization using great direct PCA are the three aspects of this technique. The investigated data is presented as a multivariate lattice. The ECG waves attributes: Wave amplitudes and section estimations are deleted from this grid’s factors. The developed methodology allows for the detection of arrhythmias and irregular heartbeats. When the results of this methodology are compared to the master’s material, they support our study’s presentation. To improve precision, the approach used a combination of convolution neural networks (CNNs) and an arrangement of long short-term memory (LSTM) units, as well as pooling, dropout and standardization algorithms. We chose the last expectation for organization because the algorithm predicted a characterization at each eighteenth information test.
Categorization of Cardiac Arrhythmia from ECG Waveform …
605
The Physionet Challenge 2017 preparation dataset, which contains 8528 single lead ECG accounts lasting from 9 s to little over 60 s, was used to cross-approve the results.
3 Methodology Wandering the baseline, the phenomenon of a waveform’s base axis (x-axis) instead of being straight, it appears to ‘wander’, or move up and down is known as baseline wander or baseline drift [10]. The entire waveform shifts away from its typical basis as a result of this. The baseline wander in an ECG waveform is produced by faulty patient mobility and respiration, sensors and electrodes. Figure 2 depicts a typical ECG waveform that has been influenced by baseline wander [11]. The baseline wander’s frequency content is in the 0.5 Hz band. Wandering the baseline, the phenomenon of a waveform’s base axis (x-axis) instead of being straight, it appears to ‘wander’ or move up and down is known as baseline wander or baseline drift [10]. Because low-amplitude feature separation might be erroneous, fake waveforms can result, making ECG analysis and interpretation more difficult [12]. Because powerline interference entirely superimposes low frequency. It is necessary to remove ECG waves such as the P and T waves from ECG waveforms. Figure 3 depicts a typical ECG waveform that has been influenced by power line interference. Muscle noise is a significant issue in many ECG applications, particularly in recordings taken during exercise, because low-amplitude waveforms can become entirely covered [13]. Spreading the skin around the electrode alters the impedance of the skin and is the most common cause of electrode motion abnormalities. Motion artefacts have
Fig. 2 Baseline wandering ECG waveform (drift)
606
S. T. Sanamdikar et al.
Fig. 3 Interference from power lines (50/60 Hz) affects ECG
waveform properties that are similar to baseline wander, but they are more difficult to resist since their spectral content overlaps the PQRST complex significantly. They are most common in the 1–10 Hz range [17]. Figure 4 shows an ECG waveform that has been influenced by electrode motion artefact. The wavelet transform technique allows for time-scale analysis to be taken advantage of. The MRA, along with consecutive waveform selection, appears to be a potent technique for acquiring minute waveform details. Due to the wavelet’s resemblance to the ECG waveform, for wavelet coefficients, the Daubechies wavelet group is demonstrated to be the optimum option. An orthogonal wavelet is fully defined by two filters: a scaling filter and a wavelet filter [6]. Convoluting X(n) with the wavelet function G(n), where G(n) is the high pass filter’s scale parameter, yields the detailed components (D). The first method creates twice as much data as the first: For N input samples, there are N approximation coefficients and N detail coefficients (D) (Fig. 5).
Fig. 4 Electrode motion artefacts have an impact on the ECG
Categorization of Cardiac Arrhythmia from ECG Waveform …
607
Fig. 5 Wavelet decomposition tree [17]
To fix this, simply throw away every second coefficient and down sample the filter output by two. Up sampling the waveforms and then executing the reverse procedure can be used to reconstruct the input waveform. Figure 6 depicts the entire architecture for deconstruction and reconstruction. The input waveform is broken down into multiple lower-resolution components as the decomposition procedure is repeated [20]. Wavelet decomposition tree is the term for this. A three-level breakdown is shown in Fig. 6. In the frequency domain, sinusoidal functions are absolutely confined, yet in the spatial coordinate system, they are global
Fig. 6 Wavelet decomposition and reconstruction
608
S. T. Sanamdikar et al.
(position or time). As a result, representing a time or spatially constrained function in a Fourier basis is problematic [8]. A wavelet basis, on the other hand, is roughly confined in both the frequency and spatial domains. The functions that are permissible are severely limited as a result of this. These features are obtained in the Daubechies formulation through the recursion process. The wavelet transform is named after its originator, Daubechies. Ingrid Daubechies is a mathematician [21]. There are four wavelets and scaling function coefficients in the Daubechies D4 transform. The coefficients of the scaling function are present in figure.
3.1 Regression with Support Vectors Regression with support vectors preserves all of the useful features of support vector machines. It tries to find a curve based on ECG characteristics. Rather than using the curve as a decision boundary in a classification task, super vector regression (SVR) looks for a match between a vector and the curve’s position [22]. After all, this is a regression scenario. And support vectors are involved in determining the best match between ECG features and the actual function that they reflect [7]. When the distance between the support vectors and the regressed curve is maximized, it appears to be the closest to the actual curve (because there is always some noise present in the statistical values of ECG features) [7]. It follows that any vectors with no support vectors should be discarded for the simple reason that they are statistical outliers. A regressed function is the end outcome! Assume that one has a training dataset {(x1 , y1 ), . . . , (xl , yl )} ∈ xi × R, where xi denote the training data and yi denotes the target BPs. The ISVR is to find y = ω, φ(x)H + b, where ω and φ(x) denote the vectors obtained from reproducing Kernel Hilbert space H. Therefore, we can minimize the following risks: 1 ω2 + c L(yi , xi , f ), 2 i=1 l
R=
(1)
where L(yi , xi , f ) denotes the ∈—loss function given by L(yi , xi , f ) = |y − f (x)|∈ = max(0, |y − f (x)|− ∈) 1 ω2 + c ξ 2 + ξ˜ 2 2 i=1
(2)
l
min
ω,b,ξ,xi ˜
(3)
Categorization of Cardiac Arrhythmia from ECG Waveform …
609
{ f (xi ) − yi ≤ ∈ + ξi , i = 1, . . . , l i − f (x i ) ≤ ∈ + ξ i , i = 1, . . . , l . Subject to {y ξi , ξ˜i ≥ 0 This problem can be solved by the Lagrangian theory as follows: ω=
l α i − αi φ(xi ),
(4)
i=1
where α, α , i = 1, . . . , l denotes the Lagrangian multipliers with respect to the constraints given in Eq. (4), and the solution is given by l l l 1 a i − αi a j − α j yi a i − αi − ∈ a i + αi − 2 i, j=1 αi , α i i=1 i=1
1 K xi, x j + ζi, j c
max
(5)
Subject to
l α i − αi = 0 i=1
0 ≤ αi , 0 ≤ αi , i = 1, . . . , l,
where K xi , x j = φ(xi ), φ x j H and ζi, j represent the Kronecker symbol. This problem can be resolved using some disassemble methods of the ISVR.
4 Results The proposed method is validated using the MIT-BIH dataset. We used 15 different sorts of classes to interpret the data. After denoising, we extracted features from ECG for those classes. Baseline Wander noise, power line noise and muscle noise all exist in the ECG waveform. The proposed solutions use a zero phase low pass filter to reduce noise. Use a Daubechies 4 wavelet with a four-level decomposition for feature extraction. Higher order statistics are then calculated using those traits. In addition, the RR-interval is determined. Time and frequency domain features are retrieved. The whole distribution of ECG waveforms is obtained by extracting a total of 20 higher order statistical features. For feature level fusion, all features are combined and concatenated [23]. The ISVR training is then based on these fused features. ISVR is a modified version of SVM with kernel adjustments. Support vector
610
S. T. Sanamdikar et al.
regression outperforms standard SVM in terms of accuracy. The performance of the created wavelet transformation extraction of different ECG waveform characteristics for cardiac arrhythmia detection is covered. The following is a summary of the data from the MIT-BIH dataset [22]. For research objectives, the ECG waveforms were received from the MIT/BIH dataset. The database of ECG waveforms that was created is described below. (1) (2)
(3) (4) (5)
ECG waveforms were acquired from 48 patients, 19 females (ages 23–89) and 29 males (ages 23–89). ECG waveforms came in 15 different sorts: normal sinus rhythm, pacemaker rhythm and 15 different types of cardiac dysfunctions (for each of which at least 10 waveform fragments were collected). At a frequency response of 360 [Hz] and a gain of 200 [adu/mV], all ECG waveforms were captured. A total of 1000 10-second (3600 sample) ECG waveform fragments were chosen at random for the investigation. The data comes in Excel format, which we convert to MATLAB.
The suggested ECG classifier is trained and evaluated using arrhythmia data from the MIT/BIH database. The following are the mathematical parameters that were estimated for the MITBIH data. The accuracy is determined using 240 examples acquired from MIT/BIH dataset. We selected 16 samples from each class in those samples. There are 15 classes in all. Positive samples total (P) for each class: 16. Negative Samples in Total or Remaining 224th class (N). True Positive (TP): Total Samples of Current Class that were accurately retrieved (Out of 16). True Negative (TN): Total number of correctly retrieved samples from the remaining classes (Out of 224). The accuracy can be calculated by the equation Accuracy = [TN + TP]/[P + N]
(6)
The accuracy gained for each class is shown in Table 3 and Fig. 7. Each column reflects the accuracy achieved using a suggested or current Classifier, and each row represents a class. The presented method is more accurate than other conventional methods. Similarly, the other parameters TPR, FPR, FNR and PPV are calculated [24–27].
5 Conclusion This study proposes and analyses a system for ECG processing and analysis with two main applications. A fifteen-class ECG classification system can tell the difference between healthy (normal) and unhealthy (bad) ECG waveforms. Characteristics
Categorization of Cardiac Arrhythmia from ECG Waveform …
611
Table 3 Comparison of different methods for accuracy S. No.
SVR
SVM
ANN
NMD
1
0.995833
0.9875
0.979167
0.979167
2
0.9875
0.983333
0.979167
0.970833
3
0.9875
0.983333
0.983333
0.983333
4
0.9875
0.9875
0.9875
0.983333
5
0.991667
0.9875
0.983333
0.979167
6
0.991667
0.991667
0.983333
0.983333
7
0.991667
0.983333
0.983333
0.979167
8
0.995833
0.995833
0.991667
0.9875
9
0.991667
0.983333
0.979167
0.970833
10
0.991667
0.983333
0.975
0.975
11
0.995833
0.995833
0.991667
0.991667
12
0.995833
0.991667
0.9875
0.979167
13
0.9875
0.9875
0.9875
0.983333
14
0.995833
0.991667
0.983333
0.975
15
0.991667
0.991667
0.9875
0.983333
MIT Accuracy
0.96 0.95 0.94 0.93 0.92 0.91
1
2
3
Acc_SVR
4
5
6
7
Acc_SVM
8
9
10
Acc_ANN
11
12
13
14
15
Acc_NMD
Fig. 7 Accuracy comparison of proposed method
from the local and MIT-BIH datasets are used to implement the recommended ECG auto classification system design. Within the research, the proposed ECG classification algorithms properly categorized the ordinary and the extraordinary ECG waveforms 99% of the time. The proposed approach outperformed the current method. For categorization, the proposed architecture employs both temporal and frequency
612
S. T. Sanamdikar et al.
domain characteristics. Our classification challenge becomes simpler than standard morphological features due to the usage of higher order statistics. Even with minimal learning data, the proposed approach performed well. ISVR classification models have a modelling accuracy of up to 98%. They were faced with a lot of resistance when they tried to duplicate ECG waveforms.
References 1. J.I. Willams, CSE Working Party, Recommendations for measurement standards in quantitative ECG. Eur. Heart J. 815–825 (1985) 2. Q. Zhao, L. Zhan, ECG feature extraction and classification using wavelet transform and support vector machines, in International Conference on Neural Networks and Brain, ICNN&B’05, vol. 2 (2005), pp. 1089–1092 3. V.S. Chouhan, S.S. Mehta, Total removal of baseline drift from ECG signal, in Proceedings of International Conference on Computing: Theory and Applications, ICTTA–07 (2007), pp. 512– 515 4. S.C. Saxena, V. Kumar, S.T. Hamde, Feature extraction from ECG signals using wavelet transforms for disease diagnostic. Int. J. Syst. Sci. 33(13), 1073–1085 (2002) 5. A.B. Ramli, P.A. Ahmad, Correlation analysis for abnormal ECG signal features extraction, in 4th National Conference on Telecommunication Technology, 2003. NCTT 2003 Proceedings (2003), pp. 232–237 6. A. Balachandran, M. Ganesan, E.P. Sumesh, Daubechies algorithm for highly accurate ECG feature extraction, in 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE) (2014) 7. https://en.wikipedia.org/wiki/Electrocardiography 8. C.-C. Chiu, T.-H. Lin, B.-Y. Liau, Using correlation coefficient in ECG waveform for arrhythmia detection. Biomed. Eng. Appl. Basis Commun. 17(4), 147–152 (2005) 9. B. Karhk, Y. Ozba, A new approach for arrhythmia classification, in 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Amsterdam 10. S. Gradl, P. Kugler, C. Lohmüller, B. Eskofier, Real-time ECG monitoring and arrhythmia detection using Android-based mobile devices, in 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 11. J. Lee, K.L. Park, M.H. Song, K.J. Lee, Arrhythmia classification with reduced features by linear discriminate analysis, in Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, 1–4 Sept 2005 12. P.R. Gomes, P.O. Soares, J.H. Correia, C.S. Lima, Cardiac arrhythmia classification using wavelets and hidden Markov models—a comparative approach, in 31st Annual International Conference of the IEEE EMBS, Minneapolis, MN, 2–6 Sept 2009 13. V. Tavakoli, N. Sahba, N. Hajebi, A fast and accurate method for arrhythmia detection, in 31st Annual International Conference of the IEEE EMBS, Minneapolis, MN, 2–6 Sept 2009, pp. 1897–1900 14. R. Fensli, E. Gunnarson, T. Gundersen, A wearable ECG-recording system for continuous arrhythmia monitoring in a wireless tele-home-care, in IEEE Symposium on Computer-Based Medical Systems, Dublin, 23–24 June 2005 15. K. Daqrouq, I.N. Abu-Isbeih, Arrhythmia detection using wavelet transform, in The International Conference on “Computer as a Tool”, Warsaw, 9–12 Sept 16. V. Vijaya, K. Kishanrao, V. Rama, Arrhythmia detection through ECG feature extraction using wavelet analysis. Eur. J. Sci. Res. 66(3), 441–448 (2011). ISSN 1450-216X 17. A.R. Saheb, Y. Meharzad, An automatic diagnostic machine for ECG arrhythmia classification based on wavelet transform and neural network. Int. J. Circuits Syst. Signal Process. 5(3), 255–262 (2011)
Categorization of Cardiac Arrhythmia from ECG Waveform …
613
18. S.-W. Chen, P.M. Clarkson, Q. Fan, A robust sequential detection algorithm for cardiac arrhythmia classification. IEEE Trans. Biomed. Eng. 43(11) (1996) 19. H. Leutheuser et al., Automatic ECG arrhythmia detection in real time using Android-based mobile devices, in Conference on Mobile and Information Tech in Medicine MobileMed 14, 20–21 Nov 2014 20. Y. Jung, W.J. Tompkins, Detecting and classifying life-threatening ECG ventricular arrhythmias using wavelet decomposition, in Proceedings of the 25th Annual International Conference of the IEEE EMBS, Cancun, Mexico 17–21 Sept 2003, pp. 2390–2393 21. S.T. Sanamdikar, S.T. Hamde, V.G. Asutkar, Machine vision approach for arrhythmia classification using super vector regression. Int. J. Signal Process. 1–8 (2019). https://doi.org/10. 5281/Zenodo.2637567 22. S.T. Sanamdikar, S.T. Hamde, V.G. Asutkar, Extraction of different features of ECG signal for detection of cardiac arrhythmias by using wavelet transformation. IEEE Explore 1–6 (2018). https://doi.org/10.1109/ICECDS.2017.8389881 23. S.T. Sanamdikar, M.P. Borawake, Analysis of several characteristics of ECG signal for cardiac arrhythmia detection. Vidyabharati Int. Interdisc. Res. J. 13(1), 1–12 (2021). ISSN 2319-4979 24. C.-L. Chang, K.-P. Lin, T.-H. Tao, T. Kao, W.H. Chang, Validation of automated arrhythmia detection for Holter ECG, in Proceedings of the 20th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 20, no. I (1998), pp. 101–103 25. S.T. Sanamdikar, S.T. Hamde, V.G. Asutkar, Classification and analysis of ECG signal based on incremental support vector regression on IOT platform. Biomed. Signal Process. Control 1–9 (2020). https://doi.org/10.1016/j.bspc.2020.102324 26. S.T. Sanamdikar, S.T. Hamde, V.G. Asutkar, Analysis and classification of cardiac arrhythmia based on general sparsed neural network of ECG signal. SN Appl. Sci. 2(7), 1–9 (2020). https:// doi.org/10.1007/s42452-020-3058-8 27. S.T. Sanamdikar, S.T. Hamde, V.G. Asutkar, Arrhythmia classification using KPCA & super vector regression. Int. J. Emerg. Technol. 1–10 (2020). ISSN No. (Online): 2249-3255
Performance Analysis of Classification Algorithm Using Stacking and Ensemble Techniques Praveen M. Dhulavvagol, S. G. Totad, Ashwin Shirodkar, Amulya Hiremath, Apoorva Bansode, and J. Divya
Abstract Classification is very important in day-to-day life as well as in cooperate world. Classification is mainly used to classify relevant data into different set of classes. Classification of data mainly involves different steps like sorting, storing of the data which can be used for prediction and decision-making. The existing classification algorithms like logistic regression, Naïve Bayes, decision tree, support vector machine, and K-nearest neighbours have limitations like sensitivity towards the dataset, high time requirement to train the data, performance, etc. In order to overcome these limitations, in this paper, we propose the hybridization of algorithms using ensemble technique. Hybridization enhances the accuracy and precision value and also over comes the limitations of existing classification algorithms. In this study, we have analysed different classification algorithms on medical dataset and performed the comparative study analysis of different existing algorithms with the proposed hybridization method. The results interpret that the proposed method shows high accuracy, precision and F1-score in classifying the text data as compared to the existing methods. Keywords Logistic regression · Naïve Bayes · Decision tree · Support vector machine
1 Introduction Classification has wide range of applications like speech recognition, handwriting recognition, document classification, biometric identification, traffic prediction, email spam, malware filtering, and many more. The different classification algorithms considered for the study are support vector machine (SVM), Naïve Bayes, and logistic regression. Most often in the literature logistic regression is used widely P. M. Dhulavvagol (B) · S. G. Totad · A. Shirodkar · A. Hiremath · A. Bansode · J. Divya School of Computer Science and Engineering, KLE Technological University, Hubballi, India e-mail: [email protected] S. G. Totad e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_46
615
616
P. M. Dhulavvagol et al.
when the prediction is categorical, and also, it is less inclined to overfitting. On the other hand, support vector machine draws a hyperplane that clearly differentiates the classes. While Naïve Bayes works well with small dataset and is simple to implement [1]. But these algorithms used for classification and prediction perform slower in case of larger dataset and show poor performance for overlapped classes. At times they may be inadequate for predicting continuous values [2]. Hence, the proposed system involves hybridization of the existing algorithms through ensemble technique (stacking) to enhance the performance.
2 Literature Survey A literature survey was carried out to understand the need for data classification and also to know the existing classification algorithms and their advantages and limitations [1]. Classification is important to classify the given dataset and to infer the knowledge which helps in decision-making. Classification techniques are being used in wide range of applications like speech recognition and hand written text classification. The different classification techniques implemented are Naive Bayes, logistic regression, decision tree, KNN, random forest, support vector machine, and ANN. When the score is evaluated for diabetes dataset for a basic model, logistic regression and ANN algorithms perform better with an accuracy of 69.9% and 69.5%, respectively [2]. After the hyperparameters tuning and optimization, random forest and decision tree have achieved better accuracy among all the algorithms with 71.9% and 71.2%, respectively. Considering hypertension dataset, logistic regression and ANN algorithms perform better with an accuracy of 61.7% and 60.6%, respectively, for the basic model. Random forest and logistic regression perform better with an accuracy of 61.9% and 61.8%, respectively, after the hyperparameters are tuned. With the community acquired pneumonia dataset, random forest and ANN achieve good accuracy of 77.3% and 78.6%, respectively, for the basic model. We concluded that random forest optimized with effective global optimization algorithm gives best performances on three datasets for binary classification [3, 4]. The different performance parameters considered for classification techniques are discussed [2] with breast cancer dataset and titanic dataset since cancer dataset had higher correlation among the columns. Various parameters were used to compare the performance of algorithm. Firstly, we understand that logistic regression has higher speed compared to other algorithms and random forest is the slowest of all. Naïve Bayes is excellent in performing over big data. And both logistic regression and Naïve Bayes consume less memory, whereas decision tree and random forest consume large memory space [5]. Decision tree has the highest, logistic regression, Naïve Bayes, and KNN have medium, while random forest has the lowest transparency of all. Hence, on comparing the performance of all algorithms, we analysed that logistic regression and Naïve Bayes have an upper hand compared to others [6]. The support vector machine (SVM) algorithm is used for data classification and function approximation due to its generalization capability, and it has found
Performance Analysis of Classification Algorithm …
617
success in many applications [5]. SVM minimizes the generalization error and can be constructed by plotting set of planes which separate two classes in case of binary dataset. SVM works really well with high dimensional spaces. SVM is relatively memory efficient. Through this paper, we could conclude that SVM can also be used for the study along with Naïve Bayes and logistic regression [7]. Considering the social media unstructured dataset the behaviour of classification techniques was analysed. The dataset was classified comments/messages made by people in the online social Web blogs as legal or illegal. Here, the supervised machine learning algorithms, i.e. Naive Bayes and support vector machine, are integrated to produce a hybrid algorithm [6]. Hybrid algorithm increases the efficiency and accuracy of the results and reduces the complexity, computational cost, and individual model variance. The results show that the hybrid model has given 96.76% accuracy against the 61.45% accuracy of Naive Bayes and 69.21% accuracy of support vector machine model [8]. Classification techniques were used for prediction [6, 9]; the staking methodology is used to integrate different classification algorithms, and performance was analysed with each of the staking techniques. It is useful to stack individual algorithms to obtain much powerful learning model. As we know that predictions can be highly co-related, this stacking algorithm overcomes the issue and still being at faster rate and easy to interpret. Algorithms are placed at base level, and another algorithm is placed at meta layer, i.e. final estimator. And this results in ensembled algorithm. Hence, through this paper, we understand the working of stacking algorithms and its efficiency compared to individual algorithms [7]. The literature survey briefs the existing classification algorithms and their limitations. The survey briefs the research gaps and the research questions on classifying the text data. Existing algorithms have less accuracy and precision score so to enhance this stacking and ensemble techniques can be proposed.
3 Methodology Figure 1 describes the proposed system architecture which mainly consists of different modules such as data pre-processing, feature selection, various classification algorithms SVM, NB, and LR, stacking classifiers such as SVM-NB-LR, NB-LR-SVM, and SVM-LR-NB.
3.1 Data Pre-processing Data pre-processing is a technique to transform the raw data into useful data. Data pre-processing improves the performance of the algorithm. The steps carried out to preprocess the data are data cleaning, data transformation, and data reduction. Data cleaning is a process of removing missing and null values, removing noisy data,
618
P. M. Dhulavvagol et al.
Fig. 1 Proposed system architecture
outliers, correlated variables, inconsistent data, unwanted, and redundant data from the dataset. Data transformation is a process of converting the data from one format to another. This is done because the raw data may be in a format that is not suitable for the model, so this technique helps to convert the raw data into appropriate format that your model needs. Data reduction is a process in which the amount of data is reduced. Large datasets can be time consuming so to avoid this data reduction is performed. Data reduction saves money, time, and increases the storage efficiency.
3.2 Feature Selection Feature selection is a process used in data reduction. It is also called variable selection or attribute selection. In feature selection, the features that are of utmost importance are used to develop the model. Feature selection reduces the model complexity, avoids overfitting, and gives higher accuracy. The purpose of splitting the data is to avoid overfitting. Here, we have split the data in two portions that is train data and test data with 70% and 30%, respectively. Training data is used to develop a predictive model so that the model learns and performs good on the unseen data.
Performance Analysis of Classification Algorithm …
619
3.3 Support Vector Machine (SVM) Support vector machine is a supervised learning algorithm. It is used for classification and regression problems, mostly used for classification problems. SVM algorithm creates a hyperplane, and a hyperplane is a decision line or boundary that is drawn to clearly separate the data points into different classes. Hyperplane can be a single line or multiple lines considering the distance between the hyper-plane and support vectors. SVM maximizes this margin. The hyperplane with the maximum margin is called the optimal hyperplane.
3.4 Logistic Regression (LR) Logistic regression is a meta-level learning classifier. It is used for predicting the categorical dependent variable using a given set of independent variables. Logistic regression gives the probabilistic output between 0 and 1. Logistic regression uses sigmoid function to predict values of probabilities.
3.5 Naive Bayes (NB) Naive Bayes algorithm is a supervised learning algorithm. It can be used for binary and multi-class classification problems. It is based on Bayes theorem and uses conditional probability to classify an object. Formula for conditional probability: P( A|B) = [P(B|A) · P(A)]/P(B)
(1)
3.6 Stacking Algorithms Stacking is an ensemble learning technique that combines different algorithms in a hierarchical architecture. The individual algorithms are called as base learners, and their output is fed as an input to the next layer, i.e. meta learner. The meta learner reduces the weaknesses and improves the performance of each individual algorithm. Stacking of algorithms is performed with three different combinations, i.e. SVMNB-LR, SVM-LR-NB, and LR-NB-SVM.
620
P. M. Dhulavvagol et al.
4 Implementation The implementation is carried out in two phases. The first phase involves the process of implementing individual algorithms (SVM, NB, and LR). The second phase consists of hybridization of these individual algorithms using ensemble technique, i.e. stacking. Two medical datasets mainly breast cancer and Pima diabetes dataset are taken into consideration for the implementation.
4.1 Data Pre-processing
Algorithm: Data Pre-processing Input: Raw text dataset. Output: Pre-processed dataset Step 1—Count the number of rows, columns in the dataset for better understanding. And also know the type of each attribute. Step 2—Check for the null values in the dataset. If found, then eliminate null values or replace them with mean value. Step 3—Check for the attributes which have more than 95% of correlation between them. If found, such attributes have to be eliminated.
4.2 Feature Selection Algorithm: Feature Selection Input: Pre-processed dataset Output: Variable selected dataset/reduced dataset Step 1—Check for the attributes which are unnecessary and have lesser influence on the outcome. These attributes are neglected so as to build much efficient model and also to reduce the computational cost.
Performance Analysis of Classification Algorithm …
4.3 Data Partitioning
Algorithm: Data Partitioning Input: Reduced dataset Output: Train and test dataset Step 1—Divide the dataset into train and test with 3:1 ratio.
4.4 Support Vector Machine
Algorithm: Support Vector Machine Input: Train dataset Output: SVM trained model. Step 1—The main objective of SVM algorithm is to plot a hyperplane in N-dimensional space which classifies the data points. Step 2—SVM has a character to ignore the outliers and choose the hyperplane that maximizes the margin. Step 3—Training dataset is trained using SVM algorithm. Step 4—The prediction model is now tested using test dataset. Step 5—Various parameters like accuracy, precision, recall, F1-score, and ROC curve are used to measure the efficiency of the model.
4.5 Naive Bayes
Algorithm: Naive Bayes Input: Train dataset Output: NB trained model. Step 1—Naïve Bayes classification algorithm is a probabilistic classifier. The class having maximum probability is considered to be the most suitable class. Step 2—The trained dataset is trained using NB algorithm. Step 3—The prediction model is now tested on test model. Step 4—The model is measured on different parameters like accuracy, precision, recall, F1-score, and ROC curve.
621
622
P. M. Dhulavvagol et al.
4.6 Logistic Regression
Algorithm: Logistic Regression Input: Train dataset Output: LR trained model. Step 1—Logistic regression is a go-to method for binary classification problems. It plots a sigmoid curve in which the output values are mapped between 0 and 1 rather than numeric value. Step 2—Train the dataset using LR algorithm. Step 3—Prediction model is tested on the test dataset. Step 4—Parameters like accuracy, precision, recall, F1-score, and ROC curve are used to measure the efficiency of the model.
5 Proposed Hybrid Algorithm The proposed hybrid algorithm involves the integration of ensemble technique with stacking method. Algorithm: Ensembled Algorithm Input: Train dataset Output: SVM-NB-LR trained model. Step 1—Individual algorithms are hybridized using ensemble technique. Step 2—Any of the two layers are used at the base level, and the other is used as a meta layer (final estimator) for training the dataset. Step 3—Prediction model is tested on the test dataset. Step 4—Parameters like accuracy, precision, recall, F1-score, and ROC curve are used to measure the efficiency of the model.
Base Level (Level 0) In this combination, we have combined the features of support vector machine and logistic regression algorithm at level 0. Meta Level (Level 1) The output of the combined algorithms of support vector machine and logistic regression algorithm is fed as an input to Naive Bayes algorithm at level 1. Base Level (Level 0) In this combination, we have combined the features of support vector machine and Naive Bayes algorithm at level 0.
Performance Analysis of Classification Algorithm …
623
Meta Level (Level 1) The output of the combined algorithms of support vector machine and Naive Bayes algorithm is fed as an input to logistic regression algorithm at level 1. Base Level (Level 0) In this combination, we have combined the features of logistic regression and Naive Bayes algorithm at level 0. Meta Level (Level 1) The output of the combined algorithms of support vector machine and logistic regression algorithm is fed as an input to support vector machine algorithm at level 1. Figure 1 explains the complete methodology of the system, whereas Figs. 2, 3, and 4 explain the proposed hybrid techniques. Different combinations of stacking using classification algorithms is carried implemented and tested.
Fig. 2 Stacking algorithm of SVM-LR-NB
Fig. 3 Stacking algorithm of SVM-NB-LR
624
P. M. Dhulavvagol et al.
Fig. 4 Stacking algorithm of LR-NB-SVM
6 Results and Discussions The experiment was conducted on windows machine of 8 GB RAM and 164 GB memory. The medical dataset considered for experimentation purpose is breast cancer dataset and Pima diabetes dataset of each 1 Gb size. The performance metrics considered for comparative study analysis are as follows. Accuracy: Accuracy is the number of test cases classified correctly divided by the total number of test cases. Accuracy = (TP + TN)/(TP + FP + FN + TN)
(2)
Precision: Precision is used to identify the correctness of classification. Precision = (TP)/(TP + FP)
(3)
Recall: Recall is the number of positive cases classified correctly from the total number of positive cases. Recall = (TP)/(TP + FN)
(4)
F1-score: It is the harmonic mean of the precision and recall. F1-score = 2 ∗ (Recall ∗ Precision)/(Recall + Precision)
(5)
Performance Analysis of Classification Algorithm …
625
Table 1 Performance analysis of existing classification algorithms Classification algorithms
Breast cancer dataset (in %) Accuracy
Precision
Recall
Pima diabetes dataset (in %) F1 score
Accuracy
Precision
Recall
F1 score
SVM
97.6
97.2
99.0
98.1
74.8
70.3
53.5
60.81
NB
91.2
95.0
90.6
92.8
78.7
73.97
64.2
68.7
LR
97.6
96.3
100
98.1
76.6
73.4
55.9
63.5
6.1 Performance Analysis of Existing Classification Algorithms Table 1 shows the performance analysis of existing classification algorithms considering precision, recall, F1-score, and accuracy on breast cancer and Pima diabetes dataset. The results interpret that SVM, NB, and LR are used to classify the cancer dataset into malignant and benign. SVM and LR have given the highest accuracy of 97.66%. SVM gives highest precision of 97.24%. LR gives highest recall of 100% when compared to SVM and NB. LR has given highest F1-score of 98.16%. On Pima diabetes dataset, NB has performed the highest in all the parameters with 78.78% of accuracy, 73.97% of precision, 64.28% of recall, and 68.78% of F1-score.
6.2 Performance Analysis of Proposed Stacking Technique Table 2 shows the performance of stacking algorithms and is implemented on breast cancer dataset. The performance is measured using parameters like accuracy, precision, recall, and F1-score. The results interpret that the proposed stacking technique is more efficient compared to the existing classification technique in terms of accuracy, precision, and F1-score. The stacking technique shows that 97.66% of accuracy is obtained when SVM and LR are used as meta layer. With 97.24% of accuracy, SVM meta layer has again performed well. LR as meta layer has performed well with 100% of recall and 98.16% of F1 score. On Pima diabetes dataset, the highest accuracy Table 2 Performance analysis of proposed stacking techniques Stacking algorithms
Breast cancer dataset (in %) Accuracy
Precision
Recall
F1 score
Accuracy
Precision
Recall
F1 score
LR-NB-SVM
97.66
97.24
99.06
98.14
77.92
70.37
67.85
69.09
Pima diabetes dataset (in %)
SVM-LR-NB
91.22
95.09
90.65
92.82
76.62
70.83
60.71
65.38
SVM-NB-LR
97.66
96.39
100
98.16
76.19
73.01
54.76
62.58
626
P. M. Dhulavvagol et al.
Fig. 5 Existing classification algorithm’s RoC curve for breast cancer dataset
of 77.92% is obtained when SVM is used as meta layer. With 73.01% of precision, LR meta layer has performed well. Highest recall 67.85% and F1-score 69.09% are obtained with SVM as meta layer.
6.3 Comparative Study Analysis of Existing and Proposed Stacking Technique Breast Cancer Dataset Figure 5 describes the ROC and AUC of existing classification models SVM, NB, and LR on Breast cancer dataset. Gaussian NB has 0.2 variation of ROC as compared to LR and SVM. Figure 6 describes the ROC and AUC of the stacking classifiers where logistic regression is a meta layer on cancer dataset. Pima Diabetes Dataset Figure 7 describes the ROC and AUC of existing classification models SVM, NB, and LR on diabetes dataset. SVC has 0.1 variation of ROC as compared to Logistic Regression and Gaussian NB. Figure 8 describes the ROC and AUC as 0.84 of the stacking classifiers where Naïve Bayes is a meta layer on diabetes dataset.
Performance Analysis of Classification Algorithm …
627
Fig. 6 Proposed stacking classification RoC curve for breast cancer dataset
Fig. 7 Existing classification algorithm’s RoC curve for diabetes dataset
7 Conclusion Classification algorithms are one of the important aspects in data analysis and prediction. It has its importance ranging from the fields of agriculture to defence of a country. Various studies have been carried out extensively for classification techniques covering wide range of applications. The experiment was conducted on two medical datasets—breast cancer and diabetes. The existing classification algorithms,
628
P. M. Dhulavvagol et al.
Fig. 8 Proposed stacking classification RoC curve for diabetes dataset
i.e. support vector machine, Naïve Bayes, and Logistic regression, are implemented on the datasets to understand the limitations considering precision, accuracy, and F1-score performance metrics. To overcome the limitations, stacking and ensemble technique are proposed. The results interpret that the stacking and ensemble technique shows better accuracy, precision, and F1-score as compared to the existing classification algorithms dataset. The performance analysis of algorithms is measured through parameters like accuracy, precision, recall, F1-score, and ROC curve. The comparative study analysis shows that the stacking technique, i.e. LR, NB (base layer), and SVM (meta layer), has performed better compared to other combinations.
References 1. H. De Oliveira, M. Prodel, V. Augusto, Binary classification on French hospital data: benchmark of 7 machine learning algorithms, in SMC 2018, Miyazaki, 9 Oct 2018 2. V. Bahel, S. Pillai, M. Malhotra, A comparative study on various binary classification algorithms and them improved variant for optimal performance, Jan 2020 3. Y.I.A. Rejani, S.T. Selvi, Early detection of breast cancer using SVM classifier technique (2009) 4. D.C. Asogwa, S.O. Anigbogu, I.E. Onyenwe, F.A. Sani, Text classification using hybrid machine learning algorithms on big data (2019) 5. C.F. Kurz, W. Maier, C. Rink, A greedy stacking algorithm for model ensembling and domain weighting (2020) 6. P.M. Dhulavvagol, S.G. Totad, A.S. Meti, V. Shashidhara, An adaptive and dynamic dimensionality reduction method for efficient retrieval of videos, in Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018, ed. by K. Santosh, R. Hegadi. Communications in Computer and Information Science, vol. 1035 (Springer, Singapore, 2019) 7. C. Zuo, A.I. Karakas, R. Banerjee, A hybrid recognition system for check-worthy claims using heuristics and supervised learning
Performance Analysis of Classification Algorithm …
629
8. P.M. Dhulavvagol, S.G. Totad, S. Sourabh, Performance analysis of job scheduling algorithms on Hadoop multi-cluster environment, in Emerging Research in Electronics, Computer Science and Technology, ed. by V. Sridhar, M. Padma, K. Rao. Lecture Notes in Electrical Engineering, vol. 545 (Springer, Singapore, 2019). https://doi.org/10.1007/978-981-13-5802-9_42 9. K.M. Mendez, S.N. Reinke, D.I. Broadhurst, A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification (2019)
Conceptual Study of Prevalent Methods for Cyber-Attack Prediction S. P. Sharmila and Narendra S. Chaudhari
Abstract Autonomous proactive intrusion prediction system (IPrS) for predicting a upcoming attack by monitoring the network for every instant and for every event is needed for security of systems. There are intrusion detection system (IDS) and intrusion prevention system (IPS) for detecting and preventing intrusions; however, analysis of integrated functionality for various components in such systems has not been fully reported in the literature. In this paper, we review various techniques for attack prediction as reported by various researchers for IDS and IPS. We identify the limitations of these existing techniques for attack prediction. We propose model to predict attack before it occurs by integrating the functionalities of various components in IDS and IPS, namely security database, attack graphs, hidden Markov model, proactive IDS and automated IRS. Keywords Attack prediction · Attack graph · Deep learning · Hidden Markov model
1 Introduction A system that monitors the flow of data in a network to detect malicious behaviours happening in a system under protection is termed as intrusion detection system (IDS). Responses are generated from an IDS when a critical security incident occurs. An IDS is meant for reporting some attacks that have happened but cannot prevent them. To identify the plans of attacker, IDS needs to be extended for taking preventive measures. Responses from an IDS can be analysed to take relevant prevention measures for an upcoming risky attack [1, 2]. S. P. Sharmila (B) Department of Information Science and Engineering, Siddaganga Institute of Technology, Tumakuru, India e-mail: [email protected] N. S. Chaudhari Department of Computer Science and Engineering, Indian Institute of Technology, Indore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_47
631
632
S. P. Sharmila and N. S. Chaudhari
Some attacks that have adverse effect on the network topology are brute force, botnets, DoS, DDoS, SQL injection, infiltration, heart bleed, zero day attacks and phishing [3]. Other malicious attacks like shellcode, backdoors, fuzzers, reconnaissance attacks which are meant for observing network traffic and topology to intrude have resulted as evolving attacks. Traditional IDS and IPS are passive defence mechanisms; they do not have the ability to predict attacks; however, they create a delay on providing defensive actions to respond to attacks. Study on cyber-attack prediction methods has recently acquired a high significance. It is difficult to find a single approach to solve issues related to cyber-attacks, as most approaches depend on taskspecific algorithms [4], and also, they need a method for representational learning. Extracting intrinsic features of dataset and consuming them for generalizing the test cases is essential part of representational learning. Observing the network traffic in attack scenarios, we need to extract intrinsic features; representational learning involves application of automated learning to generate the concepts.
2 Background Work 2.1 Identifying the Attack Scenario With an intention to achieve a specific goal of leading to unauthorized access of a secured system, an adversary implements a set of malicious activities; these malicious activities can be multistep or combined attack scenario. An attack A contains various attack phases ap1 , ap2 , ap3 … Each attack phase is a composition of several logically related attack events ae1 , ae2 , ae3 … An intrusion detection system (IDS) will detect an intrusion and generates huge amount of alert data. Analysing this alert data in order to recognize the attack, extract features and isolate the information I related to attack events is initial step to rebuild the security system. The extracted information I can be useful to review and evaluate or assess the state S of the network and predict attack over it. In this paper, we propose a novel method to find a learning function F: {ae1 , ae2 , … aen } which accepts a variable length of input sequences and outputs a prediction of target event aetarget for a given system. There are seven major attack scenarios, namely worm, phishing, botnets, DDoS, Web attack scenario, escalating remote privileges and ransomware attacks. For the administrator to respond proactively for an attack and to protect the network from the threat, it is essential to analyse the alert data.
2.2 Prediction of Attacks Main approaches possible to predict attacks in a network are given below:
Conceptual Study of Prevalent Methods for Cyber-Attack Prediction
1. 2.
633
Using statistical approach: It includes time series approaches, auto regression, linear regression and least square regression Using algorithmic approach: This includes data mining, machine learning and probabilistic models.
Predicting an attack in real time is a complex problem. Various challenges we come across are recognizing the attacks in noisy data, keeping track of attacks across different machines and time.
2.3 Various Techniques for Attack Prediction This section includes a state-of-the-art survey of various techniques proposed by several researchers; they can be broadly identified under four categories. 1. 2. 3. 4.
Learning and modelling based on historical data of the attack. Attack prediction based on analysis of attack graphs [5] and application of collaborative filtering. Cyber-attack prediction with deep learning approaches. Estimating the security state assessment and prediction of next attack [6]. i. ii. iii. iv.
Intrusion alert analysis and processing [7] Attack probability By fusing multi-source heterogeneous data Prediction with HMM [8].
Further attack graph analysis, deep learning approaches and hidden Markov Model and their contributions for cyber-attack prediction are presented in Sects. 3, 4 and 5, respectively.
3 Attack Graph Analysis Ammann et al. [9] introduced an algorithm with the polynomial complexity of the size of network, that generates a graph containing the poor case scenarios. It is XVn2 for maximal access and n3 for minimal access, so altogether XVn2 + n3 for X number of exploits, V number of Vulnerabilities and n number of hosts. The operation of this approach is better as it is scalable for practical network, but it cannot assure the return of all appropriate paths as they focus only on optimal attack paths. Whenever repairing of network is concerned, a network analyst has to take suboptimal choices. Ammann et al. intuitive model that demonstrated the practicality of the approach. Ou et al. [10] has implemented which is a Multi-host, Multi-stage Vulnerability Analysis Language (MulVAL); it is a sensible security analyzer for a network and is logicbased. This vulnerability analysis tool is meant for modelling the configurations of the network along with communication of the software bugs in the network. All relevant
634
S. P. Sharmila and N. S. Chaudhari
information is encircled within the system but there is a bug-reporting community to provide the information related to the bugs. A logic-based approach proposed in [11]. They use logic inference to derive the concluding facts from the initial evidences based on certain lucidity rules to figure out the attack graph; as the attack graph grows, the performance of the approach degrades. Ingols et al. [12] introduce the notion of reachability by group to moderate intricacy in a rational way. To find whether network hosts are reachable or unreachable in the prerequisite graphs, they employ a breadth first search (BFS). Xie et al. [13] generate attack graph using a bidirectional search method. They demonstrate the limits of applying depth of the search for identifying possible attacks. They extend this work by meeting the necessary requirements and show that it is possible, so that the prerequisite graphs can be refined by timely updation of information about firewalls, attacks from the client side and intrusion detection as reported in [14]. Many researchers have introduced various methods for generation of attack graph and further analysis [15]. Usage of attack graphs and analysis is the strategy employed by most of the cyber-attack prevention technologies to identify all probable paths that attackers can manipulate to gain unlawful admittance to any surveillance system. For spawning attack graphs, topological vulnerability analysis (TVA) is another tool available in addition to MulVAL [15, 16]. Analysing attack susceptibility on a network by studying its topological conditions is the basis of TVA. A search algorithm identifies the attack paths that exploit various vulnerabilities; exploitation of these vulnerabilities is done by epitomizing preconditions and post conditions of the exploit dependency graph. Thus, study based on the topology of the network plays a major role to predict any attack. Artificial intelligence has a major role in cyber security. One such approach proposed by Ghosh in [17] based on artificial intelligence clubbed with other customized algorithms which they name as Planner produces an attack graph in polynomial time (order). This approach has an advantage, from the performance point of view. Poolsappasit et al. [18] showed that usage of Bayesian method for attack graph generation for dynamic security risk management. Techniques like virtual shared memory abstraction, a multi-agent system and hyper-graph partitioning with the amalgamation of depth first search (DFS) are employed in [19] to improve the overall performance of the system; their approach uses distributed attack graph generation algorithm. With the use of agents after a precise size of the graph, it is revealed that the performance is also upgraded by this approach. The use of a probabilistic model is proposed in [20] by considering the dynamic network features to compute risk probability and measures risk security. Other similar approach [21] uses dynamic generation algorithm to compute the full attack graph by returning the leading K paths. Collaborative filtering is employed in various recommendation systems. Attack graph analysis can be [22] combined with such an approach for forecasting attacks within a risk management system. A graph containing different attack paths can be synthesized after analysing the network; then, application of multi-level collaborative filtering is employed to detect the next susceptible attack on that network. This would
Conceptual Study of Prevalent Methods for Cyber-Attack Prediction
635
also forecast the attackers’ next activity in gaining access to any of the resource he is looking for. This approach has a flaw, namely the process of discovering attack path is slow as they employ a recommendation technique, hence searching the entire length of the path happens to be slow as the network grows dynamically during real time.
4 Attack Prediction with Deep Learning Approach As information processing architecture of deep learning is nonlinear, it can be easily adapted to learn from network traffic, so that network packets can be classified as benign or malicious. All these approaches include task-specific algorithms with representational learning. Significant models relevant to our work are shown in Table 1. They analyse the raw data and extract intrinsic features of network traffic, thereby generating representation of attack scenario. Shen et al. proposed a Tiresias model [23]. By collecting more than 3 billion security events from an intrusion prevention system-IPS, they could forecast the succeeding events on a machine by observing the history of events associated with recurrent neural network (RNN) model. Though this model predicts next event and stable over time, as RNN is a statistical model Tiresias, whenever rare intrusion was attempted, it was pretty difficult to predict with this model, and fake actions of an Table 1 Significant deep learning methods for attack prediction Title
Methodology
Efficiency
1.
Tiresias model [23]
RNN
Successful in predicting an adversary’s next event
2.
Attack detection in mobile cloud [24]
RBM in greedy layer-wise learning algorithm
Accuracy of 97.11%
3.
Malware prediction [25]
RNN to classify the benign and malicious
An accuracy of 96.01%
4.
Detecting port scan attempts [26]
SVM
Successfully detected attempts made for port scanning on a host machine accuracy of 97.80%
5.
Scale hybrid IDS AlertNet model [27]
Distributed and parallel ML Successful in handling all algorithms events in large network
6.
Filter-based feature engineering for wireless IDS [28]
Feed forward deep neural network (DNN)
An accuracy of 99.69%
7.
IDS for IoT [29]
Genetic algorithm [30] and deep belief network
The performance evaluation using NSL-KDD dataset shows a detection rate of 99%t
636
S. P. Sharmila and N. S. Chaudhari
adversary can influence the decisions by using before attacking the victim. However, RNN with long-term memory is successful in predicting a adversarys’ next event, and hence, it is useful in attack prediction. Usefulness of nonlinear transformation before the training process was demonstrated in by Nguyen et al. in [24]. They apply restricted Boltzmann machine (RBM) in Greedy layer-wise learning algorithm. Their approaches detect and isolate cyberattacks in mobile clouds with an accuracy of 97.11%; however, fine-tuning was needed to detect attacks, and hence, labelled data was used for training the weights. Recurrent neural networks (RNNs) were used on a snapshot of behavioural data by Rhode et al. in [25] to classify the benign and malicious nature of the executable code. They could achieve an accuracy of 96.01%. An IDS was modelled with SVM for CICDS dataset 2017, which successfully detected attempts made for port scanning on a host machine, was implemented by Aksu and Aydin in [26]. Though for the deep learning model they achieve an accuracy of 97.80%, but for SVM model, accuracy of 69.79% was reported. Accuracy was improved with SVM on NSL KDD and KDD99 dataset which is another deep learning approach for dimensionality reduction and feature learning. Enhancing the prediction accuracy, this model also reduced the time consumption for training and testing. For unsupervised pretraining purpose, a sparse autoencoder is used, and further they deployed SVM to take this transformed feature space for attack detection. Deep autoencoded decoders found versatile applications from thereon. To detect attacks on a 5G IOT network, an autoencoded deep and dense neural network (DNN) algorithm was applied by Rezvy et al. [27]. Unsupervised pretraining was done by using autoencoders to reduce data from high dimension to low dimension representation before applying DNN, and hence, they call it as two step approach. Though this model achieves better performance, but it deteriorates for large type of attack, and performance estimation becomes too tedious for continuously evolving attacks. Monitoring the network traffic also becomes intricate so that anomalies present in the network traffic can be tracked with scale hybrid IDS AlertNet model of Vinayakumar et al. [28]. As distributed and parallel machine learning algorithms with various optimization techniques were employed in the model, it influenced handling all events even with a large network. Even in wireless networks it is feasible to realize similar anomalies to deal with such security issues. Kasongo and Sun [29] developed an IDS using feed forward deep neural network (DNN) and evaluated it with NSL KDD dataset. Whenever we deal with network attack prediction, training the model for existing dataset is not sufficient; the model must have a cognitive capability to adapt for continuously evolving attacks that are feasible in future. To detect attacks on a IoT network, Zhang et al. [31] developed an adaptive model with an amalgamation of genetic algorithm [30] and deep belief network (DBN). DBN can be applied effectively to classify the attacks on the optimal structure of network generated by GA. As GA is powerful in executing various iterations to generate the optimal structure of the network, it contributes in improving the accuracy of this classification.
Conceptual Study of Prevalent Methods for Cyber-Attack Prediction
637
5 Hidden Markov Model Approach Yu and Frincke [7] proposed hidden Colored Petrinet model whose objective is to predict the next goal of the intruder. From a concrete attack scenario, they derive IDS alert names which is the base for the action set. Though mathematical modelling was used in this model, it has some performance issues due to significant increase in pre and post conditions as and when actions were added into the model. Fava et al. [32] proposed a training model with offline and real suffix tree. As it was an offline model, it is possible to use diversified and distinctive symbols with a limited suffix tree to analyse the results on the model trained. However, the issue with such a system is that it faces a continuous sequence of new traces of attacks indefinitely and the system will run out of memory or any resources in the real time. There is another model proposed by Du et al. [33] to observe the future targets based on multistaged attacks by studying the history attack. With this approach, it is easy to evaluate the proficiencies and competence of the attacker based on the history of his attacks and visible vulnerabilities of his objectives. Based on suffix tree, prediction estimates can be computed, as it generates indistinct inference rules from a variable length Markov model (VLMM). IDS reports some alert information which has XML alert tag also, and thus, suffix tree is constructed from such symbols obtained by the alert information. Another significant research is based on finite state machine FSM in 2013 [34], in an intrusion response system for multistep attacks. Being a detection component of the system, FSM shows an alert after every state change in the state machine without predicting the growth or advancement of the next attack. The observed hindrance is weight assignment for states whenever there is a multistep attack scenario. Markov processes are employed to generate responses automatically for intrusions in an automated intrusion response system by Zonouz et al. [35]. They decide the best response with a two-player Stackelberg stochastic game theory accompanied with a process partially observable competitive Markov decision process (POMDP). This system cannot predict the responses proactively for an attack. Katkar et al. [6] implemented a signature-based IDS using Naive Bayes classifier to detect DDoS attacks; but this model lacks the capability of predicting the attack once the DDoS multistep attack is in progress. Holgado et al. [2] proposed an approach to predict multistep attacks by a modified hidden Markov model, using the alerts generated by an IDS system. They have simulated a DDoS attack with LLDDoS DARPA dataset in a virtual network and achieved a 95% probability of success for DDoS attack.
6 Proposed Model Based on the strength of attack graphs, hidden Markov model and deep learning approaches as revealed in the literature survey, we propose a model to detect an attack
638
S. P. Sharmila and N. S. Chaudhari
and generate attack graph for analysis. By employing deep learning approaches and the modified hidden Markov model, our intrusion prediction system has a capability to proactively respond to an attack and indefinitely gather the prediction details even when the attack is in progress. HMM is a discrete first order function with two stochastic processes; hidden process representing the state variable and observation process representing the alert information, it can be formally defined by (1), and modified HMM is shown in (2). Table 2 describes the existing and modified terms used. H = {Observations, States, Tmat , Pmat , Dvector }
(1)
Hmodified = {Observations, States, Tmat , Pmat , Dvector , Hmean }
(2)
The proposed model, given in Fig. 1, consists of the following components. 1. 2.
3. 4. 5.
Security database: It includes various multistep attacks attempted on a particular IP address and other history of attacks. Attack graph generation module: It generates the attack graph for analysing network connectivity and future vulnerabilities. Every time the graph is updated by the trained data. Modified HMM model: Expectation maximization is applied in HMM training which is the core of our prediction module. Proactive IDS: Detects the attack events proactively even in online mode during the progress of an attack. IRS: It is a cognitive module, fully automated intrusion response system that generates the responses and takes preferable actions by responding to the attack.
Table 2 Existing and modified parameters in HMM Parameter
Existing or modified Description
Observations Existing
Possible observations based on the IDS alerts and report
States
Existing
Set of HMM states corresponding to similar attack steps
T mat
Existing
N × N transition probability matrix describing the transitions between states of set S, according to alerts from IDSs
Pmat
Existing
Observation probability matrix, which is composed of probability vectors for each state
Dvector
Existing
Initial distribution vector witch indicates the probability of the initial state of the intrusion
H mean
Modified
Vector of mean number of alerts in the HMM representing the mean number of alerts for state i for each multistep attack
Conceptual Study of Prevalent Methods for Cyber-Attack Prediction
639
Fig. 1 Proposed model
7 Conclusion and Future Work In this paper, some methods for prediction of attacks in a network are reviewed, by identifying the benefits of modified Hidden Markov model for an autonomous Intrusion Prediction System. With the focus of predicting attacks before they occur, we proposed a model that includes these components, namely security database, attack graphs, hidden Markov model, proactive IDS and IRS. We gave a literature review in which strengths of each such component are successfully demonstrated by various researches earlier. Our model integrates these components to have a synergetic effect for the benefits of intrusion prediction.
References 1. C.W. Geib, R.P. Goldman, Plan recognition in intrusion detection systems, in Proceedings DARPA Information Survivability Conference and Exposition II. DISCEX’01, vol. 1 (IEEE, 2001) 2. A. Sathesh, Enhanced soft computing approaches for intrusion detection schemes in social media networks. J. Soft Comput. Paradigm (JSCP) 1(02), 69–79 (2019) 3. S.P. Sharmila, Balaji Rao Katika, Classification Model for Phishing e-mails with a Data Mining Approach, presented in International Conference on Advances in Engineering Science and Management (ICAESM 2021), Wainganga College of Engineering and Management, Nagpur, 25–26 Mar 2021, published in IJERCSE, Vol. 8 Issue-4 (April 2021), p. 55.
640
S. P. Sharmila and N. S. Chaudhari
4. S.P. Sharmila, Harsha .P. Moger, An operative application of distributed ledger technology for banking domain. International Journal of Computer Science and Mobile Computing. Vol. 10, Issue-7, p-68 (July 2021). https://doi.org/10.47760/ijcsmc.2021.v10i07.010 5. O. Sheyner et al., Automated generation and analysis of attack graphs, in Proceedings 2002 IEEE Symposium on Security and Privacy (IEEE, 2002) 6. S.A. Zonouz et al., RRE: a game-theoretic intrusion response and recovery engine. IEEE Trans. Parallel Distrib. Syst. 25(2), 395–406 (2013) 7. D. Yu, D. Frincke, Improving the quality of alerts and predicting intruder’s next goal with hidden colored petri-net. Comput. Netw. 51(3), 632–654 (2007) 8. P. Holgado, V.A. Villagrá, L. Vazquez, Real-time multistep attack prediction based on hidden Markov models. IEEE Trans. Dependable Secure Comput. 17(1), 134–147 (2017) 9. P. Ammann et al., A host-based approach to network attack chaining analysis, in 21st Annual Computer Security Applications Conference (ACSAC’05) (IEEE, 2005). https://doi.org/10. 1109/CSAC.2005.6 10. X. Ou, S. Govindavajhala, A.W. Appel, MulVAL: a logic-based network security analyser, in Proceedings of the 14th Conference on USENIX Security Symposium, vol. 14 (2005), p. 8 11. X. Ou, S. Govindavajhala, A.W. Appel, MulVAL: a logic-based network security analyser, in USENIX Security Symposium, vol. 8 (2005) 12. K. Ingols, R. Lippmann, K. Piwowarski, Practical attack graph generation for network defense, in 2006 22nd Annual Computer Security Applications Conference (ACSAC’06) (IEEE, 2006). https://doi.org/10.1109/ACSAC.2006.39 13. A. Xie et al., A probability-based approach to attack graphs generation, in 2009 Second International Symposium on Electronic Commerce and Security, vol. 2 (IEEE, 2009). https://doi. org/10.1109/ISECS.2009.113 14. K. Ingols et al., Modeling modern network attacks and countermeasures using attack graphs, in 2009 Annual Computer Security Applications Conference (IEEE, 2009). https://doi.org/10. 1109/ACSAC.2009.21 15. X. Ou, A. Singhal, Attack graph techniques, in Quantitative Security Risk Assessment of Enterprise Networks (Springer, New York, NY, 2012), pp. 5–8. https://doi.org/10.1007/978-1-46141860-3 16. S. Jajodia, S. Noel, B. O’Berry, Topological analysis of network attack vulnerability, in Managing Cyber Threats (Springer, Boston, MA, 2005), pp. 247–266 17. N. Ghosh, S.K. Ghosh, A planner-based approach to generate and analyze minimal attack graph. Appl. Intell. 36(2), 369–390 (2012). https://doi.org/10.1007/s10489-010-0266-8 18. N. Poolsappasit, R. Dewri, I. Ray, Dynamic security risk management using Bayesian attack graphs. IEEE Trans. Dependable Secure Comput. 9(1), 61–74 (2011). https://doi.org/10.1109/ TDSC.2011.34 19. K. Kaynar, F. Sivrikaya, Distributed attack graph generation. IEEE Trans. Dependable Secure Comput. 13(5), 519–532 (2015). https://doi.org/10.1109/TDSC.2015.2423682 20. H.M.J. Almohri et al., Security optimization of dynamic networks with probabilistic graph modeling and linear programming. IEEE Trans. Dependable Secure Comput. 13(4), 474–487 (2015). https://doi.org/10.1109/TDSC.2015.2411264 21. K. Bi, D. Han, J. Wang, K maximum probability attack paths dynamic generation algorithm. Comput. Sci. Inf. Syst. 13(2), 677–689 (2016). https://doi.org/10.2298/CSIS160227022B 22. N. Polatidis et al., From product recommendation to cyber-attack prediction: generating attack graphs and predicting future attacks. Evol. Syst. 11(3), 479–490 (2020). https://doi.org/10. 1007/s12530-018-9234-z 23. Y. Shen et al., Tiresias: predicting security events through deep learning, in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (2018). https:// doi.org/10.1145/3243734.3243811 24. K.K. Nguyen et al., Cyberattack detection in mobile cloud computing: a deep learning approach, in 2018 IEEE Wireless Communications and Networking Conference (WCNC) (IEEE, 2018) 25. M. Rhode, P. Burnap, K. Jones, Early-stage malware prediction using recurrent neural networks. Comput. Secur. 77, 578–594 (2018)
Conceptual Study of Prevalent Methods for Cyber-Attack Prediction
641
26. D. Aksu, M.A. Aydin, Detecting port scan attempts with comparative analysis of deep learning and support vector machine algorithms, in 2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT) (IEEE, 2018) 27. S. Rezvy et al., An efficient deep learning model for intrusion classification and prediction in 5G and IoT networks, in 2019 53rd Annual Conference on Information Sciences and Systems (CISS) (IEEE, 2019) 28. R. Vinayakumar et al., Deep learning approach for intelligent intrusion detection system. IEEE Access 7, 41525–41550 (2019) 29. S.M. Kasongo, Y. Sun, A deep learning method with filter based feature engineering for wireless intrusion detection system. IEEE Access 7, 38597–38607 (2019) 30. B. Vivekanandam, Design an adaptive hybrid approach for genetic algorithm to detect effective malware detection in android division. J. Ubiquitous Comput. Commun. Technol. 3(2), 135–149 (2021) 31. Y. Zhang, P. Li, X. Wang, Intrusion detection for IoT based on improved genetic algorithm and deep belief network. IEEE Access 7, 31711–31722 (2019) 32. D.S. Fava, S.R. Byers, S.J. Yang, Projecting cyberattacks through variable-length Markov models. IEEE Trans. Inf. Forensics Secur. 3(3), 359–369 (2008) 33. H. Du et al., Toward ensemble characterization and projection of multistage cyber attacks, in 2010 Proceedings of 19th International Conference on Computer Communications and Networks (IEEE, 2010) 34. A. Shameli-Sendi et al., A retroactive-burst framework for automated intrusion response system. J. Comput. Netw. Commun. 2013 (2013) 35. S. Fayyad, C. Meinel, Attack scenario prediction methodology, in 2013 10th International Conference on Information Technology: New Generations (IEEE, 2013)
Performance Analysis of Type-2 Diabetes Mellitus Prediction Using Machine Learning Algorithms: A Survey B. Shamreen Ahamed, Meenakshi Sumeet Arya, and V. Auxilia Osvin Nancy
Abstract Diabetes Mellitus (DM) is an autoimmune disease that affects many people’s everyday life across various age groups. Diabetes (Type-1 and Type-2) are the two main categories. Many algorithms and techniques such as DT (Decision Tree), GB (Gradient Boost), LR (Logical Regression), XGB Classifier, Extra Trees Classifiers, LGBM, etc., have been proposed to predict Diabetes. It can be achieved by framing a predictive model, for predicting the onset of the disease. A breakthrough is yet to be achieved in predicting the same, with higher accuracy and by improving the prediction accuracy. This survey aims at studying, analyzing and drawing inferences from the various predictive models designed and proposed by the researchers, their merits, shortcomings and the ways in which the prediction accuracy can be improved in future. Keywords Diabetes mellitus · Machine learning · Diagnosis · Classifier algorithms · Prediction technique · Accuracy
1 Introduction One of the most commonly found diseases is Diabetes Mellitus [1]. The most important factor for the onset of the disease is the ignorance of people regarding its causes and effects. Another major factor is the treatment given, as it involves the Prediction mechanism being used by the medical team. An effective prediction mechanism needs to be created, to categorize the problem in a channelized format and to streamline the arising symptoms of a person in the initial stage itself. With the technical advancements in the health care sector, Data Analytics is used as an important major in the field of Health Care. Data Analytics involves conversion of raw data into pre-defined information that is required during the study. This is the first step of prediction. This prediction procedure is implemented using ML algorithms and the B. Shamreen Ahamed (B) · M. S. Arya · V. Auxilia Osvin Nancy SRM Institute of Science and Technology, Vadapalani Campus, No.1, Jawaharlal Nehru Road, Vadapalani, TN, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_48
643
644
B. S. Ahamed et al.
most accurate algorithm can be chosen depending on the dataset taken and the accuracy of prediction can be obtained. In the prediction of Diabetic Disease, datasets need to be collected through various sources of patients and validated into static form for data prediction procedures. Each algorithm has a specified efficiency and advantage that can be used to carry out the research [2]. The datasets can include attributes such as Age, Glucose, BMI, Blood Pressure, etc. [1] and can be obtained from the following collection samples: urine test, oral glucose tolerance test (OGTT) and Glycated Hemoglobin test (AIC). Based on these tests, the medication procedures are started. However the prediction procedures are not 100% accurate as it is an autoimmune disease. In order to achieve good accuracy, Machine Learning algorithms can be evaluated and tested on the different patient details datasets and the accuracy can be predicted [3]. This paper is segmented in the following way: Sect. 2 presents a systematic and detailed analysis of techniques proposed by many researchers for predictive analysis on well-known datasets pertaining to diabetes. A description about the related works on the various machine learning algorithms is explained. This is followed by predicting the algorithms and the corresponding techniques needed to predict Diabetic Disease in health care. Finally in Sect. 3, the final observations and the outcomes of the study are performed and obtained.
2 Implementation Techniques Diabetic Disease can be predicted using different technologies as discussed in the below sections. However, a suitable method needs to be chosen and implemented to obtain the highest accuracy as result. The term implementation in this context means the way in which the dataset can be given to the predictive model for obtaining the results. The accuracy percentage obtained by the model is calculated from the various attributes taken and the algorithm producing the highest percentage value can be taken as the suitable algorithm [4]. The techniques and methods that can be used for prediction are given in the following sections: Sect. 2.1 denotes the Traditional methods of implementing the model. This includes the concepts of Big Data. Section 2.2 denotes the various Machine Learning Classifiers that can be used to solve the problem. This includes classifiers such as SVM (Support Vector Machine), NB (Naive Bayes), RF (Random Forest), KNN (K-Nearest Neighbor) and LR (Logistic Regression). Section 2.3 denotes the Decision Tree Algorithms. This includes algorithms such as LGBM (Light Gradient Boosting Machine), GBM (Gradient Boosting Machine), J48 and XGB (XGBoost). Section 2.4 denotes concepts of Deep Learning mainly involving Artificial Neural Network for prediction. Section 2.5 denotes the performance of the classifiers when in hybrid form. This includes the combination of different machine learning algorithms.
Performance Analysis of Type-2 Diabetes …
645
2.1 Traditional Methods One of the traditional methods used for disease prediction is Big Data. Big Data techniques have been evolving over the years. The main characteristic is that it holds a large amount of data from which the needed information that has to be retrieved [5]. The data collection is very large and therefore data management is a little complex compared to the other technologies. The Big Data concept can be divided into the following categories: Unstructured data, Structured data and Semi-structured data. Saravana Kumar et al. [6] have used the techniques of Predictive-Methodology that can be used for analyzing the diabetic data in Big Data using the concept of Hadoop/Map Reduce Technique. The Warehoused dataset was used with Map Reduce Algorithm and sent to the Hadoop System. By using the Big Data Analytics Concept in Hadoop, a systematic solution for Diabetic Disease Prediction was proposed. Olga et al. [7] have used the concept of PTHrP (Parathyroid hormone-related Protein) for Diabetic Type-1 disease on real-time data involving 304 men and 558 women. The concept of “Gray reflected Binary Code (Java), Cluster Analysis (iPython), Graph Analysis (iPython, Gephi), 3D Visualization (Java) and Uran in Supercomputer are used.” The ADV (Advanced Data Visualization) method for patient inflow can reveal the groups of patients and diagnosed. Aishwarya et al. [8] have used the concept of Big Data Analytics along with K-means clustering. The data used is a real-time data consisting of 800 records and 10 attributes. The data is initially retrieved from semi-structured, structured, unstructured context. It is then implemented on the built predictive model. Machine learning algorithms are used by Clustering technique along with pipelining. The pipeline chosen is Adaboost classifier. The maximum accuracy is obtained by using Logistic Regression technique with 96% accuracy. By pipelining using Adaboost Classifier, the accuracy is further improved to 98.8%.
2.2 Machine Learning Classifiers There are many ML Classifiers that can be used for prediction and classification. Some of the ML Classifiers are SVM, NB, RF, KNN and LR. Support-Vector Machine belongs to the concept of Supervised-LearningAlgorithm that can be used for problems of regression and classification. It is used to generate the decision boundary or best line for diving the n-dimensional space into many classes that are different from one another in order to place the data point in the right category for future purposes. The hyperplane is known as the best decision boundary. To create the hyperplane, SVM selects vectors and extreme points. This introduces the concept of Support Vectors that further gives rise to the algorithm called Support Vector Machine [9].
646
B. S. Ahamed et al.
SVM uses the Lagrangian formulation mentioned below for classifying the testing samples d(X ) = T
l
yi αi X i X T + b0
(1)
i=1
where the class label of support vector is given as yi , X i X T are test tuple and α i , b0 are numeric parameters [10]. The ML Algorithm—Naive Bayes is based on classification that divides the data using the conditional probability values. Naive Bayes is an algorithm that is used for detecting the behavior of the different patients involved. It is a combination of classification Logistic Regression for classifying the patients into different groups. It is an algorithm that works swiftly for all the classification problems. It is good for predictions involving real time, multi-class, recommendation system, text-based classification and sentiment analysis. It is easy to implement for large datasets [11]. The Bayesian Formula for calculating Naive Bayes Algorithm is as follows: P( A|B) =
P( B|A)P( A) P(B)
(2)
where P(A|B) = Posterior Probability, P(B|A) = Likelihood Probability, P(A) = Class Prior Probability, P(B) = Predictor Prior Probability [12]. The Random-Forest is an estimator that is used to fit the decision tree into a number of classifiers on the dataset that are further categorized into divided samples that are used and improves the predictive accuracy and controls over- fitting [13]. The mean squared error of regression problems can be calculated as N 1 MSE = ( f i − yi )2 N i=1
(3)
where the number of data points is given as N, f i is the value returned by the model and yi is the actual value for the data point i. The gini value and entropy of classification models are calculated as follows [14] Gini = 1 −
c
( pi )2
i=1
Entropy =
C
1 − pi ∗ log2 ( pi )
(4)
i=1
Logistic Regression is a model that is represented statistically to identify the probability of a certain class or occurrence of an event. It is used for estimating
Performance Analysis of Type-2 Diabetes …
647
parameters of logical model. The Cost Function is called as log loss or binary crossentropy function and is given by N 1 (yi I n( pi ) + (1 − yi )I n(1 − pi )s) − N i=1
(5)
where is the probability that Y belongs to class 1. It is given by ez 1 + ez Z = β0 + β1 X 1 + β2 X 1 + · · · βm X m
pi = P(yi = 1) =
(6)
where 1, 2, are features [15]. Yaxin et al. [16] have developed a Diabetes Classification-prediction model using SVM, using the concept of improved chaotic differential evolution (ICDE) algorithm. The data used is compared and analyzed with other SVM Algorithm’s such as Genetic-Algorithm (GA), Particle-Swarm-Optimization (PSO) and DifferentialEvolution (DE). When compared, the improved SVM model namely ICDE-RBFSVM is identified to be 94% and gives the highest accuracy of all the algorithms used. The concept of Backward Propagation is used. Jayanthi et al. [17] have done a survey on machine learning algorithms including KNN, NLP, NN and NB. The highest accuracy from the above algorithms was obtained by SVM along with clustering concept. The accuracy was 98.82%. The dataset used was PIMA Indian Dataset and implemented using hybrid concept. Priyanka et al. [18] have developed a model that is used for determining Diabetes Mellitus Disease. The ML algorithms used for the study include, SVM, NB, DT and ANN. The dataset was collected globally containing 9 attributes and 768 instances. The accuracy obtained from the mentioned algorithms are: decision tree gives 82%, Naive Bayes gives 77%, SVM gives 83% and ANN gives 82%. Therefore it was concluded that SVM produced the highest accuracy. Narendra et al. [19] have developed the performance analysis for Diabetes Mellitus Disease on the PIMA Indian Dataset. The Support Vector Machine is used on 4 kernels namely, Linear, Polynomial, Sigmoid and RBF Kernel. The accuracy of the SVM Kernels are Linear 77%, Polynomial 80%, RBF 82% and Sigmoid 69%. The results were predicted that RBF Kernel produced the highest accuracy of 82% in comparison with the other kernels used on the dataset. Sisodia et al. [20] have proposed a comparative study for Diabetes Mellitus prediction, using ML algorithms. They are SVM, NB and DT algorithms. The dataset used for the study is PIMA Indian Dataset that is taken from the UCI Repository. The results are verified using the Receiver Operating Characteristic (ROC). The Naive Bayes Algorithm produces the highest accuracy of 76.30%. Phattharat et al. [21] have proposed classification algorithms which include ANN, SVM, LR and NB, using the dataset that was collected from 12 hospitals of Thailand containing 22,094 records. The classifiers used include CHAID and Rapid
648
B. S. Ahamed et al.
Miner Studio 7.0. The highest accuracy is predicted to be given by the Naive Bayes Algorithm with a percentage of above 80% depending on the dataset used. Sadri et al. [22] have developed a diagnosis of performance comparison in Diabetes Mellitus Disease using classification algorithms for the dataset taken from the UCI Repository. The tool used is WEKA. The algorithms used for comparison are Naive Bayes, J48 and RBF Network. The accuracy obtained is highest for NB with an accuracy percentage of 76.95%. Perveen et al. [23] have proposed a study mechanism for analyzing prediction on Diabetes. The algorithms used are J48 Classifier, Naive Bayes and Decision Tree. K-Mediods down-sampling is done along with Naive Bayes Classifier. The accuracy obtained using Naive Bayes is 79%. Esmaily et al. [24] have compared Decision Tree and Random Forest Algorithms in machine learning concepts. The implementation was performed on 9528 people and the prevalence rate is 14%. The study was called MASHAD study program. From the algorithms used, the accuracy given by Decision Tree is 64.9% and Random Forest is 71.1%. Sajal et al. [25] have applied 6 ML algorithms involving SVM, KNN, GB, DT, RF and LR. The dataset used is PIMA Indian Dataset. The models are compared for accuracy, precision, recall, F1 score and area under curve (AUC) based on 5 performance metrics. The results obtained have the following percentage. Random Forest 84% which has the highest accuracy, recall at 76% and Precision and AUC score 83%. Gangil et al. [26] have used the concept of optimal features selection in order to perform prediction on the diabetic disease in the initial stage of the disease itself. The dataset used in this paper is Cancer Sample Dataset that consists of 5000 patient entries. The author used different algorithms which gave the accuracy as follows: DT algorithm with 98.20% accuracy and is considered to have the highest accuracy when compared to the algorithms used in this paper. The other algorithms used include RF with an accuracy of 98%, SVM with 77.73% accuracy and NB techniques with 73.48% accuracy. Juyoung et al. [27] have used the concept of genetic algorithm on clinical data. The dataset has been collected from hospital patients. Logistic Regression algorithm is implemented along with other algorithms including SVM, C4.5 and KNN on the data that is collected. The accuracy was obtained by Logistic Regression with an accuracy percentage of 90.4% as a highest. Thanga Prasad et al. [28] discussed the concept of pre-processing point for mutually statistical and unconditional information. They used CART, Genetic Algorithm and SOFM to impute categorical values. The Hadoop Distributed File System handles large volume of data. The predictive Analysis done by distributing into stages: Normal, Initial Stage and Final Stage. In normal stage, the minimum value is 100 and the curing level is normal. In initial stage, the minimum value is 101 and the maximum value is 125, the curing level is possible to cure. Whereas, in the final stage, the minimum value is 126 and the maximum value is greater than 126, the curing value is impossible to cure.
Performance Analysis of Type-2 Diabetes …
649
Faranak et al. [29] have proposed a comparative study using 4 machine learning algorithms. The algorithms used are KNN, SVM, LR and ANN. The best performance was considered based on AUC, Sensitivity, Specificity, ROC Curve and Range. The performance percentage based on mean of AUC was given as KNN 91%, SVM 95%, LR 95% and ANN 93%. The Logistic Regression and Support Vector Machine were considered to have better area under the curve. Zheng et al. [30] have used the concept of Feature Engineering and Semiautomated framework using ML algorithms to obtain a good percentage of accuracy. A framework has been developed for Type-2 Diabetes Mellitus (T2DM) and uses Electronic Health Records for data. 300 samples are taken from the EHR between the years 2012 and 2014. It consists of 161 cases, 60 controls and 79 unconfirmed cases. ML algorithms such as NB, KNN, DT, SVM, Logistic Regression are used for the framework. An accuracy of 98% has been obtained for the collected data. Raghad et al. [31] have compared the analysis of 6 classification algorithms. The algorithms involved are K-Nearest Neighbor, Radial Basis Function Support Vector Machine (RBF SVM), Linear SVM, Sigmoid SVM, Linear Regression, LDA, CART and Naive Bayes. The dataset used is PIMA Indian Dataset. 10-cross validation test is conducted. The kernels used are RBF and Sigmoid. The accuracy obtained from the different algorithms are K-Nearest Neighbor 73%, RBF SVM—61%, Linear SVM— 77%, Sigmoid SVM—65%, Linear Regression—77%, LDA—77%, CART—70% and Naive Bayes—76%. Fikirte et al. [32] have developed a model by using the PIMA Indian Dataset taken from UCI Repository. The implementation is done using R programming language in R-Studio. The algorithms used are SVM, J48, Naive Bayes and Backward Propagation. The accuracy produced is highest for Back propagation with an accuracy percentage of 83.11%.
2.3 Decision Tree Algorithm The Decision Tree Algorithms that are discussed below include LGBM, J48 and XGB. The Decision Tree Entropy is generated as follows: A node k is taken and J class labels are identified. The value of j ranges from 1 to J. It is given mathematically as Entropy(k) =
j
p( j|k) log2 ( j|k)
(7)
j=1
LGBM uses two concepts. They are GBDT (Gradient Boosting Decision Tree) and GOSS (Gradient based one-sided sampling). It is used to separate the tree in leaf-wise method that provides best fit whereas other boosting algorithms are used to divide depth-wise. The results are better in accuracy when compared with the other existing boosting algorithms [33].
650
B. S. Ahamed et al.
The J48 Algorithm is a Decision Tree Algorithm that is developed by the WEKA tool to implement the ID3 (Iterative Dichotomizer 3) algorithm. It is also knows as the extension of ID3 algorithm. The features in addition to ID3 are deviation of rules, decision trees pruning, missing values, continuous attribute value ranges, etc., that was improved to increase the accuracy percentage of the data [34]. XGBoost is used for supervised regression models. It is used to infer the details about the validity of the objective function and base learners. The concept of Ensemble learning is used to combine the result into a single prediction by involving training and combining individual models. XGBoost is one such type of the ensemble learning methods [35]. The objective function of XGBoost is given as: obj(θ ) =
n
l(yi − yi ) +
i
j
δ( f i )
(8)
j=1
where f j denotes prediction from the jth tree. The MSE (mean squared error) is given as [36]: n i=1
MSE =
p 2
yi − yi n
(9)
Guolin et al. [37] have proposed a study using LGBM Algorithm. The authors have used a concept of GBDT machine learning algorithm. Two novel-based techniques are used. They are: GOSS and EFB. Public datasets are used. The proposed system is that Light GBM makes the training process fast by over 20 times in conventional GBDT while achieving the same accuracy. The advantages are better computation speed and memory consumption. The accuracy is calculated using θ
−2/3 . (1 − γ )n
(10)
Omodundi et al. [38], have developed a hybrid-predictive model that is a combination model of Light Gradient Boosting Machine (LGBM) algorithm and K-Nearest Neighbor (KNN). The dataset includes Data from UCI Repository. Soft Voting Classifier has been used. It can be represented as y = a max i
i
W j Pi j
(11)
j=1
Shafi et al. [39], have proposed a model for prediction by examining the features of Type-2 Diabetes Mellitus. The tool used in the study was WEKA (3.6.10) version. The data was collected between the year 2009 and 2011 and was obtained from a database in Tabriz, Iran. After data processing and preparation, 22,399 records were used. The accuracy obtained using the algorithm was 97%.
Performance Analysis of Type-2 Diabetes …
651
Shamreen et al. [40], have developed a predictive model for the Diabetes Mellitus Disease. The dataset used in PIMA Indian Dataset. The concepts of GOSS (Gradient based one-sided sampling) and GBDT (Gradient Boosting Decision Tree) were used. The machine learning algorithms used were LR, GB Classifier, XGB Classifier, DT, Extra-Trees Classifier, RF and LGBM (Light Gradient Boosting Machine). The highest accuracy was obtained by the LGBM algorithm with a percentage of 95.20% while the accuracy of the other algorithms were Random Forest 94.80%, Extra Trees Classifier 94.60%, Decision Trees 94.40%, Gradient Boosting Classifier 94.10%, XGB Classifier 83.30% and Logistic Regression 75.20%. Saravanathan et al. [41], have used a dataset consisting of 545 patients for Diabetes Prediction. The data are compared and evaluated along with machine learning algorithms such as CART, SVM and KNN algorithms. The accuracy of J48 algorithm is 67.16%, CART is 62.29%, SVM is 65.05% and KNN is 53.39%. J48 algorithm was predicted with high accuracy in comparison with the other algorithms. Gagonjot et al. [42], have proposed a modified J48 Classifier on the PIMA Indian Dataset. The WEKA tool is used as an API for MATLAB in order to generate J48 Classifiers. Some of the other algorithms and classifiers used are RandomTree, ADTree, Random Forest, J48 Classifier and Naive Bayes. The accuracy obtained are RandomTree—68%, ADTree—72%, Random Forest—74%, J48 Classifier—74%, Naive Bayes—76% and Modified J48 Classifier—99%. Posonia et al. [43], have developed a classification method using Decision Tree, J48 Algorithm. The dataset used is PIMA Indian Dataset. The data pre-processing is done in order to remove the nosy data. The tool used is WEKA. By implementing the algorithm, the accuracy obtained is 91.2%. Pei et al. [44], have proposed a prediction system using the J48 Decision Tree Algorithm. The dataset is taken from the Chinese population of 10,436 people obtained from retrospective cross sectional study. It consists of 14 input variables and 2 output variables. The accuracy obtained from the classifier is 90.3%. Tianqi et al. [45] have used the Tree Boosting Technique called XGBoost. The authors have proposed a novel-sparse data and weighted-quantile stretch for tree learning procedures. The major tree boosting systems are as follows: XGBoost, pGBRT, SparkMLLib, H20, RGBM and SciKit-Learn. XGBoost is considered to have the acceptance rate for all the approaches mentioned below: ExactGreedy, ApproximateLocal, ApproximateGlobal, OutofCore, Sparsityaware and Parallel-Approaches. Anju et al. [46] have proposed a system for detection of Diabetes Mellitus disease. The algorithms used are KNN, SVM, RF and XGB algorithms. The dataset was collected real time from 217 people. The accuracy obtained was the highest for XGBoost with a percentage of 99.79% without normalization, 99.65% for min–max normalization and 98.62% for z-score normalization.
652
B. S. Ahamed et al.
2.4 Deep Learning In this section, the Artificial Neural Network (ANN) concepts are used to implement the model. The ANN is a concept that involves models involving the biological brain and artificial intelligence techniques. It is a computational neural network. The neurons that are interconnected with each other are similar to the working of the brain and are linked with the various layers of the network. They are represented in the form of nodes [47]. In order to use the activation function, the sigmoid function is calculated as: σ =
1 1 + e−x
(12)
The neural network equation is given as Z = Bias + w1 x1 + w2 x2 + · · · wn xn
(13)
is used to denote the graphical representation of ANN. wi is used to represent the Weights, xi is used to represent the independent variables of the inputs and Bias or intercept w0 [26]. Jahani et al. [48], have used manual dataset. The neural network approach is used and memetic algorithm is used to classify and diagnose onset and progression of diabetes diseases. The accuracy obtained is 93.2% on the dataset used. Lakhwani et al. [49] have introduced the three-layered ANN model and used the PIMA Indian Dataset. A tool for the neurons is used to design the artificial neural network. A Quasi Newton method is used for logistic activation of neurons and is further trained. As a training algorithm, a Quasi Newton’s method was used and a primary value of 1.10079 of training loss was measured. After 151 epochs, the closing value was found to be 0.471762. It indicates a realistic approach for modeling instrument of neural network. Jobeda et al. [50], the authors have used different machine learning algorithms and Neural Networks such as KNN, LR, DT, SVM, RF and NB for executing. The PIMA Indian Dataset was used. A predictive model was developed using the LR and SVM for the Diabetes disease. The accuracy of the other algorithms are LR = 78.85%, NB = 78.28%, RF = 77.34% and ANN = 88.57% whereas the other algorithms were below 70%.
2.5 Hybrid Models In this section, many ML algorithms are combined into a hybrid model to perform the prediction mechanism.
Performance Analysis of Type-2 Diabetes …
653
Hang Lai et al. [51] have developed a prediction-based mechanism using the combination of Logistic Regression and Gradient Boosting Machine (GBM) techniques. The data is collected from Canadian patients. The prediction is based on operating characteristic curve (AROC—Area under the ROC Curve). The tool used is R. It uses tenfold validation method and sensitivity was calculated. In GBM, Bernoulli loss function and tree-based learners were used. AROC for GBM is 84.7% and Logistic Regression is 84%. Han Wu et al. [14], have proposed a model using PIMA Indian Dataset along with two other dataset collected namely: Dataset provided by Dr. Schorling and from Online Questionnaire. The algorithms used are K-Means algorithm and Logistic Regression Algorithm. The WEKA tool has been used to obtain the experimental results using visualized interface. The obtained results are 3.04% percentage higher than the other researchers with an accuracy percentage of 95.42%. Wenqian et al. [52] have proposed a hybrid model containing the combination model of algorithms K-means and DT. The PIMA Indian Dataset is used for their study. The tool used for the study is WEKA. The tenfold cross validation procedure is followed. The proposed model gives an accuracy of 90.04%. Patil et al. [53] have created a hybrid-based model for determining the Type-2 Diabetes Mellitus Disease. The algorithms used are K-means clustering algorithm and C4.5 algorithm. The dataset used is PIMA Indian Dataset taken from UCI Repository. K-means algorithm was used for validating by using K-fold cross validation method. The commonly used measures are sensitivity and specificity performance. The accuracy obtained using the hybrid model is 92.38%. Komal et al. [54] have developed a hybrid-based model for determining the Diabetes Mellitus Disease. The dataset used is PIMA Indian Dataset. Many classifiers such as Naive Bayes Classifier, Logistic Regression, Decision Tree Classifier, Random Forest, Neural Network, KNN, XGBoost, Voting Classifier, AdaBoost and SVM were used. The accuracy obtained from Decision Tree is 75.32%, XGBoost Classifier is 77.48%, Voting Classifier is 75.75% while the other algorithms are below 60%. By combining the Stacking classifier and AdaBoost, the accuracy obtained is 80%.
3 Conclusion and Future Scope We can conclude from the above studies that many ML algorithms can be used in prediction of Diabetes Mellitus. The datasets used along with each algorithm vary from one another depending on the researcher and the analysis that is done. Many predictive models were developed and feature extraction processes were done. This involved concepts such as CNN, ANN, NB Techniques, SVM, etc., and a prediction model is used.
654
B. S. Ahamed et al.
The research gap as such states the following: 1. 2. 3. 4.
A framework needs to be developed, to streamline the given data into the system to obtain an accuracy higher than the existing system. A predictive model needs to be built using a machine learning algorithm to obtain better results using the various attributes taken in the dataset. In the PIMA Indian Dataset, the Glucose values are taken only for pre-diabetic patients. This may not predict the presence of Diabetes in a person. Only selected attributes are taken for detecting Diabetes Mellitus Disease. To overcome these research gaps:
1. 2. 3. 4.
Channeling the attributes of the dataset in such a way that a framework can be built and utilized effectively. Advances can be made in the existing algorithm and many improvisation techniques can be incorporated in the data model for further study and analysis. A predictive model that gives better accuracy can be built. Implementation including other attributes such as BMI, Age, Pregnancy, can also be taken into account to find out the probability at which a person will be affected by Diabetes Mellitus Disease in future.
References 1. A. Adam Mohammad, Predicting diabetes using gradient boosting is a machine learning technique. Int. J. Sci. Res. (IJSR) (2019). ISSN: 2319-7064 SJIF:7.583 2. V.L.W. Goetsch, J. Deborah, Diabetes Mellitus: Handbook of Health and Rehabilitation Psychology (Springer, US, 1995). https://doi.org/10.1007/978-1-4899-1028-8_25 3. I.M. Rabinowitch, Diabetes mellitus. Am. J. Digest. Dis. 1573–2568. https://doi.org/10.1007/ BF03001237 4. https://www.journals.elsevier.com/diabetes-research-and-clinical-practice?sf8158831=1 5. R. Biswas, S. Pal, N.H.H. Cuong, A. Chakrabarty, V.K. Solanki, M.K. Hoang, Z.(Joan) Lu, P.K. Pattnaik, A novel IoT-based approach towards diabetes prediction using Big Data, in Intelligent Computing in Engineering, vol. 163 (Springer Singapore, 2020). https://doi.org/10.1007/978981-15-2780-7_20 6. S. Kumar, N.M. Eswari, T. Sampath, S. Lavanya, Predictive methodology for diabetic data analysis in Big Data. Procedia Comput. Sci. 50, 203–208. ISSN 1877–0509. https://doi.org/ 10.1016/j.procs.2015.04.069(2015) 7. O. Kolesnichenko, E. Marochkina, R. Komarov, M. Lev, M. Andrey, D. Soldatov, L. Minushkina, M. Chernoskutov, V. Averbukh, I. Mikhaylov, A. Martynov, V. Pulit. S. Amelkin, I. Grigorevsk, Y. Kolesnichenko, Big data analytics of inpatients flow with diabetes mellitus type 1, in IEEE: 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence) (IEEE, 2019). 978-1-5386-5933-5/19/$31.00 c 8. A. Mujumdar, V. Vaidehi, Diabetes prediction using machine learning algorithms. Procedia Comput. Sci. 165, 292–299. ISSN 1877-0509 (2019) 9. Y. Hou, X. Ding, R. Hou, Support vector machine classification prediction model based on improved chaotic differential evolution algorithm, in 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC- FSKD) (2017)
Performance Analysis of Type-2 Diabetes …
655
10. https://www.google.co.in/books/edition/Data_Mining_Concepts_and_Techniques/pQws07 tdpjoC?hl=en&gbpv=1&printsec=frontcover 11. A. Minyechil, J. Rahul, M. Preeti, Analysis and prediction of diabetes mellitus using machine learning algorithm. Int. J. Pure Appl. Math. 118(9), 871–878 (2018) 12. https://www.saedsayad.com/naive_bayesian.htm 13. S.S. Mehta, N.S. Lingayat, SVM-based algorithm for recognition of QRS complexes in electrocardiogram. IRBM 29(5), 310–317. ISSN 1959-0318 (2008). https://doi.org/10.1016/j.rbm ret.2008.03.006 14. H. Wu, S. Yang, Z. Huang, J. He, X. Wang, Type 2 diabetes mellitus prediction model based on data mining, in Informatics in Medicine Unlocked, vol. 10. ISSN 2352–9148 (2018), pp. 100– 107 15. https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc 16. F. Woldemichael, S. Menaria, Prediction of diabetes using data mining techniques, 2018/05/01, pp. 414–418 (2018). https://doi.org/10.1109/ICOEI.2018.8553959 17. N. Jayanthi, V.B. Babu, S. N. Rao, Survey on clinical prediction models for diabetes prediction. J. Big Data 4, Article number: 26 (2017) 18. S. Priyanka, K. Jaya Malini, Diabetes prediction using different machine learning approaches, in Proceedings of the Third International Conference on Computing Methodologies and Communication (ICCMC) (2019). IEEE Xplore Part Number: CFP19K25-ART; ISBN: 978-1-5386-7808-4 19. N. Mohan, V. Jain, Performance analysis of support vector machine in diabetes prediction, in Fourth International Conference on Electronics, Communication and Aerospace Technology (ICECA-2020), IEEE Xplore Part Number: CFP20J88-ART; ISBN: 978-1-7281-6387-1(2020) 20. D. Sisodia, D.S. Sisodia, Prediction of diabetes using classification algorithms. Procedia Comput. Sci. 132, 1578–1585 (2018). ISSN 1877-0509 21. P. Songthung, K. Sripanidkulchai, Improving type 2 diabetes mellitus risk prediction using classification, in 13th International Joint Conference on Computer Science and Software Engineering (JCSSE) (2016), pp. 1–6. https://doi.org/10.1109/JCSSE.2016.7748866 22. S. Sadri, M. Amanj, H. Ramin, P. Zahra, C. Kamal, Comparison of data mining algorithms in the diagnosis of type II diabetes. Int. J. Comput. Sci. Appl. (IJCSA) 5 (2015) 23. S. Perveen, M. Shahbaz, K. Keshavjee, A. Guergachi, Metabolic syndrome and development of diabetes mellitus: predictive modeling based on machine learning techniques. IEEE Access 7, 1365–1375 (2019) 24. H. Esmaily, M. Tayefi, H. Doosti, M. Ghayour-Mobarhan, H. Nezami, A. Amirabadizadeh, A comparison between Decision Tree and Random Forest in determining the risk factors associated with type 2 diabetes. J. Res. Health Sci. 18(2), e00412 (2018). PMID: 29784893 25. R. Katarya, S. Jain, Comparison of different machine learning models for diabetes detection, in IEEE International Conference On Advances And Developments In Electrical And Electronics Engineering (ICADEE) (2020) 26. N. Sneha, T. Gangil, Analysis of diabetes mellitus for early prediction using optimal features selection. J. Big Data 1 (2019). https://doi.org/10.1186/s40537-019-0175-6 27. J. Lee, B. Keam, E. Jung Jang, M. Sun Park, J.Y. Lee, K. Dan Bi, L. Chang-Hoon, K. Tak, O. Bermseok, H.J. Park, K.-B. Kwack, C. Chu, H.-L. Kim, Development of a predictive model for type 2 diabetes mellitus using genetic and clinical data. Osong Publ. Health Res. Perspect. 2(2), 75–82 (2011). ISSN 2210-9099, https://doi.org/10.1016/j.phrp.2011.07.005 28. S.T. Prasad, S. Sangavi, A. Deepa, F. Sairabanu, R. Ragasudha, Diabetic data analysis in big data with predictive method, in International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET), Chennai, India (2017), pp. 1–4. https:// doi.org/10.1109/ICAMMAET.2017.8186738 29. F. Kazerouni, A. Bayani, F. Asadi, L. Saeidi, N. Parvizi, Z. Mansoori, Type2 diabetes mellitus prediction using data mining algorithms based on the long-noncoding RNAs expression: a comparison of four data mining approaches. BMC Bioinf 1471–2105 (2020). https://doi.org/ 10.1186/s12859-020-03719-8
656
B. S. Ahamed et al.
30. N. Yuvaraj, K.R. Sri Preethaa, Diabetes prediction in healthcare systems using machine learning algorithms on Hadoop cluster. Clust. Comput. 22, 1–9 (2017) 31. R. Sehly, M. Mezher, Comparative analysis of classification models for pima dataset, in International Conference on Computing and Information Technology (ICCIT-1441) (2020), pp. 1–5. https://doi.org/10.1109/ICCIT-144147971.2020.9213821(2020) 32. B.S. Ahamed, M.S. Arya, LGBM classifier based technique for predicting type-2 diabetes. Eur. J. Mol. Clin. Med. 8(3), 454–467 (2021) 33. N. Pradhan, G. Rani, V. Singh Dhaka, R. Chandra Poonia Diabetes prediction using artificial neural network, in Deep Learning Techniques for Biomedical and Health Informatics (Academic Press 2020), pp. 327–339. ISBN 9780128190616, https://doi.org/10.1016/B978-012-819061-6.00014-8 34. S. Karun, A. Raj, G. Attigeri, Comparative analysis of prediction algorithms for diabetes, in Advances in Computer Communication and Computational Sciences—Proceedings of IC4S 2017. Advances in Intelligent Systems and Computing; vol. 759, ed. by S.K. Bhatia, S. Tiwari, M.C. Trivedi, K.K. Mishra (Springer, 2019), pp. 177–187. https://doi.org/10.1007/978-98113-0341-8_16 35. M.S. Islam, M.K. Qaraqe, H.T. Abbas, M. Erraguntla, M. Abdul-Ghani, The prediction of diabetes development: a machine learning framework, in 2020 IEEE 5th Middle East and Africa Conference on Biomedical Engineering, MECBME 2020 [09292043] (Middle East Conference on Biomedical Engineering, MECBME, vol. 2020-October) (IEEE Computer Society, 2020) 36. https://towardsdatascience.com/xgboost-mathematics-explained-58262530904a 37. G. Ke, M. Qi, T. Finley, T. Wang, W. Chen, M. Weidong, Q. Ye, L. Tie-Yan, LightGBM: a highly effificient gradient boosting decision tree, in NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, Dec 2017 (2017), pp. 3149–3157 38. B. Omodunbi, Development of a diabetes melitus detection and prediction model using light gradient boosting machine and K-nearest neighbor. UNIOSUN J. Eng. Environ. Sci. 3 (2021). https://doi.org/10.36108/ujees/1202.30.0160(2021) 39. S. Habibi, M. Ahmadi, S. Alizadeh, Type 2 diabetes mellitus screening and risk factors using decision tree: results of data mining. Glob. J. Health Sci. 7(5) (2015). ISSN 1916-9736 E-ISSN 1916-9744 40. B.S. Ahamed, M.S. Arya, Prediction of Type-2 diabetes using the LGBM classifier methods and techniques. Turkish J. Comput. Math. Educ. 12(12), 223–231 (2021) 41. K. Saravananathan, T. Velmurugan, Analyzing diabetic data using classification algorithms in data mining. Indian J. Sci. Technol. 9 (2016). https://doi.org/10.17485/ijst/2016/v9i43/93874 42. G. Kaur, A. Chhabra, Improved J48 classification algorithm for the prediction of diabetes. Int. J. Comput. Appl. 98, 13–17 (2014). https://doi.org/10.5120/17314-7433 43. A.M. Posonia, S. Vigneshwari, D.J. Rani, Machine learning based diabetes prediction using decision tree J48, in 3rd International Conference on Intelligent Sustainable Systems (ICISS) (2020), 498–502. https://doi.org/10.1109/ICISS49785.2020.9316001(2020) 44. D. Pei, T. Yang, C. Zhang, Estimation of diabetes in a high-risk adult Chinese population using J48 decision tree model. Diabetes Metab. Syndr. Obes. 13, 4621–4630 (2020). https://doi.org/ 10.2147/DMSO.S279329(2020) 45. C. Tianqi, G. Carlos, XGBoost: a scalable tree boosting system, KDD ’16, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2016, pp. 785–794. https://doi.org/10.1145/2939672.2939785(2016) 46. A. Prabha, J. Yadav, A. Rani, V. Singh, Design of intelligent diabetes mellitus detection system using hybrid feature selection based XGBoost classifier. Comput. Biol. Med. 136, 104664 (2021). ISSN 0010-4825 47. S. Kumari, D. Kumar, M. Mittal, An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int. J. Cogni. Comput. Eng. 2, 40–46 (2021). ISSN 2666-3074. https://doi.org/10.1016/j.ijcce.2021.01.001(2021) 48. M. Jahani, M. Mahdavi, Comparison of predictive models for the early diagnosis of diabetes. Healthc. Inform. Res. 22(2), 95–100 (2016). https://doi.org/10.4258/hir.2016.22.2.95. Epub 2016 Apr 30. PMID: 27200219; PMCID: PMC4871851
Performance Analysis of Type-2 Diabetes …
657
49. K. Lakhwani, A novel approach of sensitive data classification using convolution neural network and logistic regression (2019) 50. J.J. Khanam, S.Y. Foo, A comparison of machine learning algorithms for diabetes prediction. ICT Express (2021). ISSN 2405-9595. https://doi.org/10.1016/j.icte.2021.02.004 51. H. Lai, H. Huang, K. Keshavjee, A. Guergachi, X. Gao, Predictive models for diabetes mellitus using machine learning techniques. BMC Endocr. Disord. 19(1), 101 (2019). https://doi.org/ 10.1186/s12902-019-0436-6. PMID: 31615566; PMCID: PMC6794897 52. W. Chen, S. Chen, J.H. Zhang, T. Wu, A hybrid prediction model for type 2 diabetes using K-means and decision tree, in 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), pp. 386–390 (2017). https://doi.org/10.1109/ICSESS.2017.834 2938 53. B.M. Patil, R.C. Joshi, D. Toshniwal, Hybrid prediction model for Type-2 diabetic patients. Exp. Syst. Appl. 37(12), 8102–8108 (2010). ISSN 0957-4174. https://doi.org/10.1016/j.eswa. 2010.05.078(2010) 54. M.S. Patil, Komal, S.D. Sawarkar, S. Narwane, Designing a model to detect diabetes using machine learning. Int. J. Eng. Res. Technol. (IJERT) 08(11) (2019)
Dynamic Updating of Signatures for Improving the Performance of IDS Asma Shaikh and Preeti Gupta
Abstract Signature-based intrusion detection system has the limitation that it will see only well-known attacks; it will not detect zero-day attacks. Due to the pattern matching operation, existing signature-based IDSs have large execution time and memory utilization overheads. As a result, an efficient system must be designed to reduce overhead and detect zero-day attacks. Many systems carry out the matching by comparing each input event to all rules. First, we define the proposed intrusion detection system using a decision tree as evaluated using real network traces. The proposed system with a decision tree received an accuracy of 96.96%. After this, if the signature-based system cannot detect an attack, then control goes to the anomalybased module. It uses a deep learning algorithm to detect anomalies from the packets missed from the signature-based system due to fixed rules, generate signatures from anomalies detected, and dynamically update the signatures into Signature Ruleset, improving the overall accuracy of 98.98%. So next time the same kind of attack occurs, it is possible to detect it easily. Keywords Network security · Intrusion detection system · Machine learning · Decision tree
1 Introduction An intrusion detection system (IDS) is a computer system or network monitoring tool designed to identify unauthorized or abnormal activity in an organization’s computer system or network. An IDS with a network basis, as the name suggests, consists of a network of sensors or hosts dispersed throughout the network. This IDS can track network traffic, collect it locally, and then inform a management console about the information. The IDS based on misuse is used to identify specified known attacks. A. Shaikh (B) · P. Gupta Amity University Maharashtra, Mumbai, Maharashtra, India e-mail: [email protected] A. Shaikh Marathwada Mitra Mandal College of Engineering, Pune, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_49
659
660
A. Shaikh and P. Gupta
Each known attack is signed and kept, and the incoming data matches their Signature for the attacks [1]. Work of IDS based on misuse or Signature: IDS based on malicious or Signature operates when an individual provides data to the network. First, all data leaves the server, and the server verifies that the server discharges a server and transmits it to the network if any dangerous information is detected. Then, when data reaches the server, the server uses the tool to compare the signature database packet stored on the server to verify that network packet. A Signature-Based method is used by most intrusion detection systems [2] that correspond with preset signatures, which describe the harmful activity. The matching method is the most resource-intensive activity of an IDS. Several systems achieve the correspondence between each input event and all rules. However, is not even near to perfect. No general solution to this problem has yet been offered, despite ad hoc optimizations at times. This paper suggests a way to improve the matching process by utilizing algorithms. A decision tree containing a number of signatures (each condition the input data must meet for activation) is produced by an algorithm to discover hazardous events with as few repeating comparisons as possible. A network-based intrusion detection system is applied to this notion (IDS). Misuse detectors examine system actions and match these activities with known attacks, which already have system definitions or signatures [1, 2].
2 Background of IDS Machine learning clustering is applied to enhance matching by Kruegel and Toth [2]. As a result of several signatures (each of which constrains the data input to meet specific requirements to activate it), a decision tree is constructed to identify harmful incidents through a few repetitive comparisons as feasible. Intrusion Detection Systems (IDSs) have the administrators manually update the databases or use websites that feed them new signatures of an attack. Al Yousef et al. [3] provide a model for automating the attack lists with a filtering device functioning as a second IDS engine. Our study shows that by adopting the suggested approach, the overall accuracy of the IDS improves. New attack signatures are based on similarities, and an IDS blacklist of IP factors automates updating the IDS database without human intervention. In a data-parallel approach developed by Patel et al. [4] Snort will raise the detection rate and reduce the time required for analysing packets. Furthermore, the system is scalable horizontally and vertically, so add or remove hosts as necessary. Kruegel et al. [5] developed a decision tree that reduces the number of unnecessary comparisons. The Kruegel method is improved by introducing changes to the decision-making tree clustering technique. The experiment results show that the output decision tree is considerably enhanced in quality. This paper analyses severe SNIDS Snort attacks done by H. Holm, which test 356 crafted Snort rules. The findings indicate that Snort can catch zero-day vulnerabilities (a mean of 17% detection). However, the detection rate is usually greater (a mean of 54% detection).
Dynamic Updating of Signatures for Improving …
661
The Ma et al. research team [6] develops a system to protect themselves from security breaches, companies may choose to deploy several S-IDS. Each of these will run separately and assume that they can monitor all packets of a given flow to identify intrusion signs. However, like the Multi-Path TCP use of several pathways to speed up network performance, these developments will inevitably lead to complications in the network. Malicious payloads may be spread across numerous channels to avoid signature-based network intrusion detection systems that identify malicious payloads. Multiple monitors are set up, but none provides a comprehensive picture of the network activity, which prevents the signature for an intrusion from being found. Yassin et al. [7] given Signature-Based Anomaly Detection Scheme (SADS) in the study to scrutinize better and identify network packet headers. Using mining classifiers like Naive Bayes and Random Forest to combine classifications like false alarms and detection results minimizes false alarms and creates signatures based on detection findings to speed up future prediction processing. Shiri [8] enhances the identification of symbolic intrusions, an entirely different approach. According to the research, a combination of two signature-based network intrusion detection systems, each running the programme named Snort, and assigned rules and packets, reduces the time required to identify malicious traffic. Anomaly-based approach enables the detection of previously unknown attacks, it may suffer from false positives. Signature-based IDS is helpful for detecting already known attacks and has a less false positive rate. Some systems combine these approaches, but none of them have adapted deep learning with updating signatures into the ruleset.
3 Proposed Framework A signature method for most IDSs is used in which each input event matches default signatures of hazardous behaviour. The most resource-intensive IDS task is the matching procedure. Each input event is eventually compared to every rule by many systems. Even the ideal isn’t close. Despite the use of ad hoc optimizations, no general solution has yet been developed [9]. This study offers a technique to enhance matching with machine learning algorithms. An algorithm creates a decision tree with several signatures to detect hazardous events with as few repeat comparisons as feasible, activating with the input data. This design was supported by a networkbased intrusion detection system (IDS) shown in Fig. 1. In this system architecture, all packets are received from internet, then sniff packets are given to IDS engine1, i.e. signature-based IDS then if the attack is not detected then packet is passed to Anomaly-based IDS. Here, if Anomalies detected then prepare signature from anomalies and update the signature into dataset. Figure 2 gives the proposed system architecture where the first module captures the packet from the network, then checks the match the signature from the packet
662
A. Shaikh and P. Gupta
Fig. 1 System architecture
Fig. 2 Working of system
with Ruleset using the Decision tree classifier in module 2, then pass it to anomalybased IDS module. If anomaly detected then alert generated and generate signature from anomalies and send packets to module 4 and update signature Ruleset. Algorithm Module 1: Capture Packets: Capture packets from the network using Scapy library. Module 2: Signature-based Intrusion Detection module:
Dynamic Updating of Signatures for Improving …
663
Fig. 3 Signature-based IDS module
Module1: Capture Packets from threading import Thread from scapy.all import * def inPacket(self, pkt) def stopfilter(self, x) def run(self): print("Sniffing started.") sniff(prn=self.inPacket, filter="", store=0, stop_filter=self.stopfilter)
The Signature-based IDS system uses the extracted information and checks the data using the decision tree from the Signature ruleset shown in Fig. 3. A decision tree may detect potentially risky network behaviours by analysing specific characteristics. This programme may be used to assess the security of multiple intrusion detection systems in an enormous data collection of intrusion occurrences. Attack signatures and other monitoring operations can continue while trends and patterns are growing. Decision trees are superior to different categorization approaches because they allow various rules, which are readily understandable and feasible with applications in real time. Most systems compare each input event gradually with all rules until the first mistake is detected. Ad hoc optimizations are also known to be domain-specific optimizations but are not optimally suited for all sets of rules by some of the abovestated approaches. It is vital to achieve a solution that encompasses everybody; thus, we need an overall system. A decision tree was presented to enhance the matching process that is created and adapted directly from the installed intrusion detection signatures. Until a basic decision tree is utilized, all rules may be speedily identified with the lowest number of comparison elements. For the creation of subgroups, a decision tree is beneficial; rules are split. This form enables us to locate the tree’s root node, which constitutes all rules. The subsets that the children of the tree form are equal to the subsets obtained by dividing the set rule by the first property. Each branch in the tree has a subgroup attached to it.
664
A. Shaikh and P. Gupta
Fig. 4 Anomaly-based IDS module
If a node has several rules, then the rules are separated, and the division’s features are given a single name. This feature is annotated with an arrow that leads from a child to a parent, and all of the rules on that child node include the value of the feature in parentheses. There are only a few unique rules for each leaf node in the tree and no distinctive features. Regulations are indistinguishably evident if all the features are the same in the clustering process. This is followed by an example where four rules and three characteristics are considered. A rule defines the properties of network packets coming from a given source address to a specified destination and port. While IPv4 source and destination port are used for the destination integer port, IPv4 type of source and destination is used for the source integer port. #) Source Address --> Destination Address: Destination Port (1) 172.16.0.1 --> 172.16.0.2: 23 (2) 172.16.0.1 --> 172.16.0.3: 23 (3) 172.16.0.1 --> 172.16.0.3: 25 (4) 172.16.0.4 --> 172.16.0.5: 80 We may observe a possible decision tree in the case given in Fig. 4. The rules were split according to the three attributes starting from the source address and set in succession for the tree from left to right. The detection procedure begins at the tree base after the IDS has detected the rules for an input data item. Once the node obtains a label, we will look at the following feature. The characteristic play’s fundamental role is to set the stage for deciding which child node is to be selected. Thus, while the Ruleset is divided by the appropriate functionality, detecting a single child node must be maintained. Module 2: Signature based IDS dtree = DecisionTreeClassifier(criterion = ‘entropy’) dtree = dtree.fit(X_train, y_train) data = tree.export_graphviz(dtree, out_file = None, feature_names = features) graph = pydotplus.graph_from_dot_data(data) graph.write_png(‘mydecision.png’) print(“xxxxxxxxxxxxxxxx”) y_pred = dtree.predict(X_test)
Dynamic Updating of Signatures for Improving …
665
All rules relevant to the node are presented in the leaf node. Additional functions may be required even if the other features are implemented. There has not previously been a partition of the features on the route to this leaf, and any regulations applying to this stretch of the road must be reassessed. Figure 1 shows the first rule in the left-hand leaf node. The root node is linked to a target port and the source and destination addresses, but the two are not applied to the target port. The rule determination procedure will ultimately reach the leaf node holding Rule 1 while processing an input packet transmitted from 172.16.0.1 to 172.16.0. However, the rules may need to be modified because the packet might have been transferred to another port. The destination port must thus be double-checked. It works as it should be. If unique characteristics are not specified, they are just added to the tree. Thus, all you have to do to divide a node is to ‘partition’ the node, leading to only one child node. There is no matching rule if the detection technique cannot discover a successor node with a matching rule. This helps to exit the matching procedure by returning as fast as feasible. Every step in the decision tree from beginning to end must be completed. Developing the tree entails implementing a feature selection on each group of rules (i.e. partitioning the corresponding rules). This method is not used if there is no rule to recognize a feature. For example, having one node with the same rules would be the same thing. This second component ensures that every node in the tree from the root down clears any partitioning characteristics previously utilized. In a split, end up with a single node behaves exactly like the parent. When there are prior partitions, all child nodes do not share the same rules, even if the rules are referencing the same feature. In this particular case, picked a specific feature that impacts the structure, complexity, and depth of the decision tree, a and. Decisions at each leaf should be reduced to a minimum because each one controls a separate input control, which is necessary for every decision made along the route from the root to the leaf. The perfect tree would be identical, with one individual root and one individual leaf assigned to each of them. If the rule matches, it generates an alarm, or it will go to the anomaly-based IDS. Module 3: Anomaly-based Detection Module: CNN ResNet 50 algorithm to check the anomaly level. If anomaly detected and alert generated and send a packet to Signature generation module. Steps for Anomaly-based IDS: Data is received from the signature-based dule, then packet data is preprocessed using data encoding technique, data normalization and balancing dataset. The next step is to send a detection module. The detection module includes the ResNet 50 module to detect an anomaly, generate alarm, and send packet data to the Signature generation module. All the steps are shown in Fig. 4.
666
A. Shaikh and P. Gupta
ResNet50, a 50-layer Convolutional Neural Network, is used in the model. This is an improvement over traditional machine learning approaches, which are insufficient in lowering the false alarm rate. The suggested IDS paradigm categorizes all network packet traffic into normal and attacks categories to identify network intrusions. The proposed model is trained and validated using three datasets: the Network Security Laboratory Knowledge Discovery in Databases (NSL KDD), First, the ROC curve was used to assess and validate the model’s overall accuracy, recall, precision, and F1 Score. Module 3: Anomaly-based Intrusion Detection module while (NewPackets P1) { F = extract_features(p1); //here F is featureset data_encoding(F); normalize_features(F); detection_attack_using_Resnet(F) {; create_sequential_model() Conv1D() relu _activation _function() batch_normalization () max_pooling() added_droup out() return attack_name } if attack { generate_alarm() genearate_signature_module() } else { continue() } }
NSL KDD dataset This is a well-known dataset that has been identified as a benchmark for the intrusion detection system [1], as shown in Table 1. Intrinsic features are those that are gathered from the packet’s header and do not have access to the payload. They contain basic information about the packet. The features 1–9 make up an intrinsic feature. Table 1 Dataset information
Dataset No. of features No. of records No. of attack types NSL
41
148517
4
Dynamic Updating of Signatures for Improving …
667
Table 2 Attack information in NSL KDD Attack types
Attack names
Remote to Local (R2L)
snmpguess snmpgetattack, xsnoop, ftp_write, phf, guess_passwd, imap, named, multihop, sendmail, xlock, spy, warezclient, worm, warezmaster
Denial of Service (DoS)
snmpguess snmpgetattack, satan, apache2, udpstorm, back, land, udpstorm, pod, processtable, teardrop
Probe
mscan, saint, Ipsweep, portsweep
User to Root (U2R)
perl, xterm, httptunnel, Loadmodule, buffer_overflow, rootkit, ps, sqlattack,
Because packets are broken into a number of sub-pieces rather than one, content features include information about the primary packets. We need to access packet data for this type of feature. Parts 10–22 make up a content feature. Time-based characteristics include scanning traffic over two-second periods and carrying information such as counts and rate rather than details of traffic input data; it also has how many connections it attempts to make to the same host. A time-based feature includes features numbered 23–31. Host-based features are similar to time-based features; in that they scan over a sequence of connections made rather than a two-second timeframe (how many requests are made to the same host over × number of relationships). These characteristics are primarily designed to gain access to attack data that spans more than a two-second window time period. The host-based characteristics range from 32 to 41. The attack classes present in the NSL KDD data set are grouped into the following four categories shown in Table 2 [5, 7]. Module 4: Signature generation and updation in ruleset module: Generate Signature from the packet. Module 4: Generate Signature while (NewPackets P1) { s1 = generate_signature( p1); //here F is featureset add_siganture_into_ruleset(); }
4 Experiment Environment and Result The intrusion detection system was developed using Google’s Tensorflow, providing an option to visualize the network design. The experiments were implemented in the below environment:
668
A. Shaikh and P. Gupta
System Specifications: Intel ® Core ™ i5-5200U CPU @ 2.20 GHz RAM: 8 GB Operating System: Ubuntu Programming Language: Python Python Libraries: Tensorflow, Numpy, Scikit-learn, Pandas. Table 3 consists of sample of Signature-based IDS Ruleset prepared from the SNORT ruleset. Mapping of Rules features with NSL KDD dataset features is to create a dataset to test the result of combined Signature-based and Anomaly-based IDS shown in Table 4. Experiment #2: Sniffing Result Example: [IP HEADER] version: 4 IHL: 20 bytes Tos: 40 Total Length: 52 Identification: 4983 Flags: DF Fragment Offset: 0 TTL: 56 Protocol: 6 Header checksum: 19177 Source: 49.44.198.218 Destination: 192.168.43.141 [TCP HEADER] Source Port: 80 Destination Port: 53390 Sequence Number: 1756565665 AAcknowledgements Number: 912356495 Data Offset: 8 Flags: A Window Size: 260 Checksum: 51126 Options: (timestamp (1392071508)) Experiment #4: Result of AIDS module: Figures 5 and 6 indicate the precision of attack detection; the x axis represents the number of epoch cycles, while the y axis indicates the accuracy. As a result of this graph, we demonstrated that as the number of epochs increases, so does the accuracy rate of LeeNet5 and ResNet50, respectively. Furthermore, it also aids in indicating the loss in training and validation, with the x axis indicating no epoch cycles and the y axis indicating loss. As a result, this graph demonstrated that the loss of training and validation reduces as the number of epochs rises.
SourceIP
$HOME_ NET
$HOME_ NET
$HOME_ NET
$HOME_ NET
Proto col
tcp
tcp
tcp
tcp
2401
666
513
513
Sport no.
$EXTERNA L_NET
$EXTERNA L_NET
$EXTERNA L_NET
$EXTERNA L_NET
DestIP
Table 3 Indicates the dataset prepared from Snort Dataset
Any
any
any
Any
dpo rtn
to_client,esta blished
to_client,esta blished
to_client, esta blished
to_client,esta blished
Flow
E protocol error| 3A| Root request missing, fast_patter
FTP Port open
|01|rlogind|3A| Permission denied.,fast_patter n,nocase
login incorrect,fast_patte rn,nocase
content
misc- attack
misc- activity
unsucces sful-user
unsucces sful-user
Classtype
INDICATO R-COMPROM
MALWAREBACKDOO R BackConstr uction2.1 Server FTP Open Reply
PROTOCO L-SERVICES rlogin login failure
PROTOCO L-SERVICES rlogin login failure
Message
Dynamic Updating of Signatures for Improving … 669
670
A. Shaikh and P. Gupta
Table 4 Mapping of rules with NSL KDD dataset features Attribute name Rule keyword
Short description
ip_proto
Protocol
IP protocol type
src_ip
Source IP address
Source IP address
dst_ip
Destination IP address
Destination IP address
Service
Destination port no
Service type. e.g. http
icmp_type
Itype
ICMP message type
src_bytes
DSize
Packet Payload size
Flags
Flags
TCP flags
Land
Sameip
Source and destination addresses are the same
Src_count
threshold: track by_src_count No. of connections to the same destination
Dst_count
threshold: track by_dst_count No. of connections to the same source
Fig. 5 LeeNet5 validation accuracy and loss with 100 epochs
Fig. 6 ResNet50 validation accuracy and loss with 100 epochs
Dynamic Updating of Signatures for Improving …
671
Table 5 The result after updation of Signature-based Ruleset Sr. No.
Evaluation parameters
Signature-based IDS
After dynamic updation of signature
1
Accuracy
96.64
98
2
Precision
96
98
3
Recall
95
97
4
F1 Score
95
97
Experiment #4: Final Result Table 5 indicates the result of Signature-based IDS after dynamic updation of Signatures it will increase the accuracy. Figure 7 indicates the result of Signature-based IDS system with improved performance using dynamic updation of Signature. TPR versus FPR is plotted on a ROC curve at various categorization levels. Figures 8a and b depicts the ROC curve of before dynamic updation of signature-
Fig. 7 Signature-based IDS System
a) ROC before Dynamic updation
Fig. 8 ROC curve
b) ROC after Dynamic updation
672
A. Shaikh and P. Gupta
Table 6 Comparison of results with other methods Parameter Linear Naive K-Nearest Adaboost Random Support Proposed Final regression Bayes Neighbors Forest vector model result Algorithm machine result after signature updation Accuracy 61.2
29.5
73.1
62.1
75.3
70.2
96.96
98.98
50.9
28.5
72.0
65.1
81.4
68.9
96
98
Precision Recall
61.2
29.5
73.1
62.1
75.3
70.2
95
96
F1 Score
53.0
18.4
68.4
59.4
71.5
65.6
95
97
Fig. 9 Comparison of system
based and after dynamic updation of signature-based Ruleset respectively. Table 6 and Fig. 9 indicate the comparison of results with other methods, it indicates our system gives better results as compared to other methods shown in Fig. 9.
5 Conclusion A signature method for most IDSs is used in which each input event matches default signatures of hazardous behaviour. The most resource-intensive IDS task is the matching procedure. Each input event is eventually compared to every rule by many systems. The given IDS is built primarily on a Decision Tree for signature-based
Dynamic Updating of Signatures for Improving …
673
IDS, which are especially beneficial in cybersecurity research given an accuracy of 96.96% but when added dynamic updation into signature-based IDS then result improved. Our suggested IDS is also tested on real-time traffic. Also, a significant contribution is the design and implementation of an intelligent system that allows the system to learn from previous intrusion attempts and prepare rules. Finally, dynamic updation of Signatures in the ruleset improves the IDS system’s accuracy by 98.98%.
References 1. H. Almutairi, N.T. Abdelmajeed, Innovative signature-based intrusion detection system: parallel processing and minimized database, in 2017 International Conference on the Frontiers and Advances in Data Science (FADS), Xi’an, China (2017), pp. 114–119. https://doi.org/10. 1109/FADS.2017.8253208 2. C. Kruegel, T. Toth, Using decision trees to improve signature-based intrusion detection, in Recent Advances in Intrusion Detection. RAID 2003. Lecture Notes in Computer Science, vol. 2820, ed. by G. Vigna, C. Kruegel, E. Jonsson. (Springer, Berlin, 2003). https://doi.org/10. 1007/978-3-540-45248-5_10 3. M.Y. Al Yousef, N.T. Abdelmajeed, Dynamically detecting security threats and updating a signature-based intrusion detection system’s database. Procedia Comput. Sci. 159, 1507–1516 (2019). ISSN 1877-0509. https://doi.org/10.1016/j.procs.2019.09.321 4. P.M. Patel, P.H. Rajput, P.H. Patel, A parallelism technique to improve signature based intrusion detection system. Asian J. Converg. Technol. (AJCT), 4(II) (2018) 5. C. Kruegel, T. Toth, Automatic Rule Clustering for improved, Signature based Intrusion Detection. Technical Report, Distributed System Group, Technical University Vienna, Austria 6. J. Ma, F. Le, A. Russo, J. Lobo, Detecting distributed signature-based intrusion: the case of multi-path routing attacks, in 2015 IEEE Conference on Computer Communications (INFOCOM), Hong Kong, China (2015), pp. 558–566. https://doi.org/10.1109/INFOCOM. 2015.7218423 7. W. Yassin, N.I. Udzir, A. Abdullah, M.T. Abdullah, H. Zulzalil, Z. Muda,Signature-based anomaly intrusion detection using integrated data mining classifiers, in 2014 International Symposium on Biometrics and Security Technologies (ISBAST), Kuala Lumpur, Malaysia (2014), pp. 232–237. https://doi.org/10.1109/ISBAST.2014.7013127 8. F.I. Shiri, B. Shanmugam, N.B. Idris, A parallel technique for improving the performance of signature-based network intrusion detection system, in 2011 IEEE 3rd International Conference on Communication Software and Networks, Xi’an, China (2011), pp. 692–696. https:// doi.org/10.1109/ICCSN.2011.6014986 9. R. Kumar, D. Sharma, Signature-anomaly based intrusion detection algorithm, in 2018 Second International Conference on Elect6onics, Communication and Aerospace Technology (ICECA), Coimbatore (2018), pp. 836–841 10. P. Do, H.-S. Kang, S.-R. Kim, Improved signature based intrusion detection using clustering rule for decision tree, in Proceedings of the 2013 Research in Adaptive and Convergent Systems (RACS ‘13). Association for Computing Machinery, New York (2013), pp. 347–348. https:// doi.org/10.1145/2513228.2513284 11. Q. Cheng, C. Wu, H. Zhou, D. Kong, D. Zhang, J. Xing, W. Ruan, Machine learning based malicious payload identification in software-defined networking. J. Netw. Comput. Appl. 192, 103186 (2021). ISSN 084-045. https://doi.org/10.1016/j.jnca.2021.103186 12. A.A. Shaikh, Attacks on cloud computing and its countermeasures, in 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES) (2016), pp. 748–752. https://doi.org/10.1109/SCOPES.2016.7955539
M-mode Carotid Artery Image Classification and Risk Analysis Based on Machine Learning and Deep Learning Techniques P. Lakshmi Prabha, A. K. Jayanthy, and Kumar Janardanan
Abstract In India, Cardiovascular Disease (CVD) is one of the major reasons for death. The age-standard CVD death rate in India is 272 per 100,000 individuals based on Global Burden of Disease Study. In the proposed work, risk of heart failure is predicted based on arterial stiffness indices parameters from carotid artery image. This study was carried out in South Indian population in and around Chennai. Carotid artery image was taken using ultrasound machine for 165 subjects (55 normal, 55 CVD and 55 diabetic). From the carotid artery images, arterial stiffness indices parameters such as stiffness index, elastic modulus, distensibility, and compliance are measured for predicting the risk of heart failure. Biochemical parameters such as HDL, LDL, total cholesterol, fasting blood sugar, and postprandial blood sugar are also recorded for the 165 subjects. Estimated Framingham Risk Score (FRS) is based on biochemical parameters, systolic blood pressure and diastolic blood pressure, and smoking. FRS is 10-year risk score prediction for analyzing the risk of heart failure. All these parameters are analyzed using biostatistical methods, machine learning methods, and deep learning methods to determine the risk of heart failure. Results show that Pearson’s correlation gives significant results with p < 0.01 for biochemical parameters and FRS score. The performance was also determined by comparing various classification techniques in machine learning such as Naïve Bayes, K-Nearest neighbor, Random Forest, Support vector machine (SVM), and multilayer perceptron. Results show that multilayer perceptron and SVM produce good results in four different models. Further analysis is performed by extracting the features of Mmode carotid artery images using transfer learning techniques such as ResNet50 and InceptionV3. The results show the highest accuracy of 99% and 98% are achieved P. L. Prabha · A. K. Jayanthy (B) Department of Biomedical Engineering, SRM Institute of Science and Technology, Kattankulathur Campus, Chennai, Tamil Nadu, India e-mail: [email protected] P. L. Prabha e-mail: [email protected] K. Janardanan Department of General Medicine, SRM Medical College Hospital and Research Centre, Kattankulathur Campus, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_50
675
676
P. L. Prabha et al.
using ResNet50 and InceptionV3, respectively for normal and CVD subjects. Relative risk analysis are also performed using SPSS software for stiffness index and elastic modulus with FRS and the results are tabulated. Keywords Cardiovascular disease · Arterial stiffness · Elastic modulus · Stiffness index · Distensibility · Compliance
1 Introduction Arterial Stiffness (AS) refers to the rigidity of an arterial wall. The potential significance of arterial stiffening is the early evolvement of cardiovascular disease in adults. Structure of an arterial wall consists of elastin and collagen fiber. The vascular smooth muscle tone, elastic and collagen fiber, along with the transmural distending pressure play a role in determining arterial stiffness [1]. Artery stiffening is caused by the gradual breakage and damage of elastin fibers in the arterial wall, as well as the increase of stiffer collagen fibers [2]. AS is caused due to biological aging of blood vessel wall, arteriosclerosis [3], biological aging, hypertension [4, 5], dyslipidemia [6], obesity [7], and smoking. Development of elevated arterial stiffness is interconnected with a greater risk of cardiovascular events such as myocardial infarction and stroke [8, 9]. According to the research, it is observed that higher arterial stiffness has been linked to an increased risk of atherosclerotic events [10, 11] and to the possibility of heart failure [12]. There are various methods for determining arterial stiffness such as Sphygmocor system, Complior system, QKd system, and ultrasound recording system [3–15]. Ultrasound recording system is noninvasive and most commonly used method for direct visualization of the blood vessels. For examining carotid arteries using ultrasound, linear phased array probes are used with a fundamental frequency minimum of 7 MHz and a maximum of 15 MHz [15, 16]. Arterial stiffness is a major risk factor interrelated with cardiovascular disease as well as hypertension. It is assumed that Systolic Blood Pressure (SBP) rises linearly with age, for patients with hypertension and patients with type II diabetes, while the Diastolic Blood Pressure (DBP) falls curvilinear as early as at the age of 45, all pointing to the development of increased arterial stiffness. In those with hypertension and diabetes, stiffness is a major, independent, and significant risk factor [17, 18]. Blood Pressure (BP) is an active CVD risk factor that controls the arterial wall and is associated with a variety of CVD, including strokes and ischemic heart disease [19]. The findings suggest that subjects with dyslipidemia or abnormal High Density Lipoprotein (HDL) and Low Density Lipoprotein (LDL) appear to be a strong and independent risk factor for CVD [20]. If dyslipidemia is untreated, it can lead to Coronary Artery Disease (CAD) and Peripheral Artery Disease (PAD) [11, 20, 21]. The arteries of individuals with chronic kidney disease, are usually stiffer than that of the general population, leading to the high cardiovascular mortality rate in renal patients [22]. The Ambulatory Arterial Stiffness Indicator (AASI) is one of the new index method which can monitor the arterial stiffness with ease under ambulatory
M-mode Carotid Artery Image Classification and Risk Analysis …
677
condition [22, 23]. In chronic inflammatory illnesses, arterial stiffness is raised due to the presence of atherosclerosis, and is also linked to the duration of disease, dyslipidemia, and inflammatory mediator C-reactive protein that increases its production [24]. Cardiovascular risk is predicted based on Framingham Risk Score (FRS) chart. FRS chart calculates the risk percentage based on age and gender. The following main parameters taken into consideration are total cholesterol, lipid profile, blood pressure, diabetes, and smoking, for calculating the risk. This chart predicts Coronary Heart Disease (CHD) separately for male and female populations based on their age. The FRS study shows the low risk of European populations [25, 26]. The findings reveal that the FRS overestimates the risk of coronary events in South European Mediterranean countries [27]. FRS separates the subjects into different types of risk categories based on age, gender, cholesterol, blood pressure, and smoking as estimated low-risk group (5–10% risk in 10 years), intermediate risk group (10–20%), and high-risk group (greater than 20%). The subjects from 30 to 74 years old were taken for this Framingham study. The subjects with 75 years older of age were excluded, due to other possible risk factors. Therefore from FRS, CVD risk can be predicted as early as possible [28].
2 Materials and Methods In the proposed method, the arterial stiffness indices of 165 subjects (55 normal, 55 CVD and 55 diabetic) are calculated based on carotid artery image obtained from Mmode ultrasound image. FRS was calculated for all 165 subjects and carotid artery imaging was also obtained. The analysis was then performed using biostatistics, Machine Learning (ML), and Deep Learning (DL) to classify the risk of heart failure.
Fig. 1 B-Mode carotid artery ultrasound image
678
P. L. Prabha et al.
A M-Mode carotid artery ultrasound image is shown in Fig. 1 which has been obtained using ultrasound machine, Philips affinity 30. Systolic diameter and diastolic diameter are measured corresponding to the ECG wave of the subject from carotid artery ultrasound image. From the measured parameters of systolic diameter and diastolic diameter, and systolic pressure and diastolic pressure, the arterial stiffness indices parameters such as Elastic modulus, Stiffness index, arterial Distensibility, and Arterial Compliance are estimated from M-mode ultrasound image.
2.1 Elastic Modulus Elastic modulus is used to determine the arterial wall stiffness. Elastic modulus (Ep) per unit area is the pressure step required per centimeter for theoretical 100% stretch of blood vessel from resting diameter and is described in Eq. 1. Elastic modulus(E p) =
P × Dd D
(1)
D = Ds − Dd P = Ps − Pd where, D = Differential diameter P = Differential Pressure Ds = Maximum systolic diameter Dd = Minimum diastolic diameter Elastin and collagen fibers make up the majority of the carotid artery, which provides a large elastic response [29].
2.2 Stiffness Index Stiffness index is calculated based on the ratio of systolic and diastolic pressure with respect to the ratio of differential diameter and diastolic diameter [1]. Stiffness index is determined by the logarithm of the ratio of maximum/minimum pressure over fractional diameter change [15] as described in Eq. 2. Ps ln Pd Stiffness Index (β) = D/Ds
(2)
M-mode Carotid Artery Image Classification and Risk Analysis …
679
Ps = Systolic pressure Pd = Diastolic pressure Ambulatory stiffness index is calculated in younger and older populations and is found to be less than 0.50 for younger patients and greater than 0.70 for older subjects [22].
2.3 Arterial Distensibilty Using high-resolution ultrasound machine, the distensibility of an artery segment has been measured [30]. Distensibility is the relative change in diameter for the given pressure change as shown in Eq. 3. Distensibility =
D P × Dd
(3)
D = Differential diameter (systolic diameter minus diastolic diameter) P = Differential pressure (systolic pressure minus diastolic pressure)
2.4 Arterial Compliance Compliance is absolute change in diameter for the given pressure step as shown in Eq. 4. Compliance = D/P
(4)
Proposed methodology for analyzing the risk of heart failure is shown in Fig. 2. A total of 165 subjects among which, 55 normal, 55 diabetic, and 55 CVD subjects are taken for the study. For all 165 subjects, measurement variables and other biochemical analysis are performed after obtaining the consent form and questionnaire with proper ethical approval. FRS was then calculated based on the measurement and biochemical variable. M-mode carotid ultrasound image was also obtained for all 165 subjects. The analysis was performed using biostatistics, machine learning and deep learning for the variable measured to classify the risk of CVD. Machine learning performed using various classification techniques achieving an maximum accuracy of 79%. Further, M-mode carotid artery images of 165 are augmented to 2200 dataset and deep learning architecture [31–33] of InceptionV3 and ResNet50 [34] as shown in Figs. 3 and 4, respectively, was applied. The InceptionV3 has convolution layer with 192 filters, maxpool layer, Inception V3 layer, 1024 filters, and an average pool layer. The ResNet50 has convolution layer with 2048 filters, maxpool layer, and an average pool layer before the flatten and softmax layer. Features are extracted before softmax
680
P. L. Prabha et al.
Fig. 2 Proposed methodology for analyzing the risk of heart failure
Fig. 3 InceptionV3 architecture
layer in both InceptionV3 and ResNet50. The extracted features using ResNet50 and InceptionV3 are trained using deep learning CNN architecture. Both InceptionV3 and ResNet50 produced greater accuracy of 99% and 98%, respectively. For training and validation the dataset is split as, 80% for training and 20% for validation. Entire dataset is fed into features extraction algorithm. Among the entire extracted features given to the classified neutral network to classify normal and abnormal carotid artery
M-mode Carotid Artery Image Classification and Risk Analysis …
681
Fig. 4 ResNet50 architecture
images the result shows only for 20% of validation data to classify the risk of heart failure.
3 Results and Discussion Biostatistical analyses such as student t-test and Pearson correlation, are performed for arterial stiffness indices parameter, FRS, and biochemical parameters. Table 1 shows Pearson Correlation for 55 CVD and 55 normal subjects. Parameters highly correlated are shown with p < 0.01 (considered highly significant- mentioned with double star) and parameters slightly correlated are shown with p < 0.05 (considered slightly significant- mentioned with single star) parameters. Results of FRS with stiffness index, elastic modulus, FBS, PPBS, and HbA1C are highly significant and mentioned with double star. Machine Learning (ML) classification techniques used are Naïve Bayes, KNN, Random Forest, SVM, and multilayer perceptron, to detect the greatest fit classifier for predicting the risk of heart failure between normal and abnormal subjects. The feature extracted for ML classifier [35] is elastic modulus, stiffness index, distensibility, compliance, biochemical parameters, and FRS score value. Table 2 shows the performance of ML classifier with a maximum accuracy of 79%. To improve the accuracy, carotid artery images are augmented and the features are extracted using ResNet50 and InceptionV3 deep learning architecture [31]. Finally, the extracted features are trained using deep neural network architecture to achieve greater accuracy. Tables 3 and 5 show the deep learning neural network confusion matrix of ResNet50 and Inceptionv3, respectively, for normal and diabetic subjects for the 20% validation dataset. Result of ResNet50 in Table 3 shows that 213 subjects are predicted as true positive (subjects who have a risk of heart failure), 216 subjects are predicted as true negative (subjects who have no risk of heart failure), and 11 subjects are false positive (subjects with no risk of heart failure but predicted as risk), 9 subjects are predicted as false negative (subjects with risk are predicted as
Table 1 Pearson correlation of arterial stiffness indices parameter with biochemical parameter
682 P. L. Prabha et al.
M-mode Carotid Artery Image Classification and Risk Analysis …
683
Table 2 Comparison of various ML classification techniques Classification Techniques
Sensitivity%
Specificity%
Precision%
Accuracy%
Naive Bayes Classifier
69
75
90
70
Logistic Regression
75
80
90
73
SVM
72
77
90
79
KNN
78
81
90
79
Random forest
78
81
90
79
Multilayer Perceptron
78
81
90
79
Table 3 Deep learning confusion matrix of ResNet50 for normal and diabetic subjects for 20%validation data
Predicted class of Observed class of abnormal and normal abnormal and normal True False Abnormal class
213 (True positive) 11 (False positive)
Normal class
0 (False negative)
216 (True negative)
no risk of heart failure) Table 4 shows the performance of deep learning ResNet50 for validation data with sensitivity of 100%, specificity 95% and accuracy 97%. Result of InceptionV3 in Table 5 shows 212 subjects are predicted with true positive (subjects who have a risk of heart failure), 223 subjects are predicted as true negative (subjects who have no risk of heart failure), four subjects are false positive (subjects with no risk of heart failure but predicted as risk), and one subject is predicted as false negative (subjects with risk are predicted as no risk of heart failure). Table 6 shows the performance of deep learning ResNet50 for validation data with sensitivity of 99%, specificity of 98%, and accuracy of 98%. Results of Figs. 5 and 6 show the validation and accuracy plot of ResNet50 and InceptionV3 for normal and diabetic subjects. Table 7 shows the deep neural network confusion matrix of ResNet50 for normal and CVD subjects. Result of ResNet50 in Table 7 shows 225 subjects are predicted as Table 4 Performance of deep learning confusion matrix of ResNet50 for normal and diabetic subjects
Table 5 Deep learning confusion matrix of InceptionV3 for normal and diabetic subjects for 20%validation data
Measure
Value (%)
Sensitivity
100
Specificity
95
Accuracy
97
Predicted class of abnormal and normal
Observed class of abnormal and normal True
False
Abnormal class
212 (True Positive)
4 (False Positive)
Normal class
1 (False Negative)
223 (True Negative)
684 Table 6 Performance of InceptionV3 classifier for normal and diabetic subjects
P. L. Prabha et al. Measure
Values (%)
Sensitivity
99
Specificity
98
Accuracy
98
Fig. 5 Validation and accuracy plot of ResNet50 for normal and diabetic subjects
Fig. 6 Validation and accuracy plot of InceptionV3 for normal and diabetic subjects
Table 7 Deep neural network confusion matrix of ResNet50 of normal and CVD for validation data of 20%
Observed class of normal and abnormal Predicted class of abnormal and normal True False Abnormal class
225 (True positive) 3 (False positive)
Normal class
2 (False negative)
210 (True negative)
true positive (subjects who have a risk of heart failure), 210 subjects are predicted as true negative (subjects who have no risk of heart failure), 3 subjects are false positive (subjects with no risk of heart failure but predicted as risk), 2 subjects are predicted as false negative (subjects with risk are predicted as no risk of heart failure) Table 8 shows the performance of ResNet50 classifier for normal and CVD subjects for validation data with sensitivity 99%, specificity 98% and accuracy 99% Result of InceptionV3 in Table 9 shows 222 subjects are predicted as true positive (subjects who have a risk of heart failure), 213 subjects are predicted as true negative (subjects who have no risk of heart failure), 2 subjects are false positive (subjects with no risk of heart failure but predicted as risk), 3 subjects are predicted as false
M-mode Carotid Artery Image Classification and Risk Analysis … Table 8 Performance of ResNet50 classifier for normal and CVD subjects for validation data
Table 9 Deep neural network confusion matrix Inceptionv3 of normal and CVD for validation data of 20%
Measure
Values (%)
Sensitivity
99
Specificity
98
Accuracy
99
685
Predicted class of Observed class of normal and abnormal abnormal and normal True False Abnormal class
222 (True positive) 2 (False positive)
Normal class
3 (False negative)
213 (True negative)
negative (subjects with risk are predicted as no risk of heart failure). Table 10 shows the performance of InceptionV3 classifier for normal and CVD subjects for validation data with sensitivity of 98%, specificity of 99%, and accuracy of 98%. Figure 7 shows the validation and accuracy plot of ResNet50 for normal and CVD subjects. Figure 8 shows the validation and accuracy plot of InceptionV3 for normal and CVD subjects. Risk analysis was performed through SPSS statistical analysis software for 165 datasets. Low and high risk is defined through cut-off value by recoding into different variable options present in transform. Then, the cross tab analysis was performed for risk analysis to identify the relative risk. Table 11 shows risk analysis of stiffness index with FRS. Low risk of FRS with high stiffness index is 7% and high risk with Table 10 Performance of InceptionV3 of normal and CVD for validation data
Measure
Values (%)
Sensitivity
98
Specificity
99
Accuracy
98
Fig. 7 Validation and accuracy plot of ResNet50 for normal and CVD subjects
686
P. L. Prabha et al.
Fig. 8 Validation and accuracy plot of InceptionV3 for normal and CVD subjects
Table 11 Risk analysis of FRS with stiffness index Framingham risk score (FRS) FRS%
Low risk High risk
Total
Stiffness index
Total
Low risk
High risk
Count
53
4
57
% within FRS
93.0%
7.0%
100.0%
Count
64
44
108
% within FRS
59.3%
40.7%
100.0%
Count
117
48
165
% within FRS
70.9%
29.1%
100.0%
high stiffness index is 40.7%. Table 12 shows relative risk and odd ratio for FRS as 9.1. Therefore, relative risk analysis shows that a person with low risk of FRS has 0.17 times (or 17%) chances of getting high stiffness index in comparison with the person with high risk of FRS. Hence, a person with low risk of FRS has less chance of getting high stiffness index. A person with high-risk FRS has 3.7 times (or 37%) chances of getting high stiffness index than a person with low risk. Table 13 shows risk analysis of elastic modulus with FRS. Low risk of FRS with high elastic modulus is 45.6% and high risk with high elastic modulus is 82.4%. Table 14 shows relative risk and odd ratio for FRS as 5.5. Therefore, relative risk analysis shows that a person with low risk of FRS has 0.55 times (or 55%) chances Table 12 Relative risk estimate and Odd ratio for FRS and stiffness index Risk analysis
Risk value
95% CI (confidence interval) Lower
Upper
Odd Ratio for FRS (low risk/high risk)
9.109
3.074
26.995
For cohort Stiffnessindex = low risk
1.569
1.321
1.863
For cohort Stiffnessindex = high risk
0.172
0.065
0.455
No. of valid cases
165
M-mode Carotid Artery Image Classification and Risk Analysis …
687
Table 13 Risk analysis of FRS with elastic modulus Elastic modulus
Framingham risk score (FRS) FRS%
Low risk High risk
Total
Low risk
High risk
Total
Count
31
26
57
% within FRS
54.4%
45.6%
100.0%
Count
19
89
108
% within FRS
17.6%
82.4%
100.0%
Count
50
115
165
% within FRS
30.3%
69.7%
100.0%
Table 14 Relative risk and odd ratio for elastic modulus and FRS Risk analysis
Risk value
95% CI (confidence interval) Lower
Upper
Odd ratio for FRS (low risk/high risk)
5.585
2.721
11.463
For cohort EP = low risk
3.091
1.928
4.958
For cohort EP = high risk
0.554
0.411
0.745
No. of valid cases
165
of getting high elastic modulus in comparison with the person with high risk of FRS. Hence, a person with low risk of FRS has less chance of getting high elastic modulus. Limitation of this work-study is that it requires a large number of datasets with different classes of subjects such as diabetes, cardiovascular, renal disease, and smoking. Multiclass classifiers can be used with large datasets for different groups of subjects to predict the risk of heart failure. Fully automated carotid artery segmentation can be performed using deep learning techniques with large datasets.
4 Conclusion Arterial stiffness indices parameters such as stiffness index, elastic modulus, distensibility, and compliance were measured from carotid artery ultrasound images. FRS was calculated from biochemical parameters. Pearson’s correlation performed for FRS value and other parameters shows significant results with p < 0.01. Machine learning algorithm has been performed and achieved an accuracy of 79% for KNN, Random Forest, and Multilayer perceptron classifier. Deep learning algorithms using ResNet50 and InceptionV3 are performed for 2200 datasets and achieved greater accuracy of 97% and 98% for normal and diabetic subjects, respectively. Similarly, transfer learning approach using ResNet50 and InceptionV3 was performed for normal and CVD subjects and produced 99% and 98% accuracy, respectively. Risk analysis was performed using SPSS software for stiffness index and elastic
688
P. L. Prabha et al.
modulus with FRS, and the results show that the subject with low-risk FRS has 0.17 times chances of getting high stiffness index and 0.55 times of high elastic modulus, compared to that of a person with high-risk FRS.
References 1. I.S. Mackenzie, I.B. Wilkinson, J.R. Cockcroft, Assessment of arterial stiffness in clinical practice. QJM Mon. J. Assoc. Physicians 95(2), 67–74 (2002). https://doi.org/10.1093/qjmed/ 95.2.67 2. A.N. Lyle, U. Raaz, Killing me unsoftly: causes and mechanisms of arterial stiffness. Arterioscler. Thromb. Vasc. Biol. 37(2), e1–e11 (2017). https://doi.org/10.1161/ATVBAHA.116. 308563 3. D.A. Duprez, J.N. Cohn, Arterial stiffness as a risk factor for coronary atherosclerosis. Curr. Atheroscler. Rep. 9(2), 139–144 (2007). https://doi.org/10.1007/s11883-007-0010-y 4. D.L. Cohen, R.R. Townsend, Update on pathophysiology and treatment of hypertension in the elderly. Curr. Hypertens. Rep. 13(5), 330–337 (2011). https://doi.org/10.1007/s11906-0110215-x 5. G. Radchenko, I. Zhyvylo, E. Titov, Y.U. Sirenko, Evaluation of arterial stiffness in new diagnosed idiopathic pulmonary arterial hypertension patients.. Available: https://academic.oup. com/eurheartj/article/41/Supplement_2/ehaa946.2245/6005675 6. M. Enomoto et al., LDL-C/HDL-C ratio predicts carotid intima-media thickness progression better than HDL-C or LDL-C alone. J. Lipids 2011, 1–6 (2011). https://doi.org/10.1155/2011/ 549137 7. M.E. Safar, S. Czernichow, J. Blacher, Obesity, arterial stiffness, and cardiovascular risk. J. Am. Soc. Nephrol. 17(SUPPL. 2), 109–111 (2006). https://doi.org/10.1681/ASN.2005121321 8. F.U.S. Mattace-Raso et al., Arterial stiffness and risk of coronary heart disease and stroke: the Rotterdam Study. Circulation 113(5), 657–663 (2006). https://doi.org/10.1161/CIRCULATI ONAHA.105.555235 9. D.H. O’Leary, J.F. Polak, R.A. Kronmal, T.A. Manolio, G.L. Burke, S.K. Wolfson, Carotidartery intima and media thickness as a risk factor for myocardial infarction and stroke in older adults. N. Engl. J. Med. 340(1), 14–22 (1999). https://doi.org/10.1056/nejm199901073400103 10. N. M. Van Popele et al., Association between arterial stiffness and atherosclerosis: the Rotterdam study (2001). Available: http://ahajournals.org 11. B.G. Nordestgaard, J. Zacho, Lipids, atherosclerosis and CVD risk: Is CRP an innocent bystander? Nutrit. Metabol. Cardiovasc. Dis. 19(8), 521–524 (2009). https://doi.org/10.1016/ j.numecd.2009.07.005 12. A. Pandey et al., Arterial stiffness and risk of overall heart failure, heart failure with preserved ejection fraction, and heart failure with reduced ejection fraction: the Health ABC study (health, aging, and body composition) heart. Hypertension 69, 267–274 (2017). https://doi.org/10.1161/ HYPERTENSIONAHA.116.08327 13. M. Butlin, A. Qasem, Large artery stiffness assessment using sphygmocor technology. Pulse 4(4), 180–192 (2016). https://doi.org/10.1159/000452448 14. B.M. Pannier, A.P. Avolio, A. Hoeks, G. Mancia, K. Takazawa, Methods and devices for measuring arterial compliance in humans (2000). Available: https://academic.oup.com/ajh/art icle/15/8/743/144071 15. J.A. Chirinos, Arterial stiffness: basic concepts and measurement techniques. https://doi.org/ 10.1007/s12265-012-9359-6 16. K.H. Park, M.K. Kim, H.S. Kim, W.J. Park, G.Y. Cho, Y.J. Choi, Clinical significance of framingham risk score, flow-mediated dilation and pulse wave velocity in patients with stable angina. Circ. J. 75(5), 1177–1183 (2011). https://doi.org/10.1253/circj.CJ-10-0811
M-mode Carotid Artery Image Classification and Risk Analysis …
689
17. H. Smulyan, A. Lieber, M.E. Safar, Hypertension, diabetes Type II, and their association: role of arterial stiffness. Am. J. Hypertens. 29(1), 5–13 (2016). https://doi.org/10.1093/ajh/hpv107 18. J. Hashimoto, M.F. O’Rourke, Is arterial stiffness better than blood pressure in predicting cardiovascular risk? Curr. Cardiovasc. Risk Rep. 2(2), 133–140 (2008). https://doi.org/10. 1007/s12170-008-0025-0 19. M.E. Safar, B.I. Levy, H. Struijker-Boudier, Current perspectives on arterial stiffness and pulse pressure in hypertension and cardiovascular diseases. Circulation 107(22), 2864–2869 (2003). https://doi.org/10.1161/01.CIR.0000069826.36125.B4 20. D. Orozco-Beltran et al., Lipid profile, cardiovascular disease and mortality in a Mediterranean high-risk population: the ESCARVAL-RISK study behalf of ESCARVAL Study Group. Ramon Durazo-Arvizu 16. https://doi.org/10.1371/journal.pone.0186196 21. R.H. MacKey, P. Greenland, D.C. Goff, D. Lloyd-Jones, C.T. Sibley, S. Mora, High-density lipoprotein cholesterol and particle concentrations, carotid atherosclerosis, and coronary events: MESA (Multi-Ethnic Study of Atherosclerosis). J. Am. Coll. Cardiol. 60(6), 508–516 (2012). https://doi.org/10.1016/j.jacc.2012.03.060 22. Y. Li et al., Ambulatory arterial stiffness index derived from 24-hour ambulatory blood pressure monitoring (2006). https://doi.org/10.1161/01.HYP.0000200695.34024.4c 23. E. Dolan et al., Ambulatory arterial stiffness index as a predictor of cardiovascular mortality in the Dublin outcome study (2006). https://doi.org/10.1161/01.HYP.0000200699.74641.c5 24. M.J. Roman et al., Arterial stiffness in chronic inflammatory diseases (2005). https://doi.org/ 10.1161/01.HYP.0000168055.89955.db 25. R.M. Conroy et al., Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur. Heart J. 24(11), 987–1003 (2003). https://doi.org/10.1016/S0195-668 X(03)00114-3 26. T. Yingchoncharoen, T. Limpijankit, S. Jongjirasiri, J. Laothamatas, S. Yamwong, P. Sritara, Arterial stiffness contributes to coronary artery disease risk prediction beyond the traditional risk score (RAMA-EGAT score). Heart Asia 4(1), 77–82 (2014). https://doi.org/10.1136/hea rtasia-2011-010079 27. J. Marrugat et al., An adaptation of the Framingham coronary heart disease risk function to European Mediterranean areas. J. Epidemiol. Community Health 57(8), 634–638 (2003). https://doi.org/10.1136/jech.57.8.634 28. K.M. Anderson, P.W.F. Wilson, P.M. Odell, W.B. Kannel, An updated coronary risk profile. A statement for health professionals. Circulation 83(1), 356–362 (1991). https://doi.org/10.1161/ 01.CIR.83.1.356 29. T. Khamdaeng, J. Luo, J. Vappou, P. Terdtoon, E.E. Konofagou, Arterial stiffness identification of the human carotid artery using the stress-strain relationship in vivo. Ultrasonics 52(3), 402–411 (2012). https://doi.org/10.1016/j.ultras.2011.09.006 30. G. Baltgaile, Arterial wall dynamics. Perspect. Med. 1–12(1–12), 146–151 (2012). https://doi. org/10.1016/j.permed.2012.02.049 31. D. Alblas, C. Brune, J.M. Wolterink, Deep learning-based carotid artery vessel wall segmentation in black-blood MRI using anatomical priors. arXiv preprint arXiv:2112.01137 (2021) 32. S. Sava¸s, N. Topalo˘glu, Ö. Kazcı, P.N. Ko¸sar, Classification of carotid artery intima media thickness ultrasound images with deep learning. J. Med. Syst. 43(8) (2019). https://doi.org/10. 1007/s10916-019-1406-2 33. L. Saba et al., Ultrasound-based internal carotid artery plaque characterization using deep learning paradigm on a supercomputer: a cardiovascular disease/stroke risk assessment system. Int. J. Cardiovasc. Imaging 37, 1511–1528 (2021). https://doi.org/10.1007/s10554-020-021 24-9 34. M. Raj, I. Gogul, M. Deepan Raj, V. Sathiesh Kumar, V. Vaidehi, S. Sibi Chakkaravarthy, Analyzing ConvNets depth for deep face recognition, vol. 703 (Springer, Singapore, 2018) 35. M. K. Abd-Ellah, A.A.M. Khalaf, R.R. Gharieb, D.A. Hassanin, Automatic diagnosis of common carotid artery disease using different machine learning techniques. J. Ambient Intell. Humaniz. Comput. 0123456789 (2021). https://doi.org/10.1007/s12652-021-03295-6
Analyzing a Chess Engine Based on Alpha–Beta Pruning, Enhanced with Iterative Deepening Aayush Parashar, Aayush Kumar Jha, and Manoj Kumar
Abstract Chess is a two-player strategy board game played on a chessboard, a checkered game board with 64 squares arranged in an 8 × 8 grid. The current technological advancements have transformed the way we not only play chess but study chess techniques and analyze the beautiful game. Predicting the next move in a chess game is not an easy task, as the branching factor of Chess on average is 35. Thus, new algorithms need to be researched to make an ideal Game Engine. In this paper, we analyze why the traditional algorithms fail to work for a Chess Engine and create one such engine which evaluates the Game Tree using the traditional Alpha–Beta Algorithm, enhanced by using Iterative Deepening Search. Keywords Chess engine · Alpha–beta pruning · Min–max algorithm · IDDFS
1 Introduction Game Trees are the traditional methods that are used for predicting the next best moves in a Game Engine, where one player’s moves are generated by an Automated Engine. While Game Trees work great with some simple games, they become increasingly complicated as the branching factor of a game increases. While various algorithms like the Minimax Algorithm and Alpha–Beta Algorithms are used for evaluating the next optimal move, they fail when applied to a Chess Engine. As Chess is a very complicated game, owing to the fact that at any point in time there are multiple possible options for the next moves, it becomes a very hectic task to even create a Game Tree, leaving only an algorithm to evaluate and predict the next move. In this paper, we analyze why traditional algorithms fail for a chess engine and the strategy that can be used to enhance the existing traditional methods of a Game Tree Evaluation.
A. Parashar (B) · A. K. Jha · M. Kumar Delhi Technological University, Shahbad Daulatpur, Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_51
691
692
A. Parashar et al.
1.1 Background The first thought that comes to mind when asked about a board game is Chess. It is one of the oldest board games known to mankind. No one knows how and when it started but in 1770 the first successful chess automation was created, which worked with the help of a human inside it who did all the thinking part. So it demonstrates that people were enthusiastic about this 32-piece board game and its complexity. KRK end games (played chess end games using a king and rook) were the first electromechanical device designed in 1890. Computer chess has a long history. Since the 1940s chess has been always there where automation and computer science terms are used. And a lot of work is witnessed in developing theory and creating chess algorithms, for instance, the paper “Programming a Computer for Playing Chess” by Claude Shannon is considered the initial step in today’s computer chess theory. Automation has proved to be of great significance and it has achieved various achievements like Deep Blue (1997) defeating chess grandmaster Gary Kasparov and Hydra defeating Micheal Adams (2005). Drosophila (fruit fly) is considered one of the major species that brought so many insights about biology, especially genetics. Many researchers do believe that chess is drosophila of AI, which is not perfectly true. According to Donald Michie and Stephen Coles, there is no comparison, drosophila has given a lot to biology whereas chess has not done that much for AI. This observation reflects practice stating that instead of using A.I. Chess computers rely on brute force. Mainly there are two fields that are focused on research on Chess Engine: 1.
2.
The quest for search efficiency, resulted in many optimizations to the search algorithms of chess computers resulting in better performance. This field solely focuses on pruning branches from the game tree for efficient search. The new ideas, where the search is guided by chess knowledge are being investigated— that the research becomes more A.I. oriented—again. Representation of chess knowledge in chess computers. To decide the best move, the automation evaluates which position should be used and which not. Tuning by hand, genetic algorithm, etc. are used in Deep Thought and Deep Blue as a method for the evaluation function.
1.2 Research Work Research on Artificial Intelligence Algorithms has been going on for a long time. New Algorithms are explored and researched to achieve the best result with minimum time and space complexity, often a trade-off between them. Numerous researches are undertaken for the enhancement of the Alpha–Beta Algorithm. Following are some of the enhancements that are undertaken: 1.
Transpositions. The search space in an Alpha–Beta Algorithm is generally designed as a tree, however, it can also be considered a direct acyclic graph.
Analyzing a Chess Engine Based …
2.
3.
4.
693
Tables, called Transposition tables, can be used to store which nodes are visited, which can save the algorithm a lot of unnecessary moves [1]. Best Move First: If we were to go through the best move first at every node, we may end up saving a lot of time and complexity. Thus, various dynamic and static schemes are designed, so that all the possible moves from the current node can be ordered from best to worst [1]. Variable Depth: While pruning the search tree, it is not necessary to go to the max depth of every node. Some moves have potential that should be explored further, while some are not worth spending time on. Thus, algorithms are designed that can make this decision and decide the appropriate search depth for the current subtree [1]. Minimal windows. The search window of the Alpha–Beta tree lies in the range of lower bound (α) and upper bound (β). If the window is narrowed, then the likelihood of a cutoff can be increased [1].
The research paper published at the Chinese Control and Decision Conference [2], explored the features of Artificial Intelligence on game models and developed a Checkers game using Alpha–Beta Pruning. The paper proposed an algorithm based on the Alpha–Beta Pruning Algorithm and further optimized it by using an iterative deepening algorithm. The whole model was implemented on a Checkers game. By adjusting the search depth dynamically, it solved the problem caused by changes in checkers. Another unique feature of this paper was that it presented a kind of zoning evaluation. They divided the board into size regions and those regions were assessed separately. The pawn differences were quantified by various experiments that were conducted to get the reasonable scaling factor of the king and the ordinary. The research paper, Alpha–Beta Pruning in Minimax Algorithm—An Optimized Approach for a Connect-4 Game(2018) [3], presented an enhancement of Alpha– Beta Pruning with Min–Max Algorithm for a Connect-4 game. According to that, a user will be able to log in to the system where he/she can challenge various levels of AI with an increasing degree of difficulty. The moves of the AI will be optimized by using a minimax algorithm and alpha–beta pruning.
2 Game Tree of the System Characteristics of Chess- zero-sum, deterministic, finite, complete information game. Zero-sum means that at any point in the game the sum will be zero, that is the main purpose of the opponent is to win and when one wins the game the other is bound to lose. In the case of a draw, the sum remains zero. For instance, if the points of one player are +36 the other must be −36. Deterministic—there are only three possible outcomes: win, lose or draw. Finitethe game does end and is not a never-ending game. And to handle that thing various rules are made like—If the same position happens thrice the result is a draw. Even
694
A. Parashar et al.
after 25 moves no head to head interaction between any two opposite pieces it’s a draw. Complete information means that there is no hidden information or uncertainty like in a card game such as poker and the game is played sequentially: both players have the same information. In a two-player game such as chess, there are specific requirements that need alpha–beta which can be described using the tree. Each node of the tree is a position, the root node is where the game starts and its child nodes are the possibilities for the next moves. So after every possibility, moves can be generated and are added to the tree. Everything can be visualized in the game tree. A move in chess is when both black and white play one turns each. A single move by either player is called a ply. Sometimes even a ply is considered a move, the terminology in chess is inconsistent. As shown in Fig. 1, there are a lot of branches coming out of every node in the Game Tree. This signifies that the branching factor of chess is very high (at an average, it is 35). Thus, there are so many moves that a player can take at any point of time during a chess game, that it becomes hard to use the traditional Alpha–Beta Algorithm. The final node of the Game Tree is where the evaluation function is used to assess who is winning and award scores to each move based on that. As the figure shows, Multiple moves can lead to the same position which leads to the conclusion that caching the current evaluated move made by the engine may save some computation.
Fig. 1 Game tree of a chess engine
Analyzing a Chess Engine Based …
695
3 Proposed Work The development of an Intelligent Model for a chess game that can take valid and optimized moves on a Chess Board requires the application of various concepts. A lot of factors need to be considered such that the next move predicted is the most appropriate one. A Chess Program can be divided into the following three modules: 1.
2.
3.
Move Generator: A module that is responsible for generating the next move. It takes into account the current positions of all the chess pieces and evaluates the next best move based on certain conditions specified at the start of the game. It recursively generates the next legal moves and forms a game tree. Evaluation Function: For any heuristic algorithm, the most important part is how we evaluate whether the current move would be appropriate or not. For that purpose, we use an Evaluation Function. In chess, we can evaluate the position based on various factors - like Position, Material, and Attack. Logically, the best possible position is the one in which the player is winning, thus it should return a positive value. Search Algorithms: After evaluating the best move, we need to find a path to that node. That is done using Search Algorithms. Chess is a zero-sum game. So, all the nodes in the path to the best node, non-winning or non-final leaf nodes, will be assigned the maximum value relative to their neighbors. These are the major modules that are considered while making a chess module.
3.1 Move Generator For a chess game, the branching factor is approximately 35 at any point in time. There are two rooks, two bishops, two knights, one queen, one king, and eight pawns on an 8 × 8 chessboard. Any pawn can take either one or two moves forward or one move diagonal. Knights can take eight moves on an empty Chess Board. Bishops can move diagonally, so there are approximately 16 moves that can be taken at the center of the board. Rooks can move vertically or horizontally throughout the board. Kings can take only one step in all 8-directions and Queens can move diagonally as well as vertically. While in an empty chessboard with only one chess piece present, it would be easy to predict all the possible moves, in an actual chess game it gets increasingly complicated. First, we need to select the chess piece out of the available ones and then select the best move for that piece. As discussed earlier, drawing the whole game tree for Chess and recursively going through every branch before choosing the best option is not convenient. It would cost a lot of time and space. To solve this, the Min–Max algorithm can be used. It is a recursive or backtracking algorithm that provides an optimal move for the player assuming that the opponent is also playing optimally. However, the traditional Min–Max algorithm will not work in this case.
696
A. Parashar et al.
Alpha–Beta Pruning is an enhanced version of the minimax algorithm where we can compute the correct minimax decision without going through all the branches, a technique which is called pruning. Pruning some branches reduces some load over the engine, however, it still is not the most appropriate choice for the engine. However, while making a move some conditions need to be checked. Failure of these conditions is not acceptable and that branch can be pruned immediately [4]. Such conditions can be described at the start of the game. For example, if the king is in danger then that move should not be made by the Chess Engine [5–10]. Hence, that branch can be pruned. If such a state arrives that no move can be taken by the chess Engine where the king is not in danger, then it is a checkmate condition and the Game should be terminated at that moment [11, 12].
3.2 Evaluation Functions and Pieve Values Chess pieces indicate a certain value, even though many people don’t know about it, it is an essential feature of the game as well as automation. Chess piece values help determine which piece can be traded for another exchange and these values help the program to evaluate the position and go for the best possible move. Each piece in itself has a weakness and strengths. Piece value gives a total point-based representation of pieces based on all those weaknesses and strengths. King doesn’t have value, because the main motive of the game is to checkmate the king, so saving the king becomes the utmost priority in the game [13]. Values are of great significance, even while strategizing, for instance, if it’s possible to trade two rooks for a queen, well mostly it’s not because two rooks are equal to 10 while the queen is 9 but sometimes it is all right depending upon the game (Table 1). Engine evaluation has a direct connection with piece value. The evaluations are numerically based with + being favorable for white and − for black. For instance, if the evaluation is +1 then white is ahead by a point of pawn, or if it’s −5 then black is ahead by the point of a rook. Sometimes the evaluation can be in the form of decimals like −2.5 meaning black is ahead by up to and a half pawn. Table 1 Chess pieces with their material values
Chess piece
Values
Pawn
1
Knight
3
Bishop
3
Rook
5
Queen
9
Analyzing a Chess Engine Based …
697
3.3 Search Algorithm A Game tree is nothing but an Undirected Acyclic Graph. Breadth-First Search, BFS, and Depth First Search, DFS, are the two most common ways of traversing through the graph. For a Game Tree generally, Depth First Search is used to traverse the tree. However, in the case of a chess engine, it wouldn’t work. Another algorithm, namely Iterative Deepening Depth First Search, IDDFS, combines the space efficiency of DFS and the fast search of BFS. Caching is the process of storing some useful data in a temporary storage location so that it can be accessed more quickly. As depicted earlier, many times different moves may lead to the same stage in the game. Alpha–Beta Pruning is a method of pruning branches that are irrelevant and do not contribute to the final result. Every node is assigned two parameters: Alpha (α) and Beta (β). Alpha represents the minimum score that the Maximizing Player is assured of and Beta represents the maximum score that the Minimizing player is assured of. Thus, any branch with α > β can be pruned. The chess game begins with the White Piece making the first move. If the Player is White, the Engine first waits for the player to make the move. If the player is black, then the Engine makes the first move. A Game Tree is constructed using the Move Generator Module. The Engine conducts an Iterative Deepening Depth First Search (IDDFS) on the constructed game tree with a Default Depth or taken as an Input from the Player at the beginning of the Game. The tree is evaluated and branches are pruned using the Alpha–Beta Pruning Method. The evaluated value of each node is stored in a cache register. The Engine then takes that path that returns the highest evaluated value (Fig. 2). The player makes the next move. From this point onward, the Engine generates the Game Tree and before evaluating that path, checks the cache register to see if it
Fig. 2 Snapshot of a white queen making a move
698
A. Parashar et al.
stores the evaluated value of this node. If the node is already evaluated, the Engine returns that value for the node; otherwise, it repeats the same procedure [14].
4 Findings The results of the Algorithm are presented in Table 2. With the Minimax-Algorithm implementation, the compiler soon ran out of memory. When the search depth is increased to five for the IDDFS algorithm, the chess engine runs out of memory as in this case, no branch is pruned and all the possible moves are evaluated. It clearly shows the problem with the traditional Minimax Algorithm and proves that it is not suitable for a chess engine. For the Alpha–Beta Algorithm, the results show that as the depth increases, the time taken by the engine to predict the next move increases. Figure 3 is the plot of the results for the Alpha–Beta Algorithm with X-Axis as Depth and Y-Axis as the time taken. The plot proves that the time taken by the chess algorithm increases exponentially with depth. Thus, it can be said that while the Alpha–Beta Algorithm certainly provides a great improvement over the traditional Minimax Algorithm, it certainly isn’t the most ideal algorithm itself. Another important feature to note here is that using Iterative Deepening Depth First Search here was an absolute success. It allowed us to quantify the value of time taken with depth. If the traditional DFS had been used here, the results may have been a lot different. Thus, it would be interesting to experiment with other Heuristic/Meta-Heuristic Algorithms like Particle Swarm Optimization, Best-First Search, etc. [5, 15, 16]. Table 2 Time taken with for Prediction of next algorithm with different depths for a Minimax algorithm and b Alpha–Beta algorithm
Search depth
Total time taken for next move prediction (seconds) With Minimax algorithm
With Alpha–Beta algorithm
2
0.201
0.039
3
1.696
0.173
4
13.848
0.546 2.531
5
Ran out of memory
6
Ran out of memory 11.452
7
Ran out of memory 44.922
Analyzing a Chess Engine Based …
699
Fig. 3 The plot of the Table 2 values with time taken as Y-axis and depth as X-axis for the Alpha– Beta algorithm and the Min–Max algorithm
5 Future and Scalability The research can be extended to implement the concepts of Deep Learning and Neural Networks. A famously used feature in Deep Learning that optimizes decision-making is Reinforcement Learning. It does so with previously defined strategies. The overall aim of using these features is to form an exhaustive set of good strategies so that the algorithm can predict the most appropriate and optimal moves, and to do so with as little time as possible. Another way the Artificial Intelligence Model can be enhanced is by using Sibling Prediction Pruning. The SPP uses the maximum positional difference (MPD) between siblings’ properties. For systems with insufficient or low memory, SPP can be used instead of the Alpha–Beta Algorithm. It could provide more optimized solutions than Alpha–Beta for nodes with high branching factors.
6 Conclusion The paper was aimed at building a chess game using Alpha–Beta Pruning, Min– Max algorithm, and enhancement using iterative deepening. The proof of concept on Alpha–Beta Pruning is a success. There is no information loss during this forward pruning method. Only those branches are pruned that do not affect the final solution. This is a very important feature as the major downside of other forward pruning mechanisms is information loss. Various evaluation functions can be used to quantify the value of each node. For the algorithm to work properly, it is important to accurately measure the position difference between siblings.
700
A. Parashar et al.
The results of the project prove why the traditional algorithms for game tree evaluations fail for a Chess Engine. While the minimax algorithm and alpha–beta pruning work for simpler games, for a Chess Engine they are not ideal. IDDFS over a game tree certainly provides an improvement for the searching algorithm, but not by a huge factor. The results of this project cannot be generalized because of the dependencies on the evaluation function, but are intended as a proof of concept and show that it is worthwhile to investigate Pruning Alpha–Beta in a broader context.
References 1. A. Bashar, Survey on evolving deep learning neural network architectures. J. Artif. Intell. 1(02), 73–82 (2019) 2. Z. Zhao, S. Wu, J. Liang, F. Lv, C. Yu, The game method of checkers based on alpha-beta search strategy with iterative deepening, in The 26th Chinese Control And Decision Conference (2014 CCDC) 3. R. Nasa, R. Didwania, S. Maji, V. Kumar, Alpha-Beta Pruning in Minimax algorithm “An optimized approach for a connect-4 game”. Int. Res. J. Eng. Technol. (IRJET) (2018) 4. W. Guangyao, L. Hedan, Study on the algorithm based on the important region of the board, in 2018 Chinese Control And Decision Conference (CCDC) 5. A. Primanita, R. Effendi, W. Hidayat, Comparison of A∗ and Iterative Deepening A∗ algorithms for non-player character in Role Playing Game, in 2017 International Conference On Electrical Engineering And Computer Science (ICECOS) (2017) 6. A. Plaat, J. Schaeffer, W. Pijls, A. de Bruin, A new paradigm for minimax search. Technical Report Tr 94–18, Department Of Computing Science, The University Of Alberta, Edmonton, Alberta, Canada 7. T. Vijayakumar, Comparative study of capsule neural network in various applications. J. Artif. Intell. 1(01), 19–27 (2019) 8. M. Tripathi, Analysis of convolutional neural network based image classification techniques. J. Innov. Image Processing (JIIP) 3(02), 100–117 (2021) 9. I.J. Jacob, P. Ebby Darney, Design of deep learning algorithm for IoT application by image based recognition. J. ISMAC 3(03), 276–290 (2021) 10. G. Ranganathan, An economical robotic arm playing chess using visual servoing. J. Innov. Image Process. (JIIP) 2(03), 141–146 (2020) 11. A.R. Mendes et al., Implementation of the automatic and interactive chess board. IOSR J. Electr. Electron. Eng. 9(6) (2014) 12. M. Tim Jones, AI and Games, in Artificial Intelligence: A Systems Approach (Hingham:Infinity Science Press, 2009), pp. 101–106 13. C. Matuszek, B. Mayton, R. Aimi, M.P. Deisenroth, L. Bo, R. Chu, et al., Gambit: an autonomous chess-playing robotic system, in Proceedings of International Conference on Robotics and Automation (2011), pp. 4291–4297 14. T. Cour, R. Lauranson, M. Vachette, autonomous chess-playing robot, Feb 2016 15. H.M. Luqman, M. Zaffar, Chess brain and autonomous chess playing robotic system, in 2016 International Conference on Autonomous Robot Systems and Competitions (ICARSC) (2016) 16. S. Ozan, S. Gümüstekin, A case study on logging visual activities: chess game, in Artificial Intelligence and Neural Networks (Springer, Berlin, 2006), pp. 1–10
Intelligent Miniature Autonomous Vehicle Dileep Reddy Bolla, Siddharth Singh, and H. Sarojadevi
Abstract Nowadays, one of the major causes of death in the Indian subcontinent is road accidents where the data suggests that there are over 151 thousand fatalities. One of the major causes of accidents is human error which is why there is a need for an autonomous system, which would help humans to drive the vehicle without their involvement. With this research work, we have conducted an extensive study with various deep learning algorithms where these models can be implemented in real-time to reduce accident fatalities due to human error. To facilitate and secure the implementation, we have used a simulator provided by Udacity to model the automobile. The data needed to train the model is recorded in the simulator and then imported into the research work. Keywords Autonomous car · Deep learning · Computer vision · Neural networks · Lane detection · Convolutional neural networks · Edge detection · Linear regression
1 Introduction The very first four wheeler car was introduced to the market by Royal Enfield in the year 1893. Since then, the number of people buying and driving cars has only increased exponentially. But experimentation with autonomous cars did not begin until the early 1920s. However, encouraging trials were conducted in the 1950s, and work has continued since then. The first totally autonomous cars appeared in the 1980s, with the Navlab and ALV projects from Carnegie Mellon University in 1984 and the Eureka Prometheus Project from Mercedes-Benz and the Bundeswehr D. R. Bolla (B) · S. Singh · H. Sarojadevi Department of CSE, Nitte Meenakshi Institute of Technology, Yehahanka, Bangalore, India e-mail: [email protected] S. Singh e-mail: [email protected] H. Sarojadevi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_52
701
702
D. R. Bolla et al.
University Munich in 1987. Mercedes-Benz, General Motors, Continental Automotive Systems, Autoliv Inc., Bosch, Nissan, Toyota, Audi, Volvo, Vislab from the University of Parma, Oxford University, and Google are just a few of the major companies and research organizations that have developed working autonomous vehicles since then. As human error is highly inevitable and man tends to find ways to make autonomous systems do their work. The rise in research and development of self-driving cars will continue to rise. Simultaneously the amount of road accident fatalities has also increased over the years. So the need for self-driving cars is justified. We have used machine learning and the power of convolutional neural networks to provide one more solution to the above-mentioned cause. The images generated by the Udacity simulator were used to extract features. An open-source simulator provided by Udacity helps us to collect data and test our deep learning models. The data is collected in the form of images from the simulator. There are three cameras installed on the simulated vehicle, i.e., left, right, and center. The model trained would be able to predict the steering angle varying between −1 and 1, where the negative angle represents the left turn and the positive angle represents the right turn. The home screen of the simulator we’re using is shown in Fig. 1. It has two modes for us to choose from training mode and autonomous mode. The data is recorded in training mode, and the model is tested in autonomous mode. Figure 2 shows the simulator and the model, which are connected via socket-IO and flask server. In autonomous mode, the simulator sends images to the server, which passes them on to the model, which predicts the steering angle and sends it to the server, which then sends it to the simulator, which steers the car autonomously in the desired direction.
Fig. 1 Start screen for autonomous car simulator by Udacity
Intelligent Miniature Autonomous Vehicle
703
Fig. 2 Data flow diagram between the simulator and the model connected via flask server
2 Literature Survey Ian Timmis et al released a study in which they proposed utilizing a deep neural network to teach cars to navigate without the use of any additional software. The scientists created a tagged dataset for the Autonomous Campus TranspORt (ACTor) electric car by combining real-world photographs taken during a workout with the corresponding steering angle. The authors used a variety of deep learning methodologies, including convolutional neural networks (CNNs) and transfer learning, to train the model to automatically detect essential characteristics of the input and produce an expected conclusion. The steering wheel problem was solved using high-level characteristics and a pre-trained inception network on the ImageNet dataset. The network’s top node now has a Linear Regression node layer. The trained model is linked to the vehicle’s software on the Robot Operating System to interpret visual input and produce a corresponding steering angle in real-time (ROS). The current model in this research has a 15.2-degree inaccuracy on average [1]. Wuttichai et al., the study’s authors, began by analyzing several machine learning methods on self-driving automobiles utilizing high-performance devices such as the Nvidia Jetson Nano. The platform is an RC-Car platform with the following algorithms: SVM (Systematic Variable Modeling) ANN-MLP ANN-M The third option is ANN-MLP ANN- CNN-LSTM. With and without obstructions, the car is driven at three different speeds and in three different situations. Based on the percentage of accuracy rate, the authors discovered that the CNN-LSTM algorithm has the highest efficiency, not only with obstacles but also without them. The accuracy rate of all algorithms rapidly declines as more obstacles and high-speed stages are added [2].
704
D. R. Bolla et al.
Model A and Model B were proposed by Sangita et al., in their work. Using Machine Learning and CNNs, they retrieved features from the photos generated and gathered by the Udacity simulator (Convolutional Neural Networks). Despite the fact that they are from the same CNN family, the architectures behave significantly differently. In the simulation, Model “A” performed admirably, with an accuracy of 96.83%. Model “B” has a 76.67% accuracy rate. We may conclude that CNN architecture is useful in predicting the steering angle based on the track based on the results of the earlier tests. Model “A” completed the course without the assistance of the other two [3]. In contrast, Qiumei et al. [4] created a new activation function called FELU to augment the ELU function in order to deal with the sluggish computation speed. By using a rapid exponential function on the negative side, it effectively solves problems like neural death and output latency. It also significantly reduces network and processing time. The experimental findings show that the novel activation function has the fastest operation time and greatest ranking accuracy across the three datasets. Furthermore, the new activation function enhances results in a variety of network configurations (such as VGG, ResNet, etc.). As a result, deep convolutional neural networks can benefit from the FELU provided in this paper. In the future, the authors intend to improve real-time performance even more [4]. According to a paper by James et al. robust DAA is a requirement for fully autonomous BVLOS UAS operation. This will not be possible until autonomous UAS can successfully DAA small uncooperative flying objects in order to avoid colliding. LiDAR has the potential to be a DAA sensor, but further research is needed to determine its ability to tactically sense and identify minor air collision dangers provided by unruly small planes, helicopters, birds, and recreational UAS. Their findings confirmed the difficulty in recognizing small uncooperative objects moving through the field of view (e.g., birds and small aircraft) [5]. The paper by Sara et al. describes the majority of MOT algorithms at various levels in depth. Object detection is divided into two parts. There are two types of detectors in traditional algorithms and deep learning-based techniques: one-stage and two-stage detectors. Using deep learning, a model was proposed for extracting features that have a significant impact on performance. The writers have downplayed the bulk of target tracking techniques. Multi-object tracking (MOT) and detection is a difficult tasks. The study looks at a variety of MOT phases, with a particular focus on deep learning [6]. The paper by Sushmitha et al. demonstrates how to track cars using a variety of modules, including frame conversion, pre-processing, motion segmentation, feature extraction, and clear objects and tracking. Videos are first sent into the system, where they are converted into a series of frames. Following this, the data is pre-processed by converting an RGB image to a Grayscale image. The next phase is the backdrop subtraction stage, which removes the unnecessary foreground. The authors employed blob analysis to clearly identify the item while reducing extraneous noise. Following video frames are utilized to find the target object after tracking is used to find the moving object [7].
Intelligent Miniature Autonomous Vehicle
705
The authors of the research, Nicholas et al., looked at a project that aimed to provide LTU students hands-on experience with autonomous vehicles. The effort has two primary goals. First, it serves as a basis for each student’s research projects, and it also serves as a mode of transportation for students and visitors to and from various campus amenities. With the exception of drive-by-wire technology, all vehicle software and hardware were written and installed entirely by university students. For many students, the ACTor has proven to be a dependable research tool for their assignments. Students could utilize the vehicle to compete in software and robotics competitions as well as construct an end-to-end deep learning system for lane assist that utilized high dynamic range photography for machine vision. Other projects include creating software for local navigation and pathfinding, as well as using deep learning to recognize pedestrians and signs. The authors discovered that the offered framework significantly reduces development time while retaining project dependability, modularity, and reusability as compared to previous robotics software projects. New students engaged in research projects stimulate changes and extensions, so both hardware and software are still under active development [8]. The paper by Albawi et al., discussed the important issues related to the Convolutional Neural Network (CNN) and explained the effect of each parameter on network performance. The most important layer of CNN architecture discovered by their research is the Convolution layer which takes most of the time in the network. Here the network performances were trial run with different numbers of levels in the network and it was observed that with the increase in the number of levels the time required to train and test the network will also increase. In conclusion, the research done by the authors proves that CNN can be used for a lot of applications. Therefore it can also be used for Autonomous vehicles [9]. The paper by Li et al., proposed a lane filtering system by using a four point representation to represent the lane. So the authors created a state vector of fixed dimension and used a Kalman filter. Then the quality of the filtered result is evaluated to fuse the detection result with the filtered result to get the final result. By experimentation, the authors found that the lanes were steady in the form of reduced standard deviation. Their research was limited to the temporal domain and there is further scope for research in the spatial domain [10]. Along with the significant research work carried out we have been inspired from some of the research work proposed in [11–18] where investigated and studies during the process of the research work carried out.
3 Methodology The different stages of building the model are explained in this section.
706
D. R. Bolla et al. Left Camera
Center Camera
Right camera
Fig. 3 Images collected from the left, center, and right sides of the car
Fig. 4 Information extracted from the csv file
3.1 Data Collection Using the simulator’s training mode, we drive the car on the predefined tracks and record ourselves. It is expected that the model will replicate the driving behavior of the car. We recorded data is in the format of images and a total of 18,792 images with different steering angles were taken for training the model, Each image is of the dimension 320 × 160, i.e., The width of 320 pixels and height of 160 pixels as shown in Fig. 3. A CSV file is automatically generated which contains the path of the image and the steering angle at the capture point as shown in Fig. 4.
3.2 Pre-processing of the Data Figure 5 represents a histogram of the total collected data and the data appears to be biased for steering angle 0 as the car was being driven straight. Our data must also contain other angles from −1 to 1 in case there is a situation where the car has skewed across the track as in Fig. 5. So we have to balance the data to eliminate the bias.
Intelligent Miniature Autonomous Vehicle
707
Fig. 5 Count of images according to steering angles
The data is balanced by removing 2590 random images which majorly had a steering angle of 0. So we are left with 400 images with a steering angle of 0 in the data set as in Fig. 6. After removal of images for balancing dataset, we are left with 1463 records and we need to avoid underfitting for which we are using the following image augmentation techniques.
Fig. 6 Showing count of images after balancing the data set
708
D. R. Bolla et al.
Fig. 7 Showing zooming of the image for avoiding underfitting
Fig. 8 Showing panning of the image for image augmentation
3.2.1
Zooming
See Fig. 7.
3.2.2
Panning
See Fig. 8.
3.2.3
Brightness Altering
See Fig. 9.
Fig. 9 Altering the brightness of the image for augmentation
Intelligent Miniature Autonomous Vehicle
709
Fig. 10 Flipping the image to augment
Fig. 11 Showing Images before and after pre-processing
3.2.4
Flipping the Image
Figures 7, 8, 9, 10 and 11 above mentioned are displaying various augmentations we are performing to avoid underfitting. With the help of the OpenCV library, we are pre-processing the image before feeding it to our model. The image is converted from RGB to YUV as it takes lesser bandwidth than prior. The smoothness of the image is increased by Gaussian Blur and the region of interest is also cropped which makes the height and width of the new pre-processed image 200 pixels and 66 pixels, respectively, which makes the processing a lot faster comparatively.
4 Results and Discussion of Model Training and Testing We’re employing a CNN design with 12 layers that are added one after the other in sequential order. Tensorflow’s Keras library was used to generate the model. The layers are piled one on top of the other, with each layer having one input and output, resulting in a model that is selected as a sequential category from the Keras library. There are five convolutional layers, 1 drop out and one flatten layer, and four dense layers, for a total of 12 layers. We chose “Exponential Linear Unit (ELU)” because the model predicts negative outcomes as our activation function for all layers (for left steering angles). The kernel size for the first three layers is 5, with strides of 2. The photographs are fed into the
710
D. R. Bolla et al.
first layer, which is the input layer, from various angles such as left, center, and right. The second layer is a Convolutional2D layer with 24 filters. The third layer is also a Convolutional2D layer, with 36 filters. The Convolutional2D layer, which comprises a total of 48 filters, is the fourth layer. Convolutional2D layers with 64 filters make up the fifth and sixth layers, respectively. To avoid overfitting, a dropout layer with a factor of 0.5 is added as the seventh layer. The eighth layer, Flatten, flattens the entire input. The next four levels are fully connected dense layers, with the steering angle being the output of the last layer. The ninth layer has 100 units, followed by the tenth and eleventh tiers, which have 50 and 10 units, respectively. The anticipated steering angle is the single unit in the last layer, i.e., the twelfth layer. Figure 12 depicts the layers and their output shape. We have used 1173 images as Training images and 293 images as for testing. This can be visualized in Fig. 13. We are using matplotlib library available in python to plot these graphs. The model is trained for 12 epochs and 300 steps_per_epochs with validation_steps as 200 with the training and validation loss decreasing after every epoch exponentially as in Fig. 14.
Fig. 12 Showing different layers and their output shapes in our model architecture
Intelligent Miniature Autonomous Vehicle
711
Fig. 13 Showing count of images and steering angles in training and testing data
Fig. 14 Showing training and validation loss per epoch for the model
5 Conclusion With this research work carried out in the Udacity environment model, we are able to successfully achieve the appropriate accuracy in designing and developing the autonomous system. Using the deep learning algorithms with the use of the ELU model a real-time simulation environment is created tested and validated using Udacity model. In this work, we have successfully demonstrated the future autonomous cars with better efficiency and cleaner and safer car driving experience. As we all know that autonomous vehicles and self-driving cars will be commercialized in the nearby future, based on this to overcome the future challenges we need to be ready with innovative ideas and solutions in designing effective and cooperative strategies in managing the traffic conditions. Apart from these design aspects
712
D. R. Bolla et al.
the ethical and legal issues can be analyzed for further improvements and make the society ready for the innovation coming up in the future with the autonomous environment. Acknowledgements We thank the management Principal, Deans, HoDs, Research & Development cell of the department of CSE, Nitte Meenakshi Institute of Technology (NMIT), Bangalore for their extended support to carry out the research work.
References 1. I. Timmis, N. Paul, C.-J. Chung, Teaching vehicles to steer themselves with deep learning, in 2021 IEEE International Conference on Electro Information Technology (EIT) (2021), pp. 419– 421 2. W. Vijitkunsawat, P. Chantngarm, Comparison of machine learning algorithms on self-driving car navigation using Nvidia Jetson Nano, in 17th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) (2020), pp. 201–204 3. S. Lade, P. Shrivastav, S. Waghmare, S. Hon, S. Waghmode, S. Teli, Simulation of self driving car using deep learning, in 2021 International Conference on Emerging Smart Computing and Informatics (ESCI) (2021), pp. 175–180 4. Z. Qiumei, T. Dan, W. Fenghua, Improved convolutional neural network based on fast exponentially linear unit activation function. IEEE Access 7, 151359–151367 (2019). https://doi. org/10.1109/ACCESS.2019.2948112 5. J. Riordan, M. Manduhu, J. Black, A. Dow, G. Dooly, S. Matalonga, LiDAR simulation for performance evaluation of UAS detect and avoid, in 2021 International Conference on Unmanned Aircraft Systems (ICUAS) (2021), pp. 1355–1363 6. S. Bouraya, A. Belanger, Approaches to video real-time multi-object tracking and object detection: a survey, in 2021 12th International Symposium on Image and Signal Processing and Analysis (ISPA) (2021), pp. 145–151 7. S. Sushmitha, N. Satheesh, V. Kanchana, Multiple car detection, recognition and tracking in traffic, in 2020 International Conference for Emerging Technology (INCET) (2020), pp. 1–5. https://doi.org/10.1109/INCET49848.2020.9154107 8. N. Paul, M. Pleune, C. Chung, B. Warrick, S. Bleicher, C. Faulkner, ACTor: a practical modular and adaptable autonomous vehicle research platform, in 2018 IEEE International Conference on Electro/Information Technology (EIT), pp. 0411–0414 (2018) 9. S. Albawi, T.A. Mohammed, S. Al-Zawi, Understanding of a convolutional neural network, in 2017 International Conference on Engineering and Technology (ICET) (2017), pp. 1–6. https:// doi.org/10.1109/ICEngTechnol.2017.8308186 10. Y. Li, S. Ding, Fast lane filtering for autonomous vehicle, in 2019 IEEE National Aerospace and Electronics Conference (NAECON) (2019), pp. 169–172. https://doi.org/10.1109/NAE CON46414.2019.9057872 11. M. Ben Youssef, A. Salhi, F. Ben Salem, Intelligent multiple vehicule detection and tracking using deep-learning and machine learning: an overview, in 2021 18th International MultiConference on Systems, Signals & Devices (SSD) (2021), pp. 632–637. https://doi.org/10. 1109/SSD52085.2021.9429331 12. M. Öztürk, E. Çavu¸s, Vehicle detection in aerial imaginary using a miniature CNN architecture, in 2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA) (2021), pp. 1–6. https://doi.org/10.1109/INISTA52262.2021.9548348
Intelligent Miniature Autonomous Vehicle
713
13. K. Ji, K. Han, Optimal decision-making strategies for self-driving car inspired by game theory, in 2021 Twelfth International Conference on Ubiquitous and Future Networks (ICUFN) (2021), pp. 375–378. https://doi.org/10.1109/ICUFN49451.2021.9528803 14. H. P. Sharma, R. Agarwal, M. Pant, S.V. Karatangi, Self-directed robot for car driving using Genetic Algorithm, in 2021 3rd International Conference on Signal Processing and Communication (ICPSC) (2021), pp. 285–288. https://doi.org/10.1109/ICSPC51351.2021.945 1804 15. B. Asmika, G. Mounika, P.S. Rani, Deep learning for vision and decision making in self driving cars-challenges with ethical decision making, in 2021 International Conference on Intelligent Technologies (CONIT) (2021), pp. 1–5. https://doi.org/10.1109/CONIT51480.2021.9498342 16. D.R. Bolla, B. Varsha, G.R. Chethan Kumar, M. Prajwal, B. Sowmya, Secured transportation system to enhance child safety, in 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) (2018), pp. 2314–2318. https://doi.org/10.1109/RTEICT42901.2018.9012099 17. D.R. Bolla, J.J. Jijesh, S. Palle, M. Penna, S. Keshavamurthy, An IoT based smart e-fuel stations using ESP-32, in 2020 International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT) (2020), pp. 333–336. https://doi.org/ 10.1109/RTEICT49044.2020.9315676 18. C. Shetty, H. Sarojadevi, Framework for task scheduling in cloud using machine learning techniques, in 2020 Fourth International Conference on Inventive Systems and Control (ICISC) (2020), pp. 727–731, https://doi.org/10.1109/ICISC47916.2020.9171141
Energy and Trust Efficient Cluster Head Selection in Wireless Sensor Networks Under Meta-Heuristic Model Kale Navnath Dattatraya and S Ananthakumaran
Abstract The lifetime and stability expansion of wireless sensor networks (WSNs) is still a challenging aspect. Clustering has been determined as a viable option for extending the network longevity. Sensor nodes in the network are organized into clusters by selecting a Cluster Head (CH) during the clustering phase. CH is in charge of gathering data from the nodes within its clusters and forwarding the aggregated information to the base station. However, choosing the best CH is a difficult task in the clustering process. For selecting the best CH, recent research trend suggests using meta-heuristic optimization models. This study provides a new hybrid optimization model for optimal CH selection (CHS) under a variety of criteria, including energy spent and separation distance, delay, distance, Qos, Trust (direct and indirect trust). The proposed hybrid optimization model referred to Hunger game Customized Slimemould Optimization (HCSO) is used to select the optimal CH. The evaluation of the proposed work is done over the existing works in terms of count of Alive Node (AN), normalized network energy, and CH separation distance. This assessment was made between the proposed work and existing works like GA, HGS, SMA, GSO, ALO, and MMSA, respectively. Keywords Wireless sensor network (WSN) · Clustering · Cluster head (CH) selection · Multi-objective decision making · Hunger game customized slimemould optimization model (HCSO)
Abbreviations SSA
Sparrow search algorithm
K. N. Dattatraya (B) Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India e-mail: [email protected] S. Ananthakumaran School of CSE, VIT Bhopal University, Madhya Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_53
715
716
WSN PSO CHS DE QoS HCSO SLnO ALO SMA MS-GAOC HGS HQCA PDU-SLnO CSMA AN GA MMSA GAOC
K. N. Dattatraya and S. Ananthakumaran
Wireless sensor network Particle swarm optimization CH selection Differential evolution Quality of service Hunger game customized slimemould optimization Sea lion optimization Ant lion optimization Slime mould algorithm Multiple data sinks based GAOC Hunger games search High-quality clustering algorithm Particle distance updated sea lion optimization Carrier sense multiple access Alive node Genetic algorithm Modified moth search algorithm Genetic algorithm-based optimized clustering
1 Introduction WSN encapsulates low-powered as well as low-cost sensor nodes, and therefore they are being implied in vast range of applications like habitat monitoring, military solicitations, field surveillance, smart home monitoring system, health monitoring systems automobiles as well [1–5]. Sensor nodes are sometimes placed in dangerous areas, which makes it difficult to change batteries or replace nodes that are malfunctioning. On the other hand, enhancing the node’s battery performance tends to increase the overall costs [6–9]. Therefore, the researchers are being conducted in enhancing the lifespan of the network as well as its stability via diverse protocols [10, 11]. Clustering is being a promising solution, wherein the geographical area is split into smaller sectors [12–14]. Thereby, the workload is distributed evenly among all the nodes. There is a CH in each cluster that is responsible for routing data packets from one cluster to the next. Since a predominant role is being played by CH in WSN; there is a necessity to select the most optimal CHs [15–18]. In LEACH protocol, the nodes are probabilistically as well as evenly selected as CHs with respect to the parameters such as energy, predicted energy consumption, and the number of neighboring nodes [19–25]. Furthermore, it is critical to consider the centralized usage of information (e.g., battery status) from all nodes at the BS when choosing a CH. However, during the transmission mechanism, it’s highly complicated to acquire this information simultaneously at the base station [26–29]. Clustering is a popular two-layer approach that separates a network into tiny, manageable components. It is commonly used to improve the network’s scalability.
Energy and Trust Efficient Cluster Head Selection…
717
It ideally picks the right node to act as a CH to allocate the tasks across the nodes [2, 4, 6, 30, 31]. CH has been chosen based on the network’s specifications and energy consumption. As the vast amount of data produced by the sensors are similar, clustering algorithms make use of the correlation between the data, resulting in more efficient energy consumption [12, 16]. In this case, the answer for each node to be picked as a CH is the lasting energy. Lower-energy level nodes are grouped together based on their distances from the nearest CH, and they send their data to that CH directly [33–37]. CHs should be placed at a great distance from the nodes so that lower-energy nodes may use more energy. The CHs are responsible for gathering data from the typical nodes, computing it, and then transmitting it to the base station [6, 12, 38, 39]. Optimal clustering is described in terms of energy efficiency and is used to remove the overhead related to the CH selection process as well as the nodes associated with their respective CHs. Clustering with regard to swarm intelligence has been proposed as an effective strategy for optimization protocols, such as PSO, ALO, and so on. The major contribution of this research work is: • Introduces a multiple objective hybrid optimization model for optimal CH selection in WSN that includes energy spent and separation distance, delay, distance, Qos, and trust (direct and indirect trust). • Proposed the Hunger game Customized Slimemould Optimization (HCSO) model for the optimal selection of CHs. The rest of this paper is arranged as: Sect. 2 discusses the literature works undergone in CH selection. Section 3 discusses about proposed optimal CH selection model with multi-objective decision making: system model. The results acquired with the projected model are discussed comprehensively in Sect. 4. This paper is concluded in Sect. 5.
2 Literature Review 2.1 Related Works Panimalar and Kanmani [1] have proposed a hybrid SSA with DE algorithm in 2021 with the goal of addressing energy consumption issues that rise during the CH selection in WSN. Furthermore, the high-level search efficiency of the SSA has been explored, as well as the vivid potential of differential evolution, to extend the node’s lifespan. The proposed model has been validated with respect to different performance measures. Yadav and Mahapatra [2] have created a novel energy-aware CH selection framework with the support of the hybrid optimization method, in 2021. The CH selection in this research was based on the restrictions such as energy, distance, latency, and QoS. The authors used the recently proposed PDU-SLnO method to find the best CH.
718
K. N. Dattatraya and S. Ananthakumaran
The SLnO and PSO were combined to create the PDU-SLnO. The proposed model has been validated in terms of throughput as well as the number of living nodes. Verma et al. [3] have introduced a GAOC technique for WSN CH selection in 2019. The most effective characteristics, such as “residual energy, distance to the sink, and node density”, were taken into account during this optimum CH selection. Furthermore, the authors have introduced the MS-GAOC, which aims to solve the Hot-Spot problem while also shortening the communication distance between nodes and sinks. In terms of “stability period, network lifespan, number of dead nodes against rounds, throughput, and network’s residual energy”, the proposed model has been evaluated against state-of-the-art protocols. Baradaran and Navi [4] have used the newly developed HQCA model to build high-quality clusters in 2019. CHs were chosen based on “residual energy, minimum and maximum energy in each cluster, and minimum and maximum distances between cluster nodes and base station”. The network’s energy usage and longevity have both improved thanks to the proposed model. Khan et al. [5] have introduced a “Fuzzy-TOPSIS approach” in 2018 that was based on multi-criteria decision making for selecting the best CH in a WSN in order to increase network longevity. The most important criteria, such as “residual energy, node energy consumption rate, number of neighbor nodes, average distance between adjacent nodes, and distance from the sink”, were taken into account throughout the CH selection process. In terms of latency and network longevity, the proposed model has been evaluated against current models.
3 Proposed Optimal CH Selection Model with Multi-objective Decision Making 3.1 Assumptions • The wireless communication network is constructed in a square unit area A of “100m × 100m”. • Within this A region, the base station BS has been fixed at the center. The base station is a node with no energy constraint and enhanced computation capabilities. • In A, there is N = 100 count of nodes Nodei , and these nodes are dispersed randomly following the uniform distribution within the sensing region. • The nodes have fixed coordinate locations, i.e., the sensing nodes and the BS. The BS has an infinite amount of energy and will never run out. • Energy is used for detecting, transmitting, and data transfer in general. If a node’s energy is totally exhausted and it is regarded non-operational, it is said to be dead. • Nodes are responsible for sensing, transmitting, and receiving data from other nodes. A node’s transmission range is fixed, and it can’t communicate with other nodes outside of it.
Energy and Trust Efficient Cluster Head Selection…
719
• All sensor nodes are homogeneous, and once deployed in the field, all nodes are immobile and contain position information. • All these Nodei send a HELLO message inclusive of their local information to BS. • The chosen CH is not changed between rounds to reduce the overhead that occurs during network setup.
3.2 Overview of the Proposed Work This research presents a novel hybrid optimization model for optimum CH selection CHS based on the constraints such as energy consumed and separation distance, delay, distance, Qos, and trust (direct and indirect trust). The suggested hybrid optimization model, HCSO, is a conceptual hybridization of the traditional Slime mold algorithm (SMA) and the Hunger games search (HGS) model. The block diagram of the projected model is manifested in Fig. 1.
3.3 Proposed Scheme The suggested work is divided into five sections: (a) random SN deployment; (b) neighbor node discovery; (c) CH selection; (d) optimum CH selection; and (e) clustering. The following is a full explanation of each of the phases: (a)
(b)
(c)
SN deployment at random: In the WSN field, all SNs are initially placed at random since it is a simple and low-cost deployment technique, as illustrated in Fig. 2. The sink node broadcasts a Hello control packet in the network after deployment, containing information about its position. Discovery of neighbor nodes: Using the CSMA technique, all SNs broadcast a Hello control packet in their transmission range TR during phase 2. Residual energy, average distance between this node and its sink, node energy consumption rate, average distance between this node and its neighbors, and node location and ID are all included in the Hello control packet. The average distance between this node and its neighbors and average distance between this node and its sink fields in the hello control packet is kept blank since the node has no knowledge of its neighbors at first. All SNs update their neighborhood tables after receiving the hello control packet from nearby nodes. CH selection: Among Nodei , the node that satisfies the defined multi-objective function such as energy spent and separation distance, distance, delay, Qos, and trust (direct and indirect trust) is selected as the CH. The optimal CH is selected via the developed hybrid HCSO optimization. The selected broadcasts an advertisement to all the nodes that are within the transmission range. The other nodes, receive multiple advertisements from different CHs in their transmission range and then decide to associate with the CH which has minimum
720
K. N. Dattatraya and S. Ananthakumaran Multi-Constraints
Separation distance
QoS
delay
Cluster Head Selection Energy spend
HCSO model
Base station Selected CH
Selected CH
Selected CH
Cluster Cluster Sensor nodes
Sensor nodes Cluster
Fig. 1 Architecture of the projected model
distance. Because the CH is in charge of communication with the base station, it must be chosen carefully. In order to have the most favorable data transmission, this study effort uses a novel optimization model that is based on multi-objective such as “QoS parameter, distance, delay, and energy.” In this study, ten counts of ideal CHs are created, resulting in ten counts of clusters Ncluster . In addition, ClusterKL denotes the nodes that make up the cluster, where “k = 1, 2, . . . L and L = 1, 2, . . . M”. Once an appropriate CH is chosen, data transmission speeds up dramatically, increasing the network’s lifespan. The nodes that can act as CH are shown in Fig. 3. Overall Objective Function The optimal CH is selected based on the multi-objective function, i.e., among the possible nodes, the most optimal nodes which satisfy the multi-objective are finely tuned via the developed hybrid HCSO optimization. Mathematically, the objective function (O) is given as per Eq. (1). O = ψ × F 2 + (1 − ψ)F 1 ; 0 < ψ < 1
(1)
Energy and Trust Efficient Cluster Head Selection…
721 Sensor Node Base Station
100m
100m
Fig. 2 SN deployment: an illustration
-Sensor Node -Base Station -Nodes that can act as CH
100m
Fig. 3 Nodes that can act as CH: an illustration
In which, energy
F 1 = W 1 ∗ fi
+ W ∗ (1 − 3
sep_distance
+ W 2 ∗ (1 − f i f idist )
+W ∗ 4
delay fi
)
+ W 5 ∗ f iTrust
(2)
722
K. N. Dattatraya and S. Ananthakumaran
F = 2
N cluster
1 Ncluster
||Nodei − BSs ||
(3)
r =1
The constant ψ = 0.3 in Eq. (1). In addition, the energy, separation distance, distance parameters, delay, QoS, and trust are the weight parameters denoted in Eq. (2) as W 1 , W 2 , W 3 , W 4 and W 5 , respectively. In addition, the quantity Nodei − BSs in Eq. (3) denotes the distance between Nodei (normal node) and base station BSs . The fitness function of ith normal node corresponding to distance metric is denoted by the notation f idist . The fitness functions of ith normal node related to separation distance, trust, energy, and delay metric are represented by the sep_distance energy delay , fitiTrust , fiti , and fiti . notations fiti • Fitness function in terms of energy consumption: The nodes consume their own battery energy for data transmission and reception. The energy is the major criterion that decides the lifespan of the network. In WSN, the node with highest energy is chosen as the optimal CH, and this selection will be made using the projected hybrid model. The required energy for a node to be a CH is computed using Eq. (4). fitenergy =
fitenergy (A) fitenergy (B)
(4)
In which, fitenergy (q) =
M
E( j )
(5)
j=1
E( j ) =
L 1 − E(Nodei ) ∗ E(CH j ) ; 1 ≤ j < M
(6)
i=1 i∈ j L M fitenergy (u) = M ∗ Max(E(Nodei )) ∗ Max E(CH j ) i=1
j=1
(7)
The energy of ith normal node is denoted by the symbol (E(Nodei )), whereas the energy of the jth CH is denoted by the letter E(CH). • Fitness function in terms of distance: The nodes that are close to CH join the cluster to have reliable data transmission. The node closest to the base station has a better possibility of serving as a CH. The f dist is set to be within the range [0,1]. The HCSO model is used to choose the best node within this range. The fitness function in terms of distance is mathematically defined in Eq. (8). dist
fit
dist
=
fit ( A) fit (B) dist
(8)
Energy and Trust Efficient Cluster Head Selection…
723
In which, dist
fit (q) =
M L Nodei − CH j + CH j − Bases
(9)
i=1 j=1
fit
dist
L M Nodei − CH j ( p) =
(10)
i=1 j=1
The difference in distance between Di and BSs , as well as the difference in distance between the jth, CH j and BSs , is computed by the f dist ( A) in Eq. (9). In addition, the distance between two normal nodes is denoted by fitdistance ( A) in Eq. (10). • Fitness function in terms of delay: The latency or delay is important statistic that shows how consistent the data transfer is. It is critical to pick the appropriate CH, as it is the data transmission mediator, in order to route the fresh data packets to the destination with minimal delays. Using the HCSO model, the node with the shortest data transmission latency will be chosen as the optimum CH in this study. fitdelay is set to be between 0 and 1 in this case. The HCSO model selects the nodes within this range that are most optimal. The data transmission latency is mathematically given in Eq. (11). fit
delay
=
Max M j=1 CH j L
(11)
Reducing the number of nodes in the cluster is one way to reduce the latency. The numerator in Eq. (11) indicates the available CH in the network, whereas L is the total number of nodes in the network. • Fitness function in terms of QoS: If the aforementioned conditions, such as minimal latency, minimal distance, and maximal energy, are met, the QoS can be accomplished. • Fitness function in terms of trust: The direct, as well as indirect trust, is taken into consideration. The direct trust is based upon the node’s interaction. Here, both the energy as well as distance metrics are considered. f D_trust( A−B) =
R dist (Node A, Node B)
(12)
Here, f D_trust( A−B) denotes the direct trust computed between nodes A and B. In addition, R and dist denotes the remaining energy as well as difference in distance between nodes A and B. The indirect trust is based upon the node’s recommendations. It is the sum of the node’s trust values. This can be mathematically given as per Eq. (13).
724
K. N. Dattatraya and S. Ananthakumaran
f ind_trust( A−B) =
f D_trust( A−c) ∗ f D_trust(C−B)
(13)
Here, f ind_trust(A−B) denotes the indirect trust computed between A and B by means of considering the C’s recommendations. In addition, f D_trust( A−c) and f D_trust(C−B) points to the direct trust value computed for C by A and direct trust value computed for B by C, respectively. • Fitness function in terms of separation distance between CHs: The separation distance between the CHs is considered for optimal CHs selection. The mathematical formula for separation distance is given in Eq. (14). f sep_distance =
X A
2 +
2 Y B
(14)
Here, X, Y denotes the dimensions of the nodes in m2 . (iv)
Optimal CH selection: Among the available nodes that can act as CH, the most optimal one is selected using the newly projected hybrid optimization model.
3.4 Proposed HCSO The HCSO model is the conceptual hybridization of the SMA [40] and the HGS [41] models. The SMA model is based on the slime mold’s oscillation mode. SMA has outstanding exploitation and predatory behavior. The HGS, on the other hand, is based on animals’ hunger-driven actions and behavioral choices. The HGS model has a faster convergence rate and can help locate global solutions without becoming stuck in local optima. The HCSO model is merged in this study to develop a new optimization model known as HCSO. The CHs are fed as input to HCSO model. The solution encoding is shown in Fig. 4. The following are the actions taken in the anticipated model: Step 1: Step 2: Step 3:
Step 4:
The population of the search agent Pop is initialized. In addition, the current iteration itr and maximal iteration maxitr are initialized. The population of the search agent is initialized X i ; i = 1, 2, . . . , N . Generate oppositional solutions using opposition-based learning (OBL). With OBL, the exploitation is maximized and hence assists in enhancing the convergence speed of the solutions. While itr < maxitr do.
Fig. 4 Solution encoding
CH1
CH2
…….
CHn
X
Energy and Trust Efficient Cluster Head Selection…
Step 5: Step 6: Step 7:
725
Using Eq. (1) calculate the search agent’s fitness. The best fitness X best is updated. The approach food phase is followed to update the solutions. Here, the search agents update the solutions based upon the odor in air. The mathematical formula for approach food phase in SMA model is manifested in Eq. (15) X
t+1
=
X best (t) + b.(W.X A (t) − X B (t)) r < p c.X (t) r≥p
(15)
Here, b is the parameters that have been garneted within the range [−a, a]. moreover, c is a parameter that decreases from 1 to 0. The search agent that has been located within the highest odor concentration is denoted as X best . Moreover, X A as well as X B are the two search agents. The search agent’s weight is denoted as W . Moreover, the mathematical formula for p is denoted as per Eq. (16). p = tanh|O(i ) − Best|
(16)
Here, O denotes the fitness of the search agent and it is computed as per Eq. (1). Moreover, the notation Best denotes the so far acquired best fitness. The mathematical formula for a and b is denoted as per Eq. (17) and Eq. (18), respectively. This is the newly projected expression, where in the count population count has been considered rather than the maximal iterations that have been utilized in the existing model.
itr a = arctan h exp − N
itr b = exp − N Step 8:
(17) (18)
Using Eq. (19), compute the value of W . opt −O(i ) 1 + rand. log opt fit−worstfit + 1 condition optfit −O(i ) W (SI(i )) = ⎝ 1 − rand. log opt fit−worstfit + 1 others ⎛
(19)
fit
Here, SI = Sort(O). The random value is denoted as rand, and it is generated within the range [0,1]. Moreover, the notation optfit , worstfit points to the optimal as well as worst fitness, respectively. Step 9: Step 10:
For every search agent, the value of p, b, c is initialized. Update the position of the search agent using the approach phase model of HGS. This is mathematically shown in Eq. (20).
726
K. N. Dattatraya and S. Ananthakumaran
-Sensor Node -Base Station -Nodes that can act as CH (computed based on multiobjectives) -selected optimal CH’s
100m
Fig. 5 Selected optimal CHs based on defined multi-objectives: an illustration
X t+1 = W t (1) ∗ X best + rad ∗ W t (2) ∗ |X best − X (t)|
(20)
Here, rad is a random number generated within the range [−a, a]. Moreover, W t (1) and W t (2) is the hunger weight. Step 11111: Step 11211: Step 11311: Step 11411:
End for itr = itr + 1 End while Return best fitness X best
The selected optimal CHs are shown in Fig. 5. (e)
Clustering: In phase four, clustering takes place. As shown in Fig. 4, the CH identifies itself as the CH within its transmission range, and the other nodes in the region reply by sending a joining request. Based on the number of node members with which it is linked, the CH generates TDMA schedule. CHs are not changed every round to save overheads during the setup process. The clustered network is shown in Fig. 6.
4 Results and Discussions 4.1 Simulation Procedure The proposed optimal CH selection model using new hybrid optimization model (HCSO) has been implemented in MATLAB. The projected model has been constructed with the parameters shown in Table 1. The evaluation of the proposed work is done over the existing works in terms of count of AN, normalized network
Energy and Trust Efficient Cluster Head Selection…
727
Cluster
-Sensor Node -Base Station -Nodes that can act as CH (computed based on multiobjectives) -selected optimal CH’s
100m
Fig. 6 Clustered network: an illustration
Table 1 Parameter values
Parameter
Value
Network area
100 × 100 m2
Initial energy ENIn
0.5
Transmitter energy E tr
50 nJ/bit/m2
Number of nodes (N)
100
Population size
N
Chromosome length
N
Energy of free space model ENfr
10 pJ/bit/m2
Data aggregation energy E Da
5 nJ/bit/signal
Energy of power amplifier E power
0.0013 pJ/bit/m2
Count of rounds
500, 1000, 1500 and 2000
energy, and CH separation distance. This assessment was made between the proposed work and existing works like GA, HGS, SMA, GSO, ALO, and MMSA, respectively.
4.2 Analysis on the Count of Alive Node The number of living nodes after each cycle is calculated to determine if the planned work improves the network’s life span. The analysis was carried out by adjusting the number of rounds from 0 to 500, 1000, 1500, and 2000. Figure 7 depicts the obtained results in terms of the number of living nodes. According to the obtained
728
K. N. Dattatraya and S. Ananthakumaran
Fig. 7 Analysis on the count of alive nodes
results, the projected model has the maximum number of living nodes at the end of each round. The implementation of the new hybrid optimization model has resulted in this improvement. For all of the analyzed models that have been used for CH selection, the count of living nodes is observed to be greater at the 0th round. However, when the number of nodes grows, there is a decrease in the number of living nodes, which is owing to the nodes’ increased power consumption. Furthermore, the suggested model has the greatest count of living nodes as 38 at the 2000th round, which is a positive score when compared to GA = 15, HGS = 35, SMA = 32, GSO = 18, ALO = 12, and MMSA = 23. As a result, the projected model is considered to be very appropriate for CH selection, as the number of dead nodes after each round is smaller (Fig. 7).
4.3 Analysis on Normalized Network Energy Because of their vast range of applications, wireless sensor networks are gaining a lot of attention. The primary design difficulty for these networks is ensuring that nodes consume as little energy as possible. As a result, the network was examined for normalized network energy using both the proposed and current models. The network’s normalized energy must be kept as high as feasible in order to extend the network’s life span. The normalized energy recorded by both current and proposed work is described in this part by altering the number of rounds from 500 to 1000, 1500, and 2000. The related outcomes are depicted in Fig. 8. The normalized network energy is determined to be greater with both the existing and proposed work at the conclusion of the 0th round. Furthermore, as the number of cycles increases, the normalized network energy appears to steadily decrease. The suggested model has the maximum normalized network energy of 0.19 pJ at the 1000th round, which is
Energy and Trust Efficient Cluster Head Selection…
729
Fig. 8 Analysis on the normalized network energy
the most favorable value. Surprisingly, the suggested model also had the greatest normalized network energy at the end of the 2000th round than the current models. As a result, the proposed approach is believed to be very useful for selecting the best CH (Fig. 8).
4.4 Analysis on Trust For several rounds, trust evaluations were conducted between the proposed and current models. The projected model revealed higher trust levels for every modification in the rounds after viewing the obtained outcomes. In the 450th round, the suggested model had the greatest trust value of 95%. Furthermore, even in the 2000th round, the projected model had a greater trust value of 83%, when the current models’ trust values in this range are quite low. Only the adoption of the new multi-objective hybrid optimization approach has resulted in this improvement (Fig. 9).
4.5 Statistical Analysis on Count of Alive Nodes The statistical analysis on the count of alive nodes has been evaluated for 2000 rounds, and the recorded best values are recorded in Table 2. On observing the recorded outcomes, the projected model has recorded the highest mean values in terms of count of alive nodes. All these improvements are due to the consideration of the multi-objective hybrid optimization.
730
K. N. Dattatraya and S. Ananthakumaran
Fig. 9 Analysis on trust
Table 2 Statistical analysis on count of alive nodes
Approaches
Mean
Median
Std_dev
GA [3]
62.031
98
41.665
HGS [42]
62.334
98
41.998
SMA [43]
62.099
72.5
37.397
GSO [44]
72.865
93
32.547
ALO [45]
61.332
89
41.618
MMSA [46]
62.158
93
41.634
HCSO
74.329
95
31.719
4.6 Statistical Analysis on Normalized Energy The statistical analysis of normalized energy was performed for 2000 rounds, and the best results are shown in Table 3. The suggested work’s best mean value is 8.3%, 7.5%, 3.5%, 6.6%, 6.4%, and 7.7% better than current models such as GA, HGS, Table 3 Statistical analysis on normalized energy
Approaches
Mean
Median
Std_dev
GA [3]
0.18793
0.11432
0.1699
HGS [42]
0.18955
0.11727
0.16892
SMA [43]
0.19781
0.15755
0.17788
GSO [44]
0.19135
0.14353
0.17644
ALO [45]
0.19177
0.11544
0.16654
MMSA [46]
0.18917
0.11682
0.16951
HCSO
0.20491
0.12328
0.15578
Energy and Trust Efficient Cluster Head Selection…
731
SMA, GSO, ALO, and MMSA, respectively. As a result of the overall evaluation, it is obvious that the suggested model may be used to identify the best CH.
4.7 Statistical Analysis on Trust The statistical analysis of normalized trust was performed for 2000 rounds, and the best results are shown in Table 5. The suggested work’s best mean value is 43.504 which is better than the traditional methods such as GA, HGS, SMA, GSO, ALO, and MMSA, respectively. As a result of the overall evaluation, it is obvious that the suggested model may be used to identify the best CH.
4.8 Statistical Analysis on Separation Distance Between CHs The statistical analysis on the proposed model for separation distance between CHs is shown in Table 4. The projected model recorded the minimal separation distance between the CHs, which is said to be the most favorable one. The mean separation distance between CHs is 19.5%, 5.4%, 43.8%, 79.7%, 17%, and 20.2% improved over the existing models like GA, HGS, SMA, GSO, ALO, and MMSA, respectively. Thus, from the overall evaluation, its clear that the proposed model can be utilized for optimal CH selection. Table 4 Statistical analysis on normalized trust
Approaches
Mean
Median
Std_dev 277.5
GA [3]
18.04
22.734
HGS [42]
10.747
10.14
38.563
SMA [43]
24.358
23.003
16.846
GSO [44]
24.686
22.474
16.874
ALO [45]
23.788
21.796
16.294
MMSA [46]
23.48
22.405
14.825
HCSO
43.504
42.833
28.953
Table 5 Statistical analysis on seperation distance between CHs Measures
GA [3]
HGS [42]
Best
266.79
197.68
Worst
4117.6
3699.3
SMA [43] 141.24 9861
GSO [44] 140.11 4223.7
ALO [45]
MMSA [46]
HCSO
304.38
171.7
133.33
4422.1
4333.2
4423.9
Mean
1850.2
1404.6
2638.8
824.07
1786
1856.1
1481
Median
1797
1338.7
2562.2
712.47
1726.9
1836.7
1449.4
Std_dev
692.26
1234.7
543.5
652.5
550.96
674.34
564.6
732 Table 6 Computational complexity of developed model and traditional model
K. N. Dattatraya and S. Ananthakumaran Approaches
Computational complexity (s)
GA [3]
11.288
HGS
22.577
SMA
21.896
GSO
9.5862
ALO
6.4774
MMSA
10.493
HCSO
5.73
4.9 Computational Complexity The computational complexity of the developed and traditional methods was computed for 2000 rounds and ten iterations, and the obtained results are shown in Table 6. From the obtained results, it is observed that the developed model has attained minimum complexity than the traditional methods such as GA, HGS, SMA, GSO, ALO, and MMSA, respectively. Statistical Analysis on Normalized trust.
5 Conclusion This research proposes a novel hybrid optimization model for optimal CHS under multiple objectives such as energy spent and separation distance, delay, distance, Qos, tsssrust (direct and indirect trust). The HCSO model will be the conceptual hybridization of the standard SMA and HGS model, respectively. The evaluation of the proposed work is done over the existing works in terms of count of AN, normalized network energy, and CH separation distance. This assessment was made between the proposed work and existing works like GA, HGS, SMA, GSO, ALO, and MMSA, respectively. The projected model recorded the minimal separation distance between the CHs, which is said to be the most favorable one. The mean separation distance between CHs is 19.5%, 5.4%, 43.8%, 79.7%, 17%, and 20.2% improved over the existing models like GA, HGS, SMA, GSO, ALO and MMSA, respectively. Thus, from the overall evaluation, its clear that the proposed model can be utilized for optimal CH selection.
References 1. K. Panimalar, S. Kanmani, Energy efficient CH selection using improved sparrow search algorithm in wireless sensor networks. J. King Saud Univ. Comput. Info. Sci. (2021)
Energy and Trust Efficient Cluster Head Selection…
733
2. R.K. Yadav, R.P. Mahapatra, Hybrid metaheuristic algorithm for optimal CH selection in wireless sensor network. Pervasive Mob. Comput. (2021) 3. S. Verma, N. Sood, A.K. Sharma, Genetic algorithm-based optimized CH selection for single and multiple data sinks in heterogeneous wireless sensor network. Appl. Soft Comput. (2019) 4. A.A. Baradaran, K. Navi, HQCA-WSN: high-quality clustering algorithm and optimal CH selection using fuzzy logic in wireless sensor networks. Fuzzy Sets Syst. (2019) 5. B.M. Khan, R. Bilal, R. Young, Fuzzy-TOPSIS based CH selection in mobile wireless sensor networks. J. Electric. Syst. Info. Technol. (2018); G. Omeke et al., DEKCS: a dynamic clustering protocol to prolong underwater sensor networks. IEEE Sens. J. 21(7), 9457–9464 (2021). https://doi.org/10.1109/JSEN.2021.3054943 6. H. Ali, U.U. Tariq, M. Hussain, L. Lu, J. Panneerselvam, X. Zhai, ARSH-FATI: A novel metaheuristic for CH selection in wireless sensor networks. IEEE Syst. J. 15(2), 2386–2397 (2021). https://doi.org/10.1109/JSYST.2020.2986811 7. S. Umbreen, D. Shehzad, N. Shafi, B. Khan, U. Habib, An energy-efficient mobility-based CH selection for lifetime enhancement of wireless sensor networks. IEEE Access 8, 207779– 207793 (2020). https://doi.org/10.1109/ACCESS.2020.3038031 8. H.-H. Choi, S. Muy, J.-R. Lee, Geometric analysis-based CH selection for sectorized wireless powered sensor networks. IEEE Wireless Commun. Lett. 10(3), 649–653 (2021). https://doi. org/10.1109/LWC.2020.3044902 9. T.M. Behera, S.K. Mohapatra, U.C. Samal, M.S. Khan, M. Daneshmand, A.H. Gandomi, Residual energy-based cluster-head selection in WSNs for IoT application. IEEE Int. Things J. 6(3), 5132–5139 (2019). https://doi.org/10.1109/JIOT.2019.2897119 10. S. Lata, S. Mehfuz, S. Urooj, F. Alrowais, Fuzzy clustering algorithm for enhancing reliability and network lifetime of wireless sensor networks. IEEE Access 8, 66013–66024 (2020). https:// doi.org/10.1109/ACCESS.2020.2985495 11. A. Verma, S. Kumar, P.R. Gautam, T. Rashid, A. Kumar, Fuzzy logic based effective clustering of homogeneous wireless sensor networks for mobile sink. IEEE Sens. J. 20(10), 5615–5623 (2020). doi https://doi.org/10.1109/JSEN.2020.2969697 12. H. El Alami, A. Najid, ECH: An enhanced clustering hierarchy approach to maximize lifetime of wireless sensor networks. IEEE Access 7, 107142–107153 (2019). https://doi.org/10.1109/ ACCESS.2019.2933052 13. C. Wang, J. Li, Y. Yang, F. Ye, Combining solar energy harvesting with wireless charging for hybrid wireless sensor networks. IEEE Trans. Mob. Comput. 17(3), 560–576 (2018). https:// doi.org/10.1109/TMC.2017.2732979 14. C. Wang, Y. Zhang, X. Wang, Z. Zhang, Hybrid multihop partition-based clustering routing protocol for WSNs. IEEE Sens. Lett. 2(1), 1–4 (2018). Art no. 7500504. doi: https://doi.org/ 10.1109/LSENS.2018.2803086 15. B. Zhu, E. Bedeer, H.H. Nguyen, R. Barton, J. Henry, Improved soft-k-means clustering algorithm for balancing energy consumption in wireless sensor networks. IEEE Internet Things J. 8(6), 4868–4881, (2021). https://doi.org/10.1109/JIOT.2020.3031272 16. J. Qiao, X. Zhang, Compressive data gathering based on even clustering for wireless sensor networks. IEEE Access 6, 24391–24410 (2018). https://doi.org/10.1109/ACCESS.2018.283 2626 17. S. Jia, L. Ma, D. Qin, S. Yang, Research on energy sensing based fault-tolerant distributed routing mechanism for wireless sensor network. IEEE Access 6, 39775–39786 (2018). https:// doi.org/10.1109/ACCESS.2018.2854900 18. N.M. Shagari, M.Y.I. Idris, R.B. Salleh, I. Ahmedy, G. Murtaza, H.A. Shehadeh, heterogeneous energy and traffic aware sleep-awake cluster-based routing protocol for wireless sensor network. IEEE Access 8, 12232–12252 (2020). https://doi.org/10.1109/ACCESS.2020.2965206 19. W. He, Energy-saving algorithm and simulation of wireless sensor networks based on clustering routing protocol. IEEE Access 7, 172505–172514 (2019). https://doi.org/10.1109/ACCESS. 2019.2956068 20. N. Choudhury, R. Matam, M. Mukherjee, J. Lloret, E. Kalaimannan, NCHR: A nonthresholdbased cluster-head rotation scheme for IEEE 802.15.4 cluster-tree networks. IEEE Internet Things J. 8(1), 168–178 (2021). https://doi.org/10.1109/JIOT.2020.3003320
734
K. N. Dattatraya and S. Ananthakumaran
21. J. Wang, S. Li, Y. Ge, Ions motion optimization-based clustering routing protocol for cognitive radio sensor network. IEEE Access 8, 187766–187782 (2020). https://doi.org/10.1109/ACC ESS.2020.3030808 22. W. Osamy, A.M. Khedr, A. Aziz, A.A. El-Sawy, Cluster-tree routing based entropy scheme for data gathering in wireless sensor networks. IEEE Access 6, 77372–77387 (2018). https:// doi.org/10.1109/ACCESS.2018.2882639 23. Y. Han, G. Li, R. Xu, J. Su, J. Li, G. Wen, Clustering the wireless sensor networks: A metaheuristic approach. IEEE Access 8, 214551–214564 (2020). https://doi.org/10.1109/ACCESS. 2020.3041118 24. J.S. Lee, H.T. Jiang, An extended hierarchical clustering approach to energy-harvesting mobile wireless sensor networks. IEEE Internet Things J. 8(9), 7105–7114(2021). https://doi.org/10. 1109/JIOT.2020.3038215 25. Q. Ren, G. Yao, Enhancing harvested energy utilization for energy harvesting wireless sensor networks by an improved uneven clustering protocol. IEEE Access 9, 119279–119288 (2021). https://doi.org/10.1109/ACCESS.2021.3108469 26. J. Zhang, R. Yan, Centralized energy-efficient clustering routing protocol for mobile nodes in wireless sensor networks. IEEE Commun. Lett. 23(7), 1215–1218 (2019). https://doi.org/10. 1109/LCOMM.2019.2917193 27. T.M. Behera, S.K. Mohapatra, U.C. Samal, M.S. Khan, M. Daneshmand, A.H. Gandomi, I-SEP: an improved routing protocol for heterogeneous wsn for iot-based environmental monitoring. IEEE Internet Things J. 7(1), 710–717 (2020). https://doi.org/10.1109/JIOT.2019.2940988 28. F.A. Khan, M. Khan, M. Asif, A. Khalid, I.U. Haq, Hybrid and multi-hop advanced zonal-stable election protocol for wireless sensor networks. IEEE Access 7, 25334–25346 (2019). https:// doi.org/10.1109/ACCESS.2019.2899752 29. A.A.H. Hassan, W.M. Shah, A.H.H. Habeb, M.F.I. Othman, M.N. Al-Mhiqani, An improved energy-efficient clustering protocol to prolong the lifetime of the WSN-based IoT. IEEE Access 8, 200500–200517, (2020). https://doi.org/10.1109/ACCESS.2020.3035624 30. A. Kelotra, P. Pandey, Energy-aware CH selection in WSN using HPSOCS algorithm. J. Netw. Commun. Syst. 2(1), 24–33 (2019) 31. J. John, P. Rodrigues, Multi-objective HSDE algorithm for energy-aware CH selection in WSN. J. Netw. Commun. Syst. 2(3), 20–29 (2019) 32. A. Sarkar, T. Senthil Murugan, Adaptive cuckoo search and squirrel search algorithm for optimal CH selection in WSN. J. Netw. Commun. Syst. 2(3), 30-39 (2019) 33. P.K. Reddy, M.R. Babu, CH selection in IoT using enhanced self adaptive bat algorithm. J. Netw. Commun. Syst. 2(4), 23–32 (2019) 34. J. Devagnanam, N.M. Elango, Optimal resource allocation of cluster using hybrid grey wolf and cuckoo search algorithm in cloud computing. J. Netw. Commun. Syst. 3(1), 31–40 (2020) 35. S.L. Shelgaonkar, I-CSA based CH selection model in wireless sensor network. J. Netw. Commun. Syst. 3(2) (2020) 36. S. Rathod, Hybrid metaheuristic algorithm for CH selection in WSN. J. Netw. Commun. Syst. 3(4) (2020) 37. A. khare, CH selection in IoT using a novel hybrid self-adaptive heuristic algorithm. J. Netw. Commun. Syst. 4(1) (2021) 38. J. Wang, Optimized CH selection in WSN using GA-WOA. J. Netw. Commun. Syst. 4(1) (2021) 39. K. Srinivas, Cluster based dense using hybrid genetic and grasshopper optimization algorithm in WSN. J. Netw. Commun. Syst. 4(3) (2021) 40. S. Li, H. Chen, M. Wang, A.A. Heidari, S. Mirjalili, Slime mould algorithm: a new method for stochastic optimization. Future Gener. Comput. Syst. 11 (2020) 41. L. Abualigaha, A. Diabatb, S. Mirjalilid, M. Abd Elaziz, A.H. Gandomih, The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376 (2021) 42. Y. Yang, H. Chen, A.A. Heidari, A.H. Gandomi, Hunger games search: visions, conception, implementation, deep analysis, perspectives, and towards performance shifts. Expert Syst. Appl. (2021)
Energy and Trust Efficient Cluster Head Selection…
735
43. S. Li, H. Chen, M. Wang, A.A. Heidari, S. Mirjalili, Slime mould algorithm: a new method for stochastic optimization. Future Gener. Comput. Syst. 111, 300–323 (2020) 44. S. He, H. Wu, J.R. Saunders, Group search optimizer: an optimization algorithm inspired by animal searching behaviour. IEEE Trans. Evol. Comput. 13(5) (2009) 45. S. Mirjalili, The ant lion optimizer. Adv. Eng. Softw. 83, 80–98 (2015) 46. K. Navnath, Optimal cluster head selection in wireless sensor network via improved moth search algorithm, in Communication
Performance Analysis of OTFS Scheme for TDL and CDL 3GPP Channel Models I. S. Akila, S. Uma, and P. S. Poojitha
Abstract A new modulation scheme Orthogonal Time Frequency Space modulation (OTFS) modulation is considered for the next-generation mobile communications. It is not affected under higher carrier frequencies and doppler shifts. This paper focuses on the evaluation of OTFS modulation with 3GPP channel models for Quadrature Amplitude Modulation (QAM) with design of Minimum Mean Square Error (MMSE) equalizer for 5G New Radio (NR) and Beyond 5G (B5G) technologies. The simulation is performed using MATLAB R2021b version for comparison between OTFS and Orthogonal Frequency Division Multiplexing (OFDM) modulation schemes for Tapped Delay Line (TDL) and Cluster Delay Line (CDL) 3GPP channel models. It is observed that the proposed OTFS scheme shows significant improvement in SNR performance when compared with the OFDM scheme for 3GPP channel models. Among them, the TDL-E channel model exhibits better performance against other channel models with a delay spread of 300 ns. Keywords TFS · OFDM · MMSE · New radio · CDL · TDL · BLER
1 Introduction The 5G wireless networks are expected to offer better Quality of Service (QoS) performance to overcome the challenges addressed in the existing technologies. Long Term Evolution (LTE) uses Orthogonal Frequency Division Multiplexing (OFDM) modulation techniques, but it suffers from Inter-Carrier Interference (ICI) at higher frequencies. The upcoming 5G technologies also use the OFDM scheme with a difference in transmission and reception both at uplink and downlink. The recent works employ Orthogonal Time Frequency Space modulation (OTFS) where the performance is better even at the higher delays and doppler scenarios in comparison I. S. Akila (B) · S. Uma · P. S. Poojitha Department of ECE, Coimbatore Institute of Technology, Coimbatore, India e-mail: [email protected] S. Uma e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_54
737
738
I. S. Akila et al.
with existing OFDM LTE. The proposed work uses OTFS modelled as delay and doppler domain with a suitable equalizer to cancel out the ICI effects, thus making the entire system stable. OTFS waveform has its origin from the Zak transform, [1, 2] that applies the transmission over a long wireless propagation channel. The Zak transform relates time and frequency domain to delay and doppler domain [3, 4]. The OTFS can be represented using an orthogonal basis function. The OTFS waveform is overlayed over the OFDM where pre-processing and post-processing functions are constructed to achieve delay and doppler domain. OTFS has shown some typical SNR improvements approximately to 10 dB. It also reduces the complexity of channel estimation algorithm. The entire channel exploiting feature is a boon to the whole process as the throughput becomes linear rather than gradual scaling [5]. The 3GPP channel models Tapped Delay Line (TDL) and Cluster Delay Line (CDL) provide the cluster delay for multipath propagation and power delay taps, respectively [6]. This trend has gained global attention as they are robust to higher doppler shifts. Also, the contribution towards OTFS literature is numerously increasing in recent years due to the fact of its performance over high Doppler shifts. In [3, 4, 7] the performance of OTFS transform is compared with OFDM in terms of Bit Error Rates (BER), Block Error Rates (BLER) and Packet Error Rates (PER) for a Frequency Division Duplexing (FDD) 4QAM, 16QAM and 64QAM modulation schemes with subcarrier spacing of 15 kHz for TDL channel model with 3GPP specifications [6]. In [8], a high doppler fading channel scenario is carried out at 4 GHz carrier frequency for BPSK modulation scheme and BER is obtained, then compared to the existing systems. In [9], OTFS modulation for millimetre wave communications has been carried out at 28 GHz with subcarrier spacing up to 480 kHz for 3GPP channel model CDL [6], for 16QAM modulation schemes using LTE turbo code from which BER was calculated. In [10], the MMSE equalizer with DFE integration is performed for 16QAM modulation and BER were obtained. Recent research works have discussed channel adaption algorithms and equalization techniques for OTFS modulation [11–14]. Also, the works presented in [15–19] exhibit the emerging directions of recent research with an inherent complexity. In this paper, OTFS performance has been evaluated and compared to OFDM at 28 GHz for 16QAM modulation scheme using MMSE equalizer. The rest of the paper is organized as follows: Sect. 2 presents the fundamentals of OTFS transform and it’s working with MMSE equalizer. Section 3 discusses the simulation results with the performance evaluation.
2 Orthogonal Time Frequency Space Modulation As cited in [9] a signal can be represented as the function of time, frequency or in Delay-Doppler Domain. The usage of Fourier Transforms results in the conversion of Time-Domain (TD) signals into Frequency Domain (FD) components while the
Performance Analysis of OTFS Scheme for TDL and CDL 3GPP… Fig. 1 Relationship among time, frequency and delay-doppler domains
739
Time
Zt
FT
DelayDoppler
Frequency
Zf
Zak transform can be applied for the conversion from Delay-Doppler Domain into time and frequency representation as shown in Fig. 1. The OTFS system works with delay and doppler domain which differs from conventional systems. The key concept of OTFS is the conversion of time-varying characteristics of multipath components into non-fading channel in the DelayDoppler Domain and as the outcome, the information symbols experience an identical channel gain. This is achieved using the pre-processing and the post-processing blocks constructed using Symplectic Finite Fourier transform (SFFT) and Inverse Symplectic Finite Fourier transform (ISFFT) [3–5] as shown in Fig. 2. The 5G NR downlink transmitter and receiver chain are considered for the proposed work as shown in Fig. 3. The OTFS system is implemented with SFFT and ISFFT transforms. The work incorporates Cyclic Redundancy Check (CRC) and Low-Density Parity Check (LDPC) for the error correction and detection purposes. CRC gives the code block concatenation to LDPC to choose the base graphs. OTFS is implemented on multicarrier OFDM modulation scheme. Channel estimation and equalization are carried out at the receiver side. The information-bearing symbols s[k, l] at the transmitter in Delay-Doppler Domain is mapped to the sequence of complex numbers s[n, m] in time frequency s[n,m]
ISFFT & Windowing s[k,l]
s(t)
Heisenberg Transform
OTFS Transform
Fig. 2 OTFS system model
Y[n,m]
y(t)
Channel
OTFS (Delay-Doppler Domain)
Wigner Transform
y[k,l]
SFFT & Windowing
Inverse OTFS Transform
740
I. S. Akila et al. s(k,l)
Information Bits
CRC & LPDC Encod-
QAM
s(n,m)
ISFFT
OFDM Transmitter s(t)
Channel y(t)
Information Bits
LDPC Decoding & CRC
MMSE Equaliser & Detector Y’(k,l)
SFFT
OFDM Receiver
Y(k,l)
Y(n,m
Fig. 3 OTFS 5G NR downlink transmitter and receiver structure
domain using 2D Inverse Symplectic Finite Fourier transform and windowing is performed to remove the dissimilarities. The time frequency signals are then transformed to s(t) using Heisenberg transform and transmitted through the 3GPP channel models. At the receiver, y(t) represents the received signal and is depicted as in Eq. (1). y(t) =
h c (τ, ν)ej2πν(t−τ ) s(t − τ )dτ dν
(1)
According to Eq. (1), the received signal y(t) is a mean of multi-propagated signals, where each of these signal components experiences delay of τ and Doppler shift of ν. The parameter hc (τ, ν) represents delay-doppler impulse response of the channel. From the literature, it was inferred that the doppler shifts are of order 10 Hz–1 kHz. In case of high mobility or at high carrier frequency, even the values would be larger. The Symplectic Finite Fourier Transform (SFFT) is used to convert Delay-Doppler Domain to time frequency domain for detection of symbols.
2.1 OTFS Modulation Each QAM symbol in the Delay-Doppler Domain s[k, l] is mapped to time frequency domain s[n, m] and it represents the resource element of OFDM symbol as depicted in Eq. (2) is called as the OTFS modulated symbols [9]. s[n, m] = Wtxr [n, m]SFFT−1 (s[k, l]) where Wtxr . represents the transmit windowing function.
(2)
Performance Analysis of OTFS Scheme for TDL and CDL 3GPP…
741
2.2 OFTS Demodulation From [9], the SFFT is applied to the periodic time frequency sequence Y [n, m] to obtain a periodic delay-doppler sequence Y [k,l] as depicted in Eqn. (3). Y [k, l] = SFFT(Y [n, m])
(3)
2.3 Equalization and Channel Estimation MMSE equalizer is an efficient linear equalizer which can equalize the symbols in Delay-Doppler Domain thus obtaining the transmitted symbols Y (k, l) . Phase tracking signals at the transmitter are used to compress the common phase error at the equalizer thus reducing the ICI effects. The channel estimation is set to obtain a synchronous idle channel with channel characteristics of 3GPP [6] with delay spread of 300 ns.
3 Results and Discussion In this section, the comparative results of OTFS and OFDM are presented with TDL and CDL 3GPP channel models. The performance of these models has been compared under 5G NR downlink system in MATLAB. The parameters used for the simulation are listed in Table 1. Table 1 Simulation parameters
Parameter
Value
Subcarrier spacing
30 kHz
Carrier frequency
28 GHz
Cyclic prefix
Normal
Modulation scheme
16 QAM
Channel model
TDL and CDL models
Channel delay spread
300 ns
Channel doppler shift
10 Hz
Channel estimation
Ideal
Equalizer
MMSE
742
I. S. Akila et al.
Fig. 4 OTFS versus OFDM for TDL-A channel model
3.1 TDL Channel Models As cited in 3GPP Release 14 of TR 38.90, the specification of TDL-A and TDL-C represents NLOS scenario while the TDL-E depicts the LOS component and they are considered for the simulation of OFDM and OTFS schemes. The simulation plots of Figs. 4 and 5 exhibit TDL-A and TDL-C performance of OTFS and OFDM modulation schemes. Higher BLER leads to increase in SNR owing to the ISI effects that cannot be eliminated at higher Doppler ranges. It is inferred that the BLER is higher in OFDM scheme. It can be seen that there is no fluctuation in the plots due to the MMSE equalizer design that effectively reduces ICI at the receiver. Figure 6 represents the simulation results of TDL-E channel and it is inferred that the saturation curve of BLER decreases across SNR scale as a sign of improvement. Also, it is observed that OTFS outperforms OFDM in terms of power taps TDL channel model.
3.2 CDL Channel Models The CDL models are defined in terms of clusters at angles of arrival and departure. These clusters consisting of multipath components [6] are simulated in Matlab. For 16QAM modulation scheme, two NLOS channel models (CDL-A and CDL-C) and
Performance Analysis of OTFS Scheme for TDL and CDL 3GPP…
Fig. 5 OTFS versus OFDM for TDL-C channel model
Fig. 6 OTFS versus OFDM for TDL-E channel model
743
744
I. S. Akila et al.
Fig. 7 OTFS versus OFDM for CDL-A channel model
one LOS channel model CDL-E have been evaluated with MMSE equalizer at the receiver. Figures 7 and 8 exhibit the performance outcome of OFDM and OTFS systems with CDL-A and CDL-C channel models. Beyond this, a small degree of difference in dB in the SNR scale with expected rates is observed when compared to TDL-A and TDL-C channel models. Figure 9 show the performance of CDL-E channel model and it is inferred that OTFS has better saturation rates compared to OFDM across the SNR scale. With reference to Fig. 5, it can be seen that the TDL-E channel does better performance compared to all the other channels and are suitable for Beyond 5G Technologies (B5G). Thus, OTFS simulation with MMSE equalizer in the Delay-Doppler Domain can outperform the OFDM simulation at higher doppler and delay spreads for 3GPP channel models. This modulation scheme can be employed for B5G technologies in the future as the results obtained show overall acceptable performance ranges. Also, the MMSE equalizer is well suitable to reduce the Inter-Carrier Interference effects.
Performance Analysis of OTFS Scheme for TDL and CDL 3GPP…
Fig. 8 OTFS versus OFDM for CDL-C channel model
Fig. 9 OTFS versus OFDM for CDL-E channel model
745
746
I. S. Akila et al.
4 Conclusion In this paper, the performance of OTFS and OFDM for 5G NR systems are evaluated and compared in the terms of 3GPP channel model TDL and CDL. Performance gains were calculated for OTFS and OFDM systems with a MMSE equalizer at receiver. Block error rates were obtained for 16 QAM modulation schemes with 5G NR specifications for OTFS and OFDM. It is seen that OTFS outperforms OFDM with MMSE equalizer for 3GPP channel models in the SNR scale. The results also indicate the better performance of TDL-E channel model in OTFS compared to all other 3GPP channel models. The proposed approach could be enhanced across BER and throughput by considering tradeoff amidst performance factors of 5G wireless networks. Acknowledgements We would like to thank Ashok Kumar Reddy Chavva and Anusha Gunturu from Samsung PRISM and Coimbatore Institute of Technology for their constant support and guidance in the completion of this research work.
References 1. A. Janssen, The zak transform: a signal transform for sampled time continuous signals. Philips J. Res. 43(1), 23–69 (1988) 2. J. Zak, Finite translations in solid-state physics. Phys. Rev. Lett. 19(24), 1385–1387 (1967) 3. R. Hadani, S. Rakib, M. Tsatsanis, A. Monk, A.J. Goldsmith, A.F. Molisch, R. Calderbank, Orthogonal time frequency space modulation, in Proceedings of IEEE WCNC (2017), pp. 1–7 4. R. Hadani, S. Rakib, A.F. Molisch, C. Ibars, A. Monk, M. Tsatsanis, J. Delfeld, A. Goldsmith, R. Calderbank, Orthogonal time frequency space (OTFS) modulation for millimetre-wave communications systems, in Proceedings of IEEE MTT-S International Microwave Symposium (2017), pp. 681–683 5. L. Li, H. Wei, Y. Huang, Y. Yao, W. Ling, G. Chen, P. Li, Y. Cai, A simple two-stage equalizer with simplified orthogonal time frequency space modulation over rapidly time-varying channels. (2017) online: arXiv:1709.02505v1 [cs.IT] 6. 3GPP TS 38.901-g10, Study on channel model for frequencies from 0.5 to 100 GHz 7. A. Gunturu, A.R. Godala, A.K. Sahoo, A.K.R. Chavva, Performance analysis of OTFS waveform for 5G NR mm wave communication system, in IEEE Wireless Communications and Networking Conference (WCNC) (2021). https://doi.org/10.1109/WCNC49053.2021.9417346 8. H. Zhang, X. Huang, J.A. Zhang, Comparison of OTFS diversity performance over slow and fast fading channels, in Proceedings of IEEE International Conference on Communications (ICCC) (Changchun, China, 2019), pp. 828–833 9. G.D. Surabhi, A. Chockalingam, Low-complexity linear equalization for OTFS modulation. IEEE Commun. Lett. 24(2), 330–334 (2020) 10. G.D. Surabhi, A. Chockalingam, Low-complexity linear equalization for 2×2 MIMO-OTFS signals, in Proceedings of IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) (Atlanta, GA, USA, 2020), pp. 1–5 11. 3GPP TS 38.212-g10, NR Multiplexing and channel coding 12. 3GPP TS 38.211-g10, NR Physical channels and modulation 13. 3GPP TS 38.214-g10, NR Physical layer procedures for data
Performance Analysis of OTFS Scheme for TDL and CDL 3GPP…
747
14. F. Long, K. Niue, C. Dong, J. Lin, Low complexity iterative LMMSE-PIC equalizer for OTFS, in Proceedings of IEEE International Conference on Communications (ICC) (Shanghai, China, 2019), pp. 1–6 15. V. Suma, Community based network reconstruction for an evolutionary algorithm framework. J. Artif. Intell. 3(01), 53–61 (2021) 16. B. Vivekanandam, Nakagami-m fading detection with eigen value spectrum algorithms. J. Electron. Inform. 3(2), 138–149 (2021) 17. J.-Z. Chen, The evaluation of performance for a mass-MIMO system with the STSK scheme over 3-D α-λ-μ fading channel. IRO J. Sustain. Wireless Syst. 1(1), 1–19 (2019) 18. A. Bashar, An efficient cell selection approach in 4G networks. J. Trends Comput. Sci. Smart Technol. (TCSST) 2(04), 188–196 (2020) 19. P. Iddalagi, SDWAN–its impact and the need of time. J. Ubiquit. Comput. Commun. Technol. (UCCT) 2(04), 197–202 (2020)
Design of Smart Super Market Assistance for the Visually Impaired People Using YOLO Algorithm D. Jebakumar Immanuel, P. Poovizhi, F. Margret Sharmila, D. Selvapandian, Aby K. Thomas, and C. K. Shankar
Abstract Among the population of India, nearly 1.6 million children are blind or visually impaired by birth. The challenges faced by the visually impaired persons are high in day-to-day life. There are number of challenges faced by them like requiring guidance for the usage of public transport, to walk independently. One of the most common problems is the dependency on others for the purchase of household items. Hence, a unique solution is required which makes the people to identify and purchase the products in the supermarket without dependency. This research focuses on a portable camera-based assistance that helps them to identify the grocery items along with the framework to read the text on the labels. This experimental system consists of three modules: object detection and classification; reading the text from the label; text to audio converter. The input to the camera is fed by the Raspberry Pi kit. The captured video is segregated as pictures of the grocery items. The proposed deep learning algorithm for the detection of object is You only look once (YOLO). The accuracy of the detection is pre-dominantly high compared to the existing algorithms. The captured image of the grocery item label D. Jebakumar Immanuel (B) · F. Margret Sharmila Department of Computer Science and Engineering, SNS College of Engineering, Coimbatore, TamilNadu, India e-mail: [email protected] P. Poovizhi Department of Information Technology, Dr. N.G.P. Institute of Technology, Coimbatore, TamilNadu, India D. Selvapandian Department of Computer Science and Engineering, Karpagam Academy of Higher Education, Coimbatore, TamilNadu, India A. K. Thomas Department of Electronics and Communication Engineering, Alliance College of Engineering and Design, Alliance University, Bangalore, India e-mail: [email protected] C. K. Shankar Department of Electrical and Electronics Engineering, Sri Ramakrishna Polytechnic College, Coimbatore, TamilNadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_55
749
750
D. Jebakumar Immanuel et al.
is preprocessed to blur the unwanted text. Optical character recognition (OCR) is applied over the label to convert as the text where machine can read. At last, the processed text is converted as audio for the guidance of the people. Thus, a flexible supervisory mechanism with user-friendliness is developed for helping the visually impaired people. Keywords Deep learning · Visually impaired · Smart super market assistance · Optical character recognition
1 Introduction As per the report of world health organization, 1/6 of the world population is affected by visual impairment. In general, the global vision ımpairment people is around 2.2 billion. Hence, a proper and unique solution is required to guide the visually impaired people in right manner. The cause of this site problem is mainly because of cataracts. The vision impairment can be categorized as low, medium and severe based on the root cause of the problem [1]. According to survey of Medical Express, the ratio of visually impaired will grow from 36 to 115 million by the year 2050. This is due to the increase in population and the high in elderly count. These kinds of people require assistive technology to stay independent. The recent advances in computer vision, sensors technology, medical innovations and wearable technology ease the growth in assistive technology for the impaired people [2]. In recent days, the popularity of the mobile phones, advancement in laptops and technologies have boons the people in different manner. These devices have the ability to help the needy people. The visually impaired are dependent on others for their work. The main problem of them is shop the grocery items independently. They rely on their friends and relatives.
1.1 Inspiration Towards Research Work Eyesight is the main part of the human being. Millions of people live in this world with many impairments like visual impairment, deaf and dump. The visual impairment has a major impact on the people. This makes to depend on other people for their work. In olden days, visually impaired people can be easily identified by the way of walking and the wearing of black glass. Nowadays, due to the advancement in the computer vision technology, the visually impaired people behave like normal people. Although many alternative approaches for the blind people, still many faces the difficulty in purchasing the household items. For example, the navigation difficulty is faced by them. Moreover, the blind people find it difficult to know whether a person is talking
Design of Smart Super Market …
751
to them or someone else during a conversation. [3] To make the lifestyle of the blind people as comfortable is the underlying reason for carrying out the research work.
1.2 Facts About Blind People [4] A.
B.
C.
Blind People do not see blackness, instead they see nothing The blind people usually sense nothing rather than blackness. Myths make the normal people to think as the contradictory to the stated statement. Blind People can use Smartphones and Computers The visually impaired people have the opportunity to use the smartphones and computers. The people can share their thoughts and connect with their friends through the usage of social media. Blind people can live independently One of the myths on blindness is, the blind people cannot live independently. This is not the case in today’s scenario. The blind people can run normal life as normal people.
Section 1 describes the introduction and the facts and myths of the blind people. Section 2 is about the literature works related to the visually impairment of the people. Section 3 structures the proposed algorithm for assisting the visually impaired people with its woking principle. Section 4 deals with experimental results. Section 5 describes the conclusion and the future work.
2 Related Works There are many reasons for blindness in people. They are uncorrected refractive errors, cataracts, glaucoma, diabetic retinopathy, injury in eye and trachoma. This makes them as visually impaired people. Further, which restricts them not to navigate to another place independently, to perform everyday works. With the change in knowledge, various answers have been announced like the ER (Eye Ring) project, the gesture, TRS (text-recognition-system) and face-recognition-System, etc. However, these outcomes have drawbacks such as heavy-duty, luxurious, fewer strengths, low acceptance, etc. Hence, progressive methods must arrive to benefit them. So, a technically imposed methodology can be planned and implemented to help the blind people. Henceforth, deep learning technique is applied to reach the high-level accuracy [5].
752
D. Jebakumar Immanuel et al.
2.1 Assistive Clothing Pattern Recognition for Visually Impaired People Yang et al. proposed a system for picking the garments with complex patterns and colours. Camera-based image system was developed by the author to acknowledge the article of clothing patterns in four class paid, striped, patternless, and irregular patterns. The components used were a camera, a mike, a computer, a bluetooth headphones for audio. The camera was mounted on the glass, to capture the picture of clothing. The verbal communication over the patterns and colours was there. Math Properties were applied over the pattern to make the blind people to identify the clothes [6].
2.2 From Smart Objects to Social Objects: The Next Evolutionary Step of the Internet of Things Atzori et al. proposed iris recognition technique along with Dogmans algorithmic programme to help the blind people. This method was related to statistics and wireless techniques which solve the spurious group action [7].
2.3 Accessible Shopping Systems for Blind and Visually Impaired Individuals: Design Requirements and the State of the Art Vladimir proposed a solution for the dependency of blind people on the purchase of grocery items. This research work establishes the necessity to help searching systems and to analyze existing approaches. Potential analysis and work have been carried out for the directions of the blind people in order to access the grocery items in the market [8].
2.4 Touch Access for People with Disabilities Sjostrom projected haptic computer interfaces for the creation of new possibilities of interaction among computers. Certec—touch interface and haptic interface were compared. The development of haptic interface helps the blind people to do many things like play haptic computer games and learn maths by tracing [9].
Design of Smart Super Market …
753
2.5 Graphene-Based Web Framework for Energy Efficient IoT Applications Chen and Yeh adopted the application layer framework named Graphene. This method is developed to improve the efficiency of the communication. The recent IoT applications applied in abstraction layer can be easily integrated with software using this method [10].
2.6 Design of Deep Learning Algorithm for IoT Application by Image-Based Recognition Jacob and Darney proposed a framework for deep learning with PCA. The PCA is used to identify the images from IoT applications. The CNN algorithm is used to classify the images. The experimental results give better accuracy than the SVM [11].
2.7 Computer Vision on IoT Based Patient Preference Management System Sathesh used adaptive wavelet sampling (AWS) for the management of the data. Patient preference management system based on computer vision is performed with same data in continuous manner to yield the best result. This AWS segregates the required data which can be moved from the application to cloud layer [12].
2.8 Study of Retail Applications with Virtual and Augmented Reality Technologies The AR and VR technology is described by the author. This is the base for the research relevant to online shopping with the use of AR/VR technology. The market share and the profit of the industry can be increased with this technology [13].
2.8.1
Analysis of Convolutional Neural Network-Based Image Classification Techniques
The author proposed CNN-based classifiers to predict the fruits via camera. This helps to establish the quick response in the billing process [12].
754
D. Jebakumar Immanuel et al.
3 The Proposed Work The visually impaired people use the smart glass for the independent shopping. The smart glass along with the set of components can act as the guide for the visually impaired people. The object detection algorithm used here is the YOLO algorithm. The YOLO algorithm divides the image into grids of cells. A bounded box is used as a representation for describing the object placement in them. This algorithm is faster than the other object detection algorithms.
3.1 Steps of the Research Work Step 1: Step 2: Step 3: Step 4: Step 5: Step 6: Step 7: Step 8:
The camera attached smart glass captures the image as an input. The captured image is sent to Raspberry Pi kit and the image is preprocessed. The preprocessed image is segmented to change the dimension of the image in a way that is easy to analyze. The classification algorithm is applied over the segmented image. The classified image is fed as an input to neural network. This gives the label of classification and the prediction accuracy. Object detection is performed to obtain the bounding boxes for the classified objects. The ultrasonic sensor is used to calculate the period between the send/receive sign from the object and the visually impaired person. The bone conduction headset is here to instruct the visually impaired person. The image, i.e. text or OCR is converted as speech. The visually impaired people will act according to the instruction given by the microcontroller.
3.2 Smart Glass with Their Nature of Working The visually impaired people can wear the smart glass. The overall flow of the proposed work is shown in Fig. 1. Initially, the smart glass starts capturing the images. The output from the camera is fed into the microprocessor for processing. The system will detect and classify the images. The bone conduction headset will receive the signal from the microprocessor and pass this to the visually impaired people. Figure 2 shows the proposed smart glass with the necessary components. A Camera is fixed with the glass which has other components also to have a communication with the visually impaired. The primary front-end components used by the visually impaired people are headset, glass with camera and vibration band. The
Design of Smart Super Market …
755
Start
User wear the smart glass
Captures Image Object Detection
Classification of the objects using CNN
Bluetooth/GPS/GSM
Pass this information to the user with the help of bone conduction headset and vibration band Fig. 1 Flow chart of the proposed system
GSM/GP
Rashberry Pi Microcontroller
Bluetooth
Ultrasonic Distance Battery
Fig. 2 Block diagram of the proposed system
Bone Conduction Headset Vibration Band
Smart Glass with Camera
756 Table 1 Components of the proposed work
D. Jebakumar Immanuel et al. S No.
Components name
Specifications
1
Raspberry Pi
Raspberry Pi 3
2
BC headset
Frequency: 20–20,000 Hz
3
1080p HD camera
1080p
4
Battery with MAH
1 (2500 mAh)
5
UD sensor
8 mA, range distance: 3–3.5 cm
6
VB—Band
1
other back-end components which make the communication possible are battery, ultrasonic distance sensor, GSM, GPS and Bluetooth/Wi-Fi Modules (Table 1). The components should be interconnected with each other using Bluetooth/WiFi. The input and output of every component are interlinked with each other. These components give response without any delay. Raspberry Pi is a smart card-sized processor, to which all other are related [14]. Figure 3 displays the architectural diagram of the proposed work. The smart glass gets the input from the Rashberry pi kit through the bluetooth. The ultrasonic sensor detects the objects and measures the distance between the people and the object. The vibration band (VB) along with the Rashberry Pi gives the input to the visually impaired people.
Fig. 3 Architectural diagram of proposed Work
Design of Smart Super Market …
757
3.3 Working Principle of YOLO Algorithm The YOLO algorithm is used to identify the objects. The class prediction and bounding boxes for the full image occurred while executing the algorithm. The bounding box has four descriptors. • • • •
Centre of box (bs,bt) Width (bw) Height (bh) Value ‘V ’ denotes the name of the class, i.e. object
YOLO algorithm is one of the item or object detection techniques in physical time. It uses CNN to detect the objects. This requires only one single propagation to detect the objects. The bounding boxes and probability of the predictive class are by CNN (Fig. 4) [15].
3.3.1
Why YOLO Algorithm?
• Speed • Learning capabilities • High accuracy Following equation has been used Pc , Wx , Wh , Ww , D
(1)
S ∗ S ∗ (B ∗ 5 + C)
(2)
where,
Fig. 4 a Image classification, b object detection
758
D. Jebakumar Immanuel et al.
Input layer
Hidden Layer
Output Layer
Fig. 5 Structure of neural network
Pc is the probability c to class W x , by—centre of bounding box W h —height of bounding box Ww —width of bounding box D—value S * S—resized image B—BB C—Probabilities of class Where BB indicates—bounding boxes
3.4 Classification The neural network consists of many layers like input, hidden and output layers. There are no limitations on the number of hidden layers. The YOLO algorithm uses CNN to classify the images. Figure 5 shows the general structure of CNN.
4 Investigative Results 4.1 Building a Product Recognition System The YOLO algorithm detects the object and with the help of CNN, the prediction of the classes of the object is done. The datasets are taken from the Freiburg Groceries website. The dataset contains the pictures of the grocery items. This contains 5000 images with 35 different grocery items. These 5000 images are classified into 80:20 ratio. The 80% of the data is used for training and the remaining 20% is for testing the data. There are two key components
Design of Smart Super Market …
759
Fig. 6 Image classification and detection
• Image detection [16] • Image recognition Figure 6 shows the object detection with prediction rate. The reference image is uploaded in the database. The trained data with the original data is matched by the CNN classifier which is in-built into YOLO algorithm. Accuracy = Accuracy = • • • •
No. of correct predictons Total number of predictions
(3)
True Positive + True Negative (4) True Positive + True Negative + False Positive + False Negative
True Positive—Correct predictions of the labelled class False Positive—Correct predictions of the other labelled class True Negative—Incorrect predictions of the labelled class False Negative—Incorrect predictions of the non-labelled class
4.2 Output Image Figure 7 shows the real-time object detection in the supermarket. The detected items are marked in the rectangle box with colour indication.
760 Table 2 Observational table
D. Jebakumar Immanuel et al. Item
Detection accuracy (%)
Distance (cm)
Time period (s)
98
25
0.8
99
20
0.6
97
10
0.4
4.3 Graphical Representation Accuracy is the main indication of the performance metrics. The accuracy percentage of the items is displayed in Table 2. Table 2 represents the object with detection accuracy and their distance with period of detection. Figure 8 shows the graph of the accuracy versus distance of the object. Figure 7 shows the comparison of the YOLO algorithm with MobileNet algorithm. The YOLO algorithm performs well by showing the accuracy as better when compared to the MobileNet algorithm (Figs. 8 and 9).
5 Conclusion The proposed work consists of smart glass with bone conduction headset to communicate with the visually impaired people. The accuracy of the proposed system consists of distance as the most important parameter. The YOLO algorithm is used to detect the objects with high accuracy. The input from the camera is fed into the Raspberry Pi with the help of Bluetooth or Wi-fi. The image is processed and segmented as
Design of Smart Super Market …
761
Fig. 7 Real-time grocery item detected image
Distance vs Accuracy 120
Accuracy (%)
100 80 60 40 20 0 -20
0
0.5
1 Salt
1.5 2 Distance (sec) Dettol
2.5
3
3.5
Coca-Cola
Fig. 8 Accuracy versus distance
S * S grid size. The output produced from YOLO algorithm act as an input to the vibration band. The image is converted as speech and sent to the visually impaired people by bone conduction headset. Thus, this research work is contributed to the visually impaired person to make them to act as normal people. Here forth, real-time object detection takes place.
762
D. Jebakumar Immanuel et al. COMPARISON OF ACCURACY YOLO
MobileNet
100%
Accuracy %
98% 96% 94% 92% 90% coca-cola
Dettol
Salt
Name of the Grocery Items
Fig. 9 Comparison of YOLO with mobilenet
6 Future Work The proposed work can be converted as the mobile app where the visually impaired people can scan the product to know the details of the product.
References 1. K. Manjari, M. Verma, G. Singal, A survey on assistive technology for visually impaired. Int. Things. 11, 100188 (2020), ISSN 2542–6605 2. A. Bhowmick, S.M. Hazarika, An insight into assistive technology for the visually impaired and blind people: state-of-the-art and future trends. J. Multimodal User Inter. 11, 149–172 (2017) 3. http://cs231n.stanford.edu/reports/2016/pdfs/218_Report.pdf 4. https://www.letsenvision.com/blog/5-facts-about-blind-people-and-blindness 5. S. Shaikh, V. Karale, G. Tawd, Assistive object recognition system for visually ımpaired. Int. J. Eng. Res. Technol. 9(9), 2278–0181, (2021), ISSN (Online) 6. X. Yang, S. Yuan, Y.L. Tian, Assistive clothing pattern recognition for visually impaired people. IEEE Trans. Hum. Mach. Syst. 44(2) (2014) 7. L. Atzori, A. Iera, G. Morabito, From smart objects to social objects: the next evolutionary step of the Internet of Things. IEEE Commun. Mag. (2014) 8. V. Kulyukin, A. Kutiyanawala, Accessible shopping systems for blind and visually ımpaired ındividuals: design requirements and the state of the art. Open Rehabil. J. 3, 158–168 (2010) 9. C. Sjostrom, Touch access for people with disabilities, CERTEC Lund University Sweden (1999) 10. J.I.Z. Chen, L-T. Yeh, Graphene based web framework for energy efficient IOT applications. J. Inform. Technol. 3(01), 18–28 (2021) 11. I.J. Jacob, P.E. Darney, Design of deep learning algorithm for ıot application by ımage based recognition. J. ISMAC. 3(03), 276–290 (2021) 12. A. Sathesh, Computer vision on iot based patient preference management system. J. Trends Comput. Sci. Smart Technol. 2(2), 68–77 (2020)
Design of Smart Super Market …
763
13. T. Senthil Kumar, Study of retail applications with virtual and augmented reality technologies. J. Innovative Image Process. (JIIP), 3(02), 144–156 (2021) 14. A. Suresh, C. Arora, D. Laha, D. Gaba, S Bhambri, Intelligent smart glass for visually ımpaired using deep learning machine vision techniques and robot operating system (ROS) (2019) 15. https://www.section.io/engineering-education/introduction-to-yolo-algorithm-for-object-det ection/ 16. T. Priya, K.S. Sravya, S. Umamaheswari, Machine-learning-based device for visually impaired person. Artifi. Intell. Evol. Comput. Eng. Syst. Adv. Intell. Syst. Comput. 1056 (Springer, Singapore, 2020)
Investigating the Effectiveness of Zero-Rated Website on Students on E-Learning Asemahle Bridget Dyongo and Gardner Mwansa
Abstract The current Covid-19 pandemic environment has led to most universities switching to multimodal teaching strategies that also incorporate e-learning. E-Learning requires access to websites which normally comes with data costs. Zerorating of university websites became an option to offer internet services to students who could not afford data services. This study investigated the effectiveness of zerorated websites on students and lecturers at a university in the Eastern Cape. The university has a majority of students who emanates from previously disadvantaged backgrounds. A quantitative research approach was employed to derive the desired results. This paper shares some findings and conclusions that may well be found useful in the operationalization of zero-rated websites in academic institutions. Keywords Zero-rated website · E-learning · Content accessibility
1 Introduction Higher education sector in South Africa aims at supporting a broader and equal access to quality education for all including initiatives that promote universal access to the internet. The practice of zero-rating in higher education in South Africa is still new and has unknown possibilities. It became a common solution by most universities in accommodating students’ data affordability challenges during the beginning of the Covid-19 pandemic. Zero-rating can be defined as negotiated “free services” which do not attract a cost on user data usage. The zero-rated services can be offered in forms such as basic website browsing for news, and educational resources including access to data-intensive applications. There is normally no capping of data usage. The practice is implemented in different ways depending on the objective of the service required [1]. The Human Rights Council of the United Nations General Assembly A. B. Dyongo · G. Mwansa (B) Walter Sisulu University, East London, South Africa e-mail: [email protected] A. B. Dyongo e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_56
765
766
A. B. Dyongo and G. Mwansa
considered the promotion, protection, and enjoyment of human rights through access to internet and associated information and communication technologies [2]. Developing countries are confronted by major crisis in access to internet. According to the Digital 2021, South Africa has 64% of the population as active internet users excluding data usage from social media platforms [3]. This calls for an improved access considering the resolution as per the United Nations. As part of the objectives, this study sought to observe the nature of zero-rating and educational data bundles rollout at an academic institution in the Eastern Cape. This was an attempt to provide internet services to all students during the lockdown. The institution further considered guidelines in line with the Department of Higher Education and Technology (DHET) in rolling out electronic devices to needy students. Zero-rating at the institution had not been a new phenomenon. In 2016, a negotiated zero-rating initiative of the university’s web services with four major mobile networks operators (MNO) was done. However, this time, the government was involved in negotiating the zero-rating of most of its web services as way of catering to students especially those from impoverished backgrounds. The context of the currently negotiated zero-rating meant that the institutional websites are accessible and yet there’s still a blocking in embedded content like videos and YouTube [4]. Due to geographic location of most students, bad MNO signal in some areas remains a throttling issue. A White paper on science, technology, and innovation on income inequality has stifled the outcomes and stability of economic growth, health care, and education in South Africa [5]. Information communication and technology (ICT) significance in teaching and learning space is increasingly becoming a new sensation and catalyst in transforming the South African educational system. Data tariffs and costs remain anti-poor and almost half the South African population remains offline including university students. Through the zero-rating arrangement, the DHET provisioned a limited number of university websites although some sites which are hosted outside South Africa were not included. At the institution under this observation, the main Learning Management System (LMS); Blackboard was initially locally hosted with a zero-rated facility until at stage when storage became a challenge due to limited facility. The university had to migrate to a cloud-based version that is hosted outside South Africa and therefore the zero-rating facility on this service fell off. As stated earlier, the objectives were to first investigate nature of zero-rating of websites in higher education institutions, then second to determine the students’ content and access requirements for e-learning and finally to understand the level of content accessibility on zero-rated websites. The study was conducted at an academic institution that is based in the rural Eastern Cape of South Africa. Section 2 will present the review of literature and Sect. 3 will discuss research methodology used. The analysis of findings will be presented in Sect. 4 and Sect. 5 concludes the paper.
Investigating the Effectiveness …
767
2 Literature Review The literature review is focused on the nature of zero-rating of websites, content and access requirements for e-learning and content accessibility on zero-rated websites.
2.1 The Nature of Zero-Rating of Websites Zero-rated services are “free web services” which enable users to access selected internet websites without a financial cost but with agreed terms of conditions. In most cases, this is limited to localized websites. This practice has been common globally but implemented in different ways depending on agreed objectives. Third-party websites arising from the localized sites normally attract a charge for the data usage although this may still be negotiated through Internet Service Providers (ISPs). It is also common to provide advertisements as a means of servicing the “free services” [1]. Both developed and developing countries provide these services. Zero-rating may be done in the following ways: (1) Third-party content providers may subsidize the cost of users’ data when they access their service; (2) ISPs can offer selected third-party services for “free” at no cost to those parties; (3) Third-party content providers may pay ISPs to provide their services to users for “free”; and finally, (4) Users get limited amounts of data in exchange for viewing an advertisement or completing a survey. However, considering some limited financial resources and also in response to the Covid-19 pandemic, the South African government working with regulatory entities unanimously agreed to subsidize the zero-rating to all Higher Education Institutions. This was done to minimize the internet divide that exists among students, particularly those from rural backgrounds. Data costs had become a barrier for majority of students and that impacted execution of online learning [5]. In fact, according to [6], the following are among websites that qualify for zero-rating: • Local websites offering free access to educational content resources • Local commercial websites that offer all learners unconditional or free access to educational content resources or offer parents or learners direct access to their respective schools’ teaching and learning content resources.
2.2 Content and Access Requirements for E-Learning E-learning refers to the use of information and communication technologies to enable the access to online learning/teaching resources. In its broadest sense [7], defines elearning to mean any learning that is supported by digital electronic tools and media. In deeper understanding of e-learning, it may be critical to consider the following questions: is e-learning an online coursework for students at a distance like the
768
A. B. Dyongo and G. Mwansa
case of University of South Africa (UNISA)? Does it mean using a remote learning environment to support the provision of campus-based or contact-based education? Does it refer to an online tool to enrich, extend and enhance collaboration? OR is it a wholly online learning or part of amalgamated learning? [8]. These questions are also critical when analyzing and recommending e-learning platforms. Despite disadvantages of e-learning, literature indicates that it has made an impact on the development of learning [8]. The success of e-learning is dependent on having a greater educational access and support for the educational programs [9]. It can be concluded from the above that the use of e-learning is the most viable solution during the time of Covid-19, as alluded by different sectors of government trying to save the academic project. Otherwise, the major barrier to e-learning remains the lack of internet connectivity and access to content for vast majority of disadvantaged institutions. E-learning has made a strong impact on teaching and learning. Its adoption in some institutions has increased lecturer and students’ access to information and has provided a rich environment for collaboration among students and lecturers.
2.3 Content Accessibility on Zero-Rated Website The danger of Covid-19 infection has triggered institutions to move their courses online [4, 5]. However, access to internet remains a major challenge. For instance, in Africa, only 24% of the population has access to the internet. Additionally, poor connectivity, exorbitant costs, and frequent power interruptions are serious challenges in most parts of rural Africa [10]. 89% of students in sub-Saharan Africa do not have access to computers at home and 82% of them do not have internet access; this implies that these online classes cannot accommodate all students [10]. However, there have been some improvements, especially in handling bandwidth challenges; these include pre-recorded lectures and uploading them on zero-rated learning platforms, among others. Zero-rated content is limited to localized content and most educational institutions lack capacity to integrate all learning materials on their websites. Hence, the greater part of them does not have the capacity to fully deliver all their programs online. The digital divide of students within the same institution can be another challenge in terms of accessibility to zero-rated websites [12].
3 Research Methodology The research design gives an approach to which the data can be collected and analyzed in order to make a decision based on the findings. The choice of the research design depends on objectives of the research study that would enable the research questions to be answered [11]. The objectives of the study were to first investigate nature of zero-rating of websites in higher education institutions, then second to determine the
Investigating the Effectiveness …
769
students’ content and access requirements for e-learning, and finally to understand the level of content accessibility on zero-rated websites. Based on the objectives, a quantitative method made it easier to manage and analyze data [12]. The collected data were analyzed with descriptive statistics techniques. The same methodological approach used in this study has been used previously by other researchers when dealing with similar types of objectives [11]. A total of 127 respondents were realized as a sample for the study. The general plan of the data collection procedures was through the use of a questionnaire as an instrument. ˙It was structured to with closed and open-ended questions. This research was limited to the university under investigation based in the Eastern Cape and results were not expected to support generalization beyond the population.
4 Findings and Analysis This section illustrates a summarized account of collected data from questionnaires on some questions that were asked to the respondents based on the objectives of the study. Analysis was done quantitatively using SPSS software as follows:
4.1 Nature of Zero-Rating Designation Out of the total respondents, 98.4% were students and only 1.6% were lecturers as indicated in Table 1. Faculty The respondents were mainly from faculty of Science, Engineering, and Technology (61%) and Management Sciences (31%). The rest were from Education and Business Sciences with 7% and 2%, respectively as indicated in Table 2. Level of Study/Instructing Fourth and third-year students mostly participated with 55.1% and 20.5%, respectively. While the first and second years constituted about 24.4% being 11.8% and 12.6%, respectively. The lower number of participations in first and second years Table 1 Designation of respondents Frequency Valid
Student Lecturer Total
Percent
Valid percent
Cumulative percent
125
98.4
98.4
98.4
2
1.6
1.6
100.0
127
100.0
100.0
–
770
A. B. Dyongo and G. Mwansa
Table 2 Faculty representations of respondents Frequency Valid
Percent
Valid percent
Cumulative percent
Science, engineering and technology
77
60.6
60.6
Management sciences
39
30.7
30.7
91.3
9
7.1
7.1
98.4 100.0
Education Business sciences Total
2
1.6
1.6
127
100.0
100.0
60.6
–
Table 3 Level of study/instructing Frequency Valid
Percent
Valid percent
Cumulative percent
Fourth
70
55.1
55.1
55.1
Third
26
20.5
20.5
75.6
Second
16
12.6
12.6
88.2 100.0
First
15
11.8
11.8
Total
127
100.0
100.0
–
could be attributed to lack of awareness and connectivity because the study questionnaires were administered electronically through Google Forms. Table 3 shows the results of level of study and instructing. Internet Services Used Majority of respondents were using miscellaneous sites (33.1%) which included various internet sites such as job sites, blogs, online newspapers, and google searches. This was followed by educational sites (26.8%) which are external from the university sites. The social media usage was 19.7% while email was 12.6%. The least used sites were the university sites that included the learning management system (Blackboard) and other internal content-based sites. Considering internet usage by level of study, it was observed that fourth-year students are the most active students on all the sites. Table 4 demonstrates the crosstabulation of year of study/instructing and usage. Main Reason for Using a Free Internet Service The major reasons for most respondents to use zero-rated internet services were to access educational sites (39.4%) and some utilized these services because it was too expensive for them to get prepaid data plans (35%). However, a few were interested in accessing social media like Facebook and informational sites like Wikipedia and 10.2% were just trying it out. The ones trying out included some who were influenced by either their family members or friends to use zero-rated internet services. Refer to Table 5 for the breakdown.
Investigating the Effectiveness …
771
Table 4 level of study/instruction against internet use Internet services used
Total
Miscellaneous Educational Social Email Educational sites - external media - university sites sites First Level of study/instructing
Count
5
5
2
0
3
15
% of Total
3.9
3.9
1.6
0.0
2.4
11.8
Second Count
Third
Fourth
Total
2
5
7
2
0
16
% of Total
1.6
3.9
5.5
1.6
0.0
12.6
Count
5
7
7
6
1
26
% of Total
3.9
5.5
5.5
4.7
0.8
20.5
Count 30
17
9
8
6
70
% of Total
13.4
7.1
6.3
4.7
55.1
23.6
Count 42
34
25
16
% of Total
26.8
19.7
12.6
Percent
Valid percent
33.1
10 7.9
127 100.0
Table 5 Reasons for using zero-rated services Frequency Valid
Cumulative percent
Accessing educational sites
50
39.4
39.4
39.4
Expensive data plans
45
35.4
35.4
74.8
Social media
17
13.4
13.4
88.2
Trying it out
13
10.2
10.2
98.4
2
1.6
1.6
100.0
127
100.0
100.0
Other reason such as Wikipedia, etc Total
–
Benefits of using Zero-Rated Plans Majority of respondents (61.4%) prodigiously agreed that free internet services supported them in their academic activities. While others thought that zero-rated services also benefited them in other areas of need such as accessing health information, job advertisements, etc. Figure 1 shows the results of how beneficial zero-rating plans have been to the respondents of which the majority were students.
772
A. B. Dyongo and G. Mwansa
Fig. 1 Benefits of using zero-rated plans
Knowledge of Zero-Rating Prior to this Survey Table 6 illustrates that majority of respondents were not familiar with the zero-rating concept. Although, some users selected NO to the question; evidence from the study shows that they utilized some of free internet services. Access to Zero-Rating Sites and Applications Table 7 shows that 46.5% of respondents preferred unlimited data valid for limited number of websites and/or online applications; 44.1% suggest unlimited data access for limited time. Only 9.4% opted for limited data for unrestricted usage to any site of their choice.
Table 6 Knowledge of zero-rating Frequency Valid
Percent
Valid percent
Cumulative percent
No (not aware)
66
52.0
52.0
52.0
Yes (aware)
61
48.0
48.0
100.0
127
100.0
100.0
Total
Table 7 Requirement for zero-rating sites and applications Frequency Valid
Percent
Valid percent
Cumulative percent
Unlimited data valid for a limited number of websites or apps
59
46.5
46.5
46.5
Unlimited data valid for a limited time (e.g., 1 day, etc.)
56
44.1
44.1
90.6
Limited data (e.g., 15 MB) for use on any website or app
12
9.4
9.4
100.0
127
100.0
100.0
Total
–
Investigating the Effectiveness …
773
Table 8 Average usage on data/internet per week Frequency Valid
501 MB–2 GB
48
Percent 37.8
Valid percent 37.8
Cumulative percent 37.8
2.1–5 GB
38
29.9
29.9
67.7
50–500 MB
26
20.5
20.5
88.2
More than 5 GB
15
11.8
11.8
100.0
127
100.0
100.0
–
Total
Average Usage per Week on Your Mobile Phone Internet/Data Service As indicated in Table 8, most respondents (37.8%) use between 501 MB and 2 GB per week of internet data; followed by 29.9% who utilize excess of 2–5 GB; and 20.5% use less than 500 MB of data per week. Only 11.8% of respondents were spending beyond the 5 GB mark. The general feeling of respondents was that internet should be made freely accessible as per UN declaration [2, 15].
4.2 Content and Access Requirement for E-Learning Technology Device Used Most respondents used mobile phones (79.5%) as compared to 7.1% who were using laptops. The university has an initiative of allocating laptops to all its students but rarely do they use them for online lessons as they prefer mobile phones. A few respondents (0.8%) were using Tablets for online learning. Table 9 shows the distribution of type devices in use. Internet Service Providers Table 10 shows ISPs according to the device in use. It is clear from the observation that MTN was the most used internet service provider with 63%. The study also established that the data plan used was mainly prepaid.
Table 9 Type of device in use Frequency Valid
Mobile phone Laptop
Percent
Valid percent
101
79.5
79.5
Cumulative percent 79.5
16
12.6
12.6
92.1
Personal computer
9
7.1
7.1
99.2
Tablet phone
1
0.8
0.8
100.0
127
100.0
100.0
Total
–
774
A. B. Dyongo and G. Mwansa
Table 10 Internet service providers Service provider Cell C Device in use
Laptop
MTN
Total Telkom
Vodacom
Count
2
9
2
3
16
% of total
1.6
7.1
1.6
2.4
12.6
Mobile phone
Count
11
65
12
13
101
% of total
8.7
51.2
9.4
10.2
79.5
Personal computer
Count
0
5
1
3
9
% of total
0.0
3.9
0.8
2.4
7.1
Tablet phone Total
Count
0
1
0
0
1
% of total
0.0
0.8
0.0
0.0
0.8
Count
13
80
15
19
127
% of total
10.2
63.0
11.8
15.0
100.0
Frequency and speed of internet use Most respondents used internet almost every day. However, the most frequent users of educational sites found that the zero-rated services were slower than those that they would get through the capped prepaid services. Table 11 shows the distribution of frequency and speed of different connectivity methods.
Table 11 Frequency and speed of use of internet How fast are these services when you surfing the internet?
Total
Data Data Free access Free access capped capped was fast was slow access was was slow faster How often Frequently Count do you % of total visit University Moderate Count website % of total
4
3
18
44
3.1
2.4
14.2
34.6
17
0
2
18
37
13.4
0.0
1.6
14.2
29.1
4
1
3
5
13
Rare
Count
% of total 3.1
0.8
2.4
3.9
10.2
Very frequently
Count
5
2
8
22
3.9
1.6
6.3
17.3
Very rare Total
19 15.0
7
% of total 5.5
0
0
7
11
% of total 3.1
Count
4
0.0
0.0
5.5
8.7
Count
10
10
56
127
7.9
7.9
44.1
100.0
51
% of total 40.2
Investigating the Effectiveness …
775
Table 12 Clear images and text on zero-rated sites Frequency Valid
Yes No Total
82
Percent 64.6
Valid percent
Cumulative percent
64.6
64.6
45
35.4
35.4
100.0
127
100.0
100.0
–
Table 13 Sound being Audible Frequency Valid
Sometimes
70
Percent 55.1
Valid percent 55.1
Cumulative percent 55.1
Yes
42
33.1
33.1
88.2
No
15
11.8
11.8
100.0
127
100.0
100.0
–
Total
Table 14 Information in the course easily understandable Frequency Valid
Percent
Valid percent
Cumulative percent
Yes
72
56.7
56.7
56.7
No
55
43.3
43.3
100.0
127
100.0
100.0
–
Total
Quality of Zero-Rated services Tables 12, 13 and 14 show the quality of zero-rated services in terms of images, sound, and general reception of academic content. Generally, the quality of all content-based services such as text, images, and sound were within acceptable limits.
4.3 Levels of Content Accessibility on Zero-Rated University Website Most Used University Zero-Rated Website As shown in Table 15, most students (71.7%) had visited blackboard in zero-rated university websites. This could be attributed to as the major activity for e-learning at the university through the LMS. Perception on Content Delivery Table 16 indicates that most of students were neutral on matters concerning the perception on effectiveness of content delivery. However, an average of about 21% agreed that online classes were effective, easy platforms and found the online experience engaging. On the other hand, about 14% believed otherwise.
776
A. B. Dyongo and G. Mwansa
Table 15 zero-rated university sites visited Frequency Valid
Percent
Valid percent
Cumulative percent
71.7
71.7
71.7
Blackboard
91
Student mail portal
29
22.8
22.8
94.5
7
5.5
5.5
100.0
100.0
100.0
–
VPN portal Total
127
Table 16 Perception on content delivery Question
Number of respondents
Neutral (%)
Agree (%)
Disagree (%)
Strongly disagree (%)
Strongly agree (%)
Total (%)
How effective were the online classes /lectures at helping you reach their learning objectives?
127
46.5
19.7
15.0
13.4
5.5
100
How easy are the online platforms to use?
127
45.7
19.7
16.5
13.4
4.7
100
How engaging you found the course over these online platforms?
127
40.2
22.8
16.5
14.2
6.3
100
Preferred Learning Platform From all other platforms available for e-learning, respondents preferred use of Microsoft Teams as an enabling platform. Refer to Table 17 about responses.
5 Conclusion The main objective of the study was to investigate the effectiveness of zero-rated websites to promote e-learning. The findings of the study indicate that students mainly used internet for a combination of needs that included academic activities,
Investigating the Effectiveness …
777
Table 17 Preferred learning platform Frequency Valid
Microsoft teams Zoom
110
Percent 86.6
Valid percent 86.6
Cumulative percent 86.6
12
9.4
9.4
96.1
Blackboard
1
0.8
0.8
96.9
Contact
1
0.8
0.8
97.6
Google meet
1
0.8
0.8
98.4
Skype
1
0.8
0.8
99.2
Telegram
1
0.8
0.8
100.0
127
100.0
100.0
Total
–
emails, and social media. The reason being that during the Covid 19 pandemic, most academic and communication activities were done virtually. From the study, it is clear that zero-rating of internet services will provide a more accessible environment for most students and hence contribute to effective delivery of the academic project, especially during the Covid 19 restrictions. Based on the results of the study, the use of zero-rated websites will continue to be an important building block to enhance the students’ experience and an anchor to improve students’ performance, learning, and results. It is then critical that universities promote zerorated website services and possibly cap them according to content requirements especially content that is external to the institution.
References 1. J. Romanosky, M. Chetty, Understanding the use and impact of the zero-rated free Basics platform in South Africa, in Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, (2018). https://doi.org/10.1145/3173574.3173766 2. UN, General Assembly : Oral Revısıons of 30 June Human Rights Council, (2016). https://doi. org/10.1016/0005-2736(76)90226-1 3. S. Kemp, Digital 2021: South Africa (2021), [Online]. Available: https://datareportal.com/rep orts/digital-2021-south-africa 4. D.W. Hedding, M. Greve, G.D. Breetzke, W. Nel, B.J. van Vuuren, COVID-19 and the academe in South Africa: Not business as usual. S. Afr. J. Sci. 116(8), 8–10 (2020). https://doi.org/10. 17159/sajs.2020/8298 5. Department of Science and Technology White Paper on Science, Technology and Innovation (2019). https://www.dst.gov.za/images/2019/White_paper_web_copyv1.pdf 6. V. Maphosa, Factors influencing student’s perceptions towards e-learning adoption during COVID-19 pandemic: a developing country context. Eur. J. Interact. Multimed. Educ. 2(2), e02109 (2021). https://doi.org/10.30935/ejimed/11000 7. S. Africa, D.M. Act, Directions on Zero-rating of Content and Websites for Education and Health (2021) 8. S. Kumar Basak, M. Wotto, P. Bélanger, E-learning, M-learning and D-learning: conceptual definition and comparative analysis, E-Learn. Digit. Media 15(4), 191–216 (2018). https://doi. org/10.1177/2042753018785180
778
A. B. Dyongo and G. Mwansa
9. C. Author, The role of e-learning. Adv. Disadv. adoption High. Educ. 2(12), 397–410 (2014) 10. L. Uden, I.T. Wangsa, E. Damiani, The future of e-learning: e-learning ecosystem, Proceeding of 2007 Inaugural IEEE-IES Digital EcoSystems and Technologies Conference, pp. 113–117 (2007). https://doi.org/10.1109/DEST.2007.371955 11. A. Aborode, O. Anifowoshe, T.I. Ayodele, A.R. Iretiayo, O.O. David, Impact of COVID-19 on Education in Sub-Saharan Africa, Preprints 2890(October), (2020), pp. 1–29. [Online]. Available: https://www.preprints.org/manuscript/202007.0027/v1. 12. K.A. Soomro, U. Kale, R. Curtis, M. Akcaoglu, M. Bernstein, Digital divide among higher education faculty, Int. J. Educ. Technol. High. Educ. 17(1), (2020). https://doi.org/10.1186/s41 239-020-00191-5. 13. D. Janet, The relationship between research question and research design, Crookes, Patrick A. Davies, Sue eds. Res. into Pract. Essent. Ski. Read. Appl. Res. Nurs. Heal. Care, vol. 2nd Ed, 2004, [Online]. Available: http://www.amazon.co.uk/Research-into-Practice-Essential-Reasea rch/dp/0702026867. 14. K. McCusker, S. Gunaydin, Research using qualitative, quantitative or mixed methods and choice based on the research. Perfus. (United Kingdom) 30(7), 537–542 (2015). https://doi. org/10.1177/0267659114559116 15. T.G. Assembly, Universal declaration of human rights (Chuukese). Asia-Pacific J. Hum. Rights Law 8(1), 101–106 (2007). https://doi.org/10.1163/157181507782200222
Role of Machine Learning Algorithms on Alzheimer Disease Prediction V. Krishna Kumar, M. S. Geetha Devasena, G. Gopu, and N. Sivakumaran
Abstract Alzheimer is a kind of dementia that affects reasoning and social aptitudes which results in constant decline of individual’s capacity to work. The development of amyloid protein and tau protein in cerebrum stimulates cell passing. The prevention and early prediction is a big challenge. The proposed model addresses this need with the analysis of parameters such as Mini-Mental State Examination score and so on. In addition to these parameters, one more parameter dice coefficient value that shows difference between normal and affected scan images. The SVM and CNN are applied on various neuro parameters and evaluated for prediction accuracy. Keywords Neuro decline · Dementia · Alzheimer’s disease and machine learning algorithms
1 Introduction Alzheimer disease, a kind of neurologic issue which is considered as a main reason for dementia. It’s a progressive disease that begins with modest memory loss and advances to the inability to talk and respond to the environment. Alzheimer’s disease (AD) affects the parts of the brain responsible for cognition, memory, and language. It can have a significant impact on a person’s capacity to carry out daily tasks. It V. Krishna Kumar (B) · M. S. Geetha Devasena Department of Computer Science and Engineering, Sri Ramakrishna Engineering College, Coimbatore 641022, India e-mail: [email protected] M. S. Geetha Devasena e-mail: [email protected] G. Gopu Department of Electronics and Communication Engineering, Sri Ramakrishna Engineering College, Coimbatore 641022, India e-mail: [email protected] N. Sivakumaran Instrumentation and Control Engineering, NIT, Trichy, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9_57
779
780
V. Krishna Kumar et al.
affects the individual’s day to day activity to a greater extent on progression and no perfect recovery is available. AD is not a normal aspect of growing older. The known risk factor is getting older, and the people coming of age 65 and older. If a person age is below 65 develops Alzheimer’s disease, it is called younger-onset Alzheimer’s disease. Getting accurate diagnostic of early-onset Alzheimer’s could be a long and tedious procedure because health care providers rarely search for AD in younger patients. Symptoms may be wrongly attributed to stress, or findings from different health care specialists may dispute. AD in younger people can be in any stage of the disease: Cognitive Normal, Mild Cognitive Impairment, or Alzheimer Disease. Scientists are still baffled as to what causes Alzheimer’s disease. There is most likely no single cause, but rather a combination of circumstances that influence each person differently. Alzheimer’s disease development may be caused by genetic factors, according to researchers. Genes, on the other hand, do not equal destiny. A healthy lifestyle can help you lower your risk of Alzheimer’s disease. Two major, long-term studies suggest that doing enough exercise, eating a nutritious diet, drinking less alcohol, and not smoking can all help people live healthier lives. Years before the first symptoms occur, changes in the brain might begin. Researchers are looking into the influence of education, diet, and surroundings in the development of AD [1]. Healthy practises, which have been found to prevent cancer, diabetes, and heart disease, may also minimize the risk of subjective cognitive decline, according to emerging scientific data. Alzheimer’s disease (AD) is a progressive degenerative disease that causes dementia. In the early stages, memory loss is minor, but late-stage patients lose their ability to talk and respond to their surroundings. Human brain has billions of neurons. Each nerve cell links to a vast number of others to form communication networks [2]. Groups of neurons will have specific functions like processes of thinking, learning, and remembering, seeing, hearing, and smelling. To perform their jobs, brain cells work like miniature factories. They collect materials, generate energy, construct equipment, and dispose of garbage. In addition to processing and storing data, cells also communicate with one another. To keep everything functioning, it takes a lot of coordination, as well as a lot of fuel and oxygen. Plaques and tangles are thought to damage and destroy nerve cells [3]. Plaques are clumps of beta-amyloid [4], a protein fragment that fills the spaces between neurons. Inside cells, tangles are a type of protein called tau that accumulates [5]. The memory-critical entorhinal cortex and hippocampus are the first parts of the brain to be destroyed. Later on, it has an impact on areas of the cerebral cortex that deal with language, logic, and social interaction. As seen in Fig. 1, several additional areas of the brain are eventually damaged [6]. Only the early identification is useful to slow down the progression rate of the disease. Several symptoms may progress over months or years and clinical diagnosis is vital as this could result in a stroke [7]. There is no single test for Alzheimer’s sickness [8]. On the off chance that a specialist associates the presence with the condition, they will ask the individual and in some cases their family or guardians about their indications, encounters, and clinical history.
Role of Machine Learning Algorithms on Alzheimer Disease…
781
The neuro related illness is predicted using Mini Mental State Examination score. This score decreases intermittently if the individual is influenced. The patients having Mild Cognitive Impairment might possibly develop dementia in due course. At the point when the crucial MCI brings about a deficiency of memory, the circumstance hopes to create to dementia because of this sort of sickness. During the cutting edge stages of the sickness, people undergo various complexities such as drying out, lack of healthy sustenance that results demise [9]. The earlier diagnosis at MCI stage could be useful to assist the patients with quality health care. To address this early detection issue, the MRI scan images at frequent intervals will be informative in analyzing the abnormal biomarkers. This will possibly aid in early detection of AD [10]. AI strategies have been discovered to be exceptionally helpful for the analysis of Alzheimer’s in the previous decade. Deep learning algorithms are finding lot of scope in improving prediction accuracy of such health care systems [11].
2 Literature Survey Hett et al. proposed a methodology to predict patients who were with mild cognitive impairment and might develop AD. The biomarkers play a major role in finding the change of MCI to AD. They proposed a graph-based system that analyzes the anatomical structures and modification. They used dataset from ADNI dataset and evaluated multiscale graph-based grading technique to predict the possible cases of progress from MCI to AD over a period of three years [12]. These outcomes were in contrast with best in class strategies assessed on the equivalent dataset [13]. Ju et al. proposed a technique for deep learning alongside the cerebrum organization and clinical critical data such as genetic parameters, gender for prior assessment. Cerebrum was orchestrated, computing useful associations in the mind area by utilizing the R-fMRI information. To deliver a definite revelation of dementia, auto encoder is utilized where practical associations of the organizations are built and are vulnerable to AD and MCI. ADNI information is used as the dataset. The group comprises of the determination at an earlier stage, by first preprocessing R-fMRI. At that point, the time arrangement information (90 × 130 matrix) is acquired and blood oxygen levels in each locale of mind and changes a significant stretch. At that point, a cerebrum network is assembled and changed to a 90 × 90 time arrangement information relationship grid. They focused on a three-layered model using auto encoder which gives information about sensory system, at that point passages cerebrum networks credits totally [14]. Ly et al. proposed a machine learning model for predicting aging of brain and estimating age of a patient from the neuro scan images. Brain aging shows that a person’s mind seems “more seasoned” than age-coordinated sound companions, proposing that they may have encountered a higher combined openness to cerebrum affronts or was more affected by those obsessive abuses. Be that as it may, contemporary brain
782
V. Krishna Kumar et al.
age models remember more seasoned members with amyloid pathology for preparation sets and along these lines might be puzzled when examining Alzheimer’s infection (AD) [15]. X-ray information was also used additionally to predict age resolving dimension problems. Prominently, huge contrasts shown on AD indicative gatherings than existing models. Furthermore, additional information on amyloid status aid in depicting the critical contrasts in mind age comparative with ordered age between intellectually ordinary people. This age forecast model utilized mind age as a biomarker for AD. Kruthika et al. classified between different subjects using multistage classifier. For classifications they used Bayes, KNN, and SVM. Particle Swarm Optimization is used to identify the global and local best features. There are two steps in image retrieval. The first step is generating features and developing a query image. Secondly, relating these features with those in dataset already. The Particle Swarm Optimization calculation is utilized to choose the best biomarkers to show AD or MCI [16]. The information is taken from ADNI dataset. The processing of MRI images is done first. The component choice incorporates volumetric and thickness estimations. At that point the ideal element records were gotten from PSO calculation. The KNN, Bayes, and SVM were evaluated with test dataset. Jo et al. diagnoses that AD classification and prediction for conversion from MCI to AD by traditional ML algorithm. Deep learning techniques that use neuroimaging data for feature selection without pre-processing, such as CNN or RNN, have improved the classification of Alzheimer’s disease and the prediction of MCI conversion [17]. When multimodal neuroimaging and fluid biomarkers were used together, the classification performance was the best [18]. Deep learning methods to diagnostic categorization of AD utilizing multimodal neuroimaging data continue to improve in performance [19, 20]. Deep learning research in AD is still in its early stages, with researchers hoping to improve performance by incorporating more hybrid data types. Lin et al. proposed a longitudinal structural MRI based MCI progression prediction system. Longitudinal data on MCI patients was gathered from Alzheimer’s Disease Neuroimaging Initiative (ADNI) [21] After preprocessing, to distinguish MCI patches, segmentation of regions was avoided based on their interest using discriminative dictionary learning framework. The percentage of patches in each patient with more severe symptoms classed as severe atrophy patches was evaluated as a feature to be fed into the basic SVM. Finally, using fourfold cross-validation (CV), a new subject was predicted, and the AUC (area under the curve) was determined. After fourfold CV, the average accuracy and AUC was 0.973 and 0.984, respectively. It was also looked into the implications of data from one or two time points. The prediction system performs admirably and consistently in predicting MCI patients [22]. Furthermore, using longitudinal data to predict MCI progression was more effectual and precise. Sungheetha et al. proposed the study that features are extracted using deep networks and CNN. On mild DR pictures, micro aneurysms can be diagnosed in the early stages of the transition from healthy to unwell. The confusion matrix detection results can be used to categorize the severity of diabetes [23]. The CNN framework,
Role of Machine Learning Algorithms on Alzheimer Disease…
783
early identification of the diabetic state was performed using HE detected in an eye’s blood artery. The presence of diabetes in a person can also be identified using the proposed framework. This paper proves that the proposed framework’s accuracy is greater than that of other commonly used detection algorithms. Bashar gives an overview of deep learning neural network designs that have been used in diverse applications for accurate classification and automated feature extraction. Deep learning typically uses a neural network (DNN) to conduct correct classification. The DNN is one of the most frequent used for wide range of applications, CNN will automate the feature extraction process from the images directly by learning, so DNN will achieve the accuracy mostly [24]. Ghazal et al. proposed a framework for detection of Alzheimer’s Disease utlizing transfer learning on multi-class grouping using cerebrum MRI attempting to order the images in four phases, MD, MOD, ND, and VMD. The proposed framework model gives 91.70% precision. Aruchamy et al. proposed a work that isolate the white and gray matter using 3D structural brain MR images and extracts 2D cuts in the several regions and select the vital cuts to perform feature extraction on them [25]. To sort out these cuts, extraction is applied on top of the principal request measurable elements and the unmistakable element vectors created by PCA are chosen for additional review [26].
3 Proposed System The proposed Alzheimer’s sickness forecast and investigation framework recognizes the subjects as Cognitive normal, MCI-Mild Cognitive Impairment (MCI), and AD. The required parameters for early prediction and improved health care, the dataset is retrieved from ADNI (Alzheimer’s Disease Neuroimaging Initiative). There are three significant sorts of information accessible in the ADNI specifically study data includes outcome from examinations such as MMSE, NPI-Q, image collections like as MRI, PET, CT, and genetic data [13]. SVM is a reliable categorizing algorithm in which the attributes are mapped in n-dimensional space and the feature value is assigned to a specific coordinate value. The category for the attributes is then predicted using a hyper plane based on the clear distinction between the classes. The hyper plane’s identification is crucial for SVM prediction, as seen in the diagram below [27], which depicts the hyper plane’s two unique classes as shown in Fig. 2. Convolution is the simple technique of producing an activation by applying a filter to an input. The location and strength of a recognized feature in an input like image is denoted by a feature map made by repeatedly applying the same filter to the input. The capacity of CNN to acquire a vast number of filters shown in Fig. 4 parallel particular to a training dataset under the restrictions of a given predictive modeling problem, which including image classification, is its key novelty. As a result, extremely particular features can be found anywhere on the input photos as shown in Fig. 3.
784
V. Krishna Kumar et al.
Fig. 1 Comparision of normal and affected
Fig. 2 SVM
Firstly, SVM classification algorithm is implemented and evaluated. Later on SVM added one more parameter dice coefficient for the progression accuracy. Though, to more set the results, data from MRI as shown in Fig. 5 are also taken. SVM classifies by plotting the highlights in n-dimensional space and the anticipated class for the highlights based on unique classes. The dice coefficient is an additional parameter derived from MRI images is considered for prediction. The dice coefficient score is calculated as follows:
Role of Machine Learning Algorithms on Alzheimer Disease…
785
Fig. 3 Convolution neural networks
Support Vector Machine Algorithm
Classifica on accuracy
Dice Coefficient Calcula on using Support Vector Machine Algorithm
Progression accuracy
Convolu on Neural Network Algorithm
Predic on accuracy
Psychological
MRI Scans
Fig. 4 Proposed AD prediction system
Dice score = 2 ∗ no. of true positives/2 ∗ no. of true positives + no. of false positive + no. of false negative ResNet is a CNN design that overcomes the “vanishing gradient” problem, allowing for the creation of networks with thousands of convolutional layers that outperform deeper networks [28]. The fundamental building components of ResNet networks are residual blocks. ResNet will identity the shortcut connection and it will skip one or more layers Fig. 6. This skip connection helps to train the network deeper. The proposed system combines the psychological parameters and MRI images has been able to identify the progression of the disease. The CNN algorithm also greatly helps in predicting the disease with more accuracy. Prediction will be performed with the help of psychological parameters using SVM algorithm, later dice coefficient from MRI images and the value considers another parameter for predicting AD in
786
V. Krishna Kumar et al.
Fig. 5 Normal and AD MRI
Fig. 6 Residual learning block diagram
the same algorithm. And also using ResNet model prediction is calculated with MRI images.
Role of Machine Learning Algorithms on Alzheimer Disease…
787
Fig. 7 Performance analysis
4 Results The prediction accuracy will be getting from only the significant parameters among the various parameters given in the dataset. The parameters considered for prediction of AD include age factor, educational qualification, MMSE score and also the frequency of visits. The proposed AD prediction system evaluates Support Vector Machine and Convolution Neural Network. For both the algorithms, the split of training and testing is 70 and 30%. Using psychological parameter data in the ADNI for process of predicting AD accuracy was 75%. The imaging data in ADNI used for calculating dice coefficient for the test subjects and also added as new parameter in same dataset and processed again with SVM found the accuracy 83%. To improve the prediction accuracy used ResNet model with help of imaging data available in ADNI database. Prediction accuracy is about 93% found using CNN algorithms as shown in Fig. 7.
5 Conclusion The proposed system compares the machine learning algorithms to predict the Alzheimer’s disease and successfully implemented. Without using dice coefficient, the AD prediction accuracy is observed as 75% whereas the additional parameter dice coefficient helps in improving the prediction accuracy. The proposed AD prediction system shall further be enhanced with more neuro related parameters. Also the dataset shall be increased with more MRI images to achieve higher accuracy. Also the machine learning shall be applied to effectively for early prediction to render better health care.
788
V. Krishna Kumar et al.
References 1. T. Iwatsubo, A. Iwata, K. Suzuki, R. Ihara, H. Arai, K. Ishii et al., Japanese and North American Alzheimer’s Disease Neuroimaging Initiative studies: harmonization for international trials. Alzheimer’s Dement. 14, 1077–1087 (2018) 2. D.A. Nation, M.D. Sweeney, A. Montagne, A.P. Sagare, L.M. D’Orazio, M. Pachicano et al., Blood-brain barrier breakdown is an early biomarker of human cognitive dysfunction. Nat. Med. 25, 270–276 (2019) 3. O. Hansson, J. Seibyl, E. Stomrud, H. Zetterberg, J.Q. Trojanowski, T. Bittner, CSF biomarkers of Alzheimer’s disease concord with amyloid-bPET and predict clinical progression: A study of fully automated immunoassays in BioFINDER and ADNI cohorts. Alzheimer’s Dement 14, 1470–1481 (2018) 4. P.S. Insel, R. Ossenkoppele, D. Gessert et al., Time to amyloid positivity and preclinical changes in brain metabolism, atrophy, and cognition: evidence for emerging amyloid pathology in Alzheimer’s disease. Front. Neurosci. 11, 281–289 (2017) 5. S.J. Van der Lee, C.E. Teunissen, R. Pool, M.J. Shipley, A. Teumer, V. Chouraki, Circulating metabolites and general cognitive ability and dementia: evidence from 11 cohort studies. Alzheimer’s Dement 14, 707–722 (2018) 6. S. Zhao, D. Rangaprakash et al., Deterioation from healthy to mild cognitive impairment and Alzheimer’s disease mirrored in corresponding loss of centrality in directed brain networks. Brain Inf. (2019) 7. M.W. Weiner, D.P. Veitch, P.S. Aisen, L.A. Beckett, N.J. Cairns, R.C. Green, D. Harvey, R.M. Clifford, W. Jagust, J.C. Morris, R.C. Petersen, A.J. Saykin, L.M. Shaw, A.W. Toga, J.Q. Trojanowski, Alzheimer’s Dis N, Recent publications from the Alzheimer’s disease neuroimaging initiative: reviewing progress toward improved AD clinical trials. Alzheimer’s Dement 13, E1–E85 (2017) 8. H.I. Suk, S.W. Lee, D. Shen, A. S. D. N, Initiative. deep ensemble learning of sparse regression models for brain disease diagnosis. Med. Image Anal. 37, 101–113 (2017) 9. N. Tesi, S.J. van der Lee, M. Hulsman, I.E. Jansen, N. Stringa, N. van Schoor et al., Centenarian controls increase variant effect sizes by an average twofold in an extreme case-extreme control analysis of Alzheimer’s disease. Eur. J. Hum. Genet. 27, 244–253 (2019) 10. K. Kauppi, A.M. Dale, Combining polygenic hazard score with volumetric mrı and cognitive measures ımproves prediction of progression from mild cognitive ımpairment to Alzheimer’s disease. Front. Neurosci. (2018) 11. M. Grassi, D.A. Loewenstein, D. Caldirola, K. Schruers, R. Duara, G. Perna, A clinicallytranslatable machine learning algorithm for the prediction of Alzheimer’s disease conversion: further evidence of its accuracy via a transfer learning approach. Int. Psychogeriatr. 14, 1–9 (2018). https://doi.org/10.1017/S1041610218001618 12. K. Hett, V.T. Ta, I. Oguz, J.V. Manjón, P. Coupé, Multi-scale graph-based grading for Alzheimer’s disease prediction. Med. Image Anal. 67, 101850 (Jan. 2021) 13. T. Yamane, K. Ishii, M. Sakata, Y. Ikari, T. Nishio, K. Ishii et al., Inter-rater variability of visual interpretation and comparison with quantitative evaluation of 11C-PiB PET amyloid images of the Japanese Alzheimer’s Disease Neuroimaging Initiative (J-ADNI) multicenter study. Eur. J. Nucl. Med. Mol. Imag. 44, 850–857 (2017) 14. R. Ju, C. Hu, P. Zhou, Q. Li, Early diagnosis of Alzheimer’s disease based on resting-state brain networks and deep learning. IEEE/ACM Trans. Comput. Biol. Bioinf. 16(1) (2019) 15. M. Ly, Z.Y. Gary, H.T. Karim, N.R. Muppidi, A. Mizuno, W.E. Klunk, H.J. Aizenstein, Improving brain age prediction models: incorporation of amyloid status in Alzheimer’s disease. Neurobiol. Aging 87, 44–48 (Mar. 2020) 16. K.R. Kruthika, Rajeswari, H.D. Maheshappa, Multistage classifier-basedapproach for Alzheimer’s disease prediction and retrieval. Inform. Med. Unlocked (2019) 17. T. Jo, K. Nho, A.J. Saykin, Deep learning in Alzheimer’s disease: diagnostic classification and prognostic prediction using neuroimaging data. Front. Aging Neurosci. 11, (2019). ISSN=1663–4365. https://doi.org/10.3389/fnagi.2019.00220
Role of Machine Learning Algorithms on Alzheimer Disease…
789
18. J. Shi, X. Zheng, Y. Li, Q. Zhang, S. Ying, Multimodal neuroimaging feature learning with multimodal stacked deep polynomial networks for diagnosis of Alzheimer’s disease. IEEE J. Biomed. Health Inform. 22, 173–183 (2017) 19. F. Zhang, Z. Li, B. Zhang, H. Du, B. Wang, X. Zhang, Multi-modal deep learning model for auxiliary diagnosis of Alzheimer’s disease. Neurocomputing (2019) 20. J. Shi, X. Zheng, Y. Li, Q. Zhang, S. Ying, Multimodal neuroimaging feature learning with multimodal stacked deep polynomial networks for diagnosis of Alzheimer’s disease. IEEE J. Biomed. Health Inform. 22(1), 173–183 (2018) 21. D.F. Wong, H. Kuwabara, R. Comley et al., Longitudinal changes in [18F] RO6958948 tau PET signal in four Alzheimer’s subjects. 11th Hum. Amyloid Imaging, Miami, USA (11–13 Jan. 2017). Abstract ID 129, 70 22. Y. Lin, K. Huang, H. Xu, Z. Qiao, S. Cai, Y. Wang, L. Huang, Predicting the progression of mild cognitive impairment to Alzheimer’s disease by longitudinal magnetic resonance imagingbased dictionary learning. Clin. Neurophysiol. 131(10), 2429–2439 (2020). ISSN 1388–2457. https://doi.org/10.1016/j.clinph.2020.07.016 23. A. Sungheetha, Rajendran, R. Sharma, Design an early detection and classification for diabetic retinopathy by deep feature extraction based convolution neural network. J. Trends Comput. Sci. Smart Technol. 3, 81–94 (2021). https://doi.org/10.36548/jtcsst.2021.2.002 24. A. Bashar, Survey on evolvıng deep learnıng neural network archıtectures. J. Artif. Intell. Capsule Netw. 73–82 (2019). https://doi.org/10.36548/jaicn.2019.2.003 25. C. Ge, Q. Qu, I.Y.H. Gu, A.S. Jakola, Multi-stream multi-scale deep convolutional networks for Alzheimer’s disease detection using MR images. NeuroComputing (2019) 26. S. Aruchamy, A. Haridasan, A. Verma, P. Bhattacharjee, S.N. Nandy, S.R.K. Vadali, Alzheimer’s disease detection using machine learning techniques in 3D MR ımages, in 2020 National Conference on Emerging Trends on Sustainable Technology and Engineering Applications (NCETSTEA) (2020), pp. 1–4. https://doi.org/10.1109/NCETSTEA48365.2020.911 9923 27. M. Liu, J. Zhang, P.-T. Yap, D. Shen, View-aligned hypergraph learning for Alzheimer’s disease diagnosis with incomplete multi-modality data. Med. Image Anal. 36, 123–134 (2017) 28. R. Cuia, M. Liu, RNN-based longitudinal analysis for diagnosis of Alzheimer’s disease. Inform. Med. Unlocked (2019) 29. T.M. Ghazal, S. Abbas, S. Munir, M.A. Khan, M. Ahmad et al., Alzheimer disease detection empowered with transfer learning. CMC-Comput. Mater. Continua 70(3), 5005–5019 (2022)
Author Index
A Abed, Mohammed Hamzah, 491 Abhishek, K., 173 Agarwal, Abhinav, 581 Aggarwal, Bharat Kumar, 337 Aggarwal, Vaibhav, 529 Ahmed, Karim Ishtiaque, 317 Ajay, C., 393 Akalya, V., 403 Akila, I. S., 737 Al-Shammary, Dhiah, 491 Alzubaidy, Hussein K., 491 Ananda, Chintia, 65 Ananda, M., 467 Ananthakumaran, S, 715 Anil Kumar, G., 479 Annamalai, S., 91 Anuradha, T., 327 Arora, Ankita, 529 Arthi, B., 143 Arulkumar, V., 111 Aruna, M., 127, 143 Arya, Meenakshi Sumeet, 643 Auxilia Osvin Nancy, V., 643
Bilagi, Pradyumna, 187 Bindu, P., 327, 367 Bodas, Anaya, 543 Bolla, Dileep Reddy, 701 Borse, Manas, 415 Briskilal, J., 17 Bunglawala, Warisahmed, 237
B Balamurugan, M., 91 Balkawade, Shubham, 569 Bansal, Bijender, 337 Bansode, Apoorva, 615 Barak, Dheer Dhwaj, 337 Baranidharan, V., 403, 505 Bhaidasna, Zubin, 225 Bhandage, Nandan, 187 Bhuvana, J., 91
E Ezhilarasan, K., 377
C Charlyn Pushpa Latha, G., 143 Chaudhari, Narendra S., 631 Chauhan, Ankit, 81 D Dattatraya, Kale Navnath, 715 Debnath, Narayan C., 351 Deepak, R., 505 Dhulavvagol, Praveen M., 187, 615 Divya, J., 615 Doifode, Adesh, 529 Doucet, Antoine, 43 Dwi, Randy, 31 Dyongo, Asemahle Bridget, 765
F Fernandes, Edwil Winston, 453 G Gandhi, Ankita, 225
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 I. J. Jacob et al. (eds.), Expert Clouds and Applications, Lecture Notes in Networks and Systems 444, https://doi.org/10.1007/978-981-19-2500-9
791
792 Garetta, Agelius, 271 Geetha Devasena, M. S., 779 Geetha Devi, A., 317 Gnanadeep, M. V., 173 Gopu, G., 779 Goyal, Deepak, 337 Gupta, Ankur, 337 Gupta, Pankaj, 337 Gupta, Preeti, 659 Guruprakash, K. S., 111
H Hardikar, Shubhankar, 543 Haripriya, V., 91 Harshni, R., 403 Hartato, 271 Hasan, M. K. Munavvar, 505 Hegde, Anantashayana S., 173 Hiremath, Amulya, 615 Hithesh, Chatna Sai, 467 Husein, 271
J Jagadeesh, C. B., 377 Janahan, Senthil Kumar, 367 Janardanan, Kumar, 675 Jaswanth, M. H., 157 Jayani, Jeconiah Yohanes, 283 Jayanthy, A. K., 675 Jebakumar Immanuel, D., 749 Jenifer Grace Giftlin, C., 127 Jha, Aayush Kumar, 691 Joglekar, Atharva, 543
K Kalpana, G., 111 Kamble, S. B., 599 Karajanagi, N. M., 599 Karthick, V., 505 Kasbe, Akash, 569 Kathiravan, M., 327 Khan, Imran, 317 Kowdiki, K. H., 599 Krishna Kumar, V., 779 Krishnamoorthy, N., 317 Krishna, Samarla Puneeth Vijay, 467 Kukreti, Sanjeev, 211 Kulkarni, Pradnya, 569 Kumaresan, M., 91 Kumar, Govind, 517 Kumar, Manoj, 691
Author Index Kurniadi, Velia Nayelita, 253
L Leander, Amadeus Darren, 283
M Mahadik, Atharva, 415 Mahesha, N., 327 Majji, Sankararao, 367, 377 Malarselvi, G., 1 Malik, Sandeep, 581 Mangla, Aditya, 17 Manh, Phuc Nguyen, 351 Margarita, Vania, 31 Margret Sharmila, F., 749 Mathad, K. S., 441 Math, M. M., 441 Matta, Priya, 211 Mazumder, Debarshi, 351 Mehta, Saarthak, 393 Minh, Ngoc Ha, 351 Moulieshwaran, B., 505 Mwansa, Gardner, 765
N Naresh Kumar, S., 377 Naresh, R., 427 Nathanael, Frandico Joan, 253 Naveeth Babu, C., 143
O Oktavianus, Benaya, 31
P Pandian, A., 1 Pandiselvi, C., 301 Pandya, Nitin Kumar, 385 Parashar, Aayush, 691 Parmar, Darshna, 237 Parveen, Suraiya, 555 Patel, Archana, 351 Patel, Astha, 81 Patel, Disha, 225 Patel, Ibrahim, 327 Patel, Syed Imran, 317 Patil, Rachana Yogesh, 415 Patnala, Tulasi Radhika, 377 Pimo, S. John, 377 Poojitha, P. S., 737
Author Index Poovizhi, P., 749 Prabha, P. Lakshmi, 675 Pramudji, Agung A., 31 Priyadarsini, K., 367 Puttaraju, Surabhi, 453
R Rahman, Salka, 555 Rahul, A., 467 Rahul, C. J., 157 Raj, T. Ajith Bosco, 367 Raol, Prachi, 385 Rufai, Ahmad, 43
S Sainath, Guru, 157 Sai, R. Dilip, 173 Sakthipriya, S., 427 Sanamdikar, S. T., 599 Santhameena, S., 453 Sarlashkar, Rujuta, 543 Sarojadevi, H., 701 Sathya, S., 199 Satpathy, Rabinarayan, 317 Selvan, C., 127 Selvapandian, D., 749 Senthil Kumar, D., 17 Senthil Kumar, S., 143 Shah, Jaimeel, 237 Shaikh, Asma, 659 Shamreen Ahamed, B., 643 Shankar, C. K., 749 Shantala, C. P., 479 Sharma, Sonal, 211 Sharma, Varun N., 157 Sharmila, S. P., 631 Shendkar, Parth, 415 Shirodkar, Ashwin, 615 Shirsat, Neeta, 543, 569 Shubhashree, S., 403
793 Sindhu, C., 393, 517 Singh, Sachin, 517 Singh, Siddharth, 701 Sivakumaran, N., 779 Sivakumar, S., 301 Sofi, Shabir Ahmad, 555 Sreedhar, Jadapalli, 327 Sridhar, S., 111, 127 Srikanteswara, Ramya, 157, 173 Subash, S., 403 Suganya, R., 199 Susmitha, T., 403
T Thirumal, S., 367 Thomas, Aby K., 749 Titiksha, V., 403 Totad, S. G., 187, 615 Tuteja, Vritika, 467
U Uma, S., 737 Undre, Yash, 415
V Vaghasia, Madhuri, 81 Vala, Brijesh, 385 Venkatesh, A., 505 Vincenzo, 253
W Warnars, Diana Teresia Spits, 43, 65 Warnars, Harco Leslie Hendric Spits, 31, 43, 65, 253, 271, 283
Y Yoshua, Riandy Juan Albert, 271