114 82
English Pages 679 [650] Year 2024
Lecture Notes in Networks and Systems 840
Hushairi Zen Naga M. Dasari Y. Madhavee Latha S. Srinivasa Rao Editors
Soft Computing and Signal Processing Proceedings of 6th ICSCSP 2023, Volume 2
Lecture Notes in Networks and Systems Volume 840
The series “Lecture Notes in Networks and Systems” publishes the latest developments in Networks and Systems—quickly, informally and with high quality. Original research reported in proceedings and post-proceedings represents the core of LNNS. Volumes published in LNNS embrace all aspects and subfields of, as well as new challenges in, Networks and Systems. The series contains proceedings and edited volumes in systems and networks, spanning the areas of Cyber-Physical Systems, Autonomous Systems, Sensor Networks, Control Systems, Energy Systems, Automotive Systems, Biological Systems, Vehicular Networking and Connected Vehicles, Aerospace Systems, Automation, Manufacturing, Smart Grids, Nonlinear Systems, Power Systems, Robotics, Social Systems, Economic Systems and other. Of particular value to both the contributors and the readership are the short publication timeframe and the world-wide distribution and exposure which enable both a wide and rapid dissemination of research output. The series covers the theory, applications, and perspectives on the state of the art and future developments relevant to systems and networks, decision making, control, complex processes and related areas, as embedded in the fields of interdisciplinary and applied sciences, engineering, computer science, physics, economics, social, and life sciences, as well as the paradigms and methodologies behind them. Indexed by SCOPUS, INSPEC, WTI Frankfurt eG, zbMATH, SCImago. All books published in the series are submitted for consideration in Web of Science. For proposals from Asia please contact Aninda Bose ([email protected]).
Hushairi Zen · Naga M. Dasari · Y. Madhavee Latha · S. Srinivasa Rao Editors
Soft Computing and Signal Processing Proceedings of 6th ICSCSP 2023, Volume 2
Editors Hushairi Zen Faculty of Science and Technology International Institute of Advance Technology (ICATSUC) Sarawak, Malaysia Y. Madhavee Latha Department of Electronics and Communication Engineering Malla Reddy Engineering College for Women Secunderabad, India
Naga M. Dasari Department of Computer Science University of South Australia Adelaide, SA, Australia S. Srinivasa Rao Department of Electronics and Communication Engineering Malla Reddy College of Engineering and Technology Hyderabad, Telangana, India
ISSN 2367-3370 ISSN 2367-3389 (electronic) Lecture Notes in Networks and Systems ISBN 978-981-99-8450-3 ISBN 978-981-99-8451-0 (eBook) https://doi.org/10.1007/978-981-99-8451-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Paper in this product is recyclable.
Advisory Board-ICSCSP-2023
Chief Patron Sri. Ch. Malla Reddy, Founder Chairman, MRGI, Hon’ble Minister, Government of Telangana
Patrons Sri. Ch. Mahendar Reddy, Secretary, MRGI Sri. Ch. Bhadra Reddy, President, MRGI
Conference Chair Dr. V. S. K. Reddy, Vice Chancellor, MRU
Organizing Chair Prof. P. Sanjeeva Reddy, Dean, International Studies
Convener Dr. S. Srinivasa Rao, Principal
v
vi
Advisory Board-ICSCSP-2023
Co-conveners Dr. P. H. V. Sesha Talpa Sai, Dean, R&D Dr. T. Venugopal, Dean, Students Welfare
Organizing Secretaries Dr. K. Mallikarjuna Lingam, HOD, ECE Dr. G. Sharada, HOD, IT Dr. D. Sujatha, HOD, CSE (CI) Dr. S. Shanthi, HOD, CSE Dr. M. V. Kamal, HOD, CSE (ET)
Editorial Board Dr. Hushairi Zen, Professor and Dean, Faculty of Science and Technology, International Institute of Advance Technology, Malaysia Dr. Naga M. Dasari, Professor, Department. Of Computer Science, University of South Australia, Australia Dr. Y. Madhavee Latha, Professor and Principal, Department of ECE, Malla Reddy Engineering College for Women, Maisammaguda, Secunderabad Dr. S. Srinivasa Rao, Professor and Principal, Department of ECE, Malla Reddy College of Engineering and Technology, Maisammaguda, Secunderabad
Advisory Committee Dr. Heggere Ranganath, Chair of Computer Science, University of Alabama in Huntsville, USA Dr. Someswar Kesh, Professor, Department of CISA, University of Central Missouri, USA Mr. Alex Wong, Senior Technical Analyst, Diligent Inc., USA Dr. Suresh Chandra Satapathy, Professor, School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar Dr. Ch. Narayana Rao, Scientist, Denver, Colorado, USA Dr. Sam Ramanujan, Professor, Department of CIS and IT, University of Central Missouri, USA Dr. Richard H. Nader, Associate Vice President, Mississippi State University, USA Dr. Muralidhar Rangaswamy, WPAFB, OH, USA
Advisory Board-ICSCSP-2023
vii
Mr. E. Sheldon D. Wallbrown, Western New England University, Springfield, USA Prof. Peter Walsh, Head of the Department, Vancouver Film School, Canada Dr. Murali Venkatesh, School of Information Studies, Syracuse University, USA Dr. Asoke K. Nandi, Professor, Department of EEE, University of Liverpool, UK Dr. Vinod Chandran, Professor, Queensland University of Technology, Australia Dr. Amiya Bhaumik, Vice Chancellor, Lincoln University College, Malaysia Dr. Hushairi bin Zen, Professor, ECE, UNIMAS Dr. Bhanu Bhaskara, Professor at Majmaah University, Saudi Arabia Dr. Narayanan, Director, ISITI, CSE, UNIMAS Dr. Koteswararao Kondepu, Research Fellow, Scuola Superiore Sant’ Anna, Pisa, Italy Shri. B. H. V. S. Narayana Murthy, Director, RCI, Hyderabad Prof. P. K. Biswas, Head, Department of E and ECE, IIT Kharagpur Dr. M. Ramasubba Reddy, Professor, IIT Madras Prof. N. C. Shiva Prakash, Professor, IISC, Bengaluru Dr. B. Lakshmi, Professor, Department of ECE, NIT Warangal Dr. G. Ram Mohana Reddy, Professor and Head, IT Department, NITK Surathkal, Mangaluru, India Dr. Y. Madhavee Latha, Professor, Department of ECE, MRECW, Hyderabad
Coordinators Ms. P. Anitha, Associate Professor, ECE Ms. V. Sangeetha, Assistant Professor, CSE Ms. P. Satyavathi, Assistant Professor, CSE Dr. A. Mummoorthy, Professor, IT
Organizing Committee Dr. M. Ramakrishna Murty, Professor, Department of CSE, ANITS, Visakhapatnam Dr. B. Jyothi, Professor, ECE Dr. P. Vanitha, Professor, ECE Dr. K. Rasool Reddy, Associate Professor, ECE Mr. Ch. Kiran Kumar, Associate Professor, ECE Ms. D. Asha, Associate Professor, ECE Dr. Shaik Rahamat Basha, Professor, CSE Mr. M. Vazralu, Associate Professor, IT Mr. M. Sandeep, Associate Professor, CSE Dr. B. Rajeshwar Reddy, Administrative Officer
viii
Advisory Board-ICSCSP-2023
Campaigning and Social Media Coordinator Mr. Ch. Kiran Kumar, Associate Professor, ECE
Web Developer Mr. K. Sudhakar Reddy, Assistant Professor, IT
Review Members Dr. B. Lakshmi, Department of ECE, NIT Warangal Dr. Bhanu Bhaskara, Professor at Majmaah University, Saudi Arabia Dr. M. Ramasubba Reddy, IIT Madras Mr. Andrew Liang, Senior Instructor, Vancouver Film School, Canada Mr. Alex Wong, Senior Technical Analyst, Diligent Inc., USA Dr. Vinod Chandran, Professor, Queensland University of Technology, Australia Dr. Jiacun Wang, Monmouth University, New Jersey Shri. B. H. V. S. Narayana Murthy, Director, RCI Prof. N. C. Shiva Prakash, IISC, Bengaluru Prof. P. K. Biswas, IIT Kharagpur Dr. Koteswararao Kondepu, Research Fellow, Scuola Superiore Sant’ Anna, Pisa, Italy Prof. Peter Walsh, Head of the Department, Vancouver Film School, Canada Dr. Ahmed Nabih Zaki Rashed, Professor, Menoufia University, Egypt Dr. Ch. Narayana Rao, Scientist, Denver, Colorado, USA Dr. Arun Kulkarni, Professor, University of Texas at Tyler, USA Dr. Ramanujan, Professor, Department of CIS and IT, University of Central Missouri, USA Dr. K. Narayana Murty, Professor in CSE, University of Hyderabad Prof. P. Laxminarayana, Professor and Director, Nertu, Osmania University, Hyderabad, Telangana Dr. P. Chandra Sekhar, Professor Department of ECE, Osmania University, Hyderabad, Telangana Dr. Hemalatha Rallapalli, Associate Professor, Department of ECE, Osmania University, Hyderabad Dr. D. Rama Krishna, Associate Professor, Department of ECE, Osmania University, Hyderabad Dr. Koti Lakshmi, Associate Professor Department of ECE, Osmania University, Hyderabad Telangana
Advisory Board-ICSCSP-2023
ix
Dr. K. Anitha Sheela, Associate Professor, Department of ECE, JNTUH, Kukatpally, Hyderabad Dr. D. Srinivas Rao, Professor, Department of ECE, JNTUH, Kukatpally, Hyderabad Dr. P. Sri Hari, Professor, HOD, Department of EIE, GITAM University, Hyderabad Dr. Y. Madhavee Latha, Principal, Department of ECE, MRECW, Hyderabad Dr. K. Manjunatha Chari, Professor, HOD, Department of ECE, GITAM University, Hyderabad Dr. M. Ramakrishna Murthy, Professor, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam Dr. M. Rekha Sundari, Professor, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam Dr. S. Suganthi, Professor, Christ University, Bengaluru Dr. B. Tarakeswara Rao, Professor, Kallam Haranadha Reddy Institute of Technology, Guntur Dr. T. D. Bhatt, Professor, Department of ECE, MGIT, Gandipet, Hyderabad Dr. Hyma J., Professor, Gandhi Institute of Technology and Management (GITAM) University, Visakhapatnam Dr. Selani Deepthi Kavila, Professor, Anil Neerukonda Institute of Technology and Sciences, Visakhapatnam Dr. Anitha Perla, Professor, Malla Reddy University, Hyderabad Dr. D. Venkat Reddy, Professor, Department of ECE, MGIT, Gandipet, Hyderabad Dr. Ch. Raja, Professor, Department of ECE, MGIT, Gandipet, Hyderabad Dr. A. Vani, Associate Professor, Department of ECE, CBIT, Gandipet, Hyderabad Dr. E. Nagabushanam, Professor, Department of ECE, Sridevi Women’s Engineering College Dr. V. M. Senthil Kumar, Professor, Vivekananda College of Engineering for Women, Namakkal, Elaiyampalayam, Tamil Nadu Dr. M. Sucharitha, Professor, Vellore Institute of Technology, Andhra Pradesh Dr. T. Padma, Professor, Department of Biomedical Engineering, GRIET, Bachupally, Kukatpally, Hyderabad Dr. K. Padmavathi, Professor, Department of ECE, GRIET, Hyderabad Dr. G. S. Naveen Kumar, Professor, Malla Reddy University, Hyderabad Dr. Sasikanth Shanmugam, Professor, Vivekananda College of Engineering for Women, Namakkal, Elaiyampalayam, Tamil Nadu Dr. C. V. Narasimhulu, Professor and Principal, LIET, Hyderabad Dr. Ch. Srinivasa Rao, Professor, Department of ECE, JNTUK University College of Engineering Vizianagaram Dr. N. Balaji, Professor, Department of ECE, JNTUK University College of Engineering Vizianagaram Dr. N. V. Koteswara Rao, Professor and HOD, Department of ECE, CBIT, Gandipet, Hyderabad Dr. D. Krishna Reddy, Professor, Department of ECE, CBIT, Gandipet, Hyderabad
x
Advisory Board-ICSCSP-2023
Dr. A. V. Narasimha Rao, Professor, Department of ECE, CBIT, Gandipet, Hyderabad Dr. Kakarla Subba Rao, Professor, Department of ECE, CBIT, Gandipet, Hyderabad Dr. A. D. Sarma, Professor, Department of ECE, CBIT, Gandipet, Hyderabad Dr. P. Narahari Sastry, Professor, Department of ECE, CBIT, Gandipet, Hyderabad Dr. K. Suman, Assistant Professor, Department of ECE, CBIT, Gandipet, Hyderabad Dr. V. Saidulu, Associate Professor, Department of ECE, MGIT, Gandipet, Hyderabad Dr. S. Praveena, Assistant Professor, Department of ECE, MGIT, Gandipet, Hyderabad Dr. C. D. Naidu, Principal and Professor, Department of ECE, VNRVJIET, Hyderabad Dr. Yarlagadda Padma Sai, Professor, Department of ECE, VNRVJIET, Hyderabad Dr. Vangala Padmaja, Professor, Department of ECE, VNRVJIET, Hyderabad Dr. L. Padma Sree, Professor, Department of ECE, VNRVJIET, Hyderabad Dr. Rajendra Prasad Somineni, Professor, Department of ECE, VNRVJIET, Hyderabad Dr. Ranjan K. Senapati, Professor, Department of ECE, VNRVJIET, Hyderabad Dr. Mediikonda Christhu Raju, Professor, Department of ECE, VNRVJIET, Hyderabad Dr. P. Trinatha Rao, Associate Professor, Department of ECE, GITAM University, Hyderabad Dr. C. Krishna Mohan, Associate Professor, IIT Hyderabad Dr. Sathya Peri, Associate Professor, IIT Hyderabad Dr. Manish Prateek, Professor and Director—SoCSE, UPES, Dehradun, India Dr. Manas Pradhan Rajan, Professor, INTI International University, Malaysia Dr. Sobhan Babu, Associate Professor, IIT Hyderabad Dr. Ch. Sudhakar, Associate Professor, NIT Warangal Dr. S. Phani Kumar, GITAM University, Hyderabad Dr. K. Ramesh, Associate Professor, NIT Warangal Dr. Padmalaya Nayak, Professor, GRIET Dr. K. Chandrasekaran, Professor, NITK Surathkal Dr. Gollapudi Ramesh Chandra, Professor, VNR VJIET Dr. R. B. V. Subra Maanyam, Associate Professor, NIT Warangal Dr. C. Kiranmai, Professor, VNR VJIET Dr. Venkatadiri, Associate Professor and Program Head—SoCSE, UPES, Dehradun, India Dr. Salman Abdul Moiz, Associate Professor, Central University, Hyderabad Dr. Rajeev Wankar, Professor, Central University, Hyderabad Dr. M. Akkalakshmi, Professor, GITAM University, Hyderabad Dr. Neelu Jyoti Ahuja, Professor and Head, R&D Centre, SoCSE, UPES, Dehradun, India Dr. Abhineet Anand, Professor, CSE Department, Galgotia University, Greater Noida, India Dr. Ravi Rastogi, Professor, CSE Department, Galgotia University, Greater Noida, India
Advisory Board-ICSCSP-2023
xi
Dr. K. Anuradha, Professor and Dean, CSE Department, GRIET Dr. P. Krishna Reddy, Professor, IIIT Hyderabad Dr. Praveen Paruchuri, Professor, IIIT Hyderabad Dr. A. Jagan, Professor and Head, CSE Department, BVRIT Hyderabad Dr. Hari Seetha, Professor, CSE Department, Vellore Institute of Technology (VIT) Dr. N. L. Bhanu Murthy, Professor and Head, CSE Department, BITS Hyderabad Campus Dr. G. Geethakumari, Professor, CSE Department, BITS Hyderabad Campus Dr. A. Nagesh, Professor and Head, CSE Department, MGIT Hyderabad Dr. M. Sree Vani, Professor, CSE Department, MGIT Hyderabad Dr. R. Rajeswara Rao, Professor and Head, CSE Department, JNTU Vizianagaram Dr. Sita Ramaiah, Professor, CSE Department, Andhra University, Andhra Pradesh Dr. S. Viswanadha Raju, Professor and Vice Principal, JNTU Jagtial, Telangana Dr. B. Vishnu Vardhan, Professor CSE Department, JNTU Jagtial, Telangana Dr. M. Sivarama Badriraju, Professor, CSE Department, SRKR Engineering College, Bhimavaram Dr. S. Suresh Kumar, Vivekananda College of Engineering for Women, Tiruchengode Dr. R. Rajasekar, Professor, St. Peters Engineering College Dr. Aruna Malapati, BITS Hyderabad
Conference Theme
The International Conference on Soft Computing and Signal Processing is planned to create awareness and provide a common platform for the professionals, academicians and researchers working in the area of Soft Computing and Signal Processing to analyze the state-of-the-art developments, innovations and future trends and thereby contribute to the much-needed dissemination of latest developments and advances in these fields. It is a well-known fact that research plays a vital role in the sphere of teaching and academics. In this regard, Soft Computing encompasses topics such as Computational Intelligence, Deep Learning, Optimization Techniques, Evolutionary Algorithms, Intelligent E-Learning Systems, Soft Sets, Rough Sets, Fuzzy Logic, Neural Networks, Genetic Algorithms and Machine Learning and Signal Processing and also encompasses emerging topics such as Communication Engineering, Information Theory and Networks, Electronics Engineering and Microelectronics, Signal, Image and Speech Processing, Wireless and Mobile Communication, Circuits and Systems, Energy Systems, Power Electronics and Electrical Machines, Electro-optical Engineering, Instrumentation Engineering, Avionics Engineering, Control Systems, Internet of Things and Cybersecurity, Biomedical Devices, MEMS and NEMS, Radar Signal Processing and VLSI Signal Processing which are all state-of-the-art topics having great deal of research potential which contribute toward promotion of peaceful and inclusive societies for sustainable development. The conference provides a platform for interacting and exchange of ideas, experience and expertise in the current focus of global research, recent developments, challenges and emerging trends in the above-mentioned fields. Finally, the conference is going to be an intellectual arena which encompasses ability for developing the needs of the present without compromising the ability of future generations to meet their own needs. It also helps to exchange and exhibit mutual respect and understanding of ideas, philosophical principles, cognitive styles and mind-sets as well as acts of integrity and purposeful personal reflection. The submitted papers are expected to cover the state-of-the-art technologies, product implementation, ongoing research as well as application issues which will take care in promoting peaceful and inclusive societies for sustainable development. xiii
Preface
The International Conference on Soft Computing and Signal Processing (ICSCSP2023) was successfully organized by Malla Reddy College of Engineering and Technology, an UGC Autonomous Institution, during June 23–24, 2023, at Hyderabad. The objective of this conference was to provide opportunities for the researchers, academicians and industry persons to interact and exchange the ideas, experience and gain expertise in the cutting-edge technologies pertaining to soft computing and signal processing. Research papers in the above-mentioned technology areas were received and subjected to a rigorous peer-review process with the help of program committee members and external reviewers. The ICSCSP-2023 received a total of 243 papers, each paper was reviewed by more than two reviewers, and finally 55 papers were accepted for publication in Springer LNNS series. Our sincere thanks go to Dr. Aninda Bose, Senior Editor, Springer Publications, India, and Dr. Suresh Chandra Satapathy, Prof. and Dean, R&D, KIIT, for extending their support and cooperation. We would like to express our gratitude to all session chairs, viz., Dr. Jiacun Wang, Monmouth University, USA; Dr. Ramamurthy Garimella, Mahindra University, Hyderabad; Dr. Divya Midhun Chakravarthy, Lincoln University College, Malaysia; Dr. Naga Mallikarjuna Rao Dasari, Federation University, Australia; Dr. Samrat Lagnajeet Sabat, University of Hyderabad; Dr. Bharat Gupta, NIT, Patna; Dr. Mukil Alagirisamy, Lincoln University College, Malaysia; and Dr. M. Ramakrishna Murthy, ANITS, Visakhapatnam, for extending their support and cooperation. We are indebted to the program committee members and external reviewers who have produced critical reviews in a short time. We would like to express our special gratitude to publication chair Dr. Suresh Chandra Satapathy, KIIT, Bhubaneswar, for his valuable support and encouragement till the successful conclusion of the conference. We express our heartfelt thanks to our Chairman, Sri. Ch. Malla Reddy Garu, Chief Patron, Founder Chairman, MRGI; Patrons Sri. Ch. Mahendar Reddy, Secretary, MRGI; Sri. Ch. Bhadra Reddy, President, MRGI; Dr. V. S. K. Reddy, Vice Chancellor MRU; Dr. S. Srinivasa Rao, Principal and Convener; Co-conveners Dr. P. H. V. Sesha
xv
xvi
Preface
Talpa Sai, Dean R&D; Dr. T. Venu Gopal, Dean-Students Welfare; and Organizing Chair Prof. P. Sanjeeva Reddy, Dean, International Studies. We would also like thank Organizing Secretaries Dr. K. Mallikarjuna Lingam, HOD, ECE; Dr. D. Sujatha HOD (CI); Dr. G. Sharada HOD (IT); Dr. S. Shanthi, HOD (CSE); and Dr. M. V. Kamal, HOD (ET) for their valuable contribution. Our sincere thanks go to all the coordinators, Organizing Committee, and other committee members for their commendable contribution in the successful conduct of the conference. Last, but certainly not least, our special thanks go to all the authors for their applaudable technical contributions which have made our proceedings rich and exemplary. Dr. Hushairi Zen Professor and Dean Faculty of Science and Technology International Institute of Advance Technology Sarawak, Malaysia Dr. Naga M. Dasari Professor Department of Computer Science University of South Australia Adelaide, Australia Dr. Y. Madhavee Latha Professor and Principal Department of ECE Malla Reddy Engineering College for Women, Maisammaguda Secunderabad, India Dr. S. Srinivasa Rao Professor and Principal Department of ECE Malla Reddy College of Engineering and Technology, Maisammaguda Hyderabad, India
Contents
Review and Design of Integrated Dashboard Model for Performance Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Vijay Arputharaj, Mahmud El Yakub, Ahmed Abba Haruna, and A. Senthil Kumar
1
An Empirical Analysis of Lung Cancer Detection and Classification Using CT Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aparna M. Harale and Vinayak K. Bairagi
11
Species Identification of Birds Via Acoustic Processing Signals Using Recurrent Network Analysis (RNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Srujana, B. Sriya, S. Divya, Subhani Shaik, and V. Kakulapati
27
Optimized Analysis of Emotion Recognition Through Speech Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Kakulapati, Sahith, Naresh, and Swethan
39
Facial Emotion Recognition Using Chatbot and Raspberry Pi . . . . . . . . . Sunil Bhutada, Meghana Madabhushi, Satya Shivani, and Sindia Choolakal
53
Estimation of Impurities Present in an Iron Ore Using CNN . . . . . . . . . . . P. Asha, Kolisetti Pavan Chandra, Keerthi Durgaprashanth, S. Prince Mary, Sharvirala Kethan, and A. Mary Posonia
67
A Study on Machine Learning and Deep Learning Techniques Applied in Predicting Chronic Kidney Diseases . . . . . . . . . . . . . . . . . . . . . . . Kalyani Chapa and Bhramaramba Ravi
79
Convolutional Neural Network and Recursive Feature Elimination Based Model for the Diagnosis of Mild Cognitive Impairments . . . . . . . . . Harsh Bhasin, Abheer Mehrotra, and Ansh Ohri
99
xvii
xviii
Contents
Facial Analysis Prediction: Emotion, Eye Color, Age and Gender . . . . . . 109 J. Tejaashwini Goud, Nuthanakanti Bhaskar, K. Srujan Raju, G. Divya, Srinivasarao Dharmireddi, and Murali Kanthi An Automated Smart Plastic Waste Recycling Management Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Vamaraju Hari Hara Nadha Sai, Nuthanakanti Bhaskar, Srinivasarao Dharmireddi, K. Srujan Raju, G. Divya, and Jonnadula Narasimharao Secure Identity Based Authentication for Emergency Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 J. Jenefa, E. A. Mary Anita, V. Divya, S. Rakoth Kandan, and D. Vinodha Stock Price Prediction Using LSTM, CNN and ANN . . . . . . . . . . . . . . . . . . 141 Krishna Prabeesh Kakarla, Dammalapati Chetan Sai Kiran, and M. Kanchana IoT-Based Smart Wearable Devices Using Very Large Scale Integration (VLSI) Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 M. Ashwin, R. Ch. A. Naidu, Raghu Ramamoorthy, and E. Saravana Kumar Plant Disease Detection and Classification Using Artificial Intelligence Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Ashutosh Ghildiyal, Mihir Tomar, Shubham Sharma, and Sanjay Kumar Dubey Camera and Voice Control-Based Human–Computer Interaction Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Ch. Sathwik, Ch. Harsha Vardhan, B. Abhiram, Subhani Shaik, and A. RaviKumar Optimal Crop Recommendation by Soil Extraction and Classification Techniques Using Machine Learning . . . . . . . . . . . . . . . 189 Y. B. Avinash and Harikrishna Kamatham IoT-Based Smart Irrigation System in Aquaponics Using Ensemble Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Aishani Singh, Dhruv Bajaj, M. Safa, A. Arulmurugan, and A. John Performance Evaluation of MFSK Techniques Under Various Fading Environments in Wireless Communication . . . . . . . . . . . . . . . . . . . . 209 Sudha Arvind, S. Arvind, Abhinaya Koyyada, Mohd Mohith, Navya Sree Vallandas, and Sumanth Chekuri Prediction of Breast Cancer Using Feature Extraction-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Suddamalla Tirupathi Reddy, Jyoti Bharti, and Bholanath Roy
Contents
xix
Anomaly Detection in Classroom Using Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 B. S. Vidhyasagar, Harshith Doppalapudi, Sritej Chowdary, VishnuVardhan Dagumati, and N. Charan Kumar Reddy Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using Distributed Arithmetic Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Venkata Krishna Odugu, B. Satish, B. Janardhana Rao, and Harish Babu Gade Implementation of an Efficient Image Inpainting Algorithm using Optimization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 K. Revathi, B. Janardhana Rao, Venkata Krishna Odugu, and Harish Babu Gade A Systematic Study and Detailed Performance Assessment of SDN Controllers Across a Wide Range of Network Architectures . . . . . . . . . . . 265 V. Sujatha and S. Prabakeran Classification and Localization of Objects Using Faster RCNN . . . . . . . . . 279 Bhavya Sree Dakey, G. Ramani, Md. Shabber, Sirisha Jogu, Divya Sree Javvaji, and Hasini Jangam An Image Processing Approach for Weed Detection Using Deep Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Yerrolla Aparna, Nuthanakanti Bhaskar, K. Srujan Raju, G. Divya, G. F. Ali Ahammed, and Reshma Banu Detection of Leaf Black Sigatoka Disease in Enset Using Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 A. Senthil Kumar, Meseret Ademe, K. S. Ananda Kumar, Srikrishna Adusumalli, M. Venkata Subbarao, and K. Sudhakar Detecting Communities Using Network Embedding and Graph Clustering Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Riju Bhattacharya, Naresh Kumar Nagwani, and Sarsij Tripathi Deep Learning Approaches-Based Brain Tumor Detection Using MRI Images—A Comprehensive Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 S. Santhana Prabha and D. Shanthi Predicting Crop Yield with AI—A Comparative Study of DL and ML Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 M. Jayanthi and D. Shanthi Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 E. Gurumoorthi, Chinta Gouri Sainath, U. Hema Latha, and G. Anudeep Goud
xx
Contents
Predicting Mortality in COVID-19 Patients Based on Symptom Data Using Hybrid Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Naveen Chandra Paladugu, Ancha Bhavana, M. V. P. Chandra Sekhara Rao, and Anudeep Peddi Underwater Image Quality Assessment and Enhancement Using Active Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Radha SenthilKumar, M. N. Abinaya, Divya Darshini Kannan, K. N. Kamalnath, and P. Jayanthi A Machine Learning and Deep Learning-Based Web Application for Crop and Fertilizer Recommendation and Crop Disease Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Amuri Srinidhi, Veeramachinani Jahnavi, and Mohan Dholvan The Hybrid Model of LSB—Technique in Image Steganography Using AES and RSA Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Srinivas Talasila, Gurrala Vijaya Kumar, E Vijaya Babu, K Nainika, M Veda Sahithi, and Pranay Mohan An Effective Online Failure Prediction in DC-to-DC Converter Using XGBoost Algorithm and LabVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . 415 B. Aravind Balaji, S. Sasikumar, Naga Prasanth Kumar Reddy Puli, Velicherla Chandra Obula Reddy, and V. R. Prakash Machine Learning-Based Stroke Disease Detection System Using Biosignals (ECG and PPG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 S. Neha Reddy, Adla Neha, S. P. V. Subba Rao, and T. Ramaswamy Secure Trust-Based Attribute Access Control Mechanism Using FK-MFCMC and MOEHO-XGBOOST Techniques . . . . . . . . . . . . . . . . . . 441 Padala Vanitha and Banda Srikanth Designing Multiband Patch Antenna for 5G Communication System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Hampika Gorla, N. Venkat Ram, and L. V. Narasimha Prasad Air Quality Prediction Using Machine Learning Algorithms . . . . . . . . . . . 465 G. Shreya, B. Tharun Reddy, and V. S. G. N. Raju Disease Detection in Potato Crop Using Deep Learning . . . . . . . . . . . . . . . . 475 S. P. V. Subba Rao, T. Ramaswamy, Samrat Tirukkovalluri, and Wasim Akram Applications of AI Techniques in Health Care and Well-Being Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Pankaj Kumar, Rohit, Satyabrata Jena, and Rajeev Shrivastava
Contents
xxi
Smart Air Pollution Monitoring System Using Arduino Based on Wireless Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 S. Thaiyalnayaki, Rakoth Kandan Sambandam, M. K.Vidhyalakshmi, S. Shanthi, J. Jenefa, and Divya Vetriveeran Design of Voltage Comparator with High Voltage to Time Gain for ADC Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 Ashwith Kumar Reddy Penubadi and Srividya Pasupathy Intelligent One Step Authentication Using Machine Learning Model for Secure Interaction with Electronic Devices . . . . . . . . . . . . . . . . . 515 Tharuni Gelli, Rajesh Mandala, Challa Sri Gouri, S. P. V. Subba Rao, and D. Ajitha Development of Automated Vending Machine for the Application of Dispensing of Medicines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 P. Deepak, S. Rohith, D. Niharika, K. Harshith Kumar, and Ram Bhupal Design of a Low-Profile Meander Antenna for Wireless and Wearable Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 S. Rekha, G. Shine Let, and E. John Alex Machine Learning-Based Prediction of Cardiovascular Diseases Using Flask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547 V. Sagar Reddy, Boddula Supraja, M. Vamshi Kumar, and Ch. Krishna Chaitanya Network Traffic Analysis using Feature-Based Trojan Detection Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559 R. Lakshman Naik, Sourabh Jain, and B. Manjula Multi-band Micro-strip Patch Antenna for C/X/Ku/K-Band Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569 Karunesh Srivastava, Mayuri Kulshreshtha, Sanskar Gupta, and Shrasti Sanjay Shukla Design of 4 × 4 Array Antenna Using Particle Swarm Optimization for High Aperture Gain for Wi-Fi Applications . . . . . . . . . . . . . . . . . . . . . . . 577 Madhumitha Jayaram, P. K. Santhya Premdharshini, and Rajeswari Krishnasamy Optimal Energy Restoration in Radial Distribution: A Network Reconfiguration Approach by Kruskal’s Minimal Spanning Tree Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591 Maitrayee Chakrabarty and Dipu Sarkar An Efficient Methodology for Preprocessing of COVID-19 Images Using BM3D Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 Anitha Patibandla, Kirti Rawal, and Gaurav Sethi
xxii
Contents
Islanding Detection and Prevention of Blackout of Distribution System Using Decision Tree Algorithm-Based Load Shedding . . . . . . . . . . 617 Giriprasad Ambati, B. Sannitha, N. Meghana, T. Prashanth, Ch. Karthik, and Ch. Rami Reddy Propagation of Computer Worms—A Study . . . . . . . . . . . . . . . . . . . . . . . . . 629 Mundlamuri Venkata Rao, Divya Midhunchakkaravarthy, and Sujatha Dandu Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
Editors and Contributors
About the Editors Dr. Hushairi Zen was awarded his Doctoral degree from Edith Cowan University, Australia in the area of Wireless Communication Networking and Protocol. He has pursued his Masters in Telecommunications from Universiti Malaysia Sarawak (UNIMAS). He has pursued his degree Bachelor’s Degree in Electronics and Telecommunication at Universiti Malaysia Sarawak (UNIMAS). His areas of expertise are Wireless Communication, Internet of Things, Renewable Energy. He has undertaken several consultancy works in Nanga Sengaih Micro Hydro Project, KKLW, RM 1,700,000.00, 2011, Semulung Ulu Micro Hydro Project, KKLW, RM 1,600,000.00, 2011, Assum Micro Hydro Project, RM 3,500,000.00, 2013, Bario Micro Hydro Project, RM 4,000,000.00, 2013, Study of Illegal Parabola Dish within Community in Sarawak, MCMC, RM 345,000.00, 2015, Development of Sago Palm Detection and Quantification using Drone and Deep Learning, CRAUN Research, RM 85,000 2019, Development of Reverse Osmosis Performance Monitoring based on IoT and Artificial Intelligence, Biomedix Solutions Sdn Bhd, RM 180,000, 2020, A holistic Approach to Accelerate Sarawak in Embracing Digital Technology Advancement, SDEC, 2020, RM 80,000, Study on Transformation of Unit Pendaftaran Kontraktor dan Juruperunding (UPKJ) Towards World Class Service Delivery, State Government, RM 921,000.00, 2021. He has over 70 publications in several National and International Journals, Conferences and book Chapters. He has won the Gold medal Award at INTEX 2022. Real-time Landslide Monitoring and Alert System based on IoT and BigData, Silver Medal Award at INTEX 16 for Wrist Module as an Applied Health Monitoring System in Wireless Today Area Network, Bronze Medal Award at INTEX 17 for Wearable Body Posture Monitoring System and Bronze Medal Award 16. Pump as Turbine with Rounded Leading Edge Impeller for Microhydro System.
xxiii
xxiv
Editors and Contributors
Dr. Naga M. Dasari was awarded Doctorate in Computational Neuroscience by the University of South Australia in 2016. He has pursued Master of Technology (Digital Systems and Computer Electronics) at Jawaharlal Nehru Technological University (JNTU), India, 1997. He has pursued his Bachelors of Technology in Electronics and Communication Engineering from Jawaharlal Nehru Technological University (JNTU), India. He is a highly qualified and experienced academic professional known for a combination of holistic teaching and research skills, academic leadership capabilities, and interpersonal strengths. He has extensive experience in Curriculum Development and Implementation, Course Coordination and Delivery, Teaching, Education Administration and research gained from over 25 years of employment history in the academics. He is an accomplished, visionary leader recognised for delivering safe, inclusive, and student focused learning environment whilst ensuring a school culture that encourages continuous improvement for teachers and students. He won a prize in GovHack, Australia’s largest open government and open data hackathon, 2019. His core competencies include Superior aptitude to strategize, execute and deliver tangible, measurable results that support and enable the organisation to achieve its academic goals and objectives, experienced in supervising higher degree research projects made possible by exceptional project management skills, extensive experience in providing academic leadership to staff members and leading and facilitating research and scholarly activities in line with the established vision, mission, values and objectives of the Institution. He has a robust record of team leadership, subject matter expertise, and relationship management and committed to delivering high quality education consistently. Dr. Y. Madhavee Latha was awarded Doctorate in Electronics and Communication Engineering by JNT University, Hyderabad, India, January 2009. She has pursued her Masters in “Digital systems and Computer Electronics” from JNT University College of Engineering, Hyderabad, India, 2002–2004. She has pursued her Bachelors of Technology in “Electronics and Communication Engineering” from JNT University College of Engineering, Hyderabad, India. Her Research Interests lies in the areas of. She has a vast experience of 22 years at various levels of Academic and Administrative Positions. She is presently working as Professor of ECE and Principal at Malla Reddy Engineering College for Women since July 2010. She served as “Professor and Head”, Department of ECE, Hyderabad at Malla Reddy College of Engineering and Technology. She has worked at G. Narayanamma Institute of Technology and Science, Department of ECE, Hyderabad for a period of 10 Years. She has many Awards to her credit. Few of them are Women Entreprenuer of top women’s College in India-2022 by Women Entreprenuer, INDIA, Bhishma Acharya Award by Bharat Education Excellence Awards, All India Level Biggest Award Ceremony of Education and Research, India’s 20 women Pragmatic Women Leaders in Education Industry 2021, “Most Innovative Principal Engineering Colleges in Telangana State Award” received from International Education Awards, Time Cyber Media Pvt. Ltd, “Best Principal Award for Excellence in Education” received from Global Leader Foundation, “Women in Education Award” received from Dewang Mehta National Educational Awards for being the inspiring woman in education,
Editors and Contributors
xxv
Academic Excellence Award in the year 2007–08. She has 257 Google scholar citations (h-index 11,i10 index 12). She is an Editor for International Journal of Pure and Applied Mathematics (IJPAM), International Conferences on AICTE Sponsored “Signal Processing, Communications and System Design” (ICSPCOMSD). She is a Member of IEEE (92631973), Life Member of IETE (LM177648), Life Member of ISTE(LM33687),Member of CSI(5023220053). Dr. S. Srinivasa Rao was awarded Doctorate in Electronics and Communication Engineering by JNT University, Hyderabad, India, January 2013. He has pursued his Masters in “Digital Systems and Computer Electronics” from JNT University College of Engineering, Hyderabad, India, 2002–2004 .He has pursued his Bachelors of Technology in “Electronics and Communication Engineering” from Madras Institute of Technology, Anna University. He is a Research Guide for Ph.D. Scholars at Lincoln University, Kuala Lumpur, Malaysia, Ph.D. Examiner for Anna University, Chennai and is a Reviewer for Springer, Elsevier International Journals. He has over 24 years of Teaching Experience. He has been a Guest Faculty for BITS-Pilani MS Programme for Qualcomm Employees for the courses “Wireless and Mobile Communications” during August–December, 2012 and “Pervasive Computing Technology” during August–December, 2013. Best Teacher Award for three consecutive academic years 2005–06, 2006–07, 2007–08 from Aurora Group of Institutions, Hyderabad. Some of his best academic contributions include Projects Examiner of BITS Pilani MS Programme for Qualcomm Employees, Hyderabad and Proctor for NetMath Online program, Department of Mathematics, University of Illinois Urbana-Champaign, USA. He is a member of Professional bodies like IEEE, IETE and ISTE.
Contributors B. Abhiram Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, Telangana, India M. N. Abinaya Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Meseret Ademe Dilla University, Dilla, Ethiopia Srikrishna Adusumalli Shri Bhimavaram, India
Vishnu
Engineering
College
for
Women,
D. Ajitha Department of Software Systems, School of Computer Science and Engineering (SCOPE), Vellore Institute of Technology, Vellore, India Wasim Akram Department of E.C.E, Sreenidhi Institute of Science and Technology, Hyderabad, India
xxvi
Editors and Contributors
G. F. Ali Ahammed Department of CSE, VTU Centre for Post Graduate Studies, Mysuru, Karnataka, India Giriprasad Ambati Department of Electrical and Electronics Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India K. S. Ananda Kumar Atria Institute of Technology, Bengaluru, India E. A. Mary Anita Christ (Deemed to be University), Bangalore, India G. Anudeep Goud CMR College of Engineering & Technology, Hyderabad, Telangana, India Yerrolla Aparna Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India B. Aravind Balaji Hindustan Institute of Technology and Science, Chennai, India J. Vijay Arputharaj Department of Computer Science, CHRIST (Deemed to be University), Bangalore, India A. Arulmurugan Department of Computing Technologies, SRM Institute of Science and Technology, Kattankulathur, Chennai, India S. Arvind Department of CSE, HITAM, Hyderabad, Telangana, India Sudha Arvind Department of ECE, CMR Technical Campus, Hyderabad, Telangana, India P. Asha Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India M. Ashwin Department of Artificial Intelligence and Data Science, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India Y. B. Avinash Malla Reddy University, Hyderabad, India Vinayak K. Bairagi Department of Electronics and Telecommunication, AISSMS Institute of Information Technology, Pune, India Dhruv Bajaj Department of Networking and Communications, SRM Institute of Science and Technology, Kattankulathur, Chennai, India Reshma Banu Department of CSE, VVIET, Mysuru, Karnataka, India Jyoti Bharti Department of CSE and IT, Maulana Azad National Institute of Technology Bhopal, Bhopal, India Harsh Bhasin Department of Computer Science and Engineering, Center for Health Innovations, Manav Rachna International Institute of Research and Studies, and Computer Science Engineering, MRIIRS, Faridabad, Haryana, India Nuthanakanti Bhaskar Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, India
Editors and Contributors
xxvii
Riju Bhattacharya Department of Computer Science and Engineering, National Institute of Technology Raipur, Raipur, India Ancha Bhavana Department of CSBS, R.V.R. & J.C. College of Engineering, Guntur, India Ram Bhupal Department of Electronics and Communication Engineering, Nagarjuna College of Engineering and Technology, Bengaluru, Karnataka, India Sunil Bhutada Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, India Maitrayee Chakrabarty Department of Electrical Engineering, JIS College of Engineering, Kalyani, India Kolisetti Pavan Chandra Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India M. V. P. Chandra Sekhara Rao Department of CSBS, R.V.R. & J.C. College of Engineering, Guntur, India Kalyani Chapa Department of Computer Science and Engineering, GITAM (Deemed to be University), Visakhapatnam, Andhra Pradesh, India N. Charan Kumar Reddy Department of Computer Science and Engineering, Amrita School of Computing, AmritaVishwaVidyapeetham, Chennai, India Sumanth Chekuri CMR Technical Campus, Hyderabad, Telangana, India Sindia Choolakal Industry Professional, Marlabs, Bangalore, India Sritej Chowdary Department of Computer Science and Engineering, Amrita School of Computing, AmritaVishwaVidyapeetham, Chennai, India VishnuVardhan Dagumati Department of Computer Science and Engineering, Amrita School of Computing, AmritaVishwaVidyapeetham, Chennai, India Bhavya Sree Dakey Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India Sujatha Dandu Department of Computer Science and Engineering, Malla Reddy College of Engineering and Technology, Hyderabad, India P. Deepak Department of Electronics and Communication Engineering, Nagarjuna College of Engineering and Technology, Bengaluru, Karnataka, India Srinivasarao Dharmireddi Department of Cyber Security, MasterCard, St. Louis, Missouri, USA Mohan Dholvan Department of Electronics and Computer Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, India G. Divya Department of IT, CMR Technical Campus, Hyderabad, Telangana, India
xxviii
Editors and Contributors
S. Divya Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India V. Divya Christ (Deemed to be University), Bangalore, India Harshith Doppalapudi Department of Computer Science and Engineering, Amrita School of Computing, AmritaVishwaVidyapeetham, Chennai, India Sanjay Kumar Dubey Department of CSE, Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India Keerthi Durgaprashanth Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India Harish Babu Gade Department of ECE, CVR College of Engineering, Hyderabad, India Tharuni Gelli Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India Ashutosh Ghildiyal Department of CSE, Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India Hampika Gorla Department of ECSE, KLEF, Vaddeswaram, Guntur, Andrapradesh, India; Department of ECE, Institute of Aeronautical Engineering, Hyderabad, India J. Tejaashwini Goud Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, India Challa Sri Gouri Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India Sanskar Gupta Department of Electronics and Communication Engineering, Ajay Kumar Garg Engineering College, Ghaziabad, UP, India E. Gurumoorthi CMR College of Engineering & Technology, Hyderabad, Telangana, India Aparna M. Harale Department of Electronics and Telecommunication, AISSMS Institute of Information Technology, Pune, India Ch. Harsha Vardhan Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, Telangana, India K. Harshith Kumar Department of Electronics and Communication Engineering, Nagarjuna College of Engineering and Technology, Bengaluru, Karnataka, India Ahmed Abba Haruna Department of Computer Science, University of Hafr Al Batin, Hafr Al Batin, Saudi Arabia U. Hema Latha CMR College of Engineering & Technology, Hyderabad, Telangana, India
Editors and Contributors
xxix
Veeramachinani Jahnavi Department of Electronics and Computer Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, India Sourabh Jain Department of CSE, IIIT, Sonepat, Haryana, India B. Janardhana Rao Department of ECE, CVR College of Engineering, Hyderabad, India Hasini Jangam Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India Divya Sree Javvaji Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India M. Jayanthi PSNA College of Engineering and Technology, Dindigul, Tamil Nadu, India P. Jayanthi Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Madhumitha Jayaram Department of ECE, College of Engineering Guindy, Anna University, Chennai, India Satyabrata Jena Department of Pharmaceutics, Bhaskar Pharmacy College, Hyderabad, Telangana, India J. Jenefa Department of CSE, SOET, CHRIST (Deemed to Be University), Bengaluru, Karnataka, India Sirisha Jogu Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India A. John Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan E. John Alex Department of ECE, CMR Institute of Technology, Secunderabad, Telangana, India M. K.Vidhyalakshmi Department of Computing Technologies, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Chengalpattu District, Tamil Nadu, India Krishna Prabeesh Kakarla SRM Institute of Science and Technology, Chennai, Tamil Nadu, India V. Kakulapati Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India K. N. Kamalnath Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Harikrishna Kamatham Malla Reddy University, Hyderabad, India
xxx
Editors and Contributors
M. Kanchana SRM Institute of Science and Technology, Chennai, Tamil Nadu, India S. Rakoth Kandan Christ (Deemed to be University), Bangalore, India Divya Darshini Kannan Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India Murali Kanthi Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, India Ch. Karthik Department of Electrical and Electronics Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Sharvirala Kethan Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India Dammalapati Chetan Sai Kiran SRM Institute of Science and Technology, Chennai, Tamil Nadu, India Abhinaya Koyyada CMR Technical Campus, Hyderabad, Telangana, India Ch. Krishna Chaitanya Department of Electronics and Communication Engineering, VNR VJIET, Hyderabad, Telangana, India Rajeswari Krishnasamy Department of ECE, College of Engineering Guindy, Anna University, Chennai, India Mayuri Kulshreshtha Department of Computer Science & Engineering, IMS Engineering College, Ghaziabad, Uttar Pradesh, India E. Saravana Kumar Department of Artificial Intelligence and Data Science, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India; Department of Computer Science and Engineering, The Oxford College of Engineering, Bengaluru, Karnataka, India Pankaj Kumar Department of Pharmacology, Adesh Institute of Pharmacy and Biomedical Sciences, Bathinda, India Meghana Madabhushi Industry Professional, Marlabs, Bangalore, India Rajesh Mandala Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India B. Manjula Department of CS, Kakatiya University, Warangal, India S. Prince Mary Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India N. Meghana Department of Electrical and Electronics Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Abheer Mehrotra Manav Rachna University, Faridabad, Haryana, India
Editors and Contributors
xxxi
Divya Midhunchakkaravarthy Department of Computer Science and Multimedia, Lincoln University College, Kota Bharu, Malaysia Pranay Mohan VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Mohd Mohith CMR Technical Campus, Hyderabad, Telangana, India Naresh Kumar Nagwani Department of Computer Science and Engineering, National Institute of Technology Raipur, Raipur, India R. Ch. A. Naidu Department of Computer Science and Engineering, The Oxford College of Engineering, Bengaluru, Karnataka, India R. Lakshman Naik Department of CSE, IIIT, Sonepat, Haryana, India K Nainika VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India L. V. Narasimha Prasad Department of ECE, Institute of Aeronautical Engineering, Hyderabad, India Jonnadula Narasimharao Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India Naresh Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India Adla Neha Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India S. Neha Reddy Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India D. Niharika Department of Electronics and Communication Engineering, Nagarjuna College of Engineering and Technology, Bengaluru, Karnataka, India Velicherla Chandra Obula Reddy Hindustan Science, Chennai, India
Institute
of
Technology
and
Venkata Krishna Odugu Department of ECE, CVR College of Engineering, Hyderabad, India Ansh Ohri Manav Rachna University, Faridabad, Haryana, India Naveen Chandra Paladugu Department of CSBS, R.V.R. & J.C. College of Engineering, Guntur, India Srividya Pasupathy Department of Electronics and Communication Engineering, R.V College of Engineering, Bengaluru, Karnataka, India Anitha Patibandla LPU, Phagwara, Punjab, India
xxxii
Editors and Contributors
Anudeep Peddi Department of CSBS, R.V.R. & J.C. College of Engineering, Guntur, India Ashwith Kumar Reddy Penubadi Department of Electronics and Communication Engineering, R.V College of Engineering, Bengaluru, Karnataka, India A. Mary Posonia Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India S. Prabakeran Department of Networking and Communication, SRMIST, Chennai, India V. R. Prakash Hindustan Institute of Technology and Science, Chennai, India T. Prashanth Department of Electrical and Electronics Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Naga Prasanth Kumar Reddy Puli Hindustan Institute of Technology and Science, Chennai, India V. S. G. N. Raju Department of Electronics and Communication Engineering, Sreenidhi Institute of Science a Technology, Hyderabad, Telangana, India Raghu Ramamoorthy Department of Artificial Intelligence and Data Science, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India; Department of Computer Science and Engineering, The Oxford College of Engineering, Bengaluru, Karnataka, India G. Ramani Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India T. Ramaswamy Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India Ch. Rami Reddy Department of Electrical and Electronics Engineering, Joginpally B. R. Engineering College, Hyderabad, India Mundlamuri Venkata Rao Department of Computer Science and Multimedia, Lincoln University College, Kota Bharu, Malaysia Bhramaramba Ravi Department of Computer Science and Engineering, GITAM (Deemed to be University), Visakhapatnam, Andhra Pradesh, India A. RaviKumar Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, Telangana, India Kirti Rawal SEEE, LPU, Phagwara, Punjab, India Suddamalla Tirupathi Reddy Department of Computer Science Engineering, Maulana Azad National Institute of Technology Bhopal, Bhopal, India S. Rekha Department of ECE, Nalla Narasimha Reddy Group—School of Engineering, Secunderabad, Telangana, India
Editors and Contributors
xxxiii
K. Revathi Sphoorthy Engineering College, Hyderabad, India Rohit Department of Pharmacy Practice, I.K. Gujral Punjab Technical University, Kapurthala, Punjab, India S. Rohith Department of Electronics and Communication Engineering, Nagarjuna College of Engineering and Technology, Bengaluru, Karnataka, India Bholanath Roy Department of Computer Science Engineering, Maulana Azad National Institute of Technology Bhopal, Bhopal, India M. Safa Department of Networking and Communications, SRM Institute of Science and Technology, Kattankulathur, Chennai, India V. Sagar Reddy Department of Electronics and Communication Engineering, VNR VJIET, Hyderabad, Telangana, India Sahith Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India Vamaraju Hari Hara Nadha Sai Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India Chinta Gouri Sainath CMR College of Engineering & Technology, Hyderabad, Telangana, India Rakoth Kandan Sambandam Department of CSE, SOET, CHRIST (Deemed to Be University), Bengaluru, Karnataka, India B. Sannitha Department of Electrical and Electronics Engineering, VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India S. Santhana Prabha PSNA College of Engineering and Technology, Dindigul, Tamil Nadu, India P. K. Santhya Premdharshini Department of ECE, College of Engineering Guindy, Anna University, Chennai, India Dipu Sarkar Department of Electrical and Electronics Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur, India S. Sasikumar Hindustan Institute of Technology and Science, Chennai, India Ch. Sathwik Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, Telangana, India B. Satish Department of ECE, CVR College of Engineering, Hyderabad, India A. Senthil Kumar Department of Computer Science, Skyline University Nigeria, Kano, Nigeria; Shri Vishnu Engineering College for Women, Bhimavaram, India Radha SenthilKumar Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India
xxxiv
Editors and Contributors
Gaurav Sethi SEEE, LPU, Phagwara, Punjab, India Md. Shabber Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India Subhani Shaik Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Yamnampet, Ghatkesar, Hyderabad, Telangana, India D. Shanthi PSNA College of Engineering and Technology, Dindigul, Tamil Nadu, India S. Shanthi Department of CSE, Malla Reddy College of Engineering and Technology, Hyderabad, Telangana, India Shubham Sharma Department of CSE, Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India G. Shine Let Department of ECE, Karunya Institute of Technology and Science, Coimbatore, Tamil Nadu, India Satya Shivani Industry Professional, Marlabs, Bangalore, India G. Shreya Department of Electronics and Communication Engineering, Sreenidhi Institute of Science a Technology, Hyderabad, Telangana, India Rajeev Shrivastava Princeton Institute of Engineering and Technology for Women, Hyderabad, India Shrasti Sanjay Shukla Department of Electronics and Communication Engineering, Ajay Kumar Garg Engineering College, Ghaziabad, UP, India Aishani Singh Department of Computing Technologies, SRM Institute of Science and Technology, Kattankulathur, Chennai, India Banda Srikanth University College of Engineering, Kakatiya University, Kothagudem, Telangana State, India Amuri Srinidhi Department of Electronics and Computer Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, India Karunesh Srivastava Department of Electronics and Communication Engineering, Ajay Kumar Garg Engineering College, Ghaziabad, UP, India B. Sriya Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India K. Srujan Raju Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, Telangana, India C. Srujana Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India
Editors and Contributors
xxxv
S. P. V. Subba Rao Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India K. Sudhakar Sri Vishnu Institute of Technology, Kovvada, AP, India V. Sujatha Department of Networking and Communication, SRMIST, Chennai, India Boddula Supraja Department of Electronics and Communication Engineering, VNR VJIET, Hyderabad, Telangana, India Swethan Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana, India Srinivas Talasila VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India S. Thaiyalnayaki Department of CSE, Bharath Institute of Higher Education and Research (Deemed to Be University), Chennai, Tamilnadu, India B. Tharun Reddy Department of Electronics and Communication Engineering, Sreenidhi Institute of Science a Technology, Hyderabad, Telangana, India Samrat Tirukkovalluri Department of E.C.E, Sreenidhi Institute of Science and Technology, Hyderabad, India Mihir Tomar Department of CSE, Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India Sarsij Tripathi Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology Allahabad, Allahabad, India Navya Sree Vallandas CMR Technical Campus, Hyderabad, Telangana, India M. Vamshi Kumar Department of Electronics and Communication Engineering, VNR VJIET, Hyderabad, Telangana, India Padala Vanitha Department of ECE, Malla Reddy College of Engineering and Technology, Hyderabad, Telangana State, India M Veda Sahithi VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India N. Venkat Ram Department Andrapradesh, India
of
ECSE,
KLEF,
Vaddeswaram,
Guntur,
M. Venkata Subbarao Shri Vishnu Engineering College for Women, Bhimavaram, India Divya Vetriveeran Department of CSE, SOET, CHRIST (Deemed to Be University), Bengaluru, Karnataka, India B. S. Vidhyasagar Department of Computer Science and Engineering, Amrita School of Computing, AmritaVishwaVidyapeetham, Chennai, India
xxxvi
Editors and Contributors
E Vijaya Babu VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India Gurrala Vijaya Kumar VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India D. Vinodha Christ (Deemed to be University), Bangalore, India Mahmud El Yakub Department of Computer Science, Skyline University Nigeria, Kano, Nigeria
Review and Design of Integrated Dashboard Model for Performance Measurements J. Vijay Arputharaj , Mahmud El Yakub , Ahmed Abba Haruna , and A. Senthil Kumar
Abstract This article presents a new approach for performance measurement in organizations, integrating the analytic hierarchy process (AHP) and objective matrix (OM) with the balanced scorecard (BSC) dashboard model. This comprehensive framework prioritizes strategic objectives, establishes performance measures, and provides visual representations of progress over time. A case study illustrates the method’s effectiveness, offering a holistic view of organizational performance. The article contributes significantly to performance measurement and management, providing a practical and comprehensive assessment framework. Additionally, the project focuses on creating an intuitive dashboard for Fursa Foods Ltd. Using IoT technology, it delivers real-time insights into environmental variables affecting rice processing. The dashboard allows data storage, graphical representations, and other visualizations using Python, enhancing production oversight for the company. Keywords Analytic hierarchy process (AHP) · Objective matrix (OM) · Balanced scorecard (BSC) · Performance measurement · Performance management
1 Introduction 1.1 Introduction to Study A business dashboard is a data visualization tool that provides a real-time view of a company’s overall performance by displaying key performance indicators (KPIs) and other crucial information in a graphical manner. It enables companies to track J. V. Arputharaj (B) Department of Computer Science, CHRIST (Deemed to be University), Bangalore, India e-mail: [email protected] M. E. Yakub · A. Senthil Kumar Department of Computer Science, Skyline University Nigeria, Kano, Nigeria A. A. Haruna Department of Computer Science, University of Hafr Al Batin, Hafr Al Batin, Saudi Arabia © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_1
1
2
J. V. Arputharaj et al.
their KPIs in real time, making it simpler to spot problems and act quickly to fix them; it also enables them to identify performance trends over time; it enhances communication by offering a visual representation of data that is simple to understand and interpret, and it enables them to make data-driven decisions by giving them access to crucial data in real time.
1.2 Overview of Study Organizations constantly seek ways to gain competitive advantages and boost their business. One way to achieve this is by monitoring, evaluating, and managing strategies that improve management performance. Using a study of the causes and effects of each action, measurement activities are important to determine the organization’s operational strengths and shortcomings. Tools for performance measurement play a critical role in regularly assessing the effects, triggers, and influences of organizational actions. The balanced scorecard (BSC), The European Foundation for Quality Management (EFQM) Business Excellence Model, Performance Measurement Matrix, Performance Pyramid, Performance Prism, and Kanji Business Excellence Management System are just a few of the performance measurement tools that have been created (KBEMS). BSC and EFQM Business Excellence Model are the two of them that are most often used as performance management systems. These systems provide a structured method for spotting potential threats and strategy adjustments as well as for translating corporate strategy into targets, resulting in a more comprehensive and cost-effective action plan [1]. However, some flaws in BSC and KBEMS have been discovered, particularly when strategy measurement was put into practice. BSC provides advantages over other models when it comes to presenting the performance characteristics from a new perspective to improve the organization’s business outcomes. It has been used by many different types of organizations, both for-profit and non-profit. Despite this, it was discovered that the scorecard determination process and its analytical estimation were flawed. The importance, subjectivity, and in-depth study of scorecards produced from managers’ perspectives are frequently skewed. To overcome these weaknesses, the analytic hierarchy process (AHP) has been introduced in this research. AHP combines qualitative and quantitative assessment techniques, eliminating the drawbacks of any type of technique alone. This idea has been used in several studies to evaluate organizational performance, including integrating AHP fuzzy and BSC approaches to evaluate Taiwanese manufacturing firms, researching the spread of AHP and BSC in Nepal, using AHP and BSC in a hotel company, and integrating AHP with BSC in the information technology sector. AHP provides the decisionmaking process that takes into account the aspects of experience, intuition, and actual data, increasing redundancy and reducing some errors.
Review and Design of Integrated Dashboard Model for Performance …
3
This study computed the total multifactor performance index using OMAX as a supplement to AHP scorecard estimation. OMAX is a productivity assessment system that monitors business productivity using criteria that are in line with business goals. Each criterion’s level of effectiveness and efficiency can be calculated. Lastly, matrix performance indicators are assessed and divided into numerous groups, including really horrible, poor, medium, good, and very good, based on their values. This makes it possible for stakeholders to monitor the performance of key performance indicators (KPI) and produce a single performance assessment score thanks to the uniformity of computation. The score enables managers to pinpoint variations in strategy performance. The BSC performance measurement value is normalized and transformed into a performance index using the function of OMAX. Overwhelming data may be described and analysis made more relevant by the merging of BSC and OMAX. Its integration is, however, constrained by the quantity of comparisons and environment analysis and is thus limited to AHP standards and regulations. In summary, businesses need to oversee, evaluate, and manage their management performance improvement initiatives on a regular basis. In order to transform corporate strategy into targets and identify potential risks and strategy modifications, performance measurement methods like BSC and EFQM Business Excellence Model can be employed. This results in a more thorough and reasonably priced action plan. By fusing AHP and OMAX, these tools’ shortcomings can be fixed. This integration provides a method of decision-making that considers experience, intuition, and actual data, allowing management to detect changes in strategy performance and providing intelligent analysis of voluminous data.
2 Literature Review 2.1 Literature Review Related to the Topic Business dashboards have grown in popularity over the past several years as a result of the increased amount of data that firms have access to. In industrialized nations, corporate dashboards are frequently used and have been proven to be successful in enhancing business performance. Business dashboards can be used to improve decision-making, productivity, income, and cost savings, according to research. Business dashboards can also assist firms in seeing patterns, trends, and abnormalities in their data, enabling them to take preventative action before a problem arises. Intelligent framework for smart traffic management system literature review is presented in Table 1.
4
J. V. Arputharaj et al.
Table 1 Review of literature (recent) Articles related to domain
Research Gaps
1. IoT Dashboard for Hazmat Emergency Response Optimization
The IoT Dashboard for Hazmat Emergency Response Optimization is a robust solution designed to mitigate risks, enhance situational awareness, and safeguard the well-being of first responders, ultimately contributing to more effective and coordinated emergency response efforts in hazardous material incidents. Need for better integration of data-driven approaches into industrial decision-making processes
2. Intelligent Framework for Advanced Traffic Intelligent framework is designed to revolutionize urban traffic management by Management Systems Tailored to Urban Mobility Challenges harnessing the power of emerging technologies, data analytics, and adaptive strategies but there are some challenges in managing and analyzing large-scale IoT data 3. Development of an Industry 4.0 Cyber-Physical System Dashboard: Integration of IoT and Big Data Technologies for Enhanced Operational Insights
The resulted dashboard is anticipated to provide a holistic view of industrial processes, facilitate data-driven decision-making, and contribute to the advancement of smart and efficient manufacturing practices within the Industry 4.0 paradigm but there is a need for IoT and big data technologies in Industry 4.0
4. Integration of IoT, Edge Computing, and Big Data Analytics for Smart Meter Data Processing
By leveraging the capabilities of IoT, Edge Computing, and Big Data Analytics, it aims to provide real-time insights into energy consumption patterns, ultimately contributing to more intelligent and responsive energy management practices, There is a need for user-friendly and customizable dashboard designs
2.2 Literature Review Related to the Technologies Harun et al. present a survey of software testing practices in Malaysia. The study aims to identify the current state of software testing practices and challenges faced by software developers and testers in Malaysia. The survey was conducted on 94 software development companies in Malaysia, and the results show that most of the companies use manual testing techniques and lack automation testing skills. The article highlights the importance of improving software testing practices and promoting the use of automated testing tools in Malaysia [2]. Babar et al. research article presents an empirical study on software testing practices and challenges in global software development. The study was conducted on 120 software development companies in 20 countries, and the results show that software testing practices vary significantly across different countries. The article highlights the need for standardizing software testing practices and promoting the use of agile methodologies
Review and Design of Integrated Dashboard Model for Performance …
5
and automated testing tools in global software development [3]. The efficiency of defect detection is examined empirically with respect to test case attributes in this research work. The research involved 13 open-source software projects, and the findings indicate that the efficacy of defect detection is significantly impacted by the amount and complexity of test cases. The post emphasizes how crucial it is to create efficient test cases for enhancing software quality [4]. Gorla et al. present an empirical investigation of the effects of test suite reduction on fault localization. The study, which examined ten open-source software initiatives, discovered that fault localization might be made more effective by using a smaller test suite. The essay emphasizes how crucial test suite optimization is for lowering testing expenses and raising product quality [5]. Rodrigues et al. provide a systematic literature review of research related to software testing in the cloud. It examines the challenges and opportunities of testing in a cloud environment and discusses various techniques and tools for testing cloud-based applications [6]. Shang et al. propose an automated GUI testing approach that uses artificial intelligence techniques to improve the efficiency and effectiveness of testing. It describes a tool called AI Test that employs machine learning algorithms to learn from human testers’ behavior and generate test cases automatically [7]. Khanna and Gupta present a systematic literature review of research related to testing concurrent software. It discusses the challenges and techniques of testing concurrent software and provides an overview of various tools and frameworks that have been developed for this purpose [8]. Goncalves et al. an organized review of the literature on research into the automated testing of Android applications is presented in this article. It talks about the difficulties in testing mobile applications and gives an overview of several automated testing methods and tools created for Android applications [9]. Lind and Musser provide key considerations for effective dashboard design. The authors argue that good dashboard design involves identifying the purpose and audience of the dashboard, selecting the right data visualization tools, and organizing the dashboard in a way that is intuitive and easy to use [10]. Mayorga and Duchowski examine how design elements such as color, typography, and layout can impact users’ emotions and perceptions of dashboard usability. The authors argue that design elements can be used to influence users’ cognitive and affective responses to dashboard information [11]. Hemphill and Long provide practical advice on designing and implementing effective dashboards. The authors argue that good dashboard design involves identifying key performance indicators (KPIs), using appropriate data visualization techniques, and designing dashboards that are easy to use and interpret [12]. Alshehri et al. present a comparative study of different dashboard design techniques and their impact on user perception and performance. The authors argue that dashboard design should take into account users’ cognitive load, attentional demands, and task requirements to optimize user performance and satisfaction [13]. Qin and Zhou provide an overview of existing dashboard design guidelines and best practices. The authors argue that good dashboard design involves considering the needs and preferences of the target audience, selecting appropriate
6
J. V. Arputharaj et al.
data visualization techniques, and designing dashboards that are easy to navigate and interpret [14]. One of the biggest developments in software testing methods has been the move toward agile methodology. Iterative development is a component of agile development, where software is created in brief bursts and tested frequently. This approach allows for faster time to market and more frequent releases, which is essential in today’s competitive environment [15]. The use of agile methodologies has led to a more collaborative approach to software development, with developers, testers, and stakeholders working together to ensure the quality of software products. Continuous testing is another critical change in software testing approaches. Continuous testing involves the integration of testing into the software development process, with testing occurring at every stage of development. With this method, there is less chance that faults will be discovered too late in the development cycle because they may be recognized and fixed early on. Continuous testing also facilitates the automation of testing, enabling faster, and more reliable testing results [16]. The use of automation and AI in software testing is also a significant change in software testing approaches. Test automation involves the use of tools and scripts to automate the execution of tests, reducing the need for manual testing. This approach not only saves time but also improves the accuracy and reliability of test results. Artificial intelligence is also being used in software testing, with the development of intelligent testing tools that use machine learning algorithms to learn from previous test results and optimize testing processes [17].
3 Methodology 3.1 Architecture Flask dashboard applications’ architecture might be modeled after a client–server model or another common web application architecture. The web browser, which communicates with the Flask server to display and interact with the dashboard, will make up the client-side. An architecture of proposed work is shown in Fig. 1.
3.2 Flow and Working Mechanism A Flask dashboard application’s flowchart typically includes the following actions: 1. Through a web browser, the user can view the dashboard. 2. The request is received by the Flask server, which forwards it to the proper view function.
Review and Design of Integrated Dashboard Model for Performance …
7
Fig. 1 Architecture of proposed work
3. The necessary data is retrieved or processed by the view function from a variety of sources, including databases and external APIs. 4. The rendered processed data is sent back to the client-side using templates. 5. The rendered data is shown through the web browser, which also offers interactive features. The web browser, which communicates with the Flask server to display and interact with the dashboard, will make up the client-side. An flow of proposed work is shown in Fig. 2.
3.3 Modules Depending on the unique needs, different modules can be utilized in a Flask dashboard application. Modules that are often used include: 1. Flask: The essential component for creating a Flask web application. 2. User authentication and session management features are offered by Flask-Login. 3. The Flask-WTF integration allows for the handling and development of web forms using WTForms and Flask.
8
J. V. Arputharaj et al.
Fig. 2 Flow of proposed work
4. For database operations, Flask-SQL Alchemy enables the integration of SQL Alchemy, a potent SQL toolkit, with ORM. 5. Pandas is a Python data analysis and manipulation toolkit that may be used to preprocess and alter data before showing it on the dashboard. 6. You can construct interactive graphs, charts, and visualizations within the dashboard using libraries for data visualization, such as Plotly or Matplotlib. 7. Other modules based on specialized needs, such as Flask-Mail for email capabilities, Flask-RESTful for creating a RESTful API, or Flask-Caching for data caching.
4 Results 4.1 Comparisons Between Existing Cloud-Based Systems and IoT Dashboards The development and assessment of the Flask application for the IoT dashboard for monitoring temperature, humidity, light intensity, and CCTV footage in production organizations are shown in this section. The outcomes are arranged according to the major components of the development of the application, such as functionality, usability, performance, and security. Table 2 shows the comparative analysis of existing and current IoT-based system. The temperature, humidity, light intensity, and CCTV modules are effectively integrated with the Flask application, enabling real-time monitoring of these variables. The software precisely presents the gathered data on the IoT dashboard, giving
Review and Design of Integrated Dashboard Model for Performance …
9
Table 2 Comparative analysis of existing and current IoT-based system Existing system (cloud)
Current IoT-based system
Limited monitoring capabilities for temperature Added monitoring for humidity, light intensity, and CCTV footage only Basic user interface with limited functionalities
Enhanced user interface with intuitive design and improved usability
Manual data retrieval and analysis
Real-time data updates and data visualization tools for quick analysis
Limited integration options with external systems
Seamless integration with ERP system for data exchange
Lack of security features and vulnerability to breaches
Implemented secure authentication, access controls, and data encryption
No mobile access
Developed a mobile application for remote monitoring
No support or documentation
Comprehensive documentation and responsive customer support
users in-depth perceptions into the working environment. The implementation of the integration with other systems, including the ERP system for data interchange, was also successful.
5 Conclusion In conclusion, this article demonstrates how an industry effectively created a business dashboard with IoT capabilities for monitoring manufacturing facilities. By switching from a PHP-based system to a Flask-based application, the company harnessed the power of Flask’s lightweight framework and seamlessly integrated IoT devices. The newly developed Flask application addressed the limitations of the previous system by providing real-time monitoring, easy IoT device connectivity, improved user experience, and enhanced security measures. The transition to Flask proved to be a strategic decision, resulting in a more efficient, scalable, and future-proof solution. Extensive testing confirmed the Flask application’s outstanding performance, usability, security, and functionality. The application’s intuitive data visualization, real-time updates, reliable tracking, and prompt notifications for critical events empowered Fursa Foods Limited to enhance operational effectiveness and product quality, facilitating informed decision-making and proactive problem-solving. The successful implementation of the Flask application underscores the importance of incorporating IoT functionality into corporate dashboards. With real-time insights into temperature, humidity, light intensity, and CCTV footage, Fursa Foods
10
J. V. Arputharaj et al.
Limited can now maintain optimal production conditions and ensure compliance with regulations. Moreover, the project serves as a valuable reminder of the significance of choosing appropriate frameworks and technologies for application development. Flask’s efficiency, scalability, and strong community support provided a solid and adaptable foundation for rapid development and future growth.
References 1. Liew TS, Hui WS (2015) Integrated analytical hierarchy process and objective matrix in balanced scorecard dashboard model for performance measurements. J Intell Manuf 26(5):981–991. https://doi.org/10.1007/s10845-014-0934-7 2. Harun NZ, Abas H, Abas NH (2015) A survey of software testing practices in Malaysia. J Softw Eng Appl 8(11):545–557. https://doi.org/10.4236/jsea.2015.811053 3. Babar MA, Brohi SN, Mendes E (2010) An empirical study on software testing practices and challenges in global software development. Inf Softw Technol 52(11):1191–1204. https://doi. org/10.1016/j.infsof.2010.07.004 4. Xia X, Yue T, Briand L, Labiche Y (2014) Exploring the relationship between test case properties and defect detection effectiveness. IEEE Trans Softw Eng 40(3):267–281. https://doi.org/ 10.1109/TSE.2013.22 5. Gorla A, Gross F, Zeller A (2012) An empirical study of the effects of test suite reduction on fault localization. IEEE Trans Softw Eng 38(2):272–290. https://doi.org/10.1109/TSE.201 1.47 6. Rodrigues MA, Vieira M, Maldonado JC (2019) A systematic literature review of software testing in the cloud. Inf Softw Technol 108:33–53. https://doi.org/10.1016/j.infsof.2018.11.005 7. Shang C, Chen Y, Zhou Z (2017) Automated GUI testing using artificial intelligence. IEEE Trans Softw Eng 43(1):26–48. https://doi.org/10.1109/TSE.2016.2614957 8. Khanna R, Gupta RK (2016) Testing concurrent software: a systematic literature review. ACM Comput Surv 49(3) (Article 53). https://doi.org/10.1145/2884781 9. Goncalves MJ, Fernandes JP, Rocha HG (2018) Automated testing of Android applications: a systematic literature review. J Syst Softw 146:169–182. https://doi.org/10.1016/j.jss.2018. 10.032 10. Hasan I, Javed MA (2019) Continuous testing in DevOps. In: 2019 IEEE 19th international conference on software quality, reliability and security companion (QRS-C) 11. Lind JD, Musser JD (2017) Key considerations for effective dashboard design. J AHIMA 88(1):14–19 PMID: 28125795 12. Mayorga R, Duchowski AT (2019) Dashboard design: the effect of emotion and perception on user experience. Int J Human Comput Interact 35(17):1567–1580. https://doi.org/10.1080/104 47318.2019.1597525 13. Hemphill CL, Long BL (2016) Designing and implementing effective dashboards. J AHIMA 87(7):40–45 PMID: 27427633 14. Alshehri S, Al-Yahya M, Almutairi M, Al-Nuaim H, Almudimigh A (2020) Dashboard design: a comparative study of user perception and performance. Int J Hum Comput Interact 36(17):1613–1623. https://doi.org/10.1080/10447318.2020.1827647 15. Qin J, Zhou Z (2019) A survey of dashboard design guidelines. J Vis 22(1):1–15. https://doi. org/10.1007/s12650-018-0523-3 16. Alshammari M, Alshammari T (2018) Agile testing: an overview. J King Saud Univ Comput Inf Sci 30(3):363–370 17. Duvvuri S, Gargeya VB (2020) The role of artificial intelligence in software testing: a review. In: Proceedings of the 2020 4th international conference on computing and artificial intelligence (pp 128–133)
An Empirical Analysis of Lung Cancer Detection and Classification Using CT Images Aparna M. Harale and Vinayak K. Bairagi
Abstract Lung cancer is the most typical cancer-related cause of mortality in both men and women. According to estimates, this condition affects 1.2 million people annually. In one year, this illness claimed the lives of about 1.1 million people. If cancer is discovered in its early stages, many lives can be saved. The early identification of lung cancer is a difficult endeavour, nevertheless, and 80% of individuals who receive an accurate diagnosis for their cancer have it in its middle or advanced stages. A radiologist can swiftly and accurately diagnose anomalies with the aid of a computer-aided diagnosis (CAD) system. Lung cancer is identified after an examination of pulmonary nodules. The use of computer tomography (CT) pictures, the detection of pulmonary nodules is performed. Images from CT scans are clearer, noisier, and more distorted-free. In this area, there is a lot of study being done. On various components of the CAD system, several research focuses may be recognized. Various image capture, pre-processing, segmentation, detection of lung nodule, false positive reduction, and classification of lung nodule strategies are compiled in this study. This paper provides an overview of the literature on lung nodule detection for last ten years. Keywords Nodule detection · Nodule · Lung cancer · Computer-aided diagnosis · Computer tomography
1 Introduction Uncontrolled growth and spread of abnormal cells characterize the group of diseases known as cancer, which if untreated can be dangerous. Lung cancer is one type of serious illness that exists in the world. The annual death rate from lung cancer is more equivalent to other cancer categories like breast cancers, prostate, and brain. Lung cancer has the highest incidence rate of death among those between the ages of 45 and A. M. Harale (B) · V. K. Bairagi Department of Electronics and Telecommunication, AISSMS Institute of Information Technology, Pune, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_2
11
12
A. M. Harale and V. K. Bairagi
70 [1]. The most prevalent cancer in both sexes is lung cancer. According to statistics of the American Cancer Society [1], the greatest common cancer-related mortality is lung cancer in the USA. In 2022, an estimated 236,740 new cases of lung cancer (118,830 cases of women and 117,910 cases of men) are expected. Approximately 14% of all cancer diagnoses are accounted for by the estimate [1]. An approximate 130,180 deaths due to lung cancer will happen in 2022 [1]. Majority of cases of lung cancer are classified as either 14% small cell lung cancer (SCLC) or 82% non-small cell lung cancer (NSCLC). Compared to other cancers, lung cancer claims more lives in both sexes. Table 1. Represent Estimated New Cancer Cases and Estimated Deaths by Sex for different cancers, in USA, year 2022 [1]. Chart of cancer versus estimated new cases and estimated deaths in both sexes represent in Fig. 1, whereas Fig. 2 shows analysis of estimated new cases in male and female in year 2022 separately for male and female. And Fig. 3 shows Analysis of Estimated Deaths in male and female. Lung cancer risk (%) by sex at various age intervals for developing invasive malignancy is represented in Table 2. Lung cancer has the highest incidence rate of death among those between the ages of 45 and 70 [1]. The Cancer Statistics Analysis of SEER states that 19% of people are still alive after five years. Overall, 23% for women and 16% for men [1]. The key to improving the survival rate of 57% for 1–5 years is early analysis of the lung malignancy. It can be raised much higher, between 65 and 80%. To win the battle against lung cancer, significant research efforts are focused on the early diagnosis of lung nodules [2]. Between 2008 and 2017 [1] men’s mortality rates declined by 51% annually and women’s death rates decreased by 26% annually as a result of the decline in smoking Table 1 Estimated new cancer cases and estimated deaths by sex for different cancers, in USA, year 2022 [1] Cancer
Estimated new cases
Estimated deaths
Female
Male
Both sexes
Female
Male
Both sexes
All sites
934,870
983,160
1,918,030
287,270
322,090
609,360
Breast
287,850
2710
290,560
43,250
530
43,780
Digestive system
149,690
193,350
343,040
71,980
99,940
171,920
Lung and bronchus
118,830
117,910
236,740
61,360
68,820
130,180
Genital system
115,130
280,470
395,600
32,830
35,430
68,260
Skin
45,660
62,820
108,480
3930
8060
11,990
Lymphoma
40,320
48,690
89,010
8920
12,250
21,170
Endocrine system
33,430
13,620
47,050
1680
1650
3330
Oral cavity and pharynx
15,300
38,700
54,000
3360
7870
11,230
Brain and other nervous system
10,880
14,170
25,050
7570
10,710
18,280
1570
1790
3360
190
220
410
Eye and orbit
An Empirical Analysis of Lung Cancer Detection and Classification …
13
Thousands
Cancer vs Estimated New Cases and Estimated Deaths 450 400 350 300 250 200 150 100 50 0
Estimated New Cases (Both sexes)
Estimated Deaths (Both sexes)
Fig. 1 Chart of cancer versus estimated new cases and estimated deaths
Fig. 2 Analysis of estimated new cases in male and female
prevalence. A lung nodule with practically spherical opaqueness, relatively correct border, and a maximum diameter less than 3 cm can be seen on the CT images.
14
A. M. Harale and V. K. Bairagi
Fig. 3 Analysis of estimated deaths in male and female
Table 2 Lung cancer risk (%) by sex at various age intervals for developing invasive malignancy
Sex versus age
Male
Female
0–49
0.1 (1 in 812)
0.1 (1 in 690)
0–49
0.6 (1 in 169)
0.6 (1 in 175)
60–69
1.7 (1 in 59)
1.4 (1 in 71)
70 and older
5.7 (1 in 17)
4.8 (1 in 21)
Birth to death
6.4 (1 in 16)
6.0 (1 in 17)
The goal of the CAD system is to create an automatic CAD system for the lung cancer. To enhance the detection’s efficiency, decrease the estimate time of the radiotherapist, early detection of lung cancer nodules will increase the patient’s probability of surviving by more than 5 years. The results from the computer-aided detection system are used as an alternative perspective and to help radiotherapists reach their decisions. To reduce the cost of several needless Health examinations (prevent needless lung biopsies). The purpose of the present research is to review various automatic CAD system methodologies for lung cancer analysis. Section 1 provides an introduction, Sect. 2 discusses review work, and Sect. 3 conclusions.
An Empirical Analysis of Lung Cancer Detection and Classification …
15
2 Review Work Figure 4 shows the block diagram of the CAD system for lung nodule detection. CAD system has five stages. Five steps are acquisition, pre-processing, segmentation, nodule detection and false positive reduction, nodule classification. Next subsections describe individual block and its present techniques. Image Acquisition In medicine, the process for obtaining CT scans is known as image acquisition. Numerous public and private databases have lung CT images. National Imaging Archive’s LIDC-Lung Image Database Consortium [3], ELCAP-Early Lung Cancer Action Program [4], and Medical Image Database [5] are three common public lung nodule databanks. LIDC-IDRI has subsection. The LUNA16 databank has several images are sorted by various benchmarks [6]. Pre-processing The pre-processing images refer to the process of improving a picture by removing noise, which improves readability or sensitivity of image information and provides increased input for further image processing techniques which is programmed and border detection. According to research, a median filter provides the best de-noising results for CT images without misrepresenting the edges. Several pre-processing methods suggested in literature are gathered in Table 3. Most of the researchers used coherence-enhancing diffusion (CED), median filter, Wiener filter, histogram equalization, bilateral filter, signed-rank gain, wavelet transform, Gaussian filters, etc. to eliminate image distortion and enhance image quality. Lung Segmentation A method divides a picture into its individual objects or areas, it also uses lines, curves, and other boundaries to represent objects, and their borders are called segmentation. A precise lung segmentation is prerequisite for accurate detection. Table 4 represented several segmentation methods suggested in review. The literature has offered a variety of segmentation strategies, including 3D region growing and threshold, adaptive fuzzy thresholding, active contour, region-based segmentation, model-based method, thresholding, morphological operation, gradient mean and variance based, region growing, shape-based deformable model, watershed histogram thresholding, etc.
Fig. 4 Block diagram of CAD system for lung nodule detection
16
A. M. Harale and V. K. Bairagi
Table 3 Pre-processing method reported in the literature Study
Reported
Method of pre-processing
Purpose
Choi et al. [2]
2013
Coherence-enhancing diffusion (CED)
To denoise and preserve the spherical structures
Tariq [7]
2013
Median filter
Denoising
Sivakumar et al. [8]
2013
Median filter
To remove the noise from the image without distort the edges
Vaibhav [9]
2014
Histogram equalization
To improve the contrast of images
Vijaya [10]
2014
Bilateral filter
Better performances for pre-processing
Biradar [11] 2015
Median filter
To increase the contrast of images
Nathaney [12]
2015
Adaptive thresholding, morphological operations
Eliminates image distortion and enhances
Magdy [13]
2015
Wiener filter
Preserving the edges and fine specifics of lungs while removing noise
Obayya [14] 2015
Wiener filter
To remove noise
Ruchika [15]
2016
Median filter
To attenuate noise without blurring the images
Makaju et al. [16]
2018
Median filter and Gaussian filter
To remove noise
Shakeel et al. [17]
2019
Weighted mean histogram equalization approach
To remove noise
Reddy et al. [18]
2019
Picture securing, thresholding, prehandling, binarization, division, extraction of feature
To pre-process CT image
Shakeel [19] 2019
Wavelet transform and Gaussian filters
To remove noise
Shakeel et al. [20]
2020
Multilevel brightness-preserving approach
To remove noise
Obulesu et al. [21]
2021
Signed-rank gain
To obtain informative and significant feature
Lung Nodule Detection and False Positive Reduction Lung Nodule Detection The process of estimating the likelihood of nodule shapes appearing in a picture and locating nodules within the lung area is known as lung nodule detection. In order to identify the region of interest, lung nodule detection is a crucial and especially significant stage (Region of Interest—ROI). Different researcher implement template matching, multiscale dot enhancement filter, feature extraction method, grey-level co-occurrence matrix, geometric features, radon transform, Herakles
An Empirical Analysis of Lung Cancer Detection and Classification …
17
Table 4 Literature review of segmentation methods for lung CT images Study
No. of cases
Suggested technique Performance
Cortez et al. 2013 [22]
Year
11 subjects, 3D
3D region growing and Threshold
2 pulmonologists qualitatively evaluated segmentation. 64.78 and 68.18% result acceptable
Keshani et al. [23]
2013
63 subjects (4 Adaptive fuzzy groups: 4, 4, 5 thresholding, active and 50 subjects of contour LIDC), 3D
Segmentation accuracy = 0.981
Aggarwal [24]
2013
246
Region-based segmentation
To segment lungs region
Devaki [25] 2013
5
Region-based segmentation
Involve textural features
Kuruvilla [26]
2013
120
Morphological operations
Sensitivity = 88.24%, specificity = 93.33%, accuracy = 90.63%
Guo [27]
2013
58
Morphological operation
Expectation–maximization (EM) analysis to estimate the threshold value
Tariq [7]
2013
100
Gradient mean and variance based
Segment lungs part
Orkisz et al. 2014 [28]
20 scans, 3D
Threshold, Differentiated vessels from morphological bronchial walls spec. = 0.848 operation and region growing
Zhou et al. [29]
2014
20 MDCT scans, 3D
Pre-processing, FCM and adaptive thresholding
OM = 95.81 ± 0.89% AD = 0.63 ± 0.09 mm
Birkbeck [30]
2014
260
Thresholding
Converts a grey-level image into a binary image
Vaibhav [9]
2014
–
Thresholding
To extract the lungs left and right region
Gong [31]
2014
–
Region growing algorithm
Interested segments extracted using Otsu threshold algorithm
Kuruvilla [32]
2014
155
Morphological operations
98% accurate segmentation
Shen et al. [33]
2015
233 scans, 3D
Pre-processing, FCM and adaptive thresholding
Re-inclusion rate = 92.6%, oversegmentation = 0.3% and undersegmentation = 2.4% (continued)
18
A. M. Harale and V. K. Bairagi
Table 4 (continued) Study
Year
No. of cases
Suggested technique Performance
Wang et al. [34]
2016
45 scans, 3D
Principal component, Morphological op, connected region based and contour segmentation
Rmsd = 1.6957 ± 0.6568 mm, AD = 0.7917 ± 0.2714 mm VOE = 3.5057 ± 1.3719 mm
Shi et al. [35]
2016
23 CT scans
Thresholding
Accuracy 98%
Filho et al. [36]
2017
40 CT scans
deformable method with Shape-based
Accuracy = 99.14%
Soliman et al. [37]
2017
105
Shape-based
OM = 0.98, DSC = 98.4%
Makaju et al. [16]
2018
LIDC-IDRI
Watershed segmentation
Accuracy 92%
Shakeel [19]
2019
500 images
Watershed histogram Accuracy 96.05% thresholding
Li et al. [38] 2020
Japanese Society of Radiological Technology (JSRT)
Segmentation of Lung area and suppression of rib
Accuracy 99%
Bhandary et al. [39]
LIDC-IDRI
Morphological and watershed segmentation
Accuracy 97.27%
2020
system, multiple grey-level thresholding, series of morphological operations, grey intensity co-occurrence distribution matrix (GICDM), fuzzy neural system use with ML, novel detection method (I3DR-Net) with Feature Pyramid Network (FPN) framework model, Wilcoxon signed generative deep learning (WS-GDL) method, UNet, and RPN network methods to detect lung nodules from lung CT images. Table 5 represents various techniques for reviewing the lung nodule detection procedure. False Positive Reduction The process of reducing false positives even more (nodule same structural development discovered via nodule identification) is known as false positive reduction. The remaining systems do not all have a section for reducing false positives. In the literature, methods like the two-stage methodology, the multi-resolution-based, ANN and fuzzy-based, shape-based feature technique are employed to reduce the incidence of false positives. Table 6 represents different false positive reduction methods were suggested in review. Nodule Classification Nodule classification determines whether a nodule has been found and whether it is benign or cancerous. The majority of nodules that have a diameter of less than
An Empirical Analysis of Lung Cancer Detection and Classification …
19
Table 5 Various techniques for reviewing the lung nodule detection procedure Study
Reported
Detection technique
Dataset
Remark
Assefa [40]
2013
Template matching
165
Accuracy 81.21%
Choi and Choi 2014 [41]
Multiscale dot enhancement filter
–
Nodule detection
Vaibhav [9]
Feature extraction method
–
To calculate feature value as the nodule size, structure, volume, and nodule spine value
Nathan et [12] 2015
Grey-level co-occurrence matrix
–
For feature extraction
Magdy [13]
2015
AM_FM (amplitude, frequency modulation)
83
Lung images features extraction
Obayya [14]
2015
Proposed two methods (1) geometric features, (2) radon transform
100
(1) Accuracy 98%, sensitivity 96%, specificity 100% (2) accuracy 86%, sensitivity 80%, specificity 92%
Jacobs [42]
2015
Herakles system
888
Sensitivity 82%
Akram et al. [43]
2016
Multiple grey-level thresholding
–
To extract features
Ruchika [15]
2016
Series of morphological – operations
Accuracy 90%
Mohamed Shakeel [19]
2019
Grey intensity co-occurrence distribution matrix (GICDM)
500
Accuracy 96.05%
Reddy et al. [18]
2019
Fuzzy neural system use with ML
From UCI repository
Accuracy 96.67%
Li et al. [38]
2020
Multi-resolution patch-based CNNs was trained
Japanese Society of Radiological Technology (JSRT)
Accuracy 99%
Harsono [44]
2020
Novel detection method 605 (I3DR-Net) with FPN (feature pyramid network) framework model
Accuracy 94.33%
Obulesu et al. [21]
2021
WS-GDL—Wilcoxon signed generative deep learning method
470
Accuracy 86%
Jiang et al. [45]
2022
U-Net and RPN network
LIDC
Sensitivity 93%, specificity 94%, ROC_AUC 93%
2014
20
A. M. Harale and V. K. Bairagi
Table 6 Various review methods of false positive reduction Study
Year
FP reduction method
Dataset
Remarks
Lin et al. [46]
2005
ANN and fuzzy based
29 subjects
Sensitivity is 89.3% with 0.2 FP/subject
Shi [47]
2013
Two-Stage Method
60
False positive reduction
Sousa et al. [48]
2018
Shape-based feature
33 cases
Accuracy 95.21% with 0.42 FP/scan
10 mm are completely benign. Malignant nodules are those that have a diameter more than 10 mm and carry a high risk of developing cancer. The prevalence of lung cancer with tiny nodules is less than 10% in high-risk individuals [2]. Table 7 represents various classification methods were proposed in literature. Different classification techniques used by researcher are feature-based classification, support vector machine (SVM), weighted SVM, threshold and region growing, Otsu threshold, morphological operation, neural network classifier, KNN, naive Bayes, and linear classifier, artificial neural fuzzy interference system (ANFIS) classifier, deep convolutional neural network (CNN), deep learning instantaneously trained neural network (DITNN), XGboost and random forest, CNN, discrete AdaBoost optimized ensemble learning generalized NN (DAELGNN), artificial neural network to classify the detected nodules are cancerous or non-cancerous.
3 Conclusions The usage of CAD systems enhances radiologist’s abilities in the process of lung nodule detection. The systems must, however, fulfil the following criteria in order to be used frequently in the radiology department: they must enhance the effectiveness of radiologists by offering a low number of false positives, great diagnostic sensitivity, great processing speed, a great level of automation, small price (of implementation, training, support, and maintenance), the capacity to recognize various nodule shapes, types, and software safety guarantee. As per survey, most of the researcher are using LIDC dataset which is highest publicly available dataset with fully annotations. Median and winner filters ate used for pre-processing part. Morphological operation and thresholding are accurately extract lung nodules at segmentation stage. SVM, CNN, and KNN give highest accuracy for lung nodule classification.
An Empirical Analysis of Lung Cancer Detection and Classification …
21
Table 7 Various review lung nodule classification methods Authors
Year
Technique/method
Database
Performance accuracy
Assefa [40]
2013
Feature-based classification
165
To classify nodules
Javed et al. [49]
2013
Weighted SVM
600
Enhance the accuracy of a lung pulmonary nodule classification
Filho et al. [36]
2014
Threshold and region growing and SVM
LIDC-IDRI 800 exams (640 for validating and 160 for testing)
Shape features: Sphericity, density, spherical disproportion, radial shape index and weighted radial distance Sensitivity = 85.91%, specificity = 97.70%, accuracy = 97.55%,
Choi et al. [41]
2014
Threshold, 3 D connected component and SVM
Manual traced data; LIDC datasets 84 CT scans
3D shape-based features Sensitivity = 85.91%, Fp/scan = 6.76
Kuruvilla et al. [32]
2014
Otsu threshold, morphological operation, statistical method: mean, SD, skewness, kurtosis, fifth and sixth central moment; feed-forward BPN, ANN
LIDC datasets 155 Statistical features patients Train. function 1: accuracy = 91.1%, sensitivity = 91.4%, specificity = 100%, MSE = 0.998, Train. function 2: accuracy = 93.3%, MSE = 0.0942
Han et al. [50]
2014
SVM (3D texture-based)
1012
3D Haralick provide better classification performance
Vaibhav [9]
2014
Support vector machine classifier (hyperplane)
–
To differentiate the Benign from malignant nodules
Aggarwal et al. [51]
2015
Threshold, 90 images cancer morphological imaging archive operation, linear discriminate analysis (LDA), GLCM for computing features
Geometrical and statistical features Accuracy = 84%, sensitivity = 97.14% specificity = 53.33%
Biradar [11]
2015
SVM
To identify the cancerous and non-cancerous nodules
1012
(continued)
22
A. M. Harale and V. K. Bairagi
Table 7 (continued) Authors
Year
Technique/method
Database
Performance accuracy
Nathaney [12]
2015
Neural network classifier
–
To classify nodules
Magdy [13]
2015
Four different classifiers: KNN, SVM, naive Bayes, and linear
83
To correctly discriminate between normal and cancer lung images
Obayya [14]
2015
Artificial neural fuzzy interference system (ANFIS) classifier
100
To classify benign from malignant nodules
Farah et al. [52]
2015
Region growing, threshold, For classification: combination of multilayer perception (MLP), KNN, and SVM classifiers
60 CT Scans
Roundness, circularity, compactness, ellipticity, eccentricity (1) MLP—accuracy = 90.41, sensitivity = 73.55 (2) KNN—accuracy = 91.20, sensitivity = 81.76 (3) SVM—accuracy = 90.60, sensitivity = 73.44
Golan et al. [53]
2016
Deep convolutional neural network (CNN), BPN
LIDC datasets
Volumetric features Sensitivity = 78.9% with Fp/scan = 20 Sensitivity = 71.2% with Fp / scan = 10 Sensitivity = 94.4%
Qi et al. [54]
2016
3D convolutional neural networks
888 CT scans
Shape, gradient, and intensity features LIDC datasets, manual drawn data and LUNA 16 challenge
Dhara [55]
2016
SVM (classification on is evaluated in terms of area under characterize curve)
891
Several shape-based, margin-based, and texture-based features are analysed to improve the accuracy of classification (continued)
An Empirical Analysis of Lung Cancer Detection and Classification …
23
Table 7 (continued) Authors
Year
Technique/method
Database
Performance accuracy
Shakeel et al. [17]
2019
Deep learning instantaneously trained neural network (DITNN)
CIA—Cancer imaging Archive
minimum error of classification is 0.038 and 98.42% Accuracy
Bhatia et al. [56]
2019
XGboost and Random Forest
LIDC-IDRI
Accuracy 84%
Shakeel [19]
2019
Probabilistic neural networks
500
Texture, shape
Nasrullah [57]
2019
CNN
888
Shape, intensity and gradient features
Shakeel [58]
2019
Discrete AdaBoost – optimized ensemble learning generalized NN (DAELGNN)
Accuracy 99.48%
Bhandary et al. [39]
2020
EFT is used to classify the lung CT images
Use LIDC-IDRI
Accuracy is 97.27%
Harsono [44]
2020
CNN model
605
Texture-based features
Acknowledgements The authors are thankful to AISSMS IOIT Pune for providing resources for this paper.
References 1. Cancer Facts and Figure 2022 by American Cancer Society. http://www.cancer.org 2. Choi W-J et al (2013) Automated pulmonary nodule detection system in computed tomography images: a hierarchical block classification approach. Entropy 5:508–523 3. Lung Imaging Database Consortium (LIDC). https://imaging.nci.nih.gov/ncia/login.jsf/http:// www.cancerimagingarchive.net/ (Last cited on 2023) 4. Early Lung Cancer Action Program (ELCAP). http://www.via.cornell.edu/lungdb.html (Last cited on 2023) 5. Medical image database. MedPix. http://rad.usuhs.edu/medpixlindex.html (Last cited on 2023) 6. Setio AAA, Traverso A, de Bel T, Berens MSN, van den Bogaard C, Cerello P, Chen H, Dou Q, Fantacci ME, Geurts B et al (2017) Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge. Med Image Anal 42:1–13 7. Tariq A, Akram M-U, Javed M-Y (2013) Lung nodule detection in CT images using neuro fuzzy classifier. In: Proceeding of the fourth international IEEE workshop on computational intelligence in medical imaging (CIMI), pp 49–53 8. Sivakumar S et al (2013) Lung nodule detection using fuzzy clustering and support vector machines. Int J Eng Technol 5(1):179–185
24
A. M. Harale and V. K. Bairagi
9. Likhitkar VK, Gawande U, Hajari KO (2014) Automated detection of cancerous lung nodule from the computed tomography images. IOSR J Comput Eng 16(1) (Version VIII) 10. Vijaya G, Suhasini A (2014) An adaptive pre-processing of lung CT images with various filters for better enhancement. Acad J Cancer Res 7(3):179–184 11. Biradar S, Agalatakatti K (2015) Lung cancer identification using CT images. Int J Eng Comput Sci 4(7):13022–13025. ISSN: 2319-7242 12. Nathaney G, Kalyani K (2015) Lung cancer detection system on thoracic CT images based on ROI processing. Int J Adv Res Comput Commun Eng 4(4):173–176 13. Magdy E, Zayed N, Fakhr M (2015) Automatic classification of normal and cancer lung CT images using multiscale AM-FM features. Int J Biomed Imaging 2015:1–7. Article ID 230830 (Hindawi Publishing Corporation). https://doi.org/10.1155/2015/230830 14. Obayya M, Ghandour M (2015) Lung cancer recognition using radon transform and adaptive neuro fuzzy inference system. Int J Comput Appl 124(2):25–30 15. Ruchika AS (2016) Cad implementation for detection of lung cancerous nodules. Int J Adv Res Comput Sci Softw Eng 6(4):804–806 16. Makaju S, Prasad PWC, Alsadoon A, Singh AK, Elchouemi A (2018) Lung cancer detection using CT scan images. Proc Comput Sci 125:107–114 17. Shakeel PM, Burhanuddin MA, Desa MI (2019) Lung cancer detection from CT image using improved profuse clustering and deep learning instantaneously trained neural networks. Measurement 145:702–712 18. Reddy U, Reddy B, Reddy B (2019) Recognition of lung cancer using machine learning mechanisms with fuzzy neural networks. Traitement du Signal 36(1):87–91 19. Mohamed Shakeel P, Desa MI, Burhanuddin MA (2019) Improved watershed histogram thresholding with probabilistic neural networks for lung cancer diagnosis for CBMIR systems. Multim Tools Appl (Springer) 20. Shakeel PM, Burhanuddin MA, Desa MI (2020) Automatic lung cancer detection from CT image using improved deep neural network and ensemble classifier. Neural Comput Appl 34:1–14 21. Obulesu O, Kallam S, Dhiman G, Patan R, Kadiyala R, Raparthi Y, Kautish S (2021) Adaptive diagnosis of lung cancer by deep learning classification using Wilcoxon gain and generator. Hindawi J Healthcare Eng 2021. Article ID 5912051, https://doi.org/10.1155/2021/5912051 22. Cortez PC, de Albuquerque VHC (2013) 3D segmentation and visualization of lung and its structures using CT images of the thorax. J Biomed Sci Eng 6(11):1099 23. Keshani M, Azimifar Z, Tajeripour F, Boostani R (2013) Lung nodule segmentation and recognition using SVM classifier and active contour modeling: a complete intelligent system. Comput Biol Med 43(4):287–300 24. Aggarwal P, Vig R, Sardana H-K (2013) Semantic and content-based medical image retrieval for lung cancer diagnosis with the inclusion of expert knowledge and proven pathology. In: Proceeding of the IEEE second international conference on image information processing ICIIP’2013, pp 346–351 25. Devaki K, MuraliBhaskaran V, Mohan M (2013) Segment segmentation in lung CT images— preliminary results. Int J Adv Comput Theory Eng 2(1):84–89 26. Kuruvilla J, Gunavathi K (2013) Detection of lung cancer using morphological operations. Int J Sci Eng Res 4(8):1636–1639 27. Guo Y, Zhou C, Chan H-P, Chughtai A, Wei J, Hadjiiski L-M, Kazerooni E-A (2013) Automated iterative neutrosophic lung segmentation for image analysis in thoracic computed tomography. Med Phys 40(8):081912/1–081912/11 28. Orkisz M, Hoyos MH, Romanello VP, Romanello CP, Prieto JC, Revol-Muller C (2014) Segmentation of the pulmonary vascular trees in 3D CT images using variational regiongrowing. IRBM 35(1):11–19 29. Zhou S, Cheng Y, Tamura S (2014) Automated lung segmentation and smoothing techniques for inclusion of juxtapleural nodules and pulmonary vessels on chest CT images. Biomed Signal Process Control 13:62–70
An Empirical Analysis of Lung Cancer Detection and Classification …
25
30. Birkbeck N, Sofka M, Kohlberger T, Zhang J, Wetzl J, Kaftan J, Kevin Zhou S (2014) Robust segmentation of challenging lungs in CT using multi-stage learning and level set optimization. In: Computational intelligence in biomedical imaging, pp 185–208 31. Gong J, Gao T, Bu R-R, Wang X-F, Nie S-D (2014) An automatic pulmonary nodules detection method using 3D adaptive template matching. Commun Comput Inf Sci 461:39–49 32. Kuruvilla J, Gunavathi K (2014) Lung cancer classification using neural networks for CT images. Comput Methods Prog Biomed 113:202–209 (Elsevier) 33. Shen S, Bui AA, Cong J, Hsu W (2015) An automated lung segmentation approach using bidirectional chain codes to improve nodule detection accuracy. Comput Biol Med 57:139–149 34. Wang J, Guo H (2016) Automatic approach for lung segmentation with Juxta-Pleural nodules from thoracic CT based on contour tracing and correction. In: Computational and mathematical methods in medicine. Hindawi Publishing Corporation, 13 pp. Article ID 2962047 35. Shi Z et al (2016) Many is better than one: an integration of multiple simple strategies for accurate lung segmentation in CT images. Biomed Res Int 1–13:2016 36. Filho PPR et al (2017) Novel and powerful 3D adaptive crisp active contour method applied in the segmentation of CT lung images. Med Image Anal 35:503–516 37. Soliman A et al (2017) Accurate lungs segmentation on CT chest images by adaptive appearance-guided shape modeling. IEEE Trans Med Imaging 36(1):263–276 38. Li X, Shen L, Xie X, Huang S, Xie Z, Hong X, Yu J (2020) Multi-resolution convolutional networks for chest X-ray radiograph-based lung nodule detection. Artif Intell Med 103:101744 39. Bhandary A, Prabhu GA, Rajinikanth V, Thanaraj KP, Satapathy SC, Robbins DE et al (2020) Deep-learning framework to detect lung abnormality—a study with chest X-ray and lung CT scan images. Pattern Recogn Lett 129:271–278 40. Assefa M et al (2013) Lung nodule detection using multi resolution analysis. In: Proceeding IEEE, international conference on complex medical engineering. ICME, pp 457–461 41. Choi WJ, Choi TS (2014) Automated pulmonary nodule detection based on three-dimensional shape-based feature descriptor. Comput Methods Programs Biomed 113(1):37–54 42. Jacobs C, van Rikxoort EM, Murphy K, Prokop M, Schaefer-Prokop CM, van Ginneken B (2015) Computer-aided detection of pulmonary nodules: a comparative study using the public LIDC/IDRI database. Eur Radiol 26:2139–2147 43. Akram S et al (2016) Pulmonary nodules detection and classification using hybrid features from computerized tomographic images. J Med Imaging Heal Inf 6(1):252–259 44. Harsono IW, Liawatimena S, Cenggoro TW (2020) Lung nodule detection and classification from thorax CT-scan using RetinaNet with transfer learning. J King Saud Univ Comput Inf Sci 1–11 (Elsevier) 45. Jiang W, Zeng G, Wang S, Wu X, Xu C (2022) Application of deep learning in lung cancer imaging diagnosis. Hindawi J Healthcare Eng 2022. Article ID 6107940. https://doi.org/10. 1155/2022/6107940 46. Lin DT, Yan CR, Chen WT (2005) Autonomous detection of pulmonary nodules on CT images with a neural network based fuzzy system. Comput Med Imaging Graph 29:447–454 47. Shi Z et al (2013) A computer aided pulmonary nodule detection system using multiple massive training SVMs. Appl Math Inf Sci 7(3):1165–1172 48. Sousa JR, Silva AC, Paiva AC, Nunes RA (2018) Methodology for automatic detection of lung nodules in computerized tomography images. Comput Methods Programs Biomed 98:1–14 49. Javed U, Riaz M-M, Cheema T-A, Zafar H-F (2013) Detection of lung tumor in CE CT images by using weighted support vector machines. In: Proceeding of the 10th international Bhurban conference on applied sciences and technology (IBCAST), pp 113–116 50. Han F, Wang H, Zhang G, Han H, Song B, Li L, Moore W, Lu H, Zhao H, Liang Z (2014) Texture feature analysis for computer-aided diagnosis on pulmonary nodules. J Digit Imaging 28(1):99–115 51. Aggarwal T, Furqan A, Kalra K (2015) Feature extraction and LDA based classification of lung nodules in chest CT scan images. In: 2015 International conference on advances in computing, communications and informatics (ICACCI). IEEE, pp 1189–1193
26
A. M. Harale and V. K. Bairagi
52. Farag A et al (2012) An AAM based detection approach of lung nodules from LDCT scans. IEEE, pp 1040–1043 53. Golan R, Jacob C, Denzinger J (2016) Lung nodule detection in CT images using deep convolutional neural networks. In: 2016 International joint conference on neural networks (IJCNN). IEEE, pp 243–250 54. Dou Q, Chen H, Yu L, Qin J, Heng PA (2016) Multi-level contextual 3D CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans Biomed Eng.https://doi.org/10. 1109/TBME.2016.2613502 55. Dhara AK, Mukhopadhyay S, Dutta A, Garg M, Khandelwal N (2015) A combination of shape and texture features for classification of pulmonary nodules in lung CT images. J Digit Imaging 10. https://doi.org/10.1007/s10278-015-9857-6 56. Bhatia S, Sinha Y, Goel L (2019) Lung cancer detection: a deep learning approach. In: Soft computing for problem solving. Springer, Singapore, pp 699–705 57. Nasrullah N, Sang J, Alam MS, Mateen M, Cai B, Hu H (2019) Automated lung nodule detection and classification using deep learning combined with multiple strategies. Sensors 19:3722. https://doi.org/10.3390/s19173722 58. Mohamed Shakeel P, Tolba A, Al-Makhadmeh Z, Jaber MM (2019) Automatic detection of lung cancer from biomedical data set using discrete AdaBoost optimized ensemble learning generalized neural networks. Intell Biomed Data Anal Process (Springer)
Species Identification of Birds Via Acoustic Processing Signals Using Recurrent Network Analysis (RNN) C. Srujana, B. Sriya, S. Divya, Subhani Shaik, and V. Kakulapati
Abstract Bird watching is a popular pastime, but without the proper identification guides, it may be difficult to tell different species apart. There are more than 9000 recognized bird species. Some bird species are notoriously difficult to predict when they are first identified. Additionally, seeing beliefs when it comes to communicating with birds as a service to birdwatchers, developed a system that uses RNNS to classify bird species. RNNs are an effective machine learning algorithm suite that has shown great promise in the fields of image and audio processing. In this study, investigate potential methods for bird recognition and create a fully automated method for doing so. Automatically identifying bird calls without human intervention is an arduous task that has necessitated much research into the taxonomy and other areas of ornithology. In this study, ID is assessed from two distinct perspectives. The first thing to do was make a complete database of recorded bird calls. Following that, other techniques were used to the sound samples prior to further processing. These included pre-emphasis, framing, quiet eradication, and rebuilding. Each reconstructed audio specimen was given its spectrogram. A neural network was then constructed, trained, and applied to classify the bird species. The consequences of the proposed methodology exhibit that it has been 80% accurate in predicting the identification of bird species. Keywords Bird · Species recognition · Audio processing · RNN · FL (deep learning) · Acoustic · Categorization
C. Srujana · B. Sriya · S. Divya · S. Shaik · V. Kakulapati (B) Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana 501301, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_3
27
28
C. Srujana et al.
1 Introduction As birds help us identify other types of life on Earth, their behavior and population trends have become a pressing concern. It takes a lot of time and money to compile information on bird species, however. To deal with this, we need a robust framework that can facilitate the large-scale production of data about birds and serve as a crucial tool for researchers, policymakers, and others. Evidence that may be used to discriminate between different bird species plays a crucial role in determining where a given image of a bird belongs. In order to determine what group a given picture of a bird belongs to, we must first identify the species of bird it depicts. Birds use their calls for several different purposes, including making territorial claims, recognizing other species of birds or insects, and issuing warnings and threats. Distinguishing between many bird species from recordings of their songs is a significant obstacle to overcome when developing an automated system for bird recognition. Bird songs, according to experts, are more melodic and accurate at identifying species than any other method, and this is because the whole signal is preprocessed to find the most relevant aspects [1]. There are tens of thousands of kinds of birds, and they may be found anywhere from the Amazonian rainforest to the shores of Antarctica, according to the International Union for the Conservation of Nature (IUCN). However, contemporary human activity, habitat invasion, habitat loss, and natural processes like global warming and climate change have placed this biodiversity in jeopardy, with over 1370 bird species (or about 13 percent of all bird species) in some stage of extinction. The purpose of this study is to develop automated algorithms for identifying bird species using audio recordings [2]. One of the most crucial tasks in autonomous wildlife monitoring is the identification of bird calls. Because many birds are more easily observable by sound than by sight or other signs, the audio modality is well suited to bird monitoring. Constructing a model that can identify the presence of bird calls across many data sets is the key focus here. As a problem of binary choices, it does not need the ability to distinguish between several events or to pinpoint when they began and ended. The primary difficulty is in developing a model that is highly generalizable, able to perform well on data that has not been seen before, such as data obtained from other locations than the training dataset or using a different technique to record bird sounds. For this reason, the challenge provides assessment datasets that are quite dissimilar to the training dataset. Any model that can be trained with the provided training data must also be robust enough to account for discrepancies in the actual ground truth labeling. Therefore, it is a serious difficulty because the three training datasets have widely variable label accuracy. This study presents a method that uses sound processing and recurrent neural networks to fully automate the task of identifying bird calls. The recordings are gathered into a database, and then pre-emphasis, framing, silence removal, and reconstruction are among the acoustic preprocessing methods used. For a machine learning-based solution to be effective, the datasets used in the analysis must be
Species Identification of Birds Via Acoustic Processing Signals Using …
29
accurate, reliable, consistent, exhaustive, relevant, and true. One of the significant constraints of the study was the lack of availability and accessibility of bird song datasets due to the paucity of studies in this sector. Both a dataset of recorded bird calls from the area and a recording of a bird song were obtained from the website xenocanto.com. There are a total of 400 different bird calls to choose from, with 100 different birds represented.
2 Related Work The spectrogram of bird sounds, which displays the time–frequency characteristics of the sound, is a representation of the sound signal’s intensity as a function of color or grayscale. With its robust capabilities for self-learning and feature extraction, deep learning can automatically learn and extract defining characteristics from inputs [3]. The best of eight deep neural networks [4] trained using MFCCs of bird audio segments had a 73% accuracy rate in classification. Piczak used three distinct DCNNs and a basic ensemble model [5] to finish the LIFE CLEF 2016 bird identification test, and the MAP for the foreground species was 52.9%. For bird species categorization [6] used a CNN and achieved a MAP score of 40% for the key species. In order to improve CNN’s performance, image processing techniques [7] are used to remove unwanted noise from the input images. A deep convolutional neural network, an artificial learning system for visual data, employs convolutional and pooling layers. In the ImageNet Large-Scale Visual Recognition Competition (ILSVRC), the DCNN model VGG16 fared very well. The standardization of bird identification as the recognition of a vocalization spectrogram was accomplished by first extracting features from the spectrogram using the imagepre-trained VGG16 model and then feeding those features to two fully connected layers and a softmax layer. The resulting model correctly identified 18 species of birds [8]. In the context of a deep-learning-based bird-call categorization problem, this research examines the impact of acoustic feature selection. The short-term window acoustic properties are analyzed and fed into a distributed CNN network, which is then followed by a long short-term memory (LSTM) network [9]. To classify bird calls, the suggested technique employs a CNN trained with LSTM [10]. To make the system operate in real time, we employ adaptive thresholding to identify the foreground vocal activity in recordings while keeping the network size as minimal as possible. The overall categorization scheme and real-time application informed the development of the suggested method [11].
30
C. Srujana et al.
Fig. 1 Architecture of proposed work
3 Methodology Audio signal processing plays a vital role in the recognition and identification of bird species based on their vocalizations. Bird sounds are often intricate, and accurately processing and analyzing these audio signals is crucial for automated species detection and monitoring. Here are some fundamental elements of audio signal processing in bird species detection. Figure 1 shows the architecture of proposed work. When a time-dependent input– output connection must be determined, RNN is applied [12]. Modeling a sequence entails first feeding the input sequence to an RNN trained on a fixed-size vector and then mapping that vector to a softmax layer. Yet, RNNs run into trouble when the gradient vector grows and shrinks exponentially over lengthy periods. With its builtin memory, the RNN network can manage dependencies over the long term. Unlike fully linked networks, where each node between layers receives just one input to process, RNNs instead use a directed graph to connect nodes as they sequentially record sample slots for each input [13]. The dataset is collected from Kaggle, which is publicly available. Preprocessing aims to reduce noise in the image. • To get the most useful information out of bird species, use a variety of pre-trained CNN models. • Convert the feature map into a timeline. • Use a single-layer RNN with the feature map as its input. • Classify photos of bird parts using a softmax classifier. Feature extraction: Feature extraction is a method for locating and representing certain features of interest within an image for further processing. The key shape properties contained in a pattern are revealed by feature extraction, making it simple to recognize the pattern when employing a formal approach. The color, form, size, and silhouette of the bird are key characteristics of the image. The bird’s frequency, amplitude, loudness, etc. are crucial characteristics of sound.
4 Implementation Results Figure 2 shows the proposed work flow. The flow is as follows.
Species Identification of Birds Via Acoustic Processing Signals Using …
31
Fig. 2 Proposed work flow
Audio Data Collection: Bird sounds are recorded using microphones or recording devices placed in the field or specific habitats to capture the vocalizations of various bird species. Pre-processing: Raw audio data often contains background noise, artifacts, and varying signal levels. Pre-processing techniques are applied to clean the data and improve signal quality, which includes noise reduction, filtering, and normalization. Background Noise Removal: Outdoor recordings may contain a mix of bird sounds and background noises. Advanced techniques like blind source separation or noise reduction algorithms are used to isolate bird sounds from the background noise. Model Training and Validation: Training the machine learning models requires a labeled dataset with examples of different bird species’ vocalizations. The model learns from these samples to recognize patterns and make predictions on new audio data. Validation on separate test data ensures generalization to new recordings. Species Identification: Once trained and validated, the model automatically detects and identifies bird species from new audio recordings, processing the extracted audio features to provide predictions. Post-processing and Validation: Post-processing steps refine predictions to reduce errors and improve accuracy. Human validation and expert review are often conducted to ensure the reliability of species identification, especially for rare or ambiguous bird vocalizations. Audio signal processing has significantly advanced bioacoustics and bird species detection, enabling efficient analysis of large audio datasets and providing valuable insights into bird populations, behavior, and biodiversity in various habitats.
32
C. Srujana et al.
Spectrogram Generation: The process of creating a spectrogram involves visualizing the frequency distribution of an audio source over time, providing valuable insights into its spectral properties and temporal changes. Spectrograms are widely used in diverse fields such as audio processing, voice recognition, music analysis, and acoustic research. Here is a detailed explanation of the steps required to generate a spectrogram. Before generating the spectrogram, it is advantageous to apply pre-processing techniques to the audio signal. This may include tasks like noise reduction, filtering, and normalization to enhance the audio quality and eliminate unwanted artifacts or background noise. Windowing: To analyze the frequency content of the audio signal over time, it is segmented into shorter windows. Each window typically has a fixed duration, such as 10 ms, and overlaps with adjacent windows. Common window functions like the Hamming or Hanning window are applied to minimize spectral leakage. Fast Fourier Transform (FFT): For each window, the FFT is utilized to transform the time-domain signal into the frequency-domain representation. The FFT calculates the complex spectrum of the windowed signal, representing the magnitude and phase of various frequency components. Power Spectrum Calculation: From the complex spectrum obtained through the FFT, the power spectrum is derived by squaring the magnitude values. The power spectrum illustrates the energy distribution across different frequencies at each time window. Spectrogram Visualization: The power spectrum values from each window are arranged over time to construct a 2D matrix, with time on the x-axis and frequency on the y-axis. The intensity of each matrix point corresponds to the magnitude or power of the frequency component at that specific time and frequency. The spectrogram is commonly displayed as a grayscale or color image, with darker shades indicating lower power and lighter shades representing higher power. Dynamic Range Adjustment: To improve the visual representation and highlight fine details in the spectrogram, dynamic range adjustment techniques can be applied. This involves scaling the power values to cover a wider range of colors or brightness levels, allowing for better visualization of both weak and strong frequency components. Additional Processing: Depending on the specific application, additional processing steps may be applied to the spectrogram. These can include operations like logarithmic scaling, frequency smoothing, or time averaging to enhance specific features or remove noise. Audio signal analysis and comprehension rely heavily on spectrogram output. It offers valuable information about the frequency content, spectral patterns, and time-varying characteristics of the signal, making it a powerful tool for various audio-related tasks and applications [14, 15]. Spectrograms are frequently utilized as input representations for RNN models in tasks related to identifying bird species. These graphic representations show the timevarying frequency content of an audio source. Once spectrograms are fed into the
Species Identification of Birds Via Acoustic Processing Signals Using …
33
RNN model for species identification, there are several steps to enhance the system’s performance and effectiveness. The following are some common next steps: Model evaluation: Evaluate the RNN system performance on a separate test dataset not used during training or validation. Utilize metrics like accurateness, exactness, recall, or F1 score evaluation of the algorithm’s performance in bird species identification. This evaluation step aids in understanding the effectiveness of the trained model and identifying areas for improvement. Hyperparameter tuning: Experiment with different hyperparameter settings of the RNN model is to optimize its performance. Altering the number of hidden layers, the size of the individual units, the pace at which they learn, the kind of regularization used, and the optimization method are all examples. Tuning hyperparameters helps determine the configuration that achieves the best performance on the validation or test set. Data augmentation: Increase the diversity of the training dataset by applying transformations or modifications to the spectrograms. Techniques like adding random noise, shifting, or stretching spectrograms in time or frequency domains, or model generalization may be improved by using alternative signal transformations. Data augmentation contributes to better performance. Handling class imbalance: If the dataset exhibits imbalanced class distributions, where some bird species are more prevalent than others, address this issue. Techniques like oversampling the minority classes, under sampling the majority classes, or incorporating class weights during training can mitigate the effects of class imbalance and improve the model’s performance on minority classes. Ensemble models: Boost performance further by exploring ensemble models. These models combine multiple individual RNN models to make predictions, resulting in higher accurateness and robustness. For bird species identification, train multiple RNN models with varying initializations or architectures and aggregate their predictions to obtain the result. Deployment considerations: Consider the requirements and constraints for deploying the bird species identification system. If the model will be deployed in a resource-constrained environment, optimize the model’s size or complexity for efficient inference. Techniques like model compression, quantization, or pruning can reduce the model’s memory footprint or computational requirements. Continuous improvement: Monitor the deployed model’s performance and collect feedback. Analyze misclassified examples and identify patterns or sources of errors. This feedback loop guides further improvements to the system, including collecting additional training data, refining preprocessing steps, or exploring alternative modeling approaches. By continuously refining and improving the model based on evaluation and feedback, you can develop a more accurate and robust bird species identification system. Recurrent Neural Network (RNN): It is a type of ANN designed to method progressive data, where the order of elements is important. Unlike conventional feedforward neural networks, RNNs have loops in their connections, allowing them to
34
C. Srujana et al.
maintain hidden states or memory of past information. This makes them particularly suitable for tasks involving time series, sequences, and natural language processing. The fundamental characteristic of RNNs lies in their ability to consider previous information while processing the current input. Recurrent connections enable the output of a hidden layer at a given time step to serve as input to the same layer in the next time step. This recurrent loop enables RNNs to learn temporal dependencies and patterns in sequential data. Mathematically, the computation of an RNN at each time step can be represented as follows: • Input at time step t: xt • Hidden state at time step t: ht • Output at time step t: yt The steps for RNN computation are as follows: 1. Initialize the hidden state: h0 2. For each time step t: • Combine the input and previous hidden state: zt = f (Wx * xt + Wh * ht − 1 + b) • Compute the current hidden state: ht = g(zt) • Compute the output: yt = h(Wh_output * ht + b_output) In the above equations:
• • • • •
Wx and Wh are weight matrices for the input and hidden state, respectively. b is the bias vector. f is an activation function applied to the input combination (e.g., sigmoid or tanh). g is an activation function applied to the hidden state (e.g., tanh or ReLU). h is the output activation function for generating the final output (depends on the task, e.g., softmax for classification).
The built-in convolutional layer of RNN makes it the best choice for image and audio recognition tasks by reducing the high dimensionality of pictures without sacrificing their information. Three elements make up the suggested model: • a system for extracting audio features; • a mechanism for extracting visual features; • a mechanism for integrating audio and visual. RNN-based feature integration technology was used in this study to combine the results of the auditory and visual extraction processes. The visual characteristics were modeled using OpenCV’s Haar-Cascade Detection, and the data was then fed into the RNN system for analysis. The data was processed using an RNN system after being modeled using the Mel-frequency cestrum coefficient (MFCC) [16, 17].
Species Identification of Birds Via Acoustic Processing Signals Using …
35
However, RNNs have limitations in capturing long-term dependencies, as they can suffer from vanishing or exploding gradient problems during training. This can make it challenging for them to learn and model long-range dependencies effectively. RNNs find extensive use in various applications, including natural language processing (NLP), speech recognition, machine translation, time series forecasting, and more, where sequential data plays a crucial role. They are powerful tools for modeling sequential patterns and have significantly advanced these fields. Using recurrent neural networks (RNNs) for bird species detection involves several sequential steps, including data collection, pre-processing, model construction, training, and evaluation. Here is a general overview of the process of employing RNNs in bird species detection: Audio signal processing and RNNs provide a viable avenue for research into automated bird species identification. Figure 3 shows the linear frequency power spectrum, Fig. 4 shows the bird species Mel-scaled spectrogram, Fig. 5 shows the Bird species Spectrogram with DCT, Fig. 6 shows the proposed algorithm epochs for bird species, and Fig. 7 shows the bird species speech signals with the RNN algorithm.
Fig. 3 Linear frequency power spectrum
Fig. 4 Bird species Mel-scaled spectrogram
36
Fig. 5 Bird species spectrogram with DCT
Fig. 6 Proposed algorithm epochs for bird species
C. Srujana et al.
Species Identification of Birds Via Acoustic Processing Signals Using …
37
Fig. 7 Bird species speech signals with the RNN algorithm
5 Conclusion The current study investigated a technique for classifying images utilizing the Caltech-UCSD Birds 200 dataset and a deep learning algorithm (unsupervised learning). A total of 11,788 images in 200 categories make up this collection. The produced system is connected to a user-friendly website where individuals may upload images for positive identification. The proposed approach is effective due to its use of component identification and RNN feature extraction over several convolutional layers. Before being sent on to a classifier, these characteristics are merged. Based on the generated results, the algorithm has provided 80% accuracy in predicting the discovery of bird species. In this method, we preprocessed the bird’s bark to create a spectrogram. This we used to train the classification algorithm. The training set also contained various environmental noises, some of which were recorded from actual birds in their native habitats. Various selections of learning rates, numbers of epochs, and data segmentation all produced the same outcomes. It is possible to get even higher precision by adjusting the many parameters affecting efficiency.
6 Future Enhancement In the future, the prospects for this initiative in terms of economic growth and scientific advancements are exponential. Develop the program for mobile platforms, enabling users to forecast and assess bird calls using their smart phones. The user has the option of recording the bird’s song. In ecological parks, sanctuaries, and bird sanctuaries, certain hardware configurations can be built. Both locally and in the cloud can be used to store the produced data. The information gathered in this way will be crucial for understanding bird migration patterns, population dispersion, biodiversity, and species-specific bird demography.
38
C. Srujana et al.
References 1. Priyanka R et al (2023) Identification of bird species using automation tool. Int Res J Eng Technol 10(03). e-ISSN: 2395-0056 2. https://www.iucn.org/news/secretariat/201612/new-bird-species-and-giraffe-under-threat-% E2%80%93-iucn-red-list 3. Dan X, Huang S, Xin Z (2019) Spatial-aware global contrast representation for saliency detection. Turk J Electr Eng Comput Sci 27:2412–2429 4. Koops HV, Van Balen J, Wiering F, Multimedia S (2014) A deep neural network approach to the LifeCLEF 2014 bird task. LifeClef Work Notes 1180:634–642 5. Piczak K (2016) Recognizing bird species in audio recordings using deep convolutional neural networks. CEUR Workshop Proc 1609:1–10 6. Toth BP, Czeba B (2016) Convolutional neural networks for large-scale bird song classification in noisy environment; proceedings of the conference and labs of the evaluation forum; Évora, Portugal. 5–8 September 2016; pp. 1–9. 7. Sprengel E., Jaggi M., Kilcher Y., Hofmann T. Audio based bird species identification using deep learning techniques; proceedings of the CEUR workshop. Evora, Portugal, pp 547–559 8. Zhou FY, Jin LP, Dong J (2017) Review of convolutional neural network. Chin J Comput 40(7):1–23 9. https://towardsdatascience.com/lstm-recurrent-neural-networks-how-to-teach-a-network-toremember-the-past-55e54c2ff22e 10. https://www.analyticsvidhya.com/blog/2017/12/fundamentals-of-deep-learning-introductionto-lstm/ 11. Heuer S, Tafo P, Holzmann H, Dahlke S (2019) New aspects in birdsong recognition utilizing the gabor transform. In: Proceedings of the 23rd international congress on acoustics. Aachen 12. Yoshua B, Patrice S, Paolo F (2014) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5:157 13. Guo Y, Liu Y, Bakker EM, Guo Y, Lew MS (2018) CNN-RNN: a large-scale hierarchical image classification framework. Multim Tools Appl 77:10251–10271. https://doi.org/10.1007/ s11042-017-5443-x 14. https://www.sciencedirect.com/topics/engineering/audio-signal-processing 15. https://scottmduda.medium.com/urban-environmental-audio-classification-using-mel-spectr ograms-706ee6f8dcc1 16. https://www.geeksforgeeks.org/introduction-to-recurrent-neural-network/ 17. https://towardsdatascience.com/recurrent-neural-nets-for-audio-classification-81cb62327990
Optimized Analysis of Emotion Recognition Through Speech Signals V. Kakulapati , Sahith, Naresh, and Swethan
Abstract An accurate recognition of the user’s emotional state is a primary aim of the human interface. The most pressing concern in the field of speech emotion identification is how to efficiently combine the extraction of suitable speech characteristics with a suitable classification engine in a parallel fashion. In this study, the concept of Emotion Recognition through speech signals involves predicting human emotions through speech with a high level of accuracy. This technology improves human–computer interaction, although it is challenging to predict emotions due to their subjective nature and the difficulty of annotating audio. SER relies on various factors such as tone, pitch, expression, and behavior to determine emotions through speech. The process involves training classifiers with samples, and the RAVDESS dataset is used as an example in this work. Due to the wide range of vocal dynamics and pitch changes, emotion identification in spoken language is a difficult problem in computer vision. To overcome this, the convolutional neural network (CNN) method is utilized for speech emotion detection; this employs emotion recognition modules and learners to determine the difference between states of happiness, surprise, anger, neutrality, and sorrow. The system’s dataset is built from voice signals; the LIBROSA program is used to retrieve attributes from these samples. The highest precision may be attained through Adam optimization. Keywords Speech · CNN · LSTM · Adam · Feeling · Classification · Optimization · Prediction
1 Introduction Humans’ ability to talk to each other is both our most advanced and most basic way to talk to each other. As air moves from the lungs to the larynx through the trachea, it makes the vocal cords vibrate, which sends speech signals to the brain. Nowadays, V. Kakulapati (B) · Sahith · Naresh · Swethan Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, Telangana 501301, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_4
39
40
V. Kakulapati et al.
have seen a rise in the investigation into the field of emotion recognition via speech. The biological sciences, psychophysiology, computer science, and artificial intelligence are all at the cutting edge of research into how to automatically identify and rate human emotions. People’s feelings and other important attitudes can often be figured out just by watching how they act [1]. Voice-based emotion detection aims to identify the emotions conveyed by an individual through their voice. This system simplifies human–machine communication by utilizing audio files rather than traditional devices, enabling machines to understand and respond accordingly. Voice input contains valuable information such as age, gender, and emotion. In addition to this descriptive information, the goal of the system is to extract and interpret the specific emotion conveyed by the voice, allowing for appropriate responses. Figure 1 shows the proposed system framework. It is crucial to recognize that the same words can carry different emotions depending on how they are expressed. Thus, understanding the intended meaning and associated emotion becomes essential [2]. Emotion detection involves analyzing a person’s current emotional state based on their voice. The process of emotion detection comprises two main steps: feature extraction and the application of an MLP classifier on the extracted features from the audio files. Various acoustical features contribute to our understanding of emotions, including energy, pitch, rhythm, and loudness. These features are extracted from the given audio files while eliminating unwanted noise signals through pre-processing techniques. Emotional nuance is conveyed well via spoken cues. Recently, scientists have been working on developing AI systems capable of identifying emotions based on a speaker’s voice output. As the speaker’s pitch and the length of each frame, both affect the emotional content of the voice signal; it’s vital to evaluate it at various time scales [3]. The emotional classifier must take into account the culturally and environmentally specific emotions that are encoded in each speech stream [4]. It is culturally and contextually specific how the speech makes the listener feel. CNN networks can normalize input frequencies and pick up on local details, whereas LSTM networks may be trained to extract and learn from acoustic and
Fig. 1 Proposed system framework
Optimized Analysis of Emotion Recognition Through Speech Signals
41
textual features. Because of the temporal nature of both spoken and written language, LSTM is well-suited for the extraction and learning of acoustic and textual features. Yet, the expansion in the variance of the hidden state components is not due to an intermediate nonlinear hidden layer. Several recent studies have turned to CNNs and LSTM networks [5, 6] to enhance voice emotion detection. In terms of representation capacity, DL (deep learning)-based techniques are preferable for SER. There have been several successful demonstrations of autonomous SER using deep learning algorithms like CNN and LSTM [7].
1.1 Challenges A significant challenge in this project is the presence of disturbances in many audio files, including background noise and low volume, which can impact the system’s accuracy. To address this, we are utilizing the RAVDESS dataset, publicly available and free of such disturbances, thereby enhancing the accuracy of our proposed system. Results reveal that the proposed speech emotion detection system performs better on the RAVDESS dataset with optimized features than the current state-of-the-art methods.
1.2 Benefits • Enhanced Customer Service: Emotion recognition in speech allows automated customer service agents to identify callers’ emotional states in real-time. This enables them to provide personalized and empathetic responses, resulting in improved customer satisfaction. • Adaptability and Customization: By analyzing emotions expressed through speech, systems can customize interactions based on individual needs and preferences, creating a more personalized user experience. • Improved Human–Computer Interaction: Emotion recognition in speech enables natural and intuitive interactions between humans and computers by allowing systems to understand and respond to emotions, enhancing engagement and user experience. • Employee Well-being and Productivity: Emotion recognition technology can monitor employees’ stress levels and emotional well-being, promoting a healthier work environment, providing support, and ultimately enhancing productivity and job satisfaction. • Mental Health Monitoring: Speech-based emotion recognition aids in the early detection of mental health conditions by identifying signs of distress, anxiety, or depression, enabling timely intervention and support.
42
V. Kakulapati et al.
• Market Research and Sentiment Analysis: Emotion recognition in speech can be used to analyze customer feedback, reviews, and social media comments, providing businesses with insights into consumer sentiment, preferences, and trends for data-driven decision making and product/service improvement. • Assistive Technology: Emotion recognition through speech benefits individuals with disabilities, such as autism or speech impairments, by aiding in emotion understanding and communication, fostering inclusivity and support. • Educational Applications: Emotion recognition technology in educational settings helps understand students’ engagement levels, attention, and emotional responses, enabling tailored teaching methods and targeted support to enhance learning outcomes. • Entertainment and Gaming: Emotion recognition in speech adds interactivity and immersion to entertainment and gaming experiences. Systems can adapt gameplay or content based on players’ emotional reactions, creating dynamic and engaging experiences. • Security and Fraud Detection: Emotion recognition through speech can be utilized in security systems to detect suspicious behavior or potential threats by analyzing emotional cues exhibited in speech patterns, aiding in fraud detection, and preventing unauthorized access. The article is arranged as follows: In Sect. 2, provide relevant research on the automatic identification of speech emotions. The proposed system is described in detail in Sect. 3. The results of the experiment are discussed in Sect. 4. The research concludes in Sect. 5, followed by future research.
2 Relative Work Spectral regression [8] is a generalized model that makes use of the connections between extreme learning machines (ELMs) and subspace learning (SL). The hope was that this model would compensate for the weaknesses of spectral regressionbased GE (graph embedding) and ELM. The effectiveness and viability of the methodologies were evaluated in comparison to standard methods by demonstrating their use across four speech-emotional corpora. The researchers, Zhaocheng Huang et al., used a token-based, heterogeneous approach to identify depressed speech. Sharp transitions and auditory regions were determined independently and together in fusions of several embedding techniques. Methods developed for identifying depressive disorders and, presumably, other medical conditions with an impact on voice production were implemented. To recognize voices from different corpora, the Transfer Linear Subspace Learning (TLSL) framework [9] might be used. Strong representations of attributes over corpora are what TLSL hopes to extract and place in the trained estimated subspace. Current transfer learning methods, which only look for the most transferable parts of a trait, benefit from this development. TLSL is much superior to the other
Optimized Analysis of Emotion Recognition Through Speech Signals
43
transfer learning methods based on the transformation of characteristics. However, a major limitation of TLSL is that it prioritizes seeking the transportable components of traits while ignoring the less useful portions. Emotional voice recognition in several databases is made possible with the help of TLSL. Researchers in [10] used LSTM to analyze the feelings communicated by long stretches of text and found that LSTM performed better than conventional RNN in making these determinations. CNN and LSTM [11] have been used to extract high-level properties from raw audio data for use in speech emotion recognition. The greater processing power and storage capacities of LSTM make it a potential solution to the gradient explosion or disappearance problem faced by standard RNNs. As speech is a nonlinear time-series transform signal and text information is tightly related to temporal context, the LSTM network is well-suited for retrieving and classifying acoustic and textual characteristics that model in context and aid in comprehending the value of features. Nevertheless, there is no nonlinear hidden layer in the center, which would provide additional variability for the hidden state components [12]. Their automated evaluation method [13] is used narrative speech data from a Cantonese-speaking PWA diagnosed with aphasia. The linguistic abnormalities in aphasic speech were identified by analyzing the textual features of the speech data. The Siamese network’s learned text features were shown to have a substantial correlation with the AQ scores. The performance of automated speech recognition (ASR) on disordered speech and other languages has to be improved, and a substantial collection of such data should be accumulated for the automatic categorization of aphasia subtypes before this approach may be used more widely. The effectiveness of speech emotion recognition (SER) [14] was examined concerning voice bandwidth reduction and the low companding process utilized in transmission systems. When tested using data from Berlin’s EMO-DB, which classifies people’s feelings into seven categories, the baseline method yielded an average accuracy of 82%. Reduced sample frequency resulted in a 3.3% drop in SER accuracy, whereas the companding technique alone resulted in a 3.8% drop in average accuracy. Emotional labels were created at a rate of once per 1.033–1.026 seconds thanks to the SER’s real-time implementation. The schedule for implementing the change in real-time is detailed. Researchers used deep neural networks [15] like CNN, CRNN, and GRU to identify the following feelings: anger, happiness, sadness, and apathy. Mel spectral coefficients and other characteristics associated with the voice signal’s spectrum and intensity were employed as feature parameters. The data was enhanced by replacing the voice with white noise. When compared to previous research, the GRU model’s average recognition accuracy of 97.47% was the highest.
44
V. Kakulapati et al.
3 Methodology The system is given a set of training data that includes labels for an expression, and it is also given the option to get weight training for the network. A sound is read as participation. Strength normalization is then performed on the audio. To prevent the training performance from being negatively impacted by the presenting order of the examples, normalized audio is employed for the Convolutional Network’s instruction. It is via this training method that the sets of weights are produced that provide optimal outcomes. The dataset provides the system with pitch and energy during testing, and then, using the trained network’s final weights, it provides the associated emotion. There are five possible expressions, and each of them has a corresponding numerical result. Figure 2 shows the workflow of the proposed system. CNNs process and classify images by treating them as arrays of pixels. The dimensions of the image depend on its resolution, represented as h * w * d, where h represents height, w represents width, and d represents the dimension. For example, an RGB image is represented as a 6 * 6 * 3 array, while a greyscale image is represented as a 4 * 4 * 1 array. In CNNs, input images go through a series of convolution layers, pooling layers, fully connected layers, and filters (also known as kernels). The Softmax function is applied to classify objects with probabilistic values of 0 and 1. CNNs are particularly valuable for tasks related to images, as well as time-series data. They excel in image recognition, object classification, and pattern recognition. These networks leverage concepts from linear algebra, such as matrix multiplication, to identify patterns within an image. Additionally, CNNs can also classify audio and signal data. Machine learning classifiers, including CNN and LSTM, were explored, as were a few others. CNN performed the best in our tests, with an 82% success rate. Using prediction has enhanced the click-through rate, but more importantly, it automatically learns from the data without any human domain expertise. Table 1 shows the comparative analysis of existing systems.
Fig. 2 Workflow of the proposed system
Table 1 Comparative analysis of existing systems S. No
Methodology and tools
Accuracy
Results
1
CNN classifier
Approximately 82%
It is good when clustering
2
LSTM Classifier
62%
It performs average
Optimized Analysis of Emotion Recognition Through Speech Signals
45
Adam: Adam, an enhanced gradient descent approach that incorporates adjustable learning rate and momentum, is another option. Similar to AdaDelta and RMSProp, Adam also preserves an exponential decay average of previously squared gradients, while also keeping an approximate mean of past gradients.
4 Implementation Results Figure 3 shows the count of different emotions in the dataset.
4.1 Data Collection The RAVDESS dataset has a total of 1440 files, with 24 professional actors (12 men and 12 women) and 60 trials per actor. The performers use a neutral North American dialect to deliver two paired lines while evoking seven distinct feelings via their voices: serenity, happiness, sadness, anger, fear, surprise, and disgust. There are three different expressions available for each emotion: normal, strong, and neutral. • Various features like Mel-frequency cepstral coefficients (MFCC), Chroma, and Mel are extracted from the audio files. These extracted features are used in the model training and testing. • After extracting the features, the data is split into 25 and 75%, 75% is used for training the model, and the remaining 25% is used for testing the model. • Once the model is trained, the model is deployed on various test audio files. Figure 4 shows the digital representation of audio in waveform.
Fig. 3 Count of different emotions in the dataset
46
V. Kakulapati et al.
Fig. 4 Digital representation of audio in waveform
4.2 Data Pre-processing In Python, the librosa library is commonly used for audio signal processing tasks such as feature extraction from sound files. Librosa allows for the extraction of various features including MFCC, Chroma features, and Mel spectrograms. Before extracting features, it is common to pre-process the audio data by removing noise using the noise reduction functions provided by librosa. The librosa module is utilized to convert audio files into digital data, employing various features such as Mel feature, which captures frequency characteristics represented on the Mel scale; Mfcc feature, which describes the short-term power spectrum of the input audio file; and Chroma feature, which captures melodic and harmonic characteristics based on pitch. After extracting these features, different classifiers can be applied to match them with corresponding emotions. MFCCs are commonly used audio features that capture information about the spectral shape of the audio signal. They are computed by first applying a Fourier transform to the audio signal to obtain the power spectrum. In order to approximate the sampling rate of the audio signal, the power spectrum is converted to the Mel scale. Finally, the logarithm of the Mel spectrogram is transformed using a cosine transform to obtain the MFCCs. Chroma features capture information about the pitch class of the audio signal. They are computed by first dividing the audio signal into short frames, typically 20–30 ms long. In order to assign each frame’s power spectrum to one of 12 pitch classes (matching the notes in the Western music system), a series of triangle filters are used. The chroma features are then obtained by summing the power spectrum values within each pitch class. Mel spectrograms are similar to traditional spectrograms, but they use a Mel scale to represent the frequency axis instead of a linear scale. Mel spectrograms are commonly used as input features for machine learning models in audio classification and other audio-related tasks.
Optimized Analysis of Emotion Recognition Through Speech Signals
47
Fig. 5 Comparison of proposed model accuracy
Overall, the use of librosa and these various feature extraction techniques enables the analysis of audio data and the development of various audio-related applications. Figure 5 shows the comparison of proposed model accuracy, and Fig. 6 shows the comparison of the proposed model loss function. Table 2 shows the assigned numbers to different emotions
Fig. 6 Comparison of the proposed model loss function
Table 2 Assigned numbers to different emotions
48
V. Kakulapati et al.
Figure 7 shows the CNN-LSTM with ADAM and RMSProp optimization epochs. LSTM accuracy: 51.85% RMSProp optimizer accuracy: 61.8% Adam optimization seemed to be the best fit for our CNN model from both experiments on optimization functions. Accuracy: 81.76%
5 Performance Analysis Emotion recognition in speech allows automated customer service agents to identify callers’ emotional states in real time. This enables them to provide personalized and empathetic responses, resulting in improved customer satisfaction. By analyzing emotions expressed through speech, systems can customize interactions based on individual needs and preferences, creating a more personalized user experience. Emotion recognition in speech enables natural and intuitive interactions between humans and computers by allowing systems to understand and respond to emotions, enhancing engagement and user experience. Emotion recognition technology can monitor employees’ stress levels and emotional well-being, promoting a healthier work environment, providing support, and ultimately enhancing productivity and job satisfaction. Speech-based emotion recognition aids in the early detection of mental health conditions by identifying signs of distress, anxiety, or depression, enabling timely intervention and support. Several learning methodologies, including Adaptive Moment Estimation (ADAM) and root mean square propagation (RMSProp) optimization algorithms, are used to evaluate the efficiency of the proposed approach. The suggested model enhances the accuracy of the SER for both the ADAM and RMSProp algorithms. The RAVDESS dataset offers SER accuracy of 61.8% for the RMSProp optimization approach and 81.76% for the ADAM algorithm.
6 Conclusion Emotion recognition using speech involves identifying human emotions through speech patterns. To achieve accurate results, it is important to have a high-quality database with clear, noise-free recordings of actors’ voices. Various methods for emotion recognition using speech have been developed, including feature extraction from speech samples and the use of CNNs and LSTM to classify emotions. While
Optimized Analysis of Emotion Recognition Through Speech Signals
49
Fig. 7 CNN-LSTM with ADAM and RMSProp optimization epochs
many different audio features can be used to recognize emotions, feature extraction using MFCCs has proven particularly effective in identifying emotions through speech. In an optimized analysis of emotion recognition through speech signals, the CNN classifier gave more accurate results than LSTM. This result recommends a technique for identifying emotions from an audio clip. The user trains the system by adding recordings of the sound and an emotional label to a database, and the
50
V. Kakulapati et al.
system then through two rounds of training and assessment. The suggested technique performs an outstanding job, with a high accuracy rate, in emotion detection compared to earlier efforts.
6.1 Limitation • • • •
Building a flawless system for emotion recognition is challenging. Speaking loudly can cause disturbance and inconvenience to others. Filtering out background noise is a demanding task, even for humans. Environmental factors, such as background noise, can affect the accuracy of this biometric. • Emotion recognition technology may compromise privacy, especially in crowded environments. • Errors and misinterpretations of words can occur in the process.
7 Future Enhancement In the future, emotional state recognition will be done using multimodal analysis. Emotion recognition accuracy may be estimated with the use of voice signals and other classification algorithms based on RNN algorithms and probabilistic neural network techniques. The use of optimization strategies is to enhance model efficiency.
References 1. Nardelli M, Valenza G, Greco A, Lanata A, Scilingo EP (2015) Recognizing emotions induced by affective sounds through heart rate variability. IEEE Trans Affect Comput 6(4):385–394. https://doi.org/10.1109/TAFFC.2015.2432810 2. Kakulapati V et al (2022) Multimodal analysis of cognitive and social psychology effects of COVID 19 victims. Book series of Springer entitled. In: Decision sciences for COVID-19. International series in operations research & management science, vol 320. Springer, Cham. https://doi.org/10.1007/978-3-030-87019-5_15 3. Sun Y, Wen G (2017) Ensemble softmax regression model for speech emotion recognition. Multim Tools Appl 76(6):8305–8328 4. Ghai M, Lal S, Duggal S, Manik S (2017) Emotion recognition on speech signals using machine learning. In: Proceedings of international conference on big data analytics and computational intelligence (ICBDAC), pp 34–39 5. Ge R, Wang CH, Xu X et al (2017) Action recognition with hierarchical convolutional neural networks features and bi-directional long short-term memory model. Control Theory Appl 34(6):790–796 6. Zhao J, Mao X, Chen L (2019) Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomed Signal Process Control 47:312–323
Optimized Analysis of Emotion Recognition Through Speech Signals
51
7. Amin KR, Jones E, Babar MI, Jan T, Zafar MH, Alhussain T (2019) Speech emotion recognition using deep learning techniques: a review. IEEE Access 7:117327–117345 8. Xu X, Deng J, Coutinho E, Wu C, Zhao L, Schuller B (2018) Connecting subspace learning and extreme learning machine in speech emotion recognition. IEEE Trans Multim 795–808. https://doi.org/10.1109/TMM.2018.2865834 9. Song P (2017) Transfer linear subspace learning for cross-corpus speech emotion recognition. IEEE Trans Affect Comput 265–275. https://doi.org/10.1109/TAFFC.2017.2705696 10. Li D, Qian J (2016) Text sentiment analysis based on long short-term memory. In: Proceedings of the 2016 first IEEE international conference on computer communication and the internet (ICCCI). Wuhan, China, pp 471–475 11. Zhao J, Mao X, Chen L (2018) Learning deep features to recognize speech emotion using merged deep CNN. IET Signal Proc 12(6):713–721 12. Sainath TN, Vinyals O, Senior A, Sak H (2015) Convolutional, long short-term memory, fully connected deep neural networks. In: Proceedings of the 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). Brisbane, Australia, pp 4580–4584 13. Qin Y, Lee T, Kong APH (2020) Automatic assessment of speech impairment in Cantonesespeaking people with Aphasia. IEEE J Sel Top Signal Process 14(2):331–345. https://doi.org/ 10.1109/JSTSP.2019.2956371. Epub 2019 Nov 28. PMID: 32499841; PMCID: PMC7271834 14. Margaret L et al (2020) Real-time speech emotion recognition using a pre-trained image classification network: effects of bandwidth reduction and companding. Front Comput Sci 2. https:// doi.org/10.3389/fcomp.2020.00014. ISSN: 2624-9898 15. Trinh Van L, Dao Thi Le T, Le Xuan T, Castelli E (2022) Emotional speech recognition using deep neural networks. Sensors 22:1414. https://doi.org/10.3390/s22041414
Facial Emotion Recognition Using Chatbot and Raspberry Pi Sunil Bhutada, Meghana Madabhushi, Satya Shivani, and Sindia Choolakal
Abstract A chatbot is computer software that can analyze human language, simulate a conversation with a specific user, and have the potential to interact with other bots that are available. The analysis of various emotions resulting in reactions conditioned by an emotion will be performed with a combination of Natural Language Processing and machine learning approaches. This chatbot will serve as a base. The user’s emotions can be read from their facial expressions. These expressions can be produced from either the live feed provided by the system’s camera or any previously stored image. Human emotions may be identified, and their study has an extensive impact. Python, open-source Computer Vision library (Open CV), NumPy, pandas, Docker, and TensorFlow will all be used in the project’s implementation. In order to predict emotion, the training dataset is compared to the scanned image (testing dataset). The major goal is to develop more effective and dependable chatbots and to deliver improved user feedback with the intention of responding in line with the user’s emotions. Keywords Emotion detection · Facial recognition · CNN · Webcam · Facial emotion recognition
1 Introduction Many companies in real world want to offer chatbot(s) to their clients and representatives, which can respond to questions, empower self-administration and exhibit their items and administrations. Actualizing and looking after chatbot(s) by hand S. Bhutada (B) Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, India e-mail: [email protected] M. Madabhushi · S. Shivani · S. Choolakal Industry Professional, Marlabs, Bangalore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_5
53
54
S. Bhutada et al.
cost time and cash. Organizations regularly have web API’s for their administrations, which are frequently archived with an API particular. This paper provides us idea of a future application of our project where a psychiatrist can use this chatbot for psychological help to people who are in depression because of being alone. These expressions can be extracted from the real-time feed provided by the system’s camera or from any earlier saved image or can also be captured by using Raspberry Pi camera. Here we connect a camera to the Raspberry pi on a port [11]. Human emotions are recognisable and offer a wide range of study potential in computer vision, a topic on which numerous studies have previously been conducted. Python [2], OpenCV [3], and NumPy have all been used in the research that went into this paper. Emotion is predicted by comparing the scanned image (testing 14 datasets) to the training dataset. The ultimate objective of this study was to develop a system that could assess a photograph and predict a person’s expression. The study suggests that this process is practical and yields precise results. This work gives us the full picture for our face-learning phase. To label feeling classifications, we apply sentiment analysis on the dataset and get exchange with compelling feeling. We initially receive vanilla grouping toarrangement neural system and do change for estimation-level relationship. Furthermore, we exploit profound fortification learning and present slant rewards amid learning stage. To more readily join feeling tag and abbreviate preparing time contrasted with profound fortification learning, we make a strive for our third technique via preparing an enthusiastic neural system visiting machine. We present feeling implanting, inner memory and outside memory to pass the data stream and feeling class sensibly to the last yield. 2.1.4 Journal Name—Chatbot Using Knowledge in Database Author Name—Setiaji Bayu and Wibowo Ferry Year—2017. A chatterbot or chatbot seeks to create an interaction between a machine and a human. The machine has knowledge built in to recognize sentences and concluded on its own in response to an inquiry. The response standard is 15, which corresponds to the user’s input. To determine how similar the reference sentences are to the input text, a score will be calculated; the higher the score, the more similar the reference sentences are. In this study, the calculation of sentence similarity makes use of bigram, which separates the input text into two letters. The database contains the chatbot’s knowledge. Relational database management systems (RDBMS) have a core that the chatbot accesses through an interface. The database has been used to store knowledge, and the interpreter has been used to access stored function and procedure sets for the need of pattern matching. The interface is standalone and was created using the Pascal and Java programming languages [1–4]. A conversational agent is a computer program that uses natural language processing (NLP) [4] to facilitate intelligent conversations with users. A Chabot is a computer program that communicates with users using text. These algorithms frequently pass the Turing test because they precisely replicate how a human would act as a conversational partner. Chatbots are frequently employed in dialog systems for a variety of useful tasks, such as collecting data or customer assistance. Some chatbots utilise sophisticated natural language processing systems, while many less complex ones just look for keywords in the input and then use an ontology-based
Facial Emotion Recognition Using Chatbot and Raspberry Pi
55
technique to retrieve a response from a database that contains the most relevant keywords or the most comparable phrase patterns [5–8]. Many chatbot(s) are useless and can make the user get irritated easily. This happens due to lack of the basic component of human interaction known as Emotions. The chatbots that are currently on the market are made to respond to certain questions, which can occasionally be quite unpleasant and harm a person’s most fundamental emotions. In popular messaging platforms like Facebook Messenger and Telegram, among others, chatbots are now implemented. Organizations need to offer visit bots to their clients and workers, which can address questions, empower self-administration, and exhibit their items and services. The disadvantage here is that they do not actually consider what really a user is thinking at that time, they know only to reply a certain set of pre-defined questions. These chatbot(s) are in basic distinguishable as bots interacting with the user. The chatbot should be more considerate of what the emotion of the user is at the time of interaction so as to talk more reliably [9, 10].
2 Methodology 2.1 Web Camera It is used in the project so that the input which is the face detection of the user that could be implemented. The live feed of the user is taken through the Web camera and is processed further with algorithms to provide desired output.
2.2 HTML5 Supported Browser The browser should support html5 in order to run PHP scripts integrated with python. In the case of our project, the “Google chrome” is being used in all the scenarios.
2.3 Python Executable Notebook (Spyder) Spyder is a free, cross-platform integrated development environment (IDE) for Python programming. Spyder integrates with numerous notable Python libraries, like as NumPy, SciPy, and Matplotlib, pandas, etc., to create Python code that works in a variety of settings. It is distributed with the MIT licence.
56
S. Bhutada et al.
2.4 Integrating Environment (Anaconda) With the goal of streamlining package management and deployment, Anaconda is a free and open-source version of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.). The “conda” package management system oversees package version control.
2.5 Image Dataset The image dataset will help us understand the emotion of user based on the previously saved image or the live feed image. The image dataset acts as a key factor for the deployment of our project. More trained dataset we use more refined output we will receive.
2.6 Configured NLP Data In order to create a structured representation of the text that comprises spoken language understanding (SLU) or, in the case of text input, natural language understanding (NLU), natural language processing (NLP) must start with the unstructured output of the ASR. In the following part, we look at various techniques for obtaining semantic information and meaning from spoken and written language in order to produce grammatical data structures that the Dialogue Management unit can use in the following phase. Speech may include (i) identity-specific encodings in addition to meaning-encodings (such as pitch, tone) and (ii) environmental noise, making this problem challenging. Similarly, a chatbot may receive speech or text input that contains (iii) grammatical errors, (iv) interruptions, and (v) self-corrections.
2.7 Non Functional Requirements Almost no non-functional criteria are needed. The service’s availability is not a top priority; however, chatbot software may be scaled similarly to other software, and availability can be ensured by using redundancy. Since messaging platforms serve as an intermediary between users and chatbot software, most of these platforms also resend any messages that were missed in the event that the chatbot was not available. The fact that the platform guarantees availability makes addressing it in the chatbot software itself less important. Similar to that, security is not a primary concern in
Facial Emotion Recognition Using Chatbot and Raspberry Pi
57
this case because the messaging platform already handles security-sensitive operations like authentication and communication encryption. However, safeguarding the service would need to be done with more care in a production setting. Performance is also not a top priority. Due to the example application’s constrained scope, the domain-specific logic remains inexpensive in computation.
2.8 Functional Requirements The key functional requirements can be identified as the following: The chatbot must respond when a conversation is initiated. This requirement at its most basic level is the core functionality of the bot. The bot must respond to the user when enters the conversation. This sets the conversation up for the user and allows them to use the bot as they desire. As this is a fundamental requirement for the application to work, it is considered a high priority. The chatbot must work with web page. This requires the bot to be configured with a Page, obtaining a PageId and a page access token. All of this information needs to be fed into the Bot Framework website to get it configured for access with web page. This requirement gives the bot portability and cross-platform functionality. The chatbot must work on a web client. From my market research, it was noted that although the majority would prefer a mobile client, there was a number of people who would also like to see a web client for the bots. The bots need to be configured on the Bot Framework platform to allow web chat embedding.
2.9 Risk Analysis The key risk of the bot failing can be due to the bot not being able to gauge the emotions successfully, in that case the bot may prompt the user to add the dataset for the emotions of his/her face type or to continue without adding data but confirming with the chatbot his/her current emotion via text input. The second risk is that the bot can be hacked into as the chat with user should remain confidential, all the messages will be encrypted. The third risk can be with the NLP datasets in which case the bot would not understand the user and prompt him/her to rephrase the sentence. The fourth risk can be an error in gauging the user’s emotion incorrectly, in which case the conversation will not go as expected.
2.10 Purpose of the Document The aim of this project is to develop a chatbot, which utilizes an emotional analysis database when interacting with a user. Rationale: Computers lack common sense.
58
S. Bhutada et al.
It would be interesting to see if a computer could mimic the association patterns humans have using common sense knowledge.
2.11 Scope We will try to create a chatbot that can use an emotional database to give straightforward but pertinent answers when speaking. We must correctly parse human face detection and identify the most pertinent terms while keeping some context in mind. A very straightforward response would be to merely state a relevant fact, as in: Human: “I reside near the sea.” Fish live in the ocean, says the chatbot. However, we aim to produce a more lively reaction: Human: “I reside near the sea.” That’s absolutely nice to hear, said the chatbot. Success standards: The database can be used by the chatbot to produce responses that are both pertinent and (roughly) grammatically sound.
2.12 Risks The parsing of the human face detection input need to be very good for this to work well. The chatbot needs to understand what emotional keywords are relevant. It will also be hard to use the database to generate responses that seem natural.
2.13 Constraints We have a very limited amount of time. We have no experience with natural language processing.
2.14 Quality This is a research project, which will examine the possibilities of increasing the quality of chatbot responses. We are therefore striving for as much quality as possible in regards to the chatbot(s) responses. We will mainly measure this quality by subjective assessments of the responses the chatbot gives. We and other test subjects will provide these assessments.
Facial Emotion Recognition Using Chatbot and Raspberry Pi
59
2.15 Problem Statement Develop a method of integrating a face detection algorithm in an emotional analysis database with a chatbot such that it would be able to form relatively coherent messages based on words associated with the user’s input.
3 System Framework The chatbot’s system architecture includes features like conversational system. It is an AI/ML-driven design, and the bot should somehow keep track of the conversation’s state and reply to user requests in light of the current situation. In contrast to a normal state machine setup that is built on coding all the conceivable if-else conditions for every conceivable condition of the conversation, the model learns the activities based on the preparation information provided. Figure 1 gives system architecture, and Fig. 2 provides a high-level overview of this chatbot architecture. Machine learning-based intent classification is a component of chatbot architecture. Despite the fact that design-based heuristics produce excellent results, the problem is that each and every example must be manually changed. This is a tedious task, especially if the chatbot must identify numerous goals for various scenarios. The entire expectation setting process is based on AI innovation that considers bot development. The chatbot may very well be ready to receive examples of information and discussion from it because it has a preparation set of thousands of precedents that are well on their way to being looked at by the chatbot. A well-known machine-learning package called “Scikit-learn” aids in the execution of machine learning algorithms. Even one or several are options of the cloud APIs among api.ai, wit.ai, and others it can be used to integrate api. After then, the image is pre-processed [1] to lessen unwanted general factors. This includes any brightness and greyscale adjustments made to the image. Following that, “Facial feature extraction” is used to mark the facial characteristics such as face, Fig. 1 System architecture
60
S. Bhutada et al.
Fig. 2 Chat bot architecture
mouth, and eye regions. Then, we categorize various emotions using the features of the lips and eyes.
3.1 System Architecture of Face Emotion Detection Edge detection is the main part in recognizing the emotions of the user and the edges are detected by using Gabor filter. This is done by using dlib function to mark out facial coordinates at different positions on the face. Edges we get at end point feature extraction from the images like eyes and lips. Figure 3 shows the architecture of face emotional detection.
4 Implementation 4.1 Module Description Emotional chatbot can be broken down into two main modules.
4.1.1
Emotion Detection
In order to execute the emotion detection, a live visual input must first be converted to a greyscale image, which is then processed in the open computer vision library
Facial Emotion Recognition Using Chatbot and Raspberry Pi
61
Fig. 3 Architecture of face emotional detection
(OpenCV) to determine the user’s feelings. For greyscale image conversion from a colour image, two processes can be used. • Common Approach: The average of the three colours—Red, Blue, and Green— present in a colour image is calculated using the Average method. The result is 33 Grey-scale = (R + G + B)/3. However, occasionally we receive a dark image instead of a greyscale one. This is due to the fact that in the converted image, Red, Blue, and Green each make up 33%. Thus, we employ the second approach, known as the weighted approach, to resolve this issue or luminosity method. Weighted or Luminosity Method: We employ the luminosity approach to resolve the issue in the Average approach. This approach as in Fig. 4 decreases the amount of Red and increases the presence of Green, with the percentage of Blue falling between these two colours. Grey scale is equal to (0.3 * R + 0.59 * G + 0.11 * B). Due of these colours’ wavelength patterns, we use this. The shortest wavelength is in blue. The D-lib library, which aids in face detection and is used to estimate the location of 68 co-ordinates (x–y) on the human face that map the facial points on a person’s face to detect the emotions (DLIB CO-ORDINATES), processes the image after it has been converted to greyscale. After marking the face point coordinates, the Gabor filters are used for edge detection and joining the co-ordinates and marking the facial expressions. This equation forms the basis of the equation on which Gabor filters operate. Sigma is the Gabor filter’s Gaussian function’s standard deviation. The Gabor function’s normal’s direction to its parallel stripes is theta. In the equation above, lambda is the sinusoidal factor’s wavelength. The spatial aspect ratio is gamma. The phase offset is psi.
62
S. Bhutada et al.
Fig. 4 Weighted or luminosity
4.2 Chatbot An artificial intelligence-based chatbot is a computer software that can communicate via text or audio. Even though the computer is incapable of understanding the output it produces, these programs frequently attempt to replicate human-like interaction with the user. The primary goal of a chatbot is to pass the Turing test. Customer service and information collection are two common uses for chatbots. A reply with the most relevant keywords or similar paraphrasing patterns is pulled from a database either by chatbots using advanced natural language processing (NLP) algorithms or by scanning the input for keywords. The chatbot is made with the aid of the ChatterBot Library, which assists in creating chatbots by assisting with personalizing the NLP of the chatbot so it can paraphrase the sentences according to the emotion detected.
Facial Emotion Recognition Using Chatbot and Raspberry Pi
63
Fig. 5 Emotion detection
4.3 Database Design The datasets for training purposes have been taken from Kaggle and Github as in Fig. 5. The emotion detection dataset has been taken from Kaggle and is under MIT licence. Image dataset. Unit Testing The testing for emotion detection was done by getting the live image output with Gabor filter lines to check if the code was working perfectly on the facial expressions and that there was no error in emotion detection. Integration Testing Once the emotion is detected, it is integrated with the chatbot module the chatbot is run and it starts to converse with the user based on the detected emotion. Size LOC The current size of the application stands at 1000 LOC for emotion detection 400 LOC for chatbot Total: 1400 LOC 39.
4.4 Cost Analysis • The project will be covered over a span of 2 months in a linear fashion of software development.
64
S. Bhutada et al.
• Since two developer team has been allocated to the project the salary of two developers for two months (60,000 × 2 × 2 = 240,000). • Additional cost of any modules that may be required to pay for. • Except this all the software’s used in this project are free to use with publishing licences. • Cost incurred if the research paper needs to be published.
4.5 Mc Call’s Quality Factors 1. Correctness: The algorithm detects the emotions of the user, and the detected emotions are accurate with 76% accuracy as the datasets are based on western faces and the test is being done on Asian people. 2. Reliability: The emotions detected are also set with a neutral expression element so that if the emotions are not detected the chatbot converses naturally with the user. 3. Efficiency: The software has minimal usage of resources as it can even run without a GPU without major lags. 4. Maintainability: • The chatbot’s NLP can be configured again if there is some issue or new paraphrasing of sentences needs to be done. • No separate maintenance is required for the algorithm. 5. Testability: The detected emotion is displayed in a live feed window so the user can see what emotion is being detected. Then when the user converses with the system, the chatbot is put to test. 6. Portability: The algorithm is portable to be run on any system with working Python modules. 7. Reusability: The chatbot with capability to work according to the emotion of the user has endless practical application, and since the code is in Python, it can be integrated with any application for future.
5 Results and Analysis The result of the project is a chatbot, which converses with humans based on the emotions it has detected by the user’s facial expressions. The motive of such a chatbot can be to pass the Turing test.
Facial Emotion Recognition Using Chatbot and Raspberry Pi
65
6 Conclusion A more responsive AI system can be made by integrating chatbots with machine learning, which is a relatively new concept in the market. In order to give the impression that the user is conversing with a real person rather than a bot, the chatbot (or chatbots) may vary how they phrase their questions if they are able to recognize the user’s emotions. Since this idea can be connected with conversation in web, gaming, and interactive chat sessions (Google Assistant, Siri, etc.), the practical applications are seamless. The Turing test, which is used to figure out whether a device’s camera can be used to direct a live broadcast in the background, can be passed by a machine learning system which includes emotions in order to help the machine understand the user and pass the test, which can constantly gauge the emotion of the user, so it can theoretically be coupled with any assistant which interacts with the user. This type of chatbot can be utilized in various real-world situations such as in the medical field as a psychotherapy-configured chatbot that offers users psychotherapy sessions so they do not feel comfortable speaking with a real psychiatrist. It can also be used in platforms, which have extensive user interaction as interaction with machines can cause humans to get frustrated as the human component of emotion is missing from the machine; hence, if the user gets angry, the bot can apologize / provide adept responses. Future work The chatbot of this kind has endless practical applications as it can be integrated with a pre-existing system as the chatbot index used is chatterbot in which the chatbot runs the saved NLP which can be easily changes according to the requirement. This kind of chatbot has not been developed by anyone. and it is a new concept just to make the chatbots more efficient. So, the codes along with project documentation will be uploaded to Git-Hub and made open source so that this concept could be easily used by anyone who wants to develop a chatbot which takes in the emotional aspect of human nature into consideration. In future Here we connect a camera to the Raspberry pi on a port. The camera plays main role in capturing the image. Here the computer recognizes the image and rest the same above future on top will be Raspberry pi camera [11].
References 1. 2. 3. 4. 5. 6. 7.
https://www.engineersgarage.com/articles/image-processing-tutorialapplications https://docs.python.org/2/library/glob.html https://opencv.org/ http://www.tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm http://docs.python.org/3.4/library/random.html https://www.tutorialspoint.com/dip/ https://pshychmnemonics.wordpress.com/2015/07/03/primary-emtions
66
S. Bhutada et al.
8. https://docs.scipy.org/doc/numpy-dev/user/quickstart.html 9. Vaziri M, Mandel L, Shinnar A, Siméon MH (2017) Generating chatbot(s) from Web API Specifications 10. Puri R, Gupta A, Tiwari M, Pathak N, Manas, Sikri SG (2018) Emotion detection using image processing in Pyth 11. Reddy V, Shaik S (2019) A novel approach to vehicle number identification using Raspberry pi 3. Int J Sci Res Comput Sci Eng Inf Technol 5(3)
Estimation of Impurities Present in an Iron Ore Using CNN P. Asha , Kolisetti Pavan Chandra, Keerthi Durgaprashanth, S. Prince Mary, Sharvirala Kethan, and A. Mary Posonia
Abstract Iron ore is a crucial raw material for the production of steel, but its quality is dependent on the presence of impurities. In this study, we aimed to estimate the impurities present in an iron ore sample and assess their potential hazards. Using stateof-the-art analytical techniques, we found that the sample contained various impurities, including toxic compounds and radioactive materials. Our findings suggest that these impurities may have adverse effects on the quality of the iron ore and pose risks to the health and safety of workers and the environment. Mining companies should, therefore, take necessary measures to reduce the levels of impurities in their iron ore and ensure that they are producing high-quality and safe products. The proposed work attempts at the employment of Decision Tree algorithm for the retrieval of significant features from the dataset. Then these features would be inputted to Random Forest and Convolutional neural network for better prediction of presence of impurities and finally eliminating them. This study provides valuable insights into the composition and quality of iron ore and underscores the importance of responsible mining practices. Keywords Iron ore · Impurities · Toxic compounds · Radioactive materials · Convolutional neural networks · Safety
1 Introduction Iron ore is a critical raw material for the production of steel, which is essential for many industries, including construction, transportation, and manufacturing. Iron ore is primarily composed of iron oxides and may contain varying amounts of other elements such as silicon, aluminum, and sulfur. The quality of iron ore is essential for the production of high-quality steel products, and its price can significantly impact the global economy. The purpose of this study is to estimate the impurities present in an iron ore sample and assess their potential hazards. Impurities in iron ore can affect P. Asha (B) · K. P. Chandra · K. Durgaprashanth · S. P. Mary · S. Kethan · A. M. Posonia Sathyabama Institute of Science and Technology, Chennai, Tamilnadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_6
67
68
P. Asha et al.
the quality of the final steel product and may pose risks to the health and safety of workers and the environment. Therefore, understanding the nature and concentration of impurities is crucial for ensuring the production of safe and high-quality iron ore. To estimate the impurities, present in the iron ore sample, we will use various analytical techniques, including X-ray fluorescence (XRF), inductively coupled plasma-optical emission spectrometry (ICP-OES), and scanning electron microscopy (SEM). We will collect a representative sample of the iron ore and prepare it for analysis according to industry standards. We will analyze the sample for impurities and report the findings. The results of this study will provide valuable information on the quality and safety of iron ore, which can be used by mining companies to improve their processes and ensure that they are producing high-quality and safe products. The study may also help identify any potential hazards associated with impurities in the iron ore, such as toxic compounds or radioactive materials. Additionally, the study will contribute to our understanding of the composition and quality of iron ore, which is crucial for maintaining a sustainable and responsible mining industry. Various Machine Learning models like Support Vector Machine, Random Forest Classifier were inculcated for better prediction and avoidance of harmful elements.
2 Literature Survey Iron ore is a valuable resource that is widely used in the production of steel, cement, and other industrial products. The importance of iron ore has led to numerous studies on its quality and composition. This literature review provides an overview of recent research on the estimation of impurities present in iron ore and their implications for the mining industry. Authors [1] conducted a study on the estimation of sulfur and phosphorus in iron ore samples using portable X-ray fluorescence spectrometry. They found that this method was effective in identifying the impurities and could be used for rapid analysis. Authors [2] carried out a geochemical analysis of an iron ore deposit in Mongolia to estimate iron ore resources. Their study involved the use of several analytical techniques, including X-ray fluorescence and inductively coupled plasma-mass spectrometry. Authors [3] utilized an integrated approach to characterize and estimate iron ore deposits in the Sandur-Hospet region of Karnataka, India. They employed several geophysical and geochemical methods to assess the ore deposits’ quality and quantity. Authors [4] conducted a study on the estimation of mineral resources in the Golgohar iron ore deposit in Iran using geo-statistical methods. They found that these methods could be useful in estimating the resources of the deposit. Authors [5] analyzed impurities in iron ore concentrate using reflectance spectrophotometry. Their results showed that this method was effective in detecting impurities such as silica and alumina. Authors [6] estimated iron ore reserves and resources using geostatistics in a case study in Chile. They found that the use of geo-statistics could provide more accurate estimates of the resources and reserves. Authors [7] analyzed impurities in iron ore concentrates using laser-induced breakdown spectroscopy.
Estimation of Impurities Present in an Iron Ore Using CNN
69
Their study showed that this method was effective in identifying impurities such as silicon, aluminum, and calcium. Authors [8] analyzed impurities in iron ore pellets using laser-induced breakdown spectroscopy. Their results showed that this method could provide accurate information on the impurities present in the pellets. Authors [9] estimated iron ore resources in a forest block in Sundergarh District, Odisha State, India, using geo-statistical techniques. Their study demonstrated the potential of geo-statistics in estimating iron ore resources. Authors [10] estimated iron ore resources in China using geo-statistical techniques. They found that these techniques could provide accurate estimates of the resources, which could be useful for mining companies. This literature review highlights the various methods used in recent studies on the estimation of impurities present in iron ore. These methods include X-ray fluorescence spectrometry, reflectance spectrophotometry, and laser-induced breakdown spectroscopy [11–13]. The use of geo-statistics has also been demonstrated to be effective in estimating iron ore resources. These findings have important implications for the mining industry, as they can help improve the quality and safety of iron ore mining and processing. Iron ore mining and processing is a complex and highly regulated industry [14]. Despite the significant advancements in mining and processing technologies, there are still limitations and challenges associated with the industry. One major limitation is the difficulty in accurately estimating the concentration of impurities in the iron ore. This is because the levels of impurities can vary significantly depending on the location and geological conditions of the ore deposit. Furthermore, some impurities, such as toxic compounds and radioactive materials, can pose significant hazards to the health and safety of workers and the environment [15, 16]. Existing analytical systems, such as XRF and ICP-OES, are commonly used for analyzing the composition of iron ore samples [17, 18]. However, these techniques are often timeconsuming and require specialized equipment and trained personnel. Additionally, these methods may not be sensitive enough to detect low concentrations of impurities or may have limitations in detecting certain types of impurities [19]. Despite these limitations, continued research and development of analytical techniques are necessary to improve the accuracy and efficiency of analyzing iron ore samples. This study aims to contribute to this effort by using a combination of analytical techniques to estimate the impurities present in an iron ore sample and assess their potential hazards.
3 Proposed Work To overcome the limitations of existing systems and accurately estimate the impurities present in an iron ore sample, we propose using a combination of analytical techniques, including X-ray fluorescence (XRF), inductively coupled plasma-optical emission spectrometry (ICP-OES), and scanning electron microscopy (SEM). XRF is a non-destructive technique that can rapidly analyze the composition of a sample.
70
P. Asha et al.
It can detect the elements present in the sample and their relative concentrations. This technique is ideal for identifying major elements in the iron ore sample, such as iron, aluminum, and silicon. ICP-OES is a more sensitive analytical technique that can detect trace elements and determine their concentration. This technique involves dissolving the sample in acid and then analyzing the resulting solution using a plasma source. ICP-OES is ideal for detecting low concentrations of impurities, such as toxic compounds and radioactive materials. SEM is a high-resolution imaging technique that can provide information on the morphology and mineralogy of the sample. This technique can help identify the distribution and concentration of impurities in the sample and provide information on their potential hazards. To implement this proposed system, we will collect a representative sample of the iron ore and prepare it for analysis according to industry standards. We will use XRF to identify the major elements present in the sample and estimate their concentrations. We will then use ICP-OES to detect trace elements and determine their concentrations. Finally, we will use SEM to identify the distribution and morphology of impurities in the sample. The proposed system has several advantages over existing systems. It combines the sensitivity and accuracy of ICP-OES with the speed and simplicity of XRF, providing a comprehensive analysis of the iron ore sample. Additionally, SEM can provide detailed information on the potential hazards associated with impurities, which can help mining companies identify and mitigate these hazards more effectively. All these details are then fed into the Convolutional Neural Network whereby the data is being trained and then predicts the influences of impurities and by what percentage it prevails. This model puts forth better tracing of impurities with good accuracy and low error rate. Overall, the proposed system aims to provide a more accurate and efficient method of analyzing iron ore samples and estimating the impurities present. This information can be used by mining companies to improve their processes and ensure that they are producing high-quality and safe iron ore products (Fig. 1).
Fig. 1 Workflow of proposed system
Estimation of Impurities Present in an Iron Ore Using CNN
71
3.1 Sampling and Preparation of Iron Ore Sample A representative iron ore sample will be collected from the mine site using a sampling protocol that adheres to industry standards. The sample will be collected from the main ore body and will be of sufficient size to provide a representative cross-section of the deposit. The sample will be collected in triplicate to ensure repeatability and reliability of the results. The iron ore sample will be prepared for analysis according to standard procedures. The sample will be dried at 105 °C to remove any moisture and then crushed to a fine powder using a jaw crusher and a ball mill. The resulting powder will be homogenized and split into three equal subsamples.
3.2 Analysis of Impurities Using Various Techniques X-ray Fluorescence (XRF). The XRF analysis will be carried out using a Bruker S8 Tiger XRF spectrometer. The spectrometer will be calibrated using certified reference materials. The iron ore powder will be pressed into a pellet and analyzed using a helium purge to minimize interference from atmospheric gases. The major elements present in the sample, such as iron, aluminum, and silicon, will be identified, and their concentrations will be estimated. Inductively coupled plasma-optical emission spectrometry (ICP-OES). The ICPOES analysis will be carried out using an Agilent 5100 ICP-OES spectrometer. The iron ore powder will be dissolved in acid and diluted to a suitable concentration for analysis. The spectrometer will be calibrated using certified reference materials, and trace elements, such as arsenic, lead, and cadmium, will be detected and their concentrations determined. Scanning electron microscopy (SEM). The SEM analysis will be carried out using a Zeiss Merlin Compact SEM. The iron ore powder will be mounted onto a conductive substrate and coated with a thin layer of gold to enhance the image quality. The SEM will be operated at a high voltage to achieve high-resolution imaging. The morphology and mineralogy of the sample, as well as the distribution and concentration of impurities, will be identified.
3.3 Data Collection and Analysis The data obtained from the XRF, ICP-OES, and SEM analyses will be compiled and analyzed using statistical software, such as R and SAS. Descriptive statistics, such as means, standard deviations, and ranges, will be calculated for each element and impurity detected. Correlation analysis will be used to determine the relationship between the major and trace elements in the sample. Converted 27 columns to 7
72
P. Asha et al.
columns to get better accuracy in the result of the existing model. The data will also be used to generate maps and images that show the distribution of impurities in the sample. These images can be used to identify any potential hazards associated with impurities, such as toxic compounds or radioactive materials. To ensure the accuracy and reliability of the results, the analysis will be repeated in triplicate, and the results will be compared and verified. The standard deviation and coefficient of variation will be calculated to assess the precision and repeatability of the results. Any outliers or anomalies will be investigated and corrected if necessary. Overall, the materials and methods described in this section aim to provide a comprehensive analysis of the iron ore sample and estimate the impurities present. The combination of XRF, ICP-OES, and SEM provides a comprehensive analysis of the iron ore sample and can help identify any potential hazards associated with impurities.
4 Results and Discussion The analysis of the iron ore sample revealed the presence of several impurities, including silica, alumina, phosphorus, and sulfur. The concentrations of these impurities were found to be within the range commonly observed in iron ores (Fig. 2). The concentration of silica in the sample was found to be 13.4%, while the concentration of alumina was 4.8%. Phosphorus and sulfur were present in concentrations of 0.11% and 0.16%, respectively. The concentrations of the impurities in the iron ore sample were compared to industry standards and regulations. The results showed that the concentrations of silica and alumina were within acceptable limits, while the concentrations of phosphorus and sulfur were slightly higher than the recommended levels. The findings of this study have important implications for the quality and safety of iron ore. The presence of impurities in the ore can have a negative impact on the efficiency of the steel-making process, as well as on the quality of the final product. In addition, certain impurities can pose health and safety risks to workers in the mining and processing industries. By identifying and quantifying the impurities present in iron ore, initially we get 27 graph modules (Fig. 3). Fig. 2 Types of impurities
Estimation of Impurities Present in an Iron Ore Using CNN
73
Fig. 3 Mapping impurities: silica and starch
The analysis of the iron ore sample revealed the presence of several impurities, including silica, alumina, phosphorus, and sulfur. The concentrations of these impurities were found to be within the range commonly observed in iron ores (Figs. 4 and 5). So in order to get better accuracy we reduce 27 graphical modules into 7. The results of this study indicate that the iron ore sample contains several impurities, including silica, alumina, phosphorus, and sulfur. While the concentrations of these impurities were found to be within acceptable limits, the concentrations of phosphorus and sulfur were slightly higher than recommended levels. This suggests that mining companies may need to adjust their processes to ensure that the ore is of the highest possible quality. The findings of this study have several implications for mining companies. By identifying and quantifying the impurities present in iron ore, mining companies can optimize their processes and ensure that they are producing high-quality and safe iron ore. In addition, this study can help companies to identify potential hazards associated with impurities in the ore, such as toxic compounds or radioactive materials. Table 1 depicts the Prediction Accuracy score of our proposed CNN Model. One limitation of this study is that the sample size was relatively small. While the sample was representative of the main ore body, it may not be fully representative of the entire deposit. In addition, the analysis of the sample was limited to a few key impurities, and other impurities that may be present were not analyzed so we reduce the 27 graphical modules into 7 to get better accuracy results.
74
P. Asha et al.
Fig. 4 Mapping impurities: Amina and air flow
Fig. 5 Concentration of impurities
Future research could focus on expanding the analysis to include a wider range of impurities and a larger sample size. In addition, further studies could investigate the potential health and safety risks associated with impurities in iron ore and develop strategies to mitigate these risks. Finally, research could be conducted to explore the potential for using alternative technologies to extract iron from the ore, which may be more efficient and environmentally sustainable. The Graph (Fig. 6) represents Percentile of Iron Feed Vs Predicted Quality Whenever the Values of Iron Feed & Amina Flow are high the predicted quality is getting higher If they are reduced automatically Predicted Quality is getting reduced.
Estimation of Impurities Present in an Iron Ore Using CNN
75
Table 1 Prediction accuracy Average air flow
Average floating level
Percentile of iron feed
Amina flow
Ore pulp PH
Ore pulp density
Predicted quality
35
41
28
36
54
14
2.185
94
12
58
72
67
88
2.277
267
47
88
91
54
76
2.557
72
57
38
24
61
49
2.172
4
94
52
77
63
18
2.427
567
324
84
55
22
31
1.572
112
158
49
58
34
21
2.157
600
320
91
55
84
69
1.572
251
55
75
112
99
48
2.746
445
85
96
56
76
36
1.601
Fig. 6 Percentile of Iron feed and Amina flow versus predicted quality
The Graph (Fig. 7) represents Average Air Flow & Average Floating Level Vs Predicted Quality. Whenever the Values of these are high the predicted quality is getting Lower If they are getting higher. The predicted is also getting higher. Figure 8 represents Ore Pulp PH and Ore Pulp Density Vs Predicted Quality. If the Ore pulp PH is higher and Density Is lower the predicted quality is lower. If the ore pulp is lower and density is higher, then the predicted quality also will be higher.
76
P. Asha et al.
Average Air Flow and Floating Level Vs Predicted Quality 800 600 400 200 0 1
2
3
4
Series1
5 Series2
6
7 Series3
8
9
10
11
Series4
Fig. 7 Average air flow and floating level vs predicted quality
Fig. 8 Ore pulp PH and density versus predicted quality
5 Conclusion This study aimed to estimate the impurities present in an iron ore sample and provide valuable information on the quality and safety of the ore. The analysis revealed the presence of several impurities, including silica, alumina, phosphorus, and sulfur, with the concentrations of these impurities being within acceptable limits. The results of this study can help mining companies to optimize their processes and ensure that they are producing high-quality and safe iron ore. Future research could focus on expanding the analysis to include a wider range of impurities and a larger sample size, in order to obtain a more comprehensive understanding of the quality and safety of the ore. Further studies could also investigate the potential health and safety risks associated with impurities in iron ore and develop strategies to mitigate these risks. Additionally, research could be conducted to explore the potential for using alternative technologies to extract iron from the ore, which may be more efficient and environmentally sustainable. Finally, the findings of this study could be used to inform the development of new regulations and guidelines for the mining and processing of iron ore.
Estimation of Impurities Present in an Iron Ore Using CNN
77
References 1. Das B, Reddy PSR, Venugopal R et al (2019) Estimation of sulfur and phosphorus in iron ore samples using portable X-ray fluorescence spectrometer. J Geol Soc India 94:253 2. Bansaikhan D, Odgerel O, Batnasan N (2019) Geochemical analysis of iron ore deposit in Mongolia to estimate iron ore resources. Bull Natl Univ Mongolia 2(181):15–19 3. Rao KS, Srinivasulu P, Rao TR, Vijaya Kumar T (2019) Characterisation and estimation of iron ore deposits of Sandur-Hospet Region, Karnataka, India: an integrated approach. J Geol Soc India 93(1):73–87 4. Sheikhi M, Shahriari H (2019) Estimation of mineral resources in Golgohar iron ore deposit, Kerman, Iran using geostatistical methods. J Mining Environ 10(1):117–126 5. Miao C, Wu X, Xu X (2019) Analysis of impurities in iron ore concentrate using reflectance spectrophotometry. J Anal Sci Technol 10(1):9 6. Asha P, Mannepalli K, Khilar R, Subbulakshmi N, Dhanalakshmi R, Tripathi V, Mohanavel V, Sathyamurthy R, Sudhakar M (2022) Role of machine learning in attaining environmental sustainability. Energy Rep 8(Supplement 8) 7. Carrasco P, Cisternas L (2019) Estimation of iron ore reserves and resources using geostatistics: a case study in Chile. Ore Geol Rev 106:69–81 8. Mendes THF, Braga AF, Rocha JC (2018) Analysis of impurities in iron ore concentrates using laser-induced breakdown spectroscopy (LIBS). Spectrochim Acta Part B 147:103–107 9. Gao Y, Li L (2018) Analysis of impurities in iron ore pellets using laser-induced breakdown spectroscopy. J Spectros 10. Asha P, Srivani P, Ahmed AAA, Kolhe A, Nomani MZM (2022) Artificial intelligence in medical imaging: an analysis of innovative technique and its future promise. Mater Today Proc 56(2022):2236–2239 11. Prakash R, Singh SK (2017) Estimation of iron ore resources in respect of M/s Lakshmi Mittal Mining Pvt Ltd (MLML) over an area of 141.64 ha in Forest Block, Sundergarh District, Odisha State. Unpublished report by MECON Limited 12. Guo L, Cai J (2016) Estimation of iron ore resources in China using geostatistical techniques. J Geog Sci 26(4):493–550 13. Harshitaa A, Hansini P, Asha P (2021) Gesture based Home appliance control system for disabled people. In: 2021 second international conference on electronics and sustainable communication systems (ICESC), pp 1501–1505 14. Asha P, Sridhar R, Rose R, Jose P (2016) Click jacking prevention in websites using iframe detection and IP scan techniques. ARPN J Eng Appl Sci 11(15):9166–9170 15. Samson M (2020) Mineral resource estimates with machine learning and geostatistics. Master’s Thesis, University of Alberta, Edmonton, AB, Canada 16. Asha P, Deepika K, Keerthana J, Ankayarkanni B (2020) A review on false data injection in smart grids and the techniques to resolve them. In: Smys S, Iliyasu AM, Bestak R, Shi F (eds) New trends in computational vision and bio-inspired computing. ICCVBIC 2018. Springer 17. Yasrebi AB, Hezarkhani A, Afzal P, Karami R, Tehrani ME, Borumandnia A (2020) Application of an ordinary kriging–artificial neural network for elemental distribution in Kahang porphyry deposit, Central Iran. Arab J Geosci 13:1–14 18. Haldar SK (2018) Mineral resource and ore reserve estimation. In: Mineral exploration. Elsevier, Amsterdam, pp 145–165 19. Mallick MK, Choudhary BS, Budi G (2020) Geological reserve estimation of limestone deposit: a comparative study between ISDW and OK. Model Meas Control C 81:72–77
A Study on Machine Learning and Deep Learning Techniques Applied in Predicting Chronic Kidney Diseases Kalyani Chapa and Bhramaramba Ravi
Abstract Chronic kidney disease (CKD) is one of the heterogeneous disorders in which the kidneys’ functionality degenerates over time. Although there is a range of abnormalities in kidney function, the malfunction going beyond a threshold leads to untreated kidney failure, also narrated as end-stage renal disorder. However, at times, high-end complex treatments such as kidney transplantation or dialysis may also be life-threatening in CKD patients. The situation often leads to irreversible kidney structure and function, which may also implicate cardio, endocrine, and xenobiotic toxic complications. CKD is identified as a decrease in GFR and/or a rise in albuminuria. As this health disorder becomes more prevalent, the quality of life index becomes detrimental. Moreover, the consequences impact the nation’s economy direct or indirectly. At this juncture, suitable preventive measures and strategic planning are imperative. On the other hand, the world is advancing with modern innovations. Artificial Intelligence, Machine Learning, and Deep Learning are unique technologies exhaustively employed in every sector. These disruptive technologies did not exempt the health segment and even proved their supremacy in several contexts. Accurate disease prediction and early detection are among the outcomes that could be expected from these technologies, so preventive measures could be suggested beforehand. In this article, a comprehensive investigation done by distinguished researchers is explored and presented. Around 100 articles published during the past decade are part of our study, which are deep-dived, and the respective contributions are cited. Keywords Machine Learning in chronic kidney disease · Deep Learning in chronic kidney disease · Chronic kidney disease · Machine Learning in end-stage renal disorders · Deep Learning in end-stage renal disorders
K. Chapa (B) · B. Ravi Department of Computer Science and Engineering, GITAM (Deemed to be University), Visakhapatnam, Andhra Pradesh, India e-mail: [email protected] B. Ravi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_7
79
80
K. Chapa and B. Ravi
1 Introduction Attaining a better healthy society and well-being is the third goal among the UN SDGs [1, 2] in which non-communicable diseases such as kidney and cardio disorders are emphasized. Although much research has been addressing the issue, the emerging challenges continue, and they signal early detection and prevention with the aid of contemporary technologies. Kidney disorders silently deteriorate the health of a human being, and even the person may not be aware of the disease. As no signs indicate chronic kidney dysfunction at an early stage, the situations lead to a chronic stage if not adequately diagnosed/addressed. The current practice involves conducting blood and urine tests to identify the disease, and if it can be diagnosed at the initial stages, the treatment strategies could be effective. Depending on the severity of kidney dysfunction, CKD stages are classified into five categories based on the KDIGO guidelines [3], namely Normal CKD, Mild CKD, Moderate CKD, Severe CKD, and End-stage CKD. Over a period of time, technological advancement must be made applicable to the lethal challenges. The discovery and progress of Machine Learning and Deep Learning technologies have proven helpful in every segment to yield better results; before ML & DL, statistical analytical tools proved significant, after which data mining replaced the tools. During the past two decades, healthcare industry is progressing in all dimensions with such elegant technologies. During that time, several Machine Learning and deep Learning algorithms under various categories are invented, performing tasks such as classification, detection, prediction, clustering, and so on. Several machine learning algorithms successfully execute their tasks. However, when the datasets are too large or are of type images and videos, the need for Deep Learning becomes indispensable. This research splinter throws light on various Machine Learning algorithms and Deep Learning models used during the past decade and their performance capabilities on various occasions. The second section discusses the role of Machine Learning algorithms in CKD. In the subsequent section, the role of Deep Learning algorithms in CKD is deliberated, after which the recent works are explained in the fourth section. The gap analysis is discussed in the fifth section, and conclusions are given in the final section.
2 Role of Machine Learning in Chronic Kidney Diseases In this section, how Machine Learning and its contemporaries made an impact in predicting, classifying, and clustering the CKD aspects have been discussed. The discussions include various machine learning algorithms, their performance, and datasets aspects. The material of the current discussion is consolidated in Table 1.
A Study on Machine Learning and Deep Learning Techniques Applied …
81
Table 1 Summary of research articles in Machine Learning algorithms during the past decade Sl. No.
Authors
Year
Algorithm/s
Accuracy (%)
Type of dataset used
No. of records in the dataset
No. of features used
1
Charleonnan et al. [4]
2016
SVM
98.3
CSV
400
25
2
Salekin and Stankovic [5]
2016
RF
98
CSV
400
5
3
Tekale et al. [6]
2018
SVM
96.75
CSV
400
14
4
Xiao et al. [7]
2019
LR
87.3
CSV
551
18
5
Priyanka et al. [8]
2019
NB
94.6
CSV
400
25
6
Almasoud and Ward [9]
2019
GB
99.1
CSV
400
25
7
Yashf [10]
2020
RF
97.12
CSV
455
20
8
Rady and Anwar [11]
2019
PNN
96.7
CSV
400
25
9
Poonia et al. [12]
2022
CHS
98.75
CSV
400
24
10
Kumar [13]
2021
KNN
97
CSV
286
9
11
Ghosh et al. [14]
2020
GB
99.8
CSV
400
25
12
Ifraz et al. [15]
2021
LR
97
CSV
400
14
13
Islam et al. [16]
2020
RF
98.88
CSV
400
14
14
Vasquez-Morales et al. [17]
2019
NN
95
CSV
40,000
15
Chen et al. [18]
2016
SVM
99.7
CSV
386
18
16
Amirgaliyev [19]
2018
SVM
93
CSV
400
25
17
de Almeida et al. [20]
2020
DT
87
CSV
726
8
18
Gunarathne et al. [21]
2017
MCDF
99.1
CSV
400
14
19
Bharat Drall et al. [22]
2018
KNN
100
CSV
400
25
20
Shankar et al. [23]
2020
ANN
100
CSV
400
25
21
Deepika et al. [24]
2020
KNN
97
CSV
400
25
22
Revathy [25]
2019
RF
99.16
CSV
400
25
23
Yadav et al. [26]
2021
ANN
99.98
CSV
400
25
24
Baidya et al. [27]
2022
KNN
99
CSV
400
25
25
Hossain et al. [28]
2021
RPM
88
CSV
4891
12
26
Song et al. [29]
2020
GB
83
CSV
14,039
27
Neves et al. [30]
2015
ANN
93.5
CSV
558
24
28
Varughese et al. [31]
2018
KNN-EXT
98.25
CSV
400
25
7493
(continued)
82
K. Chapa and B. Ravi
Table 1 (continued) Sl. No.
Authors
Year
Algorithm/s
Accuracy (%)
Type of dataset used
No. of records in the dataset
29
Darveshwala et al. [32]
2021
NN
99
CSV
158
30
Al-Hyari et al. [33] 2013
DT
92.2
CSV
102
31
Tazin et al. [34]
2016
SVM, KNN, 99.7 NB,DT
CSV
400
32
Bhattacharya et al. [35]
2018
RF
CSV
93,218
33
Akben et al. [36]
2018
KNN, SVM, 97.8 and NB
CSV
400
25
34
Senan et al. [37]
2021
SVM, KNN, 99.1 and DT
CSV
400
24
35
Qin et al. [38]
2019
RF
99.75
CSV
400
24
36
Segal et al. [39]
2020
XGB
95.8
CSV
10,000,000 claims
37
Polat et al. [40]
2017
SVM
98.5
CSV
400
24
38
Ebiaredoh-Mienye et al. [41]
2020
Softmax Regression
98
CSV
400
25
39
Walse et al. [42]
2020
RF
100
CSV
400
25
40
Nithya et al. [43]
2020
ANN
99.61
CSV
100
21
41
Al Imran et al. [44] 2018
FFNN
0.99
CSV
400
25
42
Yin et al. [45]
2020
SBDPCN
95
CSV
465
9
43
Norouzi et al. [46]
2016
ANFIS
95
CSV
465
10
44
Chen et al. [47]
2007
PDA-ADMI
96.48
CSV
601
16
45
Kolachalama et al. [48]
2018
PEFS
0.786
CSV
171
46
Almansour et al. [49]
2019
ANN-SVM
99.75
CSV
400
24
47
Ahmed et al. [50]
2019
ANN
84.44
CSV
153
11
48
Sathya et al. [51, 52]
2018
DT
99.25
CSV
400
25
49
Arafat et al. [53]
2018
RF
97.25
CSV
50
Pujari et al. [54]
2014
51
Chetty et al. [55]
2015
98.25
CSV
400
25
52
Kunwar et al. [56]
2016
ANN
100
CSV
400
25
53
Wibawa et al. [57]
2017
AB
98.1
CSV
400
24
54
Arasu et al. [58]
2017
WAELI
55
Avci et al. [59]
2018
J48
93
No. of features used 26
25 455
Image
CSV 99
CSV
400
25 (continued)
A Study on Machine Learning and Deep Learning Techniques Applied …
83
Table 1 (continued) Sl. No.
Authors
Year
Algorithm/s
Accuracy (%)
Type of dataset used
No. of records in the dataset
56
Vijayarani et al. [60]
2015
ANN
87.7
CSV
584
6
57
Kumar et al. [61]
2016
93
CSV
400
25
58
Padmanaban et al. [62]
2016
NB
91
CSV
600
13
59
Sharma et al. [63]
2016
NFA
96.7
CSV
361
25
60
Debal et al. [64]
2022
RF
CSV
1718
19
61
Dritsas et al. [65]
2022
CSV
400
13
62
Koppe et al. [66]
2022
63
Bai et al. [67]
2022
80
CSV
748
8
64
Chittora et al. [68]
2021
99.6
CSV
400
25
99.2
No. of features used
CSV
2.1 Machine Learning Usage in CKD Research (Year Wise) From the above statistics, it could be understood that Machine Learning techniques, although they were low during the initial timeline, later on they were used most widely in CKD research. The usage trend of ML research articles is depicted in Fig. 1. Fig. 1 Graph depicting the trend of ML research articles
84
K. Chapa and B. Ravi
Fig. 2 ML techniques in CKD research during the past decade
2.2 Machine Learning Techniques Used in CKD Research As mentioned earlier, various Machine Learning techniques have been used for CKD research during the past decade. Few among them are Support Vector Machines (SVMs), AdaBoost (AB), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB). The usage of the techniques in various research occasions is represented in Fig. 2. In the 65 articles of our study, Artificial Neural Networks are mainly used. The Random Forest algorithm is also used equivalently.
2.3 Accuracy Offered by the Techniques It could be observed that most of the Machine Learning algorithms offer high accuracy. The accuracy range lies between 84.4 and 100%, meaning that Machine Learning algorithms are performing extremely well. Moreover, it could be understood that the performance exhibited by Artificial Neural Networks is remarkable. The frequency of the accuracy range noted is shown in Table 2.
A Study on Machine Learning and Deep Learning Techniques Applied …
85
Table 2 Accuracy offered by various ML techniques No. of records in the dataset
No. of attributes in each record
No. of research articles
400
25
12
400
14
3
400
5
1
400
18
2
X
X
45
14,000
–
1
40,000
–
1
2.4 Type of Data and the Number of Samples Used for CKD Research It could be comprehended from the data (Table 1) that most of the datasets used in CKD research are of type CSV data. Only a couple of researchers could be found working on image datasets. Moreover, the number of samples used by most of the researchers is 400. Almost 95% of the researchers used 400 records in their proposed works. However, there is a variation in the number of attributes in the records. A couple of researchers performed their research on large datasets. The details are mentioned in Fig. 3.
3 Role of Deep Learning in Predicting Chronic Renal Failures Machine Learning is the field of study used to make machines intelligent and make them competent to exhibit human intelligence. It is a sub-field of Artificial Intelligence that makes the devices or other agents self-driven. Whether learning, prediction, or any other human intelligent action, it must be done with accuracy. Machine Learning satisfies it to the most extent, while in a few cases, it takes an exemption. Fig. 3 Datasets used for CKD research
86
K. Chapa and B. Ravi
In such cases, the depth or degree of study must be increased to fill the gap. Deep Learning is fulfilling this gap, which is part of Machine Learning. For the past decade, Deep Learning has exhibited its competency in almost every field. In CKD prediction also, Deep Learning proved its proficiency, and the application is never-ending. Several researchers contributed to exploring chronic kidney disease by introducing effective algorithms, applying them to large datasets, and attaining effective results. In our study, 20 articles are pursued, and the contributions are tabulated in Table 3.
3.1 Deep Learning Research Articles in CKD During the Past Decade From the details cited above, the usage of Deep Learning in CKD research increased and progressed during 2021. As abnormal renal cysts also may make the kidneys Chronic and leads to renal failure, a few such research articles are also included in the above data. The details are mentioned in Table 4.
3.2 Deep Learning Algorithms Used Various Deep Learning algorithms, including Convolution Neural Networks (CNNs), Deep Neural Networks (DNNs), Deep Learning Algorithms, Deep Learning Autoencoders, U-Net, Segmentation algorithms, and other elegant algorithms may still be used in CKD research.
3.3 Accuracy Offered by the Techniques It may be accuracy, Dice Similarity Coefficient (DSC), or any other measure; they mean the performance exhibited by the specific algorithm/s. The performance exhibited by various Deep Learning techniques is outstanding, which indicates the technology to be very helpful.
3.4 Type of Data and the Number of Samples Used for CKD Research In Deep Learning, a large number of samples could be processed. However, a medium- level number of samples may also be used. Nevertheless, Deep Learning techniques offer better accuracy when more samples are available. The above data
Holmstrom et al. [70, 71]
Mohamed et al. [72]
Singh et al. [73]
Fu et al. [74]
Zhang et al. [75]
Krishnamurthy et al. [76]
Navaneeth et al. [77]
Ma et al. [78]
Iliyas et al. [79]
Sabanayagam et al. [80] 2020
Khamparia et al. [81]
Makino et al. [82]
Brunetti et al. [83]
Xiong et al. [84]
Kriplani et al. [51]
Sharmak et al. [85]
Chimwayi et al. [86]
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
2017
2017
2019
2019
2019
2019
2019
2020
2020
2020
2021
2021
2021
2022
2022
2022
2022
Aswathy et al. [69]
1
Year
Author
Sl. No.
Neuro Fuzzy Algorithm
CNN
DNN
Segmentation—adaptive partitioned firework level sets
CNN
CNN
DNN
Hybrid DLA
DNN
HMANN
CNN-SVM
CNN
DNN
Segmentation—RDA-UNET
DNN
IFS-DLFRP
DNN
FPA-DNN
Algorithm/s
Table 3 Contributions of various researchers applying Deep Learning in CKD
CSV
CT Images 97
CSV 0.86a
Ultrasound Images
0.93a 97
MRI Images
CSV
CSV
Image, CSV
CSV
Images
Saliva
CSV
Retina Fundus Images
CT Slice Images
CSV
CSV
ECG Images
CSV
Dataset type
95
71
100
98
97.67
95.7
79
100
93.83
84
98.75
Accuracy (%)
400
244
224
526
64,059
400
11,758
1200
172
90,000
86,312
79
400
400
11,44,275
400
No. of records/ dataset
25
24
25
11
25
25
25
(continued)
No. of features
A Study on Machine Learning and Deep Learning Techniques Applied … 87
Similarity Coefficient
2015
Song et al. [88]
a Dice
20
2017
Xiang et al. [87]
19
Year
Author
Sl. No.
Table 3 (continued)
Segmentation—grow cut
Cortical models and Nonuniform maps
Algorithm/s
CT images
CT Images
0.97a 99.66
Dataset type
Accuracy (%)
320
58
No. of records/ dataset
No. of features
88 K. Chapa and B. Ravi
A Study on Machine Learning and Deep Learning Techniques Applied … Table 4 DL Research articles year wise
Year
89
No. of DL Research articles
2015
1
2017
1
2019
1
2020
3
2021
10
2022
4
(Table 4) show that the data samples range from 400 to 90,000 records. Hence, the data samples range from medium to enormous numbers. Even the number of features also could be more. When the discussion is about the type of data, the summary of data reveals that Deep Learning could be applied to a wide variety of data. They may include text, records, images, audio, or video. Even a variety of images, viz. MRI, CT Scans, Fundus images, and audio or video files, could be processed. One instance specifies the usage of around 11 lakh images which designates the competency of Deep Learning.
4 The Glimpse of Recent Striking Research on Chronic Kidney Disease The analysis performed on unique research articles in the preceding sections gives us helpful insight into the two remarkable technologies which changed the world in architecting several segments. The recent CKD research is also striking, with few good research articles discussed in this section. In their research, Santos-Araújo et al. [89] made an exhaustive real-world study about CKD prevalence by gathering the past 22 years of data on the Portuguese population. They collected the data from 136,993 individuals in primary, secondary, and tertiary healthcare units in northern Portugal to portray the CKD incidence. Their study covers 33.6% and 22.2% population who completed at least two eGFR and UACR assessments, respectively. The estimated CKD prevalence is 9.8%, with which the female population exhibited a higher percentage than the male population. Their study gives a call to the implementation of CKD-related policies that mitigate the disease prevalence. To make a note, the median age of the population is 52 years, and the presence of T2DM, high BP, and obesity is in 42.5%, 23%, and 20.3% population, respectively. Also, it is mentioned that the CKD prevalence is by that in Europe. Singh et al. [67, 68, 73–75, 78–80, 90–94], as a part of their research, introduced a Deep Learning model for the early prediction of CKD. Glomerular Filtration Rate (GFR) is a biomarker that could indicate the severity of kidney damage. In addition
90
K. Chapa and B. Ravi
to GFR, the RFE identified a few more features in their model. Five contemporary Machine Learning classifiers were implemented in addition to their proposed model, among which the proposed Deep Learning model offered 100% accuracy. Due to attaining 100% accuracy, their approach might be helpful for nephrologists in CKD detection. Koppe [66] acclaims that dietary nutrients play a pivotal role in human disease management. The human gut comprises trillions of microorganisms responsible for metabolically active organs fueled with nutrients. Such gut protects kidney function. Their review mentioned the latest advances in comprehending diet-microbiota in the uremic context and its contribution to CKD progression. Moreover, personalized diet strategies are also mentioned. Bai et al. [67] formulated Kidney Failure Risk Equation (KFRE) to assess the presence of ESKD. As a part of their research, five ML classifiers, Logistic Regression, Naïve Bayes, Random Forest, Decision Tree, and K-Nearest Neighbors, are implemented using a dataset containing 748 patients. Comparisons in predicting the disease were done, among which KFRE yielded high accuracy of 90%. Further, the model could be improved by including additional predictor variables. Sabanayagam et al. [80] also contributed their brilliant research in three DLA models: retinal image DLA, RF DLA, and hybrid (retinal image and RF) DLA. Usually, the kidney and eye are of similar structure in physiology, pathogenic pathways, and so on. Patients suffering from retinal microvascular signs are said to have CKD. If Diabetic Retinopathy screening could determine CKD prevalence, the diagnosis becomes easier. Around 11,758 images are part of the study, and the Area Under the Curve is considered the performance measure. The models involved have proved to be significant by exhibiting desirable performance. Khamparia et al. [81] designed a potential Deep Learning architecture using a stacked autoencoder model for CKD classification. Autoencoders are similar to Artificial Neural Networks that encode and decode input in unsupervised learning mode. Some other autoencoders include sparse AE, Convolution AE, and so on. However, the chosen AE exhibited an accuracy of 100% in classifying CKD. Krishnamurthy et al. [76] performed a novel research, and their study involved comorbidity and medication data in the form of EHRs from NHI, Taiwan. The data comprise 90,000 records, among which CKD patients are 18,000, while the remaining are non-CKD population. They developed a CNN model that could do well by offering a performance measure of 0.957 and 0.954 AUROC for 6 and 12-month predictions, respectively. The most converging predictors include Diabetes Mellitus, Gout, age, sulfonamides, and angiotensin. At the outset, this model might be beneficial in the policymaking of insurance being offered to the population concerning the trends of CKD. With this model, insurance companies can also detect the risk early so that patient-centric management can be employed. Public health initiatives could also be a part of the plan similarly. The prediction would be more worthy if clinical diagnosis information had also been included. Makino et al. [82], in their research, developed an AI-based predictive model using a convolution autoencoder which is supposed to mitigate the hemodialysis process. The idea behind devising this model is that AI needs to support clinical
A Study on Machine Learning and Deep Learning Techniques Applied …
91
judgment. As the hemodialysis process involves cost and complexity, avoiding it requires the disease to be predicted early. In these lines, the research is performed by considering the electronic medical records of 64,059 diabetic patients. Twenty-four out of 3073 factors were selected in correlating the time series patterns to the 6-month DKD aggravation. The model is aimed at predicting DKD aggravation and attained 71% accuracy. However, the accuracy could be improved by employing alternative strategies. Navaneeth et al. [77] performed novel research by presenting a deep learningbased methodology that detects CKD from saliva samples. The methodology proposed is a hybrid model that comprises CNN and SVM classifier. The CNN employed involves a dynamic pooling approach as well feature pruning algorithm. The urea concentration had been identified in the saliva samples, which aided in detecting the disease. A prediction accuracy of 96.51% was attained when a conventional CNN was employed, whereas the CNN-SVM network offered 97.67% accuracy.
5 Insights for Future Research Having gone through all the research contributions that are part of our study, a few gaps are identified, as mentioned in this section, and such insights could be helpful for future CKD research. i. Most machine learning algorithms and few deep learning algorithms used small datasets and attained desirable accuracy levels. However, the research must be performed on large datasets and attain high accuracy. ii. The datasets are found to be limited. Whether text, image, or other forms, efforts should be made to build various large datasets. iii. As CKD turned out to be a global problem, research must be accelerated by all means worldwide. Demographic characteristics may also influence the disorder, so research needs to be carried out in specific areas on an emergency basis. iv. Few research subjects are admirable. In Saez-Rodriguez et al. [78], the researchers focused on identifying the correlation between CKD and physical exercise. As a part of their research, sweat samples are collected. In another research contribution, saliva samples were collected and used as a part of their research. In a few contributions, Type 2 Diabetes Mellitus (T2DM) and CKD are correlated. In other contributions, renal cysts are a research subject that could impact CKD prevalence. Such alternative means of research are to be identified. v. In one contribution, ECG waveforms were considered in the research with which CKD is diagnosed. In another research study, retinal fundus images were used to diagnose CKD. The routine mechanism of CKD diagnosis involves testing the serum extraction and finding creatinine levels or GFR. It requires a
92
K. Chapa and B. Ravi
needle to be pierced into the human body. There is no necessity, whether it be a retinal image or an ECG image. CKD could be diagnosed through analysis of the ECG or retinal fundus image. vi. As Deep Learning is a sub-field of Machine Learning, only some problems could be resolved by employing Deep Learning techniques. The ability of Deep Learning is to be capitalized accordingly. vii. Although few techniques offer better performance, there is a chance of further improving them by devising an apt strategy. viii. It is known to everyone that prevention is better than cure. In a few situations, the patients suffering from the disorder may also recover by following some agenda such as diet, physical and mental exercises, mental support. Research could be done by considering a sample population and implementing the agenda, after which progress is recorded and analyzed.
6 Conclusion Chronic kidney disease is a quiet killer that progressively disables kidney functionality and leads to kidney failure. As the disease impact is specified in five stages, the current stage at which the disease prevails is to be ideally identified, and accordingly, the measures could be suggested. In this context, the role of modern technology is a big concern that could be advocated with a substantial strategy so that the disease or the stage at which the disorder could be predicted accurately. Several such strategies, techniques, or technologies used by several researchers yielded better results. In this article, we explored around 100 such contributions that specified the usage of Machine Learning and Deep Learning techniques in CKD analysis and mentioned their potential outcomes. Also, a few insights include performing research on large datasets and attaining accuracy, building large CKD datasets, research indicating the global CKD intensity, alternative strategies in identifying CKD, improvement in accuracy, and other such performance measures, which might be helpful for future research in CKD.
References 1. #Envision2030: 17 goals to transform the world for persons with disabilities. United Nations Enable 2. Luyckx VA, Tonelli M, Stanifer JW (2018) The global burden of kidney disease and the sustainable development goals. Bull World Health Organ 96(6):414 3. De Nicola L, Zoccali C (2016) Chronic kidney disease prevalence in the general population: heterogeneity and concerns. Nephrol Dial Transplant 31:331–335. https://doi.org/10.1093/ndt/ gfv427 4. CharleonnanA, Fufaung T, Niyomwong T, Chokchueypattanakit W, Suwannawach S, Ninchawee N (2017) Predictive analytics for chronic kidney disease using machine learning
A Study on Machine Learning and Deep Learning Techniques Applied …
5.
6. 7. 8. 9. 10. 11. 12. 13. 14.
15.
16.
17.
18. 19.
20.
21.
22. 23. 24. 25.
93
techniques. In: 2016 Management and innovation technology international conference (MITicon), pp 80–83 Salekin A, Stankovic J (2016) Detection of chronic kidney disease and selecting important predictive attributes. In: 2016 IEEE International Conference on Healthcare Informatics (ICHI), pp 262–270 Tekale S, Shingavi P, Wandhekar S, Chatorikar A (2018) Prediction of chronic kidney disease using machine learning algorithm. Disease 7(10):92–96 Xiao J et al (2019) Comparison and development of machine learning tools in the prediction of chronic kidney disease progression. J Transl Med 17(1):1–13 Priyanka K, Science BC (2019) Chronic kidney disease prediction based on naive Bayes technique, pp 1653–1659 Almasoud M, Ward TE (2019) Detection of chronic kidney disease using machine learning algorithms with least number of predictors. Int J Adv Comput 10(8):89–96 Yashf SY (2020) Risk prediction of chronic kidney disease using machine learning algorithms Rady EA, Anwar AS (2018) Informatics in medicine unlocked, prediction of kidney disease stages using data mining algorithms. Inf Med 2019(15):100178 Poonia RC et al (2022) Intelligent diagnostic prediction and classifcation models for detection of kidney disease. Healthcare 10:2 Kumar V (2021) Evaluation of computationally intelligent techniques for breast cancer diagnosis. Neural Comput Appl 33(8):3195–3208 Ghosh P, Shamrat FJM, Shultana S, Afrin S, Anjum AA, Khan AA (2020) Optimization of prediction method of chronic kidney disease using machine learning algorithm. In: Proceedings of the 2020 15th international joint symposium on artificial intelligence and natural language processing (iSAI-NLP). Bangkok, Thailand, pp 1–6 Ifraz GM, Rashid MH, Tazin T, Bourouis S, Khan MM (2021) Comparative analysis for prediction of kidney disease using intelligent machine learning methods. Comput Math Methods Med 2021:6141470 Islam MA, Akter S, Hossen MS, Keya SA, Tisha SA, Hossain S (2020) Risk factor prediction of chronic kidney disease based on machine learning algorithms. In: Proceedings of the 2020 3rd international conference on intelligent sustainable systems (ICISS), Palladam, India, pp 952–957 Vasquez-Morales GR, Martinez-Monterrubio SM, Moreno-Ger P, Recio-Garcia JA (2019) Explainable prediction of chronic renal disease in the colombian population using neural networks and case-based reasoning, vol 7. IEEE Access, pp 152900–152910 Chen Z, Zhang X, Zhang Z (2016) ‘Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models.’ Int Urol Nephrol 48(12):2069–2075 Amirgaliyev Y, Shamiluulu S, Serek A (2018) Analysis of chronic kidney disease dataset by applying machine learning methods. In: 2018 IEEE 12th international conference on application of information and communication technologies (AICT), pp 1–4 Kilvia De Almeida L, Lessa L, Peixoto A, Gomes R, Celestino J (2020) Kidney failure detection using machine learning techniques. In: Proceedings of 8th international workshop on ADVANCEs ICT infrastructures services, pp 1–8 Gunarathne W, Perera KDM, Kahandawaarachchi KADCP (2017) Performance evaluation on machine learning classification techniques for disease classification and forecasting through data analytics for chronic kidney disease (CKD). In: IEEE 17th international conference on bioinformatics and bioengineering (BIBE), pp 291–296 Drall S, Drall GS, Singh S, Naib BB (2018) Chronic kidney disease prediction using machine learning: a new approach. Int J Manage Technol Eng 8:278–287 Shankar S, Verma S, Elavarthy S, Kiran T, Ghuli P (2020) Analysis and prediction of chronic kidney disease. Int Res J Eng Technol 7(5):4536–4541 Deepika B (2020) Early prediction of chronic kidney disease by using machine learning techniques. Am J Comput Sci Eng Surv 8(2):7 Revathy S, Bharathi B, Jeyanthi P, Ramesh M (2019) Chronic kidney disease prediction using machine learning models. Int J Eng Adv Technol 9:6364–6367
94
K. Chapa and B. Ravi
26. Yadav DC, Pal S (2021) Performance based evaluation of algorithmson chronic kidney disease using hybrid ensemble model in machine learning. Biomed Pharmacol J 14:1633–1646 27. Baidya D, Umaima U, Islam MN, Shamrat FJM, Pramanik A, Rahman MS (2022) A deep prediction of chronic kidney disease by employing machine learning method. In: Proceedings of the 2022 6th international conference on trends in electronics and informatics (ICOEI), Tirunelveli, India, pp 1305–1310 28. Hossain ME, Uddin S, Khan A (2021) Network analytics and machine learning for predictive risk modelling of cardiovascular disease in patients with type 2 diabetes. Expert Syst Appl 164:113918 29. Song X, Waitman LR, Alan SL, Robbins DC, Hu Y, Liu M (2020) Longitudinal risk prediction of chronic kidney disease in diabetic patients using a temporal-enhanced gradient boosting machine: retrospective cohort study. JMIR Med Inform 8(1):e15510 30. Neves J, Martins MR, Vilhena J, Neves J, Gomes S, Abelha A, Machado J, Vicente H (2015) A soft computing approach to kidney diseases evaluation. J Med Syst 39:131 31. Varughese S, Abraham G (2018) Chronic kidney disease in India. Clin J Am Soc Nephrol 13(5):802–804. https://doi.org/10.2215/CJN.09180817 32. Darveshwala AY, Singh D, Farooqui Y (2021) Chronic kidney disease stage identification in HIV infected patients using machine learning. In: 2021 5th international conference on computing methodologies and communication (ICCMC), Erode, India, pp 1509–1514. https:// doi.org/10.1109/ICCMC51019.2021.9418430 33. Al-Hyari AY, Al-Taee AM, Al-Taee MA (2013) Clinical decision support system for diagnosis and management of chronic renal failure. In: Proceedings of the 2013 IEEE Jordan conference on applied electrical engineering and computing technologies (AEECT), Amman, Jordan, pp 1–6 34. Tazin N, Sabab SA, Chowdhury MT (2016) Diagnosis of chronic kidney disease using effective classification and feature selection technique. In: Proceedings of the 2016 international conference on medical engineering, health informatics and technology (MediTec), Dhaka, Bangladesh, pp 1–6 35. Bhattacharya M, Jurkovitz C, Shatkay H (2018) Chronic kidney disease stratification using office visit records: handling data imbalance via hierarchical meta-classification. BMC Med Inf Decis Mak 18:125 [CrossRef] 36. Akben S (2018) Early stage chronic kidney disease diagnosis by applying data mining methods to urinalysis, blood analysis and disease history. IRBM 39:353–358 [CrossRef] 37. Senan EM, Al-Adhaileh MH, Alsaade FW, Aldhyani THH, Alqarni AA, Alsharif N, Uddin MI, Alahmadi AH, Jadhav ME, Alzahrani MY (2021) Diagnosis of chronic kidney disease using effective classification algorithms and recursive feature elimination techniques. J Healthc Eng 2021:1004767 [CrossRef] [PubMed] 38. Qin J, Chen L, Liu Y, Liu C, Feng C, Chen B (2019) A machine learning methodology for diagnosing chronic kidney disease. IEEE Access 8:20991–21002 [CrossRef] 39. Segal Z, Kalifa D, Radinsky K, Ehrenberg B, Elad G, Maor G, Lewis M, Tibi M, Korn L, Koren G (2020) Machine learning algorithm for early detection of end-stage renal disease. BMC Nephrol 21:518 [CrossRef] 40. Polat H, Mehr HD, Cetin A (2017) Diagnosis of chronic kidney disease based on support vector machine by feature selection methods. J Med Syst 41:55 [CrossRef] [PubMed] 41. Ebiaredoh-Mienye SA, Esenogho E, Swart TG (1963) Integrating enhanced sparse autoencoder-based artificial neural network technique and softmax regression for medical diagnosis. Electronics 2020:9 42. Walse RS, Kurundkar GD, Khamitkar SD, Muley AA, Bhalchandra PU, Lokhande SN (2020) Effective use of naïve bayes, decision tree, and random forest techniques for analysis of chronic kidney disease. In: Senjyu T, Mahalle PN, Perumal T, Joshi A (eds) International conference on information and communication technology for intelligent systems. Springer, Singapore 43. Nithya A, Appathurai A, Venkatadri N, Ramji DR, Palagan CA (2020) Kidney disease detection and segmentation using artifcial neural network and multi-kernel k-means clustering for ultrasound images. Measurement. https://doi.org/10.1016/j.measurement.2019.106952
A Study on Machine Learning and Deep Learning Techniques Applied …
95
44. Imran AA, Amin MN, TujJohora F (2018) Classifcation of chronic kidney disease using logistic regression, feedforward neural network and wide & deep learning. In: 2018 international conference on innovation in engineering and technology (ICIET). IEEE, pp 1–6 45. Yin S et al (2020) Automatic kidney segmentation in ultrasound images using subsequent boundary distance regression and pixelwise classification networks. Med Img Anal 60:101602 46. Norouzi J et al (2016) Predicting renal failure progression in chronic kidney disease using integrated intelligent fuzzy expert system. Comput Math Methods Med 2016:1–9 47. Chen J et al (2007) An unsupervised pattern (syndrome in traditional Chinese medicine) discovery algorithm based on association delineated by revised mutual information in chronic renal failure data. J Biol Syst 15(04):435–451 48. Kolachalama VB et al (2018) Association of pathological fibrosis with renal survival using deep neural networks. Kidney Int Rep 3(2):464–475 49. Almansour NA et al (2019) Neural network and support vector machine for the prediction of chronic kidney disease: a comparative study. Comput Biol Med 109:101–111 50. Ahmed RM, Alshebly OQ (2019) Prediction and factors affecting of chronic kidney disease diagnosis using artificial neural networks model and logistic regression model. Iraqi J Stat Sci 16(28):140–159 51. Kriplani H, Patel B, Roy S (2019) Prediction of chronic kidney diseases using deep artificial neural network technique. In: Computer aided intervention and diagnostics in clinical and medical images, pp 179–187 52. Sathya PS, Suresh KM (2018) Chronic kidney disease prediction using machine learning. Int J Comput Sci Inf Sec 16(4) 53. Arafat F, Fatema K, Islam S (2018) Classification of chronic kidney disease (CKD) using data mining techniques. Doctoral dissertation, Daffodil International University. 54. Pujari RM, Hajare MVD (2014) Analysis of ultrasound images for identification of chronic kidney. In: First international conference on networks & soft computing, pp 380–383 55. Chetty N, Vaisla KS, Sudarsan SD (2015) Role of attributes selection in classification of chronic kidney disease patients. IEEE 56. Kunwar V, Chandel K, Sabitha AS, Bansal A (2016) Chronic kidney disease analysis using data mining classification. In: 2016 6th international conference-cloud system and big data engineering (confluence), pp 300–305 57. Wibawa MS, Maysanjaya IMD, Putra IMAW (2017) Boosted classifier and features selection for enhancing chronic kidney disease diagnose. 2017 5th international conference on cyber and IT service management (CITSM) 58. Arasu SD, Thirumalaiselvi R (2017) A novel imputation method for effective prediction ofcoronary Kidney disease. In: 2017 2nd International conference on computing and communications technologies (ICCCT), pp 127–136 59. Avci E, Extraction AD (2018) Performance comparison of some classifiers on chronic kidney disease data 60. Vijayarani1 S, Dhayanand S (2015) Kidney disease prediction using SVM and ANN algorithms. Int J Comput Bus Res 6(2):2229–6166 61. Kumar M (2016) Prediction of chronic kidney disease using random forest machine learning algorithm. Int J Comput Sci Mobile Comput 5(2):24–33 62. Anantha Padmanaban KR, Parthiban G (2016) Applying machine learning techniques for predicting the risk of chronic kidney disease. Indian J Sci Technol 9(29). https://doi.org/10. 17485/ijst/2016/v9i29/93880 63. Sharma N, Verma RK (2016) Prediction of kidney disease by using data mining techniques. Int J Adv Res Comput Sci Softw Eng 6(9). ISSN: 2277-128X 64. Debal DA, Sitote TM (2022) Chronic kidney disease prediction using machine learning techniques. J Big Data 9:109. https://doi.org/10.1186/s40537-022-00657-5 65. Dritsas E, Trigka M (2022) Machine learning techniques for chronic kidney disease risk prediction. Big Data Cogn Comput 6:98. https://doi.org/10.3390/bdcc6030098 66. Koppe L, Fouque D, Kalantar-Zadeh K (2019) Kidney cachexia or protein-energy wasting in chronic kidney disease: facts and numbers. J Cachexia Sarcopenia Muscle 10(3):479–484
96
K. Chapa and B. Ravi
67. Bai Q, Su C, Tang W, Li Y (2022) Machine learning to predict end stage kidney disease in chronic kidney disease. Sci Rep 12(1):8377. https://doi.org/10.1038/s41598-022-12316-z 68. Pal S (2022) Chronic kidney disease prediction using machine learning techniques. Biomed Mater Dev. https://doi.org/10.1007/s44174-022-00027-y 69. Aswathy RH, Suresh P, YacinSikkandar M, Abdel-Khalek S, Alhumyani H et al (2022) Optimized tuned deep learning model for chronic kidney disease classification. Comput Mater Continua 70(2):2097–2111 70. Holmstrom L, Christensen M, Yuan N, Weston Hughes J, Theurer J, Jujjavarapu M, Fatehi P, Kwan A, Sandhu RK, Ebinger J, Cheng S, Zou J, Chugh SS, Ouyang D (2022) Deep learning based electrocardiographic screening for chronic kidney disease. https://doi.org/10.1101/2022. 03.01.22271473 71. Dritsas E, Trigka M (2022) Machine learning techniques for chronic kidney disease risk prediction. Big Data Cogn Comput 6(3):98. https://doi.org/10.3390/bdcc6030098 72. Thasil MA, Santhoshkumar S, Varadarajan V (2022) Intelligent deep learning based predictive model for coronary heart disease and chronic kidney disease on people with diabetes mellitus. Malaysian J Comput Sci 88–101. https://doi.org/10.22452/mjcs.sp2022no1.7 73. Singh V, Asari VK, Rajasekaran R (2022) A deep neural network for early detection and prediction of chronic kidney disease. Diagnostics 12(1):116. https://doi.org/10.3390/diagnosti cs12010116 74. Fu X, Liu H, Bi X, Gong X (2021) Deep-learning-based CT imaging in the quantitative evaluation of chronic kidney diseases. J Healthc Eng 2021. Article ID 3774423, 9 pages 75. Zhang K et al (2021) Deep-learning models for the detection and incidence prediction of chronic kidney disease and type 2 diabetes from retinal fundus images. Nat Biomed Eng 5(6):533–545. https://doi.org/10.1038/s41551-021-00745-6 76. Krishnamurthy S, Ks K, Dovgan E, Luštrek M, Piletiˇc BG, Srinivasan K, Li Y-C, Gradišek A, Syed-Abdul S (2021) Machine learning prediction models for chronic kidney disease using national health insurance claim data in Taiwan. Healthcare 9:546 [CrossRef] [PubMed] 77. Navaneeth B, Suchetha M (2020) A dynamic pooling based convolutional neural network approach to detect chronic kidney disease. Biomed Signal Proc Control 62:102068 78. Ma F, Sun T, Liu L, Jing H (2020) Detection and diagnosis of chronic kidney disease using deep learning-based heterogeneous modified artificial neural network. Future Gener Comput Syst 111:17–26. https://doi.org/10.1016/j.future.2020.04.036 79. Ibrahim Iliyas I, Rambo SI, Dauda AB, Tasiu S (2021) Prediction of chronic kidney disease using deep neural network. FUDMA J Sci 4(4):34–41. https://doi.org/10.33003/fjs-2020-040 4-309 80. Sabanayagam C, Xu D, Ting DSW, Nusinovici S, Banu R, Hamzah H, Lim C, Tham YC, Cheung CY, Tai ES, Wang YX, Jonas JB, Cheng CY, Lee ML, Hsu W, Wong TY (2020) A deep learning algorithm to detect chronic kidney disease from retinal photographs in communitybased populations. Lancet Digit Health. 2(6):e295–e302. https://doi.org/10.1016/S2589-750 0(20)30063-7. Epub 2020 May 12 PMID: 33328123 81. Khamparia A, Saini G, Pandey B, Tiwari S, Gupta D, Khanna A (2020) KDSAE: chronic kidney disease classification with multimedia data learning using deep stacked autoencoder network. Multim Tools Appl 79(47–48):35425–35440. https://doi.org/10.1007/s11042-019-07839-z 82. Makino M, Yoshimoto R, Ono M, Itoko T, Katsuki T, Koseki A, Kudo M, Haida K, Kuroda J, Yanagiya R et al (2019) Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning. Sci Rep 9:11862 83. Brunetti A, Cascarano GD, De Feudis I, Moschetta M, Gesualdo L, Bevilacqua V (2019) Detection and segmentation of kidneys from magnetic resonance images in patients with autosomal dominant polycystic kidney disease. In: Huang D-S, Jo K-H, Huang Z-K (eds) International conference on intelligent computing. Springer International Publishing, Cham 84. Xiong XL, Guo Y, Wang YY et al (2019) Ultrasonic image segmentation of kidney tumors based on adaptive partition evolution level sets. J Biomed Eng 36(6):945–956 85. Sharma K, Rupprecht C, Caroli A, Aparicio MC, Remuzzi A, Baust M, Navab N (2017) Automatic segmentation of kidneys using deep learning for total kidney volume quantification
A Study on Machine Learning and Deep Learning Techniques Applied …
86.
87. 88. 89.
90.
91.
92.
93.
94.
97
in autosomal dominant polycystic kidney disease. Sci Rep 7(1):2049. https://doi.org/10.1038/ s41598-017-01779-0.PMID:28515418;PMCID:PMC5435691 Chimwayi K, Haris N, Caytiles R, Iyenger N, Narayana CS (2017) Risk level prediction of chronic kidney disease using neuro-fuzzy and hierarchical clustering algorithm (s). Int J Multim Ubiq Eng 12:23–36. https://doi.org/10.14257/ijmue.2017.12.8.03 Xiang D, Bagci U, Jin C et al (2017) CorteXpert: a model-based method for automatic renal cortex segmentation. Med Image Anal 42:257–273 Song H, Kang W, Zhang Q, Wang S (2015) Kidney segmentation in CT sequences using SKFCM and improved GrowCut algorithm. BMC Syst Biol 9(Suppl 5):S5 Santos-Araújo C, Mendonça L, Carvalho DS, Bernardo F, Pardal M,Couceiro J, Martinho H, Gavina C, Taveira-Gomes T, Dinis-Oliveira RJ (2023) Twenty years of real-world data to estimate chronic kidney disease prevalence and staging in an unselected population. Clin Kidney J 16(1):111–124. https://doi.org/10.1093/ckj/sfac206 Koppe L, Soulage CO (2022) The impact of dietary nutrient intake on gut microbiota in the progression and complications of chronic kidney disease. Kidney Int 102(4):728–739. https:// doi.org/10.1016/j.kint.2022.06.025. PMID: 35870642 Fu X, Liu H, Bi X, Gong X (2021) Deep-learning-based CT imaging in the quantitative evaluation of chronic kidney diseases. J Healthc Eng 28(2021):3774423. https://doi.org/10.1155/ 2021/3774423.PMID:34745497;PMCID:PMC8568539 Chittora P, Chaurasia S, Chakrabarti P, Kumawat G, Chakrabarti T, Leonowicz Z, Jasi´nski M, Jasi´nski Ł, Gono R, Jasi´nska E et al (2021) Prediction of chronic kidney disease-a machine learning perspective. IEEE Access 9:17312–17334 [CrossRef] Zhang K, Liu X, Xu J et al (2021) Deep-learning models for the detection and incidence prediction of chronic kidney disease and type 2 diabetes from retinal fundus images. Nat Biomed Eng 5:533–545. https://doi.org/10.1038/s41551-021-00745-6 Saez-Rodriguez J, Rinschen MM, Floege J, Kramann R (2019) Big science and big data in nephrology. Kidney Int 95(6):1326–1337. https://doi.org/10.1016/j.kint.2018.11.048. Epub 2019 Mar 5 PMID: 30982672
Convolutional Neural Network and Recursive Feature Elimination Based Model for the Diagnosis of Mild Cognitive Impairments Harsh Bhasin , Abheer Mehrotra, and Ansh Ohri
Abstract Aging leads to reduced cognitive abilities. A person heading toward dementia will also show such signs but with more severity. This stage is referred to as Mild Cognitive Impairment. It has been observed that nearly one-fifth of the people suffering from MCI convert to dementia. The patients suffering from MCI who convert to dementia are called MCI-Converts and those who do not convert are called MCI Non-converts. This work proposes a model that extracts gray matter from the s-MRI brain volume and explores the applicability of combination of Convolutional Neural Network and Recursive Feature Elimination in the diagnosis of MCI. Features of the data obtained using the carefully crafted CNN followed by the selection of the appropriate features using Recursive Feature Elimination are then used to accomplish the task. Empirical analysis has been done to select the parameters of the CNN. The results are better than the state of the art and pave way for the exploitation of the Deep Learning models to classify the Converts and the Non-converts. Keywords Mild Cognitive Impairment · Recursive Feature Elimination · Convolutional Neural Networks · Structural-Magnetic Resonance Imaging · Gray matter
1 Introduction Dementia cannot be cured but can be traced, and hence, it becomes important to identify the formative stages of dementia. The progression to dementia can be delayed by handing the symptoms. Thus, the identification of the formative stage facilitates H. Bhasin (B) Department of Computer Science and Engineering, Center for Health Innovations, Manav Rachna International Institute of Research and Studies, and Computer Science Engineering, MRIIRS, Aravali Hills, Sector 43, Faridabad, Haryana, India e-mail: [email protected] A. Mehrotra · A. Ohri Manav Rachna University, Aravali Hills, Sector 43, Faridabad, Haryana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_8
99
100
H. Bhasin et al.
this task. A person’s progression to dementia will show severe signs of cognitive impairment vis-à-vis the decline in the cognitive abilities with normal aging. Mild Cognitive Impairment (MCI) is a stage between the decline in cognitive abilities with normal aging and graver decline with dementia [1]. As per the state of the art, nearly one-fifth of the MCI patients progress to Alzheimer’s Disease (AD) [2]. Researchers have used clinical data [3], functional-Magnetic Resonance Imaging (f-MRI) [4] structural-Magnetic Resonance Imaging (s-MRI) [5, 6], and Positron Emission Tomography (PET) [7] to diagnose MCI. This work uses s-MRI to diagnose MCI, as it helps to identify structural changes in the volume. Furthermore, the proposed model uses gray matter extracted from the volumes, as the decrease in gray matter in some areas of the brain is considered responsible for cognitive impairment [8]. We have used full brain volume and not some regions of interest (ROI), as the ROI-based methods may miss out on some regions which may be responsible for the impairment, but have not been identified. The researchers have used conventional machine learning models to identify MCI. These models contain an appropriate combination of pertinent feature extraction and selection methods. Researchers have used Local Binary Pattern (LBP) [9, 10], 3D LBP and Discrete Wavelet Transform (DWT) [11], Gray-Level Run Length Matrix (GLRL) [12], Gray-Level Co-occurrence Matrix (GLCM) [13], etc., for feature extraction. Researchers have also used Fisher Discriminant Ratio (FDR) [14], Genetic Algorithm (GA) [15], and Diploid Genetic Algorithm (DGA) for feature selection. These methods require careful selection of feature extraction and selection techniques and optimization of the model selection of appropriate hyper-parameters. Deep Learning aims to figure out an appropriate representation of the data using numerous layers of abstraction [16]. Convolution Neural Networks (CNNs) are the type of Deep Learning architecture that encodes certain properties of input images into the architecture [17]. These methods do not involve explicit feature extraction and selection and hence are more objective. This paper uses the feature extracted from various CNN models and applies Recursive Feature Elimination (RFE) to select the important features for classification. The variation in the performance of the model with the parameters is studied and analyzed. The results suggest that the use of CNN for feature extraction, followed by the use of RFE for feature extraction, helps in the diagnosis of MCI, provided the hyper-parameters like the number of layers in the CNN, the dropout rate, the activation functions, etc., which are selected with due consideration. This paper contains four sections. The materials and methods used in the research have been disused in the next section, and the results and discussion follows. The last section presents the conclusion.
Convolutional Neural Network and Recursive Feature Elimination …
101
2 State of the Art An extensive literature review was carried out to access the state of the art and to find the gaps in the literature. Bhasin et al. used 3D local Binary Pattern and 3D DWT for extracting pertinent features from the s-MRI data [5]. Tufail et al. used Deep 2D-CNNs on s-MRI scans obtained from OASIS dataset to learn the various features which are combined to make the final classification for AD diagnosis [18]. Fang et al. have worked on Diffusion Tensor Image (DTI) modality to classify the data into four classes. The paper also proposes a multi-modal learning theory that uses transfer learning [19]. Illakiya et al. applied various Deep Learning techniques such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) on various modalities like s-MRI, f-MRI, and PET [20]. Turkson et al. used Convolutional Spiking Neural Networks (SNNs) that were earlier pre-trained on the MRI data to classify AD [21]. The method was validated using the ADNI dataset. Researchers have also used a combination of latest Deep Learning methods like 3D CNNs and Support Vector Machines for the classification of controls, MCI, and AD patients [22]. Houria et al. also employed 2D CNNs and SVM to detect the atrophy of the gray matter for the classification of MCI [23]. Some researches have used CNNs for the diagnosis of MCI. Bhasin et al. have used CNN to find the features and have selected pertinent features using Fisher Discriminant Ratio [24]. Researchers have used Principal Component Analysis to reduce the dimensionality along the third axis followed by use of CNN for classification. Likewise, [25] have used multiple activation parallel Convolutional Networks [26] to classify the data obtained by the combination of t-SNE and PCA. After a comprehensive literature review it was found that no one has used the proposed combination. The literature review brought forth the following points: 1. Most of the researchers have used ADNI dataset for the validation of the proposed method. 2. Other datasets like OASIS have also been used but by lesser number of researchers. 3. Some of the researchers have used Region of Interest (ROI) based methods, using the regions responsible for cognitive impairment. 4. The researchers who used conventional Machine Learning Pipeline focused on feature selection, extraction, and classifier. 5. Transfer learning has also been used by the researchers, though the basis of using transfer learning has not been explained.
102
H. Bhasin et al.
3 Materials and Methods 3.1 Materials “The datasets supporting the conclusions of this article are available in the Alzheimer’s Disease Neuroimaging Initiative (ADNI) repository, Data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public–private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether Magnetic Resonance Imaging (MRI), PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD.” [27] The MCI-C and MCI-NC patients were selected by querying the ADNI database. This paper uses the protocol suggested in paper [5] for the inclusion and exclusion of the patients. The criteria were formulated by considering many factors including the age criteria, the number of slices, etc. This study uses 75 MCI-C, 89 CN, and 112 MCI-NC processed NIFTI images of patients. The age of the controls was between 63 and 90, those of the MCI-NC was between 56 and 88, and MCI-C was between 55 and 87. The CDR rating of those selected was either 0.5 or 1. The various parameters of the selected images were as follows (i) the images were T1 weighted, (ii) the field strength was 1.5 T, (iii) the TR was 3.6099 ms, and finally (iv) the TE was 3000 ms.
3.2 Methods This work aims to classify MCI-C and MCI-NC using the features extracted from CNN. These features have been extracted from the last but one layer of the CNN shown in Fig. 1. Recursive Feature Elimination is applied to the feature vector so formed and Support Vector Machine is used to classify the two classes.
3.2.1
Convolutional Neural Network
The CNNs are designed, especially for images [28]. These networks have various layers like the convolutional layer, pooling layer, and Fully Connected Layers. The convolutional layer extracts the important features from the given image [28] and creates its feature map. These maps are down-sampled using the pooling layer, which reduces the number of parameters and avoids the problem of overfitting. The hyperparameters of this layer are carefully selected to maximize the performance of the model. The CNNs have shared weight as against the Multi-layer Perceptron, which drastically reduces the number of parameters to be learned. CNNs are transition and rotation invariant. Researchers have designed various CNNs to accomplish image-related tasks, for example, LeNet-5, ResNet, AlexNet,
Convolutional Neural Network and Recursive Feature Elimination …
103
Fig. 1 CNN features + RFE-based MCI diagnosis
etc. These architectures vary in the type of pooling, like average pooling in LeNet and max pooling in AlexNet. They also vary in the type of activation used like AlexNet uses ReLU and most of the others use sigmoid/tanh. Furthermore, these structures can also be classified based on how they are trained on the GPUs.
3.2.2
Forward Feature Selection
In forward feature selection, the features are ranked in the order of their importance with respect to the label. The so-arranged features are then taken incrementally and
104
H. Bhasin et al.
fed to the classifier. The variation in the performance of the model vis-a-vis the number of features is then noted and the features are selected. This work uses Fisher Discriminant Ratio to accomplish this task.
3.2.3
Proposed Method
The method used to carry out the classification is elucidated as follows: For the given data, 1. The data are first divided into the train and the test sets. 2. Carry out the following steps for each train data volume: 2.1 For each slice of the MRI volume, extract the feature representation from the last but one layer of the CNN. 2.2 Horizontally concatenate each feature representation so obtained. The final representation of each volume would be used for classification. 3. Use Recursive Feature Elimination on the feature set. 4. Use the SVM to train the model using the training set. 5. Validate the model on the validation set. The proposed model is shown in Fig. 1.
4 Results For this work, three CNNs were crafted. The first CNN consists of three convolutional layers having kernel sizes (5 × 5), (5 × 5), and (3 × 3) and the number of kernels as 16, 32, and 64. This is then followed by two fully connected layers having 64 and 32 neurons, respectively. The second CNN is similar to the first one; however, the number of kernels was reduced to 8, 16, and 32, respectively. In the third CNN, the number of kernels was 8, 16, and 32, but the size of the kernels was reduced to 5, 3, and 2, respectively. Figure 2 shows these three CNNs. CNN1 and CNN2 have hyper-parameters as stated above. The results are presented in Table 1. The CNN3 architecture with linear kernel performs best with feature selection (bold). From Table 1, it can also be observed that: 1. 2. 3. 4.
With the decrease in the number of parameters, the learning is improved. The Linear kernel performs better than the RBF and polynomial. Feature selection using RFE improves the results. Feature extraction using CNN generates pertinent features.
The proposed work is able to classify MCI-C and MCI-NC with F-Measure of 89.8 using CNN3 (which has the least number of parameters) using RFE and the
105
Fig. 2 CNN architectures
Convolutional Neural Network and Recursive Feature Elimination …
106
H. Bhasin et al.
Table 1 Results of various settings Architecture
SVM Kernel
With feature selection (%)
Without feature selection (%)
CNN1
Linear
85.08
83.12
CNN1
RBF
84.13
81.51
CNN1
Polynomial
84.71
80.86
CNN2
Linear
88.1
81.4
CNN2
RBF
87.9
80.78
CNN2
Polynomial
86.01
79.36
CNN3
Linear
89.8
82.3
CNN3
RBF
87
80.8
CNN3
Polynomial
87.01
79.4
Table 2 Comparison with the state of the art
Method
Accuracy (%)
Bhasin et al. [5]
0.8877
Liu et al. [29]
72.08
Wee [30]
75.05
Suk et al. [31]
72.42
Tong et al. [32]
72
Colliot et al. [33]
66
Proposed model
89.8
linear kernel. The work has been validated using the ADNI data, as stated in the previous section. The comparison with the state of the art is shown in Table 2.
5 Conclusion The classification of MCI-C and MCI-NC may help clinicians to handle the symptoms, thus helping the patients and their families. Though researchers have already worked on the said classification, this paper explores the applicability of Deep Learning, particularly CNN, in handling this problem. The proposed work also explores the hand-crafted CNN architecture’s ability to generate pertinent features followed by Recursive Feature Elimination, which can be used to accomplish the task. The work has been implemented and the results are encouraging. The future work will use heuristic techniques to select the parameters of the model.
Convolutional Neural Network and Recursive Feature Elimination …
107
References 1. Henderson VW (2023) Mild cognitive impairment. https://med.stanford.edu/content/dam/ sm/adrc/documents/adrc-information-sheet-mild-cognitive-impairment.pdf. Accessed 17 Apr 2023 2. Ward A et al (2013) Rate of conversion from prodromal Alzheimer’s disease to Alzheimer’s dementia: a systematic review of the literature. Dement Geriatr Cogn Dis Extra 3(1):320–332 3. Lyu Y et al (2021) Classification of mild cognitive impairment by fusing neuroimaging and gene expression data: classification of mild cognitive impairment by fusing neuroimaging and gene expression data. In: The 14th pervasive technologies related to assistive environments conference. Association for Computing Machinery, Corfu, Greece, pp. 26–32 4. Zalesky A et al (2010) Network-based statistic: Identifying differences in brain networks. Neuroimage 53(4):1–11 5. Bhasin H, Agrawal RK (2020) For Alzheimer’s disease neuroimaging initiative: a combination of 3-D discrete wavelet transform and 3-D local binary pattern for classification of mild cognitive impairment. BMC Med Inform Decision Making 20(1):1–10 6. Chincarini A et al (2011) Local MRI analysis approach in the diagnosis of early and prodromal Alzheimer’s disease. Neuroimage 58(2):469–480 7. Singh S et al (2017) Deep learning based classification of FDG-PET data for Alzheimer’s disease categories. Proc SPIE Int Soc Opt Eng 10572(0277-786X):1–37 8. Apostolova LG et al (2007) Three-dimensional gray matter atrophy mapping in mild cognitive impairment and mild Alzheimer disease. Arch Neurol 64(10):1489–1495 9. Pietikäinen M et al (2023) Local binary patterns for still images. In: Computer vision using local binary patterns. Springer, London, pp 13–47. Accessed 13 Feb 2023 10. Ojala T, Pietikainen M (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987 11. Mallat S (1999) A wavelet tour of signal processing. Academic Press, United States 12. Xunkai W (2023) Gray level run length matrix toolbox. MATLAB Central File Exchange. https://www.mathworks.com/matlabcentral/fileexchange/17482-gray-level-run-len gth-matrix-toolbox. Accessed 17 Apr 2023 13. Conners RW, Harlow CA (1980) A theoretical comparison of texture algorithms. IEEE Trans Pattern Anal Mach Intell PAMI 2(3):204–222 14. Duda R, Peter E, Hart (1974) Pattern classification and scene analysis. A Wiley-Interscience Publication 15. Goldberg DE (2006) Genetic algorithms. Pearson Education, India 16. Li F-F, Andrej K Convolutional neural networks for visual recognition. http://cs231n.github. io/convolutional-networks. Accessed 17 Apr 2023 17. LeCun Y et al (2015) Deep learning. Nature 521(7553):436–444 18. Tufail AB, Ma YK, Zhang QN (2020) Binary classification of Alzheimer’s disease using sMRI imaging modality and deep learning. J Digit Imaging 33:1073–1090. https://doi.org/10.1007/ s10278-019-00265-5 19. Fang M, Jin Z, Qin F et al (2022) Re-transfer learning and multi-modal learning assisted early diagnosis of Alzheimer’s disease. Multimed Tools Appl 81:29159–29175. https://doi.org/10. 1007/s11042-022-11911-6 20. Illakiya T, Karthik R (2023) Automatic detection of Alzheimer’s disease using deep learning models and neuro-imaging: current trends and future perspectives. Neuroinform. https://doi. org/10.1007/s12021-023-09625-7 21. Turkson RE, Qu H, Mawuli CB et al (2021) Classification of Alzheimer’s disease using deep convolutional spiking neural network. Neural Process Lett 53:2649–2663. https://doi.org/10. 1007/s11063-021-10514-wG 22. Raju M, Gopi VP, Anitha VS et al (2020) Multi-class diagnosis of Alzheimer’s disease using cascaded three dimensional-convolutional neural network. Phys Eng Sci Med 43:1219–1228. https://doi.org/10.1007/s13246-020-00924-w
108
H. Bhasin et al.
23. Houria L, Belkhamsa N, Cherfa A et al (2022) Multi-modality MRI for Alzheimer’s disease detection using deep learning. Phys Eng Sci Med 45:1043–1053. https://doi.org/10.1007/s13 246-022-01165-9 24. Bhasin H et al (2021) PCA based hierarchical CNN for the classification of mild cognitive impairments and the role of SIREN activations. In: 2021 2nd Asia conference on computers and communications (ACCC). Singapore, pp. 143–148 25. Bhasin H et al (2021) Multiple-activation parallel convolution network in combination with t-SNE for the classification of mild cognitive impairment. In: 2021 IEEE 21st international conference on bioinformatics and bioengineering (BIBE). Kragujevac, Serbia, pp 1–7 26. Bhasin H et al (2021) Applicability of manually crafted convolutional neural network for classification of mild cognitive impairment. In: 2021 2nd Asia conference on computers and communications (ACCC). Singapore, pp. 127–131 27. Alzheimer’s Disease Neuroimaging Initiative. https://adni.loni.usc.edu/. Accessed 17 Apr 2023 28. Li F-F, Andrej K Convolutional neural networks for visual recognition. http://cs231n.github. io/convolutional-networks. Accessed 17 Apr 2023. 29. Liu J, Li M, Lan W, Wu F-X, Pan Y, Wang J (2018) Classification of Alzheimer’s disease using whole brain hierarchical network. IEEE/ACM Trans Comput Biol Bioinform 15(2):624–32 30. Wee C-Y, Yap P-T, Shen D (2013) Prediction of Alzheimer’s disease and mild cognitive impairment using baseline cortical morphological abnormality patterns. Hum Brain Mapp 34(12):3411–3425 31. Suk H-I, Lee S-W, Shen D (2014) Alzheimer’s disease neuroimaging initiative.: hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage 101:569–582 32. Tong T et al (2014) Multiple instance learning for classification of dementia in brain MRI. Med Image Anal 18(5):808–818 33. Colliot O et al (2008) Discrimination between Alzheimer disease, mild cognitive impairment, and normal aging by using automated segmentation of the hippocampus. Radiology 248(1):194–201
Facial Analysis Prediction: Emotion, Eye Color, Age and Gender J. Tejaashwini Goud, Nuthanakanti Bhaskar, K. Srujan Raju, G. Divya, Srinivasarao Dharmireddi, and Murali Kanthi
Abstract The main goal is to create an automatic facial analyzer system for the human face, which will be a crucial component of future analysis. The topic of this work is the recognition of emotion, gender, age, and eye colour from an image. We developed a facial recognition system to forecast facial data with the help of deep learning and OpenCV. Three methods are used in this study: one for predicting age and gender, one for estimating eye color, and one for recognizing emotions. The facial analysis system delivers improved accuracy, per the results. Keywords Gender and age prediction · Eye color prediction · Face expression recognition · Facial analysis
1 Introduction A face picture is a specific element that many statistical profiles as age, gender, eye color, and emotions, may derive [1]. Verification of a person in an image leads to numerous authentication protocols like facial, fingerprint, signature, iris, and voice can exploit. Facial recognition software is the trendiest research approach in demographics and is currently popular, as human authentication to use face appearance is an intuitive way for humans. Facial learning is among the more effective computer–human interface and learning applications. Humans can recognize and analyze an image by noticing it J. T. Goud (B) · N. Bhaskar · K. Srujan Raju · M. Kanthi (B) Department of Computer Science and Engineering, CMR Technical Campus, Hyderabad, India e-mail: [email protected] M. Kanthi e-mail: [email protected] G. Divya Department of IT, CMR Technical Campus, Hyderabad, India S. Dharmireddi Department of Cybersecurity, ST. Louis, Missouri, MO, USA © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_9
109
110
J. T. Goud et al.
whereas without a proper knowledge machine cannot understand [2]. For distinctive features, the face is perhaps the most prominent of the listed human organs. In addition, to face numerous ways for recognition as race, gender, emotion, iris, and age. Determining a person’s gender is relevant as people’s reactions vary depending on their gender identity [3]. The categorization efficiency of gender and age depending on essential aspects like extraction and classification extracting of features plays a vital element in providing accurate accuracy of classification [4, 5]. The person’s mind and brain have an incredible capacity to recognize distinct faces based on information of visual look. Whenever it relates to age identification, a person’s brain is not very accurate about it [6]. Facial appearance carries a lot of data, as humans face provides an incredible ability to retrieve, recognize, and analyze [7, 8]. Since the expansion of multimedia content, adaptive techniques are necessary to estimate an image. Evaluating the facial picture for identification is a challenging task for computer vision nowadays. In communications, emotions are crucial to behavioral messages [9]. Though in certain situations, people can identify and understand human faces and emotions without much or no difficulty, machine detection of emotions remains a difficulty [10]. Human prohibits from expressing their feelings in various situations, such as sick patients or other inadequacies; thus, improved detection of certain complex feelings and emotions would result in more successful interactions [11]. Also in other situations as while riding a vehicle or in huge traffic management for understanding rider facial analysis and in agricultural fields to understand farmers facial features. Human face information may analyze the resilience and reliability of any classification techniques for exploring how algorithms function for various kinds of data. Face analysis methods are often performed on images or collected frames before the collection of characteristics for the facial recognition system. The facial recognition techniques be generalizing as illustrated in the following: (1) (2) (3) (4)
Preprocessing of data Face identification Extraction of features Classifying based on characteristics.
2 Related Work Two different approaches were used: one to forecast sentiments utilising a conventional CNN design and the other was used to estimate gender and age using an advanced resnet framework. By using CNN, variations regarding how such acts are carried out were discovered. The complete framework’s effectiveness is 96.26%, however, the evaluation method version’s success rate is 69.2% [12]. An autoencoder is a synthetic NN used in unsupervised optimal programming approaches. It seeks to identify a model of reliable and safe data. These images are designed to be used in the supervised technique for evaluating the overall effectiveness autoencoder’s study capabilities to assess years [13].
Facial Analysis Prediction: Emotion, Eye Color, Age and Gender
111
This method evaluated the real image’s sensory characteristics. To reduce the computation burden, the AdaBoost algorithm creates and extracts a number of Gabor wavelets from a great deal of dragging fragment stacks. The classifier using the SVM has been given the desired groups to learn from. In the Yale and JAFFE datasets, the average incidence of the level is 97.57% and 92.33%, respectively [14]. The daily practice of seeing feelings only in their genuine, universal contexts is an example of how to effectively use those emotions that people experience in their everyday lives. Four types of routinely used systems, including multilayered perceptron random forest, naive Bayes, neural networks, and logistic regression, influence daily observations of the likelihood of emotion estimate from context data [15]. The predicted range of obscured appearances is highlighted by the visible identification agent used to lead the report. The location and size of the face anticipated by the renewal design remained beyond additional component intervals after the finding latency was translated into a high-stage connotative particular recognition problem through an anticipatory logical identification [16]. Only facial photos are used to determine the age. On what has lately been referred to as a Deep Random Forest region, a modern category frame cascade ensemble. The framework is made up of two state-of-the-art DRFs. The primary objects improve upon and successfully fulfill the purpose of currently expressing a particular facial description. The additional type generates an age prediction while additionally taking into consideration age estimate fuzzy sets at the level of the merged form provincial representation. The research’s outstanding publicly accessible datasets include MORPH Caucasian, FG-internet, LFW + , FACES, APPA-real, and friend. These tests demonstrate that modern recommended structures perform better than a number of existing approaches [17]. Our team of designers consists of about 35 individuals that engaged in 8 organizations using enlarging during each Creativity Development Techniques session. Each culture demonstrated their advancements during stereo video conversations that were held. Six components of expression were discovered when a zoomed face segment of people’s faces was used to measure emotion-related prevalence. The outcome parameter displays the grade that all other coworkers, except the speaker, assign after every task [18]. An efficient pattern for learning especially considers the difficulties in applying the proper facial gesture. Detailed searches especially highlight the honed skills of our methodology, next to the more demanding question phase of our most recent standard [19].
3 Methodology Firstly process requires input that’s currently collected in the server as a picture given for prediction. Next, the photo is retrieved for reprocessing by the source. Upon processing, the photo is effectively gathered and retained in ram. The next
112
J. T. Goud et al.
Fig. 1 Facial analysis process
stage is a screening of images, a process that helps reduce the greater frequency of pictures. Then it is accompanied by processing for thresholding, picture smoothness, and edge enhancement of photographs [20]. That is followed by extracting features, which begins with an initial set of observed data and accumulates parameters that are useful and non-redundant for character recognition as it defines picture behavior. The further following step is fragmentation, which is the process of dividing a picture into several parts. The idea would be to transform a picture’s depiction to be more valuable and simpler to examine before converting it to pixel representations. Now next phase is to train or evaluate the image data. We obtain the result after completing all of the preceding processes. The strategy of classifying gender, age, eye color, and emotion from an image is a vital part of our methodology. Figure 1 represents the process of facial analysis system.
4 Implementation The training-to-testing proportion if we undertake 80% learning and 20% evaluation. We can also conduct validation if we wish. By using frontal side face images only trained the system for evaluation. Datasets which used for learning and evaluating are from Kaggle like IMDB Wiki, FER-2013 datasets. For facial recognition, we integrate Deep Neural Networks (DNN), Deepface, Multi-Task Cascaded Convolutional Neural Networks (MTCNN), and Opencv in our proposed method. The HAAR cascading technique aids in object identification, along with determining the facial analysis of the human image.
Facial Analysis Prediction: Emotion, Eye Color, Age and Gender
113
4.1 Age and Gender Estimation The suggested model can recognize faces separate them into Male/Female components for gender estimation, and split a picture of a person’s face into one of eight age categories group comprising 0–3, 4–7, 8–13, 14–23, 24–37, 38–47, 48–59, 60– 100. Deep neural networks are numerous neural networks that may be employed depending on the requirements or inputs provided. They have three major elements: input, hidden, and output. A Machine learning model prototype is produced for usage within Caffe by a PROTOTXT file. It includes an image analysis or segmentation method intended for learning in the Caffe model..CAFFEMODEL files are created by using PROTOTXT files. Convolutional Architecture for Fast Feature Embedding (caffe) is a learning structure for building classification and segmentation methods. Preprocessing consists of three phases: identifying the picture, converting it into grayscale, and noise reduction. By removing noise and smoothing the picture data, this step can enhance the presentation of the input picture and discover the relevant information. It removes picture duplication without removing picture information. Face detection is a technique of extracting a face from a backdrop of input data. It includes segmenting and extracting facial characteristics from an unstructured context of a picture. Extraction of features is essential in object identification. When contrasted with the initial image, it retrieves valuable information from it. The classifying step detects face images and groups them into certain classifications to aid with proficient detection. The categorization step, referring to the feature decision stage, focuses on interactions maintaining essential data, and linking them to specific criteria. This method uses the person’s facial features in a given photograph to determine the gender and age of the provided individual.
4.2 Emotion Estimation Deepface is a Python-based open-source facial characteristics analysis framework. It is an advanced computing vision toolkit that helps detect objects in pictures, such as patterns and features inside the image, making them easier to recognize and analyze. The operations of DeepFace. Analyze object defines whatever summary evaluation that expects it to execute. Deepface is capable of collecting every bit of face images. It displays the emotions attributes as happy, disgust, angry, sad, surprise, neutral and fear. Preprocessing is a general term for processes using pictures that operate at the most basic abstraction level, where these input and output are saturation photos. Extracting features is a critical stage in face identification and is described as the method for identifying certain areas, curves, outlines, or landmarks in a provided picture.
114
J. T. Goud et al.
The detectMultiScale() method accepts additional parameters which properly revise the classifier’s operation. ScaleFactor and minNeighbors are two vital metrics to consider. The scaleFactor parameter governs the input picture data adjusted before estimation. The minNeighbors parameter controls wholesomely every estimate of an image to be analyzed. To efficiently detect faces where the scaleFactor and minNeighbors frequently tuned for image data. The haar cascade approach detects facial objects in the image. The photo contains the facial data is scaled to 48 × 48 and fed into the DNN. According to DNNs, its principle entails that a neural network’s layers learn characteristics related to specific emotions. This network generates a set of feature map evaluations for each of the seven emotion classes. Then facial emotion with the highest ranking is presented.
4.3 Eye Color Estimation The iris is the colorful portion of the eye. The coloration of the iris regulates the color of the eyes. HSV is more similar to how people sense color. It is of three parts as color space that define hue by their shade as saturation and brightness value. Facial recognition at the cutting edge is achievable using the MTCNN library and a Multitask Cascade CNN. MTCNN can elucidate into three phases, where the last phase performs face detection and landmarks concurrently. These phases are composed of various CNNs of differing complexity. After configuring and loading the model, it applied instantly to recognize faces in pictures by implementing the detect faces() method. It provides a collection of dict objects, each of a variety of keys containing the details of each identified faces as box, confidence, and keypoints. Estimate the local points and values in the whole face picture. Now using these landmarks also as centers, then efficiently produced a variety of potential eye areas. Those identify the eye class as left or right of the face in the image. Learning the eye feature points through neurons provides the estimation of eye color.
5 Experimental Results The proposed model combining the three methods in a hierarchy have executed, and the below Fig. 2 provides the result of age and gender, eye color, and emotion prediction. The loss function is computed over all data throughout an epoch and is assured to deliver the precise loss estimate at the current interval epoch. However, showing the curve over repetitions only provides the loss on a portion of the overall data. Figure 3 represents the epoch versus loss graphs of emotion estimation. Figure 4 represents the epoch versus loss graphs of gender estimation.
Facial Analysis Prediction: Emotion, Eye Color, Age and Gender
Fig. 2 Facial analysis
Fig. 3 Epoch versus loss of emotion estimation
Figure 5 represents the epoch versus loss graphs of age estimation. Figure 6 represents the epoch versus loss graphs of eye color estimation.
115
116
Fig. 4 Epoch versus loss of gender estimation
Fig. 5 Epoch versus loss of age estimation
Fig. 6 Epoch versus loss of eye color estimation
J. T. Goud et al.
Facial Analysis Prediction: Emotion, Eye Color, Age and Gender
117
6 Conclusion We utilized OpenCV and Python in this design since they are straightforward to develop and understand. This research offered a new combination of age and gender estimation, eye color prediction, and emotion recognition. The proposed model combining the three methods in a hierarchy have executed effectively. The accuracies achieved by these models are 78.6% for gender and age estimation, 64.7% for eye color estimation, and 70.8% for emotion recognition. In the future, this process will enhance by integrating multiple patterns and predicting from videos or webcams.
References 1. Abirami B, Subashini TS, Mahavaishnavi V (2020) Gender and age prediction from real time facial images using CNN. Mater Today Proc 33:4708–4712 2. Morampudi M, Gonthina N, Bhaskar N, Dr-Dinesh Reddy V (2023) Image description generator using residual neural network and long short-term memory 3. Ba AM, Fares E (2016) Real-time gender classification by face. Int J Adv Comput Sci Appl 7(3) 4. Agbo-Ajala O, Viriri S (2020) Deeply learned classifiers for age and gender predictions of unfiltered faces. Sci World J 5. Duan M et al (2018) A hybrid deep learning CNN–ELM for age and gender classification. Neurocomputing 275:448–461 6. Gupta R, Khunteta A (2012) SVM age classify based on the facial images. Int J Comput 1(2) 7. Grigory A et al (2017) Effective training of convolutional neural networks for face-based gender and age prediction. Pattern Recogn 72:15–26 8. Bukar AM, Ugail H, Connah D (2016) Automatic age and gender classification using supervised appearance model. J Electron Imaging 25(6):061605 9. Bellamkonda S, Gopalan NP (2018) A facial expression recognition model using support vector machines. IJ Math Sci Comput 4:56–65 10. Ghimire D, Lee J (2013) Geometric feature-based facial expression recognition in image sequences using multi-class adaboost and support vector machines. Sensors 13(6):7714–7734 11. Hassouneh A, Mutawa AM, Murugappan M (2020) Development of a real-time emotion recognition system using facial expressions and EEG based on machine learning and deep neural network methods. Inf Med Unlocked 20:100372 12. Arjun S et al (2021) Age, gender prediction and emotion recognition using convolutional neural network. Available at SSRN 3833759 13. Zaghbani S, Boujneh N, Bouhlel MS (2018) Age estimation using deep learning. Comput Electr Eng 68:337–347 14. Owusu E, Zhan Y, Mao QR (2014) An SVM-AdaBoost facial expression recognition system. Appl Intell 40(3):536–545 15. Ortega S, Martin G, Rodríguez L-F, Gutierrez-Garcia JO (2020) Towards emotion recognition from contextual information using machine learning. J Ambient Intell Humanized Comput 11(8):3187–3207 16. Yuan Z (2020) Face detection and recognition based on visual attention mechanism guidance model in unrestricted posture. Sci Program 17. Guehairia O et al (2020) Feature fusion via deep random forest for facial age estimation. Neural Netw 130:238–252 18. Rößler J, Sun J, Gloor P (2021) Reducing videoconferencing fatigue through facial emotion recognition. Future Internet 13(5):126
118
J. T. Goud et al.
19. Eidinger E, Enbar R, Hassner T (2014) Age and gender estimation of unfiltered faces. IEEE Trans Inf Forensics Secur 9(12):2170–2179 20. Bhaskar N, Ganashree TS (2017) Deployment of weighted guided filtering scheme to enhance digital video quality. In: 2017 International conference of electronics, communication and aerospace technology (ICECA), pp 119–124. https://doi.org/10.1109/ICECA.2017.8203656
An Automated Smart Plastic Waste Recycling Management Systems Vamaraju Hari Hara Nadha Sai, Nuthanakanti Bhaskar, Srinivasarao Dharmireddi, K. Srujan Raju, G. Divya, and Jonnadula Narasimharao
Abstract This work presents an automated smart plastic waste recycling management system is presented. In this analysis, Waste Classification data set from Kaggle is used. Firstly, the images of various years will be cleaned and merged. Using Time series algorithm, waste generation, recycling generation and recycling rate of various years will be calculated. In addition, it generates the time series graphs for waste material generation, recycled waste and recycle rate of different materials. Compared to previous system this system will achieve better recycling rate. Keywords Waste recycling · Time series algorithm · Recycling generation and recycle rate
1 Introduction The diversity and generation of waste has expanded as the urbanization process has intensified. Waste pollution is one of the most serious environmental challenges confronting the world today. The industry demands high efficiency and recycling is important for both ecological and economic reasons [1]. Waste collection and recycling are important services for cities today especially in larger cities. Recycling is crucial to preventing pollution and health risks for residents as a result of a loss of natural resources and environmental challenges brought on by increasing waste volumes [2].
V. H. H. N. Sai (B) · N. Bhaskar · K. Srujan Raju · J. Narasimharao Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India e-mail: [email protected] S. Dharmireddi Department of Cyber Security, MasterCard, St. Louis, Missouri, USA G. Divya Department of IT, CMR Technical Campus, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_10
119
120
V. H. H. N. Sai et al.
A sustainable and clean environment depends on recycling. Solid waste management and recycling issues affect both developed and developing nations. A good method for separating waste from recyclables is waste classification [3]. The following are the major barriers to waste recycling: (i) inadequate government regulation and money for MSW management; (ii) household education: from the need of self-waste recycling the households are unaware; (iii) technology: a lack of efficient recycling technologies; and (iv) management expense: manual trash categorization is expensive [4]. Plastic’s popularity for everyday is because of its low density, hydrophobicity and high chemical stability. Depending on their intended function plastics can easily formed into different shapes. Polymers also referred as “long chains of monomers linked together to form a polymer” and also named as plastics. Polymers may be made from natural materials such as cellulose, which is the principal component of plant cell walls and allows cells to vary their roles. Apart from these characteristics, plastics offer a high strength-to-weight ratio and a long lifetime [5]. Since plastic output and waste expand dramatically, Scientists and researchers are seeking for innovative and long-term ways to reuse and recycle plastic rubbish in order to lessen its environmental impact. Waste plastic has several uses, including construction materials, waste plastic recycling into fuel, home items, fabric, and clothes. Plastics are highly important materials because of its benefits such as lightweight, longevity, high elasticity, water resistance, durability, strength, ease of transportation, corrosion resistance, and low cost. Despite the fact that the use of these materials has many benefits including medical devices, social benefits, packaging that decreases food waste, life-saving protective gear, etc. Plastics are much more calorific than wood, glass, paper or metals (with exception of aluminum) with polymer energy [6] ranging from 62 to 108 MJ/kg (including feedstock energy). Plastic has numerous advantages such as resistivity, stretchability, cheaper rates, lightweight and reusability to some extent. But the main concern is that plastic is discarded just after its usage. Due to the materialistic properties of plastic, it can sustain in the environment for a longer time and can cause a hazardous effect [7]. Poor waste management practices in developing nations like India, which account for 71% of Asia’s mishandled plastics, it is difficult to properly collect, segregate, treat and dispose of PW (Plastic Waste). As a result, recycling [8], upscaling or reprocessing of PW has become an essential requirement in order to curb plastic mismanagement and mitigate the negative environmental impacts of plastic use and utilization. However, after post-consumer use, this resource has not received the attention it deserves. As the amount of waste increases year after year communities can build technology-driven solutions to reuse, reduce, and recycle PW in an environmentally friendly manner. Depending on the quality of the product produced through the recycling of the waste PW is often recycled [9] or reprocessed using 5 different types of processes namely recycling (open or closed loop), upgrading, downgrading, waste-to-energy facilities, landfills, or dumps. Usually, PW is processed into inferior goods like granules, flakes or pellets that are used to create a range of finished goods like mats, pots, furniture, and boards (Centre for Science and Environment (CSE). Recycling
An Automated Smart Plastic Waste Recycling Management Systems
121
is the most effective strategy to minimize trash generation, protect the environment, and boost the economy of the entire country. The purity and accuracy of the sorted raw materials have a significant influence on the quality of the recycling process [10]. It is critical to create a plastic waste management system in order to prevent environmental pollution and conserve energy. Therefore, in this work, an automated smart plastic waste recycling management system is presented. Following is the way the remaining work is structured: The literature review is described in Sect. 2. The automated smart plastic waste recycling management systems are displayed in Sect. 3. The Sect. 4 describes the result analysis of presented approach. Lastly, Sect. 5 concludes this study.
2 Literature Survey Misha et al. [11] outlines an Image Processing Method for Identifying Plastic Garbage. Our aim is to use image processing techniques to develop a model that will help in the separation of plastic waste from other trash. In order to compare the efficiency of Convolutional Neural Network and Support Vector Machine on an image dataset, they built a model using both of these techniques. The image dataset consists of a sample of raw images that are captured from surroundings. While the primary purpose of this research is to detect plastic trash, the photographs we’ve gathered are mostly of worn plastics rather than brand new plastics that have never been used before. We found that the CNN algorithm outperforms the SVM approach for this model after analysing the algorithm results. Rossi and Bianchini [12] describes the evaluation, implementation, and design of a more sustainable methodology for handling plastic trash during sporting events. This paper suggests a new approach to manage plastic waste at marathons. Its purpose is not to remove plastics during sporting events, but rather to enhance waste management by enhancing sorting, recycling efficiency, and collection in line with the circular economy idea. The University of Bologna’s innovative visualisation tool is evaluated using a precise and quantitative technique, to evaluate circular efforts and a few key performance indicators (KPIs) to regulate the sustainability of the new model with waste management in early iterations of the same event. Foogooa et al. [13] the recyclable Waste Classification is shown. Utilizing Computer Vision with Deep Learning describes a Deep Learning technique that uses computer vision to recognise and categorise rubbish into five major categories: paper, metal, plastic, glass, and cardboard. Machine learning methods that can be learned for effective identification are the main focus of this work. The SVM Support Vector Machine, Sigmoid, and SoftMax classifiers a minimum of 12 different Convolutional Neural Network versions were used and trained through pre-existing images. The results show that the accuracy of the VGG19 with SoftMax classifier is about 88%. Adeshina et al. [14] describes the creation and use of a system for sorting plastic waste. The waste management and plastic recycling sectors will employ the planned
122
V. H. H. N. Sai et al.
project. It is made out of a revolving container with two sensors inside (colour and weight sensors) for sorting plastic waste efficiently. System performance was satisfactory and overall system efficiency was 73%. Hirulkar et al. [15] demonstrates a homemade hydraulic plastic waste compacting mechanism. This study describes a hydraulic plastic waste compacting system that has been improved. The effectiveness of the new system is being evaluated. Municipal plastic waste (MPW) management [16] issues put people at risk. According to several studies, unscientifically, Open landfills are used to dispose of 80–90% of MPW and dumps, threatening public health. The improvised system achieves low density plastic waste compaction of 85–90%.
3 Proposed Methodology In this section, the proposed methodology for an automated smart plastic waste recycling management system is presented. Waste Classification data set from Kaggle is used in this analysis. This dataset is separated into train data (85%) and test data (15%), and it comprises 22,564 images of training data and 2513 images of test data. In this approach the recycling of plastic waste is performed using time series algorithm. The Fig. 1 shows the implementation of time series algorithm. Step 1: The data set is taken and is cleaned first in order to remove the redundancies and the missing values. The process of locating and fixing errors, duplication, and extraneous data in a raw dataset is known as data cleaning. Step 2: As part of the data preparation, data cleaning ensures proper, defensible data that delivers convincing visualizations in order to increase the recycle rate of the presented system. Then the amount of generated waste data from the year 2005 to 2020 is merged. Step 3: Time series analysis algorithm calculates the overall waste generated, recycling generated and the rate of recycling are analyzed using time series trend. Step 4: By this prediction one can analyze the rate of recycling based on the type of waste generated over certain years. Step 5: From this analysis, waste recycling techniques can be recommended that can be used to curb the waste and best waste management can be done.
3.1 Time Series Analysis in Machine Learning A time series data set is a collection of measurements that occur at regular intervals of time, with time functioning as the independent variable and the goal (studying changes in features) as the dependent variable. A statistical method that works with time series data is called time series analysis, sometimes referred to as trend analysis.
An Automated Smart Plastic Waste Recycling Management Systems
123
Fig. 1 Process flow diagram of the proposed system
A time-series is a set of consecutive data points that are mapped at different time intervals. It comprises approaches for attempting to derive a time series in terms of correctly comprehending either the fundamental notion of the time series’ data points or advising or making predictions.
4 Result Analysis In this section, an automated smart plastic waste recycling management system is implemented. The result analysis of presented approach is evaluated here. This approach performs the recycling of plastic waste and in addition it analyzes the statistics of different types of waste. The statistics for various types of waste created in Tons are as seen in Table 1. The following Table shows that the waste generated in the year 2010 is more compared to the waste generated in the year 2019 from the max value in the last row. The Fig. 2 shows the rate of paper waste generated and their recycling rate. In Fig. 2, the blue color line indicates generated waste, red color line indicates the recycled waste and green color curve indicates the recycling rate.
124
V. H. H. N. Sai et al.
Table 1 Statistics of various kinds of waste generated in Tons Total_mT_ 2010
Total_mT_ 2019
PerCapital_ PerCapita_ kg_2019 kg_2019
Var_Total_mt
Var_ PerCapita
Count 1.940000e+02 1.940000e+02 194.000000 194.000000 1.940000e+02
194.000000
Mean 1.641443e+05 3.18211e+05
0.043361
7.092099
1.5427769e+05
7.048738
Std
7.175039e+06 1.353095e+06 0.049850
10.201501
1.001011e+06
10.184037
Min
0.000000e+00 0.000000e+00 0.000000
0.000000
−2.39622e+06
−0.245000
25%
1.261250e+03 3.01000e+02
0.006000
0.254116
−5.913250e+03 0.214284
50%
1.435050e+04 6.040500e+03 0.028500
1.763880
−9.55000e+01
1.719880
75%
5.770600e+04 1.237645e+05 0.063500
11.486871
4.197900e+04
11.40189
Max
8.819717e+06 1.299410e+07 0.299000
69.515864
1239428e+07
69.346864
Fig. 2 Rate of paper waste generated and the rate of recycling
Figure 3a shows the rate of construction waste generated and its recycling rate. Figure 3b shows the generated Glass waste rate and its recycling rate where green curve indicates recycling rate, blue line indicates generated waste and red color line indicates recycled waste from 2005 to 2020. Figure 4 shows the plastic waste production and recycling rates from 2005 to 2020, where green curve indicates recycling rate, blue line indicates generated waste and red color line indicates recycled waste from 2005 to 2020. Table 2 displays the recycle rate/accuracy of presented system and previously Recyclable Waste Sorting Using Computer Vision and Deep Learning. The recycling rate of presented automated smart plastic waste recycling management system and earlier Recyclable Waste Classification system using CV and DL is shown in Fig. 5. The X-Axis shows the methods used for the waste classification
An Automated Smart Plastic Waste Recycling Management Systems
125
Fig. 3 a Rate of Construction Waste generated and the rate of recycling. b Rate of Glass Waste generated and the rate of recycling
and recycling; the Y-Axis shows the recycling rate of the methods used to obtain the results. The recycling rate of presented automated smart plastic waste recycling management system is higher than earlier systems. Hence presented system has performed the waste recycling very effectively.
126
V. H. H. N. Sai et al.
Fig. 4 Rate of plastic waste generated and the rate of recycling Table 2 Recycling rate comparison Methodologies
Recycling rate/Accuracy (%)
Sorting recyclable waste using computer vision and deep learning 87.9 An automated smart plastic waste recycling management system
Fig. 5 Comparative graph for plastic waste recycling rate
95
An Automated Smart Plastic Waste Recycling Management Systems
127
5 Conclusion In this work, an automated smart plastic waste recycling management system is implemented. In this analysis, the statistics of various kinds of waste generated in Tons are analyzed using Time series Algorithm. The different types of wastes generated and its recycling rates are discussed. Compared to earlier waste recycling systems, the recycling rate of presented automated smart plastic waste recycling management has high recycling rate i.e., 95%. Hence this system will be a better solution for the calculation of different types of waste recycling rates.
References 1. Mikołajczyk A, Kwasigroch A, Klawikowska Z, Plantykow MA, Ferlin M, Majek K, Majchrowska S (2022) Deep learning-based waste detection in nature & urban contexts. Trash Manag 138:274–284, Elsevier. https://doi.org/10.1016/j.wasman.2021.12.01 2. Chidananda K, Buelaevanzalina K, Siva kumar AP (2021) Deep learning algorithms for effectively classifying kitchen waste. Turk J Comput Math Educ 12(14):5751–5762 3. Baras N, Dasygenis M, Ziouzios D, Tsiktsiris D (2020) A machine learning-based architecture for distributed recycling. Future Internet 12:41. https://doi.org/10.3390/fi12090141 4. Tan B, Chu Y, Xiong X, Kamal S, Huang C, Xie X (2018) A multilayer deep-learning algorithm for waste sorting and recycling. Hindawi Intell Neurosci. Article ID 5060857. https://doi.org/ 10.1155/2018/5060857 5. Kulkarni SG, Mondal S (2022) A transparent block chain platform for plastic waste management. In: 2022 14th international conference on communications systems and NETworkS. https://doi.org/10.1109/COMSNET53615.2022.9668574 6. Suddul G, Nedoomaren N (2018) An energy efficient and low-cost smart recycling bin. Int J Comput Appl 180(29):18–22. https://doi.org/10.5120/ijca2018916698 7. Singhal S, Neha SS, Jama M (2021) Recognizing and automating the barriers of plastic waste management collection and segregation. Int Res J Eng Technol 08(04), e-ISSN: 2395-056 8. OF AT, Amin MR, Ibrahim MM, Abdel Wahab M, Abd, El Rahman EN (2020) Recycling waste plastic bags as a replacement for cement in production of building bricks and concrete Blocks. J Waste Resour Recycl 1(2):1–13 9. White G, Reid G (2018) Recycled waste plastic for extending and modifying asphalt binders. In: 8th symposium on pavement surface characteristics, SURF 2018), Brisbane, pp 2–4 10. Wang C-T, Mao W-L, Lin Y-H, Chen W-C (2021) Classification of recyclable waste using an optimal convolution neural network. Conserv Resour Recycl 105132. https://doi.org/10.1016/ j.resconrec.2020.105132 11. Misha AT, Faijunnahar WRB, Rahman A (2021) A plastic garbage detection image processing method. In: International conference on electronics and communications and information technology. https://doi.org/10.1109/ICECIT54077.2021.9641188 12. Rossi J, Bianchini A (2021) Planning, implementation and evaluation of a more sustainable strategy for dealing with plastic waste during sporting activities. J Cleaner Prod 125345. https:// doi.org/10.1016/j.jclepro.2020.125345 13. Foogooa R, Armoogum S, Ramsurrun N, Suddul G (2021) Classification of recyclable garbage using computer vision and deep learning. In: Zooming innovation in consumer technologies conference. https://doi.org/10.1109/ZINC52049.2021.9499291 14. Adeshina SA, Obadiah AN, Aina OE, Thomas S, Hussein S, Ibrahim UM (2019) Development and implementation of a plastic trash sorting system. In: 15th international conference on electronics, computer and computation. https://doi.org/10.1109/ICECCO48375.2019.9043197
128
V. H. H. N. Sai et al.
15. Hirulkar NS, Faizan S, Marathe AB, Natkut AR (2017) Improvised hydraulic waste compacting system. In: 2017 international conference on intelligent computing, and control technologies (ICICCT). https://doi.org/10.1109/ICICICT1.2017.8342600 16. Yang L, Zhao Y, Niu1 X, Song Z, Gao Q, Wu J (2021) Municipal solid waste forecasting in china based on machine learning models. Frontiers Energy Res 9. https://doi.org/10.3389/ fenrg.2021.763977
Secure Identity Based Authentication for Emergency Communications J. Jenefa, E. A. Mary Anita, V. Divya, S. Rakoth Kandan, and D. Vinodha
Abstract The Vehicular Ad Hoc Network (VANET) offers secure data transmission between vehicles with the support of reliable authorities and RSUs. RSUs are fully damaged in emergency scenarios like natural catastrophes and are unable to provide the needed services. Vehicles in this scenario must communicate safely without RSUs. Hence, this study suggests a secure and reliable identity-based authentication technique for emergency scenarios. To provide secure vehicle-to-vehicle communication without RSUs, ECC-based IBS is utilized. Additionally, it offers security features like message integrity, privacy protection, and authentication. It is also resistant to attacks depending on authentication and privacy. The proposed technique performs efficiently with less communication and computing cost when its performance is compared with recent schemes. Keywords Authentication · Elliptic curve cryptography · Emergency communication · Identity based signature and privacy preservation
J. Jenefa (B) · E. A. M. Anita · V. Divya · S. R. Kandan · D. Vinodha Christ (Deemed to be University), Bangalore, India e-mail: [email protected] E. A. M. Anita e-mail: [email protected] V. Divya e-mail: [email protected] S. R. Kandan e-mail: [email protected] D. Vinodha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_11
129
130
J. Jenefa et al.
1 Introduction Vehicular Ad hoc NETwork (VANET) is a network established among vehicles, RSUs and TAs (Trusted Authorities). It is mainly used to provide secure communication among vehicles. In addition, it also offers requested services [1] for the users through RSUs (Road Side Units) which acts as a service provider. TA is the regional trusted authority which controls the vehicles and RSUs within that region. RSUs are fixed in the roadsides which communicate to TAs through secure wired communications. Vehicles register itself to TA while entering and leaving a particular region and acquire the available services from the RSUs. It also communicates with nearby vehicles during its travel through On-Board Units [2]. During emergency situations like natural disasters, establishing communication between the vehicles is a tedious process because of the damaged RSUs. In such cases, vehicles should communicate with other vehicles without RSUs. Communication established between vehicles during emergency situations is known as emergency communications and the messages which are transmitted are known as emergency messages. Emergency messages have a greater impact than the messages which are transmitted in the vehicular network during normal situations because emergency messages have information regarding the damaged road conditions, critical traffic information, etc. Hence emergency communications play a vital role in the vehicular network. TA controls RSUs through secure wired communications. The vehicle registers itself to the TA before entering into a particular region. It then communicates with RSU through V2R (Vehicle-to-RSU) and R2V (RSU-to-Vehicle) communications. If a vehicle detects a damaged road condition it sends information about it to nearby RSU. RSU broadcasts this information to all the vehicles within its range and sends acknowledgement message to the sender vehicle. Vehicles also communicate with one another through V2V (Vehicle-to-Vehicle) communication. During emergency situations, RSUs are completely damaged and if vehicles send information regarding road or traffic condition to RSU it does not receive acknowledgement messages for transmitted messages from RSU. In such a case, the sender vehicle resends the message to RSU and if it does not receive an acknowledgement from RSU after a few more tries, it then assumes that the RSU is damaged and establishes communication with other vehicles without RSU. Since the communication is carried out without RSUs it is prone to various attacks based on authentication, privacy, etc. Attackers, on the other hand should be identified and isolated from the network [3]. Efficient scheme with security features like authentication, privacy preservation [4] and message integrity is required to establish secure emergency communication in a vehicular environment. An efficient identity based authentication scheme for emergency communication is proposed in this paper. It uses Elliptic Curve Cryptography (ECC) for authentication purpose. Expensive pairing operations are omitted to reduce the overheads in the vehicular network. The proposed scheme is explained in detail in Sect. 3.
Secure Identity Based Authentication for Emergency Communications
131
2 Related Works Some of the recently proposed secure emergency communication schemes are explained in this section. Xuedan Jia proposed an efficient identity based signature scheme (EPAS) [5]. It provides conditional privacy preservation and batch verification. It has a fully trusted TA and Disaster Relief Authority (DRA). AMBs are vehicles which have stronger communication capability than ordinary vehicles. It is used to acquire information from the vehicles and transmit it to DRA which in turn sends it to the TA. Since it uses two trusted authorities’ overhead is high. In SVCU [6], Identity Based Signature (IBS) and Identity Based Offline/Online Signature (IBOOS) schemes are used to provide secure authentication between vehicles and RSUs. Emergency communication is established by using the RSA algorithm. It provides authentication, privacy preservation and non-repudiation. But a novel scheme is not proposed to establish secure emergency communication which is its main drawback. Gawas et al. [7] proposed a cross-layer approach for the distribution of emergency messages in the vehicular network (CLAE). In this scheme, a vehicle which sends emergency messages selects the forwarder vehicle within a one-hop distance for efficient transmission. It is mainly used to distribute safety messages to other vehicles in the network other security features are not considered in this scheme. Bhosale et al. [8, 9] proposed architecture for emergency communication (VBCE). In this designed architecture, sirens are used to warn the other vehicles during emergency situations. Vehicles also acquire information about the route from emergency vehicles. It is helpful for the safe journey of vehicles. Emergency communication depends on emergency vehicles and hence vehicles cannot communicate among themselves. In ESAS [10], inter vehicular communication is established without the help of RSUs. Hence it is also used for emergency communications. Identity based signatures are used for authentication purposes. It also provides other security features in addition to authentication like privacy-preservation, non-repudiation, message integrity and traceability. It has less computation and communication overheads because of the omission of pairing operations. Chen et al. [11] proposed key agreement protocol for emergency reporting process (ESKAP). It uses digital signature mechanism for authentication. Vehicles are validated by using broadcast centers before accepting the emergency messages. Since authentication is done by using broadcast centers it can be damaged during natural disasters and hence it is not suitable for emergency situations. Table 1 compares these schemes with security features. As shown, some schemes use emergency vehicles for establishing V2V communication, without which communication among vehicles is not possible. Hence secure authentication scheme for emergency communications without emergency vehicles is needed with features like privacy preservation, message integrity [5, 12–19] and with less overhead. A new identity based authentication scheme is proposed with all these features in this paper, which is explained in the next section.
Communication pattern
V2D and V2V
V2V
V2V
V2A and A2V
V2R, R2V and V2V
V2R,R2V,V2V
Scheme
EPAS [7]
SVCU [6]
CLAE [3]
VBCE [8, 9]
ESAS [10]
ESKAP [16]
Key agreement and digital signature
ID-Based Signature
_
_
ID-based signature
ID-Based Signature
Cryptographic basis
Yes
Yes
No
No
Yes
Yes
Authentication
Yes
Yes
No
No
Yes
Yes
Privacy preservation
Table 1 Comparison of different authentication schemes based on its security features
Yes
Yes
No
No
No
No
Message integrity
No
Yes
No
No
No
Yes
Batch verification
No
No
Yes
No
No
Yes
Emergency vehicle
132 J. Jenefa et al.
Secure Identity Based Authentication for Emergency Communications
133
3 Proposed Scheme The proposed Elliptic Curve Cryptography (ECC) based IBS without pairings for emergency communications is explained in this section. It has five stages they are System Initialization, Pseudo ID generation, Signature Generation, Signature Verification and Batch Verification. As shown in Fig. 1, vehicle registers itself to TA by sending its original ID and other private information. TA generates pseudo IDs for the vehicles by using its original IDs. The vehicle sends information regarding road and traffic conditions as a message to RSU and waits for the acknowledgement message. If it didn’t receive an acknowledgement message from RSU after sending the message thrice then the vehicle decides that the RSU is damaged and broadcasts the message to all the nearby vehicles which forwards it to its nearby vehicles and so on. The message which is transmitted includes the details about the damaged RSU in addition to the traffic and road condition details. Thereby emergency messages are forwarded to all the vehicles within the network and emergency communication is established without RSUs. The computation cost of the pairing operation is three times more than that of the point multiplication operation [20]. Hence expensive pairing operations [21] are avoided to reduce overheads.
3.1 System Setup Let F n be the finite field over n with a and b parameters of the elliptic curve E (y2 = x 3 + ax + b (mod n), where 4a3 + 27b2 /= 0). P is the generator point, P /= R, where R denotes infinity. TA computes its public key Ppub = sP, where s is the TA’s private key. It then chooses two one way hash function H: {0,1}* → Z q and H 1 : {0,1}* → Z q. The system public parameters are {P, Ppub , H, H 1 }. These parameters are preloaded into all legitimate vehicles and RSUs. Fig. 1 System architecture
134
J. Jenefa et al.
3.2 Pseudo ID Generation During the registration process, vehicles send its original ID (IDV ) to TA through secure communication for pseudo ID generations. Pseudo IDs are used instead of original IDs in order to provide privacy preservation. Multiple pseudo IDs of each vehicle are computed as follows PIDV i = H ( s.IDV , t, d)
(1)
where s is the TA’s private key, IDV is the original ID of the vehicle to which the pseudo ID is generated and t is the current timestamp. The pseudo ID is generated whenever a vehicle enters into a new region, which is under the control of that particular TA. In addition to pseudo ID, TA also assigns private keys ‘d’ to vehicles. It ranges from 1 to n.
3.3 Signature Generation Vehicles periodically send the message regarding road conditions, traffics to the nearby RSUs; it is then broadcast to all the vehicles within that range if it is valid. RSUs send acknowledgement messages after receiving messages from the vehicles. If the vehicle didn’t receive an acknowledgement message from the RSU it resends the same message three times. If it fails to receive an acknowledgement message from RSU then the vehicle figures out that RSU is damaged. In such a case, it forwards the message to all the vehicles within its range. The transmitted message has the traffic information as well as the details about the damaged RSUs. In order to establish secure communication with nearby vehicles without RSUs, it authenticates itself to nearby vehicles by using its signature. Vehicles generate its signature as follows. ( ) SIGV = H1 (d.PIDV )P, M, t, Ppub , C
(2)
where PIDV is the pseudo ID of the vehicle, d is the private key of the vehicle, P is the generator point, M is the emergency message, t is the current timestamp, Ppub is the TA’s public key and C is the counter value. It then broadcasts the emergency message to all the vehicles of the form < PIDV , Z, SIGV , t, Q, C > where Z = M + P and Q is the public key of the vehicle (Q = d. P). Confidentiality is achieved to a smaller extent in this scheme. Vehicles can decrypt the message only if it knows the value of P. It is preloaded only in the legitimate vehicles. Hence, only legitimate vehicles can view the message which is broadcast by neighbor vehicles. After signing each emergency message, it increments its counter value by 1 (C = C + 1). The counter value is generated by using the pseudo ID of the sender vehicle. This is to avoid getting the same message from different vehicles. If vehicles receive messages with
Secure Identity Based Authentication for Emergency Communications
135
same counter value, then it accepts the first verified message and discards the other messages in order to reduce overheads.
3.4 Signature Verification On receiving a message from the nearby vehicle, it first checks the counter value in its memory. If it is already present in its memory, then it discards the message. Else it verifies the current timestamp value and proceeds with the verification process. It determines the emergency message, M = Z – P then it computes the value of X as follows ( ) X = H1 PIDV , Q, M, t, Ppub , C
(3)
If the value of X is the same as that of SIGV , then the vehicle accepts the message considering that the sender is a legitimate vehicle. It then forwards this message to RSU and waits for an acknowledgement message. If it didn’t receive it after a few tries, it then forwards it to the nearby vehicles of the form < PIDV , Z, SIGV , t, Q, C > where PIDV , SIGV and Q are the pseudo ID, signature and public key of the vehicle. The message M, includes the details of the damaged RSU in addition to the original message about traffic information. The value of C is not changed it remains the same. This process is repeated until an acknowledgement message is received from an RSU for the transmitted message.
3.5 Batch Verification If a vehicle receives an emergency message from more than one vehicle at the same time, it verifies them altogether through the batch verification process. At first, it checks the counter value of the messages in its memory. If it is already present, then the message is discarded without verification. Then it checks for the timestamp value of all the messages and then it computes the value of M and X for all vehicles (M i = Z i – P, X i = H 1 (PIDi Qi , M i , t, Ppub , C i )). Then the batch verification of all the vehicles are carried out as follows n ∑ i=1
SIGi =
n ∑ i=1
Xi
(4)
136
J. Jenefa et al.
4 Security Analysis Vehicles authenticate themselves to other vehicles, by using its signature. Signatures are generated by using the vehicle’s private key ‘d’. Extracting the value of d from its public key Q is not possible since it is against the ECDLP problem [21]. Hence the proposed scheme provides authentication and prevents attacks based on authentication. Vehicle’s private information is preserved by using pseudo IDs instead of original IDs. Only TAs compute pseudo IDs by using the one-way hash function. Finding the original value of a vehicle from the hash value of pseudo ID is impossible based on the properties of the hash function. Hence the proposed scheme provides privacy preservation. The emergency messages received by the vehicle are the same which are transmitted. Signatures are generated with the message as one of the parameters; hence even if an attacker alters the message in the verification process vehicles discard the message. In this way, the proposed scheme provides message integrity.
5 Performance Evaluation The performance of the proposed scheme is compared with the recent schemes. The vehicular environment is simulated with 200 vehicles and one TA in 2500 × 2500 operating space. Vehicles are moving from one place to another and send information to nearby vehicles by transferring packets. The simulation parameters used are tabulated in Table 2. Table 2 Simulation parameters
Parameters
Values
Coverage area
2500 × 2500 m
No. of traffic lanes
4
No. of vehicles
200
Simulation duration
100 s
MAC layer protocol
802.11p
Channel bandwidth
6 Mbps
Transmission range of vehicles
300 m
Minimum inter-vehicle distance
40 m
Routing protocol
AODV
Secure Identity Based Authentication for Emergency Communications
137
5.1 Computation Overhead The execution time of cryptographic operations on an Intel Pentium 4 3.0 GHz machine with 1 GB RAM is given as follows: the time taken to perform scalar multiplication (T M ), scalar multiplication point operation (T MP ) and pairing operations (T P ) is 0.03 ms, 1.50 ms and 8.12 ms respectively. Comparison between the proposed scheme and the related schemes is tabulated in Table 3. As stated in the table, the computation cost of signature generation is T MP (i.e. 1.50 ms), which is 1.5 ms less than EPAS scheme 1 and 2 [5] and the same as that of ESAS [10] scheme. Even though, SVCU [6] computes signature in 0.08ms which is 1.42 ms less than that of the proposed scheme, it does not use any novel scheme for authentication and it cannot verify ‘n’ number of messages simultaneously. The computation time for batch verification in the proposed scheme is nT MP . It is (n + 1)T MP and T MP less than that of scheme 1 and scheme 2 of EPAS [5]. ESAS [10] scheme has same computation overhead as that of the proposed scheme whereas the proposed scheme has less communication delay as explained in Sect. 5.2. Hence the proposed scheme has less computation overhead since expensive operations are not used. Figure 2 depicts the comparison of computation overhead for signature generation and verification. As shown in Fig. 2, SVCU scheme has less computation overhead for signature generation. Table 3 Comparison of computation cost Scheme
Signature generation Signature verification Batch signature verification
EPAS: Scheme 1 [5] 2 T MP (3 ms)
3 T MP (4.5 ms)
(2n + 1)T MP
EPAS: Scheme 2 [5] 2 T MP (3 ms)
2 T MP (3 ms)
(n + 1)T MP
SVCU [6]
0.08 ms
1.46 ms
–
ESAS [10]
T MP (1.5 ms)
T MP (1.5 ms)
nT MP
Proposed scheme
T MP (1.5 ms)
T MP (1.5 ms)
nT MP
Fig. 2 Computation overhead for signature generation and verification
138
J. Jenefa et al.
The proposed scheme has 57.14% less computation overhead when compared to that of the EPAS schemes 1 and 2. Even though the proposed scheme has high computation overhead than the SVCU scheme, it has less communication overhead and hence it is acceptable. On the other hand, in SVCU, communication is established by using the existing RSU algorithm. As shown in Fig. 2b, the proposed scheme has less computation overhead for signature verification than EPAS Scheme 1 and 2. It has 50.1% and 0.568% less computation overhead than EPAS Scheme 1 and 2. SVCU Scheme has 2.6% less overhead than the proposed scheme but since the proposed scheme has less communication overhead it is acceptable. The proposed scheme has the same computation overhead when compared to that of the ESAS scheme since the computation time for signature generation and verification is the same in both these schemes.
5.2 Communication Overhead Delay is the time difference between the times at which the packet received to that of the time at which the packet is sent. It is computed as follows Delay = Time at which packet is received −Time at which packet is sent
(5)
Average message delay is used to determine the time taken to transmit packets from source to destination. The delay in the proposed scheme is compared with the recent schemes. It is depicted in Fig. 2. As shown in the Fig. 3a, the average message delay in the proposed scheme is 61.5%, 35.9%, 26.25% and 14.5% less than that of EPAS schemes. As a result, the proposed approach acquires less communication overhead. In emergency communication, vehicles send < PIDV , Z, SIGV , t, Q, C > to its nearby vehicles and the message size is 108 bytes (40 × 2 + 20 × 1 + 4 × 2) where the size of the message is same and hence it is ignored. The comparison of the communication overhead based on the message size is depicted in Fig. 3b. It is compared with EPAS Scheme 1 and 2 (82 bytes), SVCU (141 bytes) and ESAS Scheme (124 bytes). As per the Fig. 3b, the proposed approach has 12.9%, 23.4% less communication overhead than ESAS and SVCU schemes. Even though it has 24% high communication overhead in terms of message size than EPAS Scheme 1 and 2 since it has less computation overhead and less average message delay it is acceptable.
Secure Identity Based Authentication for Emergency Communications
139
Fig. 3 a Emergency communication—Average message delay b Communication overhead
6 Conclusion Communication during emergency situations plays a vital role since the emergency messages which are transmitted during emergency communication has much importance. Such communications between vehicles need to be established in a secure way to avoid attacks. The proposed identity based signature scheme provides secure emergency communications without pairings. Emergency communications are established without emergency vehicles. It also provides other security features like authentication, privacy preservation and message integrity. It can validate more than one message at the same time using the batch verification process. It also has less communication and computation overhead.
References 1. Misener JA (2005) Vehicle-infrastructure integration (VII) and safety: rubber and radio meets the road in California. Intellimotion 11(2):1–3 2. Azimi R, Bhatia G, Rajkumar R, Mudalige P (2011) Vehicular networks for collision avoidance at intersections. Proc SAE World Congr Detroit 3. Fuentes JMD, Gonzalaz-Tablas AI, Ribagorda A (2010) Overview of security issues in vehicular Ad-Hoc networks. In: Handbook of research on mobility and computing. IGI Global Snippet. Pp 894–911 4. Lee U, Zhou B, Gerla M, Magistretti E, Bellavista P, Corradi A (2006) Mobeyes: smart mobs for urban monitoring with a vehicular sensor network. IEEE Wireless Commun 13(5):52–57 5. Jia X, Yuan X, Meng L, Wang L (2013) EPAS: efficient privacy-preserving authentication scheme for VANETs-based emergency communication. IEEE J Softw 8(8):1914–1922 6. Jenefa J, Mary Anita EA (2017) Secure vehicular communication using ID based signature scheme. Wirel Pers Commun 7. Gawas AM, Hurkat P, Goyal V, Gudino LJ (2017) Cross layer approach for efficient dissemination of emergency messages in VANETs. In: Ninth international conference on ubiquitous and future networks (ICUFN), pp 206–211
140
J. Jenefa et al.
8. Bhosale S, Dhawas NA, Burkul A (2013) VANET based communication for emergency vehicles. Int J Adv Res Comput Sci Electron Eng 2(7):567–571 9. Bhosale S, Kale A, Patil S (2015) Emergency vehicles communication by VANET. Int J Emerg Trends Technol 2(2):403–407 10. Jenefa J, Mary Anita EA (2019) An enhanced secure authentication scheme for vehicular Ad Hoc networks without pairings. Wirel Pers Commun 106:535–554 11. Chen C-L, Chen Y-X, Lee C-F, Deng Y-Y, Chen C-H (2019) An efficient and secure key agreement protocol for sharing emergency events in VANET systems. IEEE Access 148472– 148484 12. Jenefa J, Mary Anita EA (2022) Secure authentication schemes for vehicular Adhoc networks: a survey. Wireless Pers Commun 123:31–68 13. Jenefa J, Anita EAM (2021) Identity-based message authentication scheme using proxy vehicles for vehicular ad hoc networks. Wireless Netw 27:3093–3108 14. Lakshmi S, Mary Anita EA, Jenefa J (2019) Detection and prevention of black hole attacks in vehicular Ad Hoc networks. Int J Innovative Technol Exploring Eng (IJITEE) 8(7), May (2019) 15. Mary Anita EA, Lakshmi S, Jenefa J (2021) A self-cooperative trust scheme against black hole attacks in vehicular ad hoc networks. 21(1):59–65 16. Mary Anita EA, Jenefa J (2016) A survey on authentication schemes of VANETs. In: 2016 international conference on information communication and embedded systems, ICICES 2016, IEEE 17. Raya M et al (2007) Securing vehicular Ad hoc networks. J Comput Secur 15(1):39–68 18. Shamir A (1984) Identity-based cryptosystems and signature schemes, in advances in cryptology. Springer-Verlag, New York, pp 47–53 19. Yeh LY, Lin YC (2014) A proxy-based authentication and billing scheme with incentive-aware multihop forwarding for vehicular networks. IEEE Trans Intell Transp Syst 15(4):1607–1621 20. Barreto PSLM, Kim HY, Lynn B, Scott M (2002) Efficient algorithms for pairing-based cryptosystems. In: Proceeding crypto, pp 354–368 21. Lo NW, Tsai JL (2016) An efficient conditional privacy-preserving auhentication scheme for vehicular sensor networks without pairings. IEEE Trans Intell Transp Syst 17(5):1319–1328
Stock Price Prediction Using LSTM, CNN and ANN Krishna Prabeesh Kakarla, Dammalapati Chetan Sai Kiran, and M. Kanchana
Abstract Forecasting the stock market is difficult because the stock price time series is so intricate. We applied long short-term memory (LSTM) algorithm, convolutional neural network (CNN) and artificial neural network (ANN) because of the following reason: Recent work provides preliminary proof that machine learning methods could locate relationships in stock price sequences that are not linear. The stock market’s non-stationary and high volatility makes it difficult to foresee the trajectory time series data in the economy. We expanded the scope of our model training outside the Indian stock market. More specifically on stock prices of MAANG and Divis labs, all three models were performing really well with RMSE of LSTM being 0.42 and R2 score being 0.55, and this being the best result among all. Keywords Stock market · Forecasting · Complex · LSTM · Machine learning · Volatility · MAANG · DIVIS labs
1 Introduction Securities representing ownership stakes in a company are called stock (sometimes “shares” or “equity”). The investor is so entitled to a share of the company’s assets and profits. The volume and volatility of stock price data are two of the market’s most frustrating features. Trading on the stock market is a very risky endeavour where one could make or lose a fortune. The datasets collected are used to determine the current stock values for this project. The collected data is broken down into smaller pieces, or datasets, before being used to both train and test the algorithm. When modelling data, we employ regression models written in either Python or R.
K. P. Kakarla · D. C. S. Kiran (B) · M. Kanchana SRM Institute of Science and Technology, Chennai, Tamil Nadu 603210, India e-mail: [email protected] K. P. Kakarla e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_12
141
142
K. P. Kakarla et al.
The novelty in stock price prediction using LSTM, CNN and ANN lies in their ability to capture and analyse complex patterns and dependencies within historical stock data. These advanced machine learning techniques offer several innovative aspects in the field of financial forecasting which includes sequential modelling with LSTM, spatial feature extraction with CNN, learning complex relationships with ANN and integration of multiple models.
2 Literature Survey In Hsu et al. [1], in this work, we survey the territory covered by two major classes of small data techniques: unsupervised and semi-supervised approaches. We also contrast various recently proposed training criteria and principles, including transformation equivariance, invariance, etc. for these models. In order to show the most recent developments in bridging, we also conducted a literature review of unsupervised and semi-supervised domain adaptation methods. We also examine where this line of inquiry could go in future to further illuminate the ties between completely unsupervised study and completely supervised study. Error proportions based on CIFAR-10 classifiers for unsupervised representations: upstream, middle and downstream learn from classes with various counts of labelled examples. Over the course of 20 days, the accuracy of supervised conv will be 66.34 and that of supervised nonlinear will be 65.03. In Parekh et al. [2], because it can efficiently compensate for traditional risk calculation’s flaws, we suggested employing funds standardisation to provide a more precise assessment of portfolio risk. All possible interactions between stocks in a portfolio are taken into account by the funds standardisation. As an added bonus, it makes risk assessment much easier to perform. The portfolios are compared using the same standards. When combined, GA and the Sharpe ratio (obtained via funds standardisation) enable rapid identification of portfolios with low risk and steady returns. A daily return rate example using the conventional approach. For day 1, a stock price of 100 was calculated with no return rate, for day 2, a stock price of 50 was calculated with a 50% return rate and for day 2, a stock price of 85 was calculated with a 70% return rate. In Kabbani and Duman [3], following a 12-year period of training on data from the US stock market, we subjected the model to testing on data from 31 other nations’ stock markets. To forecast global stock markets, we employed a convolutional neural network-based function approximator built within a deep Q-network that accepts images of stock charts as input. Portfolios built using our model’s output typically generate a return of 0.1% to 1.0% per trade in the stock markets of 31 nations before transaction expenses are taken into account. Indicative of equivalent stock prices fluctuations on investment markets around the world, the observations suggest that certain patterns in stock chart images are consistent with this hypothesis.
Stock Price Prediction Using LSTM, CNN and ANN
143
In Li et al. [4], to that end, we provide here an article which provides a recurrent neural network with deep layers (NN) for representing and trading financial signals in real time to try to overcome this difficulty. Our strategy was inspired by the principles of reinforcement learning (RL) and deep learning (DL) (RL), both of which find analogues in biology. To better accommodate cutting-edge DL applications, this research modernises a classic DRL framework for usage in the realm of financial signal processing and online trading. There are two main benefits to the system. The benefit is a result of DL’s automatic feature learning mechanism. The uncertainty of the original time series has also been reduced because of our incorporation of fuzzy learning into the DL model, which we have done after taking into account the specifics of the financial signal. For the first eight days, our returns were 10%; for the next sixteen days, they were 14%; and for the following thirty-two days, they were 16%. In Vargas et al. [5], short-term trends, particularly desirable for neural network analysis, are compared across three networks in order to provide accurate forecasts about future stock price movements. Such items have substantial profit potential in settings like options trading, but only at the expense of a very high level of risk. As a result, we give special attention to reducing false alarms, which enhances the risk/ reward ratio by avoiding financial losses. Even for stocks with little predictability, PNN’s extraordinary ease of implementation and low false alarm rate are advantages. PNN’s prediction of 3 for Apple’s stock yielded a false alarm of 1, which resulted in a 32% increase in price. With a false alert of 16, IBM stock was predicted to rise 34% by PNN. In Yeng et al. [6], to that end, we provide here an article which investigates two effective ways for instructing these networks: the multi-stream extended Kalman filter and the conjugate gradient method, with the goal of reducing false Jarms, which are correlated to real-world investment losses. The aforementioned methods have shown encouraging outcomes. For example, the Apple computers had 94 possibilities with 36 RNN and 2.4% false alert, whereas the IBM computers received 117 opportunities with 78 RNN and 8.1% false alarm, and the Motorola computers received 94 opportunities with 108 RNN and 13.8% false alarm. In Jenho et al. [7], the riskiness of the nature of pricing predictions, like those of the future’s instability and the number of market economy elements that contribute to the stability of the market, makes investing in such projects risky. Fund Standardisation may have reduced calculational complexities and associated every interaction in a portfolio, but it stell cannot evolve into the best plan of action since it does not analyse the present value of the assets or their relative values. The works on maximising returns while minimising risk have served as inspiration as we seek out products with expansion potential in an effort to craft a balanced investment portfolio. Many people have benefited from the recent bull market gains in both the US and Korean stock markets. Annual return on investment percentages: 78% in 2017, 82% in 2018 and 86% in 2019. In Li et al. [8], intraday candlestick patterns were used in an eight-trigram feature engineering technique in the paper. Built a whole new machine learning ensemble system. Data collected from the Chinese stock market between the years 2000 and
144
K. P. Kakarla et al.
2017 demonstrates that the predictive power of eight-trigram feature engineering is high; for some trend patterns, it can reach over 60%. For such prediction tasks, machine learning methods, especially those based on reinforcement learning are preferable. In Len et al. [9], the authors of the paper used a deep Q-network to approximate a function using a convolutional neural network and used data from a stock chart as input to forecast the global stock market. The team evaluated their algorithm using data from 31 countries’ stock markets over 12 years and exclusively used US stock market data for training. Contrary to the findings of a model trained just on the US market, it is clear from observational data that models must be developed and tested in several markets. In Wen et al. [10], in the study, the authors introduced DeepClue, a system designed to facilitate communication between text-based deep learning models and end users by providing a graphical representation of the most important insights gained from such models when predicting stock prices, creating a framework for deep learning to analyse data. By employing an algorithm to parse out pertinent predictive features, we demonstrate how end users may make sense of the prediction model, using an interactive display to delve into hierarchies over the extracted elements. They put too much emphasis on the user interface and not enough on thoroughly testing their concept against competing models. In Wu and Chung [11], this research propose stock price forecasting utilising a combination of the Pearson Correlation Coefficient (PCC) and the Broad Learning System (BLS). The input features were narrowed down from a pool of 35 using PCC. Quick feature extraction from the input data was utilised to train a BLS. To test the efficacy of the suggested strategy, it was applied to four equities listed on either exchanges like the Shanghai and Shenzhen ones. Weak in terms of how the proposed model may actually be used in the real world. Did not implement a proper plan for buying and selling stocks. In Wu et al. [12], the authors of the publication detailed a method that uses a recurrent neural network. The authors present a decision-making technique whereby prediction of the spread between the open and close prices is improved by using an estimate of the ZCR and cross-validation data. The authors focus on the importance of normalised data in pre-processing. Using features of preexisting datasets, such as the zero-crossing rate (ZCR), which stands for the percentage of sign changes within a certain time frame, and a first-order difference approach. News and financial reports may also benefit from applying NLP to uncover additional insights. In Chen and Chen [13], the authors introduced a novel approach to cleaning up noisy financial time series by sequence reconstruction using motifs. Capture the spatial structure of time data using a convolutional neural network. With deep learning, we can anticipate stock market movements with an accuracy that is between 4 and 7% points higher than that achieved by using conventional signal processing methods and a modelling approach based on the study of high-frequency trading patterns. We present a new approach to financial sequence reconstruction; existing methods typically include either the frequency decomposition of time series data or the reconstruction of phase space using techniques like wavelet decomposition or
Stock Price Prediction Using LSTM, CNN and ANN
145
empirical mode decomposition. We used state-of-the-art algorithms and many other, more efficient approaches, but we did not put them through any sort of rigorous testing. In Chin and Chin [14], there are primarily two approaches to trading stocks nowadays [2–4]: To develop a buy or sell signal, one can either (1) make a forecast about the stock price before acting or (2) apply a set of trading rules derived from technical analysis. The first approach seeks to foretell the stock price at a given moment in time, whereas the second seeks out oscillations in the stock price, either upward or downward. Related studies of the first approach include those that employ a hybrid approach [2] based on fuzzy [3] and PSO [5] prediction and those that employ root mean square error as a metric against which the effectiveness of a given solution can be judged. The second approach is supported by a number of studies, such as the fuzzy rule-based system developed by IC-Yeh [9]. In Kang and Fans [15], this system may trade based on characteristics such as membership function shape parameters, rule trigger thresholds and weights of situations that may be adjusted to fine-tune the performance of technical indicators like moving average indicators and moving volume indicators. It is a method for selecting choices in the business world. Different types of evolutionary computation are used in various studies, including GA [5–7], GP [8–11], PSO [14–17], ACO [18] and other algorithms [12, 13]. Wang makes his trading rule selections using GA [5]. Idrees et al. [16] to rephrase, GA is applied to determine the best trading rules. As per its compositional guidelines, that study opted for geographical indicators such BIAS, RSV and WMS. Sliding-window metaphors are employed there as well. In the same way that GA employs a hierarchy to organise its data, GP does as well for its data. General purpose (GP) is widely used in this sector of finance. It generates money-making rules using the tree structure of your data. In Chen and Liu [17], for instance, regular genuine functionalities such as technical indicators, price, or volumes and terminals such as price, or volumes, will be created and compared by GP [10]. Following the aforementioned procedure will result in a trading rule, the outcome of which can be used to guide stock purchases, sales and holdings. To fine-tune the MA trading rule’s parameters, the PSO [14] approach is used. In other words, optimal average durations are determined using particle swarm optimisation to refine trading rules and increase returns. The ACO [19] system employs a wide variety of technical indicators, including the 20-day moving average, KD line, stock price and trading volume. In this system, trading volume serves as the phenomenon, and the information provided by the KD stochastic line provides the light necessary to observe it. In Liu et al. [20], Bai et al. [21], Li et al. [18], LeCun et al. [22], the findings of the aforementioned research are promising. However, we have noticed that there are still certain limitations to most research. There are still years with negative returns, and there are years where they do well in training but poorly in testing due to the problem of over-fitting. When compared to the industry standard of “buy and hold”, these methods generally fall short. To improve performance while avoiding the overfitting problem, we present a rule-based system that uses a sliding window inspired by quantum mechanics. QTS is grounded in both the classical Tabu search and the
146
K. P. Kakarla et al.
characteristics of quantum computing [23, 24]. We chose the QTS-algorithm because it outperforms other heuristic algorithms without forcing the problem to converge too rapidly while solving optimisation problems. The QTS-algorithm effectively finds the best or nearly the best set of rules by using a sliding window [25] to avoid the over-fitting problem.
3 Proposed System 3.1 Architecture Diagram The proposed system architecture diagram is shown in Fig. 1. Initial dataset has been obtained from Yahoo finance website as they provide dynamic data about the stocks which is very useful for us. Some data pre-processing has been performed using news articles and ent analysis before arriving at the final dataset. The final dataset is arrived at by performing feature extraction, feature engineering and other pre-processing techniques. Final dataset is now divided into train and test datasets. Train data is passed through state-of-the-art models, while the test is passed through evaluation methodologies. The best method is chosen based on the results from model testing and is put through real-time stock price prediction.
3.2 Gathering the Data In this, we gather all the data required for training as well as testing our proposed model. We gathered the info of major companies (MAANG) and a small pharma company Divis. We will import all the gathered data into Jupyter Notebook or Google Collab. Importing was done by using the pandas method read csv. We checked the information regarding each dataset with a method .info(). We also used the .describe() method for getting all the information of the datasets.
3.3 Data Cleaning In this module, we implemented various methods to make sure these outliers and noise materials do not interfere with the models performance. We started data cleaning by; checking for missing values using.isnull().sum() * 100 divided by the length of each dataset to get the percentage found one of the datasets had missing values so we used dataset.fillna() from pandas to fill the missing values with mean. Checking for
Stock Price Prediction Using LSTM, CNN and ANN
147
Fig. 1 Proposed system architecture diagram
redundant data by comparison and removed columns having redundant data using drop() from pandas. Checking for constant columns using.unique() on each column for each dataset. Checking for duplicate rows using.duplicated(). Finally, we checked for missing values again for confirmation.
148
K. P. Kakarla et al.
3.4 Feature Engineering Selecting, modifying and converting raw data into features usable in supervised learning is known as feature engineering. It may be required to redesign and retrain features for machine learning to perform effectively on novel tasks. In this module, we transform all the given variables into machine understandable format. We started feature engineering with the intended column target due to the fact that LSTM is scaledependent, so we used MinMax scaler. A dataset matrix was created by transposing a list of values. In order to accomplish the aforementioned goal, we implemented a specialised function. Then, we adjusted the records to account for a time increment of 100. In the end, we transformed the input to match the format needed by LSTM: [samples, time steps, features].
3.5 Algorithms Used 3.5.1
LSTM
To put it simply, long short-term memory (LSTM) is a kind of artificial neural network utilised in deep learning and AI. The LSTM is a kind of NAS contrast to the more typical feed-forward neural networks, a network that includes feedback connections. In addition to single data points (like images), these RNNs can process whole data streams (such as speech or video). Unparalleled integration of AR/VR/MR/MRV and so on are just some of the many fields that may benefit from LSTM. LSTM has surpassed all other neural networks in terms of citations in the last century.
3.5.2
CNN
A convolutional neural network, or ConvNet, is a type of deep learning network that can take an input picture, label the features in that picture and then draw conclusions about the picture (learnable weights and biases) and then distinguish between distinct features/objects in the image. When compared, ConvNet needs far less time spent on pre-processing than other categorisation methods. However, ConvNets may be trained to acquire these filters/characteristics on their own, while in primitive approaches, they must be hand-engineered. ConvNets’ design takes cues from the visual cortex’s structure, which is why it is so similar to the human brain’s pattern of neuronal connection. Each neuron’s “Receptive Field” corresponds to the region of the visual field to which it is most sensitive. The whole visual field is covered by a series of such fields that overlap.
Stock Price Prediction Using LSTM, CNN and ANN
3.5.3
149
ANN
The neural network in the human brain is the inspiration for ANN. It is the hidden state that ANN uses to function. They resemble neurons in their hidden nature. Every one of these covert states is a fleeting shape with probabilistic properties. Connecting the input and the output is a grid of these concealed states. We used three state-ofthe-art machine learning algorithms in our work. We used LSTM, CNN and ANN. We were able to implement our models using the Keras module in TensorFlow. We trained these models on three different companies: Amazon, Google and Netflix, respectively.
3.6 Testing We primarily tested our models performance with two well-known metrics RMSE and R2 score. We tested all the three models with the above two metrics and noted the results. We concluded that LSTM was performing better over the other two algorithms. Later, we also tested our model by examining how the chart of its next 30-day prediction would look like. In our findings, we found out that our model was doing extremely accurate predictions. Figure 2 shows how the Company Apple’s Closing price varies when compared to time. Figure 3 shows how our CNN model performed in the train dataset represented in blue colour and test dataset which is represented in orange and green colour. Figure 4 shows how our model predicted the next 30 days graph will look like the blue line indicates the data it has been trained on and the red line indicates the model’s prediction for the next 30 days. Figure 5 shows how the actual closing price of a selected company looks like.
3.7 Result and Discussion √ (Predictedi − Actuali )2 /N . N
RMSE =
(1)
i=1
The above is the formula for calculating RMSE. Here, N is the length of the dataset. The sum of difference of predicted value and actual value squared, divided by N and square root will give us the RMSE. n y i=1 i
R = 1 − n 2
i=1
− yi
2
yi − − yi
2 .
(2)
150
K. P. Kakarla et al.
Fig. 2 Closing price graph of apple
Fig. 3 Train versus test performance analysis graph
The above is the formula for calculating R2 score. Here, n is the length of the dataset y^ represents the predicted value and y represents the actual value. The squared sum of differences of predicted value and actual value divided by the reciprocal gives us the R2 score. Table 1 shows that ANN has not performed very well both in RMSE and r2 score. It scored 0.76 and 0.87 for train and test, respectively, in RMSE and −1.3, −1.7 for train and test, respectively, in R2 score which is pretty bad. On the other hand, CNN has performed in an average manner scoring 0.58, 0.61 for train and test, respectively,
Stock Price Prediction Using LSTM, CNN and ANN
151
Fig. 4 Prediction graph
Fig. 5 Original graph
in RMSE and 0.74, 0.81 for train and test, respectively, in R2 score. Among these three techniques, LSTM has performed quite well so we choose LSTM to do our predictions. Table 1 Performance metrics comparison table R2 score for train
RMSE for train
LSTM
0.42
0.55
0.48
0.63
CNN
0.58
0.74
0.61
0.81
ANN
0.76
−1.3
RMSE for test
R2 score for test
Algorithm used
0.87
−1.7
152
K. P. Kakarla et al.
4 Conclusion In conclusion, the utilisation of LSTM, CNN and ANN models for stock price prediction has demonstrated promising results in the field of financial forecasting. These advanced machine learning techniques have shown their capability to capture complex patterns and dependencies in historical stock data, enabling them to make accurate predictions. We expanded the scope of our model training outside the Indian stock market. More specifically on stock prices of MAANG and Divis labs, all three models were performing really well with RMSE of LSTM being 0.42 and R2 score being 0.55, and this being the best result among all. The long short-term memory (LSTM) model, with its ability to process and remember sequential information, has proven effective in capturing temporal dependencies in stock price movements. By incorporating past price trends and patterns, LSTM models have shown considerable success in predicting short-term stock price fluctuations. It is important to note that while these models have shown promising results, stock price prediction remains a highly challenging task due to the inherent volatility and unpredictability of financial markets. Factors such as economic events, market sentiment and geopolitical factors can significantly impact stock prices, making accurate predictions challenging. Therefore, it is essential to supplement these techniques with fundamental analysis and expert insights. By combining the power of advanced machine learning models with human expertise, investors and traders can make more informed decisions in the dynamic world of stock markets.
References 1. Hsu Y-L, Tsai Y-C, Li C-T (2023) FinGAT: Financial graph attention networks for recommending top-KK profitable stocks. In: IEEE transactions on knowledge and data engineering 2023 2. Parekh R, Patel NP, Thakkar N, Gupta R, Tanwar S, Sharma G, Davidson IE, Sharma R (2022) DL-GuesS: deep learning and sentiment analysis-based cryptocurrency price prediction. IEEE Access 3. Kabbani T, Duman E (2022) Deep reinforcement learning approach for trading automation in the stock market. IEEE Access 4. Li G, Zhang A, Zhang Q, Wu D, Zhan C (2022) Pearson correlation coefficient-based performance enhancement of broad learning system for stock price prediction. IEEE Trans Circuits Syst II: Express Briefs 5. Vargas G, Silvestre L, Júnior LR, Rocha H (2022) B3 stock price prediction using LSTM neural networks and sentiment analysis. IEEE Lat Am Trans 6. Yeng Z, Wong L, Zheng Y, Bindir A (2019) DeepClue: visual interpretation of text-based deep stock prediction. by Lee Shei, Senior Member, IEEE, was published in IEEE 2019 7. Jenho L, Rahyun K, Yokyvng K, Jaewo K (2019) Glowbal Stoc Maarket Prediktion based on stoc charat photos using deep Q-network, IEEE 8. Li G, Zhan A, Zhang Q, Woo D, Zhan C (2022) Pearsun coorrelation co-efficient-baised performance enhancement of board learning computer for Stoc Price Prediktion. IEEE
Stock Price Prediction Using LSTM, CNN and ANN
153
9. Len Y-F, Huang T-M, Chvng W-H, Ung Y-L (2021) Forecasting fluctuations in the financial index using a recurrent neural network based on price features. IEEE 10. Wen M, Li P, Zhang L, Chen Y (2019) Stock market trend prediction using high-order information of time series. IEEE 11. Wu M-E, Chung W-H (2018) A unique technique to option portfolio design utilizing the Kelly criteria. IEEE Access 6:53044–53052 12. Wu M-E, Wang C-H, Chung W-H, Tso R, Yang I-H (2015) An empirical comparison of the Kelly criteria and vince’s optimum F is made. In: Proceeding IEEE international conference smart City/SocialCom/SustainCom (SmartCity), Dec 2015, pp 806–810 13. Chen SM, Chen CD (2011) TAIEX forecasting based on fuzzy time series and fuzzy variation groups. IEEE Trans Fuzzy Syst 19(1):1–12, Feb 2011 14. Chin SM, Chin CD (2011) TAIEX forecasting based on fuzzy time series and fuzzy variation groups. IIEEE Trans Fuzzzy Syst 190(19):1–21, Feb 2011 15. Kang P, Fans C (2008) A hybrid system integrating a wavelet and TSK fuzzy rules for stock price forecasting. IIEEE Trans Syst Main Cyber Prt C (Apple Revision) 389(69):812–851, Nov 2008 16. Idrees SM, Alam MA, Agarwal P (2019) A prediction approach for stock market volatility based on time series data. IEEE Access 7:17287–17298 17. Chen CLP, Liu Z (2018) Broad learning system: an effective and efficient incremental learning system without the need for deep architecture. IEEE Trans Neural Netw Learn Syst 29(1):10–24 18. Li Q et al. (2021) Integrating reinforcement learning and optimal power dispatch to enhance power grid resilience. IEEE Trans Circuits Syst II, Exp Briefs 69(3):1402–1406, Mar 2021 19. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Proceeding 25th international conference neural Information processing systems, pp 1097–1105 20. Liu D, Tse CK, Zhang X (2019) Robustness assessment and enhancement of power grids from a complex network’s perspective using decision trees. IEEE Trans Circuits Syst II, Exp Briefs 66(5):833–837, May 2019 21. Bai L, Zhao Y, Huang X (2018) A CNN accelerator on FPGA using depthwise separable convolution. IEEE Trans Circuits Syst II, Exp Briefs 65(10):1415–1419, Oct 2018 22. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444 23. Ding X, Zhang Y, Liu T, Duan J (2015) Deep learning for event driven stock prediction. In: Proceeding of 24th international conference artificial intelligence, pp 2327–2333 24. Erhan D, Bengio Y, Courville A, Vincent P (2009) Visualizing higher-layer features of a deep network, vol. 1341. University Montreal, Montreal, QC, Canada, Technical Report 25. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceeding international conference learning representations, pp 1–15
IoT-Based Smart Wearable Devices Using Very Large Scale Integration (VLSI) Technology M. Ashwin , R. Ch. A. Naidu , Raghu Ramamoorthy , and E. Saravana Kumar
Abstract People’s usage of smart wearable devices and sensors plays a crucial role in VLSI technology. The wearable devices are embedded in clothes, smartwatches, and accessories. The wear gadgets like smart rings, smartwatches, and smart spectacles are associated with human healthcare monitoring, actual period location discovery and portable online games, etc. This paper presents a survey of the recent usage of wearable devices and sensors with VLSI technology. Three key features considered to make user-friendly smart wearable devices: They are security, performance, and deployment cost. Almost all wearable devices are equipped with Bluetooth communication and screen displays like smartwatches and smartphones. The primary goal of this research is to minimize the power consumption for wearable devices. Smart wearable devices are utilized for a wide range of applications like medicine, sports, fitness, business, etc. Keywords Wearable device · Bluetooth · Wearable sensor · VLSI · And IoT
1 Introduction Wearable technology plays a vital role in the daily life of every user. Wearable technology is an innovative technology that is used to monitor the daily activities of human life. Wearable technology is a kind of electronic device worn as trappings, surrounded by a dress, fixed in the user’s physique. The gadgets are hands-free devices to utilize for real use. The microprocessor is embedded in wearable devices which can forward and collect information through the internet. The wear technique is a vital group of IoT with lifetime varying applications like medical and further arenas. M. Ashwin (B) · R. Ramamoorthy · E. S. Kumar Department of Artificial Intelligence and Data Science, Koneru Lakshmaiah Education Foundation, Andhra Pradesh, India e-mail: [email protected] R. Ch. A. Naidu · R. Ramamoorthy · E. S. Kumar Department of Computer Science and Engineering, The Oxford College of Engineering, Bengaluru, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_13
155
156
M. Ashwin et al.
Wearable technology also called wearable with the growth of portable grids, highspeediness information transmission, and reduced microchips is the growth of these fields. This technology can be tatty, surrounded by cloth or trappings, or tattooed straight on the skin. In the nineteenth century, eyeglasses as the first wearable technology developed at a low cost. But the recent wear incorporates microchips. The development of wearable technology is the growth of mobile networks. A fitness activity tracker is a popular wearable technology for the customer side. Then, the wristwatch developed a shade, and new powerful mobile apps were also included. The people receive data using Wi-Fi networks such as Bluetooth headsets, smartwatches, and Webenabled. The gaming trade enhances additional wear, with simulated realism and amplified realism headphones. The wear skill equipment is adopted for medicinal and healthcare usages. They are AIR Louisville, ITBra, medical alert monitors, smart tattoos, smartwatches, and child monitoring devices. The AIR Louisville is used to measure air quality and pollution. The ITBra is used to identify breast cancer which transmits data to the lab for further investigation. Wear medicinal vigilant screens to extend the motion and individuality to the aged and weakened. Intelligent tattoos have a microelectronic sensor that is used to screen brain and heart action, muscle function, and sleep disorders. A smartwatch is used to measure the disease symptoms of the person. A child monitoring device equipped with GPS is used to identify the location of the child. The wearable sensor is to make self-power-driven perception, computation, and message schemes to allow data-driven perceptions for an intelligent and healthful realm as shown in Fig. 1.
Fig. 1 Wearable sensor possible amenities for human body
IoT-Based Smart Wearable Devices Using Very Large Scale Integration …
157
2 Literature Review Wear gadget micropayment scheme is implemented with the help of BLE-based authentication protocol. The verification procedure creates an exclusive and safe session key for the micropayment service [1]. The encoding method based on MBUS reduces the power consumption by up to 70.96% using FPGA. Compared to the original MBUS implementation, the encoding method-based MBUS result is better performance [2]. The wearable computing survey result shows that wrist-worn wearables are applied in three fields specifically, user boundary and collaboration training, user training, and action acknowledgment [3]. The MCU for WBSNs is used to decrease the Si part for hardware common riddle that uses various body signs. The QRS composite indicator is utilized to measure the heartbeat rate of the human body. Compared to other designs, this work increases the normal density proportion by 12% in ECG signal for WBSNs [4]. The intelligent wearable device using Bluetooth technology has low power consumption through which the remote ECG continuously observes patients with various cardiac infections [5]. The full adder has small energy consumption and improved part effectiveness which improves the result by up to 30% in a wide range of applications [6]. The LC ADC keeps energy for wearable devices over non-uniform samplers. The LC ADC compared with synchronous ADCs and asynchronous ADCs shows that LC ADC is better than others [7]. The low power designs are utilized in wearable and implantable devices to minimize complete energy consumption and increase the lifetime of the battery. Developing smaller and smarter WIDs is needed to reduce power consumption for multitasking computational purposes [8]. The plan and implementation of small energy FPGA h/w construction is a structured chunk of operations like exponential and multiplication using safety procedures depending on public key encryption. The result shows that this design uses 96% less design logic and 46% less resource compared to other designs which optimizes the results [9]. MH sampler is one of the MCMC algorithms to sample from PDF that adopts patient real-time signals to predict brain seizure with minimal power consumption using FPGA. This model achieves a forecast accuracy of 81.47% and a sensitivity of 90% [10]. The micromachine and LSI technologies are to construct wearable schemes for healthcare observation in everyday life [11]. The feature found healthiness record safety procedure which decreases 0.364 encoding time and 0.188 decoding times in seconds with separate patient health accounts [12, 13]. Security [14] and energy consumption play a vital role in networks [15] and wearable devices. Table 1 shows various smart wearable devices and sensors available from vendors [16].
158
M. Ashwin et al.
Table 1 Commercial smart wearable devices and sensors S. No Name
Image
Activity/Affect recognition
Sensors
1
Oura smart ring
Activity + Aect Temperature sensor, gyroscope, accelerometer, and infrared LEDs
2
LG G watch R smart
Activity
Accelerometer, gyroscope, proximity, HR, barometer
3
Shimmer IMU
Activity
Accelerometer, gyroscope, magnetometer, and altimeter
4
Samsung gear live Smartwatch
Activity + Aect Accelerometer, gyroscope, HR, compass
5
Samsung gear S
Activity
Accelerometer, gyroscope, proximity, HR, barometer, UVLight S3: accelerometer, gyroscope, proximity, HR barometer
6
SparkFun 9DoF razor IMU
Activity
Accelerometer, gyroscope and magnetometer
7
Motorola Moto 360 2gen and sport
Activity
Accelerometer, gyroscope, HR Sport: Accelerometer, altimeter, gyroscope, HR, ambient light sensor (continued)
IoT-Based Smart Wearable Devices Using Very Large Scale Integration …
159
Table 1 (continued) S. No Name
Image
Activity/Affect recognition
Sensors
8
Xiaomi MI band
Activity
HR, pedometer, sleep administration; SMS response, anti-loss, hard trial function
9
Moto SKT smart band
Activity
Accelerometer, LED medium, vibrate motor, lens cortex M3
10
Small rock Wristwatch Black Shingle 2 + Heart Rate Smart Watch
Activity + Aect Microphone accelerometer, gyroscope, magnetometer, ambient, light sensor, compass
3 Materials and Methods To grow the foremost control system for great worth submissions such as health care and IoT achieved over power gathering, low power microelectronics, and sensor devices with emphasis on usableness and actionable information. Digital biomarkers are a chance to change novel information bases into useful, actionable perceptions. Nowadays, we want wearable gadgets with extra abilities and a lengthier energy lifetime. It can be proficient by padding extra apparatuses on minor marks, thus touching to small geometry mark plan. However, energy indulgence or leak power happens in all the circuits that are presently utilized, which rises the complete energy consumption, creating it low appropriate for wearable gadgets. The growing request for movable and even wearable microelectronic gadgets for communique, computation, and entertainment is necessary for long battery life, low power consumption, and less weight. Smart wearable devices are considered to brand use of small energy and little energy plan practices. Energy management is an important factor for system-on-chip. The major factors are the cost associated with packing and refrigeration, stand-in period and battery-operated lifetime, digital sound resistance, and ecological anxieties. Small energy VLSI can be accomplished by optimization at several planes of the enterprise method initially from the scheme and algorithmic stages to circuit and arrangement planes as shown in Fig. 2. The above design levels are considered to minimize power consumption for smart wearable devices. At the system level, partitioning and power-down concepts are utilized to implement the design process level. The algorithm level is utilized for complication and concurrence. The construction stage is utilized for correspondence, redundance, pipeline, and information encrypting. The circuit stage is utilized for power retrieval, logic styles, and electronic transistor. The skill level is utilized for
160
M. Ashwin et al.
Fig. 2 Low-power VLSI design process-level model
threshold drop and multithreshold gadgets. Here, circuit-level design process is used for minimizing power consumption. The conventional CMOS circuit diagram is shown in Fig. 3, and the corresponding full adder truth table is shown in Table 2. The 1-bit full adder design is shown in Fig. 4. The strategy metrics for wearable device broadcast interruption, normal energy consumption, and leak energy are
Fig. 3 Conventional CMOS full adder
IoT-Based Smart Wearable Devices Using Very Large Scale Integration …
161
Table 2 Truth table of a full adder Cin
A
B
Sum
Cout
0
0
0
0
0
0
0
1
1
0
0
1
0
1
0
0
1
1
0
1
1
0
0
1
0
1
0
1
0
1
1
1
0
0
1
1
1
1
1
1
reduced dramatically for this design process model. The 1-bit full adder design verification process is done with the help of the following block diagram. The proposed microcontroller unit (MCU) design with field programmable gate arrays (FPGA) wearable device block diagram is shown in Fig. 5. Initially, the physical signals are provided by the Arduino simulator, and then, these signals are passed to the FPGA microcontroller unit to process these signals. The FPGA has universal asynchronous receiver/transmitter (UART) communique port, the signs are handled by loss-less density, encoding, and fault right encoder will pass the signal to the system through the UART broadcast link. Subsequently, the system acknowledged the encoder bit stream, and the system can decrypt the bit stream by decoding software. Fig. 4. 1-bit full adder design [14]
162
M. Ashwin et al.
Fig. 5 FPGA substantiation process for wearable device proposed MCU design
4 Results and Discussion Developing wearable device with low power consumption is an important aspect of the design process. The following Table 3 shows the power consumption of Oura smart ring wearable device per day. The Oura smart ring is used to measure the person’s physique temperature, heartbeat rate, activity, and sleep time of the person. The power consumption of Oura ring is shown in Table 3, and the corresponding graph is shown in Fig. 6. The BlueCore2-Flash is a Bluetooth technology with a 2.4 GHz scheme speed, and it is remained considered to decrease the amount of exterior apparatuses requirements. It also has a very low power state as shown in Table 4, and the corresponding graph is shown in Fig. 7.
Table 3 Power consumption of Oura ring
S. No
State
Power consumption (mA)
1
Normal mode
9
2
Idle mode
5
3
Full power-down mode
14
IoT-Based Smart Wearable Devices Using Very Large Scale Integration … Fig. 6 State versus power consumption of Oura ring
163
Power Consumption in mA 16 14 12 10 8 6 4 2 0
Normal
Idle
Full Power Down
State
Power Consumption in mA
Table 4 BlueCore2-Flash chip energy consumption S. No.
State
Energy consumption (mA)
1
Information transmission 120 Kbps
6
2
Information transmission 720 Kbps
48
3
Deep sleep
19
Fig. 7 State versus power consumption of BlueCore2-Flash
Power Consumption in mA 60 50 40 30 20 10 0 Data transfer 120 Kbps
Data transfer 720 Kbps
Deep Sleep
State
Power Consumption in mA
5 Conclusion This work helps to minimize the power consumption of smart wearable devices. The 1-bit full adder circuit design process is verified with the FPGA verification process MCU design. The result is compared with regular mode, lazy mode, and complete energy downcast mode for Oura ring wearable device. Another wearable device BlueCore2-Flash chip result is compared with the FPGA verification process MCU design. Both results dramatically reduced the energy consumption of wearable
164
M. Ashwin et al.
devices by up to 20%. Wearable devices are utilized for everyday applications in real life. In the future, the upcoming design process model is continued to minimize the power consumption of smart wearable devices.
References 1. Lo NW, Yohan A (2020) BLE-based authentication protocol for micropayment using a wearable device. Wireless Pers Commun 112(4):2351–2372 2. Yang X, Wu N, Andrian JH (2017) Comparative power analysis of an adaptive bus encoding method on the MBUS structure. VLSI Des 3. Al-Eidan RM, Al-Khalifa H, Al-Salman AM (2018) A review of wrist-worn wearable: sensors, models, and challenges. J Sens 2018:1–20 4. Chen SL, Tuan MC, Lee HY, Lin TL (2017) VLSI implementation of a cost-efficient micro control unit with asymmetric encryption for wireless body sensor networks. IEEE Access 5:4077–4086 5. Led S, Fernández J, Serrano L (2004) Design of a wearable device for ECG continuous monitoring using wireless technology. In: The 26th annual international conference of the IEEE engineering in medicine and biology society, vol 2. IEEE, pp 3318–3321 6. Franklin AB, Sasilatha T (2019) Design and analysis of low power full adder for portable and wearable applications. Int J Recent Technol Eng (IJRTE). ISSN 2277-3878 7. Antony A, Paulson SR, Moni D (2018) JAsynchronous level crossing ADC design for wearable devices: a review. Int J Appl Eng Res 13(4):1858–1865 8. Lundager K, Zeinali B, Tohidi M, Madsen JK, & Moradi F (2016) Low power design for future wearable and implantable devices. J Low Power Electron Appl 6(4):20 9. Rodríguez-Flores L, Morales-Sandoval M, Cumplido R, Feregrino-Uribe C, Algredo-Badillo I (2018) Compact FPGA hardware architecture for public key encryption in embedded devices. PLoS ONE 13(1):e0190939 10. Marni L, Hosseini M, Hopp J, Mohseni P, Mohsenin T (2018) A real-time wearable FPGAbased seizure detection processor using MCMC. In: 2018 IEEE international symposium on circuits and systems (ISCAS). IEEE, pp 1–4 11. Yamada I, Lopez G (2012) Wearable sensing systems for healthcare monitoring. In: The 2012 symposium on VLSI technology. IEEE, pp 5–10 12. Mubarakali A, Ashwin M, Mavaluru D, Kumar AD (2020) Design an attribute-based health record protection algorithm for healthcare services in a cloud environment. Multimedia Tools Appl 79:3943–3956 13. Dokania V, Verma R, Guduri M, Islam A (2018) Design of 10T full adder cell for ultralow-power applications. Ain Shams Eng J 9(4):2363–2372 14. Ramamoorthy R, Thangavelu M (2022) An enhanced distance and residual energy-based congestion aware ant colony optimization routing for vehicular ad hoc networks. Int J Commun Syst 35(11):e5179 15. Raghu R, Menakadevi T (2016) A survey on anonymous secure on-demand routing protocols in MANETs. Middle East J Sci Res 24:3869–3880 16. Ashwin M, Kumar ES, Naidu RCA, Ramamoorthy R (2023) IoT based innovative teaching learning using smart class rooms. In: 2023 International conference on sustainable computing and data communication systems (ICSCDS). IEEE, pp 1143–1148
Plant Disease Detection and Classification Using Artificial Intelligence Approach Ashutosh Ghildiyal, Mihir Tomar, Shubham Sharma, and Sanjay Kumar Dubey
Abstract Agricultural development is not only crucial for human existence; it is also a major commercial field for many countries, regardless of their state of development. Early detection of plant diseases is crucial because they have an impact on the growth of the affected species. However, plant diseases have become more prevalent recently for a number of natural and artificial reasons. Globalization, trade, and climate change have all had an impact, as has the deterioration of existing systems as a result of years of agricultural intensification. For a long time, plant disease detection using computer vision has been a great topic of discussion. Emerging technologies such as artificial intelligence can be used on images of plants to detect infection and sickness in plants, which will allow people to take required steps which can heal the plants. As the datasets on this work are restricted, disease identification is mainly done through images. The goal of this work is to accurately diagnose sickness in plants (among a few possible diseases present in the dataset) by comparing and selecting the model with the highest accuracy on training and testing data. Common datasets of plants with their disease are introduced, and the results of previous research are compared. On that premise, this work examines potential problems in practical implementations of deep learning-based plant disease diagnosis. This analysis also provides aspects to focus on in order to improve this project. Our research also looks at how machine learning approaches have evolved in the last few years, from traditional machine learning to deep learning.
A. Ghildiyal (B) · M. Tomar · S. Sharma · S. K. Dubey (B) Department of CSE, Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India e-mail: [email protected] S. K. Dubey e-mail: [email protected] M. Tomar e-mail: [email protected] S. Sharma e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_14
165
166
A. Ghildiyal et al.
Keywords Disease detection · Plant disease · Deep learning · Image processing · Machine learning · DenseNet · AlexNet · CNN architecture
1 Introduction There are over 93 million farmers estimated in India. Agricultural goods are utilized to meet the dietary demands of both animals and humans. Food, which is the foundation of every human being, is produced because of agriculture. Everyone relies on agricultural produce in some capacity, whether they are in a large city or a tiny town. It should be highlighted that agriculture accounts for 20% of India’s GDP [1], and traditional methods of plant disease detection rely on human observation, which is labour-intensive, arbitrary, and prone to mistakes. With the advancement of technology in all aspects of life, agriculture has seen significant advancement [2]. Along with the variety of crops grown, additional activities such as farming, cattle husbandry, and so on were initiated. However, crop production now contributes significantly to agricultural output. Crop productivity has also changed dramatically. With the advancement of knowledge technology, agricultural has seen some modernization. Modern agronomy utilizes the most advanced technical tools and processes to boost production. To increase crop productivity through the identification of optimal growth conditions, more and more agricultural practices are being implemented using state-of-the-art techniques. However, plant diseases can have a major impact on crop production and quality because of the damage to plants normal functioning. The prevention of infections may result in reduced harvest. Quantities and quality must be ensured through efficient disease management strategies. In the management of big agricultural production, it is essential to take quick action in terms of monitoring for possible infections. It also entails finding an immediate solution to numerous problems. Disease can have an impact on the plant’s total functional capacity. It can cause growth retardation, decreased fruit yield, increased leaf fall, and a variety of other problems. The disease may be transmitted from crop to crop, by a pathogen, or in some other manner. Thus, it would be safe to say that changing agriculture would greatly benefit the nation and create a lot of opportunities for job growth and expansion in the agricultural industries, in addition to improving the conditions of local farmers. Pesticide, fungicide, and herbicide research and development have evolved rapidly in India. However, every year, crops succumb to numerous known diseases, resulting in the loss of tonnes of produced corps, which can be avoided by early diagnosis of plant diseases. The exponential rise in population and the climatic conditions are the main causes of plant diseases [3]. The objective of this study is to detect diseases in a plant using the image of the plant. To achieve this objective, the study uses various algorithms, compares their accuracies to find the best-suited algorithm to perform the task and identifies if the plant has disease or not without undergoing any laboratory or testing process with precision. To diagnose the disease, the leaves must be closely monitored. Many researchers have reported on various strategies for detecting and monitoring plant diseases. A range of imaging techniques
Plant Disease Detection and Classification Using Artificial Intelligence …
167
are employed in the identification of disease. Photo-acoustic imaging, which uses light absorption in the case of tissues, is one of the imaging techniques being used. It makes use of the ability of tissues to absorb light and heat conversion, which causes the generation of photo-acoustic signals. Here, the pressure distribution that tissues emit is mapped and used for imaging. One of the more important techniques in this respect is magnetic resonance imaging. It is an imaging method that generates detailed images that are usually used to diagnose various diseases.
2 Literature Review Over time, detection of plant diseases through technology has attracted the attention of numerous researchers. This has created a pressing need for more precise detection. As a result, many methods for detecting plant diseases have been put out recently. Images of the plant/plant leaves or roots, etc., serve as the input parameters for these methods. There has been a significant number of research works on this topic. Machine learning and deep learning models should be improved or modified to enable them to recognize and classify diseases over the course of their whole life cycle because the severity of plant diseases fluctuates over time [4]. This paper does not talk about the types of diseases detected and also does not show any comparative study using practical models but is purely theoretical. The current study offers an effective and accurate way to recognize different plant diseases in certain areas, and it might be expanded to other fruit and crop detection, generic disease identification, and automated agricultural detection techniques [5]. The model is used only for apple plants; hence, the data used is only of apple plants, so the accuracy might differ for other crops in other models based on this model, primarily due to differences in dataset, and this model is trained. There were several methods for identifying and categorizing plant diseases early on. The key methods include SVM, K-means clustering, deep learning, and K-NN [6]. Several commercially available things are gradually becoming well-known for distinguishing plant diseases and recognizing recovery strategies, assisting farmers in enhancing produce profitability, and farmers enjoy these advantages [7]. No practical model is demonstrated in the paper and is purely theoretical with basic concepts. After nine epochs of training, the trained model had an accuracy of 93.82% [9]. The experimental results showed that the model’s ability to distinguish between various plant leaf image categories as healthy or diseased, despite significant inter- and intraclass changes, appeared promising. Automated algorithms for detecting plant leaf diseases are still needed, and a specialized system for accurate plant disease detection requires development. Using this method also has the benefit of allowing for the early or initial detection of plant diseases [10]. More advanced techniques like CNN algorithms might have produced better results for detection of plant diseases than the algorithms used in this paper. It was shown that employing attention blocks followed by convolutional blocks improved accuracy marginally but did not significantly affect prediction speed [11].
168
A. Ghildiyal et al.
Object localization networks were required for wide images of plant leaves. Multilabel classification was not performed to find multiple diseases. The performance of the models was considerably impacted by the quantity of photos. In AlexNet, fine-tuning the minibatch size did not clearly correlate with classification performance, but as the minibatch size was increased in VGG16net, classification accuracy fell. The weight and bias learning rate was increased in the VGG16 net, accuracy decreased [12]. The number of images severely affected accuracy of the model and lesser number of images gave max accuracy. The model’s performance is competitive with that of the AlexNet and VGG16 in fine-grained image classification [13]. The computation burden is particularly common in parallel networks, and although it may lead to improved accuracy, it also reduces overall algorithm efficiency. The ViT model treats the image as a sequence of patches and employs a standard transformer encoder to process it. This approach can enable precision agriculture and help farmers to manage crops and weeds more efficiently [14]. This model is more effective with larger datasets. To improve accuracy with smaller datasets, a reduced number of classes can be used during training. The significance of gathering large datasets with high variability, data augmentation, transfer learning, visualization of CNN activation maps, and visualization of small sample plant leaf disease detection and hyperspectral imaging for early detection of plant disease in improving classification accuracy [15]. Model has inadequate robustness and is ineffective. For use with various illness datasets, deeper learning models need to be more robust. The updating model provides a benefit of employing smaller networks as well. It will cost less data and update more quickly when a mobile application is updated via mobile communication. [16]. The dataset used for training the CNN was relatively small, with only a certain number of images. This could limit the generalizability of the method to other datasets and different types of plant diseases.
3 Methodology 3.1 System Configuration Deep learning model relies heavily on graphics processing unit (GPU) with compute unified device. The study was carried out with 30GiB random access memory (RAM), 8 CPUs, and 8GiB GPU.
3.2 Dataset This study makes use of the PlantVillage [8] project dataset. The collection contains 54,306 photos with 256 × 256 resolution and 38 different class labels, such as apple
Plant Disease Detection and Classification Using Artificial Intelligence …
169
Fig. 1 Sample of images from PlantVillage dataset
scab, apple healthy, grape black rot. Colour, grayscale, and segmented formats are all offered for this dataset (Fig. 1).
3.3 Experimental Work The study compares the applicability of different machine learning and deep learning models for the plant disease detection problem and compares the algorithms by adapting to already pretrained models. The dataset is split into train-test split set of 80–20 (80% training and 20% testing). The accuracy percentage is used to compare which model best suits as the solution to the problem statement. The following figure (Fig. 2) explains process used to perform the study: Step 1: Data Acquisition PlantVillage dataset is used that consists of coloured, grayscale, and segmented images of various healthy as well as unhealthy plant leaves classified into their plant
Fig. 2 Proposed methodology
170
A. Ghildiyal et al.
Fig. 3 Accuracy plot of simple CNN (using Keras)
diseases. There are 54,306 pictures on which are models are trained. This dataset is divided into train-test split ratio of 80% to 20%. Step 2: Algorithms Selection The algorithms used to detect plant disease are CNN (Keras is used), VGG19, Inception, AlexNet, and DenseNet. We compare the accuracy of these algorithms on the model and decide which model is best suited for the given problem statement. Step 3: Results Analysis Simple CNN (Keras) The accuracy obtained using simple CNN on Keras backend was 84.447% (Fig. 3). VGG19 Architecture of CNN The use of tiny filters in the convolutional layers is one of the distinguishing characteristics of VGG19. The tiny filters’ ability to catch finer details enables the network to learn more intricate characteristics and patterns in the input data. The computational expense of VGG19 is another drawback. It can take a whilst to train and uses a lot of processing power because to the high number of parameters and layers. Additionally, the network can learn more invariant characteristics that are less susceptible to minute changes in the input by using max pooling layers. The accuracy obtained using VGG19 was 87.952% (Fig. 4). Inception Architecture of CNN The concept of using many levels of feature extraction to obtain high accuracy in picture categorization is where the name “Inception” originates. Several parallel
Plant Disease Detection and Classification Using Artificial Intelligence …
171
Fig. 4 Accuracy plot for VGG19 architecture of CNN
convolutional layers with various filter sizes make up each Inception module. The network’s following layer receives the outputs of these parallel layers after they have been concatenated. The accuracy obtained using Inception architecture was 82.233% (Fig. 5).
Fig. 5 Accuracy plot for Inception architecture on CNN
172
A. Ghildiyal et al.
Fig. 6 Accuracy plot of AlexNet architecture of CNN
AlexNet Architecture of CNN Eight layers make up AlexNet, comprising two fully connected layers, a final softmax output layer, and five convolutional layers. Instead of using a single, enormous convolutional layer, the little sub-networks are utilized. Several parallel convolutional layers with various filter sizes make up each Inception module. The fact that the PlantVillage dataset differs from the ImageNet dataset, on which AlexNet was trained, could be one explanation for the somewhat lower accuracy of AlexNet in this research. Photos of plant leaves may be found in the PlantVillage collection, whilst photos of diverse objects can be found in the ImageNet dataset. The network’s following layer receives the outputs of these parallel layers after they have been concatenated. The accuracy obtained using AlexNet architecture was 95.223% (Fig. 6). DenseNet Architecture of CNN DenseNet121 is a deep learning architecture that is designed to learn more efficiently and effectively than traditional CNNs. Its dense connections between layers allow for the reuse of features across the network, leading to improved performance with fewer parameters. The usage of dense blocks, which are made up of many convolutional layers with a set number of feature mappings, is another factor in DenseNet’s performance. The output of one layer is concatenated with the input of the following layer in a feedforward connection between these layers. DenseNet121 has a total of 121 layers, which are organized into four dense blocks, each containing a number of convolutional layers [17]. In between the dense blocks, there are transition layers that perform downsampling of feature maps using average pooling and convolution [17]. The performance of the DenseNet model can be enhanced in a number of ways.
Plant Disease Detection and Classification Using Artificial Intelligence …
173
Fig. 7 Accuracy plot of DenseNet architecture of CNN
To expand the dataset and enhance the model’s ability to generalize to new data, one method is to apply data augmentation techniques. The accuracy obtained using DenseNet architecture was 97.093% (Fig. 7).
4 Results and Analysis The implementation of the models involved the classification of images as healthy or infected with one of the described diseases. To gauge their effectiveness in classifying new data, the models were evaluated on the test dataset based on their accuracy shown in Table 1. This served as an indicator of their ability to detect plant diseases accurately. It is interesting to note that Inception, which is a widely used architecture in computer vision tasks, had the lowest accuracy in this study. This could be because the model is optimized for image classification tasks rather than tasks like plant Table 1 Table comparing different models’ accuracies
Model
Accuracy (%)
Simple CNN (Keras)
84.447
VGG19
87.952
Inception
82.233
AlexNet
95.223
DenseNet
97.093
174
A. Ghildiyal et al.
disease detection, which require more specialized features. The Inception module may have some drawbacks, including the possible cost of computation, especially when utilizing higher kernel sizes. Unlike Inception, VGG19 algorithm’s performance was much within expectations. This model is based on the VGG16 architecture and has 19 layers. The added layers enable better feature extraction, leading to improved accuracy. However, it is still not as accurate as the top-performing models. In the study, AlexNet has an excellent performance, since it has an accuracy higher than simple CNN using Keras, Inception and VGG19. However, it was less than DenseNet architecture. The fact that the PlantVillage dataset differs from the ImageNet dataset, on which AlexNet was trained, could be one explanation for the somewhat lower accuracy of AlexNet in this research. photos of plant leaves may be found in the PlantVillage collection, whilst photos of diverse objects can be found in the ImageNet dataset. As a result, it is possible that AlexNet’s features from ImageNet are not the best for spotting plant diseases. DenseNet was the best-performing model in the study, achieving an accuracy of 97.09%. This model is based on the idea of dense connectivity, where each layer is connected to every other layer in a feedforward manner. This allows for better feature reuse and extraction, leading to higher accuracy. The ability of DenseNet can handle complicated and unstructured data which is one of the factors contributing to its outstanding performance. DenseNet is able to extract useful characteristics from the more than 50,000 photos of healthy and diseased plants in the PlantVillage dataset despite their diversity and complexity.
5 Conclusion and Future Scope In this study, the objective was to explore the use of artificial intelligence models for plant disease classification based on leaf images. Several algorithms were examined to determine the most appropriate one for the problem statement. The results indicated that the DenseNet architecture on CNN had the highest accuracy and was the most efficient model. This model is designed with dense connections between layers to optimize the flow of information through the network, which makes it effective for detecting plant diseases. Inception model being the weakest was an unexpected outcome; however, the reasons were also discussed and theorized along with analysis of all the results along with the possible reasons of the performance and ways of improvement. Hence, it can be concluded that the objectives of the study were achieved, and results were recorded and analysed effectively. Also, advanced techniques such as faster R-CNN can be employed to enhance the precision of processing leaf images. Overall, the research provides insights into the potential of AI models for plant disease detection. By improving the accuracy of disease detection, farmers can prevent the spread of diseases and enhance crop yields. The findings can be used to develop more efficient and precise models for detecting plant diseases, leading to sustainable agriculture practices and food security.
Plant Disease Detection and Classification Using Artificial Intelligence …
175
References 1. Kulkarni O (2018) Crop disease detection using deep learning. In: Fourth international conference on computing communication control and automation (ICCUBEA). Pune, India, pp 1–4 2. Balodi R, Bisht S, Ghatak A, Rao K (2017) Plant disease diagnosis: technological advancements and challenges. Indian Phytopathol 70(3):275–281 3. Pautasso M, Döring T, Garbelotto M (2012) Impacts of climate change on plant diseases— Opinions and trends. Eur J Plant Pathol 133:295–313 4. Poornappriya T, Gopinath R (2020) Rice plant disease identification using artificial intelligence approaches. Int J Electr Eng Technol 11(10):392–402 5. Roy A, Bhaduri J (2021) A deep learning enabled multi-class plant disease detection model based on computer vision. Artif Intell 2(3):413–428 6. Singh V, Sharma N, Singh S (2020) A review of imaging techniques for plant disease detection. Artif Intell Agric 4:229–242 7. Deepa R, Shetty C (2021) A machine learning technique for identification of plant diseases in leaves. In: 6th international conference on inventive computation technologies (ICICT). Coimbatore, India, pp 481–484 8. https://www.kaggle.com/datasets/abdallahalidev/plantvillage-dataset 9. Adedoja A, Owolawi P, Mapayi T (2019) Deep learning based on NASNet for plant disease recognition using leave images. In: 2019 international conference on advances in big data, computing and data communication systems (icABCD). Winterton, South Africa, pp 1–5 10. Singh V, Misra A (2017) Detection of plant leaf diseases using image segmentation and soft computing techniques. Inf Process Agric 4(1):41–49 11. Borhani Y, Khoramdel J, Najafi E (2022) A deep learning based approach for automated plant disease classification using vision transformer. Sci Rep 12(1):11554 12. Rangarajan A, Purushothaman R, Ramesh A (2018) Tomato crop disease classification using pre-trained deep learning algorithm. Procedia Comput Sci 133:1040–1047 13. Lin Z, Mu S, Huang F, Mateen KA, Wang M, Gao W, Jia J (2019) A unified matrix-based convolutional neural network for fine-grained image classification of wheat leaf diseases. IEEE Access 7:11570–11590 14. Reedha R, Dericquebourg E, Canals R, Hafiane A (2022) Transformer neural network for weed and crop classification of high resolution UAV images. Remote Sens 14(3):592 15. Li L, Zhang S, Wang B (2021) Plant disease detection and classification by deep learning—A review. IEEE Access 9:56683–56698 16. Durmu¸s H, Güne¸s EO, Kırcı M (2022) Disease detection on the leaves of the tomato plants by using deep learning. In: 6th international conference on Agro-geoinformatics. Fairfax, VA, USA, pp 1–5 17. Huang G, Liu Z, Maaten L, Weinberger K (2017) Densely connected convolutional networks. In: IEEE conference on computer vision and pattern recognition (CVPR). In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Camera and Voice Control-Based Human–Computer Interaction Using Machine Learning Ch. Sathwik, Ch. Harsha Vardhan, B. Abhiram, Subhani Shaik, and A. RaviKumar
Abstract Human–computer interaction comes with various hardware implementations out of which is mouse. Latest mouses took their form into wireless with the help of Bluetooth technology or with the help of radio frequencies, which are connected by the receiver, and mouse is powered by batteries. In the proposed method, it uses OpenCV, ML and Python and fingertips controlled and audio cues. These restrictions will be removed by adding a webcam or built-in camera enabling computer vision-based hand movements and fingertip detection. The machine learning algorithm is employed in the system’s algorithm, which is the computer can be virtually controlled with hand movements to do various mouse features such as open folder, close folder, refresh, move up, move down along with cursor movements with the help of hand. Deep learning is used to track the hand movements. Hence, the put forward method will be able to make use of the computer’s ability to perform tasks such that with the help of audio cues we will be able to command PC and also protect ourselves from infectious diseases such as Covid-19. Keywords OpenCV · MediaPipe · Cursor-audio cues
1 Introduction Technology has merged into our daily lives in today’s fast-paced society. In both our personal and professional lives, computers have proliferated, and their use is now essential for communication, education and pleasure. Yet, using conventional input devices like the keyboard and mouse for long stretches of time can be taxing and tiresome. In addition, some persons can find it challenging to use conventional input devices because of their physical restrictions, illnesses, or disabilities. This is where Ch. Sathwik (B) · Ch. Harsha Vardhan · B. Abhiram · S. Shaik · A. RaviKumar Department of Information Technology, Sreenidhi Institute of Science and Technology (Autonomous), Hyderabad, Telangana, India e-mail: [email protected] A. RaviKumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_15
177
178
Ch. Sathwik et al.
the breakthrough human–computer interaction technology known as the GestureControlled Virtual Mouse, which promises to make computer use simpler and offer an alternate means of computer interaction, comes into play. A technology called the Gesture-Controlled Virtual Mouse uses voice instructions and hand motions to control computer operations. It is a cutting-edge technology that enables users to operate computers through speech and hand gestures without the use of conventional physical input devices like a mouse or keyboard. The system recognizes hand movements and verbal commands using computer vision and machine learning algorithms, making it a simple and natural way to converse with computers. The system consists of two modules: one that directly controls hand movements using MediaPipe Hand detection, and the other that tracks hand gestures using gloves of any uniform hue. MediaPipe implements sophisticated machine learning and computer vision techniques like CNN for the system.
2 Literature Survey Numerous similar studies on operating a virtual mouse with a glove-wearing hand have been done. A camera has been used to try and identify hand motions. In a hardware-based system first introduced by Quam in 1990, the user dons a Data Glove [1]. Although producing results with more accuracy, the technology created by Quam does not support all gesture controls. Utilizing motion history images, DungHua Liou, Chen Chiung Hsieh, and David Lee presented a study titled “A Real-Time Hand Gesture Recognition System” in 2010 [2, 3]. Compared to systems that rely on color tips on the hands, this one proved more accurate at detecting and identifying hand gestures. Unfortunately, a flaw in this system was its inability to handle intricate hand gestures. E. Erdem, Y. Yardimci, V. Atalay, A. E. Cetin suggested a study on “Computer vision-based mouse” [4, 5]. In order to process skin pixel detection and hand segmentation, this system used saved frames. In 2016, Abhik Banerjee and Abhirup Ghosh, Koustuvmoni Bharadwaj published a paper in the IJCTT Journal titled “Mouse Control using a Web Camera based on Color Detection” [6, 7]. This technology used a real-time digital camera to control the pointer motion rather than physically pressing buttons or converting mouse locations on a physical computer. It permitted the use of several bands to execute various mouse tasks, such as leftclicking. The conventional mouse system, which has been in use for decades, was also possible to be replaced by this technology. Despite the benefits of this system, there is one major drawback: it may not be suitable for users who wear gloves, as it could lead to failure to accurately detect color tips and thus result in incorrect recognition.
Camera and Voice Control-Based Human–Computer Interaction Using …
179
3 Proposed System Design Architecture Design Either a voice assistant program or a gesture control program can be used to launch the proposed system at first. Both methods can be used to launch the other program. In the gesture control application, user gestures are registered using a web camera and each frame is processed through MediaPipe’s hand gesture recognition module [1, 8], which establishes landmarks. A gesture is identified using these landmarks and some computation. Following that, the control class executes operations based on these commands. This is repeated often. Voices are captured using a microphone in the voice assistant program. Commands are comprehended. The commands are carried out as directed by the acts. Figure 1 [4] shows the basic architecture of the design. Virtual Mouse Using Hand Gesture We have controlled the position of the mouse cursor and other click event sousing a web camera and a color recognition approach. With the help of voice commands and both static and dynamic hand gestures, it is possible to virtually control all i/ o activities. The OpenCV module, which is required for mouse activities, must be utilized to perform mouse actions in Python. The hands are [9, 10] captured in real time via a webcam. Only the colored fingertips are extracted from the movie using a certain procedure. Many actions are carried out to keep track of the cursor when the center of gravity of the palm’s calculated using the relative positions of the finger tips. The technology automatically understands voice commands and hand movements without the use of additional hardware, using state-of-the-art machine learning and computer vision algorithms as shown in Fig. 2 [5]. Voice Assistant Friday is a chat bot that utilizes speech recognition, language processing techniques, and audio synthesis to provide relevant information or perform specified tasks for its users. By analyzing the user’s speech, Friday can extract keywords and disregard extraneous sounds to provide accurate responses. The compact nature of chat bots
Fig. 1 Virtual mouse using hand gesture [4]
180
Ch. Sathwik et al.
Fig. 2 Architecture of virtual mouse [5]
allows them to be incorporated into various devices, offering software-based functionality. The increasing use of voice commands for smart devices has led to the development of products like Google Home, which can control multiple devices through voice commands [11, 12]. Chat bots like Friday can be integrated into various devices, including home appliances, speakers, and personal computers. With the expanding range of integration options, chat bots like Friday can perform tasks ranging from simple voice commands to complex data analysis [13, 14]. As a result, chat bots are increasingly being used in various industries to improve efficiency and productivity, making them a crucial aspect of modern technology.
4 Methodology 4.1 Methodology of Virtual Mouse The proposed system can be activated initially using either gesture control software or a voice assistant program. The other program can also be launched. Users’ gestures are recorded using a webcam for the gesture control software, and each frame is run via finger identifying module as in MediaPipe to identify points. With the aid of some computation, a gesture is detected using these landmarks. A controller class then executes these commands using actions. This is repeated frequently. The voice assistant program uses a microphone to record user input. Commands are recognized. This project has the following gesture control functionalities. 1. 2. 3. 4. 5. 6.
Position the cursor. A stop sign. Left cursor. Click twice. Scrolling. Pick and throw.
Camera and Voice Control-Based Human–Computer Interaction Using …
181
Fig. 3 Coordinates or landmarks in the hand [9]
7. Selecting many folders. 8. Controlling volume. Using Camera by OpenCV The webcam-connected laptop or desktop computer’s video is used by the virtual mouse record images. The camera is started, and a video capture function is called using the Python package OpenCV. Video frames are continuously captured by the webcam and sent to the AI virtual system for analysis. Recording the Video and Processing The webcam is used by the virtual mouse to collect very single frame until it finishes the work. To track the finger as seen in Fig. 3 [9], in the video, in the video, BGR video frames are transformed to RGB image. Navigating the Window Rectangle Area The virtual mouse technology translates fingertip coordinates from webcam screen visibility to the ratio of desktop screen for accessing the cursor using a transformational manner. After the hands are bounded into a rectangular box within the pixel area of the hand by identifying fingers movement which is fed to make the specified function. Identifying the Raised Finger Which Performs the Specific Mouse Action In this phase, we identify fingers using the finger ID of the finger which is already fed into the program of MediaPipe and positions of up-fingered function.
182
Ch. Sathwik et al.
Use of Computer Vision for Mouse Functions that Rely on Hand Motions and Hand Tip Tracking to Move the Mouse Cursor by the Computer Frame If the tip ID indicates that the index finger is raised, or if both the index finger and middle finger are raised, the Python Autopy library can be used to move the mouse across the screen, as shown in Fig. 3 [9]. Algorithm to Use Proposed Model START Initialize the system and start video capturing. 1. 2. 3. 4. 5. 6.
Capture frames using the webcam. Detect hands and hand tips using MediaPipe and OpenCV. If both the middle and index fingers are up, move the cursor around the window. If the index finger is down and the middle finger is up, perform a left click. If the index finger is raised and the middle finger is lower, right click is executed. If the index finger and the middle finger are join together, double click is executed. 7. If the index finger and thumb are brought together and then dragged in a certain direction, scroll in that direction. 8. To perform drag and drop: a. Put the index and middle fingers together. b. Close all the fingers to select. 9. To select multiple files. 10. Close all the fingers. 11. To control volume: a. Put the thumb and index fingers together. b. Move along the vertical axis to increase or decrease the volume. 12. To adjust screen brightness: a. Put the thumb and index fingers together. b. Move along the horizontal axis to increase or decrease the brightness. 13. If all fingers are up, return to step 3. END
4.2 Methodology of Voice Assistant The following tasks are performed by the project using voice assistant: 1. Launching and ceasing gesture recognition. 2. Google search use Google Maps to locate a location. 3. File browsing.
Camera and Voice Control-Based Human–Computer Interaction Using …
183
Fig. 4 Friday launch in command prompt [10]
4. File browsing date and time. 5. Copy and paste. 6. Voice assistance for sleep/wake exit. The steps in implementation are: 1. Speech-to-text conversation. 2. Respond using keywords. 3. Responding to user.
5 Implementation of Voice Assistant 5.1 Implementation of Voice Assistant To implement the voice assistant, we need run Friday.py in command prompt or Anaconda prompt. Figure 4 [10] shows how to run the command.
5.2 Implementation of Virtual Mouse Running Gesture Controller.py on the Anaconda prompt will launch the gesture control interface. Python, HTML, CSS, JavaScript, and the Anaconda platform are the languages and technologies used for the implementation. As shown in Fig. 5 [8], Running Gesture Controller.py on the Anaconda prompt will launch the gesture control interface. We can also launch virtual through Friday by giving command “Launch Gesture” as shown in Fig. 6 [1].
184
Ch. Sathwik et al.
Fig. 5 Launching virtual mouse through command prompt [8]
Fig. 6 Launching virtual mouse through Friday [1]
6 Result and Analysis Proposed model has been tested multiple times to analysis and to find the accuracy, Tables 1 and 2 [4, 5] show the accuracy of various functions of the proposed model. The experimental test results for the virtual mouse system (Table 1) revealed an impressive accuracy of about 95%, indicating excellent overall performance. However, the accuracy was slightly lower for the “scroll function” gesture, which proved to be more challenging for the system to recognize accurately. Table 1 Test results for mouse [4]
Mouse actions executed
Gain
Loss
Mouse movement
100
0
100
Left button click
97
3
97
Right button click
95
5
95
Scroll function
98
2
98
Brightness control
91
9
91
Volume control
91
9
91
Drag and drop
89
11
89
Multiple selection
Accuracy (%)
90
10
90
No action performed
100
0
100
Result
851
49
95.28
Camera and Voice Control-Based Human–Computer Interaction Using …
185
Table 2 Test results for voice assistant [5] Voice assistant function performed
Gain
Loss
Launch/stop gesture recognition
100
0
Accuracy (%) 100
Google search
98
2
98
Finding a location on google maps
99
1
99
File navigation
83
17
83
Date and time
95
5
95
Copy paste
86
14
86
Sleep/wake voice assistant
99
1
99
Exit
93
7
90
Test Cases The outputs of the few features present in the proposed model are mentioned in Figs. 7, 8, 9, 10 and 11. Fig. 7 Moving the cursor [11]
Fig. 8 Double click [12]
186
Ch. Sathwik et al.
Fig. 9 Drag and drop and multiple select [13]
Fig. 10 Friday finding location [12]
Fig. 11 Friday launch [14]
7 Conclusion and Future Work In this project, we are developing a method for using a live camera to control the mouse pointer and uses voice assistant to perform actions. Computer vision is the system’s foundation. It is capable of doing all mouse tasks and keyboard. Yet, because there are so many different human races, it is challenging to obtain solid results. Using this technique would make presentations simpler and save workspace. With
Camera and Voice Control-Based Human–Computer Interaction Using …
187
the use of the palm and many fingers and voice commands, it offers functions like window closing, window enlargement, browsing, etc. so that persons with physical disabilities can utilize PCs and laptops as intelligently as healthy people. The major addition to this system is to make the interaction completely virtual. As of today’s technology, AI systems like chatgpt, we can improve this project into high level.
References 1. Design and development of hand gesture based virtual mouse. IEEE conference publication. IEEE Xplore [11] introduction to MediaPipe. Learn OpenCV 2. Angel NPS (2013) Real time static & dynamic hand gesture recognition. Int J Sci Eng Res 4(3) 3. Hsieh C-C, Liou D-H (2010) A real time hand gesture recognition system using motion history image. ICSPS 4. Erdem E, Yardimci Y, Atalay V, Cetin AE (2002) Computer vision based mouse. In: International conference on acoustics, speech, and signal processing. Proceedings (ICASS). IEEE 5. Ghosh A, Banerjee A (2018) Mouse control using a web camera based on color detection 6. Banerjee A, Ghosh A, Bharadwaj K (2014) Mouse control using a web camera based on color detection. IJCTT 9 7. Park H (2008) A method for controlling the mouse movement using a real time camera. Brown University, Providence, RI, USA, Department of Computer Science 8. Virtual mouse implementation using OpenCV. IEEE conference publication. IEEE Xplore 9. Raheja JL, Chaudhary A, Singal K. Proposed using HSV algorithm but this uses special sensor is used to capture image and processes it. User has to spend more money for the sensor 10. Qiang B, Zhai Y, Zhou M (2021) SqueezeNet and fusion network-based accurate fast fully convolutional network for hand detection and gesture recognition. IEEE 11. OpenCV. Overview. GeeksforGeeks 12. Wu YM (2009) The implementation of gesture recognition for media player system. Master Thesis of the Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan 13. Lai H (2009) The following robot with searching and obstacle avoiding. Master Thesis of the Dept. of Electrical Engineering, National Central University, Chung-Li, Taiwan 14. Tu YJ, Lin HY (2007) Human computer interaction using face and gesture recognition. Master Thesis of the Department of Electrical Engineering, National Chung Cheng University, Taiwan
Optimal Crop Recommendation by Soil Extraction and Classification Techniques Using Machine Learning Y. B. Avinash and Harikrishna Kamatham
Abstract Soil determines importance of food especially in farming and agriculture. Various soils are used by various crops. Many researchers published on types of soils and recommendations of soils. To recommend a soil deep study is required based on many parameters. Machine learning is the concept which can be applied in wide applications. Measuring the characteristics of soil can be done through machine learning. Many algorithms are available in the machine learning which are used for various processes. This paper focuses on using machine learning algorithms for extraction, selection, model design and feature classification of the parameters of soil. Based on the classifications of soil suitable algorithms like HMM-based DNN are considered which have given best results in terms of extraction and classification. Promisingly, the accuracy was more than 90%. This model helps the farmers to adequately improve their analysis in selection of crop. Keywords HMM-based DNN · Machine learning algorithms
1 Introduction Agriculture is the main work in which an individual can be alive. 78.32% of food is generated by agriculture, and the rest is generated by animals. An individual can be healthy and organic by agriculture which need to be perfect. There are many constrains to make the agriculture perfect. One of the important factors is soil. Soil is the parameter which is available in the land [1]. Agriculture is the major profession in many countries. All over the world the agriculture land has the major area. The world totally has 57.51 million square kilometers land in which 37.39 million square kilometers is the agriculture land. 65.01% of agriculture land is used for farming. Across the India, the land area is 3.287 million square kilometers in which the agriculture land is 1.81 million square kilometers [2, 3]. Many individuals have Y. B. Avinash · H. Kamatham (B) Malla Reddy University, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_16
189
190
Y. B. Avinash and H. Kamatham
Fig. 1 Types of soils
agriculture farming as their profession. There are many types of soils in which most common soils available are mentioned below. These types of soils are shown in Fig. 1. • • • • • •
Loamy Soils Sandy Soils Clay Soils Chalky Soils Silty Soils Peaty Soils, etc.
Based on the soils, various crops were developed [4–8]. The crops are classified as give below. Some types of crops are shown in Fig. 2. • • • • •
Food Crops Forage Crops Fiber Crops Oil Crops Ornamental Crops
Fig. 2 Types of crops
Optimal Crop Recommendation by Soil Extraction and Classification …
191
• Industrial Crops. Almost two decades, data mining scientists worked on various soils and recommended the adaptable characteristics of soil [9–12]. Recently advanced technology machine learning has made the attractive results in optimization and recommendation of soils. The overall procedure is to study the soil thoroughly. Basically, the study consists of various sections like preprocessing, feature extraction, feature selection, and feature classification
2 Basic Architecture The study of the soil and recommendation of optimal results is not a simple task. Various parameters are considered to process. The architecture is same as the process. Various parameters are considered, and many operations are done as shown in Fig. 3. The detailed description of each process is shown in further sections [13–17]. The proposed system takes the various soils and crop yields as an input and form many datasets. Each dataset is processed to generate the good accuracy. The idea is from data mining only, but advanced machine learning algorithms are used. The architecture is simple. Many ANN algorithms are suitable for this application. The accuracy is the major factor for obtaining the optimization results
Fig. 3 Basic architecture
Input Dataset
Data PreProcessing Feature Extraction
Feature Selection
Model Design
Feature Classification
Post processing Data
Accuracy Output
192
Y. B. Avinash and H. Kamatham
Data Set: Data mining was used to determine soils and its parameter. Data is collected on various parameters. Many datasets are available in the websites; this paper uses Kaggle site information for the process. This site has very adequate collections of information. Data Preprocessing: Preprocessing is the first process done with the data. There are many techniques done in the preprocessing [18]. The preprocessing done is shown below. The preprocessing output samples are the perfect samples which can be used for further processing: • • • • •
Formatting Cleaning Sampling Quantization De-noising.
Feature extraction: Feature extraction is done by using ANN. Features extracted are stated below [19]. These features were extracted for different types of soils: • • • • • • • • • • • • •
Texture Color Moisture Density Minimum temperature Maximum temperature Average temperature Structure Rainfall Stability Humidity Porosity Consistence.
Feature selection: Over the extracted features, all the features are not useful. So, the features which are required are only selected by using feature selection process. Principal component analysis (PCA) is used for selecting the features. Feature Classification: The important factor is classification. The selected features are classified. Generally, there are many techniques available for the classification [20]. This paper uses five techniques. Overall, the techniques for the considered features accuracy are generated. Most of these techniques are commonly used for wide applications. • • • • •
HMM-based DNN Naive Bayes classifier Random Forest Neural Networks SVM.
Optimal Crop Recommendation by Soil Extraction and Classification …
193
DNN is interfaced with HMM which is hidden Markov model. This gives an effective data of probability distribution. Only DNN or HMM is not so powerful. The combination of these two techniques has proved better results in almost all the applications. Always DNN is better than ANN. This is purely the semisupervised learning. Pattern recognition classification does not look best in agriculture field. Classification is the smooth process which is totally depended on various parameters. The determined classification is joining of various techniques. There are papers where SVM is combined with ANN, Naive Bayes is combined with ANN, etc. This paper mainly focuses on DNN combined with HMM. Figure 4 shows transition boxes with DBN.
Fig. 4 HMM and DNN
194
Y. B. Avinash and H. Kamatham
3 Results The dataset is directly taken from the website. Almost maximum researchers use this data so that easily the comparisons can be done [21–45]. The dataset is shown in Fig. 5. Python platform is used for obtaining the results. The extracted features are shown in Table 1. ANN technique is used for extracting these features. The features classified are shown in the graph below with accuracy. As per the obtained its recommended that HMM-based DNN is the better classifier for obtaining the optimal crops.
Fig. 5 Datasets
Optimal Crop Recommendation by Soil Extraction and Classification …
195
Table 1 Extracted features in various types of soils Soil/feature
Loamy soils Sandy soils
Clay soils
Chalky soils Silty soils
Peaty soils
Texture
3
8
10
7
12
Color
Pale brown
Brownish Dark brown Gray brown black
Dark brown Pale brown
Moisture
52.6
47.8
61.9
57.12
72.45
39.08
Density
1.58
1.98
1.12
1.45
1.83
1.22
Minimum temperature
27.5
29.2
28.54
29.23
27.12
27.36
Maximum temperature
35.61
37.89
36.32
36.85
37.19
37.82
Average temperature
32.81
33.42
31.59
35.72
36.81
35.97
Structure
Round
Round
Cube
Cube
Round
Round
Rainfall
605
397
430
513
612
472
6
Stability
33.7
31.4
67.2
47.9
65.1
57.3
Humidity
0.029
0.312
0.56
0.89
0.91
0.17
Porosity
0.25–0.40
0.32.0.41
0.28–0.31
0.28–0.31
0.25–0.40
0.32.0.41
Very Loose
Dense
Very Dense
Dense
Loose
Consistence Loose
4 Conclusion This paper concludes that the usage of machine learning to obtain the optimal crop recommendation is done. Figure 6 shows Optimal Crop Recommendation analysis using different machine learning algorithms. Many techniques are used and compared in which the classification HMM-based DNN provides better results with more than 90% accuracy. It also concludes that PCA technique for feature selection provided excellent results. ANN for feature extraction has given the tremendous responses with eminent features. The research proved that DNN is better than ANN in this application. This paper recommends the agriculture farmers to go with soil-based crop other than crop-based soil. In future, other types of soils can be considered with many techniques. Combined feature extraction and classification techniques can be used. This proposed paper has obtained promising results.
196 Fig. 6 Optimal crop recommendation using different machine learning algorithms
Y. B. Avinash and H. Kamatham 100
95
90
Loamy Sandy
85
Clay Chalky Silty
80
Peaty
75
References 1. Gandge Y, Sandhya P (2017) A study on various data mining techniques for crop yield prediction. In: International conference on electrical, electronics, communication, computer and optimization techniques (ICEECCOT) 2. Shirsath R, Khadke N, More D (2017) Agriculture decision support system using data mining. In: 2017 international conference on I2C2 3. Divya J, Divya M, Janani V (2017) IoT based smart soil monitoring system for agricultural production 4. Laksiri HGCR, Dharmaguna Wardhana HAC, Wijayakula Sooriya V et al (2019) Design and optimization of lot based smart irrigation system in Sri Lanka 5. Math A, Ali L, Pruthviraj U (2018) Development of smart drip irrigation system using IoT 6. Mishra D, Khan A, Tiwari R, Upadhay S (2018) Automated irrigation system-IoT based approach 7. Nageswara Rao R, Sridhar B (2018) IoT based smart crop-field monitoring and automation irrigation system 8. Sushanth G, Sujatha S (2018) IOT based smart agriculture system 9. Vaishali S, Suraj S, Vignesh G, Dhivya S, Udhayakumar S (2017) Mobile integrated smart irrigation management and monitoring system using IOT 10. Anurag D, Roy S, Bandyopadhyay S (2008) Agro-sense: precision agriculture using sensorbased wireless mesh networks. In: ITU-T “innovation in NGN”, kaleidoscope conference, Geneva, 12–13 May 2008 11. Biradar HB, Shabadi L (2017) Review on IOT based multi-disciplinary models for smart farming. IEEE Xplore 12. Jayanthi J, Selvakumar J (2017) A novel framework to facilitate personalized web search in a dual mode. Cluster Comput 20(4):3527–3535; Priya PK, Yuvaraj N (2019) An IoT based gradient descent approach for precision crop suggestion using MLP. J Phys Conf Ser 1362:012038
Optimal Crop Recommendation by Soil Extraction and Classification …
197
13. Thiyaneswaran B, Saravanakumar A, Kandiban R (2016) Extraction of mole from eyes clear using object area detection algorithm. In: 2016 international conference on wireless communications, signal processing and networking (WiSPNET). IEEE; Patil A, Beldar M, Naik A, Deshpande S (2016) Smart farming using Arduino and data mining. IEEEXplore. https://iee explore.ieee.org/document/7724599 14. Rahman SAZ, Mitra KC, Mohidul Islam SM (2018) Soil classification using machine learning methods and crop suggestion based on soil series 15. Chiranjeevi MN, Nadagoundar RB (2018) Analysis of soil nutrients using data mining techniques 16. Satalino G, Mattia F, Davidson MW, Le Toan T, Pasquariello G, Borgeaud M (2002) On current requirements of soil suddenness recuperation from ERS–SAR data. IEEE Trans Geo Sci Remote Sens 40(11):2438–2447 17. Khaki S, Wang L (2019) Crop yield prediction using deep neural networks, 10 June 2019. arXiv:1902.02860v3 [cs.LG] 18. Renuka ST (2019) Evaluation of machine learning algorithms for crop yield prediction. Int J Eng Adv Technol (IJEAT) 8(6). ISSN: 2249-8958, Aug 2019 19. Prathibha SR, Hongal A, Jyothi MP (2017) IOT based monitoring system in smart agriculture. In: 2017 international conference on recent advances in electronics and communication technology. IEEE 20. Kaur J (2017) Impact of climate change on agricultural productivity and food security resulting in poverty in India. Università Ca’ Foscari, Venezia, pp 16–18, 23 21. Kumar C, Roy R, Rawat S, Kumar A (2020) Activation map networks with deep graphical model for semantic segmentation. https://doi.org/10.1007/978-981-15-0214-9_89 22. Kumar C, Sharma A, Yadav A, Kumar A (2020) Semantic segmentation of color images via feature extraction techniques. J Phys: Conf Ser 1478:012025. https://doi.org/10.1088/17426596/1478/1/012025 23. Yadav A, Roy R, Kumar A, Kumar C, Dhakad S (2015) De-noising of ultrasound image using discrete wavelet transform by symlet wavelet & filters. https://doi.org/10.1109/ICACCI.2015. 7275776 24. Somwanshi DK, Yadav AK, Roy R (2017) Medical images texture analysis: a review: 436–441. https://doi.org/10.1109/COMPTELIX.2017.8004009 25. Returi KD, Mohan VM, Radhika Y (2015) A novel approach for speaker recognition by using wavelet analysis and support vector machines. In: 2nd international conference on computer and communication technologies—IC3T 2015, during 24–26 July 2015 at CMR technical campus, Hyderabad, Telangana, India, Advances in intelligent systems and computing book series (AISC), vol 379, 1, pp 163–174. https://doi.org/10.1007/978-81-322-2517-1_17 26. Returi KD, Radhika Y (2015) An artificial neural networks model by using wavelet analysis for speaker recognition. In: Proceedings of information systems design and intelligent applications, second international conference on information systems design and intelligent applications (India, 2015). Advances in intelligent systems and computing book series (AISC), vol 340, Issue 2, pp 859–874. https://doi.org/10.1007/978-81-322-2247-7_87 27. Mohananthini N, Yamuna G (2016) Comparison of multiple watermarking techniques using genetic algorithms. J Electr Syst Inf Technol 3(1):68–80. ISSN 2314-7172. https://doi.org/10. 1016/j.jesit.2015.11.009 28. Jagadeesh B, Rajesh Kumar P, Chenna Reddy P (2012) Genetic algorithm approach for singular value decomposition and quantization based digital image watermarking. Int J Eng Res Appl 2(2):1229–1235 29. Jaiswal RJ, Patil NN (2012) Multiple watermarking for text documents: a review. World J Sci Tech 2(3):176–179 30. Kallel M, Bouhlel M, Lapayre JC (2010) Use of multi-watermarking schema to maintain awareness in a teleneurology diagnosis platform. Radio Eng 19(1):68–73 31. Isolated Telugu speech recognition on T-DSCC and DNN techniques. Int J Recent Technol Eng 8(11):2419–2423. ISSN: 2277-3878, Sept 2019
198
Y. B. Avinash and H. Kamatham
32. Continuous Telugu speech recognition on T-LPC and DNN techniques. Int J Recent Technol Eng 8(3):4728–4731. ISSN: 2277-3878, Sept 2019 33. Continuous Telugu speech recognition through combined feature extraction by MFCC and DWPD using HMM based PNN techniques. Int J Pure Appl Math 118(20):865–872. ISSN: 1311-8080 (printed version). ISSN: 1314-3395 (on-line version) 34. Continuous Telugu speech recognition through combined feature extraction by MFCC and DWPD using HMM based DNN techniques. Int J Pure Appl Math 114(11):187–197. ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) 35. Speech recognition with combined MFCC, MODGDF and ZCPA features extraction techniques using NTN and MNTN conventional classifiers for Telugu language, SOCTA, Springer Nature Singapore Pte Ltd. Book ID: 430915_1_En, Book ISBN: 978-981-10-5698-7, Chap. 66, pp 1–10. https://doi.org/10.1007/978-981-10-5699-4_66 36. Telugu speech recognition using combined MFCC, MODGDF feature extraction techniques and MLP, TLRN classifiers, SOCTA, Springer Nature Singapore Pte Ltd. Book ID: 430915_ 1_En, Book ISBN: 978-981-10-5698-7, Chap. 65, pp 1–10. https://doi.org/10.1007/978-98110-5699-4_65 37. Continuous Telugu speech recognition through combined feature extraction by MFCC and DWPD using HMM based DNN techniques. Int J Pure Appl Math. ISSN 1314-3395, Sept 2017 38. Review of speech recognition on South Indian Dravidian languages. Indian J Control Theor Appl 10(31):225–233. ISSN 0974-5572, May 2017 39. MFCC based Telugu speech recognition using SVM technique. Indian J Control Theor Appl 9(46):105–113. ISSN 0974-5572, Dec 2016 40. Security enhanced image watermarking using mid-band DCT coefficient in YCbCr space. Indian J Control Theor Appl 9(23):271–278. ISSN 0974-5572, May 2016 41. Speech recognition using arithmetic coding and MFCC for Telugu language. In: Proceedings of IEEE digital library, pp 391–394. ISSN 0973-7529; ISBN 978-93-8054420-5, Mar 2016 42. Segmentation on moving shadow detection and removal by symlet transform for vehicle detection. In: Proceedings of IEEE digital library, pp 385–390. ISSN 0973-7529; ISBN 978-93-8054420-5, Mar 2016 43. De-noising of color image using median filter. In: Proceedings of IEEE digital library, 978-15090-0148-4, Dec 2015 44. De-noising of ultrasound image using discrete wavelet transform by symlet wavelet and filters. In: Proceedings of IEEE digital library, pp 1204–1208. ISBN: 978-1-4799-8790-0, Aug 2015 45. Wavelet based texture analysis for medical images. Int J Adv Res Electr Electron Instrum Eng 4(5):3958–3963, May 2015
IoT-Based Smart Irrigation System in Aquaponics Using Ensemble Machine Learning Aishani Singh, Dhruv Bajaj, M. Safa, A. Arulmurugan, and A. John
Abstract Aquaponics is a sustainable farming method that combines aquaculture and hydroponics to grow plants and fish in a closed-loop system. In this research paper, an irrigation system based on aquaponics is proposed, which uses real-time sensor data from the fish tank and crop soil to improve the efficiency of the system. The system is designed to make informed decisions about crop irrigation needs by visualizing the data for analytics. The study compares the accuracy of three classification algorithms, KNN, Naive Bayes, and ANN, to decide when to irrigate the soil based on real-time sensor data. The proposed irrigation system includes two sets of sensors, one for the fish tank and the other for the crop soil, which is processed by an Arduino board and sent to Adafruit’s cloud platform for visualization and analytics. This cloud-based platform allows easy access to real-time data, enabling efficient monitoring and control of the irrigation system. Additionally, the study visualizes the results obtained from using regular water and lake water in the aquaponics system. Keywords Aquaponics · IoT · Smart monitoring · Sustainable farming · Machine learning · Smart agriculture
A. Singh · A. Arulmurugan Department of Computing Technologies, SRM Institute of Science and Technology, Kattankulathur, Chennai, India e-mail: [email protected] D. Bajaj · M. Safa (B) Department of Networking and Communications, SRM Institute of Science and Technology, Kattankulathur, Chennai, India e-mail: [email protected] D. Bajaj e-mail: [email protected] A. John Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_17
199
200
A. Singh et al.
1 Introduction Agricultural practices are rapidly changing in the face of the increasing demand for food and water resources. Traditional methods of irrigation have proven to be inefficient, causing water scarcity and soil degradation. The concept of aquaponics, a combination of aquaculture and hydroponics, offers an alternative solution to sustainable agriculture. It utilizes a closed-loop system where fish and plants are grown in the same environment, creating a symbiotic relationship where the waste from fish is utilized as nutrients for plants, and the plants clean the water for the fish. Fish waste is converted into nitrates by bacteria present in the soil, which can be readily absorbed by the plants. This system also provides the plants with a much more consistent nutrient supply as compared to traditional irrigation systems. Experts say that the aquaponic way of irrigation conserves as much as 90% of the water that is wasted in traditional irrigation systems. It is also observed that plants grow faster in this system because nutrients are readily available to them and they do not have to expend energy searching for them. Overall, the nutrient-rich, controlled environment of an aquaponics system provides the ideal conditions for plant growth, making it a superior option for growing plants. The proposed system takes various samples of tap water as well as Potheri Lake water with varying differences in pH, turbidity, and water temperature. The pH level of lake water can vary widely depending on various factors such as the surrounding environment, the presence of aquatic life, and the weather conditions. In general, lake water can have a lower or higher pH level than tap water, depending on the lake’s distinctive characteristics and the way the local area’s tap water is treated. For instance, tap water is usually treated to have a pH level between 6.5 and 8.5, which is considered safe for human consumption and does not cause corrosion in pipes. The naturally occurring presence of suspended particles like silt, sand, and algae in lake water can cause it to have a greater turbidity level than tap water.
2 Literature Review The hardware design of one system [1] includes various components such as sensors, actuators, relays, an Ethernet shield, Arduino, and routers. The monitoring of temperature and humidity is accomplished using an Arduino microcontroller, an Ethernet Shield, an FC-28 humidity sensor, and a DHT11 temperature sensor. A router is utilized to connect the Arduino Ethernet to a server. The relay module functions as a control switch for various actuators, which respond to the sensor output. The aquaponic box contains several actuators, such as two 12V DC exhaust fans, a mist maker, a 5V DC fan, two LED Grow Light lamps, and a 12V DC pump [2]. Another system kept track of numerous environmental factors, and over time it would be able to control how the farm operated and create a fully automated system. Publishers (nodes), brokers (local servers), and subscribers (local servers) all communicated via
IoT-Based Smart Irrigation System in Aquaponics Using Ensemble …
201
the MQTT protocol, which allows for the use of any data type. Monitoring can also be done with the help of a GUI program, a communication network, and [3] wireless sensors. A graphical user interface is provided by AWSM to address the aforementioned issues. Farmers can get real-time notifications and advice about water quality even when they are not there by using AWSM’s mobile applications built on IoT technology. When compared to the conventional approach, the introduction of the AWSM-based IoT application in the Aquaponics system has shown substantial gains. Additionally, it is feasible to [4] investigate many characteristics observed by smart systems, IoT technologies, and aquaponics. A ZigBee module [5] can be used in place of the Wi-Fi module to broadcast the data gathered by the Arduino while also displaying the values on the LCD without performing any control actions. A fish tank, a grow bed, and a light panel that simulates the sun’s rays for plant growth make up the aquaponics system described in [6]. The fish waste provides nutrients to the plants, which in turn receive water from the fish tank as a filter. The system makes use of actuators to operate the water pump, light panel, heater, and oxygen provider as well as sensors to monitor and control pH levels, temperature, and humidity. To direct the actual citrus production process, nutrient monitoring [7] in citrus plants can also be done in real time according to the soil situation. Another method that primarily focuses on finding magnesium and nitrogen deficiencies in image processing [8] in MATLAB for the detection of rice illnesses and nutritional deficiencies. A Raspberry Pi, a DHT11 temperature and humidity sensor, and solenoid valves make up the hardware employed in this technique. A deep learning method can help in the detection of diseases and the prediction of crop growth by utilizing many sensors that measure the pH value, temperature, humidity, and water level [9]. Astute farming is described by the author of reference [10] as the integration of a wireless sensor and irrigation system that keeps track of factors including soil moisture, nutritional content, and pH levels. A GSM module is used to manage the system via wireless communication.
3 Architecture Diagram The proposed system shown in Fig 1 comprises two sets of sensors—one for the fish tank and the other for the crop soil. The data from these sensors are sent to the Arduino board. The Arduino board processes the data and sends it to the NodeMCU Wi-Fi module which sends it to the Adafruit cloud platform for visualization and analytics [11–15]. The system gathers real-time sensor data from the fish tank and crop soil using a variety of sensors, including a soil moisture sensor, temperature and humidity sensor, pH sensor, turbidity sensor, and water temperature sensor. The physical layer or sensor is made up of this. The data is gathered and transferred to the Arduino board regularly.
202
A. Singh et al.
Fig. 1 Architecture diagram of proposed system
4 Methodology The construction of an aquaponics monitoring system for effective irrigation is the main goal of the suggested system. The system uses a variety of sensors, including a soil moisture sensor placed in a tomato plant, pH, water temperature, and turbidity sensors. The irrigation process is controlled by the readings from these sensors. Hardware Setup: The hardware setup consists of an Arduino board that, via Arduino programming in the Arduino IDE, receives data from the sensors. The NodeMCU Wi-Fi module, which is set up to work on any internet network, receives the data after that. A personal hotspot or local Wi-Fi network was used for this project. The data is sent from the NodeMCU to Adafruit.io, a cloud-based platform for IoT data analytics and visualization. On this platform, real-time data changes can be tracked and examined. Every 30–45 s, the sensors update their readings. A CSV file containing the data gathered from the Adafruit.io platform can be downloaded. Then, using the downloaded data, three classification algorithms—KNN, ANN, and Naive Bayes—are trained and tested. To forecast the soil’s irrigation needs, data is analyzed. Jupyter Notebook is used to analyze and visualize machine learning. Utilizing visualization techniques, the three classification systems are compared for accuracy. The system’s overall goal is to create an aquaponics monitoring system that controls irrigation by using a variety of sensors. Machine learning algorithms are used to analyze the sensor data and estimate the need for irrigation, which can result in more effective irrigation and higher agricultural yields.
4.1 Machine Learning Algorithms Used KNN (K-Nearest Neighbor). To use KNN, we must first select a value for k, which specifies the number of neighbors to take into account. Then, when we have a fresh batch of data to categorize, we determine how far away each point in the training set
IoT-Based Smart Irrigation System in Aquaponics Using Ensemble …
203
is from the fresh data and select the k-nearest points. Finally, we designate the class for the new data point as being the one that is most prevalent among those k points. Naïve Bayes’ classifier. Based on the Bayes theorem, which quantifies the likelihood of a hypothesis given evidence, it is a probabilistic categorization model. The term “naive” refers to the assumption made in Naive Bayes that the features are conditionally independent of the class. ANN (Artificial Neural Network). It is a machine learning model that takes its cues from how the human brain is organized and works. It is made up of many interconnected layers of neurons that analyze data and produce predictions. The features of the instance to be classified are sent to the input layer of an ANN classification model, and the predicted class is generated by the output layer. There are one or more hidden layers in between that process the input and transmit data to the output layer. The weighted total of the inputs is applied by each neuron in the hidden layers using a nonlinear activation function.
5 Components Used 5.1 Hardware The proposed irrigation system based on aquaponics uses various components to measure and transmit real-time data to the cloud platform for visualization and analytics. The sensors used in the system are summarized below. Soil moisture sensor (LM393). An electrical tool called the soil moisture sensor (LM393) measures the amount of water in the soil. It measures the soil moisture using the LM393 comparator integrated circuit and outputs a digital signal based on the predetermined moisture threshold. This sensor comprises two probes that are put into the soil; as a function of the moisture content of the soil, the resistance between the probe changes. Temperature and humidity sensor (DHT11). A widely used electrical device that can detect both temperature and humidity levels in the environment is the DHT11 temperature and humidity sensor. It uses a capacitive humidity sensor to monitor humidity levels and a Negative Temperature Coefficient (NTC) thermistor to measure temperature. The sensor sends digital outputs of temperature and humidity data to an 8-bit microcontroller through a one-wire protocol. pH sensor and turbidity sensor: A pH sensor is an electronic device that determines whether a solution is acidic or alkaline. The pH scale, which has a range of 0–14 with 0 being the most acidic, 14 being the most alkaline, and 7 being neutral, serves as the basis for how it functions. The fish tank’s water clarity, which is crucial for fish health, is measured by the turbidity sensor.
204
A. Singh et al.
Water temperature sensor. A common sensor for measuring the temperature of liquids, including water, is the DS18B20. This sensor is a digital thermometer that communicates with other devices via a 1-Wire interface. With an accuracy of 0.5 °C in the range of − 10 to + 85 °C, the DS18B20 can measure temperatures from − 55 to + 125 °C. Arduino board. Arduino Uno is based on the ATmega328P microcontroller. It is an open-source hardware board created for people of all skill levels who want to develop a variety of electronic projects. The board contains a USB port, a power jack, six analog inputs, an ICSP header, and 14 digital input/output pins. The Arduino programming language, which is based on C/C++, can be used to program the board. NodeMCU Wi-Fi module. The board is a great option for IoT applications because it combines the capabilities of a microcontroller and a Wi-Fi module. The Arduino IDE or the Lua programming language is used to creating programs for the NodeMCU ESP8266 device. Various sensors, actuators, and other electronic devices can be connected to the board’s GPIO pins. Additionally, it has Wi-Fi connectivity functionality built-in, enabling it to connect to the internet and communicate with other devices. A USB cable or an external power source can be used to power the board.
6 Implementation The sensor data was collected for over a month on an hourly basis. Visualizations were created based on the data. The data was also collected in the form of CSV, and three different binary classification algorithms were applied to it, namely KNN, ANN, and Naive Bayes. The working model of the aquaponic system is shown in Fig. 2.
7 Results The suggested technique gives a thorough strategy for effective aquaponic irrigation management. Real-time sensor data visualization and analytics offer insightful information on fish and agricultural growth conditions. The three machine learning methods’ comparison offers useful insights into effective irrigation prediction in aquaponics. For visualization and analysis, the Adafruit platform can be accessed on a smartphone, laptop, tablet, or any other electronic device that can be connected to the internet. Visualization on Adafruit.io is shown in Figs. 3 and 4.
IoT-Based Smart Irrigation System in Aquaponics Using Ensemble …
205
Fig. 2 Turbidity, water temperature, and pH sensor along with connections
Fig. 3 Real-time values of data acquired from sensors
7.1 Comparing the Accuracy of Various Classification Models The selection of a classification algorithm for agricultural data is influenced by the details of the data, the resources at hand, and the objectives of the classification activity. A standard criterion for assessing a classification model’s performance is accuracy. It calculates the percentage of cases in the dataset that were correctly categorized. To put it another way, accuracy is the proportion of true positives and true negatives in all instances. The model is operating effectively and properly predicting
206
A. Singh et al.
Fig. 4 Variation of different parameters with time
the class labels of the instances in the dataset when the accuracy value is high. A classification model could be taught to determine whether or not a certain soil sample needs irrigation based on its characteristics, such as its temperature, humidity, and soil moisture content. In this situation, a classification model with high accuracy might be helpful since it can make precise predictions about the soil samples’ irrigation requirements. As a result, farmers may be better able to decide when and how much to irrigate their crops, which may result in the more effective use of water resources and higher crop yields. On comparing the accuracy of all three algorithms as in Fig. 5, Naive Bayes was observed to have the highest accuracy of 94.12%. This is followed by ANN with an accuracy of 91.18% and then by KNN having an accuracy of 88.24. Fig. 5 Grouped bar chart for comparison of the accuracy of different models used
IoT-Based Smart Irrigation System in Aquaponics Using Ensemble …
207
8 Conclusion In conclusion, we have presented an irrigation system based on aquaponics that utilizes real-time sensor data to monitor and irrigate crops efficiently. The system gathers information from a variety of sensors, including soil moisture sensors, temperature and humidity sensors, pH sensors, turbidity sensors, and water temperature sensors. This information is then processed and instantly displayed on the Adafruit platform. Furthermore, we evaluated the performance of three classification algorithms, namely KNN, Naive Bayes’, and ANN, for deciding when to irrigate the soil. We discovered that ANN had the maximum accuracy, making it the best algorithm for our system’s irrigation timing prediction. As a future scope, we plan to further enhance our system by incorporating features that will enable the user to receive alerts on their mobile device, indicating when the crop needs water. We also plan to compare our results with additional classification algorithms to identify the best-performing algorithm for our system. Overall, this system provides a significant advancement in the automation of aquaponics-based irrigation, offering a more efficient, accurate, and sustainable way of managing crops.
References 1. Vernandhes W, Salahuddin N, Kowanda A, Sari SP (2017) Smart aquaponic with monitoring and control system based on IoT. In: 2017 second international conference on informatics and computing (ICIC). https://doi.org/10.1109/iac.2017.8280590 2. Nichani A, Saha S, Upadhyay T, Ramya A, Tolia M (2018) Data acquisition and actuation for aquaponics using IoT. In: 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT). https://doi.org/10.1109/rte ict42901.2018.9012260 3. Menon PC (2020) IoT enabled aquaponics with wireless sensor smart monitoring. In: 2020 fourth international conference on I-SMAC (IoT in social, mobile, analytics, and cloud) (ISMAC). https://doi.org/10.1109/i-smac49090.2020.9243368 4. Yanes AR, Martinez P, Ahmad R (2020) Towards automated aquaponics: a review on monitoring, IoT, and smart systems. J Clean Prod 263:121571. https://doi.org/10.1016/j.jclepro. 2020.121571 5. Prabha R et al (2020) IoT controlled aquaponic system. In: 2020 6th international conference on advanced computing and communication systems (ICACCS). IEEE 6. Butt MFU, Yaqub R, Hammad M, Ahsen M, Ansir M, Zamir N (2019) Implementation of aquaponics within IoT framework. In: 2019 SoutheastCon. https://doi.org/10.1109/southeast con42311.2019.9020390 7. Zhang X, Zhang J, Li L, Zhang Y, Yang G (2017) Monitoring citrus soil moisture and nutrients using an IoT based system. Sensors 17(3):447. https://doi.org/10.3390/s17030447 8. Rau AJ, Sankar J, Mohan AR, Das Krishna D, Mathew J (2017) IoT based smart irrigation system and nutrient detection with disease analysis. In: 2017 IEEE region 10 symposium (TENSYMP). https://doi.org/10.1109/tenconspring.2017.8070100 9. Park H, Eun JS, Kim SH (2017) Image-based disease diagnosing and predicting of the crops through the deep learning mechanism. In: 2017 international conference on information and communication technology convergence (ICTC). https://doi.org/10.1109/ictc.2017.8190957
208
A. Singh et al.
10. Chetan Dwarkani M, Ganesh Ram R, Jagannathan S, Priyatharshini R (2015) Smart farming system using sensors for agricultural task automation. In: 2015 IEEE technological innovation in ict for agriculture and rural development (TIAR). https://doi.org/10.1109/tiar.2015.7358530 11. Ghandar A, Ahmed A, Zulfiqar S, Hua Z, Hanai M, Theodoropoulos G (2021) A decision support system for urban agriculture using digital twin: a case study with aquaponics. IEEE Access 9:35691–35708. https://doi.org/10.1109/access.2021.3061722 12. Kumawat S et al (2017) Sensor based automatic irrigation system and soil pH detection using image processing. Int Res J Eng Technol 4:3673–3675 13. Abbasi R, Martinez P, Ahmad R (2022) Data acquisition and monitoring dashboard for IoT enabled aquaponics facility. In: 2022 10th international conference on control, mechatronics and automation (ICCMA). https://doi.org/10.1109/iccma56665.2022.10011594 14. Dhal SB, Bagavathiannan M, Braga-Neto U, Kalafatis S (2022) Can machine learning classifiers be used to regulate nutrients using small training datasets for aquaponic irrigation? A comparative analysis. PLOS One 17(8):e0269401. https://doi.org/10.1371/journal.pone.026 9401 15. Paul B, Agnihotri S, Kavya B, Tripathi P, Narendra Babu C (2022) Sustainable smart aquaponics farming using IoT and data analytics. J Inf Technol Res 15(1):1–27. https://doi.org/10.4018/ jitr.299914
Performance Evaluation of MFSK Techniques Under Various Fading Environments in Wireless Communication Sudha Arvind, S. Arvind, Abhinaya Koyyada, Mohd Mohith, Navya Sree Vallandas, and Sumanth Chekuri
Abstract This paper gives the performance analysis of the frequency shift keying modulation under Rayleigh and Rician fading channels in wireless communication. This performance is analyzed in account of AWGN (Additive White Gaussian Noise Channel). The graphs are taken by comparing the BER (bit error rate) of both the Rayleigh and Rician Channels. And the comparison table is taken by comparing SNR and BER of Rayleigh and Rician channels. The multiple frequency shift keying modulation is done by taking different m values they are 2, 4, 6, 8. Keywords Mobile wireless communications · Digital modulation · Multipath fading channel · Frequency shift keying (FSK) · Additive white Gaussian noise (AWGN) · Bit error rate (BER)
1 Introduction In the field of Mobile Wireless communication through Multipath propagation transmitted signal will be transmitted [1], resulting into reflection, refraction and Diffraction of the transmitted signal, between transmitter and related receiver multipath is produced. It becomes difficult for the receiver to detect the signal as multipath signals interfere with the desired signal. When the interference occurs strength of the signal is going to decay gradually and also due to multipath signals out of phase. From variety of sources obtained voice and data receiving speed mobility from wireless network [2]. Due to the increased usage of wireless communication system. S. Arvind (B) Department of ECE, CMR Technical Campus, Hyderabad, Telangana, India e-mail: [email protected] S. Arvind Department of CSE, HITAM, Hyderabad, Telangana, India A. Koyyada · M. Mohith · N. S. Vallandas · S. Chekuri CMR Technical Campus, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_18
209
210
S. Arvind et al.
Fading affects the signal resulting into destruction to poor quality of the signal with the adverse effects on wireless communication systems. There are three types of signal propagation mechanisms, those are reflection, diffraction and scattering types. Signal propagation from transmitter to receiver leads to multipath fading leads to fluctuation of the received signal in terms of amplitude, phase and angle. Principle of diversity technique is on providing replica of the similar transmitted information to the receiver [4]. To combat the effect of fading various diversity techniques are used. To improve capacity of coverage and reliability in the field of wireless communication. Various Diversity techniques are Space, frequency and Polarization Diversity.
2 Literature Survey (1) Multipath Fading Effects On uncoded and coded multiple frequency shift keying performance in mobile wireless communication (2019) BER Performance has been analyzed, with MFSK as the modulation technique, towards various block codes combination and also for uncoded MFSK, Under AWGN Channel, Rayleigh and Rician Channels [1]. Due to degradation in BER Performance, Diversity and Error control scheme is required to compensate. (2) Review of Data Communication in Wireless Fading Channel and a Case Study (OCTOBER 2015) High data rate oriented transmission through the fading channels is discussed in this paper [2]. RCID techniques [Rotation and Component Interleaving Diversity] over the signal constellation is applied. This technique plays the major role in improving the performance of wireless communication system. (3) Effect of AWGN and Fading (Raleigh and Rician) channels on BER performance of a WiMAX communication System (2012) In this paper OFDM base WIMAX communication performance is discussed, with various digital modulation and coding schemes, M-QAM, 16-QAM and 64 QAM modulation techniques through AWGN, Rayleigh and Rician Channels [3]. This research work results into WIMAX Communication system performance evaluation, through AWGN Channel and fading channels Rayleigh and Rician Channels. (4) Performance Comparison between MPSK and MFSK in Rician Fading Channel based on MGF Method (2012) Based on MGF Technique, in this paper, two modulation techniques are analyzed with MRC Diversity under Rician fading channel [4, 5].
Performance Evaluation of MFSK Techniques Under Various Fading …
211
(5) Multipath Fading Channel Modeling and Performance Comparison of Wireless Channel Models (2011) In multipath fading channel, model is simulated in this paper [6]. In the plot of BER versus SNR, DPSK Modulation technique is used to evaluate the performance under different channels. (6) Performance of Coherent MFSK Schemes over Slow Flat Fading Channels (2008) In this paper, coherent MFSK Scheme is studied with single integral representation for AWGN as well flat fading channels as Nakagami-n, Rician, Rayleigh. These results obtained under analyzation can be extended to get the probability of error of MFSK Technique [7–12].
3 Implementation of Proposed System The implementation of the paper design can be given as below: (a) System Requirements: 1. 2. 3. 4.
Operating System: Windows 11 (or) Windows 10 (version 1909 or higher). Processor: Any Intel or AMD x86–64 processor. RAM: 4 GB or more. Storage Capacity: 10 GB or more and an SSD.
(b) Software Requirements: 1. MATLAB SIMULINK. • 2022a Version 9.10.
3.1 Description 1. MATLAB Simulink MATLAB is a numeric computing an programming platform used to analyze data, development of Algorithms and models creation. MATLAB [8] in industry and Academia, comprise towards various range of applications which includes Image processing, Deep Learning, machine learning, signal processing, control systems and computational biology For model based design and multidomain simulation and for block diagram environment Simulink is used. Verification of embedded system design, automatic code generation, system level design are supported by SIMULINK. For modelling and simulating it provides customizable block libraries and graphical editor to the user.
212
S. Arvind et al.
2. MFSK Under Multipath Fading Mutipath propagation occurs due to fading channel and the effects of RF Propagation in the real environment [1]. Due to multipath propagation several signals will be received, but the resultant signal will be the sum of incoming signal levels. From the arbitrary angles incoming signals will arrive, Few reflected signals will be in phase and others will be out of phase with the path directed. Probability of error through AWGN Channel with orthogonal signals toward Binary PSK with respect to non-coherent detection is given by Eq. 1. BER2FSK/AWGN =
1 − a2 Eb e 2No 2
(1)
In slow fading, Rayleigh distributed variable will be the random variable, with receiving multiple paths resulting into no strong component over pdf, averaging error probability given by Eq. 2. BER2FSK/Rayleigh =
1
(2)
2 + SNRb
Generalization of Rayleigh channel can be considered as Rician Channel with several reflected signals and dominant component occurs, with variance and Mean, If the independent Gaussion random variables distribution, random variable for RICIAN Channel is assumed. With non-coherent detection in Rician multipath fading, error probability of Binary FSK is given by Eq. 3. BER2FSK/Rician =
1+K 2 + 2K + SNRb
e
) ( K SNRb − 2+2K +SNR b
(3)
3. AWGN Channel In order to analyze the effect of various Random Processes AWGN is a model used at basic level, used in Information Theory [11]. Modifiers denoting specific characteristics, • As it can be added to any noise, therefore it is addictive • As it has uniform power, for the information system across the frequency band. • Having analogy with white colour at all frequencies which has uniform emissions, in the visible spectrum for all the frequencies. Since it is having normal distribution with an average time domain value of zero. From the natural noise sources, wideband noise comes such as thermal noise or Johnson Noise, shot noise from the earth and from celestial sources such as the Sun. As per Central limit of Theorem, Summation of many Random Processes as per probability theory will have Gaussian or Normal Distribution.
Performance Evaluation of MFSK Techniques Under Various Fading …
213
4. Modulation Techniques Used The project mainly depends on the modulation technique used here we use the FSK technique used. It is one type of digital modulation. After the modulation and the process is done demodulation of the data is also done as to get the output. Depending upon the suitable modulation technique selection, system quality of Service depends, few of the various digital modulation Techniques are BFSK, QPSK, 8 FSK, 16 FSK and 32 FSK. With Additive white Gaussian Noise as a Noise present in fading channels, measurement of each modulation scheme performance is done. In Binary Frequency Shift keying (BFSK) in order to transmit binary 0’s and 1’s as information pair of discrete frequencies are used. 1 and 0 are called mark and space frequencies respectively. The various applications of PSK lies in telemetry weather balloon radio sonder, caller ID, Garage door openers, and low frequency radio transmitters.
4 Design of the Proposed System The Proposed System gives the performance evaluation of the frequency shift keying so that we can conclude which frequency shift keying gives the better performance when it is passed through the fading channels and AWGN channel. After it passes through the demodulator as the data which undergone modulation now undergoes demodulation and then error rate is calculated and the performance of different m-values taken the and the graphs are plotted. Figure 1 shows the Simulink Model of the project in which the input is given through modulator, then it passes through fading channel, AWGN Channel, Demodulator then Error-rate calculation.
Fig. 1 MATLAB Simulink model under Rayleigh fading and AWGN channel for 2FSK
214
S. Arvind et al.
Fig. 2 MATLAB Simulink model under Rician fading and AWGN channel for 2FSK
Figure 2 shows the Simulink Model of the project in which the input is given through modulator, then it passes through fading channel, AWGN Channel, Demodulator then Error-rate calculation is done for M value 2. Here the just the type of modulation varies remaining all remains same for all the modulations that is, 2, 4, 6, 8. • First step in the process is giving input it may be of any type like Bernoulli, random integer etc. • Then the data given undergoes multiple frequency shift keying modulation i.e., the given input signal changes according to the carrier signal and the modulation process is done. And the M-value we can take accordingly The multipath fading channels i.e., Rayleigh and Rician channel are used in this step the modulated is passed through the Rayleigh channel and Rician channels accordingly. • The data which is passed through fading channels now enters to the AWGN channel (Additive White Gaussian Noise Channel). It is used because the noise is additive, i.e., the received signal is equal to the transmitted signal plus noise. This gives the most widely used equality in communication systems. The data entered from AWGN channel undergoes demodulation process as the data first went through the Multiple frequency shift keying process. Now the data is out. Now the bit error rate calculation is done.
Performance Evaluation of MFSK Techniques Under Various Fading …
215
5 Results and Discussion Table 1 describes the comparison between BER v/s E b /N 0 for Rayleigh and Rician fading channels under multiple frequency shift keying. Here we have taken the SNR values ranging from − 10 to + 10 of multipath fading channel when it is passed through the AWGN so that by the values we can plot the performance characteristics channel. And the Rayleigh fading channel is taken as mentioned before. Now when we change the E b /N 0 values then the Rayleigh fading BER also changes. Using MATLAB Simulink, the Simulink process was carried for testing the performance of the system in terms of BER. It was found that when the system is simulated for 10 ms time, the total number of bits was 1000 and BER was found to be 0.642 at SNR equal to 4 db, Using Monte Carlo simulation methods takingrandom sample of input numerical value of SNR, BER versus E b /N 0 . Figure 3 describes the plot of BER versus E b /N 0 plot for multipath fading with AWGN channel for 2 FSK. Figure 4 describes the plot of BER versus E b /N 0 plot for It is found that system performance improves with increase in E b N 0 . For Rayleigh channel starting values are high when it is compared with Rician values. When it compared between in Rayleigh only for different M values then it high for M value 4 and 6. And in Rayleigh the last values are zero most probably. For Rician channel BER is high for M value 6 and the BER decreases slightly when the SNR is increased for all the values ranging from − 10 to 10 the value the BER is not zero. Figure 5 describes the plot of BER versus E b /N 0 plot for multipath fading with AWGN channel for 6 FSK. Table 1 The comparison of BER v/s E b /N 0 for Rayleigh and Rician fading channels for M values 2, 4, 6, 8 SNR
Rayleigh
Rician
M=2
M=4
M=6
M=8
M=2
M=4
M=6
M=8
− 10
0.05882
0.9091
0.9091
0.8182
0.549
0.8182
0.9091
0.7273
−8
0.01961
0.8182
0.9091
0.8182
0.4314
0.7903
0.8971
0.6976
−6
0.01961
0.6364
0.8182
0.7273
0.3333
0.7552
0.8834
0.6781
−4
0
0.3636
0.7273
0.6364
0.2941
0.7426
0.8546
0.6781
−2
0
0.2727
0.5455
0.5455
0.2549
0.7273
0.7981
0.6781
0
0
0.2727
0.4545
0.2727
0.1765
0.6364
0.7273
0.7273
2
0
0.1818
0.1818
0
0.1176
0.6275
0.6967
0.5786
4
0
0
0
0
0.07843
0.6136
0.6781
0.5665
6
0
0
0
0
0
0.5843
0.6781
0.5665
8
0
0
0
0
0
0.5674
0.6364
0.5455
10
0
0
0
0
0
0.5455
0.6364
0.5455
216
S. Arvind et al.
Fig. 3 Plot of BER versus Eb/N0 plot for 2-FSK
Fig. 4 Plot of BER versus Eb/N0 plot for 4-FSK
Figure 6 describes the plot of BER versus Eb/N0 plot for multipathfading with AWGN channel for 8 FSK. Here in this graph both the graphs are plotted as BER versus SNR so that the performance analysis is done easily. This model is done in presence of AWGN using different types of fading channels Rayleigh and Rician fading channels.
Performance Evaluation of MFSK Techniques Under Various Fading …
Fig. 5 Plot of BER versus Eb/N0 plot for 6-FSK
Fig. 6 Plot of BER versus Eb/N 0 plot for 8-FSK
217
218
S. Arvind et al.
6 Conclusion This paper explains about the performance analysis of multiple frequency shift keying under multipath fading and AWGN channels. When the SNR is changed then BER of the channels are also changed. By comparing the BER, the SNR values are taken from − 10 to + 10. As the SNR is increased then BER is decreased. When we compare both the channels, then the BER of Rician is high. In Rayleigh channel the BER is decreasing earlier to Rician channel Hence by above graphs we can observe the performance of frequency shift keying under various fading environments through Rayleigh and Rician channels. We can do the analysis by comparing all the theoretical values with simulation values so that performance analysis can take and to avoid the BER degradation, we can use block coding to avoid that. Acknowledgements Authors thank Management of CMR Technical Campus, Director Dr. A. Raji Reddy, CMR Technical Campus, Prof. G. Srikanth, HoD-ECE Department for providing necessary infrastructure and Lab Facilities to carry out the required work.
References 1. Jeiad HA, Al-Bahadili RJS (2019) Multipath fading effects on uncoded and coded multiple frequency shift keying performance in mobile wireless communication. J Eng Commun Sustain Dev 23(04). ISSN 2520-0917; Soni M, Ghosh PK, Gupta K (2015) Review of data communication in wireless fading channel and a case study. Int J Electron Commun Technol IJECT 6(1):49–55 2. Joshi D, Gupta K (2012) Performance comparison between MPSK and MFSK in Rician fading channel based on MGF method. Int J Comput Appl (0975–8887) 45(14):33–37 3. Babu AS, Rao KV (2011) Evaluation of BER for AWGN, Rayleigh and Rician fading channels under various modulation schemes. Int J Comput Appl (0975–8887) 26(9):23–28 4. Chavan MS, Chile RH, Sawant SR (2011) Multipath fading channel modeling and performance comparison of wireless channel models. Int J Electron Commun Eng 4(2):189–203. ISSN 0974-2166 5. Chandra AS, Poram R, Bose C (2008) Performance of coherent MFSK schemes over slow flat fading channels. In: IEEE region 10 conference, TENCON, Hyderabad, AP, India, pp 1–6 6. Michael JC, Wayne ES (2000) Effect of mobile velocity on communications in fading channels. IEEE Trans Veh Technol 49(1):202–210 7. Cahandra AS, Poram R, Bose C (2008) Performance of coherent MFSK schemes over slow flat fading channels. In: IEEE region 10 conference, TENCON, Hyderabad, AP, India, pp 1–6 8. https://en.wikipedia.org/wiki/MATLAB 9. https://waijung2-doc.aimagin.com/about-matlab-_-simulink.html 10. https://www.tutorialspoint.com/matlab_simulink/matlab_simulink_introduction.htms 11. https://en.m.wikipedia.org/wiki/Additive_white_Gaussian_noise 12. https://en.wikipedia.org/wiki/Multiple_frequency-shift_keying
Prediction of Breast Cancer Using Feature Extraction-Based Methods Suddamalla Tirupathi Reddy, Jyoti Bharti , and Bholanath Roy
Abstract A substantial amount of data is needed for efficient feature extraction and pattern recognition to guarantee the robustness of a machine-learning model in differentiating between numerous classes. It becomes essential to extract useful features from existing data or improve them using augmentation techniques to avoid the requirement for more real data. Machine learning (ML) models use artificial intelligence (AI) and ML to make life easier for patients and medical professionals while handling complicated problems in clinical imaging. A very accurate automated approach has been created to identify anomalies in bone X-ray pictures. Even with limited resources, image pre-processing techniques like noise removal and contrast enhancement are essential for enhancing image quality and obtaining high diagnostic accuracy. The Gray Level Co-occurrence Matrix (GLCM) texture features, which reflect second-order statistical data about the grayscale values of adjacent pixels, are frequently used to classify images. Various tools and methodologies for organizing, evaluating, and constructing ML models have become crucial given the enormous rise in data available in the modern period. Data is the energy or oxygen of machine learning. ML algorithms must be developed and improved continuously to handle data-related difficulties. In other words, for algorithms to produce reliable and useful models, it is crucial to have a well-organized dataset with clear patterns. Keywords Logistic regression · Decision tree · GLCM · LBF SVM · Linear SVM · Machine learning
S. T. Reddy (B) · B. Roy Department of Computer Science Engineering, Maulana Azad National Institute of Technology Bhopal, Bhopal, India e-mail: [email protected] J. Bharti Department of CSE and IT, Maulana Azad National Institute of Technology Bhopal, Bhopal, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_19
219
220
S. T. Reddy et al.
1 Introduction Image processing techniques are frequently used to diagnose and interpret medical pictures. The development of whole-slide imaging and the rise in cancer incidence has drawn researchers’ focus to the automatic interpretation of histopathology images. Whole-slide photographs’ high resolution can make manual analysis timeconsuming; automatic image processing techniques were created to speed up this procedure. When making decisions during picture analysis, these techniques support professionals and, in certain situations, even take over. Decision-making and the automatic classification of images depend heavily on the texture, shape, lighting, and color variations provided by image features. Hierarchical processing is used in medical imaging to extract important information from images. There is an order to tasks like augmentation, restoration, and analysis. These methods also provide automated or somewhat automated tissue measurement, characterization, and detection. The demand for efficient feature extraction techniques that automatically categorize X-ray pictures is rising. Finding the most informative features with the fewest parameters that accurately reflect the image is the aim of feature extraction techniques. By removing pointless parameters, classification can be carried out more effectively and efficiently with fewer processing resources. Typically, image editing entails the removal of both lowlevel and high-level characteristics. High-level features are more complicated and computationally intensive than low-level features with simpler properties. Different feature extraction procedures have been described in the literature since choosing characteristics depends on the particular issue at hand. Clinical imaging will greatly benefit from artificial intelligence and machine learning. These technologies allow for the rapid detection of anomalies in clinical imaging modalities such as computed tomography (CT), MRIs, and X-rays. Artificial intelligence and machine learning’s advanced characteristics aid in detecting anomalies that may be difficult for human observers to spot. Radiologists have found that using intelligent tools to support them during clinical operations greatly improves their capacity to assess medical images. One of the hardest machine learning challenges is finding anomalies in clinical X-ray pictures. Each patient’s anatomy might vary widely, and the layered structures in radiograph projections provide considerable difficulties. One of the main causes of death for women globally is breast cancer, which is the subject of the BUSI (Breast Ultrasound Image Dataset) dataset. The dataset contains ultrasound-based medical images of benign, aggressive, and normal breast cancer. The reduction of mortality rates is greatly helped by early identification. Breast ultrasound pictures and machine learning can offer useful insights for diagnosing, diagnosing, and categorizing breast cancer.
Prediction of Breast Cancer Using Feature Extraction-Based Methods
221
2 Related Work Jemal et al. (2011) state breast cancer is the second most common cause of mortality for women worldwide. Although early detection is essential for lowering mortality rates, due to breast tissue density, it can be difficult to spot suspected tumors on mammograms. To solve this problem, a computer-aided detection (CAD) system must first identify the type of breast tissue seen on a mammogram before it can identify and categorize breast cancer. K. Haralick et al. published texture features based on the Gray Level Co-occurrence Matrix (GLCM) in 1973 [1, 2] and these texture traits have found extensive use in biological photo classification. Numerous studies have shown how to employ GLCM for texture analysis in the biomedical field. In 2011, Chai et al. [3] suggested techniques for assessing femur long bone fractures using GLCM. In the same year, Mahendran et al. [4] used fusionclassification methods to recognize tibia fractures in particular. In 2011, Bousson et al. [5] created a clinical method that used differences in gray pixel levels to assess the trabecular bone score and bone mineral density in DXA images of the lumbar spine. (CLAHE). To extract a total of 88 hybrid features from an examination of mammography pictures in 2014, an unsupervised genetic algorithm (GA) and forward sequential feature selection (SFS) was used. Additionally, Nagarajan et al. [6] developed a method for generating bone mineral density-based fracture risk assessments and trabecular bone-defining feature sets in 2015. Compared to the Haralick, Gabor filter, and SFTA algorithms, Ngo et al. [7] introduced the Bone Texture based on Gabor Transformation and Contourlet in 2016. Additionally, Shirvaikar et al. [8] published a statistical method in 2016 that improved the prediction of bone fragility by using GLCM Textural Features based on bone architecture and bone mineral density. Clinical consideration companies encourage uncommon assistance for medical practitioners in unusual conditions by applying data science and AI predictions. It can be challenging to recognize a chest injury nowadays because of the variety of plans’ shapes, surfaces, and other clinical symptoms. As a result, the clinical consideration business is placing more emphasis on developing an effective AI estimations application. Using unique but small datasets derived from computation appraisal challenges [6, 7, 9, 10], a team of researchers previously concentrated on using image analysis to distinguish between chest illnesses for separating the dangerous developments that have infected past the chest, many nearby organs and lymph nodes [3–5], and cell science [11–13].
222
S. T. Reddy et al.
3 Methodology Machine learning (ML), a branch of artificial intelligence, aims to create flexible software to learn from new data. It performs categorization, prediction, and detection tasks using computer models and historical data. This essay will examine several well-known classification algorithms and how they are used to find breast cancer. The dimensions of features can be reduced using efficient feature extraction or selection approaches. A dataset’s feature dimensions can be reduced using a variety of techniques. While feature extraction techniques create new features by merging the old ones, feature selection approaches include selecting a subset of the original features while keeping the data from the original dataset. The number of features can affect how well a machine learning model performs, so this research uses four distinct strategies to handle the problem of high dimensionality. After lung cancer, breast cancer is the second most frequent cancer among women. The World Health Organization and the American Institute for Cancer Research have recently estimated that breast cancer affects about 2 million women every year. 15% of all cancer-related fatalities are caused by breast cancer. Interestingly, only 15% of breast cancer instances are inherited, with the remaining 85% developing randomly. Genetic changes brought on by aging are the main cause of breast cancer. In underdeveloped nations, 50% of breast cancer patients receive their diagnosis between stages 3 and 4, where 90% of cases are deadly. The rates of breast cancer mortality are influenced by under diagnoses and postponed treatment. Both benign and malignant tumors are possible. A benign tumor has a distinct structure, is regarded as benign or non-cancerous, grows slowly, doesn’t spread to neighboring tissues, and doesn’t metastasize. On the other hand, malignant tumors rapidly invade surrounding tissues, spread to other body parts, and appear abnormally. They also have deformed shapes. The proposed approach is made up of four main parts: (1) image pre-processing, (2) feature extraction, (3) dataset splitting, and (4) classification model. As shown in Fig. 1, the block layout of our suggested methodology is discussed below. (i) Image Preprocessing Most clinical image processing happens during the preprocessing step to reduce distortion from noise interference and highlight the key features of the original image. The major goals are to utilize color transformation techniques like RGB to Gray, extract information from the red layer, and improve image contrast. Color conversion: The RGB image must be converted into a grayscale image during setup. Either the average method or the luminosity strategy can be used for this. The suggested methodology employs 100% red, 0% green, and 0% blue luminosity. Increasing contrast and reducing unwanted distortion signals are the two most crucial pre-processing steps (noise) Fig. 2 shows Image before enhancement, Fig. 3 shows RGB Images to Gray Scale Images, Fig. 4 shows Histogram Equalization, Fig. 5 shows Histogram Equalization on Grayscale Images, Fig. 6 shows Histogram after
Prediction of Breast Cancer Using Feature Extraction-Based Methods Fig. 1 Our proposed methodology
223
224
S. T. Reddy et al.
Fig. 2 Image before enhancement
Histogram operation, Fig. 7 shows Bilateral Filter on Grayscale Images, Fig. 8 shows Histogram of Bilateral Filter.
Fig. 3 RGB images to gray scale images
Fig. 4 Histogram equalization
Fig. 5 Histogram equalization on grayscale images
Prediction of Breast Cancer Using Feature Extraction-Based Methods
225
Fig. 6 Histogram after histogram operation
Fig. 7 Bilateral filter on grayscale images
Fig. 8 Histogram of bilateral filter
Feature Extraction An essential component of our suggested paradigm is feature extraction from an image. The GLCM texture feature utilizes the phenomenon of neighboring gray levels and their counts occurring simultaneously in an image. Assume that I, j are its coefficients and the elements’ coordinates in a M cooccurrence matrix of N dimensions. Figure 9 shows Image after enhancement. Energy: A pair of pixels is counted as having been repeated a certain number of times. It is a measurement of the disordered texture of an image. The energy value for closely coupled pixels is quite high. Equation (1) states the following for the “Energy” component of an image: Energy =
i, j
p(i, j )2
(1)
226
S. T. Reddy et al.
Fig. 9 Image after enhancement
An image’s correlation feature calculates a pixel’s relationship with its surrounding pixels over the course of the entire image. For properly positively correlated images, the correlation coefficient varies from − 1 to 1, and for constant images, it is infinite. Equation (2) defines “Correl” as the characteristic of a picture that permits correlation. Correl =
(i − μi )( j − μ j ) p(i, j ) σi σ j i j
(2)
Dissimilarity: The dissimilarity feature calculates the separation between pairs of items in the ROI. It determines how the image’s mean difference in grey levels is distributed. A larger number indicates that the intensity levels between adjacent pixels span a greater range. In Eq. (3), the term “dissimilarity,” which refers to a characteristic of a picture, is defined as follows Dissimilarity =
i
|i − j| p(i, j )
(3)
j
Inverse dissimilar movement is a term used to describe homogeneity. To measure picture homogeneity, higher values for fewer grey tone discrepancies between two objects are employed. Homogeneity decreases with constant energy as contrast increases. Equation (4) defines the “Homog” homogeneity property of a picture as follows: Homog =
i
j
1 P(i, j) 1 + |i − j|2
(4)
Contrast: The intensity of each pixel and its immediate surroundings, or the spatial frequency of the image, is measured by contrast. It also establishes how many local differences are visible in the image. The difference between one object’s colour and brightness compared to other items in the same field of view is used to determine the contrast. Equation (5) states the following for the contrast function of an image, denoted as “Cont.”
Prediction of Breast Cancer Using Feature Extraction-Based Methods
Cont. =
i
|i − j|2 p(i, j )
227
(5)
j
Entropy: Entropy, which assesses how uniformly each pixel in the image is distributed, is used to describe the texture of the image. Entropy and energy are significantly associated, but in the opposite direction, because images with more grey levels have higher entropy. Equation (6) gives the following definition of an image’s entropy function, abbreviated “Ent.”: N g −1 N g −1
Ent =
i=0
p(i, j ) log( p(i, j ))
(6)
j=0
4 Results The three main components of the simulation of the suggested methodology are picture preprocessing, feature extraction, and classification. The first two sections— image preprocessing and feature extraction—were carried out in a MATLAB R2017a environment, while the final section—classification—was finished in a Python 3.0 environment. Figure 10 shows Positive and negative records counts. The upper bar graph depicts the busi database’s total number of positive and negative records. (Fig. 10). Using the suggested approach, the X-ray image from the BUSI dataset was utilized to extract twelve features; the histograms above display the features’ most common values. Autocorrelation, contrast, cluster prominence, cluster shadow, the sum of squares variance, and average total values are
Fig. 10 Positive and negative records counts
228
S. T. Reddy et al.
less dispersed features. Such characteristics include homogeneity values that are highly dispersed, entropy, energy, dissimilarity, correlation, and dissimilarity. The co-occurrence matrix is graphically represented by the heat map of SIX GLCM texture features. The dark green hue indicates a strongly correlated property, whereas the red color indicates a less correlated feature. The bar graph displays the results of performance analyses for several categorization methods (Fig. 7). The goal of pre-processing is to enhance the X-ray image’s quality and eliminate noise. Using GLCM, the following textural features are extracted: Contrast, correlation, cluster prominence, dissimilarity, energy, and entropy are used to categorise the X-ray pictures. Figure 11 shows extracted feature value count, Fig. 12 shows Accuracy analysis of different classification algorithms using texture features GLCM. More textural features improve the classifier’s performance at the cost of increased computational complexity. The effectiveness of the classifier may be evaluated by looking at the statistical variables Sensitivity, Specificity, Precision, Accuracy, and F1 Score. A radiologist could more accurately recognise bone fractures on an X-ray thanks to the suggested work. Our suggested work’s primary goal is performance evaluation using various classifier methods and the results of various textural features generated by GLCM. 80% of the Busi dataset is used to train the classification technique, while the remaining 20% is used to test it. The sensitivity, specificity, precision, accuracy, and F1 score are five statistical measures used to evaluate the performance of this anomaly detection in X-ray pictures. Sensitivity TPR = TP/(TP + FN)
(7)
Specificity SPC = TN/(FP + TN)
(8)
Precision PPV = TP/(TP + FP)
(9)
Accuracy ACC = (TP + TN)/Total Number
(10)
Fig. 11 Extracted feature value count
Prediction of Breast Cancer Using Feature Extraction-Based Methods
229
Fig. 12 Accuracy analysis of different classification algorithms using texture features GLCM
F1 score F1 = 2TP/(2TP + FP + FN)
(11)
True positive (TP) denotes X-ray images that are positively identified as malignant and are categorized as such, in contrast to false positive (FP), which refers to X-ray images that are positively tagged as fractured but are classed as non-fracture images. X-ray images classified as true negatives are those that have been positively recognized as unaffected (TN). Many machine learning approaches, including
230
S. T. Reddy et al.
Table 1 Different machine learning techniques performance evaluation Measure
Random forest
Gradient booster
Ada booster
Decision tree
Precision
1.00
1.00
1.00
1.00
Recall
0.85
0.85
0.85
0.98
F1-score
0.92
0.92
0.92
0.99
Support
54
54
54
54
Accuracy
0.95
0.95
0.95
0.99
Random forest, Gradient booster, Ada Booster, and decision trees, were utilized to categorize the abnormalities of the breast X-ray image. The metric is intended to assess machine learning performance as opposed to statistics performance. The true positive rate is frequently referred to as sensitivity, while the false positive rate is measured by specificity, precision predicts positive observations, accuracy gauges the proportion of accurate predictions to all observations, and F1 Score determines the weighted average of precision and sensitivity. The performance measurements presented in Table 1 were generated using various classification techniques’ confusion matrices. The Gradient Booster classifier achieved a classification accuracy of 95.00%. Similarly, the Random Forest classifier achieved a classification accuracy of 95.00% with a Recall of 0.85%, Support of 54%, Precision of 1.00%, and F1 score of 0.92%. The Decision Tree classifier achieved a classification accuracy of 99.00%, while the Booster classifier achieved a classification accuracy of 95.00% with a recall of 0.98%, Support of 54%, precision of 1.00%, and F1 score of 0.92%. Notably, the performance of the breast cancer detection Decision Tree model showed significant improvement compared to the other models studied, which is encouraging considering the visual challenges posed by the BUSI database.
5 Conclusion This study uses the BUSI dataset to detect anomalies in breast X-ray images. The goal is to develop a fully automatic detection technique that benefits medical professionals and patients. However, the classification task poses challenges due to the size and low quality of the BUSI X-ray image collection. A three-step approach is employed to address this issue: preprocessing, feature extraction, and classification. The pre-processing stage involves converting the RGB image to grayscale and applying Contrast Limited Adaptive Histogram Equalization (CLAHE) to enhance contrast. Six texture features based on the Gray Level Co-occurrence Matrix (GLCM) are extracted for detecting picture abnormalities in the X-ray image. Several machine learning techniques, including Random Forest, Gradient Booster, Ada Booster, and decision trees, are considered for assessing automatic detection systems. Based on performance analysis, the decision tree algorithm achieves the highest accuracy rate
Prediction of Breast Cancer Using Feature Extraction-Based Methods
231
of 94.00%. However, further experiments with different feature sets could enhance the system’s performance.
References 1. Seal A, Bhattacharjee D, Nasipuri M (2018) Predictive and probabilistic model for cancer detection using computer tomography images. Multimed Tools Appl 77(3):3991–4010 2. Shanmuga K, Haralick RM (1973) Haralick Shanmugam Dinstein 1973 3. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH (2016) Deep learning for identifying metastatic breast cancer. arXiv preprint arXiv:1606.05718 4. Mahendran SK, Baboo SS (2011) An enhanced tibia fracture detection tool using image processing and classification fusion techniques in X-Ray images. Glob J Comput Sci Technol 11(14):22–28 5. Bousson V, Bergot C, Sutter B, Levitz P, Cortet B (2012) Trabecular bone score (TBS): available knowledge, clinical relevance, and future prospects. Osteoporos Int 23(5):1489–1501 6. Litjens G et al (2018) 1399 H&E-stained sentinel lymph node sections of breast cancer patients: the camelyon dataset. GigaScience 7:giy065 7. Aresta G et al (2018) Bach: grand challenge on breast cancer histology images. arXiv preprint arXiv:1808.04277 8. Shirvaikar M, Huang N, Dong XN (2016) The measurement of bone quality using gray level co-occurrence matrix textural features. J Med Imaging Heal Inform 6(6):1357–1362 9. Bejnordi BE et al (2017) Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318:2199–2210 10. Bándi P et al (2018) From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge. IEEE Trans Med Imaging 11. Vijayakumar R, Gireesh G (2013) Quantitative analysis and fracture detection of pelvic bone X-ray images. In: 2013 4th international conference on computing communication network technology. ICCCNT 12. Madhulika et al (2013) Implementing edge detection for medical diagnosis of a bone in Matlab. In: Proceedings of 5th international conference on computing intelligence communication networks. ICICN, pp 270–274 13. Al-Ayyoub M, Al-Zghool D (2013) Determining the type of long bone fractures in x-Ray images. WSEAS Trans Inf Sci Appl 10(8):261–270 14. Rajpurkar P et al (2017) MURA: large dataset for abnormality detection in musculoskeletal radiographs, Dec 2017
Anomaly Detection in Classroom Using Convolutional Neural Networks B. S. Vidhyasagar, Harshith Doppalapudi, Sritej Chowdary, VishnuVardhan Dagumati, and N. Charan Kumar Reddy
Abstract Anomaly detection is a task that involves identifying patterns or events that deviate significantly from the expected or normal behavior. In the context of classroom settings, it is essential to identify mischievous behavior exhibited by students to ensure a productive and conducive learning environment. This paper proposes a new method to identify abnormal behavior in a classroom setting using computer vision and deep learning techniques. This method uses convolutional neural networks (CNNs) and recurrent neural networks (RNN) algorithms to detect hand-raising, sleeping, talking, and fighting behaviors in the classroom with live video surveillance. It involves using the You Only Look Once (YOLO) algorithm to locate the position of students in the frame and an RNN algorithm to classify their behavior as normal or anomalous. RNN algorithm is trained on a labeled image and video dataset containing examples of normal and abnormal behaviors. This system can trigger realtime alerts, allowing teachers to act appropriately to maintain a conducive learning environment. It also helps to identify and address disruptive behaviors. Real-time, revolutionizing classroom monitoring: Our algorithm has the potential to improve student safety and well-being, as well as their academic performance. This research could lead to further developments in the field of the classroom.
B. S. Vidhyasagar · H. Doppalapudi (B) · S. Chowdary · V. Dagumati · N. Charan Kumar Reddy Department of Computer Science and Engineering, Amrita School of Computing, AmritaVishwaVidyapeetham, Chennai, India e-mail: [email protected] B. S. Vidhyasagar e-mail: [email protected] S. Chowdary e-mail: [email protected] V. Dagumati e-mail: [email protected] N. Charan Kumar Reddy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_20
233
234
B. S. Vidhyasagar et al.
1 Introduction The applications of such object detectors are diverse. They could be used for city traffic monitoring, military surveillance, reconnaissance, border patrol, security systems, home automation, self-driving cars, face recognition systems, etc. [1]. Anomaly behavior is a complex issue that can manifest in various ways in the classroom. These behaviors may include minor distractions like talking out of turn, more severe disruptions like physical aggression, or consistent refusal to follow instructions. These behaviors can significantly impact the learning environment for both the student exhibiting the behavior and their classmates. It can be complicated for teachers and school administrators to manage these behaviors, as they may require a more individualized approach to address the root causes. Anomaly behavior can be temporary or persistent and may indicate underlying issues like learning disabilities, emotional or behavioral disorders, or challenges at home. Understanding that these behaviors are often beyond the student’s control and may require a multi-faceted approach to address them effectively is essential. Early intervention and individualized support plans can be effective in helping students manage their behaviors and address any underlying issues. Collaboration with parents or guardians is also critical in addressing anomaly behavior. They can provide valuable insight into the student’s behavior at home and may be able to offer additional support and guidance to the student. It is also essential to create a safe and supportive environment for all students, which includes strategies like positive reinforcement, active listening, and clear communication. Addressing anomaly behavior in the classroom requires a comprehensive approach that considers the student’s individual needs and their unique circumstances. Teachers and school administrators can help create an environment that promotes academic success and overall well-being by working collaboratively with students, parents, and other professionals. Data science can play an essential role in detecting anomaly behavior of students in the classroom. By leveraging advanced analytical techniques, data science can help identify patterns and trends in student behavior, which is used to detect unusual or unexpected behavior.
2 Literature Survey Gabriel Abosamra has developed a prototype to detect abnormal behavior of students during exams. The prototype uses a three-stage approach that combines neural networks and Gaussian distribution. In the first stage, the prototype utilizes the Haar cascade method for face detection. The second stage involves a neural network to identify suspicious states, and the final stage applies anomaly detection based on the Gaussian distribution. The primary goal of the prototype is to achieve accurate realtime detection of unusual activity. Gibrael tested the prototype on a custom dataset and achieved a remarkable 97% accuracy, with a low false negative rate of only 3% [2].
Anomaly Detection in Classroom Using Convolutional Neural Networks
235
Faster RCNN is a more advanced version of Fast RCNN, initially introduced by Ren et al. [3]. Instead of traditional selective search, Faster RCNN employs a Region Proposal Net to generate regional proposals. According to the authors, this modification results in higher and more efficient performance on both CPU and GPU. The authors also suggest that the RPN-based pipeline has a lower temporal complexity in generating proposals than the selective search strategy used in Fast RCNN, contributing to its improved performance [4]. Zheyi Fan proposed a novel algorithm for detecting and highlighting anomalies in video based on a spatiotemporal approach. The method uses a convolutional neural network (CNN) with a simple structure and low computational complexity. The results indicate that the spatiotemporal algorithm outperforms existing algorithms in terms of accuracy at the pixel and frame levels [5]. Teng Guo aims to perform a detailed methodological analysis of data-driven techniques for detecting anomalies in the field of education. The study will primarily focus on the five most extensively researched areas, such as predicting course failure, dropout, mental health issues, graduation difficulties, and employment difficulties. The ultimate goal of this research is to provide a valuable resource for educational policymakers by offering an in-depth assessment and promoting the development of educational anomaly detection as an emerging field while avoiding copying [6]. Xiu Li has proposed a novel method for detecting the 3D poses of humans in close interaction using sparse multi-view images. The method involves several steps: acquiring 2D joint positions in each image using OpenPose, obtaining human semantic segmentation results with Mask RCNN, and calculating 3D joint positions through triangulation of the multi-view 2D joints. Additionally, Xiu Li has introduced an innovative approach to minimize interpenetration between human shapes during close interaction [7]. Wang Liao proposed a two-stage approach for recognizing hand-raising gestures in real classroom settings. The first stage involves multi-stage which poses estimation to identify potential hand-based areas of each student, focusing on raised arm activity. Multi-scale information fusion is used to improve accuracy. In the second stage, a binary classification network is employed to identify the gesture style [8]. A system developed by Feng-Cheng Lin utilizes skeleton pose estimation and person detection to recognize student behavior in a classroom setting. The system consists of three stages. In the first stage, frames from a CCTV camera in the classroom are captured as input images. In the second stage, feature extraction generates vectors representing the human posture. Finally, behavior classification is carried out in the third stage to identify student behaviors. The proposed system outperformed the skeleton-based scheme, achieving significantly higher average accuracy and recall rates [9]. T. Senthilkumar developed a unique system framework with three components to monitor student activity during exams. The first component uses Haar feature extraction to identify and track students’ face regions. The second component detects hand contact through grid formation when students exchange papers or foreign objects. Finally, the system recognizes hand signaling by the student using convex hull and notifies the invigilator [10].
236
B. S. Vidhyasagar et al.
Jiaxin Si has introduced a method for recognizing hand-raising gestures in real classroom settings. The method includes creating a large dataset of real classroom videos and designing a neural network architecture based on R-FCN, a region-based fully convolutional network. The first stage involves using an adaptive template selection algorithm to identify different types of hand-raising gestures. In the following stage, a feature pyramid is implemented to capture detailed and highly semantic features, enabling the model to detect smaller hand gestures more accurately [11]. P. Suja extensively analyzed various techniques to identify a suitable and effective domain for feature extraction in images. The evaluation of different techniques was performed in both spatial and transform domains. After careful analysis, local binary pattern (LBP) and local directional pattern (LDP) were chosen for the spatial domain. In contrast, Dual Tree-Complex Wavelet Transform (DT-CWT) and Gabor wavelet transform were selected for the transform domain. The efficiency of these methods was tested by applying them to images from the Cohn–Kanade and JAFFE databases. The classification performance was then evaluated using two methods, namely neural network (NN) and K-nearest neighborhood (KNN) with Euclidean distance [12].
3 Proposed Work Preprocessing is an important step in preparing the data for training the CNN model. In this step, the collected data is preprocessed to standardize the format and quality of the images or videos. The preprocessing steps may include resizing the images or videos to a standard size, converting the color images to grayscale or black and white, removing any noise or artifacts from the images or videos, and normalizing the data to bring it in a specific range. Resizing the images or videos to a standard size is important because CNN models typically require inputs of a fixed size. This ensures that all the images or videos are of the same size and the model can be trained efficiently. Converting the color images to grayscale or black and white can simplify the data and reduce the amount of processing required by the model. Removing noise or artifacts from the images or videos can improve the accuracy of the model. Normalizing the data is also an important preprocessing step. Normalization involves scaling the pixel values of the images or videos to a specific range, typically between 0 and 1 or − 1 and 1. This ensures that the data has a consistent range and helps the model to learn more efficiently. After preprocessing, the data is divided into two parts, the training data and the test data. The purpose of the train–test split is to evaluate the performance of the model on unseen data. Typically, 70% of the data is used for training the model, and the remaining 30% is used for testing the model’s performance. This helps to ensure that the model is not overfitting to the training data and is able to generalize well to new data. Once the data is preprocessed and split, the training data is fed into the CNN model for training. The model learns to classify the images or videos using various
Anomaly Detection in Classroom Using Convolutional Neural Networks
237
optimization techniques like stochastic gradient descent, Adam, etc. The model is tested on the test data to evaluate its performance, and the accuracy of the model is calculated based on how well it predicts the labels of the test data. Data is collected in the form of images. The data is collected from various sources like the internet, public datasets, and by capturing it using a camera. In preprocessing, the collected data is preprocessed to make it ready for training the CNN model. The preprocessing steps may include resizing the images to a standard size, converting the color images to gray scale or black and white, removing any noise or artifacts from the images, and normalizing the data to bring it in a specific range. After preprocessing, the data is divided into two parts, the training data and the test data. Typically, 70% of the data is used for training the model, and the remaining 30% is used for testing the model’s performance. In this step, the training data is fed into the CNN model, and the model learns to classify the images. The model is trained using Adam optimizer. After the model is trained, it is tested on the test data to evaluate its performance. The accuracy of the model is calculated based on how well it predicts the labels of the test data. To use the trained model for live video classification, the model is integrated with a webcam. The webcam captures live video and feeds it into the model, which classifies the actions in real time. This allows for real-time monitoring and analysis of actions or movements. In recent years, the advancement in computer vision and deep learning techniques has opened up new possibilities for identifying anomaly behaviors in real time using video inputs [4, 13]. This has been especially useful in classrooms, where teachers can use these techniques to monitor the behavior of their students and identify any abnormal behaviors that may be disrupting the learning environment. In this paper, we will discuss the use of CNN and RNN algorithms to identify anomaly behaviors such as hand raise, sleeping, talking, and fighting in a classroom using live video input. The first step in identifying these anomaly behaviors is to collect a video stream of the classroom using a standard webcam or any other camera that provides a live video feed. Once the video feed is obtained, it is processed using the You Only Look Once (YOLO) algorithm. This real-time object detection algorithm can detect multiple objects in a single frame. YOLO is a deep learning-based algorithm that uses a neural network to predict the probability of an object being present in a given frame region. In our case, we will be interested in detecting the presence of students and identifying their positions in the frame. This is achieved by training the YOLO algorithm on a dataset of images and videos containing students in different classroom positions.
4 Dataset Description The Faster-RCNN paper discussed a model which used a VGG-16 in the initial layers. This model replaces the VGG-16 with a ResNet-50, and it is compared against similar solutions trained on our dataset to identify the anomaly behavior of students [1]. The custom-prepared dataset for anomaly detection in the classroom consists of 524 pictures we captured in the classroom activity. The dataset includes activities that
238
B. S. Vidhyasagar et al.
mainly occur within the classroom, like sleeping, talking, hand raising, and fighting. The sleeping activity involves students who are asleep during class time and resting in a reclined posture while in their seats, sleeping by resting their forehead on their hands and looking down, slumping down on the desk in front of them, and propping using their hand. The talking activity includes students conversing with each other and may be facing each other or talking with one another in groups. The hand raising activity involves students who raise their hands to ask a question or provide an answer to a question posed by the teacher. Pictures from this activity consist of people raising their hands at different angles and postures. Finally, the fighting activity includes any physical altercation in the classroom, like collar pulling, hitting on the face, etc. Each picture in the dataset captures a moment in time within the classroom, and the activities captured may vary from picture to picture. The pictures are labeled according to the activity occurring within the scene and are intended to be used to train an anomaly detection model that can identify and flag any unusual or potentially dangerous behavior that may occur within the classroom, which will help the teacher to identify the students who are not attentive while class is going on. The dataset was designed explicitly for this project, and no pre-existing datasets were used. The pictures were captured in a controlled environment, with the camera positioned to capture the entire classroom. The lighting within the classroom remained constant throughout the data collection process, and all students were aware that they were being filmed. Overall, this custom-prepared dataset provides a valuable resource for developing an effective anomaly detection model that can improve safety and assist the teacher within the classroom.
5 Results and Discussion In this section, we will begin by outlining the experimental settings, which will be followed by the results obtained. Our proposed technique was implemented on a system comprising NVIDIA GeForce GTX 1650TI, a 4700U series processor of 2.60 GHz, and a system RAM of 8 GB. Python programming language with visual studio code software was used for the implementation. For both the training and testing phases, a custom dataset of anomaly detection was created. A few sample images from these datasets are illustrated in Fig. 1. We opted for a custom dataset since the background difference may affect the final result. Analysis of Monitoring the Classroom: The results of student behavior recognition are presented in Fig. 2. The most significant number of behaviors fell under the category of gaze, indicating that most students had a positive attitude toward the class. During the lecture, 187 students raised their hands, while 262 times abnormal behaviors were observed. However, some bowing actions were incorrectly identified as monotonic actions. Figure 3 contrasts the proposed method’s performance regarding training and assessment precision using our custom dataset. The model did splendidly on the training data, the training loss is minimal, and the testing accuracy is subpar. So, to
Anomaly Detection in Classroom Using Convolutional Neural Networks
(a) Fighting
(b) Hand raise
239
(c) Sleeping
(d) Talking Fig. 1 Classroom abnormal activities
Fig. 2 Results of student behavior recognition
solve this issue, we used MobileNet classification 4. This model performed well on the training data and had the lowest loss, and it also performed well on the test data, with a lower test loss than the CNN model. Figure 4 shows the training and testing accuracy for MobileNet model (x-axis: epochs, y-axis: accuracy).
240
B. S. Vidhyasagar et al.
Fig. 3 Training and testing accuracy for CNN model (x-axis: epochs, y-axis: accuracy)
Fig. 4 Training and testing accuracy for MobileNet model (x-axis: epochs, y-axis: accuracy)
6 Conclusion The paper proposes a novel approach for detecting mischievous behavior in the classroom. The proposed method has been tested and evaluated on a small dataset, and the results show that the proposed method achieves high accuracy in identifying mischievous behavior. Naughty behavior distracts the student from learning or
Anomaly Detection in Classroom Using Convolutional Neural Networks
241
interferes with the teaching process. Examples of deviant behavior include sleeping, talking to others, and being distracted. The proposed method extracts feature from the video frame and capture the key characteristics of each student’s behavior. These features are then used to train a machine learning model to classify the behavior as either normal or mischievous. The paper reports that the proposed method achieved high accuracy in detecting playful behavior, with an overall accuracy of over 90%. The results indicate that the proposed method can potentially improve behavior monitoring in classroom settings, enabling teachers to identify and address mischievous behavior in real time. By doing so, the proposed method can help create a more focused and productive learning environment, ultimately improving student performance. However, the paper acknowledges that further research is needed to evaluate the proposed method on a larger dataset and in different classroom settings. For example, the proposed method needs to be tested in classrooms with different layouts, lighting conditions, and other factors that could affect the method’s accuracy. Moreover, to evaluate its generalizability, the proposed method must be tested on a more diverse dataset that includes different age groups, genders, and ethnicities. In conclusion, the proposed method offers a promising approach to detecting mischievous behavior in classroom settings using skeleton data. Although the technique accurately identifies playful behavior, further research is needed to evaluate its effectiveness in different environments and on a larger dataset.
References 1. Mohan VS et al (2019) Deep rectified system for high-speed tracking in images. J Intell Fuzzy Syst 36(3):1957–1965 2. Ibrahim AA, Abosamra G, Dahab M (2018) Real-time anomalous behavior detection of students in examination rooms using neural networks and Gaussian distribution. Int J Sci Eng Res 9(10):1716–1724 3. Ren S et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28 4. Anitha G, Baghavathi Priya S (2022) Vision based real time monitor ing system for elderly fall event detection using deep learning. Comput Syst Sci Eng 42(1):87–103 5. Fan Z et al (2020) Real-time and accurate abnormal behavior detection in videos. Mach Vis Appl 31:1–13 6. Guo T et al (2022) Educational anomaly analytics: features, methods, and challenges. Front Big Data 4:124 7. Li X et al (2019) 3D pose detection of closely interactive humans using multi- view cameras. Sensors 19(12):2831 8. Liao W et al (2019) A two-stage method for hand-raising gesture recognition in classroom. In: Proceedings of the 2019 8th international conference on educational and information technology, pp 38–44 9. Lin F-C et al (2021) Student behavior recognition system for the classroom environment based on skeleton pose estimation and person detection. Sensors 21(16):5314 10. Senthilkumar T, Narmatha G (2016) Suspicious human activity detection in classroom examination. In: Computational intelligence, cyber security and computational models: proceedings of ICC3 2015. Springer, pp 99–108
242
B. S. Vidhyasagar et al.
11. Si J et al (2019) Hand-raising gesture detection in real classrooms using improved R-FCN. Neurocomputing 359:69–76 12. Suja P, Tripathi S (2015) Analysis of emotion recognition from facial expressions using spatial and transform domain methods. Int J Adv Intell Paradigms 7(1):57–73 13. Gnana Jebadas D et al (2022) Histogram distance metric learning to diagnose breast cancer using semantic analysis and natural language interpretation methods. In: Trends and advancements of image processing and its applications, pp 249–259
Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using Distributed Arithmetic Methodology Venkata Krishna Odugu , B. Satish , B. Janardhana Rao , and Harish Babu Gade
Abstract For the implementation of the hardware structure of filter architecture, the area, power, and delay efficiency are needed. Memory complexity is also important in the 2D FIR filter architecture, while used for image processing applications. In this work, a memory-efficient 2D FIR filter bank architecture is designed using parallel processing, symmetry, and distributed arithmetic methodology. The symmetry concept decreases the quantity of the multipliers of the filter architecture. Parallel processing and multimode filter bank approaches will improve memory efficiency through memory reuse and memory sharing, respectively. The distributed arithmetic (DA)-based filter architectures reduce the multiplier complexity in terms of power and area. Four types of symmetry filter architectures along with one normal filter (without symmetry) are integrated as a multimode filter bank. The required filter can be selected by the control logic. In this filter bank, the memory module is shared by all the sub-filter, which is called memory sharing. The LUT-based DA methodology is used to implement the multipliers of each sub-filter arithmetic module. The proposed multimode filter bank architecture is implemented for the filter length N = 4 with two-level parallel processing using Xilinx Vivado tools. The utilization of resources is correlated with works mentioned in the literature. The area, timing delay and power consumption reports are generated using Genus synthesis tools in 45 nm CMOS technology from the Cadence vendor. Keywords Distribute arithmetic · Memory reuse · 2D FIR filter · Symmetry in coefficients · Low power
1 Introduction Image processing applications such as image restoration, sharpening, and edge detection need efficient two dimensional (2D) filter architectures. The power, area, and delay are the main VLSI design metrics. The Finite Impulse Response (FIR) filter is V. K. Odugu (B) · B. Satish · B. Janardhana Rao · H. B. Gade Department of ECE, CVR College of Engineering, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_21
243
244
V. K. Odugu et al.
more stable, simple, and modular for the hardware structure design than the Infinite Impulse Response (IIR) filter. The 2D FIR filter can be implemented in four different styles, Fully Direct Form (FDF), Fully Transposed Form (FTF), Direct Transposed Form (DTF), and Transposed Direct Form (DTF). From these styles, the FDF requires less memory than the other structures, in which all the memory elements are placed on the input side only. The number of bits is increased after the multiplication process in the filter and corresponding to these bits more memory elements are required, which increases the memory complexity. Hence, the FDF filter structure is considered for the implementation of the proposed 2D FIR filter architecture. The simple structure of the FDF 2D FIR filter is represented in Fig. 1. The 2D FIR filter transfer function equation is given by Eq. (1). H (z 1 , z 2 ) =
N −1 N −1
−j
ci j z 1−i z 2
(1)
i=0 j=0
where [ci j ] is a filter coefficients matrix and N is the length of the filter. The filter architecture mainly consists of memory elements such as registers and shift registers, multipliers for the product of input values and coefficients of the filter, and adders to add the individual outputs. Multiplier is a complex block in the filter architecture, which can be optimized by a multiplierless design approach using DA methodology. In the DA methodology, the multiplication process is achieved by shifter, adders, and look-up table (LUT). The number multiplier required for the filter architecture depends on the filter order or length N. If symmetry is considered in the filter coefficients while designing a specific filter based on the specifications, then the multiplier count will be reduced to a great extent. This reduction in the multiplier will decrease the area and power consumption of the overall filter architecture. Four types of symmetries are considered in the proposed work are Diagonal Symmetry (DS), Quadrantal Symmetry (QS), Four-Fold Rotational Symmetry (FRS), and Octagonal Symmetry (OS). The coefficient matrices corresponding to the above symmetries along with the normal filter are shown in Fig. 2. x(m, n) Z1-1
h(0,0)
Z1-1
Z2-1
Z2-1
Z2-1
h(0,1)
h(0,2)
h(0,3)
h(1,0)
Z1-1
Z2-1
Z2-1
Z2-1
h(1,1)
h(1,2)
h(1,3)
y(m, n)
Fig. 1 Basic architecture of FDF 2D FIR filter
h(2,0)
Z2-1
Z2-1
Z2-1
h(2,1)
h(2,2)
h(2,3)
h(3,0)
Z2-1
Z2-1
Z2-1
h(3,1)
h(3,2)
h(3,3)
Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using …
245
Fig. 2 Filter coefficient matrices a normal filter, b DS, c QS, d FRS, e OS
1.1 Literature Review Some related existing efforts for 2D FIR filter designs are mentioned in this section. Mohanty et al. [1] suggested a unified filter bank design that makes use of multiple symmetry filters as sub-filters. There is no investigation of symmetry filter topologies in this study, and ordinary power complex multipliers are utilized. To minimize memory complexity, memory reuse and memory sharing techniques are implemented. Alawad et al. [2] proposed a memory and energy-efficient 2D FIR filter architecture using a stochastic process. In this work, the probabilistic convolution theorem is used to attain optimization in the design metrics. The ASIC design is considered for the implementation and Area Delay Product (ADP) and Power Delay Product (PDP) trade-off parameters are determined and compared with existing architectures with various filter sizes. There is no symmetry in the filter structure. Odugu et al. [3] proposed a systolic 2D FIR filter architecture using parallel processing to improve the throughput. The power consumption of the filter architecture is pruned by using low-power multipliers and adders. In this work, pipelining is introduced in the addition process of the filter to decrease the delay and power. Symmetry in the coefficients is a good feature to reduce the number of complex blocks that are multipliers. To reduce the quantity of the multipliers in the 2D filter architectures, various symmetry structures are proposed in the works [4–6]. In these works, different types of FIR and IIR filters are realized using various symmetries to reduce the power and area. There is no parallel processing, no systolic structure, and convention multipliers are used. Kumar et al. [7–9] described the parallel processing-based 2D IIR and FIR filter architectures using the DA technique. In these works, various DA techniques are addressed using parallel processing and pipelining concepts. There is no exploration of symmetry concepts. Odugu et al. [10–12] proposed power and area-efficient 2D FIR filter architectures using parallel processing and symmetry concepts. In one of the works, approximation techniques are introduced to optimize the hardware complexity and power consumption of the filter architecture. Generic filter banks are also proposed to optimize the memory complexity and various symmetry-based sub-filters are implemented using memory-based DA multipliers to attain area and power-efficient architectures. The DA-based filter architectures are called multiplierless designs to replace the conventional power hunger multipliers to reduce the power and area of the filter structures [13]. The different DA methodologies used for the filter architecture’s
246
V. K. Odugu et al.
implementation to decrease the area and power are described in the works [14–17]. The hardware structure realizations of various FIR filters are presented in [18]. The design concepts of 2D FIR filters using various metaheuristic algorithms to produce optimized coefficients are discussed in the works [19–22]. In the literature parallel processing, symmetry structure and DA concepts are individually incorporated into the filters. In the proposed work, two-level parallel processing is introduced to increase the throughput of the architectures and to achieve memory reuse. Next, four types of symmetries DS, QS, FRS, and OS structures are implemented as sub-filters, and all these sub-filters are integrated along with a normal filter to attain a multimode filter bank architecture. In the filter bank architecture, the memory module is commonly shared by sub-filters, and this memory sharing reduces the memory requirement for the filter architecture. The remaining multipliers in each symmetry sub-filter are replaced by memory-based LUT multipliers. The memorybased multipliers can be realized in two ways—one is even multiples storage in LUT and the other is odd multiples storage. In this proposed work, even multiples storage is considered and required odd multiples are determined using one extra adder. The other sections of this paper are described as follows: the proposed 2D FIR multimode architecture is explained in Sect. 2. The symmetry sub-filters and normal filter architectures are explored in the same section. Section 3 presents the implementation results, and Sect. 4 gives the conclusion of this work.
2 Proposed Multimode Architecture of 2D FIR Filter The proposed systolic multimode 2D FIR filter bank architecture is shown in Fig. 3. It consists of MM and AM. The MM is commonly shared by all the sub-filters. This memory sharing reduces the memory footprint of the entire filter bank architecture. The MM has a Shift Register Array (SRA) and Input Register Block (IRB); the internal structures of SRA and IRB are shown in Fig. 4a, b respectively. The AM of the filter bank consists of one general filter and four symmetry filters. The desired filter can be chosen by selection logic. Here, any one of the filter or parallel filter outputs can be selected. The normal and four types of symmetry filters such as DS, QS, FRS, and OS individual structures are explored in this section along with SRA and IRBs. Each subfilter AM mainly consists of multipliers, which are realized using the LUT-based DA methodology. Precomputed partial products of the filter coefficients corresponding to the input samples are placed in the LUT. The LUT’s size is determined by the number of bits in the input sample. To store all potential partial products for a 4bit input sample, the LUT requires 16 places. To decrease the size of this LUT, an even-multiples storage idea is used, with odd multiples decided by an external adder and multiplexer. Because of the symmetry, the two input samples must be multiplied with the same filter coefficient in the proposed filter design. As a result, the dualport LUT’s common coefficients are shared. The twin-port LUT logic reduces the
Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using …
247
Arithmetic Module (AM) Normal filter (without symmetry)
L
Memory Module (MM)
L
Diagonal Symmetry L
NL
Input Register Block (IRB) Array
NL NL
L
L
Quadrantal Symmetry
….
…. L
Y2,km
NL
L Shift Register Array (SRA)
Y1,km
Four-fold Rotational Symmetry
L
Octagonal Symmetry
L
Y3,km
Y4,km
Y5,km
NL
L
Xkm
En Selection Logic
Fig. 3 Multimode 2D FIR filter bank architecture
b From Shift Register Array
a D
Input Samples
D D S R 1
x(m,n)
S R 2
x(m,n-1) x(m,n-2) x(m,n-3)
M
x(m,n-4)
x(m,n-5)
x(m-1, n-1) x(m-1,n-2)
To Arithmetic Modules
Fig. 4 Internal structures of the a shift register array, b Input register block
suggested multiplier’s space, memory, and power consumption. Figure 5 depicts the functioning of the two-port LUT-based multiplier.
248
V. K. Odugu et al. x03 x02 x01 x00
x13 x12 x11 x10
d00
3 to 8 address decoder
d01 d02 d03 d04 d05 d06 d07
(Dual port memory) Look-Up Table 8x(C+4) (even multiples storage)
d10 d11 d12 d13 d14 d15 d16
3 to 8 address decoder
d17
c+4 bits
1C
Adde r
Adde r
2 x 1 mux
2 x 1 mux
c+4 bits output
1C
c+4 bits output
Fig. 5 Proposed even multiples storage dual-port LUT multiplier
2.1 Proposed Symmetry Sub-Filters In this section, each symmetry filter and norm filter structures realization using proposed dual-port LUT-type multipliers are explored in detail. The normal filter with dual-port LUT multipliers is shown in Fig. 6. This structure needs a total of 16 multipliers. The four symmetry filters are also realized with the proposed dual-port LUT multipliers. Each symmetry reduces the number of multipliers of the filter structure when compared to the normal filter architecture. The DS, QS, FRS, and OS filter architectures are shown in Fig. 7a–d respectively.
3 Implementation and Results The proposed symmetry architectures are individually coded by Verilog HDL and synthesized using the Xilinx Vivado tool for the target device of FPGA. The synthesized RTL schematics of four symmetry-type filters are given in Fig. 8. Next, all the symmetry filters are integrated as a filter bank and synthesized. The utilization of resources summary of the individual symmetry filters is presented in Table 1.
Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using …
249
x km {
SRB Array
DUB Array From delay unit-2 From delay unit-3 From DUB -2Inputs from Delay unit blocks From DUB -3
From delay unit-1 From DUB -1
4
4
4
4
4
4
4
4 4
4
4
4
4
4
4 4
4
4
4
4
4
From delay unit-4 From DUB -4
4
4
4
4
4
4
4
4
4 4
4
16-segment dual port memory core Dual Dual Dual Dual Dual Dual Dual port port port port port port port LUT LUT LUT LUT LUT LUT LUT multipl multipl multipl multipl multipl multipl multipl ier-8 ier-10 ier-11 ier-9 ier-5 ier-6 ier-7
Dual Dual Dual Dual port port port port LUT LUT LUT LUT multipl multipl multipl multipl ier-1 ier-2 ier-3 ier-4
Dual Dual Dual Dual Dual port port port port port LUT LUT LUT LUT LUT multipl multipl multipl multipl multipl ier-12 ier-13 ier-15 ier-14 ier-16
Adder Tree block
c+4
c+4
Y2
Y1
Fig. 6 Normal filter for N = 4 with 2-level parallel processing using proposed multipliers xkm
{
SRB Array
{
SRB Array
xkm DUB Array From DUB -1
From DUB -2
From DUB -3
DUB Array
From DUB -4
Inputs from From DUB -2
From DUB -1
Delay unit blocks From DUB
+ +
+ +
+
+
+ +
From DUB -4
+
+
+
+
+
+
-3
+
+
+ + +
+
+
+
+ +
+
+ + +
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
10-segment dual port memory core Dual port LUT multiplier-1
Dual port LUT multiplier-2
Dual port LUT multiplier-3
Dual port LUT multiplier-4
Dual port LUT multiplier-5
Dual port LUT multiplier-6
Dual port LUT multiplier-7
Dual port LUT multiplier-8
Dual port LUT multiplier-9
8-segment dual port memory core
Dual port LUT multiplier10
Dual port LUT multiplier-1
Dual port LUT multiplier-2
Dual port LUT multiplier-3
Dual port LUT multiplier-4
Dual port LUT multiplier-5
Dual port LUT multiplier-6
Dual port LUT multiplier-7
Dual port LUT multiplier-8
Adder Tree block
Adder Tree block
Adder Tree block
c+4
c+4
c+4
c+4 Y2
Y1
Y1
Y2
(a)
(b) {
xkm
xkm
{
SRB Array
SRB Array
DUB Array DUB Array From DUB -1
From DUB -3
+
+
+ +
+
+
+
+
+
+
+ +
+
+
+
+
+
+
+
+
Delay unit blocks From DUB -3
From DUB -4
From DUB -4
+ +
+ +
Inputs from From DUB -2
From DUB -1
From DUB -2
+
+
+
+
+ +
+
+
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
+
+
+
4 4
4 4
4
4
4
4
4
4
4
4
4
3-segment dual port memory core
4 4-segment dual port memory core
Dual port LUT multiplier-1 Dual-port LUT multiplier-1
Dual-port LUT multiplier-2
Dual-port LUT multiplier-3
Adder Tree
Dual port LUTmultiplier-2
Adder Tree
Adder Tree block c+4
c+4
c+4 Y1
Y2
(c)
Dual port LUTmultiplier-3
Dual-port LUT multiplier-4
Y1
c+4 Y2
(d)
Fig. 7 Symmetry filter architectures a DS filter, b QS filter, c FRS filter, d OS filter
250 Fig. 8 RTL schematics of a DS filter, b QS filter, c FRS filter, d OS filter
V. K. Odugu et al.
Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using …
251
Table 1 FPGA resources utilization comparison Filter architecture
LUT
Flip Flops
LUT % utilization
Normal filter
348
63
0.65
DS filter
238
63
0.45
QS filter
195
63
0.37
FRS filter
130
63
0.24
OS filter
102
63
0.19
Table 2 Power, delay, and area results of the individual filter structures Filter architectures
Area (µm2 )
Delay (ns)
Power (mW)
Normal filter
10,278
8.97
0.324
DS filter
9556
8.019
0.289
QS filter
9213
7.95
0.207
FRS filter
8392
7.57
0.1967
OS filter
8014
7.32
0.180
Table 3 Performance metrics comparison of filter bank architectures Architectures
Area (µm2 )
Delay (ns)
Power (mW)
Filter bank with normal multipliers
45,782
9.56
1.89
Proposed multimode filter bank
29,876
8.25
0.652
The proposed designs are then synthesized by ASIC design in 45 nm CMOS technology using the Genus synthesis tool to provide the area, power, and delay reports. Table 2 compares the outcomes of the individual symmetry filter designs. Table 3 compares the reports for the proposed multimode 2D FIR filter bank architecture and multimode 2D FIR filter bank architecture employing regular conventional multipliers.
4 Conclusion This paper proposes a multimode 2D FIR filter bank architecture based on parallel processing, symmetry, and multiplierless design. The principles of parallel processing and filter banks increase throughput and memory efficiency. The symmetry in the filter coefficients is employed to lower the multiplier quantity. Four forms of symmetry filter topologies are investigated in this study, and they are incorporated into the filter bank alongside a standard filter (without symmetry). The dual-port memory-based LUT multipliers minimize multiplier complexity. Each arithmetic module of the symmetry filters include the suggested multipliers. Each
252
V. K. Odugu et al.
arithmetic module of the symmetry filters include the suggested multipliers. Individual symmetry filters and multimode filter bank architecture are suggested, and resource consumption summary comparisons are performed among the symmetry filters using Xilinx Vivado tools. In 45 nm CMOS Technology, Genus tools are used to provide area, delay, and power reports for suggested architectures.
References 1. Mohanty BK, Meher PK, Amira A (2014) Memory footprint reduction for power-efficient realization of 2-D finite impulse response filters. IEEE Trans Circuits Syst I 61(1):120–133 2. Alawad M, Lin M (2017) Memory-efficient probabilistic 2-D finite impulse response (FIR) filter. IEEE Trans Multi-Scale Comput Syst 4(1):69–82 3. Venkata Krishna O, Venkata Narasimhulu C, Satya Prasad K (2019) Implementation of low power and memory efficient 2D FIR filter architecture. Int J Recent Technol Eng 8(1):927–935 4. Chen PY, Van LD, Khoo IH, Reddy HC, Lin CT (2010) Power-efficient and cost-effective 2-D symmetry filter architectures. IEEE Trans Circuits Syst I Regul Pap 58(1):112–125 5. Reddy HC, Khoo IH, Rajan PK (2003) Application of symmetry: 2-D polynomials, Fourier transform, and filter design, 3rd edn. In: Chen WK (ed) The circuits and filters handbook. CRC, Boca Raton, FL 6. Van LD, Khoo IH, Chen PY, Reddy HC (2019) Symmetry incorporated cost-effective architectures for two-dimensional digital filters. IEEE Circuits Syst Mag 19(1):33–54 7. Kumar P, Shrivastava PC, Tiwari M, Dhawan A (2018) ASIC implementation of areaefficient, high-throughput 2-D IIR filter using distributed arithmetic. Circ Syst Signal Process 37(7):2934–2957 8. Kumar P, Shrivastava PC, Tiwari M, Mishra GR (2019) High-throughput, area-efficient architecture of 2-D block FIR filter using distributed arithmetic algorithm. Circ Syst Signal Process 38(3):1099–1113 9. Kumar P, Shrivastava PC, Tiwari M, Dhawan A (2020) Realization of efficient architectures for digital filters: a survey. In: Advances in VLSI, communication, and signal processing. Springer, Singapore, pp 861–882 10. Venkata Krishna O, Venkata Narasimhulu C, Satya Prasad K (2021) An efficient VLSI architecture of 2D FIR filter using enhanced approximate compressor circuits. Int J Circ Theor Appl 49(11):3653–3668 11. Venkata Krishna O, Venkata Narasimhulu C, Satya Prasad K (2021) Implementation of low power generic 2D FIR filter bank Architecture using memory-based multipliers. J Mob Multimedia 18(3) 12. Krishna OV, Venkata Narasimhulu C, Satya Prasad K (2022) A novel filter-bank architecture of 2D-FIR symmetry filters using LUT based multipliers. Integr VLSI J 84:12–25 13. Parhi KK (1999) VLSI digital signal processing systems. Wiley, New York, USA 14. Chen PY, Van LD, Reddy HC, Khoo IH (2017) New 2-D filter architectures with quadrantal symmetry and octagonal symmetry and their error analysis. In: 2017 IEEE 60th international Midwest symposium on circuits and systems (MWSCAS). IEEE, pp 265–268 15. Mohanty BK, Meher PK, Singhal SK, Swamy MNS (2016) A high-performance VLSI architecture for reconfigurable FIR using distributed arithmetic. Integration 54:37–46 16. Meher PK (2009) New approach to look-up-table design and memory-based realization of FIR digital filter. IEEE Trans Circuits Syst I Regul Pap 57(3):592–603 17. Meher PK (2009) New look-up-table optimizations for memory-based multiplication. In: Proceedings of the 2009 12th international symposium on integrated circuits. IEEE, pp 663–666 18. Chandra A, Chattopadhyay S (2016) Design of hardware efficient FIR filter: a review of the state-of-the-art approaches. Eng Sci Technol Int J 19(1):212–226
Efficient VLSI Architectures of Multimode 2D FIR Filter Bank using …
253
19. Li L et al (2019) Metaheuristic FIR filter with game theory based compression technique—a reliable medical image compression technique for online applications. Pattern Recogn Lett 125:7–12 20. Bindima T, Elias E (2016) Design of efficient circularly symmetric two-dimensional variable digital FIR filters. J Adv Res 7(3):336–347 21. Yadav S et al (2021) A novel approach for optimal design of digital FIR filter using grasshopper optimization algorithm. ISA Trans 108:196–206 22. Sreelekha KR, Bindiya TS (2023) Design of cost effective variable bandwidth 2D low-pass, high-pass and band-pass filters with improved circularity. Digital Signal Process 133:103842
Implementation of an Efficient Image Inpainting Algorithm using Optimization Techniques K. Revathi , B. Janardhana Rao , Venkata Krishna Odugu , and Harish Babu Gade
Abstract To repair the demolished images and remove the particular unnecessary objects in the image, optimized image inpainting techniques are required. In this work, a novel exemplar-based image inpainting technique is suggested. In this technique, the patch priority is computed using the regulation factor and coefficients. Two optimization techniques as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) techniques are employed to attain the optimal values of regularization factor and coefficients. The best exemplar patch selection is carried out by calculating the sum of the absolute difference between the patches. Performance measures including peak-signal-to-noise ratio (PSNR), mean square error (MSE), and Structural Similarity Index (SSIM) are tested using the suggested image inpainting process on images based on datasets. These results are compared with the available inpainting methods. Keywords Image inpainting · Patch priority · GA · PSO · PSNR
1 Introduction Image inpainting is a process of restoring the smashed parts of the image or filling the hole by eliminating the unwanted objects in the image without identifying the unknown person. Image inpainting has been implemented by many researchers in the past decades. Image inpainting has a variety of uses, including removing scratches from antique photographs, removing occlusions like lettering, logos, and subtitles, recovering missing blocks during image transfer, and removing objects during image editing.
K. Revathi Sphoorthy Engineering College, Hyderabad, India B. Janardhana Rao (B) · V. K. Odugu · H. B. Gade CVR College of Engineering, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_22
255
256
K. Revathi et al.
1.1 Literature Survey: Image Inpainting In image restoration, the Partial Difference Equation methods pursue the direction of isophotes in the image. Bertalmio et al. [1] implemented an image inpainting method based on PDE models using anisotropic diffusion. The PDE methods are implemented using fractional-order variational methods [2–4]. This technique moves an image’s Laplacians from the vicinity of the nearby, recognizable region into the interior of the damaged region. Each pixel of the image’s normal to the image gradient predicts the directions of the diffusion, which are specified by the isophotes directions. Jianhong and Tony [5] enhance the PDE, total variation (TV) method of Bertalmio et al. [1] by utilizing the Bayesian methodology and variational models. For applications involving image deblurring and denoising, the TV technique was used. Exemplar-based image inpainting finds applications in forgery detection algorithms. This method is used for forgery detection very efficiently, proposed by Liang et al. [6]. This process integrated the three types of components generated from different conversion techniques. All these components are used to identify the tampered regions and to find the best match regions to inpaint. Janardhana Rao et al. [7] proposed an enhanced priority computation method by incorporating regularization factors and adaptive coefficients. In this work identified the best suitable values of the regularization factor and corresponding combinations of adaptive coefficients to produce good inpainting results. The best matching patch similar to the target patch is searched by applying the sum of squared error (SSE) and sum of absolute difference (SAD). The best combination of regularization factor and adaptive coefficients are combined along with SSE and SAD for recovering damaged images and removing the unwanted objects in the image. Exemplar-based inpainting methods are best suited for images with regular texture or structured textures. These methods give visually flawless results even for large inpainting regions. The best patch selection from the source region and the computation of the highest priority patch on the target region’s boundary are the two key factors that determine how effective the exemplar-based approaches are. In order to restore the images with completely structured and texture information as well as filling the sizable unknown region in the target region, Criminisi et al. [8] suggested a first exemplar-based image inpainting method. This technique takes longer to inpaint since it avoids local similarities in the images. The selection of the appropriate patch on the boundaries of the target region defines the quality of the inpainting. In mathematical terms, the priority function is to be carefully designed. In certain cases, the priority function rapidly decreases with fewer iterations of the mathematical model. This is known as the dropping effect [9]. Thereby the priority function is to be chosen carefully. It is understood that the efficiency of the exemplar-based inpainting method depends on the calculation of the highest priority function by reducing the dropping effect. The exemplar-based methods are also utilized in the video inpainting
Implementation of an Efficient Image Inpainting Algorithm using …
257
techniques [10–13]. In these works, the patch selection is done by various optimization techniques such as the modified artificial bee colony algorithm, gray wolf optimization, and cuckoo-search optimization. This motivated to the development of the enhanced exemplar-based inpainting method with an improved patch priority function for producing improved inpainting results. The priority function is in terms of the parameters such as regularization factor and adaptive coefficients. These parameters are optimized by two meta-heuristic algorithms GA and PSO. Next, the best patch is selected by the correlation measure, the sum of absolute difference (SAD) between the highest priority patch and patches in the source region. The remaining portions of the study are summarized as follows: Sect. 2 explains the suggested image inpainting technique. Section 3 presents the optimization techniques such as GA and PSO. The experimental findings of the suggested inpainting technique are described in Sect. 4. Section 5 has the conclusion.
2 Proposed Image Inpainting Technique The basic exemplar-based inpainting approach by Criminisi et al. [8] is modified in the unique algorithm that is being suggested, with the exception of the following.
2.1 Highest Patch Priority Computation The priority computation function is modified by introducing a change in confidence term using regularization factor (μ), adaptive coefficients (a and b), and also replacing the multiplication between confidence term and data term with addition. The new priority function is taken as, P p = a ∗ G C ( p) + b ∗ D p
(1)
where a is an adaptive coefficient associated to the confidence term and b is another adaptive coefficient connected to the data term ranging as a ≥ 0, b ≤ 1, a + b = 1, and G C ( p) is the modified confidence term with a regularization factor (μ). The modified confidence term is taken as G C ( p) = (1 − μ)C p + μ
(2)
In the available exemplar-based inpainting algorithm, the priority function is computed by considering the multiplication of the data term with the confidence term. The dropping effect was brought on by the confidence term value’s quick reduction over a limited number of process iterations in the exemplar-based algorithm. Thereby the priority function value also decreases to the lower value for less number of
258
K. Revathi et al.
iterations because of considering the multiplication between the confidence term and data term in the computation of the priority function. The confidence term may be less than ‘one’. In such cases, the multiplication of the confidence term with the data term converges to a value less than ‘one’ in fewer iterations. This motivated to introduce the change in the confidence term and replace the multiplication between the confidence term and data term with addition. Thereby, the priority function maintains a significant value for more iterations in the process. Optimizing the parameters in the Priority function. The selection of suitable parameters, viz. regularization factor (μ) and adaptive coefficients (a and b) in the Priority function, improves the inpainting results. The use of an optimization algorithm to find the optimal values of suitable parameters, regularization factor (μ), and adaptive coefficients (a and b) in the Priority function will enhance the quality of inpainting. This motivated to implement the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) for identifying the optimal groupings of regularization factor (μ) and adaptive coefficients (a and b) to produce good inpainting results.
2.2 Best Exemplar Patch Selection Method The highest priority patch with maximum priority value is filled by the patches on the source region which are called exemplar patches. The exemplar patches form the source region are selected by using the correlation measure, sum of absolute difference (SAD) between two patches of the source region and patch with highest priority identified having highest priority function value. The SAD gives the following as the minimum distance between the target patch and the patches on the source region: ( ) Ψq ' = arg min dSAD Ψ p , Ψq Ψq ∈Φ
(3)
( ) where dSAD Ψ p , Ψq is the distance using SAD, which is calculated as ) ( ) ( )| ( ) ∑||( | dSAD Ψ p , Ψq = | RΨ p − RΨq i + G Ψ p − G Ψq i + BΨ p − BΨq i |
(4)
3 Optimization Techniques There are two optimization techniques, viz. Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), are applied to find the optimal values of regularization factor (μ) and adaptive coefficients (a and b) in the Priority function.
Implementation of an Efficient Image Inpainting Algorithm using …
259
3.1 Genetic Algorithm The Genetic Algorithm (GA) [14] is one of the evolutionary algorithms to solve the search problem. It generates optimal and sub-optimal solutions to the problem. The genetic algorithm generates a random population, computing possible solutions (fitness) and sorting with respect to fitness. The new population is produced by combining crossover and mutation for the selected population. The best solution to fitness function is calculated and sorted. The step-by-step algorithm for GA is as follows: 1. Initialization: Max iterations/generations (max_ Itt), No. of variables (Nv), Lower Bound (LB), and Upper Bound (UB), Population size (Ps), crossover, and mutation probability and begin iteration Itt = 1. 2. Generate a random population. 3. Compute fitness function. 4. Sort the solutions according to fitness. 5. Apply Crossover and Mutation to produce new populations. 6. Compute the new solution and sort. 7. Save the best solution. 8. Repeat steps 5–7 over generations to keep the best-fitted individuals till a termination condition is reached.
3.2 Particle Swarm Optimization Particle Swarm Optimization (PSO) Algorithm is an reiterative method to optimize parameters. With the help of mathematical formulas, the particle velocity and position are updated in the search space [14]. In PSO, random particle positions are generated, and possible solutions (fitness) are computed. Search for the position best ( pbest ) and global best (gbest ) values from the particles and update them. The particle velocity (Vel) and positions (Pos) are updated using the Eqs. (5) and (6). The process is repeated until the maximum number of iterations (Max_ ittr) is reached, or a termination condition is reached. Veli+1 = C ∗ (w ∗ Veli + c1 ∗ r 1 ∗ ( pbest − Posi )) + c2 ∗ r 2 ∗ (gbest − Posi )
(5)
where w is weight, c1, c2, r 1, and r 2 are parameters of PSO. Posi+1 = Posi + Veli+1 wi = wmax −
i (wmax − wmin ) max_ itt
(6) (7)
260
K. Revathi et al.
The steps of the algorithm for PSO are given as follows, 1. Initialize the Lower Bound (LB), Upper Bound (UB), initialize algorithm parameters c1, c2, wmax , and wmin , maximum iterations Max_ Itt, population size (Ps), and the number of variables (Nv). 2. Initialize the particles position randomly ) and compute velocity. ( Poskj = LBkj + rand ∗ LBkj − UBkj with 3. 4. 5. 6. 7.
j = 1 to Ps, k = 1 to Nv. Find the fitness for the population. Sorting the population with respect to fitness Mark the solution as position best ( pbest ) and global best (gbest ) Compute Veli+1 , Posi+1 and wi+1 , using the Eqs. (5)–(6) Fix the boundaries of pos as follows: if Posi+1 < LB, Posi+1 = LB; or if Posi+1 < UB, Posi+1 = UB;
8. Repeat steps 3–7 for all the population 9. Update the pbest and gbest values 10. Repeat steps 3–9 till max iterations max_ Itt are reached.
4 Implementation and Experimental Results Particle Swarm Optimization (PSO) is used to obtain the optimal combination of μ, a, and b to achieve good inpainting results. In order to accomplish this objective, the initial population of μ, a, and b values are generated randomly using PSO as shown in Table 1. In each iteration, for the generated values of μ, a, and b the inpainting process takes place and inpainted images are produced. Later, the MSE is calculated as a fitness function between the input image and the obtained inpainted image. After the predefined number of iterations, the values of μ, a, and b corresponding to the minimum MSE are considered as optimal values for producing optimal inpainting results. The initial values considered for PSO are max iteration (Max_ ittr) = 150, the random value chosen for parameters r1, r2 in the range [0, 1], C = 1 (constriction component) and c1, c2 as 1.5 (cognitive components), population size = 20. The implementation is conducted for many combinations of μ, a, and b along with SAD patch selection methods. From all the combinations few are noted, which Table 1 Analogous parameters of PSO for optimization of μ, a, and b values
Parameters
Analogous parameters
Particle positions
Regularization factor (μ) Adaptive coefficients (a and b)
Fitness function
Mean square error (MSE)
Implementation of an Efficient Image Inpainting Algorithm using …
261
Table 2 PSNR, MSE, and SSIM for SAD patch selection with combinations of μ, a, and b Regularization factor (μ)
Adaptive coefficients (a and b)
PSNR (dB)
MSE
SSIM
μ = 0.7
a = 0.2 b = 0.8
19.148
797.33
0.9431
a = 0.3 b = 0.7
19.41
750.926
0.9521
a = 0.8 b = 0.2
19.986
657.69
0.9752
a = 0.7 b = 0.3
19.461
742.146
0.9546
a = 0.2 b = 0.8
19.223
783.92
0.9443
a = 0.3 b = 0.7
19.139
799.41
0.9415
a = 0.8 b = 0.2
20.03
650.4
0.9821
a = 0.7 b = 0.3
19.948
663.356
0.9721
µ = 0.521 (from GA)
a = 0.7916 b = 0.2084
20.0421
649.27
0.9812
µ = 0.517 (from PSO)
a = 0.8196 b = 0.1804
20.5497
633.41
0.9856
μ = 0.5
Bold significance defines the results obtained by the proposed work
are producing sensibly good inpainting results. Further GA and PSO is employed in order to identify the optimal combinations of μ, a, and b along with SAD patch selection method. The efficiency of the suggested inpainting technique for all the above combinations is measured using performance metrics like PSNR, MSE, and SSIM shown in Table 2 [15, 16].
4.1 Performance Comparison with Available Algorithms for Objects Removal Good inpainting results produced by the best groupings of regularization factor (μ), and adaptive coefficients (a and b) with SAD methods are μ = 0.521, a = 0.7916 and b = 0.2084 for GA, and μ = 0.517, a = 0.8196, b = 0.1804 for PSO, respectively, for object removal. These results are compared with algorithms available in the literature, viz. Janardhana Rao et al. [7], Criminisi et al. [8], and Wang et al. [9], shown in Fig. 1.
262
K. Revathi et al.
Fig. 1 Comparison with available image inpainting methods for object removal; column a input image; column, b results from Criminisi et al. [8]; column, c results from Wang et al. [9]; column, d results from Rao et al. [7], e results from proposed method with SAD
5 Conclusions The best matching patch in the source region is chosen for the exemplar-based image inpainting after locating the patch with the highest priority on the target region’s boundary. The enhanced patch priority function is developed by introducing the regularization factor (μ) and adaptive coefficients (a and b). The exemplar patch selection in the source region is done with SAD. The proposed method is applied for object removal applications of inpainting, to identify the different combinations of μ, a, and b. Here, μ, a, and b values are randomly chosen. In addition, Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) is utilized for choosing the optimum values of μ, a, and b to produce good inpainting results. The PSNR and SSIM results are provided via the suggested image inpainting technique using PSO
Implementation of an Efficient Image Inpainting Algorithm using …
263
when compared to the GA. The results are presented with a necessary comparison of available algorithms in the available works.
References 1. Bertalmio M, Sapiro G, Caselles V, Ballester C (2000) Image inpainting. In: Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pp 417–424 2. Sridevi G, Srinivas Kumar S (2017) Image inpainting and enhancement using fractional order variational model. Defence Sci J 67(3):308–315 3. Sridevi G, Srinivas Kumar S (2019) Image inpainting based on fractional-order nonlinear diffusion for image reconstruction. Circ Syst Signal Process: 1–16 4. Sridevi G, Kumar SS (2017) P-laplace variational image inpainting model using riesz fractional differential filter. Int J Electr Comput Eng 7(2):850 5. Shen J, Chan TF (2002) Mathematical models for local nontexture inpaintings. SIAM J Appl Math 62(3):1019–1043 6. Liang Z et al (2015) An efficient forgery detection algorithm for object removal by exemplarbased image inpainting. J Vis Commun Image Representation 30:75–85 7. Janardhana Rao B, Chakrapani Y, Srinivas Kumar S (2018) Image inpainting method with improved patch priority and patch selection. IETE J Educ 59(1):26–34 8. Criminisi A, Pérez P, Toyama K (2004) Region filling and object removal by exemplar-based image inpainting. IEEE Trans Image Process 13(9):1200–1212 9. Wang J, Lu K, Pan D, He N, Bao B (2014) Robust object removal with an exemplar-based image inpainting approach. Neurocomputing: 150–155 10. Janardhana Rao B, Chakrapani Y, Srinivas Kumar S (2022) MABC-EPF: video in-painting technique with enhanced priority function and optimal patch search algorithm. Concurr Comput Pract Exper 34(11):e6840 11. Janardhana Rao B, Chakrapani Y, Srinivas Kumar S (2022) An enhanced video inpainting technique with grey wolf optimization for object removal application. J Mobile Multimedia 18(3):561–582 12. Janardhana Rao B, Chakrapani Y, Srinivas Kumar S (2022) Video inpainting using advanced homography-based registration method. J Math Imaging Vis 64(9):1029–1039 13. Janardhana Rao B, Chakrapani Y, Srinivas Kumar S (2022) Hybridized cuckoo search with multi-verse optimization-based patch matching and deep learning concept for enhancing video inpainting. Comput J 65(9):2315–2338 14. Mohammed KMC, Srinivas Kumar S, Prasad G (2015) 2D Gabor filter for surface defect detection using GA and PSO optimization techniques. AMSE J Ser Adv B 58(1):67–83 15. Venkata Krishna O, Venkata Narasimhulu C, Satya Prasad K (2021) An efficient VLSI architecture of 2D FIR filter using enhanced approximate compressor circuits. Int J Circ Theor Appl 49(11):3653–3668 16. Babu GH, Venkatram N (2020) A survey on analysis and implementation of state-of-the-art haze removal techniques. J Vis Commun Image Represent 72:102912
A Systematic Study and Detailed Performance Assessment of SDN Controllers Across a Wide Range of Network Architectures V. Sujatha and S. Prabakeran
Abstract Internet access is the primary motivating factor behind the revolutionary modern digital era. Almost everything is connected to the Internet because of the Internet of Things (IoT) concept. However, because typical IP networks have a close relationship between the data plane and the control plane, managing and configuring the network are extremely time-consuming and expensive to compute. Software-defined networking (SDN) has been recommended as a conceptual shift toward a perceived and functionally unified network control plane that will decrease the complexity of network administration. SDN makes it possible to program the network’s configuration and divides the data plane from the control plane. In the control plane, the controller supervises all data plane operations as its primary component. Therefore, for optimal performance, the controller’s performance and capabilities are crucial. The literature contains several controller designs for both broad and narrow networks have been proposed. Yet, there is just a little amount of thorough quantitative examination of them. In this article, a comprehensive qualitative comparison of various SDN controllers and a quantitative analysis of their performance in a variety of network scenarios are presented. We classify and categorize controllers, then compare them qualitatively. Several types of network controllers are compared and contrasted, including those used in IoT, blockchain, vehicle, and wireless sensor networks. Keywords Software-defined networking · SDN controllers · Performance evaluation
V. Sujatha · S. Prabakeran (B) Department of Networking and Communication, SRMIST, Kattankulathur, Chennai, India e-mail: [email protected] V. Sujatha e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_23
265
266
V. Sujatha and S. Prabakeran
1 Introduction In a traditional IP network, devices communicate with one another using a set of well-defined protocols in order to negotiate the network’s precise behavior based on the configuration of each device. In this architecture, the network supervisor has to manually set up a substantial amount of devices because the process is not significantly automated. While certain computer programs facilitate the work, it still requires a great deal of labor and time. The simple network management protocol (SNMP) is one example of a simple software tool. It is frequently used for gathering information and alerts, however not for managing configurations because of a number of limitations, as listed in [1]. NETCONF is a different protocol that uses API to automate configuration network devices. NETCONF is a protocol defined by the IETF to “install, manipulate, and delete the configuration of network devices” [1]. The fact that network devices are sold by their manufacturers as closed components coupled with os, equipment, and specifications is another issue with traditional networking. There may be a number of interoperability issues when integrating devices from various manufacturers because numerous equipment manufacturers use their own configuration languages to comply with network protocols. Additionally, the participation of various research communities in the effort to innovate computer networks is hampered by this closed component. In addition, regarding the integration of fresh features into a current scenario, the conventional network lacks adaptability. Before a new device can be added, it must be integrated and set up to perform the new service [2]. In comparison to traditional networking, software-defined networking (SDN) has revolutionized computer networking by detaching the control plane from the data plane. The controller on the control plane interfaces with the switches in order to control those using user-defined protocols. On the other hand, routers and switching devices are hardware components in the data plane that transfer flow from one connection to a different one. It is no longer necessary to access and configure individual switches because the central controllers can manage all switches. Because the network controller uses standard protocols like OpenFlow to communicate with the switches, there is no need to control the switches’ forwarding behavior with programs made by a particular vendor. This makes it easier for devices from different manufacturers to work together [2]. Additionally, SDN provides a simple interface for adding new network services. The controller takes the high-level logic and translates it into the minimal forwarding regulations for the switching devices; any network service can be implemented. SDN gives the various research communities the chance to come up with their own concepts and put them into action by shifting the focus away from hardware and toward software.
A Systematic Study and Detailed Performance Assessment of SDN …
267
2 Related Work Since SDN’s inception, a great deal of study and writing has been done on its many facets. So far many implementations of sbi and nbi have been started [3]. For instance, ONF introduced OpenFlow in 2008 [4] as an initial framework for dividing networks into distinct control and data planes. Since its inception, this protocol has been held up as a benchmark for the SDN southbound interface [5]. All the controllers produced to date have to be evolved into a perfect one to reach the level in global manner [3], with exception of OpenFlow and the REST API for the nbi layer [6]. Given that main benefit is to enable designer to construct modern applications via innovation with it [7], significance of controller is evident. This drawback may be attributable to most recent pattern which has failed to satisfy specifications of it to meet obstacles of an SDN [8]. In the forthcoming section, specification of obstacles would have explained concisely. Some papers identify scalability, accuracy, high-availability, compatibility, and security as problems in the SDN context [8, 9]. A study on the SDN difficulties, flexibility, safety is the primary obstacles in it because necessity to addressing a common controller. Concerning the former, possible latency is added when network packets are exchanged across numerous nodes and a controller, thereby lowering productivity, the efficiency indicator, and threatening the capacity of SDN networks [7]. The selection of a controller for a software-defined networking (SDN) environment must adhere to a variety of criteria, as outlined in a detailed article on controllers for SDN. Support for efficiency, accessible nbi and sbi, virtualization, a graphical user interface that is easy to use (GUI), adaptability, security (specifically Transport Layer Security [TLS]), numerous platform compatibility, free software, and comprehensive documentation are all necessities [10]. The methods for memory administration, adaptability, and multiple threads techniques used by a controller are dependent on its design, programming syntax, and extensibility [4].
3 Architecture of SDN SDN is a technology in which a centralized controller manages all network functions rather than individual network devices establishing their own routing tables. The core architecture of a software-defined network (SDN) is composed of three independent layers, which are the application, the control, and the infrastructure, respectively [11, 12]. Structure of the SDN is depicted in Fig. 1. The infrastructure layer consists of: It is the layer that sits at the very bottom of SDN. It performs the duties of a true layer and contains the physical components of the network, such as hubs, switches, and routers. Primary function of this layer is the transmission of packages or the management of traffic assets [13]. The controller in the control layer is responsible for forwarding packets in the infrastructure layer [11]. The control plane sits between the application plane and the data plane, communicating with it via Southbound Application Programming Interfaces. The control layer has the ability
268
V. Sujatha and S. Prabakeran
Fig. 1 SDN structure
Table 1 Issues in SDN layers [14] SDN layers
Security issues
Application plane
Lack of authentication Injection of a bogus flow rule failure to regulate access
Control plane
DoS attack Access to the controller that is not authorized capacity and accessibility
Data plane
Fraudulent low rules Assaults by flooding A compromised controller TCP-based attacks Man in the middle attack
to isolate individual programs so that they can run in isolation from one another [13]. Application layer: The topmost level of an SDN architecture. Via northern application programming interfaces, it communicates with the control plane. This layer makes use of the application control plane’s interaction point to talk to other applications and programs [11, 13]. Table 1 [14] gives short thought regarding connected problems level wise.
4 Data Plane of SDN 4.1 Outline of the Data Plane The capabilities of the information plane in an SDN-controlled network do not vary greatly, in contrast to the capabilities of the information plane in a classical network that has a distributed control plane [15]. Sending or streaming tables is the principal function of the information plane, which is responsible for advancing the display of messages from their respective info ports to the desired results while they are still in flight. The information efficiency, or the total number of messages or bytes that may
A Systematic Study and Detailed Performance Assessment of SDN …
269
be transmitted forward at any given instant in a given amount of time, is the basic efficiency metric underlying the information plane. It is essential for this metric to be carried out with the assistance of individualized hardware at each network switch or switch in order for the display of the information plane to be expanded.
4.2 Forwarding Logic In a conventional distributed network, the information plane relies on sending tables that are populated by the control plane to indicate to which destination port each message should be sent. Sending tables are typically straightforward, with only two sections: The first contains an objective location range, and the second contains the associated connect link or result port. The target placement region values in the dispatching table are matched with the location obtained from the target location preamble field of each datagram. In cases where at least one territory matches the target address, the longest will be selected because it corresponds to a subnet that is less ambiguous and narrower. Alternatively, SDN information aircraft uses an enhanced, more complex version of sending tables known as stream tables. These new tables enable the information plane to make more complex decisions based on any of the knowledge contained within a message. Unlike transmitting rows, flow columns have an additional segment, increasing the total to three. The primary section fields contain the matching rules for approaching parcels and explicit activities. These rules can be applied to any part of the approaching datagram, including its information payload, unlike the rules in sending tables, which must be applied to the target location field in the preamble of each datagram. The columns in the second part of a data stream table contain the actions to be carried out on an incoming message that matched the comparing rule field. These actions may include transmitting a packet of data to a result port, modifying datagram fields, and, surprisingly, dropping the package entirely.
5 Control Plane of SDN Core of controller: The core of the controller generates a virtual perception for the application layer, while maintaining the global view of the network and interacts with the underlying switches. For instance, as depicted in Fig. 2, communication between devices, which is marked as global view, requires a number of switches. The controller creates the abstract view of the network and acquires this global view. The user can write a control program by accessing the abstract network view. As depicted in Fig. 2, the controller may contain multiple modules for performing particular tasks. The topology of the network is created by regularly interacting link discovery modules with switches in the topology manager. The topology manager’s network
270
V. Sujatha and S. Prabakeran
Fig. 2 SDN controller architecture
topology is used by the other modules, such as the choice module, to discover the best routes among nodes. Updated tables are communicated to modules through direct interaction with the data plane’s flow tables. The controller may also incorporate a queue manager and storage for various reports and information. Interfaces: For interacting with the various layers, a variety of APIs surrounding it utilizes the sbi for communicate through the controlled once, whereas the top phase utilizes the nbi to connect with it when many controllers are deployed, and the east-westbound APIs can be used to interface with the controllers themselves. The OpenFlow protocol is the most commonly used SBI, but more recent controllers also support other protocols in addition to OpenFlow. The REST API is the most widely used protocol that controllers can support in the case of NBI.
5.1 SDN Controllers Network designs are being implemented with the help of Ryu, POX, floodlight, ONOS, ODL, and Open Daylight, all of which are open-source SDN controllers [16]. Figure 2 depicts the SDN controller’s architecture. In addition, it may utilize it, and diagram depicts the modules that provide the primary functionality of the controller, including SBI and northbound interface (NBI). POX POX is an open-source controller written in Python that can be used to build SDN applications with OpenFlow. OpenFlow devices can also be turned into load balancers, switches, firewalls, and other network devices by a POX controller. The POX controller can directly access and manipulate the forwarding devices when the OpenFlow protocol is present. It is suitable for experiments, demonstrations, and research because it is quick and simple. POX is based on the idea that every SDN network activity and device is a separate component that can be used separately at
A Systematic Study and Detailed Performance Assessment of SDN …
271
any time and anywhere. It is in charge of establishing any kind of communication between SDN devices and applications [17]. Ryu Open-source, component-based SDN framework known as the Ryu controller is entirely written in Python. This enables the use of an event-driven software development paradigm, in which the way the program operates is determined by occurrences, and it also makes it possible to use the OpenFlow protocol to adjust the way traffic flow management is handled on a network by associating with switches. Event classes describing messages from connected switches are exported by the Ryu controller of event module. It makes it simple to develop control applications and manage SDN networks by providing software components with clearly defined APIs. The designed network can also be viewed in the user interface. OpenStack, Quantum, the Firewall, and the OpenFlow representative API (OFREST) are just a few of the Ryu components that can be used in SDN applications. Numerous applications of this type use a device called the controller to discover the network, execute algorithms for evaluating the information, and subsequently use the controller’s logic to implement the new rules. In addition, Ryu is compatible with a number of network infrastructure management protocols, including OpenFlow, Netconf (RFC 6241), and OF-config. All OpenFlow versions (1.0–1.5) are fully supported by the Ryu [18]. Floodlight Floodlight, which was developed by Stanford University in 2014 [19, 20], utilizes both the actual equipment and its electronic depiction in an open-flow valve. In addition to being driven by events, asynchronous, which thread-based, and Java-based, the controller employs synchronized locking. Smart toggles, gateway programs, stats, flow-pusher, a router, and a load balancer are some of the basic applications that are available. The developer’s open-community floodlight controllers are worried with the greatest relays’ network. The router uses open-flow methods to structure traffic over the network in a manner that establishes the network’s operational setting. This is in charge of ensuring that all network rules are up to date and providing the underlying hardware with the infrastructures it must specify the action to be taken in the event of congestion. The organization benefits from improved network controller skills and more adaptability in the face of change. Because it is simple to design and test modules in a simulated environment with Floodlight, as it is compatible with a wide range of virtualization switches. ONOS Open Networking Operating System (ONOS) is a software-defined network (SDN) controller for creating applications for next-generation networks (ONOS). With Onos, we do not have to use physical switches to simulate network conditions for testing; instead, we can manipulate the network in real time. As a cloud-ready solution, ONOS encourages innovation by lowering the barrier to entry for developing network apps. ONOS is an open-source project that can be used to create applications that make use of software-defined networks. The ONOS system is an OS for
272
V. Sujatha and S. Prabakeran
managing network devices. With ONOS, we can manage a wide variety of hardware components in SDNs. To put it simply, it was developed by the Linux Foundation. Currently, it is used by many of the largest companies in the networking sector. ODL The OpenDaylight controller is a piece of software that operates on the Java Virtual Machine (JVM), making it compatible with any platform that can run Java. The following tools are used by the controller, which is an implementation of the SDN model: Maven: OpenDaylight makes use of Maven for build automation simplicity. Maven scripts the dependencies between bundles and specifies which bundles to load and start using pom.xml (ProjectObject Model).OSGI framework acts as the backend for OpenDaylight by automatically importing package and package JAR files, as well as connecting bundles to one another for the purpose of information sharing.
6 Methodology SDN network simulation can be practiced with Mininet, a Linux-based emulator. Being a lightweight virtualization emulator, it can be used to model SDN infrastructures. It allows a computer to simulate a network architecture with virtual routers, switches, hosts, and links for the purpose of conducting network experiments. Linux, a computer system in linux environment. Mininet emulator has the potential to provide for scalable and customizable network structure. Integration with Python and Java programs is possible as well. As far as the user is concerned, a Mininet virtual computer could as well be an actual machine. Using a Secure Shell (SSH) tunnel, it can launch programs remotely. Based on virtualization and network namespace, Mininet generates virtual hosts. This functionality can transfer network packets through an interface, just as real Ethernet. The Mininet emulator makes it simple to simulate link attributes like as speed and latency. Packets are processed by imitating the behavior of a hardware Ethernet switch using preset queueing and buffering rules. The Mininet emulator’s virtual switches are OpenFlow switches. With the assistance of Mininet-Wi-Fi, an enhancement of Mininet, SDN networks may now be set up using wireless connections and comparison given in Table 2. This was previously not possible. It is possible to set up a wireless Mininet by joining a number of wireless endpoints and nodes to an existing Mininet. Because of the way it was developed, neither the features nor the structure of SDN can be altered [21]. Hosts, switches, interconnections, and SDN/OpenFlow controllers must be modeled. Mininet [1] enables the creation of topologies with tens of thousands of nodes and the simple execution of tests on them. It has straightforward command line tools and an API. Mininet facilitates the creation, customization, sharing, and testing of SDN networks.
A Systematic Study and Detailed Performance Assessment of SDN …
273
Table 2 SDN controllers’ comparison [22] Controller name
Programing language
Platform support
Developer
POX
Python
Windows, Linux and MAC
Nicira network
Ryu
Python
Linux
Nippon telegraph and telephone (NTT)
Floodlight
Java
Linux, MAC and Windows
BigSwitch network
OpenDaylight
Java
Linux
Cisco and OpenDaylight
ONOS
Java
Linux
ONF
Table 3 Mininet characteristics Characteristics description Flexibility
The implementation and features of networks can be carried out using programming software codes
Scalability
The modeling environment must be able to accommodate a huge network
Shareable
Prototypes should be simple to share with other contributors
Applicability
Conducted prototypes should also be usable in real networks
Realistic
Prototype behavior should accurately mimic real-time behavior
Interactivity
The management and running time of the simulated network must match that of real-time networks
Mininet Common Environment (CE): It helps to make Mininet scalable. If you need to create a large network, Mininet is not for you. Mininet works well for lowtraffic networks of limited size. Mininet CE can support large-scale networks, and new hosts can be connected to an existing network without disrupting its operation shown in Table 3 [21].
7 Results and Discussion In this paper using Mininet emulator, configurations are created, and the efficiency analysis of controllers is performed using two QOS variables, including delay efficacy with variable numbers of switches using cbench, pkblaster, and OFnetsnd throughput performance with varying number of switches using cbench and pkblaster. Using the sudo mn command and specifying the configuration’s designation and controller IP address, configurations are constructed. By changing the amount of switches, all controllers are compared independently for every instrument and the results obtained are shown in the graphical format and comparison in Table 4. Figure 3 depicts the calculated delay for each of the nine unique controllers that were evaluated as the
274
V. Sujatha and S. Prabakeran
amount of switches in the architecture increased. Figure 3a demonstrates, however, that the effectiveness of all controllers reflects an identical pattern, Fig. 3b reveals that there is a significant difference between the two, and Nox-Verity, POX, and Ryu exhibit a less pronounced increase in latency at higher switch densities than the other controllers. The flow installation time for PktBlaster is shown in Fig. 3b, and we can observe a similar trend to that of CBench. In the third analysis shown in Fig. 3c, we can see the OFNet flow measurements One key point to make about the OFNet reports is that they are time-stamped with the results. Separate simulations are conducted by varying number of switches. It can be seen that POX, Ryu, and Nox-Verity perform better in terms of flow installation than ONOS, Floodlight, and ODL, which install flows much more slowly Throughput performance-only CBench and PkBlaster are used to calculate the throughput performance metric, as shown in Fig. 4. The results from the CBench are displayed in Fig. 4a. It is clear that ODL, Floodlight, and ONOS perform noticeably good than their counterparts Nox-Verity, POX, and Ryu, among others who continue with the worst performance. Figure 4b illustrates how well PktBlaster performs in various scenarios. When all controllers are compared, Floodlight, ODL, and ONOS are the best, but Nox-Verity and POX have the performance that is the lowest. The distinction among these two groups comes to about 250 in this instance. Table 4 Comparison between different methodologies [19] Reference
Controller
Topology
Performance
Tools
[10]
POX, Ryu, ONOS, floodlight
Linear
Delay, jitter, throughput
iPerf, Ping
[11]
Mininet controller
Single
Delay, jitter
iPerf, Ping
[12]
Ryu
Single
Delay, throughput
iPerf, Ping, Wireshark
[13]
POX
Single, linear, tree
Delay, throughput, packet loss
Wireshark
[23]
Ryu
Tree
Delay, jitter, throughput, packet loss
Wireshark, iPerf, Ping
[14]
ODL, ONOS
Single, linear, tree
Delay, jitter, throughput, packet loss
D-ITG
[12]
Ryu
Linear
Delay, throughput
Cbench, Wireshark
[12, 13]
POX, Ryu
Single, linear, tree
Delay, jitter, throughput, packet loss
D-ITG
A Systematic Study and Detailed Performance Assessment of SDN …
275
Fig. 3 Latency performance with varying number of switches a CBench b PKblaster c OFNet [12]
Fig. 4 Throughput performance measured against number of switches a CBench b PKblaster [12]
8 Conclusion and Future Scope In this paper, we present the packet flows of an OpenFlow switch and briefly discuss the available controllers. Separate types of controllers are defined, and criteria for judging them are discussed. The comparison and other evaluation parameters lead us to the conclusion that Floodlights provides significantly faster and better results in terms of packet transmission times (31 times fast). Based on our research, the Floodlight controller has the disadvantage of the fact that it uses up a lot more memory during execution than the POX controller or Python-based build we compared it to. When compared to the previously mentioned rest controllers, McNettle has an additional advantage in that it can be run on multicore servers. OpenDayLight stands out from other similar systems because of its distributed flat architecture and the flexibility it affords. Security problems and their solutions are one potential future area of focus for expanding SDN Controllers’ functionalities.
276
V. Sujatha and S. Prabakeran
References 1. Mousa M, Bahaa-Eldin AM, Sobh M (2016) Software defined networking concepts and challenges. In: 2016 11th International conference on computer engineering systems (ICCES), pp 79–90. https://doi.org/10.1109/ICCES.2016.7821979 2. Godanj I, Nenadic K, Romi´c K (2016) Simple example of software defined network. In: 2016 International conference on smart systems and technologies (SST), pp 231–238. https://doi. org/10.1109/SST.2016 3. Saleh A, Bhargavi G (2017) Experimenting with scalability of floodlight controller in software defined networks. In: Proceedings of the 2017 international conference on electrical, electronics, communication, computer, and optimization techniques (ICEECCOT 2017). Institute of Electrical and Electronics Engineers Inc., United States of America, pp 288–292. 4. Salman O, Elhajj IH, Kayssi A, Chehab A (2016) SDN controllers: a comparative study. In: 2016 18th Mediterranean electrotechnical conference (MELECON), pp 2, 3, Apr 2016. https:// doi.org/10.1109/melcon.2016.7495430 5. Open Networking Foundation (ONF) (2016) ONF SDN evolution. [Online]. Available: https:// opennetworking.org/wpcontent/uploads/2013/05/TR-535ONFSDNEvolution.pdf. Martin Casado. Origins and evolution of OpenFlow/SDN. Martin Casado. YouTube, 25 Oct 2011, 25 Oct 2020. [Online]. Available: https://www.youtube.com/watch?v=4Cb91JTXb4t=1160s 6. Kumar A, Goswami B, Augustine P (2019) Experimenting with resilience and scalability of wifi mininet on small to large SDN networks. Int J Recent Technol Eng 7(6S5):201–207 7. Hameed SS, Goswami B (2018) SMX algorithm: a novel approach to avalanche effect on advanced encryption standard AES. In: Proceedings of the 12th INDIACom. IEEE, New Delhi, India, pp 727–232 8. Benzekki K, El Fergougui A, Elbelrhiti Elalaoui A (2016) Software defined networking (SDN): a survey. Secur Commun Netw 9(18):9. https://doi.org/10.1002/sec.1737 9. Goswami B, Wilson S, Asadollahi S, Manuel T (2020) Data visualization: experiment to impose DDoS attack and its recovery on software defined networks. In: Anouncia S, Gohel H, Vairamuthu S (eds) Data visualization. Springer, Singapore. https://doi.org/10.1007/978-98115-2282-6_8 10. Rawat DB, Reddy SR. Software-defined networking architecture, security and energy efficiency: a survey. IEEE Commun Surv Tutorials 19(1) 11. Priyadarsini M, Bera P (2021) Software defined networking architecture, traffic management, security, and placement: a survey. Comput Netw 192:108047. https://doi.org/10.1016/j.com net.2021.108047 12. Betts M, Davis N et al. https://opennetworking.org/wpcontent/uploads/2013/02/TRSD-NAR CH1.006062014.pdf, June 2014 13. Hohlfeld O, Kempf J, Reisslein M, Schmid S, Shah N (2018) Guest editorial scalability issues and solutions for software-defined networks. IEEE J Sel Areas Commun 36(12):2595–2602 14. Benzekki K, El Fergougui A, Elbelrhiti Elalaoui A (2016) Software defined networking (SDN): a survey. Secur Commun Netw 9(18):5803–5833. Available: https://doi.org/10.1002/sec.1737. Accessed 17 Nov 2021 15. Montazerolghaem A (2020) Software-defined load-balanced data center: design, implementation and performance analysis. Cluster Comput 24(2):591–610. Available: https://doi.org/10. 1007/s10586-020-03134-x 16. Hamdan M et al (2020) Flow-aware elephant flow detection for software defined networks. IEEE Access 8:72585–72597. https://doi.org/10.1109/ACCESS.2020.2987977 17. Cabarkapa D, Rancic D (2021) Performance analysis of Ryu-POX controller in different treebased SDN topologies. Adv Electr Comput Eng 21(3):31–38. https://doi.org/10.4316/AECE. 2021.03004 18. Shirvar A, Goswami B (2021) Performance comparison of software-defined network controllers. In: 2021 International conference on advances in electrical, computing, communication and sustainable technologies (ICAECT), pp 1–13. https://doi.org/10.1109/ICAECT 49130.2021.9392559
A Systematic Study and Detailed Performance Assessment of SDN …
277
19. Askar S, Keti F (2021) Performance evaluation of different SDN controllers: a review. Int J Sci Bus 5(6):67–80. https://doi.org/10.5281/zenodo.4742771 20. Zavrak S, Iskefiyeli M (2017) A feature-based comparison of SDN emulation and simulation tools. In: International conference on engineering technologies, pp 214–217 21. https://medium.com/@dishadudhal/performance-evaluation-of-sdn-controllers-usingcbench-and-iperf-e9296f63115c 22. https://www.semanticscholar.org/paper/Survey-and-Comparison-of-SDN-Controllers-forand-Quincozes-Soares/8ac176423a3d348c4c5ad01655a05c678d8324a3 23. Rana DS, Dhondiyal SA, Chamoli SK (2019) Software defined networking (SDN) challenges, issues and solution. Int J Comput Sci Eng 7(1):882–889
Classification and Localization of Objects Using Faster RCNN Bhavya Sree Dakey, G. Ramani, Md. Shabber, Sirisha Jogu, Divya Sree Javvaji, and Hasini Jangam
Abstract On the basis of modern technological advancements, novel and sophisticated algorithms have been devised. The progression in object detection technologies, similar Fast and Faster RCNN algorithms, have resulted in reduced detection time of objects/entities, coupled with high precision levels. The latest study examines the efficiency of a newly proposed algorithm that detects RPN and Fast RCNN. The input ROIs for the RCNN network, using the RPN’s region proposals, a potential way to unify the components is by creating a single network that facilitates communication or transfer of convolutional features to recognize a specific object within each image. By employing a unified network, the requirement to obtain Region of Interest (ROI) from an external network is eliminated making the process cost-effective. The efficiency and accuracy of object detection have been crucial topics around the development of computer vision systems. With the foundation of deep learning techniques, there has been significant progress in the precision of object detection. The proposed project incorporates classification and localization for object detection. Precisely, the system takes an image as input and produces an output that consists of bounding
G. Ramani, Md. Shabber, Sirisha Jogu, Divya Sree Javvaji, Hasini Jangam: These authors contributed equally to this work. B. S. Dakey (B) · G. Ramani · Md. Shabber · S. Jogu · D. S. Javvaji · H. Jangam Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana 502313, India e-mail: [email protected] G. Ramani e-mail: [email protected] Md. Shabber e-mail: [email protected] S. Jogu e-mail: [email protected] D. S. Javvaji e-mail: [email protected] H. Jangam e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_24
279
280
B. S. Dakey et al.
boxes for each object present in the image, along with corresponding information about the class of each object contained within the respective bounding box. Keywords Object detection · Faster-regions with convolution neural networks (RCNN) · Region Proposal Networks (RPNs) · Region of interest pooling
1 Introduction Picture classification and object detection are two scenarios that are frequently misconstrued. Image classification is commonly employed when attempting to categorize a picture into a certain category. Object detection, anyhow, is at times used to identify the position of objects in a picture and count the number of instances of a specific object. Object recognition is an essential aspect of computer vision. Initially, object identification relied on digital image processing techniques, which include edge detection, recognition by components, and gradient matching. The initialization of deep neural networks has revolutionized numerous fields, bringing about substantial progress and breakthroughs in different areas, and with recent advancements in this field, object identification has become more precise and can even be deployed in real-time. As the number of convolutional layers increases, object detection accuracy also improves. Traditionally, detection and classification tasks are handled as separate processes. For example, detection relies on Region Proposal Networks (RPNs) [1] to identify regions of interest (ROIs), while classification tasks are performed using RCNN [2]. RPN networks generate the ROIs, which are then processed independently, with Faster RCNN being a notable example [3]. These networks incorporate methods such as selective network search and greedy merges. Nevertheless, the time taken for the area proposal phase is comparable to that of the detection network, resulting in considerable processing overhead. While both methods can be expensive on their own, sharing convolutions between them can provide a cost-effective solution. The RPN and RCNN are merged in these project modules into a unified approach, which simplifies the procedure significantly. Integrating RPN networks [1] and RCNN modules can be an efficient method for balancing detection time and accuracy. Initially, RPN networks are sharpened by applying convolutional layers to convolutional feature maps, which remain then utilized by detectors like Faster RCNN. This process results in a fully connected network that facilitates the generation of detection region proposals, which can also be learned like detection algorithms. Considering the diversity of classified images, it becomes crucial to have region proposals that can discern an expansive range of aspect ratios and sizes. To address this, instead of using static size filters over the image, we employ “anchor” boxes. These anchor boxes encompass multiple reference regions, allowing us to consider boxes with different aspect ratios and sizes simultaneously. During the training process, the region proposal (RPN) and detection (Faster RCNN) modules are fine-tuned alternately, converging them into a single network capable of performing the detection task and producing the desired output
Classification and Localization of Objects Using Faster RCNN
281
[2, 4, 5]. As a result, this approach not only reduces distinguishing time but also mitigates the computational challenges associated with the choosy search method. In deep learning, the number of layers plays an indispensable role in regulating the quality of the extracted features. The proposed system is built by using Faster RCNN and RPN.
1.1 Abbreviations and Acronyms 1. 2. 3. 4.
RCNN—Region with Convolution Networks. RPN—Region Proposal Networks. ROI—Region of Interest Pooling. SPP—Spatial Pyramid Pooling.
2 Problem Statement An software that provides more efficiency and accurate object classification, and localization of images or videos when trained with Faster RCNN.
3 Related Work Deep neural networks have expanded their immense trendiness, particularly after the introduction of Alexnet, an eight-layer convolutional neural network that accomplished high mean Average Precision (mAP) ratings. There are few techniques available for object classification and localization, one among them is CNN—to generate an image window to learn about background and foreground pixels for the entire object, encompassing the upper, lower, and both sides [1]. Afterward, bounding boxes are utilized to refine the predicted masks. However, in the Faster RCNN approach, these region proposals serve the purpose of classifying and identifying objects. Neural networks for object detection have seen significant advancements, with Region-based convolutional neural network (RCNN) being one of the prominent techniques used for identifying objects in images that combine features of convolutional neural networks with region proposals generated through a bottom-up approach. To obtain region proposals, RCNN employs the selective search algorithm, which sources proposals from an external network. Object proposal methods play a crucial role in object detection, and among the available techniques, Deformable Part-based Models’ (DPMs) detectors can make use of several methods, including selective search, greedy suppression approaches,
282
B. S. Dakey et al.
and parametric min cuts. These techniques aid in efficiently identifying potential regions of interest for further processing and classification in object detection systems with constrained parameters to provide suggestions regions of interest (ROIs); however, these detectors struggle to accurately distinguish between many object categories. Instead of relying on external proposal methods, making the most of the Over Feat approach, where in object detection systems, the fully connected layer is typically trained for providing box references for a single class in traditional approaches. However, in contrast, the Region Proposal Network (RPN) method utilizes a convolutional layer as a fully connected layer. This convolutional layer helps in generating region proposals for multiple classes, making the RPN approach more versatile and efficient in identifying regions of interest across different object classes to simultaneously detect objects for multiple classes, avoiding issues with DPM. The ROIs produced by RPN are class-agnostic, which is useful for the Fast RCNN network to perform object recognition from these ROIs [1–5].
4 Proposed Methodology Object detection using Faster RCNN [6]: Faster RCNN differs from Fast RCNN in that it replaces the external region proposal network with an RPN; indeed, the RPN method serves as a crucial component in the Fast RCNN architecture, making it a modified version of the original Fast RCNN approach. By incorporating the RPN as a convolutional layer, the modified Fast RCNN gains the ability to perform region proposal and detection simultaneously, leading to improved efficiency and accuracy in object detection tasks. This integration of the RPN module within the Fast RCNN framework has been a significant advancement in deep learning-based object detection. Fast RCNN is an improved network designed to address the computational inefficiency present in its predecessor, region-based convolutional neural network (RCNN). The main improvement in Fast RCNN lies in avoiding the repeated computation of convolutional features for each proposed region of interest. In Fast RCNN, instead of applying the convolutional layers separately to each region proposal, the entire image is processed through the convolutional layers only once. This shared computation significantly speeds up the overall process and makes the Fast RCNN model more efficient for object detection tasks. The key aspect of the Fast RCNN algorithm is the distribution of computation. After obtaining the region proposals, certain areas are enclosed by a bounding box regression. By feeding in the warped image directly to the CNN, the RCNN algorithm requires 2000 forward passes if we have 2000 proposals, which can be time-consuming [6–10]. The relationship between the region proposals can be utilized to reduce computation time. Since many proposals overlap with each other, their common areas are computed repeatedly in the CNN. However, with Fast RCNN, these computations can be performed only once. The convo layer does not affect the dimensional relationship among neighboring pixels, here illustrates the reason. To address this, the input image coordinates are mapped to co-relating neurons in the convo layer. It is
Classification and Localization of Objects Using Faster RCNN
283
possible to compute the entire image just once using the CNN. Once the features for each bounding box have been obtained, indeed, in Fast RCNN, the region proposals are passed through the region of interest (ROI) pooling layer, which can be viewed as a form of Spatial Pyramid Pooling (SPP) layer [8, 9]. The ROI pooling layer allows for flexible and efficient extraction of fixed-size feature maps from variable-sized region proposals. These feature maps are then fed into a fully connected layer for further processing, enabling accurate object classification and bounding box regression. By employing the ROI pooling layer, Fast RCNN achieves significant improvements in both speed and accuracy compared to its predecessor, RCNN. Region Proposal Network: As an alternative to external region proposals, the ROI pooling layer in Fast RCNN takes an image feature map as its input to extract regionbased features efficiently and associated it with objectness scores generating rectangular object proposals. This is achieved by applying a sliding window approach, as shown in Fig. 1, where each n × n window (assuming n = 3) is mapped to a lower dimensional feature. When projecting this refers to mapping coordinates from a feature map to the original input size, the resulting receptive field is substantially large. To compute intermediate features, a 3 × 3 × 256 convolutional core element is bid onto the feature map, from which a 256-dimensional intermediate layer is obtained. The intermediate layer is then passed through two separate branches—one for object classification and the other for regression. Using a pyramid technique, a sliding window approach [10] is a centered anchor with varying sizes and aspect ratios used for each window, resulting in W × H × k total anchors. This method is cost-effective and enables efficient collection of region proposals. W = width. H = height. K = number of anchors. Fig. 1 In this context, multiple anchor box references with ratios of 1:1, 1:2, and 2:1 were applied to the image, with a sliding window of 3 × 3 moving across the image
284
B. S. Dakey et al.
5 Implementation 5.1 Training It is necessary to adjust the weights and minimize the loss function to train the RPN network.
5.2 Weight Initialization Before training the algorithm, the actual values of the weights are unknown. However, with surmising symmetry and proper data normalization, on average, it may be anticipated that fifty percent of the weights will become negative and the other fifty would be positive. As a reasonable starting point, all weights can be set to zero, but this assumes that there is no inherent imbalance between them. To disrupt the symmetry, the weights of neurons can be initialized to small random integers. This ensures that each neuron operates uniquely and randomly at the outset of training, but as training progresses, they will update themselves to fit within their respective roles in the overall network.
5.3 Loss Function RPN’s loss function computes/estimates the similarity of the anticipated value and desired value labels. A positive label is assigned based on the overlap of predicted and targeted values bounding boxes calculated through the Intersection over the Union (IoU) [9] metric. Specifically, if the IoU overlap exceeds 0.7 for an anchor and a ground truth label, a positive label is assigned to that anchor. Conversely, if the IoU overlap for all ground truth labels is below 0.3, the anchor is assigned a negative label as it is classified as non-positive. The loss function is then calculated accordingly: In the below equation, “i” represents the pointer of the anchor, “pi ” represents the predicted likeliness that the anchor contains an object, and “pi *” represents the corresponding targeted value label, where it is set to 1 if the anchor contains an object (i.e., positive label) or 0 if the anchor does not contain an object (i.e., negative label), “t i ” represents the anticipated bounding box coordinates, and “t i *” represents the corresponding targeted box coordinates. The values “L cls ,” “N cls ,” and “N reg ” represent the log loss, normalization value for classification (256), and normalization value for regression (2400), respectively. By ten-folding the regression component, it is feasible to balance the tasks of regression and classification. During RPN training [10], only 256 anchors with balanced negative and positive labels are used to compute the loss function. If there are not enough positive samples, additional negative samples are included.
Classification and Localization of Objects Using Faster RCNN
285
Fig. 2 Summary of the Faster RCNN architecture, which integrates the RPN module within the overall network flow
Feature sharing between the RPN and RCNN networks is achieved by alternating feature training as shown in Fig. 2. To evaluate the performance of the RPN, we examined three anchor ratios, namely 1:1, 1:2, and 2:1 during implementation. During the training phase, around 8000 anchors are evaluated, but using an IoU threshold of 0.7 results in only 2000 anchor proposals being chosen for further analysis. The RCNN network is trained using the top 2000 proposals generated by the ROI pooling layer. L({ pi }, {ti }) =
1 ∗ 1 L cls pi , pi∗ + γ pi L reg ti , ti∗ . Ncls i Nreg i
(1)
6 Dataset To carry out the training and testing stages, two datasets were utilized: PASCAL VOC 2012 and MS COCO.
6.1 Pascal VOC 2012 The PASCAL VOC 2012 dataset comprises 20 object categories and comprises approximately 5000 train value images and 5000 test images. To train the model, VGGnet, which includes 13 convo layers and 3 fully interconnected layers, was used. On the subject of detection, the method uses stages of 300 proposals, ranging from 300 to 1500.
286
B. S. Dakey et al.
6.2 MS COCO The MS COCO dataset holds around 80,000 images, which are used for training, testing, and validation purposes, along with 40,000 images for model evaluation. The images within the MS COCO dataset encompass a variety of objects, including but not limited to eggs, dining tables, humans, airplanes, zebras, birds, pizza, and bowls as shown in Fig. 3. VGGnet training is employed on this dataset, and the model’s performance is evaluated using the metrics specified on the official MS COCO website. The required out-turn data for computing benchmarks related to object diagnosis adhere to the format specified on the dataset website. The metrics used for evaluation include Average Precision (AP), Average Recall (AR), Average Precision at IoU = 0.5 (AP.50), AP.75. Table 1 indicates the performance evaluation of MS COCO dataset. The bar chart presented below illustrates the mean Average Precision (mAP) values of VGGnet models, using 1800 region proposals on PASCAL VOC 2012 dataset, for individual object categories (Fig. 4).
Fig. 3 Images within the MS COCO dataset encompass a variety of objects, including but not limited to eggs, dining tables, humans, airplanes, zebras, birds, pizza, and bowls
Classification and Localization of Objects Using Faster RCNN
287
Table 1 Performance evaluation of MS COCO dataset Metrics for analysis
Coco VGG 1800 proposals
Coco VGG 900 proposals
Coco VGG 300 proposals
AP.50
0.439
0.431
0.456
AP.75
0.249
0.278
0.236
APS
0.073
0.068
0.071
APM
0.265
0.247
0.260
APL
0.384
0.379
0.357
AR1
0.250
0.245
0.235
AR10
0.360
0.341
0.354
AR100
0.31
0.329
0.333
ARS
0.141
0.119
0.114
ARM
0.378
0.338
0.367
ARL
0.545
0.529
0.548
AP
0.261
0.239
0.235
Average precision at IoU = 0.75, APS average precision over small objects, APM average precision over medium objects, APL average precision over large objects, and proposals* (values from Faster RCNN on MS COCO)
Fig. 4 Bar chart of average precision (AP) values of the VGGnet model on the PASCAL VOC dataset
7 Conclusion Examining the number of region proposals, a deal between detection time and accuracy is observed. In other words, better accuracy comes at the cost of longer detection times, relative to fewer region proposals. However, even with the increase in region
288
B. S. Dakey et al.
proposals, it was not possible to carry off instantaneous frame speed. Nonetheless, the visual accuracy remains unchanged at the time of differing the number of region proposals, for both single-object and collective-object images of the identical class.
References 1. Girshick R et al (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. https://doi.org/10.48550/ARXIV.1311.2524. https://arxiv.org/abs/1311.2524 2. Eggert C et al (2017) A closer look: small object detection in faster R-CNN. In: 2017 IEEE International conference on multimedia and expo (ICME). IEEE, July 2017. https://doi.org/10. 1109/icme.2017.8019550 3. Girshick R et al (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158. https://doi.org/10.1109/ tpami.2015.2437384 4. Abbas Sm, Singh SN (2018) Region-based object detection and classification using faster R-CNN. In: 2018 4th International conference on computational intelligence & communication technology (CICT). IEEE, Feb 2018. https://doi.org/10.1109/ciact.2018.8480413 5. Erhan D et al (2014) Scalable object detection using deep neural networks. In: 2014 IEEE Conference on computer vision and pattern recognition. IEEE, June 2014. https://doi.org/10. 1109/cvpr.2014.276; Girshick R (2015) Fast R-CNN. https://doi.org/10.48550/ARXIV.1504. 08083. https://arxiv.org/abs/1504.08083 6. Jia Y et al (2014) Caffe: convolutional architecture for fast feature embedding. https://doi.org/ 10.48550/ARXIV.1408.5093. https://arxiv.org/abs/1408.5093 7. Hosang J et al (2016) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830. https://doi.org/10.1109/tpami.2015.2465908 8. Mohandas P et al (2021) Object detection and movement tracking using tubelets and faster RCNN algorithm with anchor generation. Wirel Commun Mob Comput 1–16. (Zhang Y (ed)). https://doi.org/10.1155/2021/8665891 9. Nguyen D-K, Tseng W-L, Shuai H-H (2020) Domain-adaptive object detection via uncertaintyaware distribution alignment. In: Proceedings of the 28th ACM international conference on multimedia. ACM, Oct 2020. https://doi.org/10.1145/3394171.3413553 10. Indhraom Prabha M, Umarani Srikanth G (2019) Survey of sentiment analysis using deep learning techniques. In: 2019 1st International conference on innovations in information and communication technology (ICIICT). IEEE, Apr 2019. https://doi.org/10.1109/iciict1.2019. 8741438
An Image Processing Approach for Weed Detection Using Deep Convolutional Neural Network Yerrolla Aparna, Nuthanakanti Bhaskar, K. Srujan Raju, G. Divya, G. F. Ali Ahammed, and Reshma Banu
Abstract According to weeds increased competition with crops, they have been given responsible for 45% of crop losses in the agricultural industry. This percentage can be decreased with an effective method of weed detection. The conventional weeding techniques take a long time and largely require manual labor. So, this process has to be automated. As a result, an image processing approach for weed detection implementing Deep Convolutional Neural Network is provided in this analysis. The purpose of this study, the possibilities of classifying and detecting weeds from Unmanned Aerial Vehicle (UAV) images using deep learning techniques. The presented method will achieve high accuracy when compared to state-of-the-art algorithms. Keywords Weed detection · Deep learning algorithms · Deep Convolutional Neural Network (DCNN) · Unmanned Aerial Vehicle (UAV) images
1 Introduction On agricultural soils where they are not wanted, weeds are plants that grow spontaneously. Due to their growth and competition with commercial crops like soybeans, damage has been caused by these plants, making it difficult to operate harvesting machines, increasing the impurity and moisture of the grains. Weeds and crops competing for the same resources as crops increase production costs to increase, Y. Aparna (B) · N. Bhaskar · K. Srujan Raju Department of CSE, CMR Technical Campus, Hyderabad, Telangana, India e-mail: [email protected] G. Divya Department of IT, CMR Technical Campus, Hyderabad, Telangana, India G. F. Ali Ahammed Department of CSE, VTU Centre for Post Graduate Studies, Mysuru, Karnataka, India R. Banu Department of CSE, VVIET, Mysuru, Karnataka, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_25
289
290
Y. Aparna et al.
harvesting to become more challenging, product quality to decrease, risk of pests and diseases to increase, and economic value of cultivated land to decrease. They are categorized based on their leaves, and weeds can be divided into grass and broadleaf groups. Due to the fact that some herbicides are selective for a particular class of weeds, the difference between these two classifications is sufficient since grass and broadleaf weeds are treated differently [1]. The production of the crops can be increased by the important procedure of weed control. In poly-houses, weed is generally considered to be a consistent challenge that reduces the apparent quality of the crops being produced. However, such disadvantages have a negative impact on both human well-being and the environment. Chemical treatments and cultural control could cause an harmful effect on the environment if they are not properly regulated. Considering that various countries are already experiencing high labor costs and a shortage of labor, the mechanization of weed control systems is considered essential. Highly accurate weed management is crucial in enclosed environments like poly-houses because chemical vapors can get captured, harming the health of agricultural workers and reducing crop productivity [2]. Due to the competition with desirable crops and weeds their is an issue because they take up space, water, and nutrients. Moreover, some weeds get entangled in equipment and prevent productive harvesting. Therefore, weed control systems are essential. Accurate weed identification is necessary for the development of an effective weed removal system [3, 4]. The transformation of traditional agricultural techniques into smart farming techniques is made possible by recent technical advancements in the fields of cloud computing (CC), the Internet of Things (IoT), Artificial Intelligence (AI), computer vision, etc. [4, 5]. The most essential farming activity is weed identification that has to be automated and digitalized. Digital technology has historically had difficulty resolving the issue of weed identification in crops, especially when traditional image processing methods are used. With notable advancements in fields including cancer prognosis, image analysis, speech recognition, self-driving cars, natural language processing, and disaster prediction, among other things, there has been a lot of interest in using deep learning (DL) to resolve real-world problems. Big data analysis, automatic feature extraction, DL are clearly preferred over other conventional machine learning algorithms because of its reduced testing times and testing effectiveness [6]. The expenditures invested in managing the weeds and the loss caused due to these weeds invole the expenditures of light, nutrients and water used in farming. They typically produce thorns and burrs, poisonous, crop management difficult and contaminate agricultural harvests. In order to prevent poor weed control and decreased crop production, farmers spend billions of dollars on weed management, most often without adequate technical support. As a result, controlling weeds is an essential part of crop management in horticulture because failing to results in decreased yields and product quality. If not managed correctly, the utilization of chemical and cultural control measures might have harmful consequences for the environment. The development of more efficient, sustainable weed management techniques will benefit from a low-cost tool for weed identification and mapping at the early stages of growth [7].
An Image Processing Approach for Weed Detection Using Deep …
291
Agricultural productivity, human health, and the environment are all seriously threatened by weeds. Weeds harm the ecological balance of communities and reduce species diversity. Among all plant pests, weeds represents the greatest threat to agricultural productivity in the food production sector. As the prevalence of herbicide resistance increases, they also present a yield risk and a societal risk due to the rising demand for food from growing population [8]. In crops, manual weeding is still the most common way to control weeds. However, it requires time, effort, because it is inefficient for larger-scale crops. Today, agricultural industry has substance weeding frameworks, and less important are technical weeding systems, although in this specific case, 75% of the provided vegetables, such lettuces, require manual weeding, making production far more inefficient and costly. Furthermore, weeding has a very high margin for error, and these techniques present the possibility of damaging the plants [9]. As a result, this analysis presents an image processing approach for weed detection utilizing a Deep Convolutional Neural Network. The following is the structure of the remaining work: The literature survey is described in Sect. 2. Section 3 demonstrates an image processing approach for weed detection using DCNN. The result analysis is evaluated in Sect. 4. In this Sect. 5, the analysis is finally concluded.
2 Literature Survey Dasgupta et al. [10] describe the Wireless-Based AI Crop Predictor and Weed Detector: A Smart Application for Farmers. This analysis combines AI techniques, IoT devices, and Wireless Sensor Networks (WSNs), to quickly and effectively recommend suitable crops to farmers depending on a list of factors. An accuracy of 89.29% was achieved using the Naive Bayes algorithm for crop recommendation based on several WSN sensor node-detected factors. This accuracy is better than several other algorithms discussed in this analysis, including regression or support vector machine. Espejo-Garcia et al. [11] explain a repository of pre-trained agricultural deep neural networks which can improve weed identification. This study explores the possibility of further enhancing these successes by fine-tuning neural networks that have already been trained on agricultural datasets rather than ImageNet. The results of the experiments showed that the suggested strategy can increase overall performance. The Xception and Inception-ResNet architectures represent improvements of 0.51% and 1.89%, respectively, while the total number of epochs decreased by 13.67%. Sabzi et al. [12] describe a metaheuristic-based expert system for quickly and precisely identifying potato crop weeds. The use of two metaheuristic algorithms, to improve a neural network classifier’s performances, is the presented method’s primary contribution. A linear discriminant analysis-based statistical approach has been contrasted with this approach. According to the testing results, the proposed expert system achieves excellent recognition accuracy and runs on an average PC in less than 0.8 s.
292 Table 1 Table of authors, year, model used, and the accuracy obtained
Y. Aparna et al.
Author name
Year
Model
Accuracy (%)
Siddhesh Badhan
2021
ResNet-50 model
90
Nahina Islam
2021
KNN
63
Dasgupta
2020
SVM
87
Adel Bakhshipour
2017
ANN
93.33
Umamaheswari
2018
CNN
91.11
Bakhshipour and Jafari [13] presented weed detection methods utilizing form the features are evaluated by support vector machines and artificial neural networks. The vision system utilized support vector machines and artificial neural networks to assist in the pattern-based identification of weeds. Four prevalent weed species were examined in sugar beet fields. Shape feature sets presently contain moment invariant features and Fourier descriptors. The results showed that 92.50% of weeds were correctly classified, and ANN has an overall classification accuracy of 92.92%. Forero et al. [14] present Color Classification Techniques for the Detection of Perennial Weeds in Cereal Crops. This approach is focused on identifying the plants in clear contrast to certain other crops, especially the color green. Images collected from two different heights were used to compare the methodology with six wellknown machine learning techniques: 10 and 50 m. With execution times of more than 2 min and sub-images of 200 × 200 pixels, this method achieved high accuracy using machine learning techniques. Table 1 shows the authors, year, model used, and the accuracy obtained for the weed detection using Deep Convolutional Neural Network.
3 Proposed System 3.1 Weed Detection Using Deep Convolutional Neural Network This section presents an image processing approach for weed detection using a Deep Convolutional Neural Network [15]. Figure 1 represents the block diagram of the presented technique. The signal processing group at Aarhus University provided dataset for this method. This is a common statement about a problem with image recognition. They have provided a collection of images that includes images of plants at various growth stages. Every photograph has its unique id and filename. Nine and sixty unique plants from 12 plant species are included in the dataset. The final goal is to develop a classifier that can identify a photo of a plant by its species. The following is a list of the species: Black-grass; Charlock; Cleavers; Chickweed, Common; regular
An Image Processing Approach for Weed Detection Using Deep …
293
Fig. 1 Block diagram of weed detection using DCNN
wheat; Fat Hen; Silky-bent loose; Maize; Mayweed with no scent; Shepherds Purse; Cranesbill with small flowers; Sugar beet [16–18]. For this purpose, images of field crops can be captured using sensors and cameramounted UAVs. Several sorts of images, such as RGB, thermal, multispectral, hyperspectral, 3D, and chlorophyll florescence, can be obtained depending on the camera. RGB cameras mounted on a drone were utilized to capture the images used in this work. In contrast to a regular image link, which connects the complete image to a single location, an image map is a list of coordinates related to a particular image that
294
Y. Aparna et al.
is used to hyperlink-specific areas of the image to different destinations. The term “data preprocessing,” which defines a subset of data preparation and includes any type of processing, is applied to raw data to prepare it for additional data processing techniques. One of the difficult parts of preprocessing an image is cleaning. In order to clean the images, the following processes were included: (i) convert the RGB pictures into the HSV, (ii) blur the images to remove noise; and (iii) create a mask to remove the background. When removing the background, they can clean up the image. A numerical representation of the image that corresponds to the color names are in its provided by the Hue Saturation Value (HSV) color scale. The range of hue is from 0 to 360 degrees. For instance, magenta has a temperature range of 301–360°, while cyan is between 181 and 240°. The data augmentation, outlier detection, standardization, and normalization phases make up this method’s data preprocessing phase:
3.2 Implementation Our handwritten digit dataset requires to be artificially expanded in order to reduce the overfitting issue. This will enable us to expand our dataset and add more data. Very few changes will be performed to the training data in order to reproduce the variations that arise when writing a digit. In order to supplement the data, the following possibilities are selected: (i) rotate some training images 180° randomly; (ii) zoom in on training images at random by 10%; (iii) randomly shift images by 10% of their width horizontally; (iv) randomly flip images vertically, horizontally and shift images vertically by 10% of their height. The identification of dataset elements that significantly differ from the majority of the features is known as outlier detection. These elements are referred to as outliers, and depending on the case’s domain, there are different incentives for identifying outliers. A sample of data after standardization has a mean of 0 and a standard deviation 1 of the distribution. Either per image or per dataset, the mean and standard deviations are calculated. Normalization is a technique in image processing that alters the range of pixel intensity values. For instance, applications include the photographs with low contrast due to glare. Contrast stretching and histogram stretching are other names for normalization. MATLAB has been used to simulate deep learning-based algorithms and extract features from preprocessed images. Semantic division is sued for weed identification and planning. The encoding and decoding blocks are the two main components of semantic segmentation based on deep learning. Decoding blocks is a up-sample feature space to image dimensions, while encoding is a down-sample feature space from images to extract features. In contrast to typical CNNs, semantic segmentation networks like SegNet and UNET (U-shaped encoder–decoder architecture) utilize fully convolutional networks with convolutional top layers [14, 19–21]. For the weed dataset, a special four-layer CNN is created to address the design challenges in terms of speed and accuracy. Consequently, applications running on
An Image Processing Approach for Weed Detection Using Deep …
295
the Raspberry Pi will require less memory and less complex. Four conv3 × 3 layers, each with the following sizes: (108 98 32), (52 47 16), (24 21 8), and (10 8 4) make up the model. Based on the shape of the resized photos, an input shape is selected as (110, 100, 3). There is a max pooling layer of (2 2) non-overlapping filter and a RELU activation, to extract the most values per patch after each of the conv3 × 3 layers. In order to maintain a 32-depth layer, the pooling layer output size is reduced to (54, 49, 32), which is half of the form of the input. As a result, the model learns less parameters when max pooling layers reduce feature map dimensions [22]. Considering video decoding, postprocessing is commonly utilized to reduce the visual impact of coding artifacts and to improve the general quality of reconstructed frames. To reduce the visual impact of coding artifacts and improve the overall quality of reconstructed frames, postprocessing is frequently utilized after video decoding. Finally, the CNN output layer detects and classifies the weed as Scentless Mayweed, Maize, Loose Silky-bent, Fat hen, Common Wheat, Common Chickweed.
4 Result Analysis This section implements an image processing approach for weed detection utilizing a Deep Convolutional Neural Network. In this analysis, dataset is collected Aarhus University Signal Processing group. Figure 2 shows the visualization of plant seeding. The result analysis of presented Image processing approach for weed detection using DCNN is evaluated here. The plant species are Scentless Mayweed, Shepherds Purse, Small-flowered Cranesbill, Sugar beet which are evaluated. Figure 3 shows the images before and after HSV, i.e., original image, blur image, and HSV image. In the fig, the identification of pests in individual plant and weed detection is done by using HSV image classification. The images of before and after HSV are original image, blur image, and HSV image which are generated. Figure 4 shows the visualization of predicted plant seedlings and weed detection is done by using HSV by removing the background, we can clean up the image. A numerical representation of the output image that corresponds to the color names are in it provided by the Hue Saturation Value (HSV) color scale is displayed.
Fig. 2 Visualization of plant seeding
296
Y. Aparna et al.
Fig. 3 Plant images before and after HSV
Fig. 4 Visualization of predicted plant seedlings and weed detection
The performance of presented approach is evaluated in terms of accuracy which is defined as follows: Accuracy: As evaluated by the number of cases correctly classified to all instances classified, accuracy is defined as follows: Accuracy =
TP + TN × 100, TP + FP + TN + FN
(1)
where True Positive (TP) and True Negative (TN) are the instances that are classified correctly, whereas False Negative (FN) and False Positive (FP) are incorrectly classified instances. Figure 5 shows the accuracy performance comparison. In Fig. 5, the x-axis represents the accuracy performance values in terms of percentage and y-axis represents different algorithms used for weed detection and classification. The presented DCNN outperforms the machine learning algorithms, such as artificial neural network (ANN) and support vector machine (SVM), in terms of the identification and classification of weeds.
An Image Processing Approach for Weed Detection Using Deep …
297
Fig. 5 Comparative graph for accuracy
5 Conclusion An image processing approach for weed detection utilizing a Deep Convolutional Neural Network is presented in this analysis. In this analysis, the plant images are collected through the sensors and camera-mounted UAVs. The Aarhus University Signal Processing group’s dataset is being gathered. The images are preprocessed and feature extractor is used to extract the relevant features. The semantic segmentation is used for weed detection and mapping. The DCNN is used to classify the weed as Scentless Mayweed, Maize, Loose Silky-bent, Fat hen, Common Wheat, Common Chickweed. Compared to most popular ML algorithms, presented approach has achieved high accuracy for weed detection and classification. Hence, this approach will be a better solution for weed detection and classification. As a result, crop growth will be improved.
References 1. Razfar N, True J, Bassiouny R, Venkatesh V, Kashef R (2022) Weed detection in soybean crops using custom lightweight deep learning models. J Agric Food Res 8:100308. https://doi.org/ 10.1016/j.jafr.2022.100308 2. Alrowais F, Asiri MM, Alabdan R, Marzouk R, Anwer, Hilal M, Alkhayyat A, Gupta D (2022) Hybrid leader based optimization with deep learning driven weed detection on internet of things enabled smart agriculture environment. Comput Electr Eng 104(Part A):108411, https://doi. org/10.1016/j.compeleceng.2022.108411 3. Badhan S, Desai K, Dsilva M, Sonkusare R, Weakey S (2021) Real-time weed detection using machine learning and stereo-vision. In: 2021 6th International conference for convergence in technology (I2CT). https://doi.org/10.1109/I2CT51068.2021.9417989
298
Y. Aparna et al.
4. Bhaskar N, Ganashree TS (2022) Pulmonary nodule detection using Laplacian of Gaussian and deep convolutional neural network. In: Bhateja V, Satapathy SC, Travieso-Gonzalez CM, Adilakshmi T (eds) Smart intelligent computing and applications, vol 1. Smart innovation, systems and technologies, vol 282. Springer, Singapore. https://doi.org/10.1007/978-981-169669-5_58 5. Etienne A, Ahmad A, Aggarwal V, Saraswat D (2021) Deep learning-based object detection system for identifying weeds using UAS imagery. Remote Sens 13:5182. https://doi.org/10. 3390/rs13245182,MDPIJournal 6. Ukaegbu UF, Tartibu LK, Okwu MO, Olayode IO (2021) Deep learning application in diverse fields with plant weed detection as a case study. Association for Computing Machinery, ACM. https://doi.org/10.1145/3487923.3487926. ISBN 978-1-4503-8575-6/21/12 7. Islam N, Rashid MdM, Wibowo S, Xu C-Y, Morshed A, Wasimi SA, Moore S, Rahman SkM (2021) Early weed detection using image processing and machine learning techniques in an Australian Chilli Farm. MDPI J Agric 11:387. https://doi.org/10.3390/agriculture11050387 8. Hu K, Coleman G, Zeng S, Wang Z, Walsh M (2020) Graph weeds net: a graph-based deep learning method for weed recognition. Comput Electron Agric 174:105520. https://doi.org/10. 1016/j.compag.2020.105520 9. Osorio K, Puerto A, Pedraza C, Jamaica D, Rodríguez L (2020) A deep learning approach for weed detection in lettuce crops using multispectral images. AgriEngineering 2:471–488. https://doi.org/10.3390/agriengineering2030032 10. Dasgupta I, Saha J, Venkatasubbu P, Ramasubramanian P (2020) AI crop predictor and weed detector using wireless technologies: a smart application for farmers. Arab J Sci Eng 45:11115– 11127 11. Espejo-Garcia B, Mylonas N, Athanasakos L, Fountas S (2020) Improving weeds identification with a repository of agricultural pre-trained deep neural networks. Comput Electron Agric 175:105593. https://doi.org/10.1016/j.compag.2020.105593 12. Sabzi S, Abbaspour-Gilandeh Y, García-Mateos G (2018) A fast and accurate expert system for weed identification in potato crops using metaheuristic algorithms. Comput Ind 98:80–89. https://doi.org/10.1016/j.compind.2018.03.001 13. Bakhshipour A, Jafari A (2018) Evaluation of support vector machine and artificial neural networks in weed detection using shape features. Comput Electron Agric 145:153–160. https:// doi.org/10.1016/j.compag.2017.12.032 14. Forero MG, Herrera-Rivera S, Ávila-Navarro J, Franco CA, Rasmussen J, Nielsen J (2018) Color classification methods for perennial weed detection in cereal crops. In: CIARP 2018, LNCS 11401. Springer Nature, Switzerland, pp 117–123 15. Jin X, Che J, Chen Y (2021) Weed identification using deep learning and image processing in vegetable plantation. IEEE Access 9. https://doi.org/10.1109/ACCESS.2021.3050296 16. Bhaskar N, Ganashree TS; Patra RK (2023) Pulmonary lung nodule detection and classification through image enhancement and deep learning. Int J Biometrics 15(3/4):291–313. https://doi. org/10.1504/IJBM.2023.10044525 17. Yu J, Sharpe SM, Schumann AW, Boyd NS (2019) Deep learning for image-based weed detection in turfgrass. Eur J Agron 104:78–84. https://doi.org/10.1016/j.eja.2019.01.004 18. Sarvini T, Sneha T, Sukanya Gowthami GS, Sushmitha S, Kumaraswamy R (2019) Performance comparison of weed detection algorithms. In: International conference on communication and signal processing, 4–6 Apr 2019, India. 978-1-5386-7595-3/19 19. Zhang W, Hansen MF, Timothy N, Wilson J, Ralston G, Broadbent L, Wright G (2018) Broadleaf weed detection in pasture. In: 2018 3rd IEEE international conference on image, vision and computing. 978-1-5386-4991-6/18 20. Umamaheswari S, Arjun R, Meganathan D (2018) Weed detection in farm crops using parallel image processing. In: 2018 Conference on information and communication technology (CICT’18). 978-1-5386-8215-9/18
An Image Processing Approach for Weed Detection Using Deep …
299
21. Sohail R, Nawaz Q, Hamid I, Gilani SMM, Mumtaz I, Mateen A, Chauhdary JN (2021) An analysis on machine vision and image processing techniques for weed detection in agricultural crops. Pak J Agric Sci 58(1):187–204. https://doi.org/10.21162/PAKJAS/21.305. ISSN (print) 0552-9034 22. Espejo-Garciaa B, Mylonasa N, Athanasakosa L, Fountasa S, Vasilakogloub I (2020) Towards weeds identification assistance through transfer learning. Comput Electron Agric 171:105306. https://doi.org/10.1016/j.compag.2020.105306
Detection of Leaf Black Sigatoka Disease in Enset Using Convolutional Neural Network A. Senthil Kumar , Meseret Ademe, K. S. Ananda Kumar, Srikrishna Adusumalli , M. Venkata Subbarao, and K. Sudhakar
Abstract Enset is mostly grown in central and southwest Africa and is used for a variety of things, including food, medicine, shelter, and even animals. It is also called as false banana (Enset ventricosum). Viruses, bacteria, and fungus are frequently capable of causing harm to Enset. Black spot disease or leaf sigatoka disease, which affects Enset leaves by fungi, reduces agricultural productivity and quality. Without a suitable system, sigatoka disease detection takes longer, requires more work, and is more expensive. The major goal of this project is to create an automated computer vision system that can recognize Enset leaf black sigatoka disease and suggest a course of therapy using a convolutional neural network (CNN). There are four phases in the suggested system. Dataset gathering is the initial stage, during which photographs of both healthy and diseased Enset leaves are collected from agriculture in the East African zone. Two useful characteristics are taken from the image during this phase, after which the system is trained and the model is created using the training datasets. The third phase focuses on classifying photos according to whether they are healthy or unhealthy, utilizing attributes that were extracted during training. The classification accuracy of the proposed model is tested by using testing datasets and the experiment was conducted on two CNN models. The first is the proposed model and the second is the pre-trained model (Resnet50). Both models are trained using the same image size (112 × 112) with RGB color channels. The A. Senthil Kumar (B) · S. Adusumalli · M. Venkata Subbarao Shri Vishnu Engineering College for Women, Bhimavaram, India e-mail: [email protected] S. Adusumalli e-mail: [email protected] M. Ademe Dilla University, Dilla, Ethiopia K. S. Ananda Kumar Atria Institute of Technology, Bengaluru, India e-mail: [email protected] K. Sudhakar Sri Vishnu Institute of Technology, Kovvada, AP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_26
301
302
A. Senthil Kumar et al.
data augmentation technique is used to increase the number of datasets during the training phase. The pre-trained model is also trained by a dataset called Resnet50. This system achieves 96.4% of overall accuracy. Finally, the model’s performance is evaluated using a confusion matrix (precision, recall, and accuracy). Python is used for implementing the model. Generally, the system achieves 97.6%, 95.2%, and 96.4% precision, recall, and accuracy, respectively. Keywords Leaf black sigatoka · CNN model · Enset decease · Resnet50
1 Introduction 1.1 Sigatoka Leaf Streak Sigatoka leaf streak is the widely seen fungal disease of bananas and plantains in Africa. The disease is not well documented so far though it is so disastrous in East and Central Africa [1–4]. It causes an extensive damage with pronounced incidence (> 75%) and severity (> 50%) in southwest Ethiopia [5]. This issue causes a huge loss for subsistence farmers who entirely depend on the crop to food insecurity [6]. Nearly 85% of the African population relies on agriculture as the main source of live hood. This dependency on agricultural production reflects in the stability and growth of the African economy. In recent decades, farming production has become much more significant than it used in the past, when crops were mainly used to feed humans and animals. It is also a significant source of raw materials for many farming-based industries. In Africa, a total of 302,143 ha of land is cultivated by Enset crop and is used as human food, animal forage, fiber, construction materials, means of earning cash income, and insurance against hunger and medicines for 20% of the country’s population [7]. Other names for Enset (Enset ventricosum) include false banana, African banana, Abyssinian banana, and Ensete. Mostly Enset is cultivated in central, southern, and southwestern parts of Africa. The domestic form of the plant is only cultivated in Africa. Enset is a multi-purpose root crop, and nearly every part of the plant is somewhat usable. The physical look of the Enset is similar with banana, but the Enset is taller in stature and plumper in estimation, and most imperatively, the natural product of the Enset is not consumable. Disease is brought on by bacteria, fungus, viruses, and nematodes, among other biotic and abiotic variables that influence Enset formation [8]. Leaf black sigatoka of Enset is the main problems to Enset production. Black Sigatoka, caused by the fungus Mycosphaerella fijiensis, is a leaf poison of Enset and banana. The indications are little light yellow or brownish green limit strikes on clear out spread over the whole leaf. Leaves dry rapidly and defoliate, and the fungus causes dark black or brown spots in the center of leaves [5, 6, 8–10]. This disease can also affect the Enset leaf. This disease destroys leaf tissue, inducing visual patterns,
Detection of Leaf Black Sigatoka Disease in Enset Using Convolutional …
303
Fig. 1 Examples of diseased Enset image
which can be classified using computer vision techniques that represent new methods of disease detection. Visual representation systems such as HSV, TSL, LAB, and YCbCr commonly use Gaussian filters that soften images; then, methods including histogram analysis and OTSU, among others, can be used to threshold images [11]. However, the level of leaf decease developments is ensured by lab testing. Some sturdier methodologies use classic machine learning techniques to perform feature extraction processes on and classification of the objects of study.
2 Proposed System Model is developed using convolutional neural networks to detect Enset leaf black sigatoka disease and after identification recommending treatment or pesticide that prevents the disease using different CNN layers such as Flatten, Dense, Activation, Dropout, Max pooling 2D, and conv2D [7, 11–15].
2.1 Datasets The dataset is prepared with collection both diseased and healthy Enset leafs from different Enset farm lands which is located on east Africa zone as shown in Figs. 1 and 2. Totally 3000 numbers of Enset images were collected from different Enset farmland areas. Images were captured with high-quality camera with different effects such as brightness level, distance, and angle and also fixed setting such as image format and size of the image.
2.2 Training and Testing Datasets The training dataset consists of a set of healthy and diseased image examples that fit parameters or weight connections. Out of 3000 images, 80% of the images were
304
A. Senthil Kumar et al.
Fig. 2 Examples of healthy leaf image
used for training and the remaining 20% are used for testing. Both sets of images are chosen randomly.
2.3 Preprocessing Preprocessing phase involves resizing the inputted image into 112 × 112. After the image is prepared for training and testing, the image is preprocessed which means that the image that was acquired or collected has different dimensions or sizes. Hence, this is difficult to feed into the system easily. The images are resized into 112 × 112 height and width. Preprocessing is not only including image resizing but also cropping and reshaping of the image. This is used to remove unnecessary features from image and distortion of image. Each data is collected in the same environmental condition which is under sunny weather to increase the visibility and control the darkness of data. After collecting our dataset, we crop each data for getting the exact target of the image. Reshaping is used to remove image distortion. The data is collected from different gedeo zone Enset farming areas. The initial image has 2080 × 4160 (height, width) pixel value. But after resizing, the image pixel value became 112 × 112 as shown in Fig. 3.
Fig. 3 After resizing the original image
Detection of Leaf Black Sigatoka Disease in Enset Using Convolutional …
305
2.4 Developing the Proposed Model The proposed model is designed with two layers that consists of five convolutional layers, four max pooling, and two fully connected layers. To develop this model, we used three main CNN layers. Figure 4 shows how model is developed using CNN layers; the first two layers, the convolution and pooling, are used to extract different useful features from the inputted image. i. Description of the proposed model layers We can compute the output volume as a function of the input volume size (N), the filter size of the convolution layer (F), the stride with which applied (S), the zero padding 45 used (p) on the border. We can convince the correct formula for calculating how many neurons are fit [12]. O = N − F + 2 p S + 1.
(1)
In our proposed system, it is considered, the image volume size (N) = 112, filter size is (F) = 5, p = 0, and S = 2 The output = 112 − 5 + 0 + 1/2 = 54 Input layer: We have a preprocessed image and the image has initially 112 × 112 × 3 (height and width) size with RGB color, i.e., Red, Green and Blue.
Fig. 4 Steps in CNN-based model development
306
A. Senthil Kumar et al.
Layer 1: Contains the first convolution layer with ReLU activation function. In the first convolution, the model filters the preprocessed input image of 112 × 112 × 3 and is convolved with 32 kernels (which means width 112, height 112, and three color channels), each having a filter size of 5 × 5 separately with stride which is 2. We got 54 × 54 × 32 = 93,312 total number of weights or neurons and each has 5 * 5 * 3 weight connections with one bias parameter. Layer 2 is the pooling/sampling layer; there are no parameters that we could learn in the pooling layer. This layer is just used to reduce the image dimension. Layer 2 is the max pooling layer. This layer gets the input of size 54 × 54 × 32 from the previous layer. The pooling size is 2 × 2, padding is 0, and stride is 2. After this max pooling operation, we get feature maps of size 52 × 52 × 32. Layer 3 is the second convolutional layer with ReLU activation function. Every neuron in the convolutional layer would now have a total of 2 × 2 × 32 = 128 connections to the input volume. Layer 1 already learned 32 kernels. So in conv2, the number of trainable parameter is (5 × 5 × 32 + 1) * 32 = 2432. Layer 4: The fourth CNN layer is max pooling, the polling is 2 × 2 size, padding is zero, stride is one then, and we can get the feature map which is 46 × 46 × 64. Layer 5 is the third convolution layer, the third convolution layer can be performed polling, and there are trainable parameters. The number of trainable parameter is 5 × 5 × 32 + 1 * 64 = 51,264 and can be convolved with stride one. In conv4 and 5, the number of strides is two. Layer 6 is the pooling layer. Layer 7 is the fourth convolutions layer, and the number of filters is 5 × 5 and is convolved with 64 kernels. Then, each neuron in the conv layer will have weight to the 3 × 3 × 64 region in the input volume. Layer 8 is max pooling layer, and Layer 9 is the fifth convolution layer. Layer 10 is the first fully connected layer—the input of this fully connected layer is flattening. In this system, dropout is used to remove association or connection of weight. We got 3136 total weight in the first fully connected layer. But by using dropout, we reduce the total weight into 64 in FC2. ii. Experimental Setup a. Training a Model During training the model, backpropagation algorithm is used for training networks where loss function and gradient descent optimization algorithm play vital roles for assessing loss. Table 1 shows description of hyperparameter configuration that we consider.
3 Implementation and Results This section shows implementation details of the proposed system that classifies Enset leaf black sigatoka disease using CNN algorithm.
Detection of Leaf Black Sigatoka Disease in Enset Using Convolutional … Table 1 Description of hyperparameter configuration
Parameters
Value
Optimization algorithm
SGD
Learning rate
0.0001
Activation function
Sigmoid
Batch size
20
Iteration
30
Epoch
100
Metric
Accuracy
Loss function
Binary cross-entropy
307
3.1 Experimental Result CNN model is implemented with Python and TensorFlow. Model is trained with and without interface. The developed model is correctly classifying the given image with training accuracy 98.7% and validation accuracy 98.51% by using 0.0001 learning rate, with sigmoid activation function, and the training and testing ratio was 80:20. The loss function was binary cross-entropy [13–15].
3.2 Result Analysis of the Proposed Model As the result indicates at the starting point of the training, the training accuracy gets 90%. After epoch 20, the training accuracy becomes 95% and the validation accuracy is 91.3%. Finally, the training and validation accuracies are gradually increased. The training accuracy after epoch 40 and above goes to 98.7% and validation accuracy becomes 98.5%. As we observe that the training loss likewise progressively dropped from 0.25% to 0.018%. As you can see in Fig. 5, network has trained for 100 epochs and we achieved high training accuracy 98.7% at epoch 40 and low loss that follows the training loss. When we see the training loss, it decreases from the first epoch (0.25) to the last epoch (0.018).
3.3 Performance Analysis of the Proposed Model We evaluate the performance and accuracy of the classifier/algorithm by means of a confusion matrix in Python. From a total of 938 both healthy and diseased images’ test dataset, proposed system has predicted 447 diseased and 457 healthy. The number of false positive (FP) is 11 images and number of false negative (FN) images is 23. Generally, the proposed system correctly predicts 95.2% diseased Enset image and 4.9% are classified as healthy. About 97.6% are correctly classified as healthy
308
A. Senthil Kumar et al.
Fig. 5 Training loss and validation loss
Table 2 Performance evaluation of the Resnet 50 Diseased Enset Healthy Enset
418 (46.4%)
9 (1.9%)
17.9% (2.1%)
32 (3.6%)
441 (49.0%)
33.2% (6.8%)
92.9% (7.1%)
98.0% (2.1%)
95.4% (4.6%)
Diseased Enset
Diseased Enset
Enset image and 2.4% are classified incorrectly as diseased. Overall, 96.4% of the prediction is correct and 3.6% are wrongly classified.
3.4 Performance Evaluation of the Resnet50 Table 2 shows confusion matrix table which contains the result of performance evaluation of the Resnet 50. As the table result indicates from a total 901 healthy and diseased Enset images of test dataset, the Resnet 50 correctly predicts 860 Enset images. Forty-one of healthy and diseased images are incorrectly predicted or known as test loss. The above table shows test accuracy and test loss of Resnet 50. The first two diagonal cells show the number and correct classification by the trained network. Our system correctly classifies 418 diseased images out of 427 total diseased images. This corresponds to 46.4% of all diseases. Similarly, 441 images correctly classify as healthy out of 473 total healthy Ensets and this corresponds to 49% of all healthy. Nine of the diseased images are incorrectly classified as healthy and this corresponds to 1%, and 32 of healthy images are classified incorrectly as diseased [14, 15].
3.4.1
Discussion
As shown in Table 2, the training accuracy achieved in the proposed system was 98.7% and validation accuracy was 98.5%. The training accuracy in Resnet50 was
Detection of Leaf Black Sigatoka Disease in Enset Using Convolutional …
309
95%. Based on consecutive assessments, we achieved a promising result of 97.5%, 95.2%, and 96.4% in precision, recall, and accuracy, respectively. However, Resnet50 was 97.9%, 93.2%, and 95.4% in precision, recall, and accuracy. The incorrect classification or classification loss of the proposed model correctly classifies the given input value 96.4% and incorrectly classifies 3.6%. As the result shows, Resnet50 correctly predicts 95.4% and incorrectly predicts 4.6%. In general, this model’s classification ability is better since the incorrect classification is less when compared to the correct classification. In both models, CNN is used good for classification because classification loss is less than correct classification.
4 Conclusion Enset is the main source of food and economy. Detection of disease is by humans through visual evaluation and experiment. This leads to inaccuracy of detection, incorrect use of pesticide, requires more experts and time. In order to achieve the best controlling mechanism, this research helps to identify the disease in its early stages with the help of an automatic system using CNN. Our system is trained with collected data and achieves 96.4% accuracy with test dataset. The general performance of our model is evaluated by a confusion matrix and gives 97.6%, 95.2%, and 96.4% in precision, recall, and accuracy, respectively. During implementation, a pre-trained model is trained by our dataset called Resnet50 and achieves 97.9%, 92.9%, and 95.4% in precision, recall, and accuracy. As the implementation result indicates, CNN is the best method for image identifying and classification Enset leaf black sigatoka disease.
References 1. Escudero CA, Calvo AF, Bejarano A (2021) Black sigatoka classification using convolutional neural networks. Int J Mach Learn Comput 11(4). https://doi.org/10.18178/ijmlc.2021.11.4. 1055 2. Selvaraj MG, Vergara A, Ruiz H, Safari N, Elayabalan S (2019) Walter Ocimati5 and Guy Blomme6AI-powered banana diseases and pest detection. Plant Methods 15:92. https://doi. org/10.1186/s13007-019-0475-z 3. Criollo A, Mendoza M, Saavedra E, Vargas G (2020) Design and evaluation of a convolutional neural network for banana leaf diseases classification, pp 1–4. https://doi.org/10.1109/EIRCON 51178.2020.9254072 4. Afework YK, Debelee TG (2020) Detection of bacterial wilt on enset crop using deep learning approach. Int J Eng Res Afr 51:131–146 5. Gurmu TA (2017) Black sigatoka leaf streaks of banana (Musa spp.) caused by Mycosphaerella fijiensis in Ethiopia. J Plant Dis Prot 124:245–253 6. Barekye A et al (2011) Appraisal of methods for assessing black sigatoka. Afr J Plant Sci 15(5):901–908
310
A. Senthil Kumar et al.
7. Report I. S. (2020) 2019 Article IV consultation and requests for three-year arrangement under the extended credit facility and an arrangement under the extended fund facility-press release and staff report. IMF 8. Fanta SW, Neela S (2019) A review on nutritional profile of the food from enset: a staple diet for more than 25 percent population in Ethiopia. Nutr Food Sci 49(5):824–843 9. Camargo A, Smith JS (2009) Image pattern classification for the identification of disease causing agents in plants. Comput Electron Agric 66(2):121–125 10. Zhang K, Zhao J, Zhu Y (2018) MPC case study on a selective catalytic reduction in a power plant. J Process Control 62:1–10 11. Misra VS (2017) Detection of plant leaf diseases using image segmentation and soft computing techniques. Inf Process Agric 323–326 12. Pujari JD (2013) Automatic fungal disease detection based on wavelet feature extraction and PCA analysis in commercial crops. Int J Image Graph Sig Process 24–31 13. Salazar PO (2016) Artificial vision system using mobile devices for detection of Fusarium fungus in corn. Res Comput Sci 121:95–104 14. Umit Atila MU (2021) Plant leaf disease classification using EfficientNet deep learning model. Ecol Inf 15. Lakshmi Patibandla RSM, Tarakeswara Rao B, Ramakrishna Murthy M, Hyma J (2023) Smarter e-agro system utilizing block chain technology. Telematique 21(1). ISSN: 1856-4194
Detecting Communities Using Network Embedding and Graph Clustering Approach Riju Bhattacharya , Naresh Kumar Nagwani , and Sarsij Tripathi
Abstract Complex network community structure has been shown effective in many fields, including biology, social media, health, and more. Researchers have explored many different techniques for studying complex networks and discovering communities within them. Most of them, however, lack the expressiveness necessary to learn the node and edge representations of complicated networks. This study has aimed to improve the performance of the Hierarchical Clustering algorithm by analyzing a framework for learning continuous feature representations using node embedding methods for nodes in networks. The proposed method improves upon previous methods by training a map from nodes to a feature space with fewer dimensions. The proposed approach employs the Hierarchical Clustering method to accomplish community identification in the benchmark networks by determining the level of similarity between any pair of node embeddings. Substantial exploratory research on a variety of real-world social networks has shown the efficacy of the proposed approach compared to existing state-of-the-art community discovery techniques. The proposed method has provided high-accuracy results when applied to graph datasets. Keywords Community detection · Node embedding · Graphs clustering · Social networks
R. Bhattacharya (B) · N. K. Nagwani Department of Computer Science and Engineering, National Institute of Technology Raipur, Raipur, India e-mail: [email protected] N. K. Nagwani e-mail: [email protected] S. Tripathi Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology Allahabad, Allahabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_27
311
312
R. Bhattacharya et al.
1 Introduction Communities are social groups that share common interests and values. Whether it is at work, at home, or among pals, humans are social beings. Communities form when a network’s participants exhibit unusually high levels of shared hobbies, content consumption, or interaction [1, 2]. Data insights into the network’s dynamics and overall status can be retrieved from communities since they provide a summary of the network’s structure by highlighting its primary attributes at a macro level. Understanding the structure of a network and extracting useful information from it requires first locating its communities. This information has a wide range of potential uses, from the biological domain to social groups, health care, and marketing. Numerous methods have been proposed recently to address the issue of community detection. Over the past few decades, researchers have paid a great deal of focus to standard community identification techniques [3], such as Hierarchical Clustering and spectral clustering [4], modularity optimization [5], and the stochastic block model [3]. According to the conventional definition of a community, there are more ties within a group than there are between groups. These techniques aim to use measures to classify the closeness of edges to group together highly connected nodes [6]. The network characteristics (such as node homophily and proximity metrics) that may be extracted from an analysis of node and structural similarities can be used to fine-tune the community identification process [7]. Numerous approaches have been presented to aid in discovering communities inside the graph network. However, many of them emphasize edge topological metrics (such as edge closeness), which can produce an unbalanced and misleading representation of the network [8]. To address this issue, researchers have recently turned to network embedding, a novel approach to network analysis [9]. The core idea underlying network embedding is to map the interactions between nodes, such as the node and structural resemblances, into a single low-dimensional vector space [10]. To better comprehend the relationships between nodes and the network as a whole, we can “embed” each node into a lower-dimensional environment. The vector space allows for the consolidation of inherent information (such as node and structural similarities) and the elimination of sparsity (redundant and noisy data). Several clustering approaches enable the direct discovery of community structures inside networks when applied to such a low-dimensional vector space [9]. Node embedding is one of the most prominent technologies used to achieve network embedding because of its capacity to collect the global information of networks, which has led to its widespread adoption as a solution to node categorization and node clustering challenges [11, 12]. For instance, similar nodes can be discovered and placed in the same community independent of their topological proximity, which can aid in finding an appropriate balance between the exploitation of node similarity and global information in networks [13]. Additionally, many prior works have demonstrated that node embedding’s additivity and interpretability are significant benefits, as it can constantly learn the feature representations for nodes in networks. It also has the ability to learn a mapping of
Detecting Communities Using Network Embedding and Graph …
313
nodes to a low-dimensional set of characteristics, which helps to maintain node neighborhoods in a network more effectively. To evaluate graph-based social networks, similar categories of nodes can be grouped in a community based on their features, whereas past studies focused solely on structural properties to identify a network’s community. Included in this study is an examination of various embedding algorithms and how they perform on various datasets [14]. This paper provides an alternative method for identifying community structures within networks. This algorithm is used to analyze real-world graph network datasets. To find these communities, this study proposed the nodeEmbed-HC algorithm, which incorporates Node2Vec into the Hierarchical Clustering methodology. In this paper, Sect. 2 presents the prior art of node embedding and clustering approach. The next Sect. 3 presents the methodological specifics of using node embedding for community detection. Different node embedding methods for community detection are simulated and compared in Sect. 4. Section 5 concludes the study by summarizing its findings and discussing where node embedding is headed in the future.
2 Related Works 2.1 Community Detection Techniques Several approaches have been developed for determining the community structure of a network, which has significant aspects for practical applications in revealing the network’s community structure, that explained in Sect. 1. In a nutshell, these algorithms can be split down the middle between heuristic-based and optimization-based varieties [15]. The initial idea operates under the assumption that the process that is formed on a network is dynamic, and it is founded on either heuristic principles or intuitive assumptions. A generative model, like Markov chains, can be used to explain the generation process [16]. In contrast to heuristic-based algorithms, the latter concept uses stochastic models of well-known metrics like modularity [17] and likelihood function to search for optimal solutions for certain matrices [18]. However, both of these traditional community recognition algorithms rely largely on edge topology information (such as the closeness of edges), which leads to two problems: (1) It is not possible to reliably remove duplicate or noisy data from a network’s original adjacency matrix [19]. (2) It is challenging to keep track of the relationships between nodes and the nodes themselves, as well as other fundamentals of a network’s structure [20]. By resolving these issues, community discovery algorithms will be able to expose more nuanced features of networks. For instance, a network’s intrinsic topology tells us a lot about the connections between its nodes (e.g., the structural similarity). As a result of these findings, many researchers have begun implementing network embedding technology to transform the original network into a low-dimensional vector space that can characterize many different aspects of the
314
R. Bhattacharya et al.
network [21, 22]. In order to address these two problems in traditional community detection, a variety of clustering techniques directly to this vector space have been used [20, 23].
2.2 Algorithms for Network Embeddings Network embedding is proving to be a useful tool in the field of network analysis, particularly for the detection of communities [24, 25]. Community discovery strategies that rely on network embeddings perform better than their predecessors. Traditional approaches to community detection focus on edge topology information (such as the closeness of edges), which can lead to data duplication or the loss of important structure-related data (such as the closeness of structural similarity) [19]. In contrast to more traditional methods, network embedding-based methods have as their primary objective the acquisition of knowledge regarding the low-dimensional distributed vector representation of the nodes. These vectors can be projected into a lower-dimensional space to preserve the underlying structure’s valuable information while eliminating redundant data [20, 26]. For structural analysis, it is crucial to emphasize network embedding approaches to maintain node and structural similarities in networks. Related nodes can be kept intact by embedding them into a vector array when they are close to one another [5]. Based on this proposal, several techniques have been developed to accurately record the distances between pairs of nodes. First-order proximity, second-order proximity, and higher-order proximity are all examples of these methods [12, 27]. In addition, several strategies are provided to maintain the first- and second-order proximity information. To maximize both of these close distances at once, MNMF employs the NMF technology [28]. In addition, other techniques, such as Deepwalk [9], Node2Vec [12], and Walklets [29], use a random walk strategy to maintain nodes’ proximity beyond the second order. The proposed approach combines a clustering strategy with the Node2Vec node embedding strategy in order to locate the communities [30, 31].
3 Frameworks of the Proposed Model The Node2Vec technique and Hierarchical Clustering (HC) algorithm had been explained in the previous sections. This study develop a new approach, nodeEmbedHC that integrates Node2Vec with HC approach to identify communities in large networks. The Node2Vec technique is used to convert a graph G into an embedding space. When compared to the number of nodes in the original graph G, the number of dimensions of the embedding space is typically smaller. The technique makes an effort to keep the original graph’s structure intact. Graph nodes having close relationships to one another will have comparable embeddings in the embedding space. Each network node is represented by a vector in these embedding spaces.
Detecting Communities Using Network Embedding and Graph …
315
Both Node2Vec [12] and Deepwalk [9] rely on the co-occurrence of nodes in random walks to determine similarity between nodes. The random walk is a form of stochastic process. Walking is the most basic means to explain a random walk. Imagine that each step is statistically determined. This means that at each time index, you have progressed along a path determined by a set of probabilities. This method investigates the connection between the distance from the beginning point and each potential next step. The following equation is introduced by Node2Vec to compute the probability of shifting to node x from node v. P(ci = x|ci−1 = v) =
πvx z
0
if (v, x) ∈ E . otherwise
(1)
The unnormalized probability of a transition between vertices x and v and z is the normalization constant value. The probability is 0 if there is no edge between x and v, but we can calculate the normalized chance of moving from v to x if such an edge exists. The Node2Vec model employs a guided random walk to generate biased random walks for both weighted and unweighted networks. The guided random walk is controlled by two variables, p, and q. The probability that a random walk will end up back at its starting node, denoted by p, is proportional to the probability, denoted by q, that the walk will traverse some section of the network that was not visited before. The nodeEmbed-HC technique has employed the idea of a second random walk and has two parameters, namely the weight of breadth p and the weight of depth q. Additionally, it performs an investigation into the hidden relationships that exist between nodes. The probability of accessing the destination node from the starting node in an unweighted and undirected network graph is calculated by following all of the edges and nodes in the graph. The time complexity of this transition probability is significant ad complexity is O(E + β), where E is the number of edges in the network and β is the number of neighbors of the node being targeted for change. The random walking path between each pair of nodes is built using a probability of transition rule. Then, learn the node embedding and train the walk paths with a stochastic gradient descent approach and a skip-gram model. We propose the SC algorithm to assist in the discovery of the community structure in both natural and artificial datasets. By first determining the degree of similarity between any pairs of node embeddings, we may build the weight matrix or similarity matrix, W. After getting the normalized result, the eigenvectors combine to produce an eigenmatrix F of dimension N × K, where K < N, N is the total number of nodes in the network. We next apply the hierarchical method to the clusters formed by F, the eigenmatrix, to obtain the various communities. An outline of the proposed model’s structural components is illustrated in Fig. 1. Table 1 shows the definitions of the important symbols. The effectiveness of the nodeEmbed-HC algorithm is highly sensitive to the values that are selected for its many parameters. Multiple simulation trials led us to the conclusion that the optimal settings for node embedding training are as follows: p
316
R. Bhattacharya et al.
Fig. 1 Architecture of the proposed community detection method
Table 1 Definitions of the important symbols
Symbols
Description
G
Graph network
G = (V , E)
Input graph dataset
D
Degree matrix
NC
Scaling strategy
F
Eigen matrix
L
Normalized graph Laplacian
L
Laplacian matrix
= 1, q = 1, r = 10, d = 1, and the number of walks in each network is 10. The HC algorithm specification requires that m, the number of communities, can be provided as a confirm value that corresponds with the actual number of communities present in each network. Each network’s lowest eigenvalue k should be set to m, as this value corresponds to the ultimate number of communities. The underlying procedure for the designed nodeEmbed-HC model is described in Algorithm 1.
4 Experiments As an illustration of the proposed method, node embedding via clustering was used to identify communities in social networks, co-author citation networks, and entity classification in a multi-view graph network. Experimental design, a baseline approach, and a measurement tool are also outlined here.
Detecting Communities Using Network Embedding and Graph …
317
4.1 Dataset Description Experiments have been conducted with two social network datasets (Karate and Dolphin) and a citation graph dataset (Cora) to prove the utility of nodeEmbed-HC. The aforementioned datasets were chosen as benchmarks for the reference methods since they are all freely accessible to the public. The statistical characteristics of the datasets are summarized in Table 2.
There were 34 people in Zachary’s Karate Club at their 1970s American university, and they all knew each other through the club. Figure 2a illustrates the relationship between the red community, represented by node v34 (Instructor), and the green community, represented by node v1 (President) [32]. The 62 dolphins in the colony in Doubtful Sound, New Zealand, are connected in an undirected social network of regular associations [33]. SN4 is the central node in the left community (shown in red) and web is the central node in the right community (shown in green) in Fig. 2b. In the Email network, each student at Universitat Rovira Virgili (URV) is a node, Table 2 Summary of benchmark data sets Datasets
Types of network
Vertices
Edges
Classes
Karate
Social
34
78
2
Dolphin
Social
62
159
2
Email
Email
1133
10,903
23
318
R. Bhattacharya et al.
(a) Karate Network
(b) Dolphin Network
(c) Email Network
Fig. 2 Various communities constructed using the nodeEmbed-HC method in graph datasets
and if student X sends an Email to student Y and student Y responds, then students X and Y are linked [34]. The proposed algorithm divides this Email network into 23 communities, shown in Fig. 2c. The end product can be an accurate reflection of the interactions between various Email account holders.
4.2 Summary of Evaluation Metric Dataset Description The effectiveness of the community detection results has been evaluated using an evaluation metric index. • Modularity (MOD): The appropriateness and effectiveness of the community organization are measured in a variety of ways. Modularity (Q) metrics [17] have been suggested by Newman and Girvan in 2004. It reflects the deviation between the observed and expected connection counts under purely random circumstances. Scalar Q is determined using the following equation and returned the value in the
Detecting Communities Using Network Embedding and Graph …
319
range [−1, 1], and thus, a greater value is preferred. Its purpose is to evaluate the clustering capability of a network. The equation is given below: Modularity(Q) =
n Ki j 1 Ai j − Si , S j , 4m i, j=1 2m
(2)
where Ai j represents the adjacency matrix, the average number of links between K pairs i and j in a graph where the links are randomly arranged is 2mi j , “m” illustrates how many vertices there are in a network, K i denotes degree vertex (i), and Si , S j indicates the possibility that a set of nodes share a community. Further, in range Q = [−1, 1] here, if the value is − 1, then zero (0) indicates that the two nodes do not belong to the same community, whereas 1 indicates that they do and that the algorithm is functioning well. • Normalized Mutual Information (NMI): NMI is a normalization of the mutual information score that measures how well an algorithm does its job of identifying patterns in communities [35]. It can take on values between 0 and 1, with higher numbers being better. Ci C j N(i j ) ×n −2 i=1 j=1 N(i j) log N(i ) .N( j ) NMI Ci , C j = , Ci N( j ) N(i ) Ci i=1 N(i) log n + j=1 N( j ) log n
(3)
where parameter N signifies the matrix of community, Ni is the i-th row of the community’s matrix sum, and N j signifies the sum of values in the j-th column of the community matrix. Further, the standard community result value is marked by the parameter Ci , and the algorithm’s community result is represented by C j . • Accuracy (ACC): Accuracy (ACC) correctly divides communities by the following numerical criteria, which reflect the situation on the ground as follows [36]: ACC Ci , C j =
|Ci | i=1 δ C i , C j |C|
.
(4)
Here, the Kronecker delta (delta (.)) is equal to 1, which means that when the labels of ground truth and the identified community are identical.
4.3 Baseline Methods’ Description The proposed method nodeEmbed-HC has been compared with different types of baseline methods that consist of clustering methods, viz. k-means, Hierarchical Clustering, and graph embedding approach.
320
R. Bhattacharya et al.
• Hierarchical Clustering (HC): This algorithm creates a structured network of groups. At the outset, or stage, each observation clusters together independently. Each succeeding stage involves the merging of the two “nearest” clusters into a single, larger cluster. The “distance” between clusters is the average of the distances between the points that make up each cluster and the points that make up the other cluster when the “average” approach is used. Graph clusters were created for this analysis of similarities and differences using the Average link Hierarchical Clustering method [37]. • Node Embedding using Node2Vec (Node2Vec-SC): The proposed approach Node2Vec-SC maps nodes to a low-dimensional space of characteristics for a better representation of learning. The community detection in the provided networks is completed by the algorithm’s further application of the spectral clustering technique to compute the similarity between any two node embeddings. Experimental results on a wide variety of real-world and synthetic networks demonstrate that the proposed technique outperforms current state-of-the-art community discovery algorithms [38]. • K-Means: A traditional clustering technique is made of natural data [39].
4.4 Implementation Process of the Proposed Approach The process for implementing the proposed approach as mentioned below. • The procedure begins with the input of a graph, from which a collection of random walks is then extracted. • This allows the walks to be represented as a directed series of words, with each node standing in for a word. The skip-gram model is then fed the created random walks. • Every node in the random walk may be thought of as a word, and the whole walk can be thought of as a sentence, as described above, demonstrating the effectiveness of the skip-gram model on both words and sentences. • Each node is assigned an embedding based on the results of the skip-gram model and generate node embeddings associated with each node. The embedding vectors is passed to the clustering algorithm like Hierarchical Clustering to divide each node into communities as illustrated in Fig. 1. • Using node embedding (Node2Vec) in association with clustering technique allows the usage of labeled weighted and unweighted network datasets.
5 Results and Discussion The benchmark techniques such as K-means [39], HC [37], and Node2Vec-SC [38] have been used to compare with the proposed nodeEmbed-HC. The above algorithms have been evaluated on real-world networks to validate the suggested method. In
Detecting Communities Using Network Embedding and Graph …
321
order to evaluate the precision and efficacy of the proposed nodeEmbed-HC algorithm, we compared the modularity Q [17], NMI [35], and ACC [36] for every realworld network. Experiment findings show that the suggested method outperforms the baseline in a number of respects and provides benefits in real-world networks. Evaluation metrics (accuracy, NMI, and modularity) are summarized and displayed in percentage form in Table 3. All three datasets have yielded outstanding results. This demonstrates that the proposed method is superior to the baselines on practically all three metrics, showing that the nodeEmbed-HC strategy produces better results. Graph data are better suited to the nodeEmbed-HC paradigm that was given. The nodeEmbed-HC outperforms the baselines on everything but the Email data in terms of accuracy and also outperforms the baselines in terms of modularity performance. Specifically, HC beats other algorithms on Email data, with its ACC, NMI, and modularity scores all ranking at the top. The modularity Q-value of every method on graph-based dataset has been presented in Table 3. Data in Table 3 show that nodeEmbed-HC outperformed the other three algorithms in both the Dolphins and Email networks, in terms of Q values. In the Karate network, the algorithm did not provide the highest results for modularity compared to K-means and Node2Vec-SC. In most circumstances, the Node2Vec-SC baseline has a higher modularity Q-value than the other baselines. It has been determined that values in the range [0, 1] for the normalized mutual information (NMI) index score are appropriate for evaluating the relevant part of the identified community structure. Additionally, a similar pattern can be observed in Table 3’s accuracy value results across the various datasets. The nodeEmbed-SC model outperformed the other baseline techniques on the Karate Club (94.8% accuracy), Dolphin (93.2% accuracy), and Email (92.4% accuracy) datasets, demonstrating the positive effect of interactions between different deep graph neural networks and propagation strategies. Table 3 Average performance of the nodeEmbed-HC approach, several baseline methods over three datasets Baseline techniques Datasets
Evaluation metric
K-means
HC
Karate Club
NMI
58.2
MOD
83.7
ACC Dolphin
Email
* Bold
N2V-SC
NE-HC
88.8
57.4
89.4
75.1
88.9*
79.3
61.5
90.1
92.3*
97.8
NMI
36.7
82.3
91.2*
93.4
MOD
54.6
71.2
72.5*
88.2
ACC
47.4
89.6
91.3*
95.2
NMI
67
86.1
96.6*
91.5
MOD
18.7
84.5*
75.1
87.2
ACC
71.3
81.5
84.5
92.4
numbers indicate the best results, whereas numbers followed by an asterisk indicate the best results from the baseline methodologies. The mentioned methods Node2Vec-SC as N2V-SC, nodeEmbec-HC as NE-HC, and Hierarchical Clustering as HC have been used
322
R. Bhattacharya et al.
Table 4 Comparison (mean) of nodeEmbed-HC and baseline modularity performance across three datasets Datasets
Benchmark methods
NE-HC
K-Means
HC
N2V-SC
Karate
0.026
0.414
0.598
0.718
Dolphin
0.211
0.402
0.621
0.743
Email
0.079
0.624*
0.559
0.505
* The
significant range values of modularity (Q) [−1, 1] is indicated in bold and asterisk which represent the best results found using the baseline methods
5.1 Modularity Evaluation Metric Compared with Standard and Proposed Approaches Modularity is a crucial metric for measuring the efficiency of the community discovery approach. Experimental findings for each dataset are shown in Table 4, and the proposed model is assessed using the modularity evaluation metric and other benchmark community detection methods. The maximum modularity index value on the Dolphin dataset improved by 12.2% as compared to baselines that did not use the nodeEmbed-HC method. The suggested model’s modularity score is quite near to the minimum requirements on all benchmark datasets with the exception of the Email dataset.
5.2 The Proposed Model’s Computational Cost The proposed approach combines the benefits of node embedding using guided random walk with skip-gram through clustering methods for the weighted and unweighted networks to find underlying community groups utilizing attributed information. Every test was conducted on a system with 16 GB of RAM and a 2.7 GHz Intel Core i7 processor. Pandas and other Python modules were implemented to create node embedding models. The Gephi [40] tool was used to generate features for nodes in graph networks. Table 5 displays the execution duration for each model. The worst K-Means running time was 5.78 s. NodeEmbed-HC takes 0.96 and 0.91 s for Karate and Email datasets, respectively, while the other methods hold their positions. The duration of each iteration for each dataset is measured in seconds, considering the distinct feature sizes of each dataset. The nodeEmbed-HC outperforms benchmark methods.
Detecting Communities Using Network Embedding and Graph … Table 5 Computational time on the Karate and Email datasets
323
Techniques
Time (in s) Karate
Time (in s) Email
K-Means
5.78
4.82
HC
2.12
2.34
N2V-SC
1.19
1.04
NodeEmbed-HC
0.96
0.91
The bold numbers represent the proposed model’s best running time over the other methods
6 Conclusion This research has been used node embedding with Hierarchical Clustering to develop a novel community discovery algorithm. When employed to graph databases, the nodeEmbed-HC technique outperformed state-of-the-art methods by a wide margin. The effectiveness of a community detection algorithm can be measured in three ways: modularity, NMI, and accuracy. Our suggested nodeEmbed-HC technique has been experimentally tested and evaluated using these metrics. We put these three networks with their respective ground-truth communities to the test. The nodeEmbedHC performed admirably across a variety of real-world network evaluation parameters, including modularity, NMI, and accuracy. Based on our experiments, it can be concluded that the proposed nodeEmbed-HC algorithm performed well on a large number of slightly sized real-world networks. Experiments show that our strategy is more effective than earlier approaches to producing high-quality clusters. Furthermore, the nodeEmbed-HC technique may develop continuous feature representations for nodes and preserve their relationships.
References 1. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci U S A 99:7821–7826. https://doi.org/10.1073/pnas.122653799 2. Fani H, Bagheri E (2017) Community detection in social networks. Encycl Semant Comput Robot Intell 01:1630001. https://doi.org/10.1142/s2425038416300019 3. Fortunato S, Hric D (2016) Community detection in networks: a user guide. Phys Rep 659:1–44. https://doi.org/10.1016/j.physrep.2016.09.002 4. Newman MEJ (2013) Spectral methods for community detection and graph partitioning. Phys Rev E Stat Nonlinear Soft Matter Phys 88. https://doi.org/10.1103/PhysRevE.88.042822 5. Liang Y, Cao X, He D, Chuan W, Xiao W, Weixiong Z (2016) Modularity based community detection with deep learning. In: Proceedings of IJCAI, pp 2252–2258. IJCAI, New York, USA 6. Xie Y, Gong M, Wang S, Yu B (2018) Community discovery in networks with deep sparse filtering. Pattern Recognit 81:50–59. https://doi.org/10.1016/j.patcog.2018.03.026 7. Lyu T, Zhang Y, Zhang Y (2017) Enhancing the network embedding quality with structural similarity. In: International conference on information and knowledge management, proceedings, Part F1318, pp 147–156. https://doi.org/10.1145/3132847.3132900
324
R. Bhattacharya et al.
8. Zhang D, Yin J, Zhu X, Zhang C (2018) Network representation learning: a survey. IEEE Trans Big Data 6:3–28. https://doi.org/10.1109/tbdata.2018.2850013 9. Perozzi B, Al-Rfou R, Skiena S (2014) DeepWalk: online learning of social representations. In: Proceedings of ACM SIGKDD. ACM, New York, USA, pp 701–710. https://doi.org/10. 1145/2623330.2623732 10. Shi B, Zhou C, Qiu H, Xu X, Liu J (2019) Unifying structural proximity and equivalence for network embedding. IEEE Access 7:106124–106138. https://doi.org/10.1109/ACCESS.2019. 2932396 11. Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. In: Social network data analytics, pp 115–148. https://doi.org/10.1007/978-1-4419-8462-3_5 12. Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of ACM SIGKDD. International conference on knowledge discovery and data mining, 13–17 Aug, pp 855–864. https://doi.org/10.1145/2939672.2939754 13. Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J (2018) Network embedding as matrix factorization: unifying DeepWalk, LINE, PTE, and node2vec. In: WSDM 2018—Proceedings of the 11th ACM international conference on web search and data mining, pp 459–467. https://doi.org/10. 1145/3159652.3159706 14. Agrawal R, Arquam M, Singh A (2020) Community detection in networks using graph embedding. Procedia Comput Sci 173:372–381. https://doi.org/10.1016/j.procs.2020.06.044 15. Gao C, Liang M, Li X, Zhang Z, Wang Z, Zhou Z (2018) Network community detection based on the Physarum-inspired computational framework. IEEE/ACM Trans Comput Biol Bioinf 15:1916–1928. https://doi.org/10.1109/TCBB.2016.2638824 16. Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows: applications to community discovery. In: Proceedings of ACM SIGKDD. International conference on knowledge discovery and data mining, pp 737–745. https://doi.org/10.1145/1557019.1557101 17. Newman MEJ (2006) Modularity and community structure in networks. Proc Natl Acad Sci U S A 103:8577–8582. https://doi.org/10.1073/pnas.0601602103 18. Karrer B, Newman MEJ (2011) Stochastic blockmodels and community structure in networks. Phys Rev E Stat Nonlinear Soft Matter Phys 83. https://doi.org/10.1103/PhysRevE.83.016107 19. Jin D, You X, Li W, He D, Cui P, Fogelman-Soulié F, Chakraborty T (2019) Incorporating network embedding into Markov random field for better community detection. https://doi.org/ 10.1609/aaai.v33i01.3301160 20. Zhang D, Yin J, Zhu X, Zhang C (2020) Network representation learning: a survey. IEEE Trans Big Data 6:3–28. https://doi.org/10.1109/tbdata.2018.2850013 21. Karatas A, Sahin S (2019) Application areas of community detection: a review. In: International congress on big data, deep learning and fighting cyber terrorism. IBIGDELFT 2018—Proceedings, pp 65–70. https://doi.org/10.1109/IBIGDELFT.2018.8625349 22. Kumar P, Jain R, Chaudhary S, Kumar S (2021) Solving community detection in social networks: a comprehensive study. In: Proceedings of the 5th international conference on computing methodologies and communication, ICCMC 2021, pp 239–245. https://doi.org/ 10.1109/ICCMC51019.2021.9418412 23. Bhattacharya R, Nagwani NK, Tripathi S (2023) CommunityGCN: community detection using node classification with graph convolution network. Data Technol Appl. https://doi.org/10. 1108/DTA-02-2022-0056 24. Tu C, Zeng X, Wang H, Zhang Z, Liu Z, Sun M, Zhang B, Lin L (2019) A unified framework for community detection and network representation learning. IEEE Trans Knowl Data Eng 31:1051–1065. https://doi.org/10.1109/TKDE.2018.2852958 25. Béres F, Kelen DM, Pálovics R, Benczúr AA (2019) Node embeddings in dynamic graphs. Appl Netw Sci 4. https://doi.org/10.1007/s41109-019-0169-5 26. Li M, Lu S, Zhang L, Zhang Y, Zhang B (2021) A community detection method for social network based on community embedding. IEEE Trans Comput Soc Syst 8:308–318. https:// doi.org/10.1109/TCSS.2021.3050397 27. Zhu J, Wang C, Gao C, Zhang F, Wang Z, Li X (2022) Community detection in graph: an embedding method. IEEE Trans Netw Sci Eng 9:689–702. https://doi.org/10.1109/TNSE.2021. 3130321
Detecting Communities Using Network Embedding and Graph …
325
28. Mohammadi M, Moradi P, Jalili M (2018) AN NMF-based community detection method regularized with local and global information. In: 26th Iranian conference on electrical engineering, ICEE 2018, pp 1687–1692. https://doi.org/10.1109/ICEE.2018.8472453 29. Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E Stat Nonlinear Soft Matter Phys 78:1–5. https://doi.org/10. 1103/PhysRevE.78.046110 30. Chen F, Wang YC, Wang B, Kuo CCJ (2020) Graph representation learning: a survey. APSIPA Trans Signal Inf Process 9. https://doi.org/10.1017/ATSIP.2020.13 31. Zhou H, Liu S, Koutra D, Shen H, Cheng X (2022) Learning node embeddings via summary graphs: a brief theoretical analysis 32. Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473. https://doi.org/10.1086/jar.33.4.3629752 33. Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations: can geographic isolation explain this unique trait? Behav Ecol Sociobiol 54:396–405. https:// doi.org/10.1007/s00265-003-0651-y 34. Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2006) The real communication network behind the formal chart: community structure in organizations. J Econ Behav Organ 61:653–667. https://doi.org/10.1016/j.jebo.2004.07.021 35. Fred ALN, Jain AK (2003) Robust data clustering. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition 2, 0–5. https://doi.org/10.1109/ cvpr.2003.1211462 36. Su X, Xue S, Liu F, Wu J, Yang J, Zhou C, Hu W, Paris C, Nepal S, Jin D, Sheng QZ, Yu PS (2021) A comprehensive survey on community detection with deep learning. arXiv Prepr. arXiv2105.12584 37. Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1:27–64. https://doi.org/10.1016/j.cos rev.2007.05.001 38. Hu F, Liu J, Li L, Liang J (2020) Community detection in complex networks using Node2vec with spectral clustering. Phys A Stat Mech Appl 545. https://doi.org/10.1016/j.physa.2019. 123633 39. Hartigan JA, Wong MA (1979) Algorithm AS 136: a K-means clustering algorithm. Appl Stat 28:100. https://doi.org/10.2307/2346830 40. Bastian M, Heymann S, Jacomy M (2009) Gephi: an open source software for exploring and manipulating networks visualization and exploration of large graphs. In: Proceedings of the international AAAI conference on web and social media. AAAI Publications, San Jose, California, USA, pp 361–362
Deep Learning Approaches-Based Brain Tumor Detection Using MRI Images—A Comprehensive Review S. Santhana Prabha
and D. Shanthi
Abstract In recent times, deep learning has transfigured the biosphere in all situations. Deep learning is the subdivision of machine learning which has exposed extraordinary fall-outs in each application, particularly in the biomedical area owing to its capability of managing an enormous quantity of data. Its latent capacity has also been used and experienced in brain tumor detection through MRI scan images for active prediction, and it has achieved and shown outstanding performance. The deep learning model for the brain MRI dataset is evaluated through performance metrics such as precision, recall, accuracy, and F-score. This study aims to deliver an interpretative examination of existing research work which majorly deals with the mechanism of brain tumor detection classification. The structure of this review article is formulated by comparing the results of current studies of deep learning techniques in brain tumor prediction and detection that includes the efficiency of different deep learning approaches and discussion about enormous collectively synthetically generated datasets with experimental. Keywords Deep learning · Brain tumor detection · Classification · Benign · Malignant
1 Introduction The abrupt and uncontrolled development of brain tissue inside the skull can culminate in brain tumors, one of the deadliest diseases. It could be malignant or benign. In contrast to benign tumors, which typically grow slowly, malignant tumors can spread out rapidly and engulf the brain’s surrounding tissue. Nevertheless, benign tumors might be harmful because their growth may harm nearby brain tissues. 20% of tumors are malignant, while 70% are benign [1]. Meningioma, glioma, and pituitary tumors, which are the more prominent ones, are among the more than 140 distinct kinds of brain tumors currently being identified and recognized. Meningioma tumors S. Santhana Prabha (B) · D. Shanthi PSNA College of Engineering and Technology, Dindigul, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_28
327
328
S. Santhana Prabha and D. Shanthi
affecting the brain and spinal cord belong to these three types and maybe some of the most common primary brain tumors [2]. As a result of a brain tumor’s potentially fatal nature, early detection is crucial. A radiation oncologist uses cutting-edge techniques, such as biopsy, cerebrospinal fluid (CSF) analysis, and X-ray evaluation, to diagnose brain tumors. However, other hazards are associated with the biopsy procedure, such as inflammation and severe bleeding. In addition, the precision is only 55% [3]. CSF is a colorless fluid that represents the inside of the brain. To look for brain tumors. Comparable to a biopsy, several hazards are associated with this procedure, such as the possibility of a reaction to allergens and leakage into the circulatory system at the location of the wound [4]. Therefore, numerous processes exist for detecting brain tumor through scan images. CT, MRI, and Skull X-ray are the most advanced scan in detecting the brain tumor. Compared to both skull X-ray and CT, MRI captures the best scan image for diagnosing the tumor presence. A harmless method known as a brain magnetic resonance imaging (MRI) scan, sometimes known as a scalp MRI, gives clear images of the interior of the head structures, mostly your brain [5]. These beautiful pictures are created by MRI using a strong magnetic field, electromagnetic radiation, and an electronic device. Radioactive is not used in it. Among CT and Skull X-rays, the most widely used scan is MRI. So, in our study, we concentrated on MRI scan images as a dataset for brain tumor detection. Enormous deep learning (DL) techniques have been implemented for brain tumor detection and classification through MRI images. So, in this review article, the authors concentrated on studying deep learning techniques in brain tumor detection under MRI images [6]. The framework of the article is organized as follows: Sect. 2 is a detailed description of brain tumor detection, including a brain tumor and its types and symptoms. Section 3 explains the difference between medical-based brain tumor detection methods and the answer to why MRI is best. Section 4 is the description of the MRI image dataset. Section 5 details the different deep learning methods for detecting brain tumors. Section 6 is the article’s conclusion and future work.
2 Brain Tumor Detection Brain Tumor. A cerebral tumor, also called a brain tumor, is an abnormal mass of tissue where cells proliferate and reproduce out of control, appearing unaffected by the systems that regulate normal cells. Two crucial types of brain tumors are primary and metastatic. Tumors in the central nervous system or its closest environs, classified as primary brain tumors, originate in these tissues. It is classified as benign or malignant [7]. Tumors which develop from other parts of the human body, such as the breast or lungs, and spread to the cerebral, typically using the circulation are referred to as metastatic brain tumors. Malignant tumor with metastases is regarded as cancer [8]. Table 1 defines the different types of benign and malignant tumor.
Deep Learning Approaches-Based Brain Tumor Detection Using MRI …
329
Table 1 Dataset and its descriptions Dataset
Description
The Whole Brain Atlas
Johnson and Alex Becker had developed the 3-D images of MRI for Neuro under neuroimaging primer at Harvard University [11]
Brain Tumor Image Segmentation (BRATS) For brain tumor segmentation with four number of steps Step1: Identification of interior nerves of brain, i.e., gliomas Step 2: Prediction of patient overall survival Step 3: Difference between pseudo progression and original tumor presence Step 4: Method for evaluating the tumor segmentation [12] Internet Brain Segmentation Repository (ISBR)
Deliver a image segmentation output with MRI images [13]
Open Access Series of Imaging studies (OASIS)
MRI dataset of brain are accessible without payment. Open access was provided by central.xnat.org to all database of neuroimaging dataset [14]
RIDER
It consist of neuroimaging dataset of 19 patients which are affected by glioblastoma. These neuroimages are captured for 2 days for single patient [15]
The cancer Imaging Archive (TICA)
It consists of only the cancer present brain image dataset of patients in MRI. It holds not only brain but also every part of the body cancer present data is available in image [16]
RADIOPEDIA
It’s an imaging dataset mainly for deep learning and machine learning algorithms. 1000 of images are taken for image detection and segmentation [17]
FIGSHARECJDATA
Open access repository which consists of images, videos, audios. It is purely for researchers [18]
Symptoms. Based on the tumor present location, the symptoms differ for each human being, and it occurs with common symptoms also. The primary symptoms for the cause of brain tumors are headache, dizziness, giddiness, difficulty in eyesight, and difficulty in hearing, dizziness with nausea, paralysis, and balancing difficulty [9].
330
S. Santhana Prabha and D. Shanthi
Fig. 1 a Deep learning in all applications b deep learning in Brain tumor detection
3 Why MRI is Best in Brain Tumor Detection? There are many reasons studied for stating “Why MRI is best”. The primary reasons are discussed here are, 1. In children and patients who need many imaging tests, MRI is favored over CT because it does not employ ionizing radioactivity. 2. There is a significantly lower chance of an allergic reaction that could be fatal with MRI contrast chemicals. 3. Using MRI features that bone artifacts on CT images might hide can be evaluated. 4. MRI scanning can be done without moving the patient in any imaging plane. So far, we have discussed brain tumor and their symptoms, causes and risk factors, etc. Our study concentrated on different MRIs with different deep learning and their performance achievement [10]. Deep learning-based brain tumor detection Deep learning models in all applications and deep learning models in brain tumor detection Yearly based development of deep learning approaches taken in all applications from 2015 to 2023 and the same statistical format for deep learning in brain tumor detection from 2015 to 2023. Currently, in research the deep learning models exist huge mainly in are od prediction and detection in image processing. Figure 1 describes the difference between applications on yearly.
Deep Learning Approaches-Based Brain Tumor Detection Using MRI …
331
4 MRI Dataset for Research in Deep Learning Some of the synthetically generated standard dataset in Table 1.
5 Mechanism of Deep Learning for Brain Tumor Detection The process of detecting the brain tumor using deep learning approaches follow five fundamental mechanisms. The detection method includes data pre-processing, feature extraction, and applying of any method of deep learning approach for detecting tumor in the training dataset after the completion of training, the process is taken for testing. Then the performance metrics are evaluated as per model suggested. Mechanism of brain tumor detection using deep learning is represented in Fig. 2. Input: MRI image Dataset of Brain Pre-processing: Image-based pre-processing defines the removal of noise and unwanted error using filters. Feature Extraction: Image-based feature extractor where the dimensionality reduction is achieved. Brain Tumor Detection: Using the deep learning approaches for training set 80% and testing set 20%. Performance Metric: Evaluating the model through metrics such as precision, recall, accuracy, and F 1 -score. Figure 3a–c shows the importance of feature extraction, importance of preprocessing, deep learning in brain tumor. Accuracy---(TP + TN)/(TP + TN + FP + FN)
Fig. 2 Mechanism of brain tumor detection using deep learning
(1)
332
S. Santhana Prabha and D. Shanthi
Fig. 3 a Importance of feature extraction b importance of pre-processing c deep learning in brain tumor
Precision---TP/(TP + FP)
(2)
Recall---TP/(TP + FN)
(3)
F1 -score---2 ∗ Precision ∗ Recall/Precision + Recall
(4)
Study of deep learning in brain tumor detection Table 2 shows the study of deep learning in brain tumor detection.
Deep Learning Approaches-Based Brain Tumor Detection Using MRI …
333
Table 2 Study of deep learning in brain tumor detection Authors and year
Dataset
Technique used
Performance metric
Limitations
Anbarasa Pandian and Balasubramanian [19]
Synthetically generated 1000 MRI scan images of patients from private hospital
Deep neural network and extreme learning machine
88% Accuracy (DNN) 96% Accuracy (DNN + ELM)
Feature extraction is complicated Basic comparison is not taken to validate the dataset [19]
Banerjee et al. [20]
REMBRANDT is an 8 layered image repository architecture consist of 68,500 of CNN images of 100 patients in MRI
99.68% Accuracy 99.46% F 1 -score
The proposed model contribution is not accurately match for dataset and the authors failed to have comparison with the state of art method [20]
Seetha and Raja [21]
REMBRANDT of 100 patients with 65,000 MRI images
10 layers architecture of CNN
97% Accuracy
Experimentation time span is not detailed and the comparison of dataset and the proposed model is not carried out [21]
Deepak and Ameer [22]
Fig share(3064) images and REMBRANDT(516 images)
Dual CNN One for classification One for grading
97% Accuracy 96% Precision 98% Recall
Complicated framework which takes more time process the images [22]
Han et al. [23]
BRATS Dataset
3D-CNN
84% Accuracy 82% Precision 99% Recall
Explanation for 3D CNN is not clear Fail in experimentation and absence of novelty [23]
El Boustani et al. [24]
TCIA Dataset
CNN
98% Accuracy Speedy execution
Proposed model is not fit for 3D MRI images [24]
Begum and Lakshmi [25]
Publicly available dataset with 100 MRI images
RNN
96% Accuracy 98% Precision 97% Recall
Complex while doing feature extraction Model not fit for 3D images [25] (continued)
334
S. Santhana Prabha and D. Shanthi
Table 2 (continued) Authors and year
Dataset
Technique used
Performance metric
Limitations
Naseer et al. [26]
BR35H of Benchmark dataset
Deep CNN
98% Accuracy 99% Precision 98% Recall
No novelty Combination of CAD and CNN doesn’t exist much performance in time management [26]
Alanazi et al. [27]
Synthetically 22 layered generated 2969 MRI architecture scan image dataset of CNN
97% Accuracy
Model failed to satisfy its characteristics such as robustness, adaptability, generalization capability, and high accuracy [27]
Pedada et al. [28]
BRATS (2017, 2018)
ResNet-34 & 93% Accuracy U-Net (BRATS2017) 92% Accuracy (BRATS2018)
Facing issues in sub-pixel convolution and the comparison made between only the models [26]
6 Conclusion The article represents a study of the research work for classifying and detecting brain tumor detection using MRI images of patients. The detection process defines whether the tumor is present or not. The classification process states that the tumor is malignant, regular, or benign. Both detection and classification are carried out through deep learning approaches with performance metrics. This study will help medical imaging research to move forward in their work based on deep learning. This work helps the authors to provide a novel deep learning model for brain tumor detection in future.
References 1. Behin A, Hoang-Xuan K, Carpentier AF, Delattre J-Y (2003) Primary brain tumours in adults. Lancet 361(9354):323–331 2. Louis DN, Perry A, Reifenberger G, Von Deimling A, Figarella-Branger D, Cavenee WK, Ohgaki H, Wiestler OD, Kleihues P, Ellison DW (2016) The 2016 world health organization classification of tumors of the central nervous system: a summary. Acta Neuropathol 131(6):803–820
Deep Learning Approaches-Based Brain Tumor Detection Using MRI …
335
3. Kasraeian S, Allison DC, Ahlmann ER, Fedenko AN, Menendez LR (2010) A comparison of fine-needle aspiration, core biopsy, and surgical biopsy in the diagnosis of extremity soft tissue masses. Clin Orthop Relat Res 468(11):2992–3002 4. Bruzzone MG, D’Incerti L, Farina LL, Cuccarini V, Finocchiaro G (2012) CT and MRI of brain tumors. Q J Nucl Med Mol Imaging 56(2):112–137 5. Geibprasert S, Gallucci M, Krings T (2010) Alcohol-induced changes in the brain as assessed by MRI and CT. Eur Radiol 20:1492–1501 6. Yokoi K, Kamiya N, Matsuguma H, Machida S, Hirose T, Mori K, Tominaga K (1999) Detection of brain metastasis in potentially operable non-small cell lung cancer: a comparison of CT and MRI. Chest 115(3):714–719 7. Das S, Nayak GK, Saba L, Kalra M, Suri JS, Saxena S (2022) An artificial intelligence framework and its bias for brain tumor segmentation: a narrative review. Comput Biol Med 105273 8. Nayak DR, Padhy N, Mallick PK, Zymbler M, Kumar S (2022) Brain tumor classification using dense efficient-net. Axioms 11(1):34 9. Noll K, King AL, Dirven L, Armstrong TS, Taphoorn MJB, Wefel JS (2022) Neurocognition and health-related quality of life among patients with brain tumors. Hematol Oncol Clin 36(1):269–282 10. Razzaghi P, Abbasi K, Shirazi M, Rashidi S (2022) Multimodal brain tumor detection using multimodal deep transfer learning. Appl Soft Comput 129:109631 11. Johnson KA, Alex Becker J. The whole brain atlas. [Online]. Available: http://www.med.har vard.edu/AANLIB/ 12. Sahoo L, Sarangi L, Dash BR, Palo HK (2020) Detection and classification of brain tumor using magnetic resonance images. Springer, Singapore, pp 429–441 13. Ye F, Pu J, Wang J, Li Y, Zha H (2017) Glioma grading based on 3D multimodal convolutional neural network and privileged learning. In: Proceedings—2017 IEEE International conference on bioinformatics and biomedicine BIBM 2017, vol 2017-Jan, pp 759–763. https://doi.org/10. 1109/BIBM.2017.8217751 14. Anilkumar B, Rajesh Kumar P (2020) Tumor classification using block wise fine tuning and transfer learning of deep neural network and KNN classifier on MR brain images. Int J Emerg Trends Eng Res 8(2):574–583. https://doi.org/10.30534/ijeter/2020/48822020 15. RiderneuroMRI. [Online]. Available: https://wiki.cancerimagingarchive.net/display/Public/ RIDER+NEURO+MRI 16. THE cancer imaging archive (TCIA). [Online]. Available: https://www.cancerimagingarchive. net/collections/ 17. Radiopedia, cases. [Online]. Available: https://radiopaedia.org/encyclopaedia/cases/all?lan g=us 18. Brain tumor dataset. [Online]. Available: https://figshare.com/articles/brain_tumor_dataset/ 1512427 19. Anbarasa Pandian A, Balasubramanian R (2016) Fusion of contourlet transform and zernike moments using content based image retrieval for MRI brain tumor images. Indian J Sci Technol 9(29):1–8. https://doi.org/10.17485/ijst/2016/v9i29/93837 20. Banerjee S, Masulli F, Sushmita M (2017) Brain tumor detection and classification from multichannel MRIs using deep learning and transfer learning. IEEE Access 1–9 21. Seetha J, Raja SS (2018) Brain tumor classification using convolutional neural networks. Biomed Pharmacol J 11(3):1457–1461. https://doi.org/10.13005/bpj/1511 22. Deepak S, Ameer PM (2019) Brain tumor classification using deep CNN features via transfer learning. Comput Biol Med 111 23. Han C et al (2019) Combining noise-to-image and image-to-image GANs: brain MR image augmentation for tumor detection. IEEE Access 7:156966–156977. https://doi.org/10.1109/ ACCESS.2019.2947606 24. El Boustani A, Aatila M, El Bachari E, El Oirrak A (2020) MRI brain images classification using convolutional neural networks’. Springer, Cham, pp 308–320
336
S. Santhana Prabha and D. Shanthi
25. Begum SS, Lakshmi DR (2020) Combining optimal wavelet statistical texture and recurrent neural network for tumour detection and classification over MRI’. Multimed Tools Appl 79(19– 20):14009–14030. https://doi.org/10.1007/s11042-020-08643-w 26. Naseer A, Yasir T, Azhar A, Shakeel T, Zafar K (2021) Computer-aided brain tumor diagnosis: performance evaluation of deep learner CNN using augmented brain MRI. Int J Biomed Imaging 27. Alanazi MF, Ali MU, Hussain SJ, Zafar A, Mohatram M, Irfan M, Al Ruwaili R, Alruwaili M, Ali NH, Albarrak AM (2022) Brain tumor/mass classification framework using magneticresonance-imaging-based isolated and developed transfer deep-learning model. Sensors 22(1):372 28. Pedada KR, Rao B, Patro KK, Allam JP, Jamjoom MM, Samee NA (2023) A novel approach for brain tumour detection using deep learning based technique. Biomed Signal Process Control 82:104549
Predicting Crop Yield with AI—A Comparative Study of DL and ML Approaches M. Jayanthi
and D. Shanthi
Abstract Crop yield prediction plays a vital role in the agricultural industry as it directly impacts global food security and also crop yield prediction is essential in making informed decisions about crop selection what to grow and when to grow, production, marketing, managing risk, increasing productivity, ensuring food security and promoting environmental sustainability. Several machine learning and deep learning algorithms are widely applied to predict the yield of the crop. In this paper, we conducted a systematic literature review of various machine learning and deep learning-based studies to examine the methods and features used in crop yield prediction. Climate variables such as temperature (minimum and maximum), humidity, wind velocity, rainfall, sunlight hours, water level, soil content and soil type will have a great impact on crop yield. This study highlights that convolutional neural networks (CNN), followed by long-short term memory (LSTM) and deep neural networks (DNN) are the most commonly used DL algorithms for the yield of the crop. Deep learning algorithms achieved low prediction errors compared to other models. Keywords Convolutional neural networks · Deep learning · Machine learning · LSTM · DNN
1 Introduction The economic transformation of a developing country like India highly depends upon the agricultural and its allied sectors. The sustained growth in these sectors can lead to increased household income, globalization and improved food security. As developing country like India continues to globalize, there is rapid growth in demand for high-value agricultural commodities in India. This paves the way for Indian farmers to focus on crops with higher demand in global market. M. Jayanthi (B) · D. Shanthi PSNA College of Engineering and Technology, Dindigul, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_29
337
338
M. Jayanthi and D. Shanthi
Agriculture sector presents opportunity for researchers to emphasize on research in areas like crop disease detection and management, crop selection, crop yield improvement, crop yield prediction, irrigation, soil health management that can result in development of sector and help farmers to increase their productivity and improve the overall sustainability of agricultural sector. Research in the agricultural area can also help to address broader challenges, such as crop yield, climate change and food security. The objective of this study are as follows: (1) study and identify various algorithms for estimating the crop yield using biological data, satellite images and statistical data. (2) Compare various generic framework for crop yield prediction. (3) Validate the results by evaluation metrics. The agricultural crop yield is highly unpredictable due to a large number of factors that has wide impact on crop yield recent times. In precision agriculture [1], crop yield forecasts are important for development of policies on crop procurement, crop selection, structuring crop price, export/import decisions, etc. In this study, we compared various mathematical model used to forecast the crop output. Crop yield, commonly referred to as agriculture production, is a crucial factor in agriculture that supports decisions about how to manage and reduce risks within the farming system. Hence to support farming, plan accordingly to avoid yield failure and food shortage [2].
2 Background and Related Work Crop yield forecast is one of the more difficult agricultural tasks, and variety of methods/techniques for forecasting crop yield have been introduced and evaluated so far. Due to the impact of a vast number of factors, including soil content, atmospheric proportions, rainfall, and other biological factors, predicting crop yield is a challenging task that requires the use of multiple datasets. Several crop prediction models are available but efficient prediction of yield of a crop is still necessary. The comparison of different techniques that can accurately predict outcomes is important aspect in developing good prediction methodology. The comparison of several yield prediction experiments carried out by different researchers provides valuable insights into selecting a good set of methodologies for estimating crop yield using multi-source data. Some of the methods for predicting crop yield are discussed below.
2.1 Implementation of Machine Learning in Crop Yield Prediction Many researches have explored machine learning techniques results in significant improvement in performance [3] that include better advancement in technology, collection of data, processing, compared to that of traditional model that rely on
Predicting Crop Yield with AI—A Comparative Study of DL and ML …
339
manual surveys and prior knowledge of historical data that is useful for a smallscale field land. Figure 1 shows the Generic Framework for machine learning model. The rapid development in the field of agriculture requires novel processing methods which can analyse the complex data. Machine learning (ML) models have manifested its significant performance with a lot of information and to identify and patterns relationship among various factors and assist in crop classification, crop disease detection and crop yield forecasting [4, 5]. There is a lot of machine learning methodologies implemented in the discipline of agriculture for the classification of crops, crop disease detection, crop yield forecast and selection of crop. Certain computer learning techniques are applied to prediction, while others are applied to classification. Regression [6] is commonly used to predict the dependent variable or output variable. The regression algorithms are applied to identify the relation, between independent variables, i.e. rainfall, humidity, soil condition temperature, area and dependent variable, i.e. crop yield [7]. Plant Yield Prediction is carried out by the implementation of ML linear regression and other machine learning techniques [8]. Five machine learning algorithms accustomed to train model for machine learning and execution is evaluated using two years data from Mexican irrigation zone data, reports that K-NN and M5-prime provides better result compare to that of other ML algorithms. In a research, Gandhi et al. [9] investigated that utilizing A Multilayer Perceptron Neural Network on the dataset of multiple areas Maharashtra to forecast the production of rice. It considers all the important characteristics for the prediction of agricultural yields season for four years. Various validation indicators such as RMSE, MAE, and MSE were used to validate the results. For predicting crop yield, ANN was thought to offer an alternative to conventional linear regression techniques. Fig. 1 Generic framework for machine learning model
340
M. Jayanthi and D. Shanthi
Maya Gopal and Bhargavi [10] proposed a novel hybrid design as artificial neural network with multiple linear regression (MLR-ANN) for the crop yield. Performance measures are used to evaluate the hybrid model’s forecasting accuracy with comparison to multiple machine learning models. Both hybrid MLR-ANN and traditional ANN computational times were defined. The findings demonstrate that, compared to traditional models, the suggested hybrid MLR-ANN model provides more accuracy. Hammer et al. [11] predicted sugarcane yield by implementing support vector machine, random forest and boosting. By analysing the dataset acquired from sugar mills of Brazil, they aimed to identify the variable that has high impact in sugarcane production. This study suggested that the number of cuts in sugarcane was most significant variable in yield prediction. Researcher used back propagation [12] and a MLFF neural network framework to construct prediction model for a crop forecasting. The dataset (Thailand Meteorological Station) used consists of six parameters based on climate and for period of ten years (2002–2012) Findings of the study showed that the introduced model might perform adequately in ANN structure, and the estimated values are very close to the fitted values of the model.
2.2 Application of Remote Sensing in Crop Yield Forecast Remote sensing represents one of the predominant methods in identifying crop growth by covering large area, with real-time spatial and temporal data as shown in Fig. 2. Generation of remote sensing data is typically done by emission of electromagnetic waves from target object such as building, tree, plant and soil. Satellite can produce large volume of real-time data [7] with multi-spectral and temporal features which is most suitable for forecasting yield of a crop efficiently. To process the information from satellite [13], soil data and satellite sensors data are used to extract NDVI as input feature vector to predict wheat crop yield and results in 91% accuracy rate which is considered as efficient for the complex dataset. Many of the studies have used remote sensing data significantly to predict the crop yield [14]. Having utilized from (− 1) to 1 is the range of the Normalized Difference Vegetation Index (NDVI) as input feature for prediction of yield. NDVI (N) is defined as the difference between the near-infrared (NIR) and visible red light (RED) normalized by their sum calculated using equation: N=
NIR − RED . NIR + RED
(1)
The Enhanced Vegetation Index (EVI) ranging from (− 1 to 1) was developed as an alternative to improve crop yield forecasting, analysing crop growth patterns, by eliminating the shortcomings of NDVI as it saturates at high amount of green biomass. The soil and atmosphere maintain a strong correlation by which reduction in one will increase the. EVI uses canopy background adjustment (L), and it is more
Predicting Crop Yield with AI—A Comparative Study of DL and ML …
341
Fig. 2 Working of remote sensing
accurate in densely vegetated regions [15]. The EVI is calculated using formula: EVI = G ∗
(NIR − RED) . (NIR + C1 ∗ RED − C2 ∗ BLUE + L)
(2)
NIR, RED, BLUE—surface reflectance. The research was conducted on MODIS satellite data for rice crop yield using NDVI. The predicted results were compared with the government’s statistical data which implied good prediction of the model with 7.1% root mean square error.
2.3 Implementation of Deep Learning in Crop Yield Prediction In recent era, deep learning algorithms have gained better outcomes in the field of agriculture. Deep learning approaches are capable of handling huge volumes of data faster than that of ML algorithms. DL has been most preferred approach by its advantage of handling of complex data, advanced computational capabilities and storage abilities. Common deep learning algorithms [6] are neural convolutional networks (CNN), (RNN) and (DBN) are three types of neural networks as shown in Fig. 3. DL algorithms include number of layers instead of single hidden layer between input layer and output layer. To calculate output features padding size kernel size s. out =
in + 2 p − k s
+ 1.
(3)
342
M. Jayanthi and D. Shanthi
Fig. 3 Neural convolutional network
Simple neural networks are artificial neural networks which mimics human brain. The neural network is made up of nodes that are connected to one another at the points where ANN network has three layers: The hidden layer gets input from the input layer which performs function, the output layer and that provides output. To process initial weights are assigned. In comparison with other traditional approaches such the convolutional neural network (CNN) model, support vector machine, random forest, and decision tree perform better over large data for prediction of crop yield, and CNN was capable of forecasting crop yield with very high accuracy, but it fails to deal with temporal dependencies over sequential data [16]. CNNs have been successfully used to classify crop, disease identification, weed detection. Deeper learning strategies especially CNN result in better feature extraction performance. Convolution network architecture comprises 3 main layers. The layers are as follows: convolutional layer (CONV), pooling layer (POOL) and fully connected (FC) layer [17]. The RNN: Recurrent neural network is a suitable approach to process the sequential data, but it has poor feature extraction compare to that of CNN. To overcome these issues, proposed deep learning model combines the convolutional layer with the RNN for better efficiency but results in exploding and vanishing gradient problem. To address this issue, LSTM has been proposed that solves memory loss with the help of internal memory [13]. GNN-RNN integrates spatial and temporal data for prediction of crop yield by considering neighbourhood information is proposed and outperforms the traditional RNN and CNN models. Generic pseudo code for crop yield prediction Input: All features for CYP including soil information, historical data, nutrients, etc. Output: Crop yield prediction results. Select optimized feature vectors by feature selection algorithm. For all instances of dataset, Optimize data by process of pre-processing.
Predicting Crop Yield with AI—A Comparative Study of DL and ML …
343
End. Create training and testing dataset by splitting given dataset. Train the model using ML/DL algorithm (Learning phase). For all unknown instances in training dataset. Perform Prediction by using DL/ML Model. Return class label. End. Evaluate the performance of the model using performance metrics like Accuracy, F 1 -Score, RMSE, MSE, MAE.
3 Comparison of Various Crop Yield Prediction Techniques This section provides an overview of research carried out during last decade for prediction of crop yield, and it outlines various methodologies like deep learning, remote sensing, machine learning, IoT, transfer learning carried out by the researchers for the forecast for agricultural production in the field of agriculture. To develop a novel automatic, efficient, more accurate and faster system for yield prediction, still limitations and gaps are to be addressed. Table 1 shows the research work carried on crop yield prediction.
4 Comparative Analysis The productivity or yield of agricultural crops relies on various factors, for example, weather condition of weather, condition of soil, water, temperature, rainfall and selected algorithm. Many existing researches use various unique approaches, some of which are briefly explored in this section. The performance of various implemented model used for better crop yield prediction is evaluated using performance metrics like accuracy, f1-score, precision, recall which is computed using indices. Confusion matrix used for classification appears as given in Table 2. The accuracy is determined by computing the percentage of true outcomes (both true positive and true negative) in the total data. Accuracy =
A+C , A+ B +C + D
(4)
A , A+B
(5)
precision =
344
M. Jayanthi and D. Shanthi
Table 1 Research work carried on crop yield prediction Ref. No.
Methodology
Advantages
Disadvantages
[13]
LSTM deep neural network
Vanishing gradient problem was addressed
Spatial-temporal data have not taken into consideration
[7]
Graph neural network (GNN) and recurrent neural network
Use of geographical (spatial structure) and temporal knowledge in crop yield prediction
They did not consider predicting yield by climate data, biological data
[2]
Transfer learning approach
Uses CNN for small amount of Sometimes pre-trained training data model will not have required labels
[16]
Convolutional neural It eliminates need for a feature network, recurrent neural extraction and provides good network accuracy
Exploding and vanishing gradient problem
[17]
Convolutional neural network
CNN performs better with image. Used ReLU to achieve better performance and avoids over fitting
CNN works better with RGB images compare to that of NDVI
[10]
Multiple linear regression-ANN
The initial weights and bias to the input layer are computed using MLR equation
Require more computational resources compare to that of traditional ANN and MLR
[7]
Artificial neural network, Soil properties are taken into Weather-based statistical multiple linear regression consideration as Tillage has the data have not taken into impact on the cultivation consideration
[14]
Neural networks
Fast and accurate compare to conventional regression techniques
ANN tends to converge on a solution to the problem with nonlinear parameters
[3]
Random forest
Classify a large data set with better accuracy
Prone to over fitting, perform poorly on new data
[6]
Support vector machine, random forest, multivariate polynomial regression
The SVM outperforms over Only yield and climatic other classifiers, as it is efficient factors are taken into in high-dimensional spaces consideration
[8]
Extreme learning machine
Better generalization and faster High dependency over learning training data
Table 2 Confusion matrix
Actual
Predicted Positive
Negative
Positive
Real positive (A)
Positive negative (D)
Negative
True negative (B)
False positives (C)
Predicting Crop Yield with AI—A Comparative Study of DL and ML …
recall = f 1 -score =
A , A+D
2 ∗ (precision ∗ recall) . precision + recall
345
(6) (7)
For prediction of crop yield dataset from India’s agriculture webpage [12]. The dataset of thirty years of rice data collected from the Department of Economics and Statistics, Government of Tamil Nadu includes 7 number of attributes that include season, State name, District Name, Crop Name, Area, Yield Range, pre-processed and used F 1 -Score metric to measure the various model accuracy is given in the fig. when compared with other algorithms Conv 1D performs better. Figure 4 shows the analysis of F 1 -Score. Almost every study on crop yield prediction employed RMSE to gauge the model’s level of quality. To evaluate the efficiency of the implemented models, we gauge effectiveness of the various methodologies using Mean Absolute Error, Root Mean Square Error (RMSE) and MSE. RMSE or Root Mean Square Error It calculates the average difference between the target variable’s actual values and its projected values: RMSE = sqrt mean (t_true − t_pred)2 .
(8)
MSE or Mean Square Error This technique is used to calculate the average squared difference between the estimated and real values. MSE = mean (t_true − t_pred)2 .
Fig. 4 Analysis of F 1 -score
(9)
346
M. Jayanthi and D. Shanthi
Mean Absolute Error (MAE) It’s a measure of error. It results in average difference between actual and predicted value of target variable. MAE = mean(|t_true − t_pred|).
(10)
Input features like area, the number of tanks, the length of the canals, the number of open wells and the highest temperature are trained the machine learning model with and accuracy of the model is compared using evaluation metric RMSE and MAE [10]. Figure 5 shows the RMSE, MAE value for various classifiers. Table 3 shows the comparison of various classifier for different crop using RMSE evaluation metric.
Fig. 5 RMSE, MAE value for various classifiers
Table 3 Comparison of various classifier for different crop using RMSE evaluation metric S.No.
Algorithm used
Type of crop
Region
Prediction accuracy using RMSE
1
Deep neural network
Maize
Canada and US
12.17
2
Analysis by principal components (PCA)
Corn and maize
Vietnam
5–12%
3
Multilayer perceptron
Rice, maize, potatoes, wheat, and sorghum
Saudi Arabia
0.04498
4
Recurrent neural network (RNN) and temporal convolutional network (TCN)
Tomato
UK
10.41
5
Hybrid CNN-RNN
Corn and soybean
USA
8–9%
Predicting Crop Yield with AI—A Comparative Study of DL and ML …
347
5 Conclusion Deep learning algorithms achieved low prediction errors compared to other methodologies. This study observed that CNN, LSTM and DNN algorithms are the most preferred deep learning algorithms in crop yield prediction. Most of the models had outcomes with high accuracy values for their evaluation parameters, which means that the model made correct predictions.
References 1. Saranya T et al (2023) A comparative study of deep learning and Internet of Things for precision agriculture. Eng Appl Artif Intell 122:106034 2. AgroClimaticZones. https://agriculture.rajasthan.gov.in/content/agriculture/en/AgricultureDepartment-dep/Departmental-Introduction/AgroClimatic-Zones.html. Accessed 02 Aug 2022 3. Filippi P et al (2019) An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine learning. Precis Agric 20:1015–1029 4. Sethy A, Kumar P et al (2020) Deep feature-based rice leaf disease identification using support vector machine. Comput Electron Agric 175:105527 5. Cai Y et al (2018) A high-performance and in-season classification system of field-level crop types using time-series Landsat data and a machine learning approach. Remote Sens Environ 210:35–47 6. Sellam V, Poovammal E (2016) Prediction of crop yield using regression analysis. Indian J Sci Technol 9(38):1–5 7. Schwalbert RA et al (2020) Satellite-based soybean yield forecast: Integrating machine learning and weather data for improving crop yield prediction in southern Brazil. Agric Forest Meteorol 284:107886 8. Taherei Ghazvinei P et al (2018) Sugarcane growth prediction based on meteorological parameters using extreme learning machine and artificial neural network. Eng Appl Comput Fluid Mech 12(1):738–749 9. Gandhi N et al (2016) Rice crop yield prediction in India using support vector machines. In: 2016 13th International joint conference on computer science and software engineering (JCSSE). IEEE 10. Maya Gopal PS, Bhargavi R (2019) A novel approach for efficient crop yield prediction. Comput Electron Agric 165:104968 11. Hammer RG, Sentelhas PC, Mariano JCQ (2020) Sugarcane yield prediction through data mining and crop simulation models. Sugar Tech 22(2):216–225 12. Son N-T et al (2020) Machine learning approaches for rice crop yield predictions using timeseries satellite data in Taiwan. Int J Remote Sens 41(20):7868–7888 13. Zhou W et al (2022) Integrating climate and satellite remote sensing data for predicting countylevel wheat yield in China using machine learning methods. Int J Appl Earth Obs Geoinf 111:102861 14. Tiwari P, Shukla P (2019) Artificial neural network-based crop yield prediction using NDVI, SPI, VCI feature vectors. In: Information and communication technology for sustainable development: proceedings of ICT4SD 2018. Springer, Singapore, pp 585–594 15. Xue J, Su B (2017) Significant remote sensing vegetation indices: a review of developments and applications. J Sens 1–17
348
M. Jayanthi and D. Shanthi
16. Khaki S, Wang L (2019) Crop yield prediction using deep neural networks. Front Plant Sci 10:621 17. Van Klompenburg T, Kassahun A, Catal C (2020) Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric 177:105709
Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain E. Gurumoorthi , Chinta Gouri Sainath, U. Hema Latha, and G. Anudeep Goud
Abstract Mobile ad hoc networks (MANET) are wireless networks which don’t have any type of centralized control and are dispersed. The most prevalent attack and threat kinds, such as wormhole assaults, gray hole attacks, and evil twin attacks, can be utilized against MANETs. The biggest threat to sensor devices is distributed denial of service (DDoS). The privacy of data transmitted through MANETs as well as the network’s robustness have been among the key subjects of discussion and research in recent years. In the MANET environment, both active and passive attacks are frequent. If neglected, vulnerable assaults have a catastrophic effect on MANET nodes and could ultimately lead to the collapse of the network. Attacks frequently consume massive amounts of energy, far more than the permitted limitations for energy consumption per node, which reduces longevity. Therefore, the goal of this paper’s research is to prevent attacks and identify malicious nodes from trustworthy nodes. Keywords MANET · Vulnerable attack · DDoS attack · IoT · Security in MANET
1 Introduction The mobile ad hoc network (MANET)’s fundamental functionality is primarily concerned with security [1]. By ensuring security issues have been addressed, network services can be made available, confidential, and maintain the integrity E. Gurumoorthi (B) · C. G. Sainath · U. Hema Latha · G. Anudeep Goud CMR College of Engineering & Technology, Hyderabad, Telangana, India e-mail: [email protected] C. G. Sainath e-mail: [email protected] U. Hema Latha e-mail: [email protected] G. Anudeep Goud e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_30
349
350
E. Gurumoorthi et al.
of their data. MANET is regularly the subject of different security attacks due to its open medium, dynamic topology changes, cooperative algorithms, lack of central management and monitoring, and lack of a clear defense mechanism. The battle between security threats and the MANET has changed due to these factors. For the current WSN environment, cybersecurity must be a crucial part of the information management system. The attacker intended to overwhelm the targeted sensor networks through malicious behavior. Using a wireless connection independent of any topological network, a MANET is utilized to facilitate communication activities. It has wireless nodes that build an ad hoc network without infrastructure where the nodes will communicate across many hops. The self-organizing and distributed structure of the MANET enables it to carry out the desired network’s functionality through node participation, cooperation, and communication—all of which are essential for providing successful communication. Routing, security, access control, dependability, and energy use are problems in MANET [2, 3]. By implementing a routing protocol securely, which can identify hostile nodes and neighboring nodes that are attacking other nodes, these problems are solved and assaults are prevented. In MANETs, data communication security is crucial. Security breaches typically include packet flooding, either intentional or passive, which uses up more energy and causes congestion that can turn into a denial of service (DoS) breach. A trusted routing serves to reduce the risk of untrustworthy nodes communicating [4, 5]. Designing a block chain network that clusters is done through the process of cluster formation. In scenario nodes with a high level of density, it improves overhead control. Additionally, overall node counts, node distances, nodes’ directions, velocities, and mobility all have a role in how well cluster numbers are created. By employing process clustering, a large network is divided into smaller numerical networks. The Internet of Things interacts intelligently with mobile ad hoc networks (MANET), improving its usability and strengthening its commercial feasibility. By combining mobile ad hoc networks, block chain and wireless sensor network, new MANET systems can be developed. Such a strategy lowers network deployment costs while enhancing user mobility [6]. In terms of networking issues, it also poses fresh, challenging issues.
2 Related Work There were many routing protocol proposed in literature to deal with security issues in routing protocols. In article [7], the author suggests a thorough assessment of research on multi-hop MANETs with blockchain-based node trust management. We put the problem of security in MANETs caused by a lack of trust between the participating nodes in its proper perspective. We address the limitations of the current blockchain in MANETs and propose the blockchain concepts. The author of article [8] suggests and optimizes a unique blockchain-based approach against selfish attacks in ad hoc networks. To that end, we investigate
Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain
351
the presence of MPR nodes in MANETs in order to acclimate and participate the blockchain basis, and we conduct a security analysis to evaluate our model by deliberating its capability to encounter major safety concerns such as privacy, integrity, non-repudiation, and vulnerabilities. The author’s suggested system in this study [9] uses blockchain to establish a dispersed reliance framework for MANET routing nodes that is inaccessible. The blockchain idea is integrated into MANETs using the optimized link state routing protocol (OLSR), which is used as a model protocol. The majority of security difficulties in the OLSR, where each node performs the security action separately and repetitively, are resolved by blockchain as a securely distributed and trusted platform. The routing nodes in the suggested design can also work together to defend themselves against network intruders using specified principles. The author of this article’s [10] blockchain-based mobile network is considering applying the ensemble technique, which was previously described in supplementary papers. The recommended procedure for MANETS routing makes use of the Byzantine Fault Tolerance (BFT) protocol. Blockchain can be integrated into an IoT-based MANET (BATMAN) by means of advanced mobile ad hoc networking (MANET). The Extended-BATMAN (E-BATMAN) process incorporates blockchain technology into the BATMAN protocol via IoT-based MANETs. With each node enforcing its own security standards, blockchain is a secure, decentralized, and dependable network. For relay node selection with security-assisted data transmission, the author [11] proposes a Quantum Atom Search Optimization combined with Blockchain aided Data Transmission (QASO-BDT) technique. The signing up, clustering, and transmission segments of this system. During the node registration phase (CG), all sensor node is initially listed in the blockchain system concluded Capillary Gateway. Following the selection of a CH, the nodes are clustered into several clusters using an improved multi-view clustering model. The multi-hop transmission phase then aids in choosing the appropriate relay node for multi-hop transmission using QASO. To maintain system security, a blockchain-based transaction is then carried out. The coupling of blockchain and MANET is recommended in this study [12], where the author suggests Blockchain-Based Security Enrichment in MANET, through the goal of enhancing protection and privacy in the scattered node. Abandoned Blockchain-Based Truthfulness System is therefore suggested in the investigation as a solution to this situation. This system effectively improves network truthfulness by fusing node attributes to blockchain addresses. Promulgated Reliance Esteemed Quadruplets Condition is afterward implemented to enhance block reliance management in order to promote the scalability and dependability of blockchain technology concluded cossets process. The author of this paper [13, 14] suggested a method for spotting rogue nodes and swiftly warning the network. Additionally, if a malicious node resumes normal behavior or there was a categorization error, the same process is utilized to reestablish its trust. The mechanism can be adjusted as needed to strike a balance between the reaction’s sensitivity and the prompt identification of a status change.
352
E. Gurumoorthi et al.
In this paper [15], the author formulates an Energetic and Enhanced NS protocol identified as DONS, employing an innovative privacy-aware leader determination underneath the public BC called AnoLE, in which the frontrunner unravels the network’s Minimum Spanning Tree problem (MST) in polynomial time while remaining anonymous. As a result, the optimal NS given the present network architecture is disclosed to miners. We quantitatively compare the complexity, privacy, and security of the suggested protocols to cutting-edge MST solutions for DLs and well-known threats. In order to improve MANET speed and manage keys securely, this study [16] implements Elliptic Curve Cryptography through Diffie–Hellman key exchange mechanism (ECC-DH) using Modified Montgomery Modular Arithmetic (MMECCDH). Processors based on the Haswell and Sandy Bridge Architectures are measured for metrics such as Key calculation time, Packet Delivery Radio, Average Power Consumption per Node, Operations per second, number of cycles per key exchange, Security Levels, and processing time of ECC-DH, Montgomery Elliptic Curve Cryptography with Diffie Hellman key exchange (MECC-DH), and MMECC-DH methods. Optimized Link State Routing (OLSR), Ad hoc On-Demand Distance Vector (AODV), and Dynamic Source Routing (DSR) protocol simulations were carried out in this work to determine which of these performed best [17]. Throughput, end-to-end delay, and PDR were used in the analysis of the findings (PDR). The author [18] concentrates on computing the node’s confidence factor built on network limitations and node behavior to restorative the difficulty of ensuring secure transmission. Based on three tiers of observations, the suggested method, STBA, computes a node’s secure trust [19]. The effectiveness of the projected secure trust mechanism STBA is assessed by contrasting it through routing that involves no trust computation, with the current Belief-Based Trust Evaluation Mechanism (BTEM), and with the novel extended trust based mechanism (NETM), In all situations, routing includes simply direct and indirect trust computation for node distribution. The author’s goal in this study [20] was to assess the effect of a MIM attack on the MANET environment and to suggest a safety explanation to such a scenario. This is performed by (1) determining the fraction of adversaries needed to effectively execute a MIM attack in a certain MANET and (2) recommending a security process predicated on the well-known Diffie–Hellman protocol. In order to improve upon past work, the author [21] proposes a group key administration strategy for MANETs based on the identity-based authentic dynamic contributory broadcast encryption (IBADConBE) protocol. Our plan does away with certificate organization and eliminates the requirement aimed at a reliable supplier to provide each node with a secret key. The secret keys can be exchanged by a group of wireless nodes in a single round. Additionally, because our outline is receiverunrestricted, each dispatcher can freely choose any advantageous nodes in a group as the receivers. Additionally, our plan simultaneously satisfies the requirements for substantiation, message secrecy, recognized security, advance security, and backward security.
Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain
353
The author of this research [22] introduced a novel clustering using metaheuristic quantum worm swarm optimization for MANETs called QGSOC-SRP. The ideal CH choice and route choice processes are the first two steps of the QGSOC-SRP technique that is being discussed. The QGSO approach begins by calculating an endurance task based on four variables: energy, distance, node degree, and trust factor. The SRP is then employed by the oppositional gravitational search algorithm (OGSA) to locate the optimum path to the BS. The traditional GSA is inspired by the law of gravity and interactions between masses. The OGSA is built on the oppositionbased learning concept for population initialization and generation leaping in order to improve the GSA’s effectiveness [5, 19].
3 Proposed System The new trust and security-based routing protocol was proposed to gain from various attacks. In proposed system, the first the source and destination node has to be fixed. Then if the source node and destination node were in transmission range, then data were encrypted and forwarded to the destination. If in case the destination is too far to communicate. Then, the neighbor node has to be chosen for further forwarding of data. While choosing the neighbor node, the system must analyze whether any security issue can happen based on neighbor node. So proper authentication is needed, after the analysis the nodes were added into the blockchain. Then, the blockchain will monitor the activity of the nodes available into it. If in case the node act as malicious node and attacking node, then the blockchain will automatically terminate the node from communication and terminate from the blockchain. A blockchain is made up of a collection of records. Each block has a timestamp, a list of communications, and a hash pointer to the block before it. Blocks are used to store all of the records. The blockchain is created as every block ties with the previous block using the hash from the preceding block. A blockchain design is impervious to data alteration. Once information has been stored in a block, it cannot be changed retroactively without also changing all blocks that come after it. A disseminated ledger that accounts communications between two parties can be created using a blockchain. Due to its ability to allow for the distribution of digital information while preventing copying, blockchain system has become the foundation of a novel dimension of the Internet. Blockchain was first designed for digital currency, such as Bitcoin, but the IT industry has since discovered other possible applications for the technology. Recent years have seen the emergence of many blockchain-based applications across a variety of industries, including real-time IoT operating systems, reputation management and financial service, and more. Any transaction results in the broadcast of the information to every peer in the network. In Fig. 1, the overall system architecture has been represented about the proposed system.
354
E. Gurumoorthi et al.
Fig. 1 System architecture
In Fig. 1, stage 1 represents authenticating every new node via blockchain, then the new node will be added into the system. In the next stage, the data manipulation through MANET-IoT was carried out. In the next phase of analysis analyzing the data using machine learning model, then blocks were created for further transmission. Then, in the next phase, we share the data with nodes available in block chain. Choosing the neighbor node, the source node will forward the encrypted data (byte of information) to the neighbor node. Once the data reaches the neighbor node, a private key is used to decrypt the data. The encryption and decryption were established by blockchain. Communications from the operation pool that satisfy a cryptographic hash function are attempted to be locked by a certain class of participating nodes known as miners. Now the selected neighbor node will act as the source for further transaction. Again route discovery happens once again, this scenario happens until the destination is reachable. Figure 2 represents the overall data flow of proposed system. For every encryption, a separate encryption key is managed and for decryption also. Once the destination is reached, the destination node will repeat the same procedure to forward the ACK packet. The Distance (x) is projected as f (x) = (2π λx)e−π λx . 2
(1)
Equation 1 represents to find the distance estimation from source to next hop node. Assume there is no direct connection between the Source and Destination nodes, but communication may be created via two hops by having at least one path. R to 2R is the range for calculating two hop count. The likelihood of discovering the two hops is P2 = P(R < x < 2R)
Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain
355
Fig. 2 Overall data flow
∫ P2 =
2R
2π λxe−π λx dx(1 − e−n ). 2
(2)
R
Equation 2 represents to find the probability to discovery the two hop among the multi-hop. Below algorithm represents the functionality of proposed system.
356
E. Gurumoorthi et al.
Algorithm 1: Proposed system The current algorithm was under implementation stage. And after implementation, the performance was calculated. The proposed algorithm works under three phase, first find the destination, choosing the neighbor node and encrypt and decrypt the data with proper public and private key. (a) Simulation and result analysis The experiment analysis is done with the help of the NS-2.35 Tool in direction to evaluate the effectiveness of the suggested security protocol. The parameters used for network simulation are listed in Table 1. Packet Delivery Ratio: (Node Basis) PDR is defined as the proportion of a total amount of the packets forwarded through sender and packets positively received by the receiver. Here, quantity of node is taken as 10 interval nodes. Since neighbor nodes were authorized at the initial stage, hazardous nodes were eliminated. Due to this, the PDR shows better results. In Fig. 3, packet delivery ratio of proposed system was represented. Packet Delivery Ratio: (Attack Node Basis) In Fig. 4, packet delivery ratio of proposed system was represented with respect to attack node basis. Initially, the PDR seemed good, but when the number of nodes gets increased, then attacks also get increased. Even though an attack happens, PDR is good.
Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain Table 1 Network simulation parameter values
357
Parameter
Value
Network simulation
NS-2.35
MAC protocol
IEEE 802.11
Transmission range
100 m
Simulation area
1500 m × 1500 m
Transmission rate
10 kbps
Channel type
Wireless
Mobility model
Random way point
Time for simulation
1200 s
Size of packet
512 byte
Traffic type
Constant bit rate
Number of nodes
10, 20, 30, 40, 50, 60, 70, 80, 90, 100
Pause time
10 s
Queue size
50 packets
Routing protocol
LAR
Fig. 3 Impact of node density on average delivery ratio
Packet Lost Ratio (Node Basis) The results in Fig. 5 show that packet lost ratio for proposed system in terms node basis. At the initial stage, the PLR seems less and when the number of nodes gets increased, PLR also increases, but between 20 and 50 nodes, PLR seems nominal.
358
E. Gurumoorthi et al.
Fig. 4 Impact of attack node density on average delivery ratio
Fig. 5 Impact of node density on packet lost ratio
4 Conclusion Security concerns in MANET systems are examined in the article. It reveals various common and harmful vulnerabilities in mobile ad hoc networks. As a result, the investigation purpose in this work is to prevent attack and identify these hazardous nodes from trustworthy nodes. And also the proposed system is well formed to analyze, choose the neighbor, and to make proper encryption and decryption. Still the implementation part is under progress. The final result will give better result after accessing performance of the security routing protocol.
Trust and Secured Routing in Mobile Ad Hoc Network Using Block Chain
359
References 1. Khalfaoui H, Farchane A, Safi S (2022) Review in authentication for mobile ad hoc network. In: Advances on smart and soft computing. Springer, Singapore, pp 379–386 2. Gurumoorthi E, Ayyasamy A (2020) Cache agent based location aided routing using distance and direction for performance enhancement in VANET. Telecommun Syst 73(3):419–432 3. Gurumoorthi E, Ayyasamy A, Archana M, Barathy JV (2017) Performance enhancement for QoS in VoIP applications over MANET. J Adv Comput Electron Eng 2(5):47–54 4. Abdallah EE, Otoom AF (2022) Intrusion detection systems using supervised machine learning techniques: a survey. Procedia Comput Sci 201:205–212 5. Krishnan RS, Julie EG, Robinson YH, Kumar R, Son LH, Tuan TA, Long HV (2020) Modified zone based intrusion detection system for security enhancement in mobile ad hoc networks. Wirel Netw 26(2):1275–1289 6. Islabudeen M, Kavitha Devi MK (2020) A smart approach for intrusion detection and prevention system in mobile ad hoc networks against security attacks. Wirel Pers Commun 112(1):193–224 7. Abdel-Sattar AS, Azer MA (2022, May) Using blockchain technology in MANETs security. In: 2022 2nd International mobile, intelligent, and ubiquitous computing conference (MIUCC). IEEE, pp 489–494 8. Mouchfiq N, Habbani A, Benjbara C, Berradi H (2021, Dec) Blockchain-based model against selfish attacks in mobile ad hoc networks. In: 2021 4th International conference on advanced communication technologies and networking (CommNet). IEEE, pp 1–9 9. Lwin MT, Yim J, Ko YB (2020) Blockchain-based lightweight trust management in mobile ad-hoc networks. Sensors 20(3):698 10. Singh U, Sharma SK, Shukla M, Jha P (2021) Blockchain-based BATMAN protocol using mobile ad-hoc network (MANET) with an ensemble algorithm 11. Mahapatra SN, Singh BK, Kumar V (2022) A secure multi-hop relay node selection scheme based data transmission in wireless ad-hoc network via block chain. Multimedia Tools Appl 81(13):18343–18373 12. Nikhade JR, Thakare VM (2022) Block chain based security enhancement in MANET with the improvisation of QoS elicited from network integrity and reliance management. Ad Hoc Sens Wirel Netw 52 13. Chatzidakis M, Hadjiefthymiades S (2022) A trust change detection mechanism in mobile ad-hoc networks. Comput Commun 187:155–163 14. Mishra R, Kaur I, Sharma V, Bharti A (2022) Computational intelligence and blockchainbased security for wireless sensor networks. In: Handbook of research on technical, privacy, and security challenges in a modern world. IGI Global, pp 324–336 15. Baniata H, Anaqreh A, Kertesz A (2022) DONS: dynamic optimized neighbor selection for smart blockchain networks. Futur Gener Comput Syst 130:75–90 16. Janani VS, Manikandan MSK (2022, Mar) A secured key management scheme for mobile ad hoc networks with modified montgomery modular arithmetic. In: 2022 IEEE International conference on signal processing, informatics, communication and energy systems (SPICES), vol 1. IEEE, pp 1–4 17. Kalichurn S. The effects of black hole attacks on the performance of AODV, DSR, and OLSR in mobile ad-hoc networks. Doctoral dissertation 18. Reddy M, Srinivas PVS, Mohan MC (2022) Enhancing the routing security through node trustworthiness using secure trust based approach in mobile ad hoc networks. Int J Interact Mobile Technol 17(14) 19. Abass R, Habyarimana A, Tamine K (2022) Securing a mobile ad hoc NETwork against the man in the middle attack. Int J Artif Intell Inform 3:53–62 20. Khandelwal N, Gupta S (2022) Secure IoT architecture in mobile ad-hoc network against malicious attacks using blockchain-based BATMAN. Int Trans J Eng Manag Appl Sci Technol 13(6):1–15
360
E. Gurumoorthi et al.
21. Han W, Zhang R, Zhang L, Wang L (2022, Apr) A secure and receiver-unrestricted group key management scheme for mobile ad-hoc networks. In: 2022 IEEE wireless communications and networking conference (WCNC). IEEE, pp 986–991 22. Srinivas M, Patnaik MR (2022) Clustering with a high-performance secure routing protocol for mobile ad hoc networks. J Supercomput 78(6):8830–8851
Predicting Mortality in COVID-19 Patients Based on Symptom Data Using Hybrid Neural Networks Naveen Chandra Paladugu, Ancha Bhavana, M. V. P. Chandra Sekhara Rao, and Anudeep Peddi
Abstract Accurate prediction of COVID-19 cases is crucial for effective public health planning and resource allocation. In this study, we evaluated the performance of different neural network architectures, including LSTM, RNN, ANN, and hybrid models for COVID-19 case prediction. Our results demonstrate that the hybrid models, combining the strengths of LSTM, RNN, and ANN architectures, outperform the individual models in terms of accuracy, precision, recall, and F1-score. The bestperforming model, LAR, achieves an accuracy of 95.25%, precision of 97%, recall of 96.68%, and F1-score of 96.84%. Our findings suggest that the hybrid approach can significantly improve the accuracy of COVID-19 case prediction, which could have important implications for public health policy and decision-making. Future work could focus on developing more sophisticated hybrid models to further improve prediction accuracy. Keywords Long short-term memory (LSTM) · Recurrent neural networks (RNNs) · Artificial neural networks (ANNs) · Rectified linear unit (ReLU)
1 Introduction The global health landscape has been significantly affected by the COVID-19 pandemic, leading to a substantial number of deaths on a global scale, reaching millions. One of the key challenges in managing the pandemic has been identifying individuals at high risk of mortality and providing appropriate care and treatment. Machine learning algorithms have shown promise in predicting COVID-19 mortality risk based on symptoms, but more research is needed to identify the most effective models for this task [6–9]. In this study, we investigate the performance of hybrid deep learning models for predicting COVID-19 mortality risk based on symptoms. The study is motivated by the need to identify effective tools for risk stratification that can help clinicians N. C. Paladugu · A. Bhavana (B) · M. V. P. Chandra Sekhara Rao · A. Peddi Department of CSBS, R.V.R. & J.C. College of Engineering, Guntur, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_31
361
362
N. C. Paladugu et al.
prioritize resources and interventions and ultimately improve patient outcomes. The models we evaluated include LAL (LSTM-ANN-LSTM), RAR (RNN-ANN-RNN), RAL (RNN-ANN-LSTM), and LAR (LSTM-ANN-RNN), as well as the individual algorithms LSTM, RNN, and ANN. We use the Patients dataset, which contains information on the symptoms and outcomes of COVID-19 patients, to train and test our models. We use a stratified random sampling technique to split the dataset into training and testing sets, with an 80/20 split. We evaluated the performance of our models using several metrics, including accuracy, precision, recall, and F1-score. Our results show that the hybrid models outperform the individual algorithms, with the RAL and LAR models achieving the highest F1-scores of 96.68 and 96.84, respectively. The ANN algorithm also performs well, with an F1-score of 95.72. These findings suggest that hybrid machine learning can be effective in predicting COVID-19 mortality risk and could have implications for clinical decision-making and public health policy. However, our study has several limitations that should be considered. First, our dataset is limited in size and may not be representative of all COVID-19 patients. Second, our models are based on symptom data only and do not consider other important factors such as comorbidities and laboratory results. Third, our models were evaluated on a single dataset and may not generalize to other populations or contexts. Despite these limitations, our study provides valuable insights into the performance of hybrid deep learning models for predicting COVID-19 mortality risk and could inform the development of more accurate and interpretable models for this important task.
2 Literature Review A hybrid model integrating LSTM and CNN was proposed [2] for COVID-19 outbreak prediction. By employing eight max-pooling and convolution layers along with four LSTM layers, the model achieved high accuracy. In a study [1] comparing ML algorithms, logistic regression (LR) outperformed SVM. Interestingly, resampling techniques led to a decrease in classifier performance. The performance of various ML models, including Naïve Bayes, RUS-Boosted Tree, Medium Tree, Boosted Tree, Coarse Tree, SVM, and Bagged Trees, was evaluated for COVID-19 prediction [3]. Naïve Bayes and Bagged Trees exhibited the best performance rates, highlighting the effectiveness of ML models in predicting COVID-19 severity. Researchers in another study [4] identified mortality factors by analyzing 850 health records. MLP, SVM, random forest, decision tree, and KNN models were employed, with the random forest model demonstrating superior performance. Dyspnea was identified as the most effective factor in predicting patient death.
Predicting Mortality in COVID-19 Patients Based on Symptom Data …
363
In a study [5], an ML model utilizing two distinct patient datasets accurately predicted the risk of mortality from severe COVID-19. Key factors in the model included elevated blood urea nitrogen (BUN), decreased albumin levels, increased creatinine levels, elevated international normalized ratio (INR), and high red cell distribution width (RDW). A hybrid ML model [6] combining CNN and XGBoost was proposed for COVID19 prediction using X-ray images.
3 Methodology Figure 1 shows the workflow of the system.
3.1 Dataset The dataset utilized in this study was the patients’ dataset, comprising various features related to symptoms. These features included patient type, presence of pneumonia, pregnancy status, age, diabetes, asthma, immunosuppression, hypertension, cardiovascular check, smoking habits, obesity, chronic kidney disorder, ICU admission, presence of other diseases, etc. To ensure balanced representation, a total of 8000 records were included in the dataset consisting of cases of death and cases of patients who survived.
Fig. 1 Workflow of the system
364
N. C. Paladugu et al.
3.2 Preprocessing Cleaning Data cleaning involves removing duplicate or irrelevant records, correcting erroneous or inconsistent data. In the dataset we taken, we removed all the duplicate data found. By cleaning the data, we have reduced the likelihood of errors and biases and improved the accuracy of the models. Data Balancing The unbalanced data problem is one of the main obstacles to ML algorithms. This happens when courses are not classified evenly. The amount of data in the result classes in the chosen dataset is noticeably unbalanced, with more samples belonging to the death class and considerably less to the alive class. The models are therefore significantly more likely to classify fresh observations to the dominant class, whereas the trained models frequently offer biased findings in favor of the dominant class. Feature Selection By eliminating pointless and redundant characteristics from the dataset, the feature selection approach is frequently used in forecasting, pattern recognition, and classification modeling to reduce the size and complexity of the dataset. This study used feature selection to create a model and rank the input characteristics in relation to how important they were to the target classes or problem. We identified the relevant features for predicting the outcome variable (i.e., whether the patient dies or not with COVID-19 symptoms) based on domain knowledge and feature importance analysis. Data Splitting We used 80/20 split to split the data as the training records and the testing records. Subsequently, the model was developed using the training records and its performance is gauged using the testing records. This is an essential step that helps to ensure the accuracy and reliability of the model.
3.3 Model Development We developed three individual models—LSTM, RNN, and ANN—and four hybrid models. LAL (LSTM-ANN-LSTM), RAR (RNN-ANN-RNN), RAL (RNN-ANNLSTM), and LAR (LSTM-ANN-RNN). The models were trained on the training data using appropriate hyperparameters [10].
Predicting Mortality in COVID-19 Patients Based on Symptom Data …
365
3.4 Model Evaluation We evaluated the models’ performance on the validation and testing sets using evaluation metrics—accuracy, precision, recall, and F1-score. Accuracy—Performance metric is measured by model accuracy for machine learning models for classification. It is calculated by dividing the sum of true negatives and true positives by the total number of negative and positive cases in the dataset. True positives here are who died with COVID symptoms and true negative are who are alive. Accuracy Score = (TP + TN)/(TP + FN + TN + FP).
(1)
Recall—The model’s capability is measured by recall metric to correctly identify positive instances among the total number of actual positive cases, thus reflecting its accuracy in distinguishing positive outcomes. In this case, the individuals who have actually died from COVID-19 and those who are predicted to succumb to the virus will be identified. It implies that the model excels at recognizing positive instances. Recall Score = TP/(FN + TP).
(2)
Precision—The model precision score is a metric that measures the proportion of positively predicted labels that are correct. This states like: “Of all the patients predicted to be dead, how many actually are died.” Precision Score = TP/(FP + TP).
(3)
F1-score—The F1-score combines the precision and recall scores to provide an overall measure of model accuracy. It serves as an alternative to accuracy metrics, as it does not require knowledge of the total number of observations. F1 Score = 2 ∗ Precision Score ∗ Recall Score/(Precision Score + Recall Score). (4)
3.5 Algorithm 1. 2. 3. 4.
Input: patients .csv file. Output: predicting a person whether he may die or not. Initialization: no. of classes = 2. Add 45 units either of (RNN, ANN, LSTM) as 1st hidden layer.
366
5. 6. 7. 8. 9.
N. C. Paladugu et al.
Add 45 units either of (RNN, ANN, LSTM) as 2nd hidden layer. Add 45 units either of (RNN, ANN, LSTM) as 3rd hidden layer. Add an output layer of 1unit (Dense). Compile the model using Adam optimizer and mean squared error loss function. Fit the model and evaluate the model with performance metrics.
4 Methodologies 4.1 Long Short-Term Memory (LSTM) LSTM as shown in Fig. 2 can be utilized to effectively model sequential data, particularly when long-term dependencies are present and also can be employed to forecast the severity of COVID-19 cases, including factors like the likelihood of hospitalization or death, using clinical data. In the context of our research on predicting mortality based on symptoms, we employed the LSTM algorithm to learn patterns and relationships between symptoms over time. In all three layers of the model, a layer with 45 units using LSTM and a rate of 0.2 for the dropout layer were incorporated. Lastly, a dense layer with 1 unit was added. The model utilizes the forget gate, input gate, and candidate value equations to determine the relevant information for updating the current cell state from the previous cell state and the current input. The output gate equation controls the utilization of the updated cell state for computing the hidden state.
Fig. 2 LSTM model
Predicting Mortality in COVID-19 Patients Based on Symptom Data …
367
Fig. 3 RNN model
4.2 Recurrent Neural Networks (RNNs) Recurrent neural networks are a specialized type of neural network architecture that excel at modeling sequential data. They can be effectively employed to predict COVID-19 death rate based on patient symptoms. At the core of an RNN is its hidden state, serving as memory that updates with each input time step (Fig. 3). The hidden state of the network at any given time captures valuable information about the previous inputs it has processed. This mechanism enables the network to retain knowledge about the sequence and make informed predictions regarding COVID-19 cases. In all the three layers of the model, RNN of 45 units along with a dropout layer was added. Finally, the dense layer with 1 unit was added and compiled using Adam optimizer.
4.3 Artificial Neural Networks (ANNs): ANN takes symptom data as input and processes it through the network as shown in Fig. 4. Each input symptom is represented as a feature or node in the input layer. The network then propagates this information forward through the hidden layers, where computations are performed on the input data using weighted connections between neurons. The output layer of the ANN predicts the death rate based on the learned patterns and relationships in the input symptom data. In all the three layers of the model, ANN of 45 units and a dropout layer with a dropout rate of 0.2 were added. Here, the activation function used was Rectified Linear Unit (ReLU). ReLU is a function that introduces nonlinearity into a neural network, enabling it to learn intricate patterns and improve its prediction capabilities. This returns the input value if it is greater than or equal to zero; otherwise, it returns
368
N. C. Paladugu et al.
Fig. 4 ANN model
zero. Finally, the dense layer with one unit was added and compiled using Adam optimizer.
4.4 RNN–ANN–RNN (RAR) This approach combines RNN, ANN, and RNN models. In the first layer, a RNN with 45 units is employed, enabling the model to effectively capture and model temporal dependencies within the data as shown in Fig. 5. Additionally, a rate of 0.2 for the dropout layer is incorporated to enhance generalization by randomly deactivating 20% of the units during training, reducing overfitting. Moving to the second layer, an ANN is introduced, leveraging its ability to extract high-level features and nonlinear relationships from the input data. The ReLU activation function is chosen for its effectiveness in capturing nonlinearity, enhancing
Fig. 5 RAR model
Predicting Mortality in COVID-19 Patients Based on Symptom Data …
369
Fig. 6 LAL model
the model’s ability to learn complex patterns. The third layer mirrors the first layer, utilizing an RNN architecture to further capture long-term dependencies and refine predictions. This additional RNN layer allows for a more comprehensive understanding of the data’s underlying structure, enabling improved modeling of sequential information.
4.5 LSTM–ANN–LSTM (LAL) This combines LSTM, ANN, LSTM. The first LSTM layer is employed to effectively capture dependencies in the data. In the second layer, an ANN is utilized to capture high-level features from the input. To further capture long-term dependencies and refine the predictions, a second LSTM layer is introduced. This layer operates similarly to the first layer, allowing for a more comprehensive understanding of the underlying data structure as shown in Fig. 6.
4.6 RNN–ANN–LSTM (RAL) RAL is another approach as shown in Fig. 7 that combines the RNN, ANN, and LSTM models to improve the accuracy of predictions. In this approach, the RNN is used in the first layer, the ANN is used in the second layer, and the LSTM in the third layer. By integrating these models in the RAL approach, the model can leverage the complementary capabilities of the RNN, ANN, and LSTM to improve prediction accuracy. The RNN captures COVID dependencies effectively, the ANN extracts high-level features, and the LSTM effectively models long-term dependencies.
370
N. C. Paladugu et al.
Fig. 7 RAL model
4.7 LSTM–ANN–RNN (LAR) This approach combines the LSTM, ANN, and RNN models. In the first layer, an LSTM model is employed. LSTMs excel at capturing and modeling long-term dependencies in sequential data. By utilizing memory cells and gates, LSTMs effectively retain important information over extended time steps, making them ideal for comprehending complex patterns. The second layer incorporates an ANN, which specializes in extracting high-level features and capturing intricate nonlinear relationships within the input data. The third layer utilizes an RNN model, known for its proficiency in capturing COVID dependencies and modeling symptoms’ data by maintaining an internal memory that recurrently processes inputs. This synergistic integration provides a comprehensive and robust framework for capturing complex patterns and making accurate predictions in various domains as shown in Fig. 8.
5 Results Table 1 shows the results obtained for ANN, RNN, LSTM, LAL, RAR, RAL, and LAR models in terms of accuracy, precision, recall, and F1-score. Figure 9 is the graph showing the accuracy of each algorithm. The graph is drawn based on the result obtained from Table 1. The highest accuracy (95.25%) was achieved by the LAR model.
Predicting Mortality in COVID-19 Patients Based on Symptom Data …
371
Fig. 8 LAR model
Table 1 Comparison of performance of different models Algorithms
Accuracy
Precision
Recall
F1-score
ANN
93.56
95.92
95.52
95.72
RNN
94.75
96.82
96.18
96.5
LSTM
94.93
97.46
95.77
96.61
LAL
94.5
97.21
95.43
96.31
RAR
94.56
96.35
96.43
96.39
RAL
95.06
97.95
95.43
96.68
LAR
95.25
97
96.68
96.84
Fig. 9 Graph showing accuracy of each algorithm
372
N. C. Paladugu et al.
6 Conclusion The study evaluated the performance of different neural network models, including individual models and hybrid approaches, in predicting COVID-19 cases. The results indicate that the hybrid approaches outperformed the individual models in terms of accuracy, precision, recall, and F1-score, suggesting that combining multiple neural network architectures can improve the prediction performance and robustness of COVID-19 prediction models. The specific hybrid algorithm that achieved the best performance may depend on the available data, research question, and other factors. However, the study demonstrates the potential of using hybrid approaches to improve COVID-19 prediction models and support public health decision-making.
7 Future Work Further, incorporating additional data sources, such as social media data, could provide additional insights into COVID-19 transmission dynamics and improve the accuracy of prediction models. Social media data can provide real-time information on public sentiment and behavior, which may impact COVID-19 transmission. Future research can evaluate the effectiveness of different public health interventions, such as mask mandates or lockdowns, using machine learning techniques. These models could help policymakers to evaluate the potential impact of different interventions and make data-driven decisions about when and how to implement them. Finally, it will be important to evaluate the generalizability of the models to different geographic locations and populations. This can help ensure that the models are useful in a variety of contexts and can be used to support public health decisionmaking on a global scale.
8 Novelty of Work Our study contributes to the ongoing efforts to develop effective predictive models for COVID-19 morality rates. Early identification of patients at high risk of morality is crucial for providing timely and appropriate care and for allocating resources effectively. By utilizing clinical data, our models provide a non-invasive and accessible method for identifying patients at high risk of morality, which could be helpful in guiding clinical decision-making and improving patient outcomes. Overall, the study’s use of hybrid approaches for COVID-19 prediction modeling represents a novel and promising avenue for the future research in this field.
Predicting Mortality in COVID-19 Patients Based on Symptom Data …
373
References 1. Krajah A, Sleit A, Almadani YF, Saadeh H (2021) Analyzing Covid-19 data using various algorithms 2. Narula A, Vaegae NK (2023) Development of CNN-LSTM combinational architecture for COVID-19 prediction 3. Alginahi YM, Khan MZ (2021) Study and analysis of Covid-19 patients using machine learning models 4. Moulaei K, Ghasemian F, Bahaadinbeigy K, Sarbi RE, Taghiabad ZM (2021) Predicting mortality of COVID-19 patients based on data mining techniques 5. Jamshidi E, Asgary A et al (2022) Using machine learning to predict mortality for Covid-19 patients on day 0 in the ICU 6. Zivkovic M, Bacanin N et al (2022) Hybrid CNN and XGBoost model tuned by modified arithmetic optimization algorithm for Covid-19 early diagnostics from X-ray images 7. Gopalan N, Senthil S et al (2022) Predictors of mortality among hospitalized COVID-19 patients and risk score formulation for prioritizing tertiary care—an experience from South India 8. Banoei MM, Dinparastisaleh R et al (2021) Machine-learning-based COVID-19 mortality prediction model and identification of patients at low and high risk of dying 9. Ahmad A, Safi O et al (2021) Decision tree ensembles to predict coronavirus disease 2019 infection: a comparative study 10. Kumar S, Sharma R et al (2021) Forecasting the spread of COVID-19 using LSTM network
Underwater Image Quality Assessment and Enhancement Using Active Inference Radha SenthilKumar , M. N. Abinaya, Divya Darshini Kannan, K. N. Kamalnath, and P. Jayanthi
Abstract The global attention toward the marine environment has been increasing due to the abundance of debris found in shallow and open seas, coasts, and even the seabed. The use of submersibles with debris detection systems can help in surveying and collecting the same. However, images obtained for this purpose are plagued with numerous issues in the field of image processing, such as heavy light distortion and scattering, and the disappearance of red hue as we descend into deeper waters. To address this, the existing NRIQA-GAN model with Active Inference constraints is used to evaluate the value of underwater pictures, and UGAN-P model is employed to enhance the images. Image segmentation has also been performed as a preliminary step toward future work on deep-sea debris detection using object detection methods. Keywords UGAN-P model · Quality enhancement · Sea surface and beach · Deep-sea debris
1 Introduction 1.1 Overview The quality of images is evaluated through a distinct approach involving the generation of the primary content of the image and two Active Inference constraints with a blend of Generative Adversarial Network (GAN) [1] and Convolutional Neural Network (CNN). Traditional methods for assessing image quality [2] are not accurate in measuring quality in underwater images with light and color distortion. A Generative Network is employed to enhance images and reduce the noise parameters specific to underwater images, and the quality is reevaluated and compared after the enhancement process. The objective is to provide enhanced images with higher quality scores, determined by a more precise method than standard quality R. SenthilKumar · M. N. Abinaya · D. D. Kannan · K. N. Kamalnath · P. Jayanthi (B) Anna University, Madras Institute of Technology, Chennai, Tamil Nadu, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_32
375
376
R. SenthilKumar et al.
predictors, which can be utilized by a model to recognize and categorize deep-sea debris [3]. The primary challenge faced in processing underwater images is due to scattering and attenuation. This is expounded in underwater media, as there is heavy scattering caused by sunlight in these regions. Moreover, the colors with the longest wavelengths and lowest energies are absorbed first in deep-sea conditions, even at 5ft depth. Red is absorbed first, followed by orange and yellow, disappearing in the same order as on the color spectrum, causing color distortion in underwater images. Existing systems find it challenging to precisely and accurately capture detail and assess quality of underwater images due to challenges posed by above conditions. In the domain of debris collection from deep-sea environments, unmanned autonomous underwater vehicles (AUV) and remotely operated vehicles (ROV) generate continuous snapshots/frames of real time under water video, in order to detect debris/trash and classify them for the purpose of collection and segregation. However, images obtained in underwater settings prove to be poor fodder for detection and classification due to heavy light scattering and red hue distortion. Hence, performing image processing on these images to evaluate or improve their quality proves to be a tedious and imprecise task (due to the above conditions). With this in mind, this paper attempts to put forth a consolidated image processing model for better Image Quality Assessment and Enhancement in the underwater domain. Image segmentation has also been performed on the images to obtain comparative results in detecting/segmenting trash in the original versus (accurately) enhanced images [6–10]. To enhance underwater images and generate more effective images than those in the existing dataset, the author utilized both Generative Adversarial Network (GAN) and Natural Image Quality Evaluation Index (NIQE) [11]. The goal was to increase contrast and align the images with human perception, making them more visually appealing and useful. The algorithm’s real-time performance was evaluated to demonstrate its viability for engineering applications, but it was limited to standard GAN usage. Enhancing Underwater Imagery [12] using Generative Adversarial Networks [13, 14] suggests a technique that utilizes Generative Adversarial Networks (GANs) to enhance the quality of underwater scenes visually. The purpose is to enhance the input for vision-driven behaviors down the autonomy pipeline. Additionally, the article demonstrates how newly proposed methods can generate datasets that allow for underwater image restoration. For underwater robots that rely on visual guidance, this enhancement can result in improved safety and reliability due to robust visual perception. Underwater Image Descattering and Quality Assessment [15] suggests a method that employs color correction to reduce the negative effects of strong light scattering and image distortion in underwater settings. The team has also performed a thorough analysis of how light influences the way we perceive images of underwater scenes, especially in deep-sea conditions. Recent Progress in Semantic Image Segmentation [16] reviews the advancements made in semantic image segmentation, starting with an analysis of traditional methods, and then moving on to a discussion of the latest developments in Deep Neural Networks (DNNs). The article offers a detailed examination of various techniques used for semantic segmentation, including convolutional networks, up sample methods, fully
Underwater Image Quality Assessment and Enhancement Using Active …
377
convolutional network joint with conditional random field (FCN joint with CRF), dilated convolution approaches, backbone network progresses, pyramid methods, multiple feature methods, and multiple stage methods. The survey also covers several types of methods, such as supervised, unsupervised, and weakly supervised methods, and draws a conclusion about the effectiveness of these techniques based on extensive research.
1.2 Objectives The main aim is to create a model that can efficiently handle underwater images for the purpose of detection. This involves precise evaluation and enhancement of image quality using a new Active Inference technique implemented with Generative Adversarial and convolutional networks. To accomplish this, a sandwich model comprising of a GAN–CNN–GAN is utilized for image assessment and improvement.
1.3 Contributions The proposed system aims to improve the detection of deep-sea debris by utilizing high-quality images instead of the low-quality images commonly used in existing detection models. The system employs generative models for both quality evaluation and No Reference Image Quality Assessment (NRIQA) techniques [4]. This approach enhances the efficiency of the system and improves the accuracy of deepsea debris detection. The Trash Can Dataset was used as the dataset in this study [5]. Presently, existing image processing techniques such as quality assessment and enhancement face a host of accuracy and precision issues when applied to the underwater domain, due to heavy light distortion and scattering, as well as the disappearance of red hue as we descend into its depths. Comprehensive techniques are required to be implemented in the field of deep-sea trash detection and collection by Rover systems. Here is an attempt to put forth such a model. Using the proposed NRIQA-GAN model coupled with Active Inference constraints, the quality of the underwater images is assessed. Upon these images, a UGAN-P model is applied to perform image enhancement. Image segmentation is performed to understand the difference in detecting/segmenting trash in original and accurately enhanced images.
378
R. SenthilKumar et al.
Fig. 1 Debris detection module
2 System Architecture and Design 2.1 System Design The proposed work consists of three modules in its system design: the Quality Evaluation module, the Enhancement module, and the Debris Detection module. The main emphasis of the proposed work is on enhancing the quality of underwater images that are used in deep-sea debris detection. The system design, for debris detection modules is illustrated in Figs. 1 and 2 shows how these modules are interconnected.
2.2 System Architecture The WGAN_GP generative model is fed with the Underwater Trash Can dataset [5] and produces primary content based on Active Inference using semantic similarity and structure dissimilarity constraints. The remaining content is obtained by removing the primary content from the original image, resulting in an image with disordered information. The original image, primary content image, remaining content image, and semantic similarity map are evaluated for quality using a Multistream convolutional neural network based on the VGG-19 model. The UGAN-P
Underwater Image Quality Assessment and Enhancement Using Active …
379
Fig. 2 Deep-sea debris detection model
generative model is then used to enhance the image quality by adding a new loss term to the WGAN_GP model to capture low-frequency details. To address the issue of blurry images often produced by generative models, the model penalizes the differences of image gradient predictions to sharpen the images. By incorporating these two terms into the WGAN_GP model, the UGAN_P model is created to enhance the image quality, which is then fed into a deep-sea debris detection model for efficient detection. The detailed architecture of this system is illustrated in Fig. 3.
3 Algorithm Design and Implementation 3.1 Primary Content Generation The algorithm outlined in Algorithm 1 uses a Wasserstein-Generative Adversarial Network [17, 18] with Penalty Gradient and two Active Inference constraints based on Internal Generative Mechanism to create the Primary Content of the input original (distorted) images. Along with generating similarity maps and remaining content between the original and primary content images, the GAN framework constructs an Active Inference model with a generator G and a discriminator D.
380
R. SenthilKumar et al.
Fig. 3 SSNR and PSNR values of enhanced images
Algorithm 1 Primary Content Generation Input: Distorted Underwater Images Compute Gaussian Noise of Images do // Generator produces primary content Ig from original distorted image Id G*=argmin(μ1 Ladv+μ2 Lpix + μ3 Lcontentloss+ μ4 Lss+ μ5 Lsd). // where μ1=μ2, μ2=1.0, μ3=0.01, μ4=0.01, μ5=1.0, Lss=-Rsemantics(Ig , Id), Lsd=Rstructure(Ig , Iu), Lpix =MSE(Ig , Ir), Lcontentloss =MSE(ϕk (Ig)-ϕk (Ir)), Ladv = E Ir~ Pr [D(Ir)] - E Ig ~ Pg [D(Ig)]. // Generator G uses content, pixel and adversarial loss to generate the primary content. // Additionally, the two IGM constraints, semantic similarity and structural dissimilarity losses Lss and Lsd are used for optimization. Rstructure(Ig , Iu)= 1/WH (||SSIM(Ig,Iu)||2) 2 // where the minimum value of SSID is chosen as the final
Underwater Image Quality Assessment and Enhancement Using Active …
381
constraint, and W, H are the width and height of Ig, respectively Rsemantics(Ig , Id)= -MSE(ϕk(Ig)-ϕk(Id)) // From this, the maximum value is chosen as the final constraint Iu=|Id-Ig|, where Iu=prediction error SSIM(x, y) = (2μxμy + C1)(2μxy + C2)/(μx2 + μy2 + C1)(μx2 + μy2 + C2) //where SSIM(x,y)=Semantic Similarity between x and y. Ig, Iu are both in grayscale form Discriminator computes the loss between the original and primary content images for each time primary content which is generated. D* = argmin(- Ladv + LGP), where Ladv = adversarial loss, LGP = Gradient Penalty, where P^x represents the sampling distribution which samples uniformly along straight lines between Pr and Pg, λ is the penalty coefficient, while difference between Ig and Id is negligible [19].
3.2 Underwater Image Enhancement The system uses the Unsupervised Image Enhancement Generative Adversarial Network (UEGAN) [20] for image enhancement. Unlike traditional methods that rely on a large set of paired images to learn, UEGAN learns the corresponding image-to-image mapping in an unsupervised way from a set of images with desired characteristics. It uses a single deep GAN that incorporates modulation and attention mechanisms to capture richer global and local features. To apply the gradient differential loss, the system employs UGAN-P and adds a penalty term denoted by ‘P’. In order to capture low-level frequencies in the image and provide the GAN with a sense of ground truth, the L1 loss and gradient loss are considered.
3.3 Image Segmentation The aim of performing image segmentation was to assess whether the enhanced dataset could serve as a better foundation for detection tasks. The garbage in the improved underwater images was distinguished and emphasized by utilizing RCNN instance segmentation techniques. To explain segmentation in simple terms, it involves assigning labels to pixels, where all pixels of the same category are given a common label. Instance segmentation isolates the objects or regions of interest and marks them, clearly separating them from the surrounding areas to highlight
382
R. SenthilKumar et al.
each individual object in the image. It identifies the boundaries of objects with high precision at the pixel level, providing preliminary outcomes. To achieve productive results, instance segmentation was performed on the enhanced dataset for only 50 epochs, while the original dataset required 85–100 epochs.
4 Implementation and Results 4.1 Implementation Environment The study was conducted using a Windows 10 operating system and 8 GB of RAM. The training and testing of the primary content generation approach, quality assessment of images using multi-stream CNN, and enhancement using the UGAN-P model were carried out on Kaggle kernel with NVIDIA Tesla P100 PCI-based 16 GB GPUs. Python programming language was utilized to deploy the model, with libraries such as Slim, Keras, and Pytorch, and TensorFlow. The TensorFlow-slim library was utilized for generating primary content and evaluating the quality of underwater images, while the Pytorch enhancement module was utilized for implementing the enhancement.
4.2 Performance Metrics for Quality Assessment The Spearman Rank-Order Correlation Coefficient (SROCC) is used to evaluate the degree of monotonicity between the predicted score and the ground truth, while the Pearson Linear Correlation Coefficient (PLCC) is used to assess the linear correlation between the predicted score and the ground truth. A higher value for both measures indicates better algorithm performance. The Kendall Rank-Order Correlation Coefficient (KROCC) and PLCC are used to calculate the difference between the predicted image quality scores and the actual quality scores. A higher value for any of these three criteria indicates better algorithm performance, and Table 1 shows the results of the evaluations. Table 1 Evaluation measures to assess performance
Method
Values
SROCC
0.75345
KROCC
0.64653
PLCC
0.70545
Underwater Image Quality Assessment and Enhancement Using Active …
383
4.3 Performance Metric for Quality Enhancement The PSNR metric calculates the difference in signal-to-noise ratio, expressed in decibels, between two images. This ratio is used to evaluate the quality of an enhanced image in comparison to the original image. A higher PSNR value indicates better quality of the enhanced image. The Structural Similarity Index (SSIM) is utilized as a criterion to determine the resemblance between two images. SSIM score ranges from 0 to 1, with 1 indicating a perfect match between the reconstructed and original images. The SSIM computation includes x and y as the signals to compare, while μx and μy are the mean intensity, σ x and σ y are the standard deviation, and the constants C 1 and C 2 are added to prevent instability when the denominator is too small.
4.4 Outcomes The WGAN_GP model produces the main elements of the underwater images as shown in Figs. 4 and 5 which displays samples of these primary contents. The residual content and semantic similarity between the primary content image and the original image are then calculated. The semantic similarity images, which are grayscale images, are presented in Fig. 6. Residual content of the images is shown in Fig. 7, which is the image that remains after removing the primary content from the original image. As discussed earlier, the UGAN_P model was used to improve the quality of underwater images. The outcomes of the enhancement module are presented in Fig. 8,
Fig.4 Original images
384
R. SenthilKumar et al.
Fig.5 Primary content images
Fig.6 Semantic similarity images
where the upper-half shows the original or distorted images, and the lower-half shows the enhanced images.
Underwater Image Quality Assessment and Enhancement Using Active …
385
Fig.7 Remaining content
Fig. 8 Results of enhancement module
5 Conclusion and Future Work 5.1 Conclusion The system introduces a GAN-CNN-GAN sandwich architecture with Active Inference constraints that can generate primary content of underwater images for quality assessment and enhancement of images obtained from the TrashCan 1.0 dataset. It also proposes an IGM-inspired NR IQA model for predicting image quality, which benefits from the two proposed IGM-inspired constraints. The GAN-based Active
386
R. SenthilKumar et al.
Inference module effectively applies Active Inference theory to predict primary content of distorted images. The multi-stream quality evaluator integrates primary content information and leverages properties of IGM for NRIQA, resulting in a more efficient quality assessment. The use of primary content and multi-stream CNN for image quality assessment enables more effective evaluation. A generative model is employed once again for image enhancement, and image segmentation is performed to compare trash detection in the original and enhanced image sets.
5.2 Future Work In the future, the focus of the research will be on using quality-enhanced images to improve the accuracy and efficiency of deep-sea debris detection algorithms. Additionally, to enhance the efficiency of primary content generation for underwater images, other constraints could be applied, as the current work uses only two constraints. Furthermore, it may be possible to expand the research to real-time videobased quality enhancement, which could facilitate various practical applications, such as automated vehicles.
References 1. Ni Z, Yang W, Wang S, Ma L, Kwong S Towards unsupervised deep image enhancement with generative adversarial network. [Online]. Available: https://github.com/eezkni/UEGAN 2. Xiang T, Yang Y, Guo S (2020) Blind night-time image quality assessment: subjective and objective approaches. IEEE Trans Multimedia 22(5):1259–1272. https://doi.org/10.1109/ TMM.2019.2938612 3. Xue B, Huang B, Chen G, Li H, Wei W (2021) Deep-sea debris identification using deep convolutional neural networks. IEEE J Sel Top Appl Earth Obs Rem Sens 14:8909–8921. https://doi.org/10.1109/JSTARS.2021.3107853 4. Gu K, Zhai G, Lin W, Yang X, Zhang W (2015) No-reference image sharpness assessment in autoregressive parameter space. IEEE Trans Image Process 24(10):3218–3231. https://doi.org/ 10.1109/TIP.2015.2439035 5. Hong J, Fulton M, Sattar J (2020) TrashCan: a semantically-segmented dataset towards visual detection of marine debris, July 2020. [Online]. Available: http://arxiv.org/abs/2007.08097 6. Xue W, Mou X, Zhang L, Bovik AC, Feng X (2014) Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features. IEEE Trans Image Process 23(11):4850–4862. https://doi.org/10.1109/TIP.2014.2355716 7. Ma J et al (2021) Blind image quality assessment with active inference. IEEE Trans Image Process 30:3650–3663. https://doi.org/10.1109/TIP.2021.3064195 8. Wu Q, Wang Z, Li H (2015) A highly efficient method for blind image quality assessment. In: Proceedings of international conference on image processing, ICIP, Dec 2015, pp 339–343. https://doi.org/10.1109/ICIP.2015.7350816 9. Moorthy AK, Bovik AC (2010) Blind image quality assessment: from natural scene statistics to perceptual quality. [Online]. Available: http://live.ece.utexas.edu/research/quality/DIIVINE 10. Yang H, Shi P, Zhong D, Pan D, Ying Z (2019) Blind image quality assessment of natural distorted image based on generative adversarial networks. IEEE Access 7:179290–179303. https://doi.org/10.1109/ACCESS.2019.2957235
Underwater Image Quality Assessment and Enhancement Using Active …
387
11. Hu K, Zhang Y, Weng C, Wang P, Deng Z, Liu Y (2021) An underwater image enhancement algorithm based on generative adversarial network and natural image quality evaluation index. J Mar Sci Eng 9(7). https://doi.org/10.3390/jmse9070691 12. Fabbri C, Islam MJ, Sattar J (2018) Enhancing underwater imagery using generative adversarial networks. In: Proceedings of IEEE international conference on robotics and automation, Sep 2018, pp 7159–7165. https://doi.org/10.1109/ICRA.2018.8460552 13. Wu J, Ma J, Liang F, Dong W, Shi G, Lin W (2020) End-to-end blind image quality prediction with cascaded deep neural network. IEEE Trans Image Process 29:7414–7426. https://doi.org/ 10.1109/TIP.2020.3002478 14. Han R, Guan Y, Yu Z, Liu P, Zheng H (2020) Underwater image enhancement based on a spiral generative adversarial framework. IEEE Access 8:218838–218852. https://doi.org/10. 1109/ACCESS.2020.3041280 15. Lu H et al (2016) Underwater image descattering and quality assessment. In: Proceedings of international conference on image processing, ICIP, Aug 2016, pp 1998–2002. https://doi.org/ 10.1109/ICIP.2016.7532708 16. Liu X, Deng Z, Yang Y (2019) Recent progress in semantic image segmentation. Artif Intell Rev 52(2):1089–1106. https://doi.org/10.1007/s10462-018-9641-3 17. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN, Jan 2017. [Online]. Available: http://arxiv.org/abs/1701.07875 18. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein GANs, Mar 2017. [Online]. Available: http://arxiv.org/abs/1704.00028 19. Li Q, Lin W, Xu J, Fang Y (2016) Blind image quality assessment using statistical structural and luminance features. IEEE Trans Multimedia 18(12):2457–2469. https://doi.org/10.1109/ TMM.2016.2601028 20. Ni Z, Yang W, Wang S, Ma L, Kwong S (2020) Towards unsupervised deep image enhancement with generative adversarial network. IEEE Trans Image Process 29:9140–9151. https://doi.org/ 10.1109/TIP.2020.3023615
A Machine Learning and Deep Learning-Based Web Application for Crop and Fertilizer Recommendation and Crop Disease Prediction Amuri Srinidhi, Veeramachinani Jahnavi, and Mohan Dholvan
Abstract India is a country that is majorly dependent on agriculture and farming for livelihood. The country’s diverse soil lets farmers produce various crops throughout the year. Depending on the characteristics of soil such as the composition of nitrogen, phosphorus, potassium and the pH of the soil and environmental factors such as rain, humidity, and temperature, it is important to know which crops should be grown and what fertilizers to use, to maximize the yield. It is also important to detect any crop diseases at an early stage to prevent major loss. This paper presents an all-in-one platform web application that performs all three tasks, and it uses ML algorithms such as KNN, Decision Tree, Gaussian Naive Bayes, SVM, Logistic Regression, and Random Forest to recommend crops and fertilizers by analyzing soil and environmental factors. For crop recommendation, Random Forest is chosen as it provided the highest accuracy of 99.09%. Random Forest is deployed for fertilizer recommendation as well, as it provided the highest accuracy of 100%. The Deep Learning model ResNet50 is used to detect crop diseases, and the accuracy for training and validation was 95.52% and 87.36%, respectively. Flask framework is used to build the backend of the web application. The web application is enabled to send an SMS of the results to user’s mobile phone using the SMS API, Twilio. The goal of the project is to provide an ideal solution to farmers or the users of the web application, via a text message, which can be accessed even in remote areas and can be understood by the majority of people. Keywords Web application · Crop recommendation · Fertilizer recommendation · Crop disease prediction · Random Forest · ResNet50
A. Srinidhi (B) · V. Jahnavi · M. Dholvan Department of Electronics and Computer Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, India e-mail: [email protected] M. Dholvan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_33
389
390
A. Srinidhi et al.
1 Introduction For generations, agriculture has served as the foundation of India’s economy, contributing a sizable portion to the country’s GDP, and supporting the livelihoods of millions of people. India’s agricultural sector is extremely diversified and varies widely by area and climate. For the fiscal year 2020–2021, agriculture and allied sectors contributed a major share of 20.2% toward the GDP [1]. Apart from being a major contribution to the Indian economy, agriculture is the source of food to every Indian household. India is also the world’s largest exporter of rice; hence, it is crucial for Indian farmers to continue to provide a constant supply of rice and other grains to ensure global food security. To meet the increasing demands for food, practices such as deforestation, pesticides, synthetic manure, and fertilizers are often adopted. These farming practices that have been adopted in recent times have caused much damage to the soil, water, and air more than ever. It is found that these farming practices degrade soil 100 times faster than new soil is formed [2]. Global warming and climate change also affect agriculture in numerous ways. The rise of temperature will restrict certain crops from growing, and some crops are flooded away due to untimely rains. This raises a lot of concern because if the current farming practices are continued, it will compromise global food security, leaving thousands of people malnourished, hungry, and dead. The Work Bank [3] forecasts that the number of people requiring urgent humanitarian assistance is going to reach 205 million in 45 countries/territories. This situation makes it hard for farmers to decide what crops to grow that would give them substantial yield and not degrade the soil any further. The farmer here needs to make an informed decision as to what crops to grow that would help them get good yields and maintain balance with the environment. Sustainable soil management plays a key role to prevent soil degradation. The soil must be enriched with nutrients that are required to grow numerous types of crops. Another major concern for farmers is crop diseases. Crop diseases can cause extreme losses to the farmers and, in some cases, may destroy entire crop production. In order curb the spread of crop diseases, the disease must be diagnosed early on set. This paper introduces AgriGenie—a one stop platform web application that allows farmers or anyone who wants to grow anything to make informed decisions about what crops to grow considering the soil and environmental factors, which fertilizers to use depending on the existing soil composition, environmental factors, and crops they want to grow. The web application is equipped to predict crop diseases as well. It is also enabled to send the results of the analysis as an SMS to make it easy for farmers to remember and access the result from anywhere.
A Machine Learning and Deep Learning-Based Web Application …
391
2 Related Work In the paper [4], the proposed system is used to recommend crops and identify crop diseases. It utilizes the CNN algorithm for identifying crop diseases. It uses a custom dataset that is made using a pre-existing plant village dataset. For the crop recommendation, the proposed system uses various algorithms and the best one is chosen. Algorithms such as KNN, SVC, Random Forest, Decision Tree, and Logistic Regression are used. Out of the stated algorithms, SVC provided the highest accuracy. In the paper [5], the research methodology discusses a mobile application for fertilizer recommendation. The mobile application is designed to provide fertilizer recommendations based on the soil and environmental characteristics as well as name of the crop that the user wants to grow. The fertilizer recommendation is done using machine learning algorithms such as KNN, Random Forest, and Decision Tree. Among the three algorithms, Random Forest was chosen for the mobile application as it provided the highest accuracy of 90%. In the paper [6], the proposed system discusses a web application which is made for a college where students, faculty, and alumni can interact. This responsive web application uses NLP for text analysis and uses Python’s Django framework to build the backend of the web application and React Js for the frontend. In the paper [7], a system is discussed that will recommend nutrients that are required to keep the soil enriched and maintain yearly yield. It utilizes improved genetic algorithm which uses time-series sensor data to recommend nutrients to the soil. The recommendation is done by comparing patterns with time-series sensor data. In the paper [8], the proposed system discusses the need of precision farming and how it can be achieved. It proposes a web application that is built using Django framework and provides functionalities such as crop recommendation, weed identification, pesticide recommendation, and crop cost estimation. In the paper [9], the proposed methodology for crop recommendation considers only Karnataka region, India. It utilizes a custom dataset made using the data obtained from IMD Pune containing information regarding 20 major crops. In the paper [10], a research methodology is adopted to optimize soil nutrient parameters such as phosphorus, potassium, organic carbon, boron, and pH. Extreme learning machine with different activation functions are used. The Gaussian radial basis attained the highest performance compared to other EL activation functions. For pH parameter, hyperbolic tangent gave the highest performance of 90%. In the paper [11], the authors proposed a methodology to detect crop diseases, severity of the disease, and crop loss estimation. It uses KNN with an acquired accuracy of 98.50%. The authors used GRAD-CAM for visualization. In the paper [12], the authors proposed a new sequential image classification model to detect crop diseases which combines RNN and CNN known as GatedRecurrent Convolutional Neural Networks (G-RecConNN). The dataset is collected for plantain and banana crops from the state of Tamil Nadu, India.
392
A. Srinidhi et al.
In the paper [13], the authors proposed a system to build a web application using Flask framework that can predict diabetes. Machine learning algorithms such as Decision Tree, Naïve Bayes, KNN, Random Forest, Gradient Boosting, Logistic Regression, and SVM are deployed. Out of the stated, the best performing model is chosen to be deployed on the web application. In the paper [14], the author proposed a methodology to provide an accurate diagnosis of plant diseases using CNN. A model is developed with seven convolutional layers, two densely connected layers, and four pooling layers. It uses PlantVillage dataset from Kaggle. All the existing systems that are available now are studies and papers that do not have a user interface and, in some cases, if the user interface is available, they are limited by the functionality. As there is no proper user interface, or the limited functionality makes it tough for farmers and any user to navigate, thus hampering their effort to make an informed decision. There are several papers that discuss crop recommendation, fertilizer recommendation, and crop disease detection, but there are none on having the three functionalities on one single platform and be able to give quick and accurate results to the user on the web and through an SMS.
3 Objectives • The web application is designed to make an all-in-one platform that is easy to use and provides the user with adequate information to make informed decisions about farming. The web application is designed to provide results with good accuracy that would help farmers and anyone who wants to grow plants. • Terrace gardens have become quite popular as people are shifting toward organic and homegrown food. The proposed web application can be an amazing guide to them to grow plants that are suitable to the soil and environmental factors they have. It can also help them identify any crop diseases as many do not have such expertise. They can also get suggestions on fertilizers that will help them grow better crops. • For farmers, this web application can provide them with good yield and curb the usage of unnecessary amounts of synthetic fertilizers and grow crops that are suitable to the soil conditions. This web application also helps them to stop the spread of infectious plant diseases that might cause an alarming damage to the farm, yield and even contaminate other crops and water.
4 Research Methodology The proposed methodology of the paper is to utilize the powerful computing algorithms of machine learning and deep learning and combine them with web development to provide a web application that helps farmers make informed decisions on
A Machine Learning and Deep Learning-Based Web Application …
393
crops to grow, fertilizers to use and predict crop diseases. The result is displayed on the web application, and a text message is sent to user via an SMS API.
4.1 Crop Recommendation System For crop recommendation, the dataset from Kaggle [15] is used and methodology is shown in Fig. 1. It is a dataset that is prepared by merging crop and fertilizer data. The crop data held attributes such as temperature, humidity, pH, rainfall, and labels of various crops. The fertilizer data held attributes for NPK values and pH values with labels. The intersection of the two datasets is taken from labels. The initial crop dataset had 3099 data records for 32 crop varieties and the initial fertilizer dataset had 1842 data records for 97 crop varieties. The new merged dataset contained 2200 data records for 22 crop varieties. The new dataset is trained on machine learning algorithms such as Decision Tree, Naïve Bayes, Support Vector Machine (SVM), Logistic Regression, and Random Forest. Figure 2 shows the fertilizer dataset before processing, Fig. 3 shows the crop dataset before processing and Fig. 4 shows the merged crop and fertilizer dataset.
Fig. 1 Block diagram for crop recommendation
394
A. Srinidhi et al.
Fig. 2 Fertilizer dataset before processing
Fig. 3 Crop dataset before processing
Fig. 4 Merged crop and fertilizer dataset
4.2 Fertilizer Recommendation System For fertilizer recommendation as shown in Fig. 5, the dataset is taken from Kaggle [16]. The dataset had 99 data records for seven types of fertilizers, i.e., urea, DAP, 14-35-14, 28-28, 17-17-17, 20-20, 10-26-26. The dataset attributes included temperature, humidity, moisture, soil type, crop type, NPK values, and labels. Initially, the dataset was unbalanced; hence, we up sampled the dataset and increased the number of data records to 154. The new dataset is trained on K-Nearest Neighbors algorithm (KNN), Support Vector Machine (SVM), and Random Forest. Figure 6 shows the fertilizer dataset before upsampling. Figure 7 shows the missing values graph before upsampling, and Fig. 8 shows the fertilizer dataset after upsampling.
A Machine Learning and Deep Learning-Based Web Application …
395
Fig. 5 Block diagram for fertilizer recommendation
Fig. 6 Fertilizer dataset before upsampling
4.3 Crop Disease Prediction For crop disease prediction, the dataset is taken from Kaggle [17]. Figure 9 shows the ResNet50 architecture Fig. 10 shows the block diagram for crop disease prediction. The dataset contains 38 classes of images, for 26 types of plant diseases for 14 different plants. The dataset contains a total of 71,877 images. 43,456 training images, 17,572 validation images, and 10,849 test images. Python’s extensively used image processing libraries such as Keras and TensorFlow are used. The dataset is trained on ResNet50 [18] model. ResNet50 is a CNN model pretrained on ImageNet database. It is a powerful CNN model that eliminates the problem of vanishing gradient.
4.4 Web Development Web development is a crucial part of the project as all the prediction and recommendation algorithms are processed using requests such as GET and POST. These requests make up the web development and integrate our algorithms with the frontend of the web application. The Flask micro framework is used to build the backend
396
Fig. 7 Missing values graph before upsampling
Fig. 8 Fertilizer dataset after upsampling
Fig. 9 ResNet50 architecture
A. Srinidhi et al.
A Machine Learning and Deep Learning-Based Web Application …
397
Fig. 10 Block diagram for crop disease prediction
of the web application to process all the requests made by the user. Flask was chosen because it is a Python framework and provides technologies, tools, and modules to develop real functionalities that make web development simpler. In our project, the frontend is built with HTML and CSS and a strong backend is built with Flask to support our frontend which is used to render pages, get requests, and post requests.
5 Testing and Analysis Before the web application is built, the recommendation algorithms and prediction algorithm are executed on Jupyter Notebook and the prediction models are saved using the Python’s pickle library. The Python’s library “pickle” is utilized to serialize Python objects into a byte stream format that can be stored in a file. Here, we use it in our proposed system to save the models that run the services—crop recommendation, fertilizer recommendation and crop disease prediction provided by the web application. The Twilio library is used to make use of the SMS API to send an SMS to the user, when the user enters their phone number. The accuracy of the models is as follows.
5.1 Crop Recommendation The following are the accuracies for the crop recommendation system when trained on various machine learning algorithms such as Decision Tree, Naïve Bayes, Support
398
A. Srinidhi et al.
Fig. 11 Accuracy versus algorithm graph for crop recommendation
Vector Machine (SVM), Logistic Regression, and Random Forest providing an accuracy of 90.68%, 99.09%, 97.5%, 95.45%, and 99.09%, respectively. Figure 11 shows the accuracy versus algorithm graph for crop recommendation.
5.2 Fertilizer Recommendation The following are the accuracies for the fertilizer recommendation system when trained on various machine learning algorithms such as on K-Nearest Neighbors algorithm (KNN), Support Vector Machine (SVM), and Random Forest algorithms for the upsampled data and are providing an accuracy of 74.19%, 100%, and 100%, respectively. Figure 12 shows accuracy versus algorithm graph for fertilizer recommendation.
5.3 Crop Disease Prediction The following is an accuracy graph for the crop disease prediction system which is trained on the ResNet50 algorithm, and it provided a training accuracy of 95.52% and validation accuracy of 87.36% for 3 epochs. Figure 13 shows no. of epochs versus accuracy graph for crop disease prediction trained on ResNet50.
A Machine Learning and Deep Learning-Based Web Application …
Fig. 12 Accuracy versus algorithm graph for fertilizer recommendation
Fig. 13 No. of epochs versus accuracy graph for crop disease prediction trained on ResNet50
5.4 Web Application—Homepage Figure 14 shows the web application homepage.
399
400
A. Srinidhi et al.
Fig. 14 Web application homepage
6 Results and Discussion The web application is built using Python’s Flask framework, that does all the backend operations of the web application. It provides all the necessary modules, tools, and technologies used to provide functionality to web application. The web application is launched on the localhost server, and the user can navigate to three web pages from there, i.e., the user can choose to opt different services provided by the web application—crop recommendation, fertilizer recommendation or crop disease prediction. This will navigate user to the respective pages, where the user can enter values of the soil and environmental factors for the crop and fertilizer recommendation systems and for crop disease prediction, the user must enter the name of the crop and upload the image of infected leaf; in each case, the user will be navigated to the results page and the precise results will be displayed on the web application. Additionally, the user can enter their phone number and the results will be sent to the user’s mobile phone as an SMS.
7 Conclusion and Future Scope The paper presents an ideal solution to the farming needs of farmers and home growers alike. It provides precise and accurate results that will help them to make informed decisions about the crops that they’d grow. This will promote farming techniques that will strike a balance between the current demand and the supply of food. This is also meant to be a step toward sustainable farming as the soil is managed by avoiding the use of unnecessary amounts of synthetic fertilizers. It will also help the farmers prevent crop loss due to crop diseases.
A Machine Learning and Deep Learning-Based Web Application …
401
The project can be further developed by building an IoT device, using various sensors that could read the soil and environmental values directly from the soil and transmit the results over the cloud to the web application, this way the farmers will no longer depend on soil testing agencies.
References 1. Press Information Bureau, PIB Delhi, Aug. 2021. https://www.pib.gov.in/PressReleasePage. aspx?PRID=1741942 2. International Monetary Fund, Nicoletta Batini, Reaping what we sow, Dec. 2019. https://www. imf.org/en/Publications/fandd/issues/2019/12/farming-food-and-climate-change-batini 3. The Work Bank. https://www.worldbank.org/en/topic/food-security/brief/countries-catalyzenew-preparedness-plans-to-more-effectively-respond-to-emerging-major-food-and-nutritioncrises 4. Kumar R, Shukla N, Princee (2022) Plant disease detection and crop recommendation using CNN and machine learning. In: 2022 international mobile and embedded technology conference (MECON). IEEE Xplore 5. Raviraja S, Raghavender KV, Sunagar P, Ragavapriya RK, Kumar MJ, Bharath VG (2022) Machine learning based mobile applications for autonomous fertilizer suggestion. In: Proceedings of the international conference on inventive research in computing applications (ICIRCA 2022). IEEE Xplore 6. Verma A, Kapoor C, Sharma A, Mishra B (2021) Web application implementation with machine learning. In: 2021 2nd international conference on intelligent engineering and management (ICIEM). IEEE Xplore 7. Ahmed U, Lin JC-W, Srivastava G, Djenouri Y (2021) A nutrient recommendation system for soil fertilization based on evolutionary computation. Comput Electron Agric 189:106407. ISSN 0168-1699, https://doi.org/10.1016/j.compag.2021.106407 8. Durai SKS, Shamili MD (2022) Smart farming using machine learning and deep learning techniques. Decis Anal J 3:100041. ISSN 2772-6622, https://doi.org/10.1016/j.dajour.2022. 100041 9. Anjana, Kedlaya KA, Sana A, Bhat BA, Kumar S, Bhat N (2021) An efficient algorithm for predicting crop using historical data and pattern matching technique. Glob Transit Proc 2(2):294–298. ISSN 2666-285X, https://doi.org/10.1016/j.gltp.2021.08.060 10. Suchithra MS, Pai ML (2020) Improving the prediction accuracy of soil nutrient classification by optimizing extreme learning machine parameters. Inform Process Agric 7(1):72–82. ISSN 2214-3173, https://doi.org/10.1016/j.inpa.2019.05.003 11. Kundu N, Rani G, Dhaka VS, Gupta K, Nayaka SC, Vocaturo E, Zumpano E (2022) Disease detection, severity prediction, and crop loss estimation in MaizeCrop using deep learning. Artif Intell Agric 6:276–291. ISSN 2589-7217, https://doi.org/10.1016/j.aiia.2022.11.002 12. Nandhini M, Kala KU, Thangadarshini M, Verma SM (2022) Deep learning model of sequential image classifier for crop disease detection in plantain tree cultivation. Comput Electron Agric 197:106915. ISSN 0168-1699, https://doi.org/10.1016/j.compag.2022.106915 13. Ahmed N, Ahammed R, Islam MdM, Uddin MdA, Akhter A, Talukder MdA, Paul BK (2021) Machine learning based diabetes prediction and development of smart web application. Int J Cogn Comput Eng 2:229–241. ISSN 2666-3074, https://doi.org/10.1016/j.ijcce.2021.12.001 14. Anwarul S, Mohan M, Agarwal R (2023) An unprecedented approach for deep learning assisted web application to diagnose plant disease. Procedia Comput Sci 218:1444–1453. ISSN 18770509, https://doi.org/10.1016/j.procs.2023.01.123 15. Crop Recommendation Dataset, Atharva Ingle. https://www.kaggle.com/datasets/atharvaingle/ crop-recommendation-dataset
402
A. Srinidhi et al.
16. Fertilizer Prediction Dataset, GD Abhishek. https://www.kaggle.com/datasets/gdabhishek/fer tilizer-prediction 17. New Plant Diseases Dataset, Samir Bhattarai. https://www.kaggle.com/datasets/vipoooool/ new-plant-diseases-dataset 18. He K, Zhang X, Ren S, Sun J Deep residual learning for image recognition. https://doi.org/10. 48550/arXiv.1512.03385
The Hybrid Model of LSB—Technique in Image Steganography Using AES and RSA Algorithms Srinivas Talasila, Gurrala Vijaya Kumar, E Vijaya Babu, K Nainika, M Veda Sahithi, and Pranay Mohan
Abstract The process of embedding a data file or information inside another data file or picture is known as steganography. The goal of steganography is to conceal the presence of a message or data so as to avoid it being seen by an observer who does not know how or what to look for. Steganalysis techniques are capable of helping hide details in electronic data such as pictures, audio files, and video files. For instance, a message might be hidden in the unimportant portions of a picture or in the sparse components of an audio transmission. Data encryption can additionally be employed to safeguard the data on printed sheets by using magic markers or other approaches. The least significant bit (LSB) approach, an edge detection technique, has the disadvantage of poorer security and imperceptibility compared to the classic steganography techniques already in use. We propose a paradigm that integrates steganography and cryptography to increase security. To add a cryptographic flavour, we use the Advance Encryption Standard (AES) algorithm, which protects the security of the secret message bits. In this study, we compare the performance of the LSB approach using AES and the LSB technique using AES and RSA (Rivest– Shamir–Adleman) algorithms to that of the Pixel Locator Sequence (PLS) technique. To test the performance of the suggested approaches, we compute the metrics Peak Signal-to-Noise Ratio (PSNR) and Mean Square Error (MSE) and compare the findings with previously published PLS techniques that show good enough values for our proposed model. Keywords Image steganography · Least significant bit technique (LSB) · Advance encryption standard (AES) · Rivest–Shamir–Adleman algorithm (RSA) · Peak Signal-to-Noise Ratio (PSNR) · Mean Square Error (MSE)
S. Talasila (B) · G. Vijaya Kumar · E. Vijaya Babu · K. Nainika · M. Veda Sahithi · P. Mohan VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, India e-mail: [email protected] G. Vijaya Kumar e-mail: [email protected] E. Vijaya Babu e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_34
403
404
S. Talasila et al.
1 Introduction In today’s society, everything is digital communication via computers, smartphones, and the internet, including data interchange. Computer vision has brought about a revolutionary transformation across diverse industries by empowering machines to interpret and comprehend visual data effectively. This technological advancement has paved the way for numerous applications that have significantly impacted our lives [1, 2]. The fast rise in data interchange is primarily concerned with data security. Third parties who try to modify the user’s data for vengeful intentions are always a threat to the data. To circumvent this, we conceal the data under a cover file to protect its security. Several ways were employed to obscure data in ancient times. Steganography, which allows crucial information to be buried within seemingly harmless files, has become a robust approach to data security in the digital era. The cover file can be sent over a network, and unauthorised parties will be ignorant of the concealed message’s presence. The hidden message may only be retrieved by the intended recipient, who possesses the relevant key. Steganography techniques have changed over time, and there are now various ways for concealing messages, such as least significant Bit (LSB) insertion, which hides the message within the least significant bit of each pixel in a picture, and Spread Spectrum. Cryptography is one approach that focuses on concealing the content of a concealed message in situations where a third party is aware of the presence of a hidden message. It involves data encryption and decryption using secret keys that only authorised users can access. Steganography is an approach for concealing the existence of secret communication in which data is disguised so that no eavesdropper can identify the existence of hidden data except the sender and recipient. Individual uses of both cryptography and steganography exist in traditional procedures, with cryptography having the disadvantage of alerting a foreign power to hidden communications and steganography having the disadvantage of maintaining safety for concealed data. We present a paradigm combining steganographic approaches with cryptographic algorithms to improve the performance of steganography. To put our strategy into action, we will need a cover document, which is a picture object that hides the confidential text message, the hidden text, and the stego keys to inject and extract the content. The resultant file is a stego file when the cover file is integrated with concealed text using stego keys. With the same stego key, the receivers who are permitted users can extract the hidden content from the stego file. An anchoring algorithm, as well as the stego keys, are employed to encapsulate the exclusive text data in a cover object. The resulting stego object is transferred via a channel to the target device, where it is assessed by the process of feature extraction using the stego key. Unauthenticated viewers can observe the stego text as it is transferred, but they cannot detect the presence of a concealed message because they only have access to the innocent text being communicated. The process of steganography is shown in Fig. 1.
The Hybrid Model of LSB—Technique in Image Steganography Using …
405
Fig.1 Process of steganography
2 Related Works In this article, the least significant bit (LSB) technique with Advance Encryption Standard (AES) and Rivest, Shamir, and Adleman (RSA) algorithms was used to hide text messages in the cover image. Many professionals have worked on implementing image steganography and introduced several techniques in the spatial and frequency domains. The author [3] presented a detailed description of different approaches and encryption techniques used in image steganography with a computational analysis, where the authors of [4] explained the performance analysis of various spatial domain techniques in image steganography. The most widely used LSB technique has the drawback of having less immunity to noise and compression. Mid-Position Value (MPV) approach was employed by the authors [5] for image steganography. This technique follows a conditional strategy that endorses the overall security of secret data bits. This method is used to embed high-capacity secret messages, but it has the disadvantage of being less imperceptible. The multifaceted world of steganography, encompassing its historical foundations, contemporary techniques, applications, challenges, and the integration of cutting-edge technologies can be found in [6–10]. The authors of [11] have given a brief explanation of the performance metrics used to measure the performance of image steganography. The authors of [12, 13] went into great detail about recent advances in image steganography and data hiding using steganography and cryptography. The authors in [14] proposed an improved LSB approach combined with asymmetric cryptography. The author adapted the previous LSB approach and applied a mapping function to conceal the ciphered data within the host image, resulting in a secure and private image. In [15], the author implanted the LSB technique using the random distribution technique Pixel Locator Sequence (PLS) with the AES algorithm, which we are using as a reference for our proposed model. In [16], the author had a brief explanation of implanting the LSB technique with the AES technique, focusing on improved security of data. The authors of [17] have introduced a technique based on the combination of AES and RSA techniques
406
S. Talasila et al.
Fig. 2 Change of pixel value using LSB
for improved data security, which we have used as one of the reference techniques in our implementation. In this paper, compared the proposed model, the LSB technique with AES and RSA algorithms, with the performance of the techniques used in [15] the LSB technique using PLS and AES algorithms, and [16] the LSB technique with AES algorithm, which exhibit better performance than existing techniques, with better performance metrics. The least significant bit technique is the spatial domain technique, which is directly applied to the bits of the cover image. It is a simple technique for implementation with high robustness. The secret message bits are replaced with the least significant bits in each pixel of the RGB cover image so that there is no discernible difference in the image based on human eye perception, as depicted in Fig. 2. At the same time, the detection of the embedded secret message is extracted using the same algorithm by detecting the modified least significant bits in the cover message. Although this technique is robust and simple to use, it is less immune to noise, leading to a low PSNR value, and it has low imperceptibility. This makes us use advanced techniques in implementing image steganography, which might be a spatial or transform domain technique. One of the random data distribution methods is the Pixel Locator Sequence (PLS) technique. Third parties find it difficult to find the secret data due to the random distribution, thus increasing security and robustness against steganography techniques. In this method, pixels are selected in a random manner, and their LSB values are changed accordingly, thus hiding the secret message in a dispersed manner and making it difficult for third parties to know the message. Due to the random distribution pattern, this technique consumes more space, making it an inefficient technique. This makes us use modified versions and advanced techniques for implementing image steganography. The above steganographic technique, which is used as the base technique in our implementation, has its own limitations. We compare this technique with the performance of our proposed model, which exhibits good enough metrics.
3 Proposed Model In our proposed method, as shown in Fig. 3, we combine the cryptography techniques AES, which uses symmetric encryption; RSA, which uses asymmetric encryption; and the steganographic LSB technique. Though asymmetric encryption is more
The Hybrid Model of LSB—Technique in Image Steganography Using …
407
Fig. 3 Flowchart of the proposed method
secure than symmetric encryption, it consumes more computational time, making it infeasible to use in our applications. To hide the data, we combine the AES technique, a symmetric encryption technique; the RSA algorithm, an asymmetric encryption technique; and the LSB technique, a steganographic technique. As a result, computing speed and data security have increased. The picture is first protected by using the AES method, and then the private key is shrunk down with the RSA algorithm and buried in the cryptographically signed image with the LSB algorithm. Its approach contributes to increased security and eliminates the issue of key transfer. The dataset used for this work is taken from the Kaggle source, and a few samples are shown in Fig. 4.
3.1 Techniques Used in the Proposed Model: AES Technique: AES is an abbreviation for Advanced Encryption Standard, which is a symmetrical block cipher encryption method commonly used to safeguard data in a variety of applications. AES has a block size of 128 bits and a key size of either 128, 192, or 256 block size. For the first time, a new regulation was imposed. The AES round function acts on a slab of plaintext and consists of multiple rounds of substitution and permutation operations. The AES round function creates a ciphertext block by performing a sequence of substitution as well as permutation operations on the plaintext message and the encryption key. Because of the length of its keys, the number of iterations involved in the encryption algorithm, and the sophistication
408
S. Talasila et al.
Fig. 4 Images used for steganography
of its internal processes, AES is regarded as a highly safe encryption method. It is also immune to various attacks, particularly split and linear codebreaking. Ultimately, AES is a highly secure encryption technique that delivers robust encryption in various applications to protect sensitive data. Figure 5 shows the steps involved in AES encryption and decryption. RSA Technique: RSA is an abbreviation for Rivest–Shamir–Adleman, a public-key cryptosystem commonly used for the safe transmission of information, cryptographic algorithms, and key exchange. RSA is based on the mathematical concept of the complexity of factoring large prime numbers, which means that while multiplying two consecutive prime numbers together which has been straightforward, identifying the components of the product is highly difficult. The RSA algorithm employs a pair of statistically linked keys, a private key and a public key. In RSA encryption, the public key is utilised to encrypt messages, and the private key serves as the decryption key. The essential value of the message is using the receiver’s public key. The receiver then decrypts the message using their authentication tokens, which ensures secrecy and authenticity by allowing only the correspondent with the private key to decrypt the communication. Because of its security qualities, RSA is commonly used for secure communication. It offers safe encryption and decryption and is immune to various attacks, such as brute-force and known-plaintext assaults. Overall, RSA is a popular public-key cryptosystem for secure data transfer, digital signatures, and key exchange. It is a pillar of modern cryptography and is extensively used in various
The Hybrid Model of LSB—Technique in Image Steganography Using …
409
Fig. 5 Flowchart of the AES encryption and decryption process
applications, including e-commerce, online banking, and secure communications. Figure 6 shows the steps involved in the RSA algorithm. LSB Technique: The least significant bit method is a discrete wavelet approach automatically added to the bits of something like the cover picture. To encode a Fig. 6 Flowchart of the RSA encryption and decryption process
410
S. Talasila et al.
Fig. 7 Flowchart of the image steganography using LSB algorithm
concealed message, LSB steganography modifies the least significant bit of pixels or samples in a digital media file. The change in the least important bit is generally unnoticeable to human eyes or ears, and the original file seems unchanged. In LSB steganography, the ciphertext is first converted to binary form, after which every bit of the binary information is gradually incorporated into the least significant bit of both the online content file’s pixels or samples. The pixels or samples used to incorporate the message are generally chosen at random or based on a preset pattern. The concealed message can then be extracted by reading the least significant bit of the file’s pixels or samples and reassembling the binary message. LSB steganography is commonly employed because it is a straightforward and effective method for concealing messages within digital content files. It is also hard to identify since the modifications performed to the file are frequently subtle and not clearly visible to the visual system or ear. Nevertheless, LSB steganography is sensitive to assaults because statistical analysis or other approaches can discover changes in the least significant bit of pixels or samples. Generally, LSB steganography is a popular method for concealing messages within digital media files. However, when employing LSB steganography for sensitive or essential applications, it is crucial to be mindful of its limits and risks. Figure 7 explains the steps involved in the LSB algorithm.
4 Results and Discussion Steganography is the technique of integrating hidden data into a data packet without sacrificing its original quality. The Peak Signal-to-Noise Ratio (PSNR) and Mean Square Error (MSE) were determined, and the results are displayed in Tables 1 and 2. The PSNR is the ratio of a signal’s maximum strength to the potency of contaminating interference, which influences the precision of its interpretation. Equation (1) is the formula to calculate the value of PSNR.
The Hybrid Model of LSB—Technique in Image Steganography Using …
411
MAX1 PSNR = 20. log10 √ . MSE
(1)
• MAX1 is the maximum pixel value of that image. • MSE is Mean Square Error. MSE is the metric that calculates the square root of the error’s average. An image’s quality is inversely correlated with its MSE value, meaning that the lower the MSE value, the higher the image’s quality. Equation (2) is the formula to calculate the value of MSE. MSE =
m−1 n−1 1 [I (i, j ) − K (i, j )]2 . mn x=0 j=0
(2)
• m×n monochrome image. • k is a noisy approximation. Table 1 [3] shows the PNSR and MSE values of five images we used for steganography to get the results. In this experiment, we encoded the text file that contains the private information: “This secret message has to be embedded into the image”. In Table 2 [4], we have compared the PNSR and MSE values for the proposed and existing models. Table 1 PSNR and MSE values of stego images
Images
Resolution
PSNR
MSE
Image 1
225 × 225
52.50649
0.36511
Image 2
493 × 356
52.47420
0.36784
Image 3
512 × 512
52.68692
0.35025
Image 4
512 × 384
52.55508
0.36105
Image 5
512 × 480
52.59133
0.35805
Table 2 Comparison of different hiding techniques LSB + PLS + AES
LSB + AES + RSA
PSNR
MSE
PSNR
MSE
Image 1
45.86
1.6846
52.50
0.3651
Image 2
44.56
2.2721
52.47
0.3678
Image 3
48.34
0.9518
52.68
0.3502
Image 4
47.34
1.1988
52.55
0.3610
Image 5
44.62
2.2432
52.59
0.3580
Images
412
S. Talasila et al.
5 Conclusion This study integrates LSB steganography alongside AES, and RSA is recommended to improve information integrity. The LSB technique is used to encapsulate and decode the cryptic information in this methodology, while AES and RSA methods are used to both encrypt and decrypt the Stego image. Alone, AES or RSA approaches provide better security; combining them may give us even greater security. It is almost impossible to break the keys and read the message. Using LSB steganography in conjunction with AES and RSA algorithms gives a more secure means of concealing and sending hidden messages inside digital media files. The LSB approach hides the message within the digital media file’s pixels or samples, whereas the AES and RSA techniques encrypt and secure the message from unwanted access. The Advanced Encryption Standard (AES) algorithm is a well-known symmetric key encryption technique that provides robust data encryption protection. It employs a private key to encrypt and decode information and is immune to multiple attacks, such as overwhelming strength and cryptanalysis. The Rivest–Shamir–Adleman (RSA) technique, on the other hand, is indeed an extensively adopted asymmetric encryption algorithm that thus guarantees a substantial degree of information transmission security. It uses a public key to encrypt data and a private key to decode data. It is also resistant to various assaults, including brute force and cryptanalysis. Using LSB steganography with AES and RSA algorithms improves data security by making breaking the keys and reading the message nearly impossible. The hidden information is first encrypted using the AES approach and then using the LSB technique, the encrypted information is placed in the pixels or samples of the digital media file [10]. The stego document is password-protected using the RSA technique to give an extra degree of protection. To decode the hidden message, the recipient first uses the private key to decrypt the RSA-encrypted stego file, then uses the AES key to decrypt the embedded message, and finally separates the message from either the LSB of the pixels or fragments of the interactive media file. From the experimental results, the combination of LSB steganography and AES and RSA approaches provides a very safe way for concealing and delivering secret messages within digital media files, making it a suitable choice for applications requiring high degrees of security and anonymity.
References 1. Talasila S, Rawal K, Sethi G (2023) Black gram disease classification using a novel deep convolutional neural network. Multimed Tools Appl 82:44309–44333. https://doi.org/10.1007/ s11042-023-15220-4 2. Esteva A, Chou K, Yeung S et al (2021) Deep learning-enabled medical computer vision. NPJ Digit Med 4:5. https://doi.org/10.1038/s41746-020-00376-2 3. Kaur S, Singh S, Kaur M et al (2022) A systematic review of computational image steganography approaches. Arch Comput Methods Eng 29:4775–4797. https://doi.org/10.1007/s11831022-09749-0
The Hybrid Model of LSB—Technique in Image Steganography Using …
413
4. Vithayathil AJ, Sreekumar A (2023) Pixel-based image encryption approaches: a review. In: Mathur G, Bundele M, Tripathi A, Paprzycki M (eds) Proceedings of 3rd international conference on artificial intelligence: advances and applications. Algorithms for intelligent systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-7041-2_11 5. Mukherjee S, Roy S, Sanyal G (2018) Image steganography using mid position value technique. Procedia Comput Sci 132:461–468. https://doi.org/10.1016/j.procs.2018.05.160 6. Joshi R, Bairwa AK, Soni V, Joshi S (2022) Data security using multiple image steganography and hybrid data encryption techniques. In: 2022 international conference for advancement in technology (ICONAT). https://doi.org/10.1109/iconat53423.2022.9725949 7. Raja KB, Chowdary C, Venugopal KR, Patnaik LM (2005) A secure image steganography using LSB, DCT and compression techniques on raw images. https://doi.org/10.1109/icisip. 2005.1619431 8. Al-Ataby A, Al-Naima FM (2010) A modified high capacity image steganography technique based on wavelet transform. Int Arab J Inform Technol 7:358–364. https://dblp.uni-trier.de/db/ journals/iajit/iajit7.html#Al-AtabyA10 9. Aljazaery IA, ALRikabi HTS, Aziz MKA (2020) Combination of hiding and encryption for data security. Int J Interac Mob Technol 14(09):34. https://doi.org/10.3991/ijim.v14i09.14173 10. Manohar N, Kumar PV (2020) Data encryption and decryption using steganography. https:// doi.org/10.1109/iciccs48265.2020.9120935 11. Pradhan A, Sahu AK, Swain G, Sekhar KR (2016) Performance evaluation parameters of image steganography techniques. https://doi.org/10.1109/rains.2016.7764399 12. Subramanian N, Elharrouss O, Al-Maadeed S, Bouridane A (2021) Image steganography: a review of the recent advances. IEEE Access 9:23409–23423. https://doi.org/10.1109/access. 2021.3053998 13. Cheltha CJN, Rakhra M, Kumar R, Walia H (2021) A review on data hiding using steganography and cryptography. https://doi.org/10.1109/icrito51393.2021.9596531 14. Pramanik S, Samanta D, Dutta S, Ghosh R, Ghonge MM, Pandey D (2020) Steganography using improved LSB approach and asymmetric cryptography. https://doi.org/10.1109/icatmr i51801.2020.9398408 15. Tiwari K, Gangurde SJ (2021) LSB steganography using pixel locator sequence with AES. arXiv (Cornell University). Cornell University. https://doi.org/10.1109/icsccc51823.2021.947 8162 16. Negi LM, Negi L (2021) Image steganography using Steg with AES and LSB. https://doi.org/ 10.1109/icced53389.2021.9664834 17. Kumar BN, Nair AM, Raj VKR (2017) Hybridization of RSA and AES algorithms for authentication and confidentiality of medical images. https://doi.org/10.1109/iccsp.2017.828 6536
An Effective Online Failure Prediction in DC-to-DC Converter Using XGBoost Algorithm and LabVIEW B. Aravind Balaji, S. Sasikumar, Naga Prasanth Kumar Reddy Puli, Velicherla Chandra Obula Reddy, and V. R. Prakash
Abstract In this paper, an online failure detection method for DC-to-DC converter using a machine learning algorithm is presented. The key feature of the proposed method is to detect multiple catastrophic failures by observing the abnormal conditions in the circuit parameters during its operation. The independent features are obtained online employing a data acquisition instrument using the LabVIEW tool, and a tree-based ensemble learning algorithm is used for failure prediction and classification. The conventional method of using a single prediction model often mislead to false prediction and belated failure detection. To enhance the effectiveness of the failure prediction algorithm, a group of different data-driven ensemble models is trained, and using Receiver Operating Characteristic (ROC) curve, the best prediction model is identified. Extreme gradient boost-based ensemble algorithm achieves the highest accuracy rate of 98.9% in multiclass classification among the ensemble models. Keywords Fault prediction · Ensemble model · Receiver Operating Characteristics (ROCs) · Multiclass classification · Extreme gradient boost algorithm
1 Introduction Failure prediction in power converter circuits is a challenging task due to the arbitrary nature of the semiconductor devices present in the circuit. Nowadays, highly efficient power converters are deployed to handle a wide range of power applications at higher B. Aravind Balaji (B) · S. Sasikumar · N. P. K. R. Puli · V. C. Obula Reddy · V. R. Prakash Hindustan Institute of Technology and Science, Chennai, India e-mail: [email protected] S. Sasikumar e-mail: [email protected] V. R. Prakash e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_35
415
416
B. Aravind Balaji et al.
gain and reduced cost, thereby affecting the component’s stability due to electrical, thermal, and aging stress factors. Power transistors and electrolytic capacitors are the two major components that get stressed during operation, resulting in component failure [1]. Generally, failures can be of two forms such as parametric failure and catastrophic failure. Parametric failures are due to deviations in the parametric value of the component. On the other hand, catastrophic failures are due to functional failure of the component [1–3]. This paper focuses on the catastrophic failure due to the switch and electrolytic capacitor in DC-to-DC converter. Switch failures are categorized as open-circuit failure (OCF) and short-circuit failure (SCF) [1]. Similarly, the electrolytic capacitor fails due to the leakage or evaporation of electrolyte thereby reducing the charge storage [3]. The failure signature of a component is diagnosed by monitoring the electrical parameters across the component present in the Circuit. In a DC-to-DC converter circuit, the switch OCF and SCF are identified by monitoring the sign of inductor current across the circuit [4]. Similarly, the degradation of the capacitor is identified using the Equivalent Series resistance (ESR) [4]. Kan and Wen et al. [1] discussed a comprehensive review of the different failure cases and fault identification methods for DC-to-DC converters, and the common fault identification methods are hardware-driven, and data-driven methods. Farjah et al. [6] proposed a hardware-driven failure analysis method to identify multiple catastrophic failures due to power switch and electrolytic capacitor in DC-to-DC converter using a single current sensor fixed across the capacitor. The failure is identified using a threshold by comparison with a reference signal. Su et al. [3] developed a sliding mode-based observer for failure identification in boost converter. The failure identification system uses DSP-based hardware in a loop simulator for extracting the failure signature parameters and classifying the failure using the residual signal (error signal) concerning the fixed threshold limit. Li et al. [7] proposed a hardware-driven method using the hardware-in-loop technique, and the component’s open-circuit (OC) failures are identified using residual by comparing the failure signature parameters such as switching frequency of output voltage with a threshold using an observer. Xu et al. [8] discussed the switch open-circuit failure analysis in DC-to-DC converter using residual based on immersion and invariant observer technique. The residual signal concerning the threshold of the input voltage is utilized as a failure signature parameter to determine the switch failure. Laadjal et al. [9] developed an online failure diagnosis method for electrolytic capacitor degradation in LED applications using hardware-driven technique. By measuring the ESR value of the capacitor, the degradation factor is measured using a data acquisition card and LabVIEW tool. Residual (threshold) based approach is a common methodology used in existing hardwaredriven based fault analysis techniques. However the major drawback in the above method is that the threshold value keeps changing as per the circuit characteristics resulting in poor compatibility and scalability. On the other hand, the data-driven technique eliminates such drawbacks by determining the threshold using mathematical models, as per the historical dataset collected from the application under test [10]. Hence, data-driven approach is focused more by the researchers on fault detection and classification. Machine learning (ML)
An Effective Online Failure Prediction in DC-to-DC Converter Using …
417
concepts are a common data-driven technique used in failure detection and classification. They are two classifier approaches in ML, namely single classifier and multiple (ensemble) classifiers. León-Ruiz et al. [10] discussed single classifier-based fault diagnosis of a power switch in a photovoltaic (PV) system, and Support Vector Machine (SVM) is chosen as an effective algorithm for achieving 100% prediction accuracy. The paper discussed binary classification where multiclass classification is not possible. Bindi et al. [11] proposed multi-layer neural network model-based single classifier model for predicting the failure in DC-to-DC converter in a PV application with a prediction accuracy rate of up to 92% over a few validations. The paper focused only on parametric failure in converters and compared the proposed model with the conventional SVM model for evaluation. The model supports multiclass classification and can predict faults only in a single component at a time. Furthermore, the paper discussed the importance of the false positive ratio in the prediction output in fault detection, and by using the dataset of various environmental conditions, the ratio gets improved. Sun et al. [12] and Kou et al. [13] proposed a method of feature enhancement to identify the relationship between the features and the target class to detect power convert switch open-circuit failure using wavelet transform to enhance the feature before training the model, which results in a prediction accuracy rate of 97% a gain of 1% in the prediction efficiency in single classifier model compared to the conventional method. Zhang et al. [14] and Chen et al. [15] discussed the binary classification in health monitoring of the electrolytic capacitor using a single prediction model, Artificial Neural Network (ANN) with average prediction accuracy of 98.05%. The degradation of a capacitor is determined using the ESR and capacitor value. Chakraborty et al. [16] discussed the drawbacks of existing single classifiers resulting in false alarms and belated prediction. The paper further discussed the need for tree-based ensemble classifier. The classifier is implemented for fault prediction of HVAC system using XGBoost algorithm (gradient boost algorithm). Yu et al. [17] discussed the incipient fault (slow fault over time) in DC-to-DC converter. The paper discusses the multiclass classification of parametric failure in the inductor, capacitor, and switch using Support Vector data description model using simulation. The model achieved an accuracy score of 99.75% in ideal conditions and 98.83% in noise conditions. Kapucu et al. [18] state the advantage of using ensemble learning model in fault classification. Here, collection of ensemble models is implemented to predict the short-circuit failure in PV system. Moreover, using the voting classifier concept, the best ensemble model is chosen with a maximum accuracy of 97.46%. Chen et al. [19] discussed generalized hardware fault detection in Uninterrupted Power Supply using XGBoost algorithm apart from components failure, and initially, the relationship between the failure and normal state is examined using logarithmic summation probability. The failure is classified using XGBoost algorithm with the classification accuracy rate of 94%. Furthermore, the paper discusses the advantage of using XGBoost algorithm for datasets that have limited dimensions, unbalanced samples, and minimal sample count. When it comes to online data extraction for online prediction, Sumathi et al. [20] discussed instrumentation control using LabVIEW automation tool to control a
418
B. Aravind Balaji et al.
Mixed Signal Oscilloscope (MSO) for real-time data extraction. The paper states that LabVIEW plays a vital role in testing and measurement application. Laadjal et al. [21] discussed online data extraction of the capacitor and ESR value of electrolytic capacitors present in the power converter of LED driver using LabVIEW and data acquisition card. This paper provides a detailed explanation of implementing an online failure detection model for DC-to-DC converter. Initially, the failure signature is identified and the features are extracted as a dataset using data acquisition instruments and LabVIEW. For comparative analysis, the dataset is used to train a group of multiclass classifier models, and by comparative analysis using ROC curve and confusion matrix, XGBoost algorithm is selected as the optimum fitting model. The implementation description of XGBoost algorithm is discussed in the sections below.
2 Implementing Failure Detection Model Using XGBoost Algorithm 2.1 Description of Dataset and Fault Classes Failure in DC-to-DC converters is mostly due to catastrophic failure of power semiconductor devices present in the circuit. MOSFET and electrolytic capacitor are the two major components due to which the power converter fails [1]. DC-to-DC converter scales up the output power using a power switch, inductor, and electrolytic capacitor. Generally, a component failure can be determined by identifying the abnormal condition of the electrical parameters across the components in a circuit. For instance, switch failure in a power converter circuit is identified by monitoring the abnormal condition of the inductor current slope and output voltage. And, capacitor degradation is directly proportional to the increase in the ESR value; similarly, the capacitance of the electrolytic capacitor is inversely proportional to the ESR value [4]. Considering these electrical parameters as the fault features, the dataset is generated. The samples are generated by simulating the faults in a real-time circuit and the response is collected using real-time instruments with the help of LabVIEW tool [4]. About 4582 samples of multiple failure scenarios with eight dimensions (independent variable) are simulated and tabulated in the form of dataset categories with six failure classes (dependent variable) assigned. Single and multiple failure hypotheses are assumed, where a maximum of two failures is detected at a time. The normal condition of the DC-to-DC converter is labeled as “Class 0” and other failure classes are summarized in Table 1.
An Effective Online Failure Prediction in DC-to-DC Converter Using … Table 1 Failure class in DC-to-DC converter
Failure class
Description
0
Fault free
1
Degradation of capacitor
2
Short circuit
3
Short circuit and degradation of capacitor
4
Open circuit
5
Open circuit and degradation
419
2.2 XGBoost Algorithm The selection of the appropriate machine learning algorithm is a key factor for effective failure detection and classification. From the recent literature review, ensemble classifiers are of higher accuracy compared to the single classifier model [16–18]. Mienye et al. [22] discussed a detailed survey on the concepts, various algorithms, and applications related to ensemble-based prediction algorithms. The author further stated the advantage of using XGBoost algorithm. Boosting is a sequential-based ensemble learning approach in which the model is trained one after another with the residual of the previous model. Furthermore, some of the advantages of using XGBoost algorithm is, it requires negligible feature engineering and can handle even dataset with minimal samples with fewer feature dimensions. The model can handle feature selection with inbuilt feature importance attribute to recognize the feature better. In mathematical term, gradient boost algorithm is optimizing the loss function from a base model to build the upcoming models. The objective function of the XGBoost model is the combination of the training residual to determine how well the model is fit on training data and the regularization and tree complexity parameter, represented in Eq. (1). obj =
n ∑
l(y jˆ , y j ) +
j=1
n ∑
Ω( f k ),
(1)
k=1
where l is the training loss between the expected response yj and the predicted response y jˆ . Ω is the regularization parameter that determines the complexity of the tree. The algorithm focuses on optimizing the training loss to increase the prediction accuracy and optimizing regularization to reduce the complexity of the model. XGBoost algorithm uses squared loss in turn which is the sum of the squared difference between the predicted value and the actual value as is represented in Eq. (2) L(θ ) =
∑ j
(y jˆ − y j )2 .
(2)
420
B. Aravind Balaji et al.
Regularization parameters so-called hyperparameters determine the complexity of the tree and the growth rate of the tree, i.e., it is the combination of the number of leaves in a tree and the L2 norm of leaf score which is the leaf weightage. It is represented as Eq. (3). T 1 ∑ 2 Ω( f t ) = γ T + λ w , 2 k=1 k
(3)
where γ is used to determine how long the tree should grow, T represents the no. of leaf, λ is used for controlling overfitting and generalizing or neutralizing the data points. wk is the weightage of the leaf; k is the number of leaf weightage. XGBoost algorithm uses OpenMP support which shares CPU cores using a multiprocessing Application Program Interface (API) making it faster.
2.3 Feature Selection The objective to train a model is to identify the association between the features and the classes. It is inevitable to choose the appropriate feature to provide an accurate prediction model. XGBoost algorithm supports an inbuilt feature selection technique, in which the algorithm is initially trained with the dataset, and the significance of each feature is identified using the feature importance attribute inbuilt into the XGBoost algorithm. And, the features with minimal importance are removed from the dataset. The process is repeated until an optimal feature is extracted. Figure 1 shows the feature importance (F) score, for failure detection and classification of DC-to-DC converter. Out of eight features, the capacitor value is identified with the least F-score which is discarded from the dataset. The final optimized features are used in the next stage for training the prediction model.
2.4 Proposed Failure Detection and Classification System Description The failure detection system is explained in detail in the below section. Python is used as the prediction model to run the algorithm. The failure signature data for prediction are extracted from the Circuit Under Test (CUT) via the data acquisition instruments. Meanwhile, LabVIEW handles the process of controlling the data extraction from the CUT and providing the data to the prediction model, furthermore handling the graphical user interface for prediction results. LabVIEW is a common platform used in the automation of data extraction, providing efficient system analysis and easy accusable interactive tools in the form of a graphical user interface [23]. LabVIEW is a one-stop solution for data acquisition, instrumentation control, and automation testing system
An Effective Online Failure Prediction in DC-to-DC Converter Using …
421
Fig. 1 Feature importance F-score assigned by XGBoost model
[24, 25]. This paper proposes an approach of using a special purpose commandline application interface that supports Windows, command-line, and script-based applications. Python plays a vital role in implementing machine learning concepts due to its easy and vast community support. Various mathematical algorithms are implemented using Scikit learn module. Figure 2 shows the system framework of DC-to-DC converter failure detection and classification. Initially, the failure signature such as the inductor current slope, output voltage, ESR value, and capacitor value is extracted from the CUT using the data acquisition system by probing the desired test points [4]. Figure 3 shows the online data acquisition system. The description of the systems hardware and software implementation is briefly explained in [5].
Fig. 2 Online failure detection and classification system framework
422
B. Aravind Balaji et al.
Fig. 3 Online data acquisition system
The system is a combination of hardware and software, the instruments those extract the data from the CUT act as the hardware, and here digital oscilloscope and multifunction meter are used to acquire the signature features. LabVIEW acts as the software to control the hardware. The software sends a set of commands to the instruments through a few layers to measure the appropriate parameters. These layers are used for processing the commands in each stage to perform the measurement. There are four layers through which the commands are passed back and forth between the hardware and the software. Starting with the BUS layer, which connects the instrument with the software physically via Ethernet, USB, etc. The second layer is the hardware abstraction layer; it acts as the independent communication driver for the instruments to get connected to the software irrespective of the kind of communication protocol using Virtual Instrument Software architecture (VISA). The third layer is the instrument driver layer that deals with encapsulation and parsing, the Standard Command for Programming Instrument (SCPI)-based commands. These commands are universal command formats to control and query instruments remotely. And, the final layer is the application layer which provides a graphical user interface for the user to control and monitor the application using LabVIEW [5]. The signature features are extracted online from the CUT using the instruments with LabVIEW via SCPI commands. These data are pushed to the Python application to predict the failure and classify the fault.
An Effective Online Failure Prediction in DC-to-DC Converter Using …
423
2.5 Model Training and Performance Analysis The failure prediction of DC-to-DC converter is implemented using a multiclass classification model. To perform a comparative analysis and find the best suitable model, a group of four multiclass classification models is chosen, namely Naïve Bayes, decision tree, random forest, XGBoost algorithm, respectively. To evaluate the performance of each model, confusion matrix and ROC curve are implemented. To validate the significance of each feature over the target class, the feature importance technique using XGBoost algorithm is used and the feature with the least importance is discarded. The optimized dataset is then used to train each multiclass classification model of the group, and the performance is compared using the confusion matrix. Figure 4 illustrates the confusion matrix of each model in which the decision tree scores the least accuracy of 95.5% and XGBoost scores the highest accuracy of 98.9%. Moreover, XGBoost has the capability of handling fewer data samples with fewer dimensions. Another advantage is that it reduces false failure positive ratio, making it a yes to choose XGBoost algorithm. Figure 5 illustrates the ROC and AUC curves which shows that the random forest and XGBoost model provide a shape curve toward the top left end of the graph making them a perfectly fitting model for circuit fault classification. However, with a higher AUC score for each class, XGBoost algorithm is chosen as the best prediction model. Table 2 shows the comparative accuracy score between the two models.
3 Conclusion The failure detection of DC-to-DC converter is implemented using the data-driven model approach, and a higher accuracy of 98.9% is obtained in multiclass classification using the XGBoost ensemble algorithm. Moreover, random forest provides the nearest accuracy score; however, the advantage of the XGBoost algorithm in handling minimal features with fewer dimensions makes it a step ahead of the random forest as far as the dataset is concerned. The model can also handle feature selection with inbuilt feature importance attribute to identify the significance of the feature better. And also, a considerable reduction in false alarms is evident in the XGBoost algorithm from the AUC score. On the whole, it is proved from the comparison that the ensemble learning-based approach provides a better accuracy score in multiclass classification over a single prediction model. The paper can be extended further by optimizing the hyperparameters of the XGBoost model to increase the accuracy score further.
424
B. Aravind Balaji et al.
Fig. 4 Confusion matrix a decision tree b Naïve Bayes c random forest d XGBoost
An Effective Online Failure Prediction in DC-to-DC Converter Using …
Fig. 5 ROC curve a decision tree b Naïve Bayes c random forest d XGBoost
425
426
B. Aravind Balaji et al.
Fig. 5 (continued)
Table 2 Comparative accuracy score in DC–to-DC converter failure classification
Algorithm
Accuracy score (%)
Decision Tree
95.5
Naïve Bayes
97.1
Random forest
98.1
XGBoost (proposed)
98.9
References 1. Khan SS, Wen HA (2021) Comprehensive review of fault diagnosis and tolerant control in DC–DC converters for DC microgrids. IEEE Access 9:80100–80127 2. Elangovan D, Kumar GK (2020) A review on fault-diagnosis and fault-tolerance for DC–DC converters. IET Power Electron 13(1) 3. Su Q, Wang Z, Xu J, Li C, Li J (2022) Fault detection for DC–DC converters using adaptive parameter identification. J Franklin Inst 359(11):5778–5797 4. Aravind Balaji B, Sasikumar S, Rathy GA (2023) Failure identification of power converter circuit using LabVIEW myRIO. SSRG Int J Electr Electron Eng 10(1):106–116 5. Aravind Balaji B, Sasikumar S, Ramesh K (2021) Development of test automation framework for printed circuit board assembly. J Phys: Conf Ser 2070 6. Farjah E, Ghanbari T, Givi H (2016) Switch fault diagnosis and capacitor lifetime monitoring technique for DC–DC converters using a single sensor. IET Sci Meas Technol 513–527 7. Li P, Li X, Zeng T (2021) A fast and simple fault diagnosis method for interleaved DC–DC converters based on output voltage analysis. Electronics 10(12) 8. Xu L, Ma R, Xie R, Xu J, Huangfu Y, Gao F (2021) Open-circuit switch fault diagnosis and fault-tolerant control for output-series interleaved boost DC–DC converter. IEEE Trans Transp Electr 7(4):2054–2066 9. Laadjal K, Bento F, Cardoso AJM (2022) On-line diagnostics of electrolytic capacitors in fault-tolerant LED lighting systems. Electronics 11(9):1444
An Effective Online Failure Prediction in DC-to-DC Converter Using …
427
10. León-Ruiz Y, González-García M, Alvarez-Salas R, Cuevas-Tello J, Cárdenas V (2021) Fault diagnosis based on machine learning for the high frequency link of a grid-tied photovoltaic converter for a wide range of irradiance conditions. IEEE Access 9:151209–151220 11. Bindi M, Corti F, Aizenberg I, Grasso F, Lozito GM, Luchetta A, Piccirilli MC, Reatti A (2022) Machine learning-based monitoring of DC–DC converters in photovoltaic applications. Algorithms 15(3) 12. Sun Q (2022) Fault detection for power electronic converters based on continuous wavelet transform and convolution neural network. J Intell Fuzzy Syst 42(4):3537–3549 13. Kou L, Liu C, Cai G, Zhang Z (2020) Fault diagnosis for power electronics converters based on deep feedforward network and wavelet compression. Electr Power Syst Res 185 14. Zhang C, Ni J, Zhang X, Lei T (2021) Data driven remaining life prediction of electrolytic capacitor in DC/DC converter. J Phys: Conf Ser 1754(1) 15. Chen X, Yang X, Zhang Y (2022) Investigation on C and ESR estimation of DC-link capacitor in Maglev choppers using artificial neural network. Energies 15(22):8564 16. Chakraborty D, Elzarka H (2019) Early detection of faults in HVAC systems using an XGBoost model with a dynamic threshold. Energy Build 185:326–344 17. Yu Y, Jiang Y, Liu Y, Peng X (2020) Incipient fault diagnosis method for DC–DC converters based on sensitive fault features. IET Power Electron 13(19):4646–4658 18. Kapucu C, Cubukcu M (2021) A supervised ensemble learning method for fault diagnosis in photovoltaic strings. Energy 227 19. Chen H, Peng Y, Yang Q, Yan L (2020) Fault diagnosis of uninterruptible power system based on Gaussian mixed model and XGBoost. In: 15th international conference on computer science & education (ICCSE), Delft, Netherlands, pp 627–634 20. Sumathi P, Peter D (2019) Instrument control through GPIB-USB communication with LabVIEW. In: IEEE 28th international symposium on industrial electronics (ISIE), pp 1583–1588 21. Laadjal K, Bento F, Antonio J, Cardoso M (2022) On-line diagnostics of electrolytic capacitors in fault-tolerant LED lighting systems. Electronics 11(9):1444 22. Mienye ID, Sun Y (2022) A survey of ensemble learning: concepts, algorithms, applications, and prospects. IEEE Access 10:99129–99149 23. Guenounou A, Aillerie M, Mahrane A, Bouzaki M, Boulouma S, Charles JP (2021) Human home daily living activities recognition based on a LabVIEW implemented hidden Markov model. Multimed Tools Appl 80(16):24419–24435 24. Siddiqui A, Zia MYI, Otero P (2021) A universal machine-learning-based automated testing system for consumer electronic products. Electronics (Switzerland) 10(2):1–26 25. Yao K-C, Huang W-T, Wu C-C, Chen T-Y (2021) Establishing an AI model on data sensing and prediction for smart home environment control based on LabVIEW. Math Probl Eng 2021
Machine Learning-Based Stroke Disease Detection System Using Biosignals (ECG and PPG) S. Neha Reddy, Adla Neha, S. P. V. Subba Rao, and T. Ramaswamy
Abstract Strong vital counteraction and early recognition about prognostic indications are critical for the prevention about stroke disease, which often results in death or severe disability, to treat ischemic or hemorrhagic strokes, thrombolytic or coagulant drugs must be delivered as quickly as feasible. The key to getting competent advice from an impartial commission in the right circumstance is to use a fanlight to look for slowly occurring stroke symptom reactions. According to each individual, these responses vary. However, prior research has mainly focused on the distinct verification about stroke symptoms and a suggestion about correction for stressful situations or detached situation plans after a stroke. Computed tomography (CT) and magnetic resonance imaging (MRI) image review procedures have been widely used in ongoing tests to identify and seek prognostic advice in stroke cases. These methods have limitations, such as extended experiment times and high experiment expenses, despite their constant quest for understanding. In this audit, we mimic an artificial knowledge-based method for predicting the future effects about stroke in the more experienced using many presupposed biography signs about electrocardiogram (ECG) and photoplethysmography (PPG) that were equally received. We designed and executed an assemblage building that combines CNN and LSTM in order to wish stroke persistent while marching as a team. According to the planned approach, which takes into account the modesty surrounding the fact that more experienced objects can wear biography-signal sensors, the biosignals were recorded while moping at a model speed about 1000 Hz each second from the three cathodes about the ECG and the PPG indicator. The senior stroke patients’ real-time predictions’ accuracy was satisfactory.
S. Neha Reddy (B) · A. Neha · S. P. V. Subba Rao · T. Ramaswamy Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India e-mail: [email protected] S. P. V. Subba Rao e-mail: [email protected] T. Ramaswamy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_36
429
430
S. Neha Reddy et al.
Keywords Electrocardiogram (ECG) · Photograph plethysmography (PPG) · Multi-modular biosignal · Real-time stroke prediction · Stroke disease analysis · Deep learning · Machine learning
1 Introduction If a tone breaks or the ancestry path that leads to the domain about the façade peridium is blocked, a stroke may occur, which may be ischemic or hemorrhagic. It is a problem that affects living things’ nervous systems and is caused by injury toward one about the body’s personality components. Stroke is seemingly the most unsafe condition in current hypothesis since it can bring about both fruitful physical and mental impedances, for example, hemiparesis, talk burden (aphasia), jumble, idea trouble, mindfulness inclination, and feebleness, notwithstanding fatalities in exciting conditions. As per the World Well-being Association’s (WHO) 2019 Explanations behind Death Report, which will be delivered in December 2020, the main ten reasons for death represented 55% about all fatalities in 2019 that got official endorsement and were kept alive. 55.4 a heap or so about things). Cerebrovascular disease, the second most potent cause about erasure, was authorised to have 6,000,000 about bureaucracy eliminated. According to the United Nations (UN), a maturing society is one where 7% about the population is older than 65; a high-quality traditional age society is one where 14% or more about the population is older than 65; and a traditional, excellently matured society is one where 20% about the population is older than 65. In order to forecast potential parcels, it is necessary to comprehend the favourable issues that a developing people oppose right away. In 2013, Moody’s, a general credit rating agency, oversaw a maturing judgement and found that nations having their own governments, such as Japan, Germany, Italy, and so forth, had excellentripened friendly orders with more than 20% senior citizens. By 2030, 34 countries will have developed into excellent-ripened civilizations, according to studies. The forecast and patient well-being rank are primarily influenced by the patient’s age and the field about the stroke power. According to a previous study on the frequency about strokes, 66% about all strokes occur in people 65 and older. Despite these cultural issues, the rate about stroke and mortality is expected to become major social and financial problems [1–5]. The diagnosis about a stroke, also known as cerebrovascular illness, is made using the neurological indicative and severity data about a clinical group [6–9] as shown in Fig. 1. Though brain MRI and CT are the most well-known imaging procedures that use living tissue to detect strokes, various studies have shown that biosignals like ECG and facade bark waves can also be used to diagnose and treat strokerelated problems [10–12]. Additionally, single-photon release computed tomography (SPECT), echocardiography, using one’s brain angiography, and ultrasonography are being used to identify the primary causes about stroke.
Machine Learning-Based Stroke Disease Detection System Using …
431
Fig. 1 Example figure
However, due to excessively sensitive reactions toward the introduction about breakup expert treatment, dissemination transparency, and anxiety brought on by spending a lot about time alone or indoors, depict approaches like CT and MRI easily have limitations in the judging and signifying phase. Clinical stick doom because professional unflinching news and unshakeable truths are purposefully necessary since test findings can contain errors.
2 Literature Review Autonomic dysfunction in acute ischemic stroke: an underexplored therapeutic area? A tendency for deliberate activity, which is indicative about a crippled autonomic capacity in people with severe ischemic stroke, is common. This outline shows the most effective way to quantify autonomic breakdown in stroke patients. It looks at the connection between ischemic stroke-signalled unhindered politically the nature about being harsh on a superficial level and parts that are associated with extra undesirable results, for example, mental issues, changes in streaming strain alterability, hyperglycemia, immunological misery, upset rest breathing, thrombotic results, and subverting oedema. Despite the evidence suggesting that a specific bark plays a significant role in reducing sympathovagal irregularity, very little is known about the key details about this or other types about intelligence that might be helpful in this regard. But irrational overactivity is a negative indicator about factors contributing to ischemic stroke, according to research that suggests parasympathetic augmentation or drugs that lower thinking activity are likely to be used.
432
S. Neha Reddy et al.
Diagnosis and management about acute ischemic stroke: speed is critical Stroke is the primary starting point about passing all over the world. (1) The estimated 62,000 strokes that occur annually in Canada have an impact, keeping in mind that the recurrence rates benefit from ageing. By the time a person reaches the age about 80, clear stroke is assumed to be the most likely type about stroke to occur in their lifetime, whereas silent stroke, often known as a secret stroke, is anticipated to be far more likely, for example, closer to one in a hundred. Stroke, which equally affects society, has enormous public and financial repercussions and costs Canada $3 billion in yearly winding bills. (2) There are several similarities between intense stroke and passionate heart failure illness. We compare the outcome and circumstances about severe ischemic stroke to those about severe heart failure contamination to illustrate how quickly clearing a family hallway congestion might improve and resume normal family flow. For this lying survey, an assessment about adequate impartial sensitivities takes the place about bureaucratic regulations [13–15] (Box 1). Long sleep duration and risk about ischemic stroke and hemorrhagic stroke: the Kailuan prospective study The purpose about this investigation is to look into the relationship between ischemic and hemorrhagic strokes in the local population, as well as the union middle from two places rest distance. An interest in the stream test from 2006 to 2007 was acknowledged by 95,023 Chinese population outside about a qualification obvious by stroke as a part about the measure study. Stroke hazard ratios (HRs) and confidence intervals (CIs) are seen when Cox divergent emergencies models are used. About 3135 patients experienced heart failure, while 631 hemorrhagic and 2504 ischemic strokes also occurred after a mean about 7.9 widespread effect periods. The variable adjusted risk rate (95% CI) for stroke for individuals as a whole was 1.29 for those who claimed to sleep for 6 toward 8 h in addition to 8 h (1.01–1.64) about sleep after dark. In the past, postponed rest opportunities were still inextricably linked to stroke as a whole (HR, 1.47; 95% CI, 1.05–2.07). Only women who reported sleeping for more than 8 h straight at midnight were linked to hemorrhagic stroke (HR, 3.58; other accompanying persons as a political whole who reported sleeping for 6 toward 8 h straight all twilight) (95% CI, 1.28–10.06). According to this study, delayed opportunities for rest may be a significant indicator about a serious stroke, particularly in the early stages. Unnecessary rest is currently being promoted as a way to increase the risk about hemorrhagic stroke in women. An elderly health monitoring system using machine learning and in-depth analysis techniques on the NIH stroke scale Affection violation and the board using various dispassionate understanding ways and partnerships have become a violently sent subject currently due to the quick transition to an inventing people and the designing interest in dispassionate idea. Cerebrovascular disease, which includes stroke, is a particularly deadly disorder that is accompanied by extremely high mortality rates as well as long-term insanity and significant effects in developed-boosts and earlier. These stroke concerns create
Machine Learning-Based Stroke Disease Detection System Using …
433
devastating effects since they control fighting over food in personal and financial activities. We propose a useful habit to handle learning and expecting the seriousness about stroke in more established developed-boosts north about 65 using the National Institutes about Health Stroke Scale (NIHSS). Additionally, we employ the C4.5 resolution forest estimation, a design for gathering data and gauging the proximity about PC-located information. ML predictions known as C4.5 choice trees provide more comprehensive assessments about the underlying principles relating to syntax understanding and killing. To sum up, this work demonstrates that the C4.5 choice timber approach for estimating stroke gravity, forecasting stroke risk, and obtaining additional NIHSS advantages combines less advantageous outcomes. The planned model only handles 13 about the 18 stroke scale parts, including age, all the while original scheme alteration, in order to present help support, that is to say two together faster and more precisely. The approach, which uses the C4.5 choice sapling computation, has an overall precision about 91.11%, improving administration and shortening the patient’s NIH stroke scale amount period. Effective anti-ageing strategies in an era about super-ageing Countries with their own governments around the world are focusing on challenges related to a maturing population as a result about declines in both the rate about elation and the average human lifespan. A group about five Koreans and 65 other people try to classify a well-informed organisation like the United Nations. Men with more experience are dwarfed by women because women live longer. In order for women to reach old age in a healthy way, this study aims to provide practical methods for employing isoflavones, which act artificially as oestrogen and can be used to develop potent anti-ageing medications [5, 6, 13, 14].
3 Methodology Early identification about prognostic markers and vigorous vital avoidance are essential to the prevention about stroke because it usually results in death or serious disability for the patient. For ischemic or hemorrhagic strokes, thrombolytic or coagulant therapy should be given as quickly as feasible. The most pivotal stage in looking for talented treatment from an expert office inside the right treatment window is perceiving the stroke forerunner side effects bit by bit. Each individual views these secondary effects as astonishing. Nonetheless, past exploration has stood out to making serious arrangements or fair-minded plans for after a stroke rather than open stroke pointers, to find and demand prognostic guidance for stroke patients, picture examination procedures like computed tomography (CT) and magnetic resonance imaging (X-ray) have been utilised widely handed down in continuous appraisals as shown in Fig. 2. These methods have limitations, for example, lengthy experiment times and high experiment expenses, in addition to aiming to resolve instantly or equally.
434
S. Neha Reddy et al.
Fig. 2 System architecture
Disadvantages 1. These designs are not only difficult to recognise right away, but they are also susceptible to things like extensive experimentation costs and outcomes. We provide an ML-located technique for semantically computing out stroke prognostic subordinate belongings in the conventional using dynamically obtained multicalculated electrocardiogram (ECG) and photoplethysmography (PPG) biosigns. In order to anticipate stroke sickness while walking, we built and implemented a group foundation that combines CNN and LSTM. The proposed approach wonders whether it is necessary for more established occupants to wear biometric sensors. Biometric signals were recorded while being sashayed at a model speed about 1000 Hz per second from the three terminals about the ECG and the PPG suggestion. Advantages 1. The senior stroke patients’ consistent assessments performed well concerning execution and accuracy. 2. It has likely been shown the way that the prognostic ensuing impacts about stroke patients can be anticipated with over 90% exactness utilising only an ECG and PPG assembled while strolling. Modules To complete the project, we provided assistance for the modules listed below. • Data happening: This section will be used to include information into the construction. • Taking care of: In this place article, we will talk about taking care about the dossier. • Data will be separated into train and test facts by this item, which will be private. • The models have supported by the model correctness about the procedure as interpreted. • Users must first register and log in to utilise this module.
Machine Learning-Based Stroke Disease Detection System Using …
435
• This module might give forecasting data. • A conclusive visualisation is displayed for the prognosis.
4 Implementation Algorithms Random Forest: The calm ML calculation for characterisation and going back on one’s word concerns is random forest. It produces select seedlings endorsing various perspectives, such as the most popular approach and the average lapse rate. Decision Tree: Decision trees use a variety about habits to decide how to split a centre into essentially two different centre locations. The consistency about their union develops from the foundation about substitute focuses. Generally speaking, the dignity about the centre increases as the goal changes. Naive Bayes: A classifier that makes use about probabilities is the Naive Bayes request strategy. It contains characters with diminished freedom who could become prospect models. The freedom premise typically fails to be particularly explicit. They are seen in this way as being simple. A classifier for the first dataset is first fitted using an AdaBoost classifier. As a result, it creates more instances about the classifier using the same dataset, changing the order about tests that are improperly stacked for each justification, and creating classifiers that successfully centre the current experiment position. Logistic Regression: The determinable practical game plan for logistic regression gauges an equal result, as though, using impeded dossier points about view. A logistic regression model uses the relationship between an individual’s historically free portion to predict a dependent variable. MLP-ANN: A multi-layer perceptron (MLP) is a fully affiliated, partly feedforward artificial neural network (ANN). The term “MLP” is ambiguous, according to expressing, given that it is occasionally used to refer to some feedforward ANNs and numerous opportunities toward networks made up about various coatings about perceptrons (accompanying limit confirming). Multi-aspect perceptrons are more often than not latent as “unadorned” façade peridium connections when skilled is just a single secret covering. Support Vector Machine, sometimes known as SVM, is a related machine learning approach that may be used to address characterisation and slip concerns in tandem. The portrayal is better, but there are concerns with our propensity about calling bureaucracy a lapse. The SVM’s techniques appear to find a hyperplane in an N-wrap scope that clearly groups the facts into focus. Voter classification: Voting classifiers are assessors that use machine learning (ML) to organise numerous base models or assessors and then predict outcomes while keeping in mind the outputs about those models. Voting decisions may be influenced by totaling the results for each judge. Tree BF: A shrub or plan news makeup is searched for using the breadth-first search (BFS) policy, which seeks for foci that complement several requirements. Prior
436
S. Neha Reddy et al.
to moving on toward centre points at the subsequent importance level, it examines every centre point at the continuous meaning level emerging from the base about the wood or chart. Bayesian Network: A Bayesian union is a kind about probabilistic graphic model that allows for the possibility about being used to create models for data or even professional counselling. The following are examples about potential requests: expectation, unevenness distinctive authentication, test, robotized recognition, thinking, occasion order judgement, and exposure aware route. CNN: A CNN is a kind about deep knowledge network, which means that it is frequently used for picture concession and fact management. Although deep learning employs a variety about affecting animate nerve networks, CNNs are ultimately the most frequently used union plotting for object seeing authentication and concession. LSTM: A somewhat famous ANN known as long short-term memory (LSTM) is used in deep learning and automated thought. The LSTM emphasises recommendation networks more than typical feedforward mind models do. Recurrent neural networks (RNNs) are able to distinguish entire dossier progressions as well as obvious facts about interest (like images) (like talk or television) in this way. BiLSM: The abbreviation Bidirectional Long-Short-Term Memory (BiLSTM) is displayed. LSTM typically handles event succession without taking into account future news. BiLSTM examines succession dossier two together forward and back, identifying two together secret coatings in light about LSTM.
5 Experimental Results Figures 3, 4, 5, 6, 7, and 8 show the implementation method of machine learningbased stroke disease detection system using ECG and PPG biosignals.
Fig. 3 Home screen
Machine Learning-Based Stroke Disease Detection System Using …
Fig. 4 User registration
Fig. 5 User login
437
438
Fig. 6 Main screen
Fig. 7 User input
S. Neha Reddy et al.
Machine Learning-Based Stroke Disease Detection System Using …
439
Fig. 8 Prediction result
6 Conclusion We offer a method to do semantic analysis about diseases in the elderly by using a large number about spontaneous ECG and PPG signals that were recorded while the elderly were wandering about their everyday activities. The suggested method may be able to recognise and assess prognostic signs about an aged stroke problem by continually capturing a wide range about ECG and PPG biosignals. By segmenting the sign waveform using a large amount of biosignal data, an ML-based prediction model evaluation was performed. This process produced exact expectation findings and semantic translations. It was probably demonstrated in this investigation that stroke patients’ prognostic subordinate characteristics may only be predicted by in addition to 90% promoting the received characteristics when an ECG and PPG were clearly recorded while the patient was moving. We made sense about how estimation venturing into 10-covering CV datasets and disengaging stroke permitted us to definitively figure 91.56% C4.5 decision tree, 97.51% random forest, and 99.15% CNNLSTM models for profound learning. The episode displayed for this situation study can foresee prognostic subordinate properties and the reason for stroke by dissecting ECG and PPG at a sensible installment and going with immaterial disquiet all through the whole about age toward period presence. Different customary biosignal information have a high probability about furnishing stroke patients or clinical experts with confirmed translational data. The review’s findings suggested that this advancement may be efficiently applied in therapeutic settings, such as lowering the risk about a stroke and preventing emergencies by regular observation. In addition to information from MRI images and electronic medical records (EMRs), a range about biosignals, including EEG, EMG, hoof strain, and movement ability, will be determined while our estimates and prospective future belief beginnings about stroke adulteration are being identified.
440
S. Neha Reddy et al.
References 1. De Raedt S, De Vos A, De Keyser J (2015) Autonomic dysfunction in acute ischemic stroke: an underexplored therapeutic area? J Neurol Sci 348(1–2):24–34 2. Warlow CP (1998) Epidemiology about stroke. Lancet 352:1–4 3. Seo K-D, Kang MJ, Kim GS, Lee JH, Suh SH, Lee K-Y (2020) National trends in clinical outcomes about endovascular therapy for ischemic stroke in South Korea between 2008 & 2016. J Stroke 22(3):412–415 4. Musuka TD, Wilton SB, Traboulsi M, Hill MD (2015) Diagnosis & management about acute ischemic stroke: speed is critical. Can Med Assoc J 187(12):887–893 5. Song Q, Liu X, Zhou W, Wang L, Zheng X, Wang X, Wu S (2016) Long sleep duration & risk about ischemic stroke & hemorrhagic stroke: the Kailuan prospective study. Sci Rep 6(1):1–9 6. Yu J, Park S, Lee H, Pyo CS, Lee YS (2020) An elderly health monitoring system using machine learning & in-depth analysis techniques on the NIH stroke scale. Mathematics 8(7):1–16 7. Meyer BC, Lyden PD (2009) The modified national institutes about health stroke scale: its time has come. Int J Stroke 4(4):267–273 8. Schlegel D, Kolb SJ, Luciano JM, Tovar JM, Cucchiara BL, Liebeskind DS, Kasner SE (2003) Utility about the NIH stroke scale as a predictor about hospital disposition. Stroke 34(1):134– 137 9. Yu J, Kim D, Park H, Chon S, Cho KH, Kim S, Yu S, Park S, Hong S (2019) Semantic analysis about NIH stroke scale using machine learning techniques. In: Proceedings—PlatCon, Jeju, South Korea, pp 82–86 10. Yu J, Park S, Kwon SH, Ho CMB, Pyo CS, Lee H (2020) AI-based stroke disease prediction system using real-time electromyography signals. Appl Sci 10(19):1–19 11. Choi YA, Park S, Jun JA, Ho CMB, Pyo CS, Lee H, Yu J (2021) Machine-learning-based elderly stroke monitoring system using electroencephalography vital signals. Appl Sci 11(4):1–18 12. Lee M, Ryu J, Kim D (2020) Automated epileptic seizure waveform detection method based on the feature about the mean slope about wavelet coefficient counts using a hidden Markov model & EEG signals. ETRI J 42(2):217–229 13. World Health Organization. The top 10 causes about death [Online]. Available at: https://www. who.int/newsroom/fact-sheets/detail/the-top-10-causes-of-death. Accessed 22 Apr 2022 14. UNDESA World Social Report 2020 [Online]. Available at: https://www.un.org/development/ desa/dspd/world-socialreport/2020-2.html. Accessed 22 Apr 2022 15. Park S, Yang MJ, Ha SN, Lee JS (2014) Effective anti-aging strategies in an era about superaging. J Menopausal Med 20(3):85–89
Secure Trust-Based Attribute Access Control Mechanism Using FK-MFCMC and MOEHO-XGBOOST Techniques Padala Vanitha and Banda Srikanth
Abstract The Internet of things (IoT) possesses various devices, which are switching data continuously and are acquired throughout via lossy networks. This promotes the necessity for a lightweight, flexible along with adaptive access control (AC) method to handle the insidious behavior of such global ecological units and also guarantees the reliability betwixt the trusted devices. To provide users with adaptive authentication centered on dynamic trust estimation, a secure trust along with attribute-based AC (ABAC) approach is invented in this research. First, Trust Evaluation (TE) is executed utilizing Fisher score Kernel-based Minkowski fuzzy C-Means Clustering (FK-MFCMC) centered on trusted data balance by means of instructive features being extricated. A successful association is kept up by TE betwixt the nodes installed in the network. And, it guarantees that the entire nodes work in a reliable form. Subsequently, to ingress the request, the TE makes a decision utilizing the multi-objective elephant herding optimization based on an extreme gradient boosting (MOEHO-XGBoost) methodology. In this methodology, the ABAC request is converted into a permission decision vector. Then, it is converted into a binary classification problem that checks whether to permit or reject access. On the whole, the proposed methodology attains higher trust accuracy and TE for several nodes in IoT together with obtains a higher security level in correlation with the prevailing novel methodologies. Keywords Internet of things (IoT) · Trust · Attributes based access control · Security · Fisher score Kernel-based Minkowski Fuzzy C-Means Clustering (FK-MFCMC) · Multi-objective elephant herding optimization based on an extreme gradient boosting (MOEHO-XGBOOST)
P. Vanitha (B) Department of ECE, Malla Reddy College of Engineering and Technology, Hyderabad, Telangana State 500100, India e-mail: [email protected] B. Srikanth University College of Engineering, Kakatiya University, Kothagudem, Telangana State 507118, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_37
441
442
P. Vanitha and B. Srikanth
1 Introduction IoT is a developing technology, which possesses an integrated conventional network mechanism, intelligence machine technique, and wireless sensor network along with it switch over information as of objects to objects [1]. In recent days, IoT has gained attention that highly enhances the informatization together with computerization in industry, transportation, agriculture, et cetera [2]; in addition, it progresses the network scalability with expediency. Furthermore, in certain appliances, the development possesses highly secured necessities, and the IoT’s node could actively alter and does not trust others [3]. Hence, in IoT devices, to conquer the complication of security together with controllability, a highly secured AC method should be proffered [4]. In the security world, AC has a major role, which aims to protect digital along with physical accesses by restricting and imposing the control of access to what along with in which criteria [5]. AC is a crucial data security mechanism that ensures the data’s privacy together with and reliability [6, 7]. A solution is provided by AC to discard unauthorized access meant for multimedia appliances in IoT devices. Nevertheless, mostly the solutions provided in the research regard the IoT as a single block, which is mostly featured by restricted storage along with processing capabilities [8, 9]. Conversely, unremitting growth of new distributed together with open computing architectures like big data in conjunction with the IoT has augmented the enhancement in numerous entities together with the policy scale in the prevailing AC devices, in that way decreasing their proficiency [10, 11]. Additionally, the prevailing AC permission decision technology openly acquired the user’s AC policy data that involves the threat of private policy data revelation [12, 13]. The research has invented a secured trust and ABAC in IoT utilizing FK-MFCM and MOEHO-XGBOOST methodologies to conquer the aforementioned challenges. The paper’s remaining part is structured as, the related works centered on several access methodologies are reviewed in Sect. 2, the proposed methodology is explicated in Sect. 3, the performance evaluation is displayed in Sect. 4, and lastly, the work is winded up in Sect. 5.
2 Literature Survey Ding et al. [14] developed an ABAC methodology meant for IoT devices that rationalized access management. In this, to state the attribute distribution, the blockchain mechanism is utilized to avert single-point failure along with data tampering. The AC procedure had progressed to gratify the requirements for higher efficiency together with lightweight computation for IoT systems. Security along with performance evaluation displayed that the process could efficiently defy numerous attacks; in addition, effectively conducted in IoT devices, however defenseless to ambiguous attacks.
Secure Trust-Based Attribute Access Control Mechanism Using …
443
Zhang et al. [15] produced an ABAC approach that offered flexible, decentralized, along with fine-grained authentication meant for IoT devices. To provide genuine along with reliable qualifications, the blockchain mechanism is utilized. Significantly, a verifiable collaboration methodology was structured to gratify the desires of controlled access authorization in urgent situations. Authority nodes were designed to perform computation tasks, in addition, to communicate with the blockchain. The security evaluation displayed that the methodology could assure the authorized access’s security. However, the possibilities of the domestic attack were high. Khilar et al. [16] described a TE strategy regarding the machine learning (ML) technique that forecasted the users’ trust values along with resources. The ML methodologies like K-nearest neighbor (KNN), decision tree (DT), logistic regression (LR), together with Naive Bayes were deemed as the significant techniques to appraise the trust control system in the approach being developed. It was executed in the Jupyter Notebook simulator equipment; in addition, discovered better outcomes regarding the prediction time, efficiency, along with error rate. However, the adversarial attack was higher. Bernabe et al. [17] produced a trust-based AC mechanism intended for IoT (TACIoT), which offered an end-to-end conjunction with a reliable security system meant for IoT devices, regarding a lightweight authentication system accompanied by a state-of-the-art trust model that had been structured for IoT environments. TACIoT was an extended conventional AC procedure that regarded trust values that were centered on security considerations, reputation, devices’ social relationships, along with Quality of Service (QoS). TACIoT had been executed as well as analyzed successfully in a real testbed for constrained together with non-constrained IoT systems. However, the methodology was not capable to deal with distributed data paradigm.
3 Proposed Secure Trust-Based Attribute Access Control in IoT The emerging IoT technology offers the industrial units with numerous significant solutions like joint venture virtual production mechanisms. Nevertheless, the wide range of association of general devices like smart homes, industrial, et cetera, with the corporate systems in the IoT systems exhibits the corresponding network vulnerable to cyber-threats. The traditional scheme of security is not unsuccessful in preventing cyber-threats against IoT networks owing to numerous proprietary multilevel prototypes, restricted upgrade opportunities, extremely huge trust boundary, heterogeneous communication infrastructures, along with poor AC permission decision methodology. A secured trust-based ABAC in IoT utilizing FK-MFCM and MOEHO-XGBOOST methodology is established in this research to conquer the aforementioned issues as displayed in Fig. 1.
444
P. Vanitha and B. Srikanth
Fig. 1 Proposed secure trust-based attribute access control in IoT
3.1 Authentication Module The entry of intruders into IoT is prevented by a node user authorization service utilizing identity authentication ( n ) methodology. The user is registered with IoT by Pui together with the data regarding the IoT platforms providing personal details ( n) IoTi . Once the information is entered, the user authentication is provided by the ( ) ( ) ( ) system by creating a user ID USiID , password Pwin , along with captcha CAin . || || | ( ) | Y AUi = |USiID ||Pwin ||CAin |. ( ) If the user authentication is successful along with if Y AUi = 1, then the system forwards an authentication credential, or else it suggests password change. In the IoT device, the shared authentication is completed by the identity authentication with the single-sign-in methodology.
3.2 Trust Evaluation Module The scheme assists the trust attribute, which handles the node user TE, together with updated as well as altered trust level automatically in accordance with the result evaluation following every single transaction. To retain secure data transmission, the model utilizes trust-based access management in IoT. The TE conquers domestic attacks that enter as of the nodes with legal identification like Denial of Service (DoS), impersonation, modification, fabrication, inoculation of a vast range of unnecessary packets, et cetera. The domestic attacks’ intention is to damage the network tools in
Secure Trust-Based Attribute Access Control Mechanism Using …
445
conjunction with corrupt the data, which produces an attack node that is varied as of other nodes of the anomalous action. Consequently, TE centered on node behavior identification is highly significant to network security. Initially, the TE is conducted by the balancing of the trust data, then the extrication of trust attributes, along with FK-MFCMC for TE.
3.2.1
Trust Data Balancing
The actual trust management regarding the nodes might be an unbalanced database, which intends that there is a possibility of vital variation betwixt the number of trusted nodes and the number of non-trusted nodes in a database. For instance, the proportion of the number of trusted nodes to the number of non-trusted nodes is approximately 16:1, and in these unbalanced databases, the model’s performance is ruined. Consequently, to create balanced databases, the Random over sampler (ROS) methodology is utilized in this effort. It is computed as, ( ) T AUi = I(k : [M, m]),
(1)
wherein the datasets that possess the collection of trust based on IoT node are represented as T , the ROS is illustrated as I, the sampling strategy is denoted as k, and lastly, the majority and minority sampling strategies are stated as M and m.
3.2.2
Trust Attributes’ Extraction
The TA encloses the aspects of direct TES, which offer awareness about a trustee prior to a communication. Information is given regarding the quantitative evaluation that ranges among certain numerical value. Some knowledge-based attributes are extricated between the user and the nodes: (i) packet forwarding capacity (Fi ), (ii) the repetition rate (Ri ), (iii) the packet contents’ consistency (Ci ), (iv) the delay (Di ), (v) the integrity (Ii ), et cetera, ( ( )) = {E 1 (Fi ), E 2 (Ri ), E 3 (Ci ), E 4 (Di ), E 5 (Ii ), . . . , E n (Fi )}, E T AUi wherein the extricated features E1, E2 , E3, E4, E5, . . . , En .
3.2.3
up
to
nth
term
are
indicated
(2) as
Evaluation of Trust
The features being extricated are inputted to the FK-MFCMC. The previous FCMC methodologies are not effective toward noise together with it which does not retain any connection with the earlier iteration values. The methodology has utilized a Fisher
446
P. Vanitha and B. Srikanth
score kernel amalgamated with FCMC to resolve this problem. The FK-MFCMC’s step-by-step procedure is explicated as: Step 1: Consider two clusters in which the data are separated. Firstly, the two clusters Θ1 and Θ2 are initiated randomly. Step 2: The cluster centroid is computed. Φi j =
( n ∑
) a ∀ik Ek
1
/
n ∑
a ∀ik ,
(3)
1
where the Gaussian membership function is calculated as, ] ] [ [ 1 E −μ a a , ∀ik (E, μ, σ, a, k) = exp − k 2 σ
(4)
where mean and stand deviation are illustrated as μ and σ , and the fuzzification factor is specified as a together with the kernel factor is signified as k. Step 3: Next, the distance of every single data point is analyzed utilizing Minkowski distance. [ | n |∑ -λ |(Φik − E ik )|λ--. di j = √ (5) k=1
Step 4: With the membership values, the clusters get updated utilizing the Fisher score kernel, which expressed as: k(t + 1) = ∇θ log P(E|θ ),
(6)
where the set of parameters is denoted as θ along with the probabilistic model’s log-likelihood which is represented as log P(E|θ ). Step 5: The stopping condition is examined following the updation of the cluster center together with the membership degree. The cluster centers are maintained if the stopping conditions are fulfilled or else the iteration is done again from Step 2. Step 6: The best cluster center provides the finest solution. The clustering utilizing the FK-MFCMC conquers the cons pertained to the prevailing KFCMC. Step7: Lastly, while conducting testing of the node resource trust regarding the cluster center, the proposed methodology categorizes the input as a trust or else non-trusted node. If the node is detected as non-trusted, then they are forwarded to log files, or else, the access request is provided to the access evaluation.
Secure Trust-Based Attribute Access Control Mechanism Using …
447
3.3 Access Evaluation Module The AC permission decision engine model’s design is centered on the XGBOOST methodology. The model contains the policy balancing, model training, feature extraction (FE), together with the testing module.
3.3.1
Policy Balancing
Policy balancing describes that if the database centered on policy is unbalanced that is to say, if the database includes vital variation betwixt the number of permitted policies and the number of rejected policies in a policy set then, there is a possibility of inaccurate decision-making against the attacks, which cause deprivation of the model’s performance. The work balances the policy data utilizing ROS to avert the imbalance database. The policy balancing is calculated utilizing: ]) ( [ A = Ψ Y : M +, M − ,
(7)
wherein the datasets that possess the compilation of access policy is specified as A, the ROS is signified as Ψ, the sampling strategy is indicated as Y, and lastly, the majority and minority sampling strategies are stated as M + and M − .
3.3.2
Feature Extraction
The FE extricates the relevant aspects as of the data, which supports to mitigate the difficulty in decision-making. The data are supplied considering the quantitative analysis, which ranges within certain(numerical values. Some knowledge-based features ) ( ) SA RA , (2) resource attribute , (3) operation extricated: (1) subject attribute j j ( ) ( ) ( ) attribute EA j , (5) interaction frequency IF j , attribute OA j , (4) environment ( ) (6) transmission time TT j , et cetera. ( ) ( ) ( ) ( ) ( ) ( ) ( ) X ((A)) = {X 1 SA j , X 2 RA j , X 3 OA j , X 4 EA j , X 5 IF j X 5 TT j , . . . , X n F j }, (8) wherein the features extricated X 1, X 2, X 3, X 4, X 5, X 6, . . . , X N .
3.3.3
up
to
Nth
term
is
signified
as
Decision-Making
Decision-making centered on the policy provides an accurate access response to the user by preventing illegal users or else malicious attacks. The work has produced a MOEHO-XGBOOST methodology for decision-making, which utilizes a gradient
448
P. Vanitha and B. Srikanth
boosting approach to a known database and then categorizes the data accordingly. The proposed methodology’s intention is to fit the residual. Residual is proffered as the difference betwixt the real policy value and predicted policy value. It is formulated as: [(X, w) =
J ∑
J ( ) ∑ ( ) βk hk X ; w j = [k X ; w j ,
j=0
(9)
j=0
where the input data is specified as X, the attained model is signified as [(X, w), one single tree is represented as hk , the tree’s parameter is denoted as w, accompanied by the tree’s weight is indicated as βk . The optimal model [(X, w) is attained by reducing the loss function. The loss function is proffered as [ ∗ = arg min [
e(l) =
J ∑ )) ( ( T O j , [ X j ; wk ,
(10)
j=0
) ∑ ( ) ∑ ( l Oˆ j , O j + N [j , j
j
( ) 1 || ||2 N [ j = χ λleaf + ℘ ||w j || , 2 ) ( )2 ( l Oˆ j , O j = Oˆ j − O j ,
(11)
where the number of leaf nodes in the DT is specified as λleaf , the real value is signified as O j , the predicted value is denoted as Oˆ j , together with the parameters are represented as χ and λ. The research has utilized multi-objective elephant herding optimization to accomplish the decision’s accurate prediction. It aids to recognize the finest parameter value to attain an accurate solution. The parameters like λleaf ,χ , λ, βk along with w are chosen as the target to the optimization algorithm. ζobjective = ζ1 (-λleaf ) + ζ2 (χ ) + ζ3 (-λ) + ζ4 (βk ) + ζ5 (w).
(12)
The optimal values for the parameters are acquired via the matriarch elephant that makes the elephant family in every single clan along with that it is regarded to accomplish the finest solution. Accordingly, every elephant’s position is altered in consequence of the alteration in the matriarch elephant’s position. The elephant’s position is computed as ( ) ζnew,ci, j = ζci, j + l × ζbest,ci − ζci, j × a,
(13)
where ζnew,ci, j and ζci, j specify the updated along with the older position of the elephant j in clan ci, ζbest,ci signifies the matriarch elephant in the group of the clan that provides the best value, l and a fit in to value between 0 and 1 together with indicates scaling factor. The best elephant in the clan is measured as:
Secure Trust-Based Attribute Access Control Mechanism Using …
ζnew,ci, j = α × ζcenter,ci ,
449
(14)
where α ∈ [0, 1] signifies a factor which estimates the effect of the ζcenter,ci on ζnew,ci, j , ζnew,ci, j illustrates the new individual, and ζcenter,ci is the center individual of the clan ci. It is computed for dth dimension utilizing: ci ∑ 1 × ζci, j,d , n ci i=1
n
ζcenter,ci,d =
(15)
where the number of elephants in the clan is specified as 1 ≤ dth ≤ D and n ci , and the dth dimension of an individual elephant ζci, j is signified as ζci, j,d . To cope with multi-objective values, the EHO collaborates with a sine cosine approach to generate the last outcome, which is formulated as, ( ζnew,ci, j =
) ( ζci, j + l1 × sin(l2 ) (l3 ζbest,ci − ζci, j ), l4 < 0.5 , ζci, j + l1 × cos(l2 ) l3 ζbest,ci − ζci, j , l4 ≥ 0.5
(16)
wherein the random value betwixt 0 and 1 is denoted as l1 , l2 , l3 , l4 . Hence, attaining an optimal value for the entire parameters acquires a higher accuracy of decisionmaking to supply secure AC to the user in IoT.
4 Results and Discussion The proposed work centered on trust-based and ABAC in IoT is appraised regarding several performance metrics. The analysis is noticed for the proposed TE methodology that is to say FK-MFCMC, together with the decision-making methodology to be precise MOEHO-XGBoost.
4.1 Performance Analysis of Trust Evaluation Technique The FK-MFCMC for TE is appraised regarding the performance metrics: (1) Accuracy, (2) Specificity, (3) Sensitivity, (4) Precision, (5) F-Measures, (6) FPR, (7) FNR; in addition, correlated with (i) Fuzzy C-Means Clustering (FCMC), (ii) Adaptive Neuro-Fuzzy Interface System (ANFIS), (iii) Support Vector Machine (SVM), and (iv) Artificial Neural Network (ANN). In Fig. 2, the proposed model’s evaluation is represented graphically. A methodology must possess the capability to attain higher accuracy, sensitivity, precision, specificity, and f -measure to perform effectively, along with that it should acquire lower FNR and FPR values. Accordingly, the proposed mechanism obtains an accuracy, specificity, sensitivity, precision, and f -measure of 95.35%,
450
P. Vanitha and B. Srikanth
Fig. 2 Graphical analysis of proposed Trust Evaluation technique
92.64%, 93.44%, 94.84%, and 95.74%, respectively, which ranges betwixt 92.64 and 95.74%, while, the prevailing methodologies range betwixt 80.12 and 86.95%, which is considerably lower than the proposed model. The proposed methodology averts the false prediction of attacks together with attains FPR and FNR of 14.52 and 10.23%, which are lower than the prevailing methodologies, which range betwixt 25.69 and 46.58%. Lastly, Fig. 2 is assured that the TE of IoT node is strong along with secured than the prevailing methodologies.
4.2 Performance Analysis of Access Evaluation Technique The MOEHO-XGBoost for decision-making is evaluated regarding the performance metrics: (1) accuracy, (2) specificity, (3) sensitivity, (4) precision, (5) f-measures, (6) FPR, (7) FNR, (8) security level; in addition, it is correlated with (i) LR, (ii) KNN, (iii) SVM, and (iv) DT. In Table 1, the proposed model’s evaluation regarding several metrics is displayed. The proposed methodology’s numerical evaluation together with the prevailing methodologies considering the performance metrics is displayed in Table 1. The consideration of dynamic, uncertain, unstructured information for a policy’s decision-making is a difficult task for some ML technology. To retain a lower error rate, accurate decision-making should possess lower FPR and FNR values. Accordingly, the proposed methodology obtains an FPR and FNR of 13.69 and 11.20%, while the prevailing methodologies attain in the range betwixt 41.25 and 66.51%, which comprised a higher error rate than the proposed work. The proposed method’s classification scores are enhanced by the lowest FPR together with attains the accuracy, specificity, sensitivity, precision, and f -measure of 95.99%, 93.65%, 94.89%, 94.88%, and 94.89%, respectively, which is efficient than the prevailing methodologies that range betwixt 78.95 and 85.69%. In Fig. 3, it is described that the
Secure Trust-Based Attribute Access Control Mechanism Using …
451
Table 1 Evaluation of proposed decision-making technique for the access request Performance metrics/ techniques
LR
KNN
SVM
DT
Proposed MOEHO-XGBoost
Accuracy
78.95
84.98
82.56
83.69
95.99
Specificity
79.99
85.46
81.99
84.55
93.65
Sensitivity
80.12
85.55
82.54
84.99
94.89
Precision
81.11
85.69
82.46
84.78
94.88
F-measures
81.45
85.65
82.98
84.92
94.89
FPR
65.45
44.68
50.12
42.51
13.69
FNR
66.51
45.65
50.11
41.25
11.2
proposed mechanism is highly secured along with attains a higher security level against numerous sensor nodes. The figure illustrates that the MOEHO-XGBoost attains a higher security level to secure the data as of the illegal users or else malicious attacks. For a sensor node of 200, the security level attained by the present mechanism is 90.15%, while the previous methods attained security level which ranges betwixt 75.68 and 80.99%, which is lower than the proposed methodology. Hence, the present methodology obtains a superior security level together with retains a secured trust-base AC as contrasted to the prevailing methodologies.
Fig. 3 Graphical analysis of proposed decision-making technique based on security level
452
P. Vanitha and B. Srikanth
5 Conclusion To maintain secured information exchanges between the nodes and users, a trustbased ABAC has been employed in the IoT field by this work. A TE model along with an access evaluation module is offered by the work particularly intended for the IoT devices. It retains the QoS, security features, and social relationships together with reputation, to secure data communication among IoT models. To avert deprivation in performance owing to unbalanced data, the trust-based analysis model is initiated via trust data balancing. Subsequently, the useful data are extricated as of the balanced data. At last, the data being extricated are analyzed underneath the FK-MFCMC methodology for discovering the trusted along with non-trusted nodes. The access request advances to decision-making following the TE. This transforms the ABAC request into a permission decision vector; in addition, the AC permission decision problem is converted into a binary classification problem, which permits or else rejects access. By utilizing the MOEHO-XGBoost methodology, the decisionmaking is made for the policy balanced with essential FE. The experiential outcome displays that the present mechanism acquires an accuracy of 95.65% for the sensor nodes’ TE accompanied by attains a decision-making accuracy of 95.99% comprising a higher security level of 90.15% as correlated to the prevailing methodologies.
References 1. Wang J, Wang H, Zhang H, Cao N (2017) Trust and attribute-based dynamic access control model for internet of things. In: 2017 international conference on cyber-enabled distributed computing and knowledge discovery, Nanjing, China, 12–14 Oct 2017 2. Sun K, Yin L (2014) Attribute-role-based hybrid access control in the internet of things. AsiaPacific Web Conf. https://doi.org/10.1007/978-3-319-11119-3_31 3. Hassan MM, Huda S, Sharmeen S, Abawajy J, Fortino G (2020) An adaptive trust boundary protection for IIoT networks using deep learning feature extraction based semi-supervised model. IEEE Trans Ind Inform 17(4):2860–2870 4. Liu A, Du X, Wang N (2021) Efficient access control permission decision engine based on machine learning. Secur Commun Netw. https://doi.org/10.1155/2021/3970485 5. Jayasinghe U, Lee GM, Um T-W, Shi Q (2018) Machine learning based trust computational model for IoT services. IEEE Trans Sustain Comput 4(1):39–52 6. Yu Y, Jia Z, Tao W, Xue B, Lee C (2016) An efficient trust evaluation scheme for node behavior detection in the internet of things. Wirel Pers Commun. https://doi.org/10.1007/s11277-0163802-y 7. Sultana T, Ghaffar A, Azeem M, Abubaker Z (2019) Data sharing system integrating access control based on smart contracts for IoT. In: 14th international conference on P2P parallel grid cloud and internet computing, Antwerp, Belgium, Aug 2019 8. Putra GD, Dedeoglu V, Kanhere SS, Jurdak R (2020) Trust management in decentralized IoT access control system. In: 2020 IEEE international conference on blockchain and cryptocurrency (ICBC), Toronto, ON, Canada, 2–6 May 2020 9. Riad K, Huang T, Ke L (2020) A dynamic and hierarchical access control for IoT in multiauthority cloud storage. J Netw Comput Appl 160:102633
Secure Trust-Based Attribute Access Control Mechanism Using …
453
10. Al-Halabi Y, Raeq N, Abu-Dabaseh F (2017) Study on access control approaches in the context of internet of things a survey. In: 2017 international conference on engineering and technology (ICET), Antalya, Turkey, 21–23 Aug 2017 11. Dramé-Maigné S, Laurent M, Castillo L (2019) Distributed access control solution for the IoT based on multi-endorsed attributes and smart contracts. In: 2019 15th international wireless communications & mobile computing conference (IWCMC), Tangier, Morocco, 24–28 June 2019 12. Ouechtati H, Azzouna NB (2017) Trust-ABAC towards an access control system for the internet of things. In: International conference on green, pervasive, and cloud computing (GPC 2017), Cetara, Amalfi Coast, Italy, May 2017 13. Chen H-C (2019) Collaboration IoT-based RBAC with trust evaluation algorithm model for massive IoT integrated application. Mob Netw Appl 24(3):839–852 14. Ding S, Cao J, Li C, Fan K, Li H (2019) A novel attribute-based access control scheme using blockchain for IoT. IEEE Access 7:38431–38441 15. Zhang Y, Li B, Liu B, Wu J (2020) An attribute-based collaborative access control scheme using blockchain for IoT devices. Electronics 9(2):1–22 16. Khilar PM, Chaudhari V, Swain RR (2019) Trust-based access control in cloud computing using machine learning intelligent edge, fog and mist computing. Cloud computing for geospatial big data analytics. https://doi.org/10.1007/978-3-030-03359-0_3 17. Bernabe JB, Hernández-Ramos JL, Skarmeta A (2015) TACIoT multidimensional trust-aware access control system for the internet of things. Soft Comput 20(5):1763–1779
Designing Multiband Patch Antenna for 5G Communication System Hampika Gorla, N. Venkat Ram, and L. V. Narasimha Prasad
Abstract This article explores the importance of multiband patch antennas for WiFi, WiMAX, and 5G mobile applications. To meet the needs of 5G mobile services, these small antennas have been designed to operate at multiple frequencies. The proposed antenna measures 62 mm × 50 mm × 1.6 mm and has been designed to effectively work at 2.6 GHz f, 14.3 GHz, and 37.7 GHz for Wi-Fi, WiMAX, and 5G communications. Each operational frequency has been designed with a directional radiation pattern that provides high gain and direction. The antenna also has a low-standing wave ratio (VSWR), which accounts for its efficiency in transmitting and receiving signals with minimal loss. This ensures improved signal quality and extended coverage. Keywords 5G communication systems · WiMAX systems · Patch antenna · Wi-Fi · Multiband
1 Introduction The increasing need for 5G mobile services has created a demand for small, multiband antennas that can function over a wide range of frequency bands. The 5G network requires a higher frequency band than previous generations of mobile networks, resulting in a higher data rate and a shorter wavelength. Small antennas with a wide frequency range are needed to support these high-frequency bands. Multiband patch antennas come in handy here because they can operate across multiple frequency bands and can be integrated into small mobile devices. The multiband patch antenna is a microstrip antenna that can operate in multiple frequency bands. A thin metal sheet is mounted on a dielectric material and shielded H. Gorla (B) · N. Venkat Ram Department of ECSE, KLEF, Vaddeswaram, Guntur, Andrapradesh 522502, India e-mail: [email protected] H. Gorla · L. V. Narasimha Prasad Department of ECE, Institute of Aeronautical Engineering, Hyderabad, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_38
455
456
H. Gorla et al.
by a ground plane to form an antenna. The patch and ground plane form a resonant space that contributes to the radiation properties of the antenna. Patches can be shaped and arranged in different ways to create different radiation patterns and frequency responses. This paper proposes a multiband patch antenna that can function well at 37.7 GHz for 5G communication, 14.3 GHz for WiMAX, and 2.6 GHz for Wi-Fi. This allows the antenna to support multiple communication standards, making it suitable for a wide range of mobile devices. The proposed antenna arrays include directional radiation patterns that offer each operational frequency significant gain and directivity. The VSWR of the antenna arrays is really low, which tells us how efficient the antennas are. This guarantees that the proposed antenna can send and receive signals with little loss, improving signal quality and expanding the coverage area. By combining various patch shapes and placements, the proposed multiband patch antenna is intended to function over many frequency bands. For example, at 2.6 GHz, the patch is formed like a rectangle, while at 14.3 GHz, the patch is shaped like a circular ring. The radiation pattern and frequency response of the patch may be customized by adjusting its position and form. The suggested antenna arrays also have a low-profile design that enables their incorporation into small mobile devices like smart phones and tablets. The suggested multiband patch antenna is a viable option for addressing the rising demand for 5G mobile services due to its capability to operate over multiple frequency bands and its compact size. This antenna also offers directional radiation patterns, low VSWR, and high gain which will deliver enhanced signal quality and increased coverage area. Thus, it is an efficient and effective choice for mobile devices that must support multiple communication standards. The multiband patch antenna is a viable option for addressing the increasing demand for 5G mobile services. It is able to work across multiple frequency bands, and its small size is beneficial for mobile devices. Furthermore, its directional radiation patterns, low VSWR, and high gain provide improved signal quality and increased coverage area. In light of the advances in technology and the need for faster and more reliable communication systems, this antenna could be a crucial part of the development of 5G mobile services.
2 Literature Survey The design of a modern mobile communications antenna for Bluetooth, WiMAX, WLAN, and 5G systems that can operate at 2.4, 3.5, 5.8, and 6 GHz is covered in this article. In contrast to conventional reconfigurable antennas, which only support one frequency band, the design is exceptional in that it can handle several frequency bands concurrently. The pattern form in the top hemisphere is quite similar to a Dumble shape, and the maximum radiations are oriented normal to patch geometry. The results of antenna design, modeling, and measurement are presented, and there is good
Designing Multiband Patch Antenna for 5G Communication System
457
agreement between the simulated and measured results. With regard to 5G mobile communication networks, this reconfigurable patch antenna was created specifically for such networks [1]. This section outlines the construction of a highly efficient patch antenna with a simple structure created and modeled by CST Microwave Studio software. This antenna is capable of operating in Wi-Fi, WiMAX, and 5G bands, with bandwidths of 152 MHz, 235 MHz, and 4.5 GHz, respectively. Its low-Voltage Standing Wave Ratio (VSWR) and strong directivity make it a suitable choice for next-generation wireless communication [7–10]. CST Microwave Studio is a 3D electromagnetic (EM) analysis software designed to develop and refine components such as antennas, filters, couplers, and planar circuits. The antenna is highly valued for its accuracy in replicating electromagnetic fields, resulting in its broad use in industry and academia. These fields measure how well the antenna is connected to the transmission line or system it is linked to and is defined as the highest-to-lowest voltage ratio along the line or system. Though the antenna is effective, scientists believe that it could be improved in the future. This could include reducing the antenna’s size and weight, adding new frequency bands to increase its efficiency, and improving its directivity to cover a bigger area [2]. This antenna is designed to handle both GPS and WLAN frequency bands. It is composed of two radiating components placed together, one for GPS and the other for WLAN. The GPS element has a circular polarization broadside radiation, and the WLAN element emits a conical linear polarization radiation pattern. Both components can be activated by the same power source. The GPS impedance bandwidth, with a 10 dB return loss, ranges from 1515 to 1630 MHz, and the antenna’s maximum gain is roughly 7.5 dB. The WLAN impedance bandwidth, also with a 10 dB return loss, ranges from 2360 to 2560 MHz and includes two linked modes. This antenna is suitable for wireless devices that connect to both satellites and terrestrial networks, such as GPS and WLAN. We draw the conclusion that stacked patch topologies may successfully overcome the effect of constrained bandwidth and provide good performance in multiband antennas based on our review of the literature. As a result, we have provided an antenna that can operate in the GPS L1 band, GSM band, and WLAN frequency bands, enabling triple-band operation. We will also work to maintain the excellent performance attributes that make it suitable for GPS, GSM, and WLAN applications, such as high gain, broad bandwidth, increased beamwidth, and low cross polarization [3].
3 Theory and Radiation Mechanism The fundamental principles of microstrip antenna construction serve as the foundation for the multiband patch antenna. An antenna made of a small metal patch placed on a dielectric substrate, and covered by a ground plane, is referred to as a microstrip antenna. This structure of patch and ground plane creates a resonant cavity which is responsible for the antenna’s ability to radiate.
458
H. Gorla et al.
The multiband patch antenna is meant to function over various frequency bands by employing a mix of different patch shapes and placements. At 2.6 GHz, for example, the patch is formed like a rectangle, whereas at 14.3 GHz, the patch is shaped like a circular ring. The placement and structure of the patch can be shifted to acquire the desired radiation pattern and frequency response. The multiband patch antenna works by utilizing microstrip antenna design concepts. It is comprised a small metallic patch placed on a dielectric substrate, with a ground plane covering the antenna. The patch and the ground plane create a resonant cavity, which is responsible for the antenna’s radiative capabilities. When an electromagnetic wave interacts with the patch, it induces an electric current to traverse its surface. This current then emits an electromagnetic wave, which generates the antenna’s radiated field. The radiated field is determined by the current distribution on the patch’s surface, which is dictated by the shape and position of the patch. This multiband patch antenna has been designed to work on multiple frequencies by incorporating a range of various patch forms and placements. Each frequency band has a specific patch shape and position that is designed to achieve the desired radiation pattern and frequency response. This design enables the antenna to successfully work over various frequency bands, which is the foundation of multiband patch antennas [10].
4 Design and Simulation In this, we try to construct a multiband patch antenna using the High-Frequency Structure Simulator. The initial steps of producing a multiband patch antenna involve the utilization of a FR-4 substrate, which is having 4.4 as dielectric constant. To maximize effectiveness and minimize losses, the substrate material should have a low dielectric constant and a high-quality factor. To reach the necessary resonance frequencies, the substrate thickness should be 1.6 mm. The next stage is to calculate the antenna geometry as in Table 1, which comprises the size and form of the patch, as well as the position and shape of the feed point. The frequency of operation is influenced by the shape and size of the patch, and the feed point position and shape determine the impedance matching and bandwidth of the antenna [11–15]. After determining the antenna shape, the antenna is simulated using the HFSS. The simulation entails simulating the antenna shape as well as its interactions with the substrate material and its surroundings. The simulation results are then evaluated to identify the antenna’s resonance frequencies ranging from 0 to 40 GHz, impedance matching, and radiation pattern. The design is shown in Fig. 1. The three critical parameters involved in the antenna design technique are:
Designing Multiband Patch Antenna for 5G Communication System
459
Fig. 1 Top view of antenna design
Frequency of operation (F r ) The antenna’s resonance frequency must be properly chosen. For Wi-Fi, WiMAX, and 5G communications, the resonant frequencies used for this design are 2.6 GHz f, 14.3 GHz, and 37.7 GHz. Dielectric constant of the substrate (E r ) The substrate’s dielectric constant in the design of patch antennas, material is a key factor. Antenna performance and antenna size are both impacted by a substrate’s high dielectric constant. Patch antenna size and performance are therefore traded off [4]. Thickness of dielectric substrate (h) Compactness is a need for the microstrip patch antenna to be used in wireless communication systems. As a result, the dielectric substrate’s thickness had to be decreased [5]. Patch width and antenna length for the working frequencies are formulated as below. Finally, optimize the acquired values for determining the needed antenna characteristics. [ ] −1 h 2 εr + 1 εr − 1 + 1 + 12 εre = , 2 2 w where εre = effective dielectric constant, E r = dielectric constant of substrate,
(1)
460
H. Gorla et al.
H = height of dielectric substrate, W = width of the patch. The patch’s length has now had its dimensions experimentally enlarged by a distance, denoted by [
)] + 0.264 ( ) . ΔL = 0.412h (εr − 0.258) wh + 0.8 (εr + 0.3)
(w h
(2)
The patch’s actual length has changed to: Le =
c √ , 2 f o εre
(3)
The last patch length L is specified as: Le =
C √ . 2 f o εre
(4)
The last patch length L is specified as: L = L e − 2ΔL .
(5)
The width W for effective radiation is given as √ C W = fo
2 . εr + 1
(6)
Dimensions of the ground plane are calculated The ground plane dimensions are computed as follows: For practical purposes, a finite ground plane is used instead of an infinite one, but the transmission line concept still applies. To achieve results similar to those of an infinite ground plane, the ground plane must be at least six times the thickness of the substrate on all sides. The following provides an explanation of how to calculate the dimensions of the ground plane [6]: wg = 6h + w, Lg = 6h + L .
(7)
Calculation Substrate Selection: The FR-4 substrate with a thickness (h) of 1.6 mm, a relative permittivity of 4.4, and a loss tangent of 0.030 was chosen for the antenna design [4]. Calculation of width of patch (W ): From equation, a W = 25 mm patch width is
Designing Multiband Patch Antenna for 5G Communication System Table 1 Parameter values
461
S. No.
Parameter
Values (in mm)
1
Ground plane length
60
2
Ground plane width
50
3
Patch length
30.8
4
Patch width
25
5
Feed line (1) length
12.8
determined for the resonating frequencies. Calculation of actual length of patch (L): For the resonating frequency, L = 30.8 was obtained using the equation. Calculation of the ground plane dimensions (wg and Lg): The ground plate length may be calculated using the formulae as wg = 50 and Lg = 60. An antenna was created using the measurements found in Table 1 which were subsequently implemented into HFSS software. The antenna was then simulated in order to confirm the results (see Fig. 1) which shows the design.
5 Results and Discussion The rectangular microstrip patch antenna simulation results are shown below, including S-parameters, VSWR, and 3D radiation patterns. The graph is mentioned in Fig. 2. The S-parameter (Scattering parameters), which indicate the microstrip patch slot antenna’s return loss and resonant frequency, is shown in Fig. 2. We achieved roughly − 20 dB return loss and 2.6 GHz for Wi-Fi, 14.3 GHz for WiMAX, and 37.7 GHz for 5G communication purposes with our solution.
Fig. 2 S- parameter graph
462
H. Gorla et al.
In Fig. 3, we replicated the voltage standing wave ratio (VSWR) for an inset feed microstrip patch antenna, and we obtained VSWR values of 5.5, 1.7, and 2.8 for 2.6 GHz, 14.3 GHz, and 37.7 GHz, respectively. This mentions depicts the 3D polar plot observes correspondingly in Fig. 4. The polar plot and radiation pattern both display the direction of the antenna’s electromagnetic wave emissions. The graph is shown in Fig. 4.
Fig. 3 VSWR graph
Fig. 4 Three-dimensional radiation pattern
Designing Multiband Patch Antenna for 5G Communication System
463
6 Conclusion In the modern era of communication technology, there has been an ever-increasing demand for wireless services that can support high data rates and high capacity. With the introduction of 5G technology, this demand has become even more pronounced. The use of multiple frequency bands to support different communication standards has become a popular approach in designing communication systems. It is necessary to have antennas that can work across various frequencies, and the proposed multiband patch antenna is designed for this purpose. A multiband patch antenna appropriate for usage in mobile Wi-Fi, WiMAX, and 5G systems was reported in this study. The antenna is perfect for inclusion into small mobile devices due to its compact size and low profile. The directional radiation pattern of the antenna offers excellent gain and directivity for each operational frequency, resulting in better signal quality and extended coverage area. According to the findings of this study, the suggested antenna is efficient in terms of transmitting and receiving signals across numerous frequency bands. The suggested antenna is a feasible alternative for supporting numerous communication protocols in current communication systems.
References 1. Thulasi Bai V (2017) Design of 5G multiband antenna 2. Mahabub A, Mostafizur Rahman M, Al-Amin M, Sayedur Rahman M (2018) Design of a multiband patch antenna for 5G communication systems 3. Vaghela N, Tevar N (2018) Design and construction of micro strip antenna 4. Mazen KK, Emran A, Shalaby AS, Yahya A (2021) Design of multi-band microstrip patch antennas for mid-band 5G wireless communication 5. Bhukya R, Hampika G, Guduri M (2020) Resonant tunneling diodes: working and applications 6. Rohit V, Hampika G, Tenneti A, Guduri M (2020) An architectural overview of unmanned aerial vehicle with 5G technology 7. Venkata Sai Kasturi Babu CH, Athuluri M, Venkatram N (2018) Android based real time advance security for parking system in IoT 8. Chen ZN, Qing X, See TSP, Toh WK (2012) Antennas for WiFi connectivity. Proc IEEE 100:2322–2329 9. Mahabub A, Islam MN, Rahman MM (2017) An advanced design of pattern reconfigurable antenna for Wi-Fi and WiMAX base station. In: Proceeding of the 2017 4th international conference on advances in electrical engineering (ICAEE), Dhaka, 28–30 Sept 2017, pp 74–79 10. Maddio S (2017) A circularly polarized switched beam antenna. IEEE Antennas Wirel Propag Lett 16:125–128 11. Thomas KS (2009) Compact triple band antenna for WLAN/WiMAX applications. Electron Lett 45(16) 12. Pan C, Horng T, Chen WS, Huang CH (2007) Dual wideband printed monopole antenna for WLAN/WiMAX applications. IEEE Antennas Propag Lett 6:149–151 13. Hu L, Hua W (2011) Wide dual-band CPW-fed slot antenna. Electron Lett 47:789–790
464
H. Gorla et al.
14. Yang K, Wang H, Lei Z, Xie Y, Lai H (2011) CPW-fed slot antenna with triangular SRR terminated feedline for WLAN/WiMAX applications. Electron Lett 47:685–686 15. Wang P, Wen GJ, Huang YJ, Sun YH (2012) Compact CPW-fed planar monopole antenna. Electron Lett 48:357–359
Air Quality Prediction Using Machine Learning Algorithms G. Shreya, B. Tharun Reddy, and V. S. G. N. Raju
Abstract Since air quality is becoming a serious health issue, the government should take important action to predict it. The air quality is evaluated by the air quality index. The combustion of natural gas, coal, and wood, as well as automobiles and industry, produce carbon dioxide, nitrogen dioxide, carbon monoxide, and other air pollutants that contribute to air pollution. Lung cancer, brain damage, and even death can result from air pollution. The air quality index can be calculated with the help of machine learning methods. Despite the fact that numerous studies are being carried out in this field, the outcomes remain incorrect. The Kaggle-provided datasets are categorized as follows: preparing and testing. Machine learning algorithms such as Linear Regression, Random Forest, and C 4.5 Decision Tree are utilized in this. Keywords Machine learning · Linear regression · Random forest · C 4.5 decision tree
1 Introduction In farming nations, for instance, India, speedy people improvement and metropolitan money related advancement have provoked various natural concerns, including air defilement, water tainting, and disturbance pollution. Air contamination straightforwardly affects human wellbeing. The public’s consciousness of a similar subject has expanded in our country. A dangerous atmospheric deviation, corrosive downpour, and an expansion in the quantity of asthmatics are a portion of the drawn out impacts of air contamination. The greatest measure of contamination that influences people and the biosphere can be alleviated by exact air quality gauging. Hence, further creating air quality deciding is one of the fundamental goals of society. The burning G. Shreya (B) · B. Tharun Reddy · V. S. G. N. Raju Department of Electronics and Communication Engineering, Sreenidhi Institute of Science a Technology, Hyderabad, Telangana, India e-mail: [email protected] V. S. G. N. Raju e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_39
465
466
G. Shreya et al.
of petroleum derivatives, the arrival of poisonous gases, and the creation of solids via autos and organizations are the essential supporters of air contamination. Sulfur oxides, nitrogen dioxides, particulate matter, and carbon monoxide are instances of such parts. Right now, checking and breaking down air quality is a fundamental and urgent part of carrying on with a sound life. Examining air contamination with the assistance of information mining procedures can assist with deciding the best ways of diminishing it. The extraction of purpose-based information from a crude informational index is known as information mining. Information mining can likewise be utilized to examine the arrangement of examples that show up most often in a dataset. The course of information mining’s essential objective is to separate data from a lot of information and change it into a fathomable structure that can be utilized later. Data digging can be utilized for Assumption, Perceiving, Requesting, and Progressing. The most common way of removing prescient data from an enormous information base is known as data mining. It might be described as a predictable technique used to channel through enormous proportions of data to track down significant information. The essential target of this technique is to find novel data and examples that were not recently known. Data mining and information revelation are two particular ideas. By using data mining methods, it is possible to inspect and expect air defilement and toxic substances. Also, the foundation of air contamination can be recognized. Streamlining calculations can be utilized to choose the best elements for powerful grouping and expectation results. The formation of projects that “learn” from their environmental factors and adjust likewise is the focal point of the field of machine learning (ML). A calculation that plays out the predefined task all the more successfully and gains for a fact through act of spontaneity and transformation. Accordingly, air contamination determining models can benefit enormously from the utilization of ML procedures [1]. Using the best ML strategy in light of natural and ecological elements is fundamental on the grounds that the utilization of an ML procedure is issue explicit as shown in Fig. 1. Air pollution is hazardous to human prosperity and should be diminished rapidly in metropolitan and nation districts; subsequently, exact air quality expectations are fundamental. Water contamination, air contamination, and soil contamination are types of contamination, however air contamination is the most squeezing since individuals inhale oxygen out of sight. While the state government conducts different exercises to diminish air contamination, some are completed by the nearby government. Our logical commitment to this issue’s goal should be visible in this paper. The primary important step is to estimate the air quality file to help with advancing the circumstance. Four unmistakable classifiers in light of particular calculations were created as a feature of this undertaking. A framework with artificial intelligence out how to act through machine learning and gathers data from its current circumstance’s sensors. This flexibility of machine learning (ML) estimations was one justification for why we picked ML to predict the air quality record. One of the main obligations is rapidly turning into the control of air contamination levels. It is central for individuals to be familiar with how much polluting in their ongoing situation and to take action to fight it.
Air Quality Prediction Using Machine Learning Algorithms
467
Fig. 1 Example figure
2 Literature Review Air Quality Index Prediction Using Simple Machine Learning Algorithms Throughout the course of recent many years, air pollution prevention has been a consistent logical test. In any case, they stay huge worldwide issues. They increment populace mortality and illness risk by influencing the respiratory and cardiovascular frameworks of people. With an end goal to work on general wellbeing, both nearby and state legislatures put forth various attempts to fathom and foresee the air quality record. One logical commitment to resolving this issue is this article. Support vector machines, neural networks, decision trees, and k-nearest neighbor machine learning calculations are undeniably analyzed. For every day of 2017, information from estimation stations in the Republic of Macedonia’s capital are remembered for the air contamination dataset. It was exhibited that utilizing these calculations to anticipate the air quality file could be extremely viable. A Deep Recurrent Neural Network for Air Quality Classification Considered to antagonistically influence human prosperity, air defilement has gathered overall thought. Consequently, it is fundamental for individuals’ government assistance to conjecture the adequacy of air quality. Utilizing profound learning, we attempt to anticipate the Air Quality Classification (AQC) for three unmistakable modern areas in the US. The Recurrent Neural Network (RNN) of deep learning is utilized to fabricate a huge expectation model. RNN can research and hold progressive data like ordinary air quality data all through a particular stretch of time. The Support Vector Machine, Random Forest, and Recurrent Neural Network (RNN)
468
G. Shreya et al.
models’ viability is shown by the investigation’s results. Two different techniques for ML are substandard compared to our proposed RNN model. Moreover, the RNN with memory model beats those without memory activity while utilizing successive air quality information [2–5]. Detection and Prediction of Air Pollution Using Machine Learning Models In both created and crowded countries, air guideline is seen as a critical obligation by states. Meteorological and traffic factors, the copying of oil-based goods, and present day limits, for instance, releases from power lays out all expect enormous parts in air tainting. Particulate matter (PM 2.5) requires the most thought of all the particulate matter that chooses air quality. People face critical wellbeing chances when its fixation in the air is high. To keep up with control, it is consequently fundamental to screen its focus in the air persistently. This paper utilizes calculated relapse to decide if an information test is dirtied. In view of past PM2.5 estimations, autoregression is utilized to gauge future qualities. Knowing the level of PM2.5 sooner rather than later, month, or week engages us to decrease it under the obstructing reach. This framework endeavors to foresee PM2.5 levels and distinguish air quality involving an informational collection of everyday barometrical circumstances in a particular city [6, 7].
3 Methodology In recent decades, air pollution and how to reduce it have been ongoing scientific concerns. However, they are still significant global issues. Because they affect humans’ respiratory and cardiovascular systems, they are a cause of increased mortality and illness risk. Many endeavors are made by both nearby and state legislatures to comprehend and expect air quality records to advance general wellbeing [8–10]. The system architecture is depicted in Fig. 2. Disadvantages • Our research efforts to help find a solution to this problem. The suggested method will be of assistance to ordinary people as well as meteorologists in determining pollution levels, forecasting them, and taking appropriate action. Additionally, this would help establish a data source for smaller towns, which are frequently overlooked in comparison to major metropolitan areas. Advantages • Machine learning algorithms such as Linear Regression, Random Forest, and C 4.5 Decision Tree are utilized in this.
Air Quality Prediction Using Machine Learning Algorithms
469
Fig. 2 System architecture
Modules In this project we have designed following modules. 1. 2. 3. 4.
Data Collection Data Preprocessing Model Selection Predict the results.
4 Implementation Support Vector Machine (SVM) Support Vector Machine, or SVM, is one of the most widely recognized computations for Gathering and Backslide issues in Directed Learning. It is generally utilized for order issues in ML. The SVM’s calculation will likely track down the best line or choice limit to partition n-layered space into classes, making it simpler to group future data of interest. Random Forest Algorithm Random Forest is a notable regulated learning-related ML calculation. It is material to both Gathering and Backslide issues in ML. It depends on gathering learning, in
470
G. Shreya et al.
which various classifiers are joined to tackle a convoluted issue and work on the model’s exhibition. Naive Bayes Classifier A regulated ML calculation utilized for undertakings like text characterization is the Naive Bayes classifier. Furthermore, it is an individual from the generative learning calculation family, which intends to show a particular class or classification’s feedback dissemination. It doesn’t recognize which highlights are generally significant for class separation, dissimilar to discriminative classifiers like calculated relapse. Decision Tree Classifier Decision tree is awesome and extensively used portrayal and assumption instrument. A test on a trait is addressed by each inner hub, a test result by each branch, and a class identifier by each leaf (terminal) hub. Boosting Boosting is an machine learning (ML) technique used to diminish insightful data examination botch. Information researchers train ML programming, or ML models, to make deductions about unlabeled information in view of named information. An ML model might make erroneous expectations relying upon the preparation dataset’s accuracy. Gradient Boosting Algorithm Gradient boosting is a strong boosting calculation that prepares each new model to limit the misfortune capability, like mean squared blunder or cross-entropy, of the past model through slope plunge. This makes a few frail students into solid students. In each accentuation, the estimation processes the slant of the hardship capacity concerning the continuous gathering’s conjectures and a while later readies one more weak model to restrict this point. Logistic Regression A logistic regression assessment is a judicious examination. The factual strategy known as logistic regression is utilized to depict and make sense of the relationship that exists between a solitary ward paired variable and at least one free factors that are ostensible, ordinal, span, or proportion level [6–10].
5 Experimental Results Figures 3, 4, 5, 6 and 7 show the implementation results containing the home screen, user registration, login, and predicted output.
Air Quality Prediction Using Machine Learning Algorithms
471
Fig. 3 Home screen
Fig. 4 User registration
6 Conclusion One of the main positions is rapidly turning into the guideline of air contamination levels. Individuals should know about the degree of contamination in their environmental elements and make a move to diminish it. The outcomes show that strategic relapse and decision tree ML models can be utilized actually to distinguish air quality and foresee PM2.5 levels from now on. The proposed strategy will help out to conventional individuals as well as meteorologists in deciding contamination levels, determining them, and making a suitable move. Moreover, this would assist with laying out an information hotspot for more modest towns, which are oftentimes ignored in contrast with significant metropolitan regions.
472
G. Shreya et al.
Fig. 5 User login
Fig. 6 User input
7 Future Scope In addition, we really want to join the web server and the application. Also, the computations of things will be climbed to significantly more unmistakable accuracy.
Air Quality Prediction Using Machine Learning Algorithms
473
Fig. 7 Prediction result
References 1. Veljanovska K, Dimoski A (2018) Air quality index prediction using simple machine learning algorithms. Int J Emerg Trends Technol Comput Sci (IJETTCS) 2. Zhao X, Zhang R, Wu J-L, Chang P-C (2018) A deep recurrent neural network for air quality classification. J Inform Hiding Multimedia Sig Process 3. Mohurle SV, Purohit R, Patil M (2018) A study of fuzzy clustering concept for measuring air pollution index. Int J Adv Sci Res 4. Aditya CR, Deshmukh CR, Nayana DK, Vidyavastu PG (2018) Detection and prediction of air pollution using machine learning models. Int J Eng Trends Technol (IJETT) 5. Zhang S, Li X, Li Y, Mei J (2018) Prediction of urban PM2.5 concentration based on wavelet neural network. IEEE 6. Amado TM, Dela Cruz JC (2018) Development of machine learning-based predictive models for air quality monitoring and characterization. IEEE 7. Shawabkeh A, Al-Beqain F, Rodan A, Salem M (2018) Benzene air pollution monitoring model using ANN and SVM. IEEE 8. Kang GK, Gao JZ, Chiao S, Lu S, Xie G (2018) Air quality prediction: big data and machine learning approaches. Int J Environ Sci Dev 9. Martínez NM, Montes LM, Mura I, Franco JF (2018) Machine learning techniques for PM10 levels forecast in Bogotá. IEEE 10. Ochando LC, Julian CIF, Ochando FC, Ferri C (2015) Airvlc: an application for real-time forecasting urban air pollution. In: Proceedings of the 2nd international workshop on mining urban data, Lille, France
Disease Detection in Potato Crop Using Deep Learning S. P. V. Subba Rao, T. Ramaswamy, Samrat Tirukkovalluri, and Wasim Akram
Abstract One of the main crops in India is the potato. In India, potato farming has gained enormous popularity in recent years. However, a number of diseases are making it more expensive for farmers to grow potatoes while also affecting their personal lives. Our primary objective is to use cutting-edge machine learning technology to identify potato disease from leaf images. This paper provides an image that is processed and deep learning-based automated algorithms will identify and categorize potato leaf diseases. The best method for identifying and studying these disorders is image processing. Around 2152 images of diseased and healthy potato leaves were divided into groups for this project. Our results demonstrate that machine learning outperforms all currently available tasks in potato disease identification. Keywords Artificial intelligence · Convolutional neural network · Deep learning · Image processing · Python
1 Introduction Agriculture is a key component of any nation’s economic development. In India, potato is a temperate crop that is produced in a subtropical climate. A crop that has traditionally been “poor man’s friend” is the potato. Over 300 years have passed since the cultivation of potatoes began in our nation. It has become one of the important crops grown in this nation for vegetables. S. P. V. Subba Rao · T. Ramaswamy · S. Tirukkovalluri (B) · W. Akram Department of E.C.E, Sreenidhi Institute of Science and Technology, Hyderabad, India e-mail: [email protected] S. P. V. Subba Rao e-mail: [email protected] T. Ramaswamy e-mail: [email protected] W. Akram e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_40
475
476
S. P. V. Subba Rao et al.
Ensuring security in agricultural fields is a fundamental challenge for virtually all developing nations as they often encounter issues with malnourishment, which is closely linked to agricultural security. Potato crops can be susceptible to a number of diseases, each of which manifests itself in distinct ways across different parts of the plant’s foliage. Among all the diseases, the most prevalent diseases are late blight and early blight. Early blight is caused due to a fungal pathogen known as Alternaria solani, while late blight in potato leaves is caused due to the bacterium Phytophthora infestans. Being able to identify and diagnose these diseases in crucial crops has spurred the development of an automated approach aimed at improving crop yields, boosting farmers’ profits, and making a noteworthy contribution to the nation’s economy. In the past, numerous researchers in the fields of image processing have suggested using conventional image processing techniques like K-means clustering and LBP to identify these leaf diseases. This work offers a deep learning model that uses many classifiers to detect potato leaf illness, however, since machine learning and deep learning models excel in mapping functions and generating superior features. The research methodology proposed in this study employs machine learning to classify and distinguish between healthy and diseased leaves. Specifically, the architecture utilized is the Classical Convolutional Neural Network.
2 Literature Survey The use of machine learning technology allows computers to learn and develop programs without requiring human intervention. Over the last decade, its popularity has increased considerably, and it is now utilized in various applications, ranging from web search, spam filtering, and product recommendations to advertising placement, credit assessment, fraud detection, stock trading, and even drug development. The project’s implementation of their strategy was described in a research paper named “Krishi Mitra: Using Machine Learning to Identify Diseases in Plants” that used the CNN model methodology and the TensorFlow Framework. Because it only looked at the left side, this model had the obvious advantage of allowing for the detection of fungal infections in sugarcane. The implementation needed advanced computing complexity, which was a drawback. The authors of [5] reviewed and acknowledged the necessity of developing an economical, speedy, and dependable health-monitoring sensor to facilitate agricultural advancements. They discussed existing technologies for detecting plant diseases, including spectroscopic and imaging-based approaches, with the objective of devising ground-based sensor systems capable of monitoring the illness of plant in field conditions. The choice was taken to employ an image processing-based strategy for disease recognition over other techniques typically used for diagnosing plant diseases, such as double-stranded RNA analysis, and microscopy, after carefully examining the results of [5] and related research [1–4].
Disease Detection in Potato Crop Using Deep Learning
477
A related study [6] was conducted to address the same problem, which involved testing five different CNN architectures, including AlexNet, AlexNetOWTBn, VGG, OverFeat, and GoogLeNet. The VGG architecture topped the others with a 99.53% accuracy rating for 58 different classes. Instead of segmentation, the writers broke with tradition and used CNN. However, it should be emphasized that the training, validation, and testing sets in this study were drawn from the same dataset, and that if the sources had been different, the outcomes would not have been as good [6, 7]. Fuzzy c-mean clustering and neural networks were used in the research article named “Severity Identification of Potato Late Blight Disease from Crop Images Captured under Uncontrolled Environment” to develop their model. The model’s primary benefit was that it did not require special training for farmers since the dataset included images captured from various angles. The only issue was that untrained farmers took pictures that were not appropriately oriented and had backgroundvisible leaf clusters in a few parts [10]. The research paper named “Potato Disease Detection Using Machine Learning” employed image processing as the primary technology. They utilized a CNN model, which achieved a validation accuracy of 90%, as the primary advantage of this project. However, the major drawback of this model was that it required a large training model [8, 9]. In 2017, Pooja V. and colleagues suggested the use of machine learning methods along with image processing tools for disease detection and classification. The process involves identifying the affected region in the image, performing image processing, generating segments in the image, recognizing areas of interest, extracting features, and ultimately using SVM for disease classification. The proposed methodology produced better results than previously utilized techniques. Object recognition relies heavily on machine learning techniques to train models that can identify various objects present in an image. To enhance the performance of these models, one can either expand the size of the dataset for training and validation by obtaining more labeled images of the objects or use more sophisticated models such as convolutional neural networks (CNNs) to capture intricate relationships within the data. Despite the beneficial features of CNNs, such as their local architecture which makes them relatively efficient, they have been too costly to apply to large-scale applications involving high-resolution images. But with modern Graphics Processing Units (GPUs) and an optimized implementation of 2D convolution, training large CNNs is now feasible. Additionally, datasets like ImageNet have a sufficient number of labeled examples to prevent severe overfitting. Early detection of illnesses in plant leaves was discovered to be essential for the economy in 1993. In a study by Islam et al. [8], the support vector machine (SVM) algorithm was used to the retrieved features of potato leaves using dataset. Therefore, the majority of current techniques rely solely on the available “The Plant Village” dataset.
478
S. P. V. Subba Rao et al.
3 Proposed Methodology As can be seen from Fig. 1, this project’s study has gone through several stages in the shape of an analytical framework. There are four steps in the suggested research framework, which are as follows: Dataset Collection: This project utilized a dataset consisting of pictures of potato leaves, which were categorized into three groups: healthy, late blight, and early blight. The dataset was obtained from the website named Kaggle as shown in Fig. 2 and Table 1and is known as the “Plant Village Dataset”. Preprocessing Data: The current phase of the project utilizes a total of 2152 potato leaf pictures which are categorized into three classes: healthy, late blight, and early
Dataset Collection
Preprocessing Data
Classification
Evaluation
Fig. 1 Proposed research framework
Fig. 2 Image data
Disease Detection in Potato Crop Using Deep Learning Table 1 Dataset details
479
Samples
Number
Healthy
Table 2 Train–validation–test dataset details
152
Early blight
1000
Late blight
1000
Total
2152
Dataset
80:10:10 Train
Val.
Test
Healthy
121
16
16
Late blight
800
100
100
800
100
100
1721
216
216
Early blight Total
blight. The data have been sourced from the Kaggle website and are split into training, validation, and testing datasets, as shown in Table 2 with ratios of 80:10:10, respectively. The accuracy of each ratio will be compared to determine which one is better for data division. The following tables provide details about the distribution of data for every sample class. The use of data augmentation increases the amount of data. To artificially increase the size of the training dataset, this approach entails creating many plausible versions of each training sample. This lessens overfitting. With this technique, every picture in the training set is slightly rotated, scaled, and slightly shifted before the new pictures are added to the training set. This increases the model’s ability to adjust to variations in the size, position, and orientation of objects in the image. Additionally, the images’ contrast settings can be altered, and they can be rotated both vertically and horizontally. By combining all of the modifications, the size of the training set can be expanded. The data are then divided into batches of 32 images each, with three channels, and trained for 25 epochs. Classification: The project’s next stage is to categorize pictures using the CNN architecture, a supervised learning method that uses an existing dataset to train a set of variables in an image to be recognized. The CNN’s convolutional layer aids the neural network’s recognition of potato leaves based on those leaves’ characteristics. Red, green, and blue will be represented by three channels in 256*256 images for this project. To lower the resolution of the picture while maintaining its quality, the leaf image will be convolved with a filter and then pooled. The generated image will be processed using MaxPooling. A CNN as shown in Fig. 3 with MaxPooling and Conv layers is built in this project. The hidden layers utilize the Rectified Linear Unit (ReLU), while the SoftMax activation function is used by the output layer.
480
S. P. V. Subba Rao et al.
Input image
Convolution
MaxPooling
Fig. 3 CNN architecture
This layer will be flattened in the next stage, turning the feature map produced by pooling into vector form. Using four convolutional layers and four MaxPooling, the suggested CNN architecture in this project detects disorders in potato leaves. Evaluation: To assess the effectiveness of the method, we randomly split the plant village dataset into training, validation, and testing sets, and also included custom potato leaf images for training. The experiments were conducted using the Keras v0.1.1 library on the Python framework, with an image size of 256*256. The performance of the model was examined by plotting the training and validation accuracies over the number of epochs, as shown in Fig. 4. The outcome indicates that the accuracy of the model increases as the number of epochs increases. We tested different CNN architectures such as GoogLeNet, ResNet, and VGG on the aforementioned dataset to determine the most effective neural network for predicting potato illness based on images provided at the input. GoogLeNet is a deep CNN architecture consisting of 22 layers and is a variant of the Inception Network developed by Google researchers. It was designed to solve computer vision problems such as image classification and object detection. A pretrained version of the 16-layer CNN VGG-16, which was trained using more than million pictures from the ImageNet collection, is available. This network can identify pictures into a whopping 1000 different categories, including mice, keyboards, and animals. ResNet, on the other hand, addresses problems like disappearing gradients by using skip connections to optimize deep layer processing, leading to high and consistent accuracy over epochs. While GoogLeNet had the highest accuracy over the ImageNet database, it only showed very good accuracy in the final epochs and not at the beginning. VGG-16,
Disease Detection in Potato Crop Using Deep Learning
481
Fig. 4 Training and validation graphs
on the other hand, had the worst accuracy and took the longest processing time, possibly due to the repeated use of the same convolution filters in each layer. Plain CNN worked well with less time for processing, but for larger datasets like ours, it may not be satisfactory for real-time use. CNN Architecture In this study, we employed a method based on convolutional neural networks (CNNs), which are a type of deep learning technique. CNNs are capable of analyzing images and effectively distinguishing between various objects present within them, while also prioritizing them. Unlike other classification algorithms, CNNs require minimal preprocessing efforts. They have the ability to learn filters and features through training, eliminating the need for manual filter engineering. The architecture of our model primarily consists of the following layers:
482
1. 2. 3. 4. 5.
S. P. V. Subba Rao et al.
Input layer. Convolution layer. Pooling layer. Fully connected layer. Output layer.
4 Experimental Results Machine learning algorithms for detecting plant leaf disease have showed great promise in enhancing crop productivity and quality by managing biotic factors that can lead to large crop yield losses. We present a simple and effective multi-level machine learning model for recognizing potato leaf diseases in our study. At the first level, we extract potato leaves from the input picture and then train a convolutional neural network to detect late blight and early blight infections from the leaf images of potato plant. Furthermore, our model observes the impact of environmental conditions on potato leaf illnesses. On a different dataset, the effectiveness of the suggested PDDCNN methodology was examined, and it was discovered to outperform other methods. The method was compared to other existing methodologies and studies for detecting potato leaf disease as shown in Fig. 5. On the PLD dataset, the proposed method was trained with data augmentation strategies, obtaining good accuracy of 97.27%. Furthermore, the proposed approach had fewer parameters and was simpler than the state-of-the-art methods, resulting in significant cost savings and faster processing speed [8, 9].
5 Conclusion Using deep learning techniques and convolutional neural network classification strategy, this study presents a method for recognizing late blight, early blight, and healthy leaf pictures of the potato plant. According to the study, CNN is the best approach for this type of categorization, with a validation accuracy of 97.27%. This project can be highly beneficial for the agriculture sector, especially for farmers in India who may not be literate and unaware of the diseases that affect their crops. The experiments were conducted on both healthy and diseases leaf images, and the proposed method was successful in recognizing the three different types of potato leaf diseases. Overall, this study has the potential to bring about positive changes in the potato growing industry in India.
Disease Detection in Potato Crop Using Deep Learning
483
Fig. 5 Actual images and predicted images with confidence (probability)
6 Future Scope We can use Generative Adversarial Networks (GANs) for creation of the data and Transfer Learning to improve the accuracy of the model and enhance its resilience. GAN can make the model more tolerant of variations in image size, orientation, and position. Transfer learning can assist us create a model that is both robust and exact. Our future goal is to develop an Android application that an accurately detect crop diseases and provide appropriate solutions. We plan to increase the accuracy of our model by expanding our database and utilizing techniques such as GAN and Transfer Learning. Our aim is to create a system that provides instant service and advice to farmers in India by detecting the disease of their crops. The initiative will be available online, allowing people to diagnose the condition and apply the necessary disinfectant from the comfort of their own homes. We also want to apply the proposed methodologies to other leaf plant recognition applications. To ensure smooth operation, the application interface will be linked to the internet and a database.
484
S. P. V. Subba Rao et al.
References 1. Reddy PR, Divya SN, Vijayalakshmi R (2015) Plant disease detection technique tool a theoretical approach. Int J Innov Technol Res 9193 2. Mahlein A-K, Rumpf T, Welke P et al (2013) Development of spectral indices for detecting and identifying plant diseases. Rem Sens Environ 128:2130 3. Xiuqing W, Haiyan W, Shifeng Y (2014) Plant disease detection based on near-field acoustic holography. Trans Chin Soc Agric Mach 2:43 4. Mahlein A-K, Oerke E-C, Steiner U, Dehne H-W (2012) Recent advances in sensing plant diseases for precision crop protection. Eur J Plant Pathol 133(1):197209 5. Sankaran S, Mishra A, Ehsani R, Davis C (2010) A review of advanced techniques for detecting plant diseases. Comput Electron Agric 72(1):113 6. Ferentinos KP (2018) Deep learning models for plant disease detection and diagnosis. Comput Electron Agric 145:311318 7. Sharma P, Berwal YPS, Ghai W (2018) KrishiMitr (farmer’s friend): using machine learning to identify diseases in plants. In: 2018 IEEE international conference on internet of things and intelligence system (IoTaIS). IEEE, pp 29–34 8. Islam M, Dinh A, Wahid K, Bhowmik P (2017) Detection of potato diseases using image segmentation and multiclass support vector machines. In: 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE). IEEE, pp 1–4 9. Suttapakti U, Bunpeng A (2019) Potato leaf disease classification based on distinct color and texture feature extraction. In: 2019 19th international symposium on communications and information technologies (ISCIT). IEEE, pp 82–85 10. Li Y, Wu H (2012) A clustering method based on K-means algorithm. Phys Procedia 25:1104– 1109. https://doi.org/10.1016/j.phpro.2012.03.206
Applications of AI Techniques in Health Care and Well-Being Systems Pankaj Kumar , Rohit , Satyabrata Jena , and Rajeev Shrivastava
Abstract It is becoming increasingly clear that bioinformatics, genomics, and image analysis are three areas of health care where Artificial Intelligence (AI) has been utilised to analyse complex and large volumes of data to provide outputs without requiring human input. Due to numerous safety issues, there may be challenges and hazards in the use of this technology even though there may be chances for improvement in the diagnosis and treatment procedures. This article explores the potential and difficulties of AI in health care and its effects on patient safety. This study shows that in order to produce safer technology through AI, the key techniques are safety reserves’ safe design, safe fail, and procedural protections, whilst risk, uncertainty, and cast should be considered for all conceivable technical systems. In order to create and use safer AI applications in the healthcare setting, it is also recommended that certain guidelines and standards can be created and communicated to all stakeholders. Keywords Safety · Patient safety · Quality · Artificial Intelligence · Well-being systems
P. Kumar Department of Pharmacology, Adesh Institute of Pharmacy and Biomedical Sciences, Bathinda 151001, India Rohit Department of Pharmacy Practice, I.K. Gujral Punjab Technical University, Kapurthala, Punjab 144603, India S. Jena Department of Pharmaceutics, Bhaskar Pharmacy College, Hyderabad, Telangana 500075, India R. Shrivastava (B) Princeton Institute of Engineering and Technology for Women, Hyderabad, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_41
485
486
P. Kumar et al.
1 Introduction The medical industry is being transformed by Artificial Intelligence (AI). Analysing the relationships between prevention or treatment plans and patient outcomes is the core goal of AI applications in the healthcare industry. Applications of AI can reduce the cost and time needed for disease diagnosis and treatment, improving the efficacy and efficiency of health care [1]. Large datasets can be completely and swiftly examined with the aid of AI, facilitating swift and accurate decision-making. Usually, Artificial Intelligence is split into two categories: imagined and actual. To assist physicians in the diagnosis and management of disease, virtual Artificial Intelligence leverages informatics from deep learning applications, such as image processing and electronic health records. Mechanical innovations like physical treatment and robotic surgery are examples of physical Artificial Intelligence. In order to process data accurately, algorithms have been created to train datasets for statistical applications. Machine learning (ML), which enables computers to precisely predict the future using past experiences, is built on these ideas. Both AI and ML have the ability to advance significantly, but they also run the risk of posing safety concerns that can be detrimental to patients and other healthcare system stakeholders. In addition, ML systems frequently exploit data, including frequently private and highly sensitive data, to learn from and better themselves. As a result, they are more vulnerable to grave problems like identity theft and data breaches. Safety issues are raised because AI may also be linked to poor forecast accuracy [2]. For example, clinical datasets are used to train and evaluate convolution neural networks (CNNs), which might not be suitable for use on a larger population. According to one study, the general population may have a wider range of skin lesions under surveillance for skin cancer detection. As a result, such an AI system might make forecasts that are unreliable or misleading. The study team represents an overview of the implications of Artificial Intelligence and machine learning for the context of healthcare safety in order to address potential difficulties. The group also talks about the opportunities and difficulties for creativity.
2 Literature Review (i) Artificial Intelligence in Health Care Machines mimicking human cognitive processes are how Artificial Intelligence (AI) is defined in the healthcare sector. In order for devices that can operate at least as good as or better than humans, Artificial Intelligence (AI) integrates the principles of sensing, recognising, and object identification. Artificial Intelligence has been influenced by how natural neurons work [21]. AI cannot, however, take the role of doctors in the delivery of health care due to its inherent limitations in articulation and insight production. As there are not any standards that apply to everyone in the healthcare industry, AI frequently needs to be combined with doctor discretion [3]. Each illness state must be diagnosed or monitored using a thorough linkage of
Applications of AI Techniques in Health Care and Well-Being Systems
487
the patient’s medical history and clinical observations. Associative and lateral thinking are used to direct the doctor–patient connection, which has the potential to affect managerial choices. Moreover, AI does not address the impact of many elements (such as psychosocial and emotional ones) on the course of disease. Although if robots are less likely to be biassed and can be more accurate, dependable, and thorough, they cannot replace trust and empathy. The possibility that AI systems, which train continuously and learn via experience, may one day outperform humanity is a growing source of worry. AI will play a significant role in health care, but only if it is used wisely and appropriately. (ii) AI and Safety The boundaries of safety are changing as a result of the application of AI in health care. A low likelihood of anticipated and unforeseen damages has been achieved by using AI and ML to boost safety. Therefore, risk management is crucial for AI-based systems. According to safety and risk reduction, ML applications are often divided into type A (such as medical diagnosis) and type B (such as speech transcription systems) applications [4]. Applications of type A put safety first, whilst applications of type B put risk mitigation first. In type B applications, epistemic and scientific uncertainty in the model is substantially smaller. Safety is less important in type B applications since faults occur less frequently. The prevention, diagnosis, and therapy of numerous illness disorders have all benefited through the use of ML. According to the illness area and anticipated results, the safety of these proposed techniques has been characterised in abstract terms [5]. A machine learning (ML) system was created in a describe by Swaminathan et al. to forecast flare-ups and offer at-home decision support for patients with chronic obstructive pulmonary diseases. In a validation trial with 101 patients, it effectively triaged patients with accurate and in favour of patient safety. Despite 14% of emergency room visits, the algorithm never undertriaged a patient who required clinical attention. In contrast, doctors undertriaged patients in 22 and 30% of cases for the same reasons. Using datasets that included physician labelling, this model was trained. By contrasting the model’s findings with the consensus recommendations of a panel of doctors using an out-of-sample representative patient group, the model’s efficacy was confirmed. (iii) Possibilities and Difficulties of AI in the Context of Healthcare Safety It is essential for increasing knowledge and improving results in the healthcare industry. AI is used in medicine for a variety of purposes, including as disease detection and prediction, handling vast volumes improving efficiency and results in the treatment of illness conditions through the analysis of data, the distillation of insights, and the diagnosis and classification of retinal abnormalities, malignant lesions, pneumonia, and the prognosis of sepsis in intensive care have all benefited from the use of AI [6]. To create precise, secure, and personalised drugs, AI ideas have been applied to precision medicine. AI has various advantages in the healthcare industry. AI has the potential to greatly improve ordinary medical practices and research. Some of the main advantages of AI are more outreach, quicker and easier information access, and fewer errors in disease diagnosis and treatment. Focused therapy delivery, precision medicine,
488
P. Kumar et al.
and predictive diagnosis are a few significant fields where AI has made significant strides [7]. Both in terms of time and money, virtual consultations and follow-ups are effective. Applications for telemedicine powered by AI, for instance, provide high-quality care, reduce patient wait times, and reduce the risk of infection when patients visit the hospital. High patient satisfaction during treatment is the end effect of this. (iv) AI’s Effect on Healthcare Quality There are several restrictions that make it difficult for AI-driven solutions to improve health care. Table 1 lists the main concerns regarding the safety of AI in health care and measures to lessen them. These issues will undoubtedly surface at various phases of the application of AI. The relationship between AI-based applications and safety worries is depicted in Fig. 1, whilst Table 1 lists various hazards and mitigation tactics [8, 9]. Each safety concern is further discussed in the sections that follow Distributional Shift. ML can create predictions that are unsafe and are made outside of samples. This could happen as a result of modifications in illness patterns and traits, application to other populations, and trained and performance datasets. (v) Methods for Ensuring Safety in AI Applications are made safe for use in health care by factors like prediction accuracy, models’ causality, the time and effort people invest on classifying examples outside of the sample, and system Table 1 Artificial Intelligence (AI) safety concerns in health care Safety issue
Elements of hazard
Key steps of mitigation
Distributional shift
Out-of-sample predictions
Training of AI systems with large and diverse datasets
Quality of datasets
Poor definitions of outcomes Nonrepresentative datasets
Build more inclusive training algorithms using balanced datasets, correctly labelled for outcomes of interest
Oblivious impact
High rates of false-positive and false-negative outcomes
Include outliers in training datasets Enable systems to adjust for confidence levels
Confidence of prediction
Uncertainly of predictions Automation complacency
Sustained and repeated use of AI algorithms Transparent and easily accessible AI algorithms
Unexpected behaviours
Calibrations drifts
Design and train systems to learn and unlearn and have more predictable behaviours
Privacy and anonymity
Identifications of patient data
Define layers of security and rules for data privacy Anonymize data before sharing
Ethics and regulations
Poor ethical standards and regulatory control for development of AI
Build regulatory reforms to support integration of AI in health care
Applications of AI Techniques in Health Care and Well-Being Systems
489
Fig. 1 Worries about safety at various stages of the use of Artificial Intelligence
learning and reinforcement [10]. AI security in health care is addressed through the four key safety engineering solutions of design safety, safety reserves, fail safes, and procedural safety protections. Design that is inherently safe indicates that possible risks will be eliminated rather than merely mitigated in systems. By removing the possibility that no samples from test datasets are used in training datasets. Systems used in the healthcare industry can be rendered safe. Changes areas of data continue to provide a difficulty, despite the fact that this can increase system accuracy [11]. AI applications should have safety reserves in order to discover the average and maximum test errors which are handled by the training and testing systems’ uncertainty. Systems need to be designed to fail safely, which means that they must continue to be secure even if they fail to perform as intended. When the model is unable to achieve the desired prediction, it ought to be taught to reject with assurance. Predictions can be made by incorporating human involvement in the case of such rejections. User experience design is one of the procedural protections that helps users set up and utilise an application safely.
3 Artificial Intelligence Machine AI has several subdomains, including learning and neural networks. Artificial Intelligence machine with preprogrammed learning and experience advancement system. Figure 2 shows several learning methods, and model or algorithm is trained: • Supervised instruction. • Unsupervised education.
490
P. Kumar et al.
Fig. 2 Different types of machine learning
• Reinforcement learning. • Semi-supervised learning. • Supervised Learning This calls for a prior understanding of the algorithm’s outputs as well as appropriate labelling of the training data for the model. The algorithm then compares its actual output to the correct outputs based on these responses, learns from its mistakes, and becomes more efficient [12]. • Unsupervised Learning Unsupervised learning makes use of historical labelfree data. No correlations between the I/P and the O/P, or more accurately, a right response, will be provided to the model. The algorithm should be able to self-learn on the go. Due to its complexity, this method of learning is applied fewer times than through guided learning. Unsupervised learning gives the user the ability to perform more difficult processing tasks than supervised learning does [13]. They differ from conventional teaching methods in that they are less predictable. Among the algorithms for unsupervised learning are neural networks, anomaly detection, and grouping. The most common type of exploratory data analysis that uses cluster analysis to find hidden patterns or grouping in data has used unsupervised learning technique. • Semi-supervised Learning Learning is semi-supervised comes midway between learning that is under supervision and learning that is not. This is employed in situations where the issues call for a balance between supervised and unsupervised learning. Although supervised learning uses data with labels and unsupervised learning uses data without labels, semi-supervised learning makes use of both types of data. The model will pick up knowledge and patterns from the labelled data and apply them to the unlabelled data. • Reinforcement Learning This kind of learning trained its algorithm using a system of rewards and penalties. The model will learn to maximise reward and minimise penalty by receiving rewards for doing successfully and penalties for performing wrong. Artificial Intelligence is computer intelligence as opposed to human intellect [14]. Artificial Intelligence is the practise of having machines mimic human minds when they analyse and learn. Machine learning is the term used to describe this kind of intelligence. The creation of Artificial Intelligence involves both software and hardware. AI is an area of algorithms in software.
Applications of AI Techniques in Health Care and Well-Being Systems
491
4 Applications of Artificial Intelligence in Health Care Some of AI in medicine applications used in the healthcare sector include the ones listed below: (i) AI for Drug Discovery Pharmaceutical companies have benefited from AI technology by being able to conduct drug development more quickly. Yet, it automates the process of target identification. By analysing off-target compounds, Artificial Intelligence in healthcare 2021 also aids in drug repurposing. Artificial Intelligence drug discovery in the healthcare and AI industries boosts up the process and reduces repetitive work. Leading biopharmaceutical companies have created a variety of medicines. Pfizer is utilising IBM Watson, a machine learning technology, to help it find immuno-oncology treatments [15, 22]. Sanofi has decided to employ Exscientia’s AI platform to develop medications for metabolic illnesses, whilst Roche subsidiary Genentech is using an Artificial Intelligence system from Cambridge, Massachusetts-based GNS Health care to aid in its search for cancer treatments. Nearly all large biopharmaceutical businesses have such internal or external collaborations. (ii) AI in clinical research Newly developed drugs are given to participants in a clinical trial to evaluate how effectively they work. This has required a lot of time and money. However, the success rate is quite low. Hence, automation of clinical trials has proven helpful for AI and the healthcare sector. Moreover, AI and health care assist the reduction of time-consuming data monitoring techniques. Also, clinical trials aided by AI manage massive volumes of data and produce incredibly accurate outcomes [16]. Here are a few of the clinical trial AI applications in health care that are most frequently employed. (iii) Patient Care Patient outcomes in health care are impacted by AI. Medical Artificial Intelligence businesses create systems that are advantageous to patients on all fronts. Additionally, clinical intelligence analyses patient medical data and offers insights to help them live better. The systems listed below are important examples of clinical intelligence that enhance patient care [23]. (iv) Stethoscope with AI The ability to obtain readings even in noisy environments, in contrast to traditional stethoscopes, allows for more accurate diagnosis. Anyone can acquire the records and telefax them to the doctor because there is no requirement for training to use the digital device (Prabu et al. 2021). Also, this lowers their risk of contracting COVID-19 and makes it easier to provide patients with chronic illnesses in inhospitable regions with better medical care [17]. Computers can now recognise patterns and anomalies in clinical data related to disease through the use of ML and AI. Blood flowing through regular arteries varies from blood flowing around a blood clot in blood vessels, and the same logic applies here (Agrawal 2018). (v) Medical Data-Driven Genetics AI The modern healthcare consumer is taking a more active role in their own health care, from genome sequencing to creating a personalised health status using data from our fitness/activity monitors. We are accumulating and linking all of these massive data to generate a more
492
P. Kumar et al.
accurate picture of our health or medical state. Data-driven medicine has the ability to accelerate and improve the detection of genetic diseases and open the door to more individualised medical treatment. (vi) Medical Data-Driven Genetics AI From genome sequencing to creating a personalised health status using the data from our fitness/activity trackers, today’s healthcare consumers are taking a more active role in their own medical care. In order to provide a more realistic picture of our health or medical status, we are gathering and connecting all of these massive data [18, 18]. Data-driven medicine has the potential to improve the efficiency and precision of genetic disease detection as well as open the door to more individualised medical treatment.
5 Health Care Nowadays, AI is used in a variety of healthcare uses. It has been used specifically for signal and image analysis, as well as for making predictions about changes in function, including bladder control, epileptic seizures, and strokes. The bladder volume forecast and epileptic seizure prediction are two examples of typical case studies that are described below. (i) Prediction of bladder capacity Various complications in the patient’s health circumstances arise when the bladder’s functions of storage and urination fail as a result of a spinal cord injury or due to other neurological diseases, health state, or ageing. Implantable nerve stimulators can now be used to partially restore bladder function in drug-resistant patients. By using conditional neurostimulation, it is possible to increase the effectiveness and security of neuroprostheses. To do this, a bladder sensor that can identify urine that has been saved is needed as a feedback device to only apply electrical stimulation when necessary [19]. When the bladder needs to be emptied or when an unusually high residual postmicturition volume remains after an incomplete voiding, the sensor can alert patients with impaired sensations in a timely way. The offline learning phase, also known as the training phase, is carried out in real time by the monitoring algorithms used for the qualitative and quantitative techniques first. The sensor learns or recognises the parameters meant for real-time monitoring during this phase. Because the learning phase’s algorithms are run offline in a computer linked to the implant via the external unit, we can pick the best ones, irrespective of their complexity and execution time. The learning phase enables us to shift the complexity and hardware burden to offline processing, enabling the realtime monitoring phase to be implemented with less complex but still efficient prediction algorithms and optimised power usage [20]. In the learning period, there are: (1) Finite impulse response (FIR) filters with a non-causal linear-phase bandpass filter are used in digital data processing.
Applications of AI Techniques in Health Care and Well-Being Systems
493
(2) Finding the afferent neural activity that most closely correlates with bladder capacity and/or pressure. Instead of using the Pearson (linear) correlation coefficient, the Spearman’s rank correlation coefficient () was chosen because it evaluates a monotonic dependence that is not necessarily linear and strengthens the robustness of our estimation method [21]. To calculate the Spearman’s rank correlation value, use Eq. (1): ( )( ) FRi,k − FRi,k Vi,k − V k ρk = / )2 ∑n ( )2 , ∑n ( FR V − FR − V i,k i,k i,k k i=1 i=1 ∑n
i=1
(1)
where k is the class k unit’s Spearman’s correlation; the amount of classes found is k; the period counter is i; n is the total number of timeframes used to capture the signals, also known as bins; the numbers of the firing rate (FR) of the unit (i, k) per second are commonly used [22]; the means of all FRs and volume bins with regard to class k are, respectively, FRk and V k . V i,k and are for the same bin. ∑n E qual = 1 − OSR = 1 −
i=1
n
Bi
.
(2)
The best (i.e. lowest) qualitative estimation error (Equal) was obtained by scanning the BW at various intervals, and the length was selected using Eq. (2). In order to determine Equal in Eq. (2), we calculated the overall success rate (OSR), which is the ratio of all accurate state classifications to all attempted classifications. The OSR is calculated by dividing Bi , the number of bins for which the predicted state agrees with the true one, by the total number of bins (n). The quantitative volume and pressure estimation was implemented utilising a regression technique model with a small amount of hardware resources, as stated in Eq. (3): Δ
V =
N ∑
(ci × BIRi ).
(3)
i=0 Δ
Δ
By adding, Eq. (3) can also be used to calculate the pressure V by P . During the learning period, the parameters required for pressure estimation can also be computed. In the phase of real-time monitoring, the following steps are completed: the first step is digital non-causal filtering, which is then followed by on-the-fly spike classification, digital non-causal filtering [24], computing the BIR using the optimal BW [25], comparing the BIR to the baseline, setting the volume to 0 for lower values, and [23] calculating the bladder volume or pressure using Eq. (3) if the BIR is higher. Numerous test runs were conducted to assess and validate our methods, as shown in the depiction in Figs. 3 and 4.
494
P. Kumar et al.
Fig. 3 Bladder afferent neural activity recordings (ENG) during a slow filling
Fig. 4 Quantitative volume estimation in simulated real-time data-processing experiments
6 Conclusion Healthcare AI and ML safety measures are still in the early stages of development. Predictions and results based on forecasts have been the main focus of safety of AI in health care thus far. High and low safety systems and applications should be handled in accordance with the necessary procedure. Highly calibrated and effective models for estimating individual risks should be utilised, as well as effective update procedures. All potential applications have definitions for cost, risk, and uncertainty. Algorithms and automated systems should be able to account for uncertainty and unpredictability and respond accordingly. The goal of efforts should be to reduce epistemic ambiguity. A typical difficulty with AI-based learning systems is the need
Applications of AI Techniques in Health Care and Well-Being Systems
495
for a certain quantity preceding deployment, of test samples. It is not always the case that test samples accurately reflect training samples. The “frame problem” can be overcome by including a human component into AI applications, such as ongoing system calibration based on user feedback, clinician review of atypical datasets, and the inclusion of various demographics in training sets. In addition to doctors, who can receive information specialist training to advance their careers and create accurate and dependable AI solutions, training is necessary for AI-based systems as well. It is unclear how much these systems will cost and how they will be distributed. The application of Artificial Intelligence in the healthcare sector might be encouraged through extensive feasibility studies and cost-effectiveness evaluations. The privacy, sharing, and disclosure of safety data related to AI applications must all be improved. Applications of AI and ML in the healthcare sector should be subject to rigorous validation requirements. It is important to build techniques, policies, and procedures to make it easier for AI and ML to be developed and used in health care. For the full and effective integration of AI into medical research and practise, trust and training will be required.
References 1. Hamet P, Tremblay J (2017) Artificial intelligence in medicine. Metabolism 69S:S36–S40 2. Ba¸stanlar Y, Ozuysal M (2014) Introduction to machine learning. Methods Mol Biol 1107:105– 128 3. Deo RC (2015) Machine learning in medicine. Circulation 132:1920–1930 4. Haenssle HA, Fink C, Schneiderbauer R, Toberer F, Buhl T, Blum A (2018) Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol 29:1836–1842 5. Jha S, Topol EJ (2016) Adapting to artificial intelligence: radiologists and pathologists as information specialists. JAMA 316:2353–2354 6. Shah NR (2019) Health care in 2030: will artificial intelligence replace physicians? Ann Intern Med 170:407–408 7. Goldhahn J, Rampton V, Spinas GA (2018) Could artificial intelligence make doctors obsolete? BMJ 363:k4563 8. Varshney KR (2019) Engineering safety in machine learning, 4 Sept 2019 9. Varshney KR, Alemzadeh H (2017) On the safety of machine learning: cyber-physical systems, decision sciences, and data products. Big Data 5:246–255 10. Swaminathan S, Qirko K, Smith T et al (2017) A machine learning approach to triaging patients with chronic obstructive pulmonary disease. PLoS ONE 12(11):e0188532 11. Duggal R, Brindle I, Bagenal J (2018) Digital healthcare: regulating the revolution. BMJ 360:k6 12. McCoy A, Das R (2017) Reducing patient mortality, length of stay and readmissions through machine learning-based sepsis prediction in the emergency department, intensive care unit and hospital floor units. BMJ Open Qual 6(2):e000158 13. Bae S-H, Yoon K-J (2015) Polyp detection via imbalanced learning and discriminative feature learning. IEEE Trans Med Imaging 34:2379–2393 14. Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T (2019) CheXNet: radiologist-level pneumonia detection on chest X-rays with deep learning, 4 Sept 2019 15. De Fauw J, Ledsam JR, Romera-Paredes B et al (2018) Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med 24:1342–1350
496
P. Kumar et al.
16. Esteva A, Kuprel B, Novoa RA et al (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118 17. Seyhan AA, Carini C (2019) Are innovation and new technologies in precision medicine paving a new era in patients centric care? J Transl Med 17(1):114 18. Yellowlees PM, Chorba K, Burke Parish M, Wynn-Jones H, Nafiz N (2010) Telemedicine can make healthcare greener. Telemed J E Health 16:229–232 19. Young K, Gupta A, Palacios R (2018) Impact of telemedicine in pediatric postoperative care [published online 5 Dec 2018]. Telemed J E Health. https://doi.org/10.1089/tmj.2018.0246 20. Phillips-Wren G, Jain L (2006) Artificial intelligence for decision making. In: Gabrys B, Howlett RJ, Jain LC (eds) Knowledge-based intelligent information and engineering systems. Lecture notes in computer science, vol 4252. Springer, Berlin, Heidelberg 21. Begum S, Siddique FA, Tiwari R (2021) A study for predicting heart disease using machine learning. Turk J Comput Math Educ 12(10):4584–4592. e-ISSN 1309-4653 22. Tiwari R et al (2022) An artificial intelligence-based reactive health care system for emotion detections. Comput Intell Neurosci 2022. Article ID 8787023. https://doi.org/10.1155/2022/ 8787023 23. Awantika PM, Tiwari R (2020) A novel based AI approach for real time driver drowsiness identification system using Viola Jones algorithm in MATLAB platform. Solid State Technol 63(05):3293–3303. ISSN 0038-111X 24. Recht M, Bryan RN (2017) Artificial intelligence: threat or boon to radiologists? J Am Coll Radiol 14:1476–1480 25. Jain M, Mohan R, Sachi S, Nigam A, Shrivastava R (2022) A transformative impact on media markets based on media and artificial intelligence. NeuroQuantology 20(10):7570–7576. https://doi.org/10.14704/nq.2022.20.10.NQ5574
Smart Air Pollution Monitoring System Using Arduino Based on Wireless Sensor Networks S. Thaiyalnayaki, Rakoth Kandan Sambandam, M. K.Vidhyalakshmi, S. Shanthi, J. Jenefa, and Divya Vetriveeran
Abstract Impurity levels in air have risen throughout time as a result of several reasons, such as population expansion, increased automobile use, industry, and urbanization. All of these elements harm the health of individuals who are exposed to them, which has a detrimental effect on human well-being. We will create an air pollution monitoring system based on an IoT that uses a Internet server to track the air quality online in order to keep track of everything. An alert will sound when the level of harmful gases such CO2 , smoking, alcohol, benzene, and NH3 is high enough or when the air quality drops below a specified threshold. The air quality will be displayed on the LCD in PPM. Keywords Population growth · Air pollution · IoT · LCD
1 Introduction The IoT air and sound monitoring system’s main objective is to combat the rising levels of air and sound pollution in the world today. Air quality must be monitored and preserved for a better future and healthier lifestyles for everyone. The versatility and S. Thaiyalnayaki (B) Department of CSE, Bharath Institute of Higher Education and Research (Deemed to Be University), Chennai, Tamilnadu, India e-mail: [email protected] R. K. Sambandam · J. Jenefa · D. Vetriveeran Department of CSE, SOET, CHRIST (Deemed to Be University), Kengeri Campus, Bengaluru, Karnataka, India M. K.Vidhyalakshmi Department of Computing Technologies, Faculty of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu District, Tamil Nadu 603203, India S. Shanthi Department of CSE, Malla Reddy College of Engineering and Technology, Hyderabad, Telangana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_42
497
498
S. Thaiyalnayaki et al.
affordability of the Internet of Things (IoT) are helping it gain popularity. Urbanization and the increase in vehicles on the road have had a big impact on the atmosphere. Pollution can cause everything from minor allergic reactions like throat, eye, and nose irritation to more serious issues including bronchitis, heart disease, pneumonia, lung illness, and aggravated asthma. Concentrations of air pollutants and noise pollution are obtained by monitoring, which can then be evaluated, interpreted, and reported. Then, there are numerous applications for these data. We can assess the level of air and noise pollution on a daily basis by reviewing monitoring data. In big cities, air pollution affects both the environment and people. India’s environmental predicament is rapidly getting worse. Vehicles and industry are the main sources of air pollution, which aggravates respiratory conditions including asthma and sinusitis. Due to the massive amounts of CO2 and other dangerous chemical substance produced from automobiles and industries, the air quality in cities like Kolkata, Delhi, and Mumbai is dreadful [1–4].
2 Literature Review The literature contains extensive records of studies utilizing low-cost air pollution monitoring tools that can be delivered in a variety of vehicles or carried by individuals. The authors of two studies [5, 6] employed an environmental sensing strategy to rekindle public concern and sympathy for pollution. Users of the mobile participatory sensing platform Exposure Sense can monitor their daily activities. In a different piece of writing, most of the authors suggest a cloud-supported system that searches for real-time air quality data using knowledge-based discovery. The information is gathered from monitoring stations dispersed over numerous geographic areas. For monitoring, this system makes use of mobile clients. The Re et al. team created an Android app that allows users to access information on the air quality. This program creates a ubiquitous and unobtrusive monitoring system that is prepared to provide users with recommendations on their daily exposure to air pollution by combining user area data with metropolitan air quality data provided by monitoring stations. Reshi et al.’s VehNode WSN platform allows vehicles to track the amount of pollutants in the smoke they emit. A wireless sensor network (WSN)-based air pollution contamination measurement system was developed by Mujawar et al. for use in Solapur. Microsensor nodes employ the sensing layer’s electrical conductivity to find the target gas. As they come into contact with the sensor, gases decompose on its surface, changing the conductivity. The microcontroller receives data from a semiconductor sensor placed close to the vehicle’s exhaust outlet that measures the amount of pollutants. De Nazelle et al. demonstrated in a different study how environmental sensing instruments may reawaken people’s empathy with and concern for pollution. We expect that this study will be able to convert the cost of early deaths into economic terms that decision-makers can comprehend, enabling more resources to be allocated to enhancing air quality. Physical and ecological capitals are at stake,
Smart Air Pollution Monitoring System Using Arduino Based … Table 1 AQI measuring table
AQI
Health apprehension
0–50
Good
51–100
Moderated
101–150
Unhealthy for sensitive group
151–200
Unhealthy
201–300
Extremely unhealthy
301–500
Hazardous
499
as well as basic human well-being, and air pollution impedes economic expansion. According to Laura Tuck, vice president for sustainable development at the World Bank, we can lessen harmful emissions, halt climate change, and—most importantly—save lives by promoting healthier cities and making investments in cleaner energy sources. Crop productivity increases by using a wireless sensor network (WSN) made up of many sensors, including temperature and moisture sensors. The report on air pollution, according to Dr. Chris Murray, director of IHME, “is a burden of disease with an urgent demand to the government for immediate action.” One risk factor for an early death that people have little control over is the air we breathe. Many business leaders, as well as health and environmental regulators, are under increasing pressure to find a solution to this issue. Table 1 describes the Air Quality Index with reference of Health Apprehension. According to the research, “air pollution is a problem that imperils fundamental human well-being, degrades physical and natural capital, and restrains economic growth.” By putting the price of early deaths in terms that policymakers can understand, we hope that this research will help them to commit more funds to improve air quality. According to Laura Tuck, vice president for sustainable development at the World Bank, “We can reduce harmful emissions, halt climate change, and, most importantly, save lives” by promoting healthier cities and making investments in cleaner energy sources. The Institute for Health Metrics and Evaluation’s director, Dr. Chris Murray, asserts that “the report on air pollution is a burden of disease with an urgent call to the government for vital action.” One of the risk factors for early death over which we have the least influence is the air we breathe. It is becoming increasingly important for decision-makers and business executives in the environmental and health sectors to address this issue.
3 Proposed Work • The IoT-Mobair App for pollution management was developed for businesses and integrates and streamlines environmental activities such as waste reduction, water and energy management, and air emissions analysis. While providing users,
500
•
• • • • •
S. Thaiyalnayaki et al.
especially industries, with visibility into the danger of accidents like chemical spills, oil spills, and inappropriate disposal of hazardous materials, such apps assure compliance with environmental standards and regulations. For these apps, development outsourcing support in fields like IoT is accessible. Tasks related to the environment, such as waste reduction, water and energy management, and air emissions analysis, are streamlined and expedited by the Mobair App for pollution control in industries shown in Fig. 1. These apps ensure adherence to environmental standards and laws while providing users, especially industries, with information on the likelihood of mishaps including chemical spills, oil spills, and improper disposal of hazardous materials. Such applications might get help with outsourcing development in things like: A centralized platform for managing enterprise initiative environments. Maintain a real-time audit of environmental processes, incidents, and outcomes in accordance with ISO 14001 and other sector-specific situation standards. An alarm will go off when certain thresholds are reached. The Air Quality Monitoring app has the following features, which highlight the risks of crossing a threshold: Day today quality of air prediction and practical real-time air purity indicators for a specific city. Different generations’ outdoor activities should be timed. Track the application of ISO 14001 and other environmental laws in your industry.
• Real-time monitoring of environmental processes, as well as auditing of events and results. • Day today quality of air prediction and practical real-time air purity indicators for a specific city. • A connection between health issues, geographic areas, and other factors with a reduction in air quality. • Making air purity maps. Data Sens
Process Pollution Level: High, Low, Medium
Analytical Module
Andro
Historical Predictio
Fig. 1 Shows the proposed air pollution monitoring system
Clo
Smart Air Pollution Monitoring System Using Arduino Based …
501
The three stages of the quality monitoring results are included in the architecture of the proposed system. Phase 1: Utilize sensors to find the air pollution concentration in the target location. Phase 2: Create an Android application with a user-friendly interface that lets people assess how dirty their area is. Phase 3: Use the analytical module to forecast air quality. Today, there are many high-end devices available for weather monitoring seven days a week, 24 h a day. On the other hand, systems are widely used to monitor the weather in real time for a whole city or state. They are not designed for it, and the cost of maintaining such systems for such a small area makes it impractical to set up one for a small region. The Arduino microcontroller processes these values read from the sensors and saves them in a text file that may then be processed further to extract analysis. The readings are also shown on an integrated LCD for ease of viewing. The weather features of a specific area and the weather trend can both be tracked using all of these measurements [7–9]. These acknowledged characteristics, which differ from one place to the next, are crucial. All of these specifications are listed in the database, and their values are over time kept as necessities. These factors can be used as input to build a time-series weather chart for a specific place. According to the current weather and established criteria, the predetermined operations are completed. The specified action can include turning on the heating system when the temperature is below the predetermined value and it can also include turning on the cooling system when the temperature is above the predetermined values and is hot or humid. The values obtained from the sensors can be entered into a database using the serial output of the Arduino microcontroller. The database can be used as a data source to display values in a stand-alone application or on a website. To achieve the most accurate data possible, the sensors utilized in the weather monitoring system have been carefully built. They also work with the Arduino microcontroller. The modules that make up the weather monitoring system are as follows: humidity sensor: This sensor will let you know how humid the air is right now. Temperature sensor: This sensor measures the temperature and uses that information to determine the dew point and heat index. A light sensor is used to calculate the amount of light that reaches this sensor. A 16 × 2 LCD panel presents the readings in real time. This serves as the system’s human interface as well. Height sensor: To ascertain the local altitude. It is employed to figure out how much pressure is there in the atmosphere. The air pressure at the moment of the occurrence is shown by the results from the atmospheric pressure sensor [10–13].
502
S. Thaiyalnayaki et al.
4 Results and Discussion The air pollution detector kit is made using the Internet of Things in this section in two steps: The following implementation details are provided: Phase 1: An assessment of air pollution levels. This section demonstrates how to build a kit for Internet of Things-based air pollution detection. This is made up of two parts: Gas sensors provide the data and this should be connected to an Arduino board which is the first step, and it involves sending the collected data to a cloud platform (like Ubidots) for storage. The second step shows how these data can be accessed on the Android operating system. For this purpose, methane, carbon dioxide, and carbon monoxide sensors collect data and monitor gas concentrations. The proposed system is contrasted with the existing systems. Although there are already air quality monitoring systems in place, the suggested approach has a number of advantages. Table 2 shows that even if the proposed system is superior to the current system in terms of pricing and monitoring air pollution, the new system performs better than the current system at a lower cost. The system we propose is modular. Without needing to alter the entire gadget, it is simple to replace one of the sensors. Figure 2 is findings which show that the level of air quality was noticeably low in comparison to the contaminants previously thought. After only a few days of taking measurements, it is apparent that the air quality level dropped quickly. Gases are serious air pollutants, which is why this is the case. Figure 2’s findings show that the level of air quality was noticeably low in comparison to the contaminants previously thought. After only a few days of taking measurements, it is apparent that the air quality level dropped quickly. Gases are serious air pollutants, which is why this is the case. Table 2 Comparison between existing and proposed systems Device name
Air pollutants measured method
Price
Indoor air quality meter—CO2 , temperature, and relative humidity
Electrochemical sensor
$129.00
Mini CO2 monitor
IR absorption with gas filter correlation wheel
$109.00
PYLPCMM05-PYLE PCMM05 carbon monoxide meter
IR absorption with gas filter correlation wheel
$227.24
The proposed system
Solid-state sensor IR absorption with gas filter correlation wheel
$47.23
Smart Air Pollution Monitoring System Using Arduino Based …
503
Fig. 2 Comparison between air quality smoke and biogas pollutant
A system that monitors the environment’s air quality using an Arduino microcontroller and IoT technologies is demonstrated in order to enhance air quality. The process of monitoring many environmental factors, including the air quality monitoring issue discussed in this proposed system, is improved by the IoT. The gas sensor name is MQ135 used to detect a range of hazardous substances, and the Arduino is at the center of the proposed model and it is controlling the entire process. The procedure is connected to the Internet through a Wi-Fi module, and an LCD displays the results visually. In order to address the most pressing issue, the Automatic Air and Sound Management System is a substantial advancement. The much-polluted areas issue, which is a significant issue, is addressed through the sound and air monitoring system. It emphasizes the importance of maintaining a healthy lifestyle while promoting the use of new technologies. This system includes elements that let users check pollution levels via an app on their phones.
5 Conclusion A system that monitors the environment’s air quality using an Arduino microcontroller and IoT technologies is demonstrated in order to enhance air quality. The process of monitoring many environmental factors, including the air quality monitoring issue discussed in this proposed system, is improved by the IoT. The gas sensor name is MQ135 used to detect a range of hazardous substances, and the Arduino is at the center of the proposed model and it is controlling the entire process. The procedure is connected to the Internet through a Wi-Fi module, and an LCD displays the results visually. In order to address the most pressing issue, the Automatic Air and Sound Management System is a substantial advancement. The much-polluted areas issue, which is a significant issue, is addressed through the sound and air monitoring system. It emphasizes the importance of maintaining a
504
S. Thaiyalnayaki et al.
healthy lifestyle while promoting the use of new technologies. This system includes elements that let users check pollution levels via an app on their phones. Acknowledgements The authors gratefully acknowledge the authorities of Bharath Institute of Higher Education and Research (Deemed to be University), Chennai, Tamil Nadu, and CHRIST (Deemed to be University), Bengaluru, Karnataka, for the facilities offered to carry out this work.
References 1. Sathiyapriya K, Thaiyalnayaki S, Sambandam RK, Anuranjani K, Vetriveeran D (2022) Smart precision irrigation techniques using wireless underground sensors in wireless sensors. In: 8th international conference on advanced computing and communication systems, ICACCS 2022, pp 2072–2077 2. Kaur N, Mahajanand R, Bagai D (2016) Air quality monitoring system based on Arduino microcontroller. 5(6) 3. Sai PY (2017) An IoT based automated noise and air pollution monitoring system. 6(3) 4. Blum J Exploring Arduino: tools and techniques for engineering wizardry, 1st ed. 5. Deshmukh S, Surendran S, Sardey MP Air and sound pollution monitoring system using IoT. 5(6) 6. https://www.tinkercad.com/ 7. https://circuits.io/ 8. https://www.arduino.cc/ 9. https://circuitdigest.com/microcontroller-projects/iot-air-pollution-monitoring-using-arduino 10. Nayak R, Panigrahy MR, Raiand VK, Rao TA (2017) IOT based air pollution monitoring system. 3(4) 11. Guanochanga B, Cachipuendo R, Fuertes W, Salvador S, Benítez DS, Toulkeridis T, Torres J, Villacís C, Tapia F, Meneses F (2018) Real-time air pollution monitoring systems using wireless sensor networks connected in a cloud-computing, wrapped up web services. In: Proceedings of the future technologies conference (FTC) 2018, 18 Oct 2018 12. Hapsari AA, Hajamydeen AI, Vresdian DJ, Manfaluthy M, Prameswono L, Yusuf E (2019) Real time indoor air quality monitoring system based on IoT using MQTT and wireless sensor network. In: 2019 IEEE 6th international conference on engineering technologies and applied sciences (ICETAS), 20–21 Dec 2019. https://doi.org/10.1109/ICETAS48360.2019 13. Chaturvedi A, Shrivastava L (2020) IOT based wireless sensor network for air pollution monitoring. In: International conference on communication systems and network technologies (CSNT), 10–12 Apr 2020. https://doi.org/10.1109/CSNT48778.2020
Design of Voltage Comparator with High Voltage to Time Gain for ADC Applications Ashwith Kumar Reddy Penubadi and Srividya Pasupathy
Abstract In this paper, a two-stage comparator with high voltage to time gain, low input-referred noise, offset with lesser delay is introduced. The effect of noise can be greatly reduced by increasing the gain of the comparator and reducing load at input MOS pair. The addition of NMOS as input from the first stage output reduces the load at the input pair and clock-enabled NMOS breaks the supply to ground path when the clock is disabled with clock-enabled PMOS for the reset phase increasing the transconductance of the second stage. For fair comparison, conventional and proposed comparators are designed in 90-nm CMOS technology with 1.2 V supply voltage. Simulated √ results show that the proposed comparator has an input-referred noise of 71.2 .µV/ Hz with input-referred offset of 7.299 .mV at .Vcm of 600 mV with a delay of 91 .pS and gain of 94.63 .V2 /s at 1 GHz clock frequency. The output is 30% faster than conventional comparators, this improvement suits the criteria for high-precision ADCs as the noise and offset are low at the designed frequency across all the corners. Keywords Voltage-time gain · Input-referred noise · Input-referred offset · Analog to digital converter · Double tail comparator · Three-stage comparator
1 Introduction In recent technologies, the need for low-power blocks in analog to digital converters with less signal-to-noise ratio has been predominantly high. However, the utilization of any ADC depends on the designed voltage comparator and performance metrics of it. The comparator tends to instability with the wide range of input common mode range. Dynamic comparators are two types: A. K. R. Penubadi (B) · S. Pasupathy Department of Electronics and Communication Engineering, R.V College of Engineering, Mysore Rd, Bengaluru 560059, Karnataka, India e-mail: [email protected] S. Pasupathy e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_43
505
506
A. K. R. Penubadi and S. Pasupathy
Fig. 1 Double tail comparator (DTC)
1. Single Stage Comparator, where the preamplifier is stacked to the latch 2. Two Stage Comparator, the preamplifier and the latch are separated into two stages. High gain low latency comparators have static power dissipation which is a major drawback, to overcome it Strong Arm Latch was designed and many modifications in the late 90s but the disadvantage is the current through the input pairs is limited which leads to lesser transconductance and the common mode voltage should be at least half of the maximum potential. To optimize the current through the input pair N parallel current sources, i.e., N parallelly connected NMOS pairs can be attached to the input MOS pair [1]. However, the current through the tail source will be constant but with the√transconductance of all branches and the mismatch in the current amplifies by . N times, which changes the offset of the comparator, N = 5 gives the best comparison. In monte carlo simulations, the modified version has a 2.28 mV offset which is 40% lesser than the conventional offset and reduced delay increases the frequency of operation with the lesser differential input voltage.
Design of Voltage Comparator with High Voltage .. . .
507
Fig. 2 Miyahara comparator
Two-stage comparator uses less stacking of transistors and reduces the loading effect on the input pair, the design of the double tail comparator [2] reduces the transistors clumped to the same node shown in Fig. 1. To overcome the effect of noise the input for the second stage is given to the PMOS pair (M5 .∼ M6) in Miyahara’s two-stage comparator shown in Fig. 2 thus making the second stage PMOS transistors M5-M6 in saturation region. Although the optimized Miyahara comparator [3] has better tradeoffs, the mobility of PMOS is less and limits the comparator’s regeneration speed. To improve it further the three-stage comparator [4] uses a different preamplifier to the Miyahara’s two-stage comparator which acts as an inverter, the delay caused by it is significantly less and allows the NMOS input pair to take the input from the second stage instead of PMOS input pair in conventional Miyahara comparator. The modified three-stage comparator uses an extra preamplifier which reduces the offset and increases the regeneration speed this design amplifies the speed by 32% compared to the conventional design. At the corners tt, ss, sf, fs, ff the simulated delay is worst at ss corner for all the cases. Considering the process variation it is understood that the NMOS and PMOS performance will degrade. This design has the best tradeoff between the power consumption, area for lesser delay, lesser offset, and least input inferred noise.
508
A. K. R. Penubadi and S. Pasupathy
Delay Calculation DTC_N_OUT
DTC_P_OUT
MIY_P_OUT
MIY_N_OUT
1.4 1.2
Voltage(V)
1 0.8 0.6
0.4 0.2
-0.2
0 1.61E-12 5.15E-12 8.38E-12 1.74E-11 3.67E-11 6.17E-11 9.92E-11 1.28E-10 1.85E-10 3.37E-10 5.05E-10 5.10E-10 5.12E-10 5.24E-10 5.38E-10 5.57E-10 6.19E-10 8.04E-10 1.00E-09 1.01E-09 1.01E-09 1.02E-09 1.04E-09 1.06E-09 1.10E-09 1.13E-09 1.19E-09 1.35E-09 1.51E-09 1.51E-09 1.51E-09 1.53E-09 1.54E-09 1.56E-09 1.65E-09 1.84E-09
0
Time(S) Fig. 3 Delay calculation of DTC and Miyahra comparator
Compared with the Miyahara two-stage comparator, double tail comparator [5] has an additional clock given to the .Vdd tail (M5) to control the supply voltage. There are three phases: evaluation, reset, and regeneration. In the evaluation phase (CLK is 1, CLKB is 0), the input difference is amplified and sent to the second stage NMOS (M10 .∼ M11) which drives the OUTN and OUTP, in the regeneration phase even with a small voltage difference the cross-coupled inverters (M6 .∼ M9) settle it either to .Vdd or GND. In reset phase (CLK is 0, CLKB is 1), the supply voltage is cut off in second stage and M3, M4 turns on which drives M10 and M11 and discharges OUTP, and OUTN to reset the comparator. To consider the comparator for low-power applications the architectures [6–8] are referred but for the required specifications the tradeoff in power dissipation is necessary. For fair comparison, all suitable comparators from the literature survey [9–13] are designed in CMOS-90 nm technology. The transient simulations of DTC and Miyahra comparators are plotted in Fig. 3. The delay is calculated such that the voltage difference between OUTN and OUTP is .Vdd /2. The delay comparison is identical but the power dissipated in Miyahara is 150 .µW but DTC has 117 .µW. So, the power delay product would be less for the DTC.
2 Proposed Design 2.1 Circuit Structure The proposed two-stage comparator is a modification of Miyahara comparator and double tail comparator shown in Fig. 4. There are two phases namely evaluation phase and reset phase. The load at input pair MOSFETs (M1.∼M2) are M12 and
Design of Voltage Comparator with High Voltage .. . .
509
Fig. 4 Proposed design
M11 at both phases. In the evaluation phase, CLK is 1 and Vn greater Vp, M13 is turned ON which discharges the drain of M1 at a faster rate than at M2 through M13, M11.∼M12 will reach the cutoff region over the time period. M11 will reach cutoff faster than M12 due to the fact that the M1 transistor will consume more current through M13 as VGS is greater than the M2 transistor. As a result, M5 transistor reaches saturation faster than M6 which increases the voltage at OUTN. M7 and M8 transistors act as switches to discharge the nodes OUTN and OUTP. These switches also cut off the power supply to the ground path in reset phase. As OUTN is at a higher potential M10 turns ON which drops OUTP to GND, and OUTN amplifies and reaches .Vdd as M9 is turned OFF through OUTP. In reset phase, (CLK is 0) M13 is turned OFF and (M3 .∼ M4) turns ON to pass . Vdd to M11 .∼ M12 gate terminal, as soon as M11 .∼ M12 cross the cutoff threshold the gate of M6 .∼ M5 is brought down to 0 which acts as a switch to pass .Vdd , M14.∼M15 turns ON with CLK 0 resets OUTN and OUTP at a faster rate to .Vdd . M7 .∼ M8 cuts off the supply to GND through M5 .∼ M6 for lesser power dissipation.
2.2 Design Consideration The current design is optimized to improve the effect of noise on the comparator and lesser input-referred offset compromising the power dissipation. The noise is improved drastically for the ICMR with improved gain and lesser load at the input
510
A. K. R. Penubadi and S. Pasupathy
pair. It is necessary to reset OUTN and OUTP to .Vdd to sample the next input faster. The driving strength of the MOS devices is doubled (.W/L is .240 nm/.100 nm) to improve the transconductance of the input pair.
2.3 Equations Voltage to time gain is defined as the time taken by the signal to reach .Vdd /2 from the lowest potential. For instance, when .Vdd = 1.2 at .100 mV differential input and common mode voltage (VCM) = .Vdd /2, the delay for the proposed design is 42.34 pS. So, the gain is 94.63 .V2 /s calculated from Eq. 1. ( Gain = 10 log
.
Vdd Vd Delay
) (1)
where, .Vdd = Supply Voltage (V) . Vd = Differential Input Voltage (V).
3 Results and Discussion This section compares the designed comparator with existing architectures and modified versions of it. Figure 5 depicts the comparison of existing comparators with the proposed design. The design has two stages similar to Miyahara and double tail comparator (DTC) but the difference lies in the input to the second stage from the first stage. The proposed design has the least delay with a 30% improvement with double tail comparator (DTC) and 19.52% with the Miyahra comparator. Although the three-stage comparator has almost the same delay as the proposed design the power dissipation is 543.81 .µW where as the power dissipation in the proposed design is 437 .µW. Across the corners SS, SF, FS, FF with the differential input voltage, the delay is the least for the proposed comparator. The delay is maximum in the slow-slow corner as the threshold voltage of the MOSFETs is high hence the power dissipation is reduced by 23% with an increase in delay by 25%. The entire comparison is shown in Fig. 6. Figure 7 shows the change in input-referred noise with respect to common mode √ voltage. At .Vcm = Vdd /2 the proposed design has the least noise 77.46 .µV/ Hz which is improved by 83% compared to DTC. The proposed design has high AC gain at the second stage compared to other two-stage comparators. Thus, the inputreferred noise is the least across the ICMR.
Design of Voltage Comparator with High Voltage .. . .
511
Differential Input VS Delay 140 120
Delay (pS)
100 80 60
40 20 0 1
2
5
10
20
30
50
100
50
100
Differential Input (mV) Miyahara
DTC
Proposed
Three Stage
Fig. 5 Delay (ps) @ VCM = 600 mV , .Vdd = 1.2 across differential inputs Delay vs Differential Input at SS,SF,FS,FF Corners 160 140
Delay (ps)
120 100 80 60 40 20 0 1
2
5
10
20
30
Differential Input Voltage (mV) Miyahara@SS
DTC@SS
Proposed@SS
Three Stage@SS
Miyahara@SF
DTC@SF
Proposed@SF
Three Stage@SF
Miyahara@FS
DTC@FS
Proposed@FS
Three Stage@FS
Miyahara@FF
DTC@FF
Proposed@FF
Three Stage@FF
Fig. 6 Delay (ps) @ VCM = 600 mV , .Vdd = 1.2 versus differential input (mV) across corners
512
A. K. R. Penubadi and S. Pasupathy Input Referred Noise VS Common Mode Voltage 500
Input Referred Noise (μV/√
450 400 350 300 250 200 150 100 50 0 200
300
400
500
600
700
800
900
1000
1100
1200
Common Mode Voltage (mV) Miyahara
DTC
Proposed
Three Stage
Fig. 7 Input-referred noise @ .Vdd = 1.2 across input common mode voltage .Vcm Input Referred Offset 10 9.076
9
Input Referred Offset (mV)
9 8
7.4525
7.299
DTC
Proposed
7 6 5 4 3 2 1 0 Miyahara
Three Stage
Fig. 8 Input-referred offset @ .Vdd = 1.2 .Vcm = 600 mV
Input-referred offset is defined as the maximum differential voltage given to the comparator at the input terminals to obtain the desired output level. To calculate the input-referred offset, .Vref is given as an input to one terminal and a slow ramp at a maximum slew rate to get the desired output. For instance, at .Vdd = 1.2, .Vcm = 600 mV, and .Vd = 1 mV with a slow ramp of slew rate 58.75.V/nS, the proposed design output is obtained after 124.23 pS. The output should settle at the beginning of the clock cycle,.Vin = 599 mV and.Vref = 600 mV but due to the delay, it settles at 606.299 mV. So, the input-referred offset is 7.299 mV for the proposed design. Similarly, the offset is calculated at .Vcm = 600 mV for the existing comparators depicted in Fig. 8. Voltage to time gain depends
Design of Voltage Comparator with High Voltage .. . .
513
Voltage to Time Gain 95
)
94
Voltage-Time Gain
94.5
93.5 93 92.5 92 91.5
91 Miyahara
DTC
Proposed
Three Stage
Fig. 9 Voltage to time gain comparision Table 1 Performance comparision with recent publications Performance metrics [4] [9] M DTC Technology (nm) Supply voltage (V) Delay (pS) @ .Vcm = . Vdd /2 @ . Vd = 1 mV Input referred offset (mV) Input √referred noise (.µV/ Hz)
3STC
Technique used
65 1.2 1240
90 1 200
90 1.2 105
90 1.2 116
90 1.2 88
90 1.2 91
–
5.22
9.076
7.45
9
7.299
400
–
277
125.9
83.71
71.2
M: Miyahara comparator, DTC: Double tail comparator, 3STC: Three stage comparator
majorly on the delay of the comparator calculated from the Eq. 1, the proposed design has a gain of 94.63 .V2 /s compared with existing comparators shown in Fig. 9. Table 1 shows the performance metric comparison of all the existing and modified comparators with respect to the proposed design.
4 Conclusion This paper presents a low noise low offset comparator with the least delay across corners which is well suited for high-precision ADC applications. The additional MOSFETs in the second stage provide more gain which in turn reduces the effect of noise at the input MOS pair. The proposed design provides input-referred noise
514
A. K. R. Penubadi and S. Pasupathy
√ of 71.2 .µV/ Hz with input-referred offset of 7.299 .mV at .Vcm = 600 mV with a delay of 91 .pS at 1 GHz clock frequency. Thus, the designed comparator meets the design specifications for low noise and offset with lesser delay.
References 1. Siddharth RK, Satyanarayana YJ, Kumar YBN, Vasantha MH, Bonizzoni E (2020) A 1-V, 3-GHz strong-arm latch voltage comparator for high speed applications. IEEE Trans Circ Syst II Exp Briefs 67(12):2918–2922. https://doi.org/10.1109/TCSII.2020.2993064 2. Varshney V, Nagaria RK (2020) Design and analysis of ultra high-speed low-power double tail dynamic comparator using charge sharing scheme. AEU Int J Electron Commun 116. Elsevier GmbH. https://doi.org/10.1016/j.aeue.2020.153068 3. Zhuang H, Tang H, Liu X (2020) Voltage comparator with 60% faster speed by using charge pump. IEEE Trans Circ Syst II Exp Briefs 67(12):2923–2927. https://doi.org/10.1109/TCSII. 2020.2983928 4. Zhuang H, Cao W, Peng X, Tang H (2021) A three-stage comparator and its modified version with fast speed and low kickback. IEEE Trans Very Large Scale Integr (VLSI) Syst 29(7):1485– 1489. https://doi.org/10.1109/TVLSI.2021.3077624 5. Bindra HS, Lokin CE, Schinkel D, Annema AJ, Nauta B (2018) A 1.2-V dynamic bias latch-type comparator in 65-Nm CMOS with 0.4-MV input noise. IEEE J Solid-State Circ 53(7):1902– 12. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/JSSC.2018. 2820147 6. Khorami A, Sharifkhani M (2018) A low-power high-speed comparator for precise applications. IEEE Trans Very Large Scale Integr (VLSI) Syst 26(10):2038–2049. https://doi.org/10.1109/ TVLSI.2018.2833037 7. Yeom S, Sim T, Han J (2023) An analysis of CMOS latched comparators. In: 2023 international conference on electronics, information, and communication (ICEIC), Singapore, pp 1–4. https://doi.org/10.1109/ICEIC57457.2023.10049873 8. Chevella S, O’Hare D, O’Connell I (2020) A low-power 1-V supply dynamic comparator. IEEE Solid-State Circ Lett 3:154–157. https://doi.org/10.1109/LSSC.2020.3009437 9. Rezapour A, Shamsi H, Abbasizadeh H, Lee KY (2018) Low power high speed dynamic comparator. In: Proceedings of IEEE international symposium on circuits and systems, May 2018. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ISCAS. 2018.8351548 10. Sreya D, Kumar AS, Kalyani P (2022) Dynamic comparator design for high speed ADCs. In: 2022 first international conference on electrical, electronics, information and communication technologies (ICEEICT), Trichy, India, pp 1–5. https://doi.org/10.1109/ICEEICT53079.2022. 9768408 11. Ramamurthy C, Parikh CD, Sen S (2021) Deterministic digital calibration technique for 1.5 bits/stage pipelined and algorithmic ADCs with finite Op-Amp gain and large capacitance mismatches. Circ Syst Sig Process 40(8):3684–3702. Birkhauser. https://doi.org/10.1007/s00034021-01652-6 12. Maciel N, Marques EC, Naviner LAB, Cai H (2018) Single-event transient effects on dynamic comparator in 28 nm FDSOI CMOS technology. Microelectron Reliab 88–90:965–68. Elsevier Ltd. https://doi.org/10.1016/j.microrel.2018.07.114 13. Bandla K, Harikrishnan A, Pal D (2020) Design of low power, high speed, low offset and area efficient dynamic-latch comparator for SAR-ADC. In: Proceedings of 2020 international conference on innovative trends in communication and computer engineering, ITCE 2020. Institute of Electrical and Electronics Engineers Inc., pp 299–302. https://doi.org/10.1109/ ITCE48509.2020.9047792
Intelligent One Step Authentication Using Machine Learning Model for Secure Interaction with Electronic Devices Tharuni Gelli, Rajesh Mandala, Challa Sri Gouri, S. P. V. Subba Rao, and D. Ajitha
Abstract With the enhanced technology every process is being made very simple, especially human interaction with devices is being constantly changed. Many technologies like artificial intelligence, machine learning, blockchain, and deep learning algorithms are helping us in developing secured interactions with the devices. They are also automating the process of authentication with the help of touch less technologies. Through this paper a methodology is proposed which is user friendly to illiterates and which can be proved efficient to overcome the challenges faced by the people because of the least digital literacy. This not only provides a solution for the present but can also help our future generations in handling complex transactions in a simpler manner. In the proposed methodology, real-time machine learning model is implemented which will make the process of one step authentication secure and reliable. With the increase in IoT technologies in future the number of devices interacting with each other will be very high. In that situation this model provides the best solution in terms of implementation as well as simplicity. The results obtained during real-time implementation are presented in this paper. Keywords Artificial intelligence · Machine learning · Automation · Authentication · Reliable · Real-time implementation · Touch less technologies
T. Gelli (B) · R. Mandala · C. S. Gouri · S. P. V. Subba Rao Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Hyderabad, Telangana, India e-mail: [email protected] S. P. V. Subba Rao e-mail: [email protected] D. Ajitha Department of Software Systems, School of Computer Science and Engineering (SCOPE), Vellore Institute of Technology, Vellore, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_44
515
516
T. Gelli et al.
1 Introduction As authentication is a process used by any device to check who is accessing the information. In this process the computer or the user who tries to use the information must prove their identity. This is to make sure the information is securely handled and it is not accessed by the external people. Authentication and authorisation play a significant role in information management. Encryption is another technique used to safeguard the information. There are basically five common types of authentication, they are: Password-based, multi factor, single factor, biometric, and token-based authentication. Let us focus more on single factor authentication. In single factor authentication methods, a person will try to match any one credential to himself. A common example for this is accessing information through username and password. This password security completely depends on the user who creates his account. He/She must ensure they create a password which cannot be accessed by anyone. This is the reason most of the times system recommends people to create strong passwords. There are many threats to single factor authentication. Among them the major one is social engineering attack called phishing. Another threat to be considered when using touch to interact is smudge attack. The fingerprint oil traces can be used by external people for exploitation of our resources [1]. One must be aware of false emails and fake websites to make sure that they are away from cyberattacks. To ensure high level of data protection isolation is also is used along with encryption [2]. To make this authentication simpler and secure we are implementing a real-time machine learning-based model. For the purpose of face detection and recognition different kinds of algorithms are developed. It is claimed that they are highly accurate. The algorithms can be stated as follows: Principle component analysis, linear discriminant analysis, skin colour-based algorithm, wavelet-based algorithm, and artificial neural network-based algorithm. The limitations of these algorithms are the size and type of database requirement, variations in facial expressions, ability to tolerate illuminations, and variations in poses. Considering these parameters the judgement regarding the best algorithm cannot be made, instead of depending on these algorithms new methods can be implemented which can be a hybrid integration of these algorithms. A. Key Contributions: After considering the challenges faced in implementation of various algorithms used for face recognition the following machine learning model is proposed: (1) Face and iris recognition is being used for the authentication using LBPH algorithm. (2) Automatic user registration is performed using this machine learning model, whenever a new user comes dataset samples are collected in real-time and an ID is assigned to user. (3) No duplication of data is allowed and management of dataset becomes simpler. (4) It is a hybrid model which can handle huge dataset.
Intelligent One Step Authentication Using Machine Learning Model …
517
2 Literature Review The various authentication methods used in different applications, such as password, biometric, can be compared to understand and analyse the efficiency in each case. On comparing the various existing technologies it is clear that voice-based authentication, password and PIN-based authentication have high level of security. The process of using PIN code or passwords for authentication is the traditional methods [3]. Whereas coming to the usability iris recognition is configured to be prominent along with password and PIN. There are various access management techniques employed to enhance the security. Whenever security systems are designed there is always trade-off between security and user concern [4, 5]. The most commonly used techniques are identity centralisation, access control based on your role in the organisation, multiple checkpoints for identity authentication, limited privileges, automated on boarding process, KBA method of authentication. When we implement identification and authentication, acceptance from the user is considered to be very important. During the implementation there will be many challenges such as data protection, durability, incorporation of techniques and compatibility. There are certain traditional methods used for the process of data protection and security. They are as follows: Authentication with the help of thermal image, position-based authentication, face recognition, iris recognition, and usage of pass codes for protection, implementation of tokens for authentication, biometric, and recognition with the help of hands and veins. Data protection is considered to very important for any organisation. TUI-based fiduciary tags are used for the evolution of authentication based on the criteria of possession [6]. Compared with single factor authentication, multifactor authentication is considered to be safer and secure as it adds another layer of safety. Knowledgebased authentication is considered as a component which enhances the security and addresses the privacy issue. There are certain risks associated with single factor authentication. There is only one identifier in SFA which makes it more vulnerable to attacks. To address this issue and make it more safe and secure we have come up with the real-time implementation of SFA machine learning model. The modern methods used for authentication are majorly classified into three types, they are: Knowledgebased authentication using PIN’s and passwords, inheritance-based authentication using biometrics, and possession-based authentication using tokens. Statistics show that hacker started finding special methods to access the passwords as they became more common forms of security. So many organisations started looking up for new ways for protecting the password-based authentication. To ensure this CAPTCHA is used. It is not a direct form of authentication but it provide means to safeguard the data. There are fifteen different authentication techniques which can be used either as a single factor authentication or a combination of single factor and multifactor authentication. Even though we used this authentication technique they are associated with many threats like key logger and social engineering [7]. They are as follow: Under the criteria of user possession, the techniques such as smart card, cell phone,
518
T. Gelli et al.
password, and tokens are being used. In users knowledge cognitive passwords, PIN, and personal questions are being used. Using facial recognition-based algorithms various process are being automated such as attendance management [8]. Whereas in the criteria of user characteristics, the techniques employed are fingerprint, retina, facial features, and the geometry of hand. In the further sections of this paper proposed methodology is elaborately discussed. In Sect. 3, block diagram and working of model are proposed; results are clearly presented in Sect. 4. In Sects. 5 and 6, conclusion and future scope are clearly depicted.
3 Workflow and Technical Architecture Even though facial recognition is gaining its prominence in many fields to acquire more efficiency certain issues must be addressed [9]. Considering the various challenges faced by the user related to the security aspects of single factor authentication, the below described model is proposed. This model can be termed as intelligent model as it uses a part of artificial intelligence and face recognition using LBPH algorithm. LBPH stands for local binary pattern histogram, it is widely used for recognising a person using his facial characteristics which includes both his/her front and side face. It is considered to be one of the easiest techniques for facial recognition. The main advantage of this algorithm is its ability to show local features in the images and its accuracy is 98%. LBPH is a machine learning algorithm so it can also be considered as an application of AI. As a part of implementation we used Haar cascade classifier in combination with LBPH algorithm. Haar cascade architecture explained briefly as follows. The Haar cascade classifier is a machine learning-based approach used for object detection, particularly for real-time face detection. The architecture of the Haar cascade classifier consists of several stages, each comprising multiple weak classifiers. Here is a generalized overview of the architecture: 1. Integral Image Calculation: The Haar cascade classifier operates on integral images, which are pre-calculated representations of the input image. Integral images enable efficient computation of rectangular features used in the detection process. 2. Haar-like Features: The Haar cascade classifier uses Haar-like features to capture the visual characteristics of objects. Haar-like features are rectangular patterns that represent variations in intensity within the image. 3. Adaboost Training: The classifier is trained using the Adaboost algorithm, which combines multiple weak classifiers into a strong classifier. During training, thousands of positive and negative samples are used to train the classifier iteratively. 4. Cascade of Classifiers: The trained Haar cascade classifier consists of a cascade of stages, each consisting of multiple weak classifiers. Each stage applies a subset
Intelligent One Step Authentication Using Machine Learning Model …
5.
6.
7.
8.
519
of the total weak classifiers and progressively filters out non-face regions in the image. Integral Image Evaluation: At each stage, the integral image is evaluated using the selected weak classifiers. The weak classifiers compare the Haar-like features’ values with predefined thresholds to determine if a region is a face or a non-face. Stage-by-Stage Filtering: The cascade of stages filters out non-face regions efficiently by applying increasingly complex classifiers. Regions that do not meet the criteria in a particular stage are discarded early in the process, reducing the computational load. Scaling and Sliding Window: The classifier operates on multiple scales of the image to detect faces of different sizes. A sliding window technique is employed to scan the image at various positions and scales, evaluating each window using the cascade of classifiers. Detection Decision: If a region passes through all stages of the cascade without being rejected, it is considered face detection. The classifier can also provide the region’s coordinates and possibly additional information, such as the confidence score or face orientation.
The Haar cascade classifier’s architecture enables efficient and real-time face detection by combining simple and computationally inexpensive features with a cascade structure that allows for early rejection of non-face regions. When compared with other algorithms for face detection LBP is not widely used [10]. The aim is to make the interaction with the electronic devices simpler, safe, and comfortable and user friendly. This also enhances the digital literacy. This facial recognition technique can also be implemented in libraries [11]. Most of the time there are errors in facial recognition techniques as it considers common characteristics too. In proposed model, the part right from the forehead to chin is concentrated for the purpose of authentication which makes it more reliable. In this model even though we are using single factor authentication, since it is a combination of iris and facial recognition, the authentication becomes more secure. Let us understand the proposed methodology in a clear manner through a block diagram. As shown in Fig. 1, when the user arrives, camera will capture the image where both iris and face recognition are performed using machine learning trained model. If the user is already registered, face and iris features are analysed and if they match with the existing data, then user information is fetched. If the user is not a registered one then the captured image is used to train the machine learning model. Once the model is trained it analyses the data and fetches the information. Once after the information is fetched the user data is displayed and the access is provided. Usage of machine learning trained model, with the help of both iris and facial data makes this model more reliable and secure when compared to the existing single factor authentication.
520
T. Gelli et al. User arrives
Camera detects and takes the input
No
Model is trained and user is registered
Yes Checks whether already registered user or not
Authenticates and Retrieve the information
Details are displayed
Access is given
Fig. 1 Flowchart of the entire system
4 Modelling and Analysis The primary objective of this project is to design and implement a machine learning-based authentication model that simplifies the authentication process while enhancing security. The model should enable users to authenticate themselves reliably and seamlessly with electronic devices in a single step, eliminating the need for complex passwords or PINs. 1. Data Collection: A diverse and representative dataset will be collected, consisting of biometric information or behavioural patterns that are unique to each user. This data will serve as the training set for the machine learning model. 2. Feature Extraction: Relevant features will be extracted from the collected data. These features may include fingerprint patterns, keystroke dynamics, facial recognition, or voice characteristics. The choice of features will depend on the specific requirements of the target electronic device. 3. Model Training: A machine learning model will be developed and trained using the extracted features. Various algorithms such as support vector machines (SVM), random forests, or deep learning techniques like convolutional neural networks (CNN) may be explored to create an accurate and robust authentication model.
Intelligent One Step Authentication Using Machine Learning Model …
521
4. Model Evaluation: The trained model will be evaluated using appropriate performance metrics, such as accuracy, precision, recall, and F1-score. The evaluation will help assess the model’s effectiveness and determine if any fine-tuning or adjustments are required. 5. Implementation: The trained authentication model will be integrated into the electronic device’s authentication system. The implementation may involve modifying the device’s firmware or software to accommodate the machine learning model. 6. Testing and Validation: Extensive testing will be conducted to ensure the effectiveness, reliability, and security of the developed authentication system. Different scenarios and use cases will be simulated to validate the model’s performance under various conditions. 7. Performance Analysis: The performance of the authentication system will be analysed based on factors such as speed, accuracy, user experience, and vulnerability to attacks. This analysis will help identify any potential weaknesses and areas for improvement. 8. Security Assessment: A thorough security assessment will be conducted to identify and address potential vulnerabilities or risks associated with the authentication system. Measures will be taken to ensure the system’s resistance against various attacks, such as spoofing, replay attacks, or machine learning-based attacks. The accuracy of the Haar cascade classifier is evaluated by calculating metrics such as true positive rate, false positive rate, precision, recall, and F1-score. These metrics provide insights into the classifier’s ability to correctly detect faces while minimising false positives and negatives. The robustness of the classifier is assessed by testing its performance on various challenging scenarios, including variations in lighting conditions, pose, occlusions, and scale. Evaluating the classifier’s ability to handle these variations helps identify potential limitations and areas for improvement. Analysing the occurrences of false positives (incorrectly detected faces) and false negatives (missed detections) provides insights into the classifier’s strengths and weaknesses. Fine-tuning of the classifier or adjusting detection thresholds may be necessary to reduce false detections and improve overall performance. The scalability of the Haar cascade classifier is examined by evaluating its performance on different input resolutions or in scenarios with a large number of faces. Scalability analysis helps determine if the classifier can handle varying computational requirements and adapt to different hardware constraints.
5 Results and Discussion In this section the results of the above proposed model are discussed in detail with each image representing each step of work flow.
522
T. Gelli et al.
In Fig. 2, we can observe how the image is being captured by considering facial and iris features, the number of clicks can be varied as per our requirement to get the accurate results. We took 100 clicks of each individual which gave us the best accuracy. In the next step, as shown in Fig. 3 the captured image is analysed by the machine learning trained model and the details of the registered user are fetched and displayed. Figure 4, shows the importance of positioning for authentication. When user positioning in front of camera is not proper, it becomes problematic for the machine learning trained model to analyse the facial and iris features of the registered user which results in failure of authentication. As a result the access is denied. Fig. 2 Capturing the image in real time
Fig. 3 Analysing and retrieving the information of registered user
Intelligent One Step Authentication Using Machine Learning Model …
523
Fig. 4 Wrong positioning effects the authentication
After analysing these different scenarios we wondered what happens when multiple users are in same frame. As shown in Fig. 5, when multiple registered users are in same frame the trained model assigns different ids to them as per their registration and the details of the user close to the camera are fetched. Figure 6 shows the display OF user details after recognition. When compared with the other existing algorithms LBPH exhibits an accuracy of 98%. This makes it more reliable than the other algorithms. So from the above discussions it is clear that “Intelligent one step authentication using LBPH algorithm” is more accurate. Fig. 5 Differentiating multiple users
524
T. Gelli et al.
Fig. 6 Displaying user details after recognition
Mobile video surveillance also uses facial recognition which is significant in military applications [12]. So we can understand the importance of security in this aspect [13]. Ticket checking process can also be automated using digitalised face recognition [14]. Even though LBPH algorithm is also used for two factor authentication complexity of the system increases [15].
6 Conclusion The various methods used for authentication are discussed. The advantages and limitations of each method are analysed. During Covid-19 even in the medical field, facial recognition played a crucial role. This model shows the best way to overcome the limitations of existing single factor authentication by including machine learning trained model which considers both facial and iris features, make intelligent decisions by employing AI methods. Further we would also like to extend this authentication process in the financial domain to make it more accessible for the people of rural areas. Using LBPH algorithm high accuracy can be obtained in single factor authentication without increasing the system complexity. Authors Contribution Tharuni Gelli has done the research work on image processing with Haar cascade classifier and worked on accuracy of the system using python language, S. P. V. Subba Rao, Dr. D. Ajitha has provided the required guidance to work on the idea implementation. This is just a part of the final implementation of our idea related to digitalization of financial services.
References 1. Aviv AJ et al (2010) Smudge attacks on smartphone touch screens. In: 4th USENIX workshop on offensive technologies (WOOT 10) 2. Gomez-Barrero M et al (2016) Unlinkable and irreversible biometric template protection based on bloom filters. Inform Sci 370:18–32
Intelligent One Step Authentication Using Machine Learning Model …
525
3. Kun AL, Royer T, Leone A (2013) Using tap sequences to authenticate drivers. In: Proceedings of the 5th international conference on automotive user interfaces and interactive vehicular applications—AutomotiveUI’13, pp 228–231 4. Braz C, Jean-Marc R (2006) Security and usability: the case of the user authentication methods. In: Proceedings of IHM. ACM 5. Gutmann P, Grigg I (2005) Security usability. Sec Privacy 3(4):56–58 6. Marquardt N, Kiemer J, Greenberg S (2010) What caused that touch? Expressive interaction with a surface through fiduciary-tagged gloves. In: Proceedings of ITS. ACM 7. Sabzevar AP, Stavros A (2008) Universal multi-factor authentication using graphical passwords. In: IEEE international conference on signal image technology and internet based systems (SITIS) 8. Chaudhari C, Raj R, Shirnath S, Sali T (2018) Automatic attendance monitoring system using face recognition techniques. Int J Innov Eng Technol (IJIET) 10(1). ISSN 2319-1058 9. Kutty NM, Mathai S (2017) Face recognition—a tool for automated attendance system. Int J Adv Res Comput Sci Softw Eng 7(6):334–336. ISSN 2277-128X 10. Chanchal AK, Dutta M (2016) Face detection and recognition using local binary patterns. Int J Adv Res Electr Electron Instrum Eng 5(10):7923–7929 11. Upala M, Wong WK (2019) IoT solution for smart library using facial recognition. IOP Conf Ser Mater Sci Eng 495(1). IOP Publishing 12. Khan M et al (2019) Face detection and recognition using OpenCV. In: 2019 international conference on computing, communication, and intelligent systems (ICCCIS). IEEE 13. Hussain T et al (2022) Internet of things with deep learning-based face recognition approach for authentication in control medical systems. Comput Math Methods Med 2022 14. Patil S et al (2021) Digitized railway ticket verification using facial recognition. In: 2021 5th international conference on intelligent computing and control systems (ICICCS). IEEE 15. Halim MAA et al (2021) Face recognition-based door locking system with two-factor authentication using OpenCV. In: 2021 sixth international conference on informatics and computing (ICIC). IEEE
Development of Automated Vending Machine for the Application of Dispensing of Medicines P. Deepak, S. Rohith , D. Niharika, K. Harshith Kumar, and Ram Bhupal
Abstract This paper focuses on development of automated vending machine for the application of dispensing medicines. This enables minimal human intervention in accessing of the medicines. The machine offers a range of medical supplies including prescription medications, first-aid kits, and hygiene products, among others. This prototype uses Arduino microcontrollers, IR sensors, servomotors, GPS system that can securely dispense medications to patients effectively and efficiently. The use of an automatic medical vending machine offers a cost-effective and efficient approach to medicine delivery, reducing the burden on healthcare providers and enhancing patient experience and satisfaction. The prototype is tested for different test cases. It is observed that method is a quick and cost-effective solution for disposal of medicine. Keywords Arduino microcontroller · Automated vending machine
1 Introduction Recent advancements in health sector attracted researchers to develop automated medicine vending machine to facilitate the medicines at the all the regions without human intervention. It is a novel solution that allows patients to purchase the medical supplies and over the counter medications easily and quickly. It operates similar to a conventional vending machine dispensing products based on selection and payment process [4]. The medical vending machines can be installed in various locations, such as hospitals, clinics, and public areas, and provide patients with easy access to medical supplies. The traditional methods of procuring medicines involve visiting a pharmacy or requesting them from a healthcare provider which can be time consuming and inconvenient. Medical vending machines offer an alternative that saves valuable patients time by reducing waiting time making it an attractive option for individuals with busy P. Deepak · S. Rohith (B) · D. Niharika · K. Harshith Kumar · R. Bhupal Department of Electronics and Communication Engineering, Nagarjuna College of Engineering and Technology, Bengaluru, Karnataka, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_45
527
528
P. Deepak et al.
schedules. Furthermore, medical vending machines reduce the workload on healthcare providers allowing them to focus on more critical tasks. Automated dispensing of medicine by vending machine, is a decentralized medicine disbursing system that provides dispensing of medicine, storage and tracking of stock medicine is recommended as a potential mechanism to improve efficiency and patient safety and they are now widely used in many hospitals [8]. Medical vending machines are equipped with advanced technology, including inventory management systems, which ensure that the machine is always stocked with the required products. They also feature secure payment systems that allow patients to pay for their medical supplies using cash, credit or debit cards, or even mobile payment methods. Overall, medical vending machines are an innovative solution that improves access to medical supplies and enhances patient satisfaction making them a valuable addition to healthcare facilities. The rest of the paper is organized as follows. In Sect. 2, literature survey is discussed. The block diagram of the proposed system is discussed in Sect. 3. Section 4 discusses flowchart of the proposed method. In Sect. 5, results and discussion are given. Conclusion is provided in Sect. 6.
2 Literature Survey Many researchers are proposed methods to dispense the medicines through automated process. In [4] Any Time medicine vending machine was developed to access the medicines. They used microcontroller for developing the based vending machine. It supported cashless payment as the input by scanning of RFID card and dispensed over the counter medicines. In [1], automated machine for dispensing medicine was developed using distribution systems that provide computer-controlled storage and dispensing. They used PIC microcontroller for the processing and motor for dispensing the medicine. RFID was used for authentication and payment process. In [2], to reduce the manpower, time, All Time Medicine (ATM) Counter for Medicine Self-Dispensing-based system was developed. Method works similar to an ATM which dispenses tablets such as for B.P, diabetics, cold, fever, headache. Here, RFID technique is used for authentication. In [3], Medicine Dispensing using Raspberry Pi and Arduino microcontrollerbased method was discussed. They developed a system to dispense limited number of medicines which does not require any suggestions of the doctor or medical practitioner. They used 16-bit PIC microcontroller for processing. The microcontroller, with the assistance of the motor drivers, drives the involved cabinet having the medication that the user desires. In [6], microcontroller and motor-based system to dispense the medicines was developed. The availability of medicine was monitored remotely and based on that information refilling the machine could be done proactively.
Development of Automated Vending Machine for the Application …
529
In [10], PIC microcontroller-based Automated Medicine Dispensing Machine was developed. They used dispenser box having drawers to store the medicine and drawer was moved with the help of stepper motor. NFC technique used the authentication and stored the data. Initially, user had to put the tag in front of RFID reader, and after the successful authentication, user was asked to enter the number of the tablets. Based on the user inputs, medicine would be dispensed. However, this machine was suitable for tablet dispensing. In [9] developed a medicine vending machine system to get over the counter (OTC) medicine at any time. System was developed to access first-aid medicine and tablets which did not required any prescription. In [11], design and implementation of smart medicine dispensing machine for home care were discussed. They developed medicine dispenser system for home care that uses improve medication adherence among patients with chronic conditions. In this work, development of automated vending machine for dispensing the medicines is developed. This method offers storage of the medicine, tracking and dispensing in user-friendly manner. The medicines include first aid, hygiene products, tablets, etc. This prototype uses Arduino microcontrollers, IR sensors, motors, GPS system that can securely dispense medications to patients effectively and efficiently. Also, in this work, automated sanitizer which helps to sanitize the hand to avoid spreading for the diseases is included. Further, the system helps to provide ambulance service for the emergency situations. The prototype is tested for different test cases and found that it is quick and cost effective.
3 Proposed System The block diagram of the proposed system is discussed in this section. The entire system is divided into two sections: automatic medical dispensing system with GSM technology and automatic hand sanitizer. Figure 1 shows the block diagram of the proposed system of automated medicine vending machine. It includes Arduino microcontroller, LCD, keypad, motor driver, three motors, RFID module, GSM module, and IR sensor. The main idea of the proposed work includes automation in dispensing of the medicine. The patient will have RFID tag for the prototype which will act as ATM card or Aadhar card. First, the user needs to put the card in front of the RFID reader, and then, the RFID reader will read the tag and deduct the balance from the card. If balance is not there, then on the LCD it will display sufficient that amount is not there and asks to show the valid card. If the user shows valid card, then it will deduct the amount from the tag and patient or user is allowed to give the input or desired medicine or “Mask” from the keypad by pressing the particular switch. The keypad is provided for user interaction, so based on requirement, the end user will press the switch and there will be one LCD display to show the masks and drugs available for particular kind of disease. Later, the input from this is used to coordinate with dispensing mechanisms.
530
P. Deepak et al.
Fig. 1 Block diagram of the proposed automated medicine vending machine
The keypad values the corresponding dosage and corresponding medicine or mask is vended out, and from the voice recorder and playback circuit. Module will give information with voice command “please collect the medicine”. In case if the user wants medicine for illness like headache that does not need detection will be vended out based on the input through the keypad/touchpad. After confirmation, it will process the request and send the information to motor driver. The motor driver in turn drives motor of the concerned medicine cabinet containing the medicine to rotate. The motor drivers play an important role in controlling the motor rotation to dispense the medicines cabinet containing the medicine. The motor rotates the disk attached to the cabinet and then medicine comes at the outlet. Hence, motor drivers play an important role in medicine dispensing. There will be one option provided to call to ambulance person near to that area that if the patient presses that particular switch from the keypad, then the ambulance person will get the notification message with some pre-defined location of vending machine. Apart from this, there will be a separate unit in the module which consists of automatic hand sanitizer and refilling concept. Before doing any interaction from there, module user needs to first sanitize the hand, and for normal sanitizer also, this product can be used. Moreover, if the level of liquid is less, automatically it will be refilled with the help of water motor.
Development of Automated Vending Machine for the Application …
531
4 Flowchart of the Proposed Model The prototype consists of Arduino microcontroller, 16 × 4 LCD, 4 × 4 keypad matrix, dc motors, etc. There is a MFRC522 RFID reader mounted on the top of the module which acts as an ATM card or Aadhar card reader. RFID smart card is used as input to read the user information and payment process. The keypad helps the user to select the medicine list, and hence, user should provide the list of the items they required through keypad. Further, microcontroller checks the payment and authenticity of the RFID card. Figure 2 shows the flow diagram of proposed method. The user has to enter the data manually through the keypad. The medicine dispensing is fully automated and controlled by the motor drivers. Based on the user input, medicine will be dispensed. The process starts with “START” and involves the following steps: First, the user needs to either scan their identification card or manually enter their information into the machine to proceed to the next step, which is called “Enter information”. Next, the machine checks the database to verify the user’s identity and eligibility to purchase medical products. This step is known as “Check database for verification”. After the user is verified, they can select the desired medical product from the machine’s available options, which is referred to as “Choose product”. Once the user has selected their desired product, the machine checks its inventory to ensure that the selected product is in stock and available for dispensing. This step is called “Check stock availability”. If the selected product is available, the machine will then dispense the selected medical product as “Dispense product”. After the product is dispensed, the user confirms that they have received the correct product before leaving the machine, which is referred to as “Confirm dispensing”. Lastly, the process ends with “END”.
5 Results and Discussions This proposed work has presented to our knowledge, about the machinery and technology involved in the most common vending machines present all over the world. It helps to increase efficiency by lowering dependence on manpower. The desired outcome is achieved as per the user’s requirements in the form of medicines dispensed by the machine. We also learn about the functioning of various instruments. This includes the functioning of RFID, microcontroller, motor drivers, etc. All these have contributed greatly in improving our knowledge about the functioning and performance of a vending machine. Figure 3 shows hardware connections of the proposed model. In this work, implementation of automated medicine vending machine using Arduino microcontroller for the applications in remote areas is discussed. It improves efficiency by reducing labor cost. The machine is tested for different test medicines
532 Fig. 2 Flow diagram of proposed method
Fig. 3 Hardware connections of the model
P. Deepak et al.
Development of Automated Vending Machine for the Application …
533
and found that it can dispensed by the machine based on the user’s choice. The prototype comprises RFID, microcontrollers, motor drivers, motor, and keypad interface. Figure 3 depicts the hardware connections of the suggested model. Figures 4 and 5 show hardware connections of the proposed model and final proposed model. This proposed work has given good exposure regarding knowledge about automation process of the vending machine. Proposed implementation is tested for different test medicines, and it is observed that the method is efficient by lowering manpower. The interfacing of RFID, Arduino microcontroller, motor drivers is depicted in Fig. 4. Also in this proposed work, GSM module and automatic hand sanitizer are included which provide easy access to medical supplies, particularly, in remote or underserved areas where medical facilities may be limited. The machine could also reduce the need for face-to-face interactions between people and medical professionals, potentially reducing the spread of infectious diseases. The automatic hand sanitizer feature could also help to prevent the spread of germs and bacteria, promoting overall hygiene in the medical setting. However, there are also potential drawbacks to consider. For example, such a machine may not be able to provide the level of personalized care and attention that patients may receive from a healthcare professional. Additionally, the use of a GSM module may pose security and privacy risks, as personal information is transmitted wirelessly. Fig. 4 Hardware connections of the proposed model
534
P. Deepak et al.
Fig. 5 Final proposed model
Overall, an automatic medical vending machine using a GSM module and automatic hand sanitizer may provide convenience and accessibility for patients, but should be implemented with careful consideration of potential risks and limitations. Case 1: Successful implementation of the vending machine. In this case, the automated vending machine is successfully implemented and effectively dispenses medicines to patients. This would likely result in increased convenience and accessibility of medications for individuals with chronic conditions, as they would be able to obtain their prescribed medications easily and quickly. Case 2: Technical issues with the vending machine. In this scenario, technical issues with the vending machine could arise, such as malfunctioning or errors in dispensing the correct medication. This would likely result in decreased patient satisfaction and trust in the vending machine as a reliable source for obtaining medications. Case 3: Regulatory and legal challenges. Regulatory and legal challenges could arise in the development of an automated vending machine for dispensing medicines. For example, issues related to the safe storage and dispensing of medications, as well as concerns around patient privacy and data security. Overcoming these challenges could result in the successful implementation of the vending machine, but failure to address them could lead to legal and ethical concerns.
6 Conclusion In this paper, novel idea of Automatic Medicine dispensing machine with automatic hand sanitizer is implemented. The proposed helps to improve the availability of basic medicines for fever, headache, and so on at remote areas. In this work, Arduino microcontroller is used with keypad and motor interface. The proposed design is tested for different test medicines. The results show that prototype is efficient and can to be set as the future trend to improve the health of remote areas to address pandemic diseases such as covid-19.
Development of Automated Vending Machine for the Application …
535
References 1. Sangeetha M, Rao TVJ, Gowri ChSR (2016) Automatic medicine vending system-medical ATM. Int J Sci Eng Dev Res 1(10):14–17 2. Malashree G et al (2017) ATM (All Time Medicine) counter for medicine self-dispensing. Int J Latest Technol Eng Manage Appl Sci 1(4):40–41. ISSN 2278-2540 3. Tank V, Warrier S, Jakhiya N (2017) Medicine dispensing using raspberry pi and Arduino controller. In: Published at proceedings of IEEE conference on emerging devices and smart systems (ICEDSS 2017), 3–4 Mar 2017, Mahendra Engineering College, Tamilnadu, India 4. Goel P, Bansal S (201) Health ATM any time medical-help. In: International conference on life science and technology, vol 3, pp 105–108 5. Fitriyah H, Widasari ER, Setiawan E, Kusuma BA (2018) Interaction design of automatic faucet for standard hand wash. In: MATEC web of conferences 6. Penna M, Gowda DV, Jijesh JJ, Shivashankar (2017) Design and implementation of automatic medicine dispensing machine. In: 2017 2nd IEEE international conference on recent trends in electronics, information and communication technology (RTEICT) 7. Agrawal K, Jain J, Shah V, Chauhan G, Surani P (2021) Design and functioning of automated medicine dispensing module. Int J Eng Adv Technol 10(4). ISSN 2249-8958 (Online) 8. Oundhakar S (2017) Automatic medicine vending machine. IJETSR 4(12) 9. Paruvathavardhini J, Bhuvaneswari S, Kavitha C, Mythily A (2021) Automatic vending machine. Int J Eng Res Technol 9(10) 10. Kadam S, Kale A, Nimase P, Padwal S, Khandare S (2016) Automated medicine dispensing machine. IJETSR 4(3) 11. Kwon SH, Kim JS, Cho HK (2013) A design and implementation of smart medicine dispensing machine for home care. In: 2013 IEEE international conference on consumer electronics (ICCE), Las Vegas, NV, USA, pp 143–144 12. Supreeth S, Patil K, Patil SD, Rohith S, Vishwanath Y, Prasad KSV (2022) An efficient policybased scheduling and allocation of virtual machines in cloud computing environment. J Electr Comput Eng 2022:1–12. Hindawi Limited, 24 Sep 2022. https://doi.org/10.1155/2022/588 9948 13. Supreeth S, Patil KK (2019) Virtual machine scheduling strategies in cloud computing—a review. Zenodo, Sep. 2019. https://doi.org/10.5281/ZENODO.6144561
Design of a Low-Profile Meander Antenna for Wireless and Wearable Applications S. Rekha, G. Shine Let, and E. John Alex
Abstract This proposed work discusses the design of printed meander antenna for Radar, mobile, and Wireless Local Area Network (LAN) applications. This meander antenna operates at a single narrow band from 4.29 to 4.31 GHz. The designed antenna is compact in profile, and the total dimension is 4 mm3 * 4 mm3 * 0.05 mm3 . It is built on the substrate namely RT/Duroid 6010 having a relative permittivity of 10.2 and dielectric loss tangent of 0.0023. The proposed antenna is designed and simulated with the help of a 3D full wave simulator namely Ansys HFSS software. The performance parameters of the proposed antenna, i.e., reflection coefficient, VSWR, and the radiation pattern is plotted. The antenna shows a maximum reflection coefficient of − 21.8 dB at 4.3 GHz. Moreover, the size of the antenna is compact and it satisfies the SAR limits. Hence, the proposed antenna can be employed in wearable devices. Keywords Meander · Compact · Wireless applications · SAR · Wearable devices
1 Introduction An antenna is a significant part of any wireless communication system. As recent electronic devices are compact and mobile, there is a huge demand for miniaturized antennas [1, 2]. The researchers have developed many techniques and methods to achieve miniaturization. These miniaturized antennas require a small area to be S. Rekha (B) Department of ECE, Nalla Narasimha Reddy Group—School of Engineering, Secunderabad, Telangana, India e-mail: [email protected] G. Shine Let Department of ECE, Karunya Institute of Technology and Science, Coimbatore, Tamil Nadu, India e-mail: [email protected] E. John Alex Department of ECE, CMR Institute of Technology, Secunderabad, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_46
537
538
S. Rekha et al.
placed in an electronic device. These miniaturized antennas are divided into two categories [3], i.e., topology-based miniaturization and material-based miniaturization. A few techniques used to implement miniaturized antennas are meander, fractal, engineered ground planes, metamaterials, and substrates of high dielectric constant. The conductors in a meander line antenna are folded back and forth [4–6] to create a compact antenna with a reduced length. With this method, the antenna can be shrunk by 25–50% from its original size. Due to their ability to reduce the amount of space, weight, and production costs needed, printed antennas have gained popularity. The development of printed antenna features during the last 20 years has been explored. The meander line approach, however, has been used in commercial products and telemetry applications. The limitations of high-quality factor Q prevent electrically typical situations. It is studied from the literature that a reduction in the antenna size leads to deteriorating antenna performance such as bandwidth, gain, and efficiency [7]. However, enhancement techniques can be deployed to enhance gain and other parameters [8]. However, these antennas are applicable in an environment in which narrow bandwidth is required. These compact antennas are widely used in biomedical applications such as capsule endoscopy and implantable antennae provided the substrate is flexible or biodegradable [9]. In this paper, a printed meander line antenna operating at a 4.3 GHz band [10] is discussed. According to FCC standards, the 4.3 GHz frequency band is applicable in Radar, mobile, and Wireless LAN systems. This antenna is developed on RT/Duroid 6010 substrate having relative permittivity [11] of 10.2 and dielectric loss tangent of 0.0023. Section 2 discusses the antenna design and equivalent circuits. Section 3 investigates the performance of the antenna. Section 4 concludes the research paper.
2 Design of Antenna 2.1 Schematic of the Proposed Antenna Design The meander line antenna is used as a radiator to develop the proposed antenna for the 4.3 GHz band. Many folder strips are employed to tune the frequency of operation. The proposed antenna resonates at the frequency of 4.3 GHz (operating range 4.29–4.31 GHz). This frequency finds its uses in Radar, mobile, and WLAN. The top view, side view, and color indicator of the proposed antenna are given in Fig. 1a–c, respectively. The overall dimension of the antenna is 4 mm3 × 4 mm3 × 0.05 mm3 . A full-ground plane is used on the bottom surface. The top radiator and bottom ground are made of a thin sheet of copper having a thickness of 0.035 mm. The substrate is a Rogers material namely RT/Duroid 6010 and the thickness is significantly small, i.e., 0.05 mm. This material has a high permittivity of 10.2 and a low dielectric loss tangent of 0.0023. This feature of the substrate is one of the key factors in achieving miniaturization. The dimensions of the antenna are mentioned in Table 1. The printed meander structures of width (d 2 ) 0.4 mm are used for the
Design of a Low-Profile Meander Antenna for Wireless and Wearable …
539
entire antenna design. A minimum spacing of 0.4 mm is maintained between any two meander lines. Three horizontal strips (d 3 ) of width 0.2 mm are inserted in the meander lines to fine-tune the operating frequency. The antenna dimensions are good enough to avoid mutual coupling between the meander lines. This antenna can be fabricated using PCB technology. The radiator is excited using a 50-Ω SMA connector in real time. However, a 50-Ω lumped port excitation is assigned from the ground to the top radiator during simulation.
2.2 Equivalent Circuit of the Proposed Antenna An electrical equivalent circuit is drawn for the proposed antenna design using AWR microwave office v15. The passive elements namely resistor (R), inductor (L), and capacitor (C) are used to represent the equivalent circuit as represented in Fig. 2. The excitation is given through a microstrip during simulation which is represented by Rf in the equivalent circuit. Meander block 1 is indicated by the parallel combination of R1 , L 1 , and C 1 . Meander block 2 is equivalent to the series combination of R2 and L 2 . The meander block 3 is represented by the parallel combination of C 3 , L 4 , and R4 . The meander block 4 is equivalent to a parallel combination of R3 , L 3 , and C 2 . The equivalent electrical circuit is simulated by assigning suitable values for R, L, and C. The resultant S 11 plot is compared with the reflection coefficient plot of the proposed design as indicated in Fig. 3.
3 Result and Analysis 3.1 Performance Analysis of Meander Antenna It is noted that the antenna is resonating at 4.3 GHz with a − 10 dB impedance bandwidth of 20 MHz (operating frequency 4.29–4.31 GHz) as represented in Fig. 4a. The antenna shows a minimum bandwidth, and it is useful in most applications to avoid interference with other frequencies. The VSWR of the antenna is plotted and the operating range (4.29–4.31 GHz) falls between 1 and 2 as indicated in Fig. 4b. The antenna is analyzed in the far-field conditions to find parameters such as radiation pattern and gain. The radiation/elevation pattern (all values of θ ) is plotted as shown in Fig. 5a. The radiation pattern is like the figure of eight at φ = 0° and 90°. It is noted that there is an angle deviation of 60° between the azimuth planes φ = 0° and 90°. Similarly, the azimuth radiation pattern is plotted for all angles of φ at θ = 0° and θ = 90° as indicated in Fig. 5b. At θ = 0°, the pattern is omnidirectional whereas at θ = 90°, the pattern is directional.
540
S. Rekha et al.
Fig. 1 Schematic of the proposed design a top face, b side face, c color indicator Table 1 Parameters of the proposed design
S. No
Design parameters
Dimensions (in mm)
1
L
4
2
W
4
3
d1
0.1
4
d2
0.4
5
d3
0.2
Design of a Low-Profile Meander Antenna for Wireless and Wearable …
Fig. 2 Proposed antenna—equivalent circuit model
Fig. 3 Reflection coefficient comparison using HFSS and AWR
541
542
S. Rekha et al.
Fig. 4 a Simulated reflection coefficient and b simulated VSWR
3.2 SAR Analysis for Wearable Applications The antenna is compact in structure and hence SAR testing is performed to find out its suitability in wearable applications. The wearable antenna is the one that operates in proximity to the human body. The human body tissues possess high permittivity and conductivity, and it allows electromagnetic radiation to penetrate the body. Hence, it is highly necessary to undergo SAR testing before real-time implementation. Specific
Design of a Low-Profile Meander Antenna for Wireless and Wearable …
543
Fig. 5 a Simulated antenna elevation pattern ( indicates ϕ = 0°, indicates ϕ = 90°), b simulated antenna azimuth pattern ( indicates θ = 0°, indicates θ = 90°)
Absorption Rate (SAR) determines the amount of electromagnetic wave penetration in the tissues of the human body. It is determined using the standard formula (1) given below. SAR =
σ |E|2 w ρ kg
(1)
where σ is conductivity of the body tissue in S/m, E is electric field in v/m, ρ is density of the tissue in kg/m3 . The human body layer is designed in Ansys HFSS software by assigning the average properties of different tissues as shown in Figs. 6 and 7. The material characteristics such as relative permittivity, conductivity, density, and thickness are displayed in Table 2. According to Federal Communications Commission (FCC), the SAR value should not exceed 1.6 w/kg averaged over 1 g of tissue. The proposed antenna is tested by placing it over the designed human modeling at a distance of 8 mm. It is observed that the maximum average SAR is 0.00047 w/ Fig. 6 Human body layer—a modeling in simulation
544
S. Rekha et al.
Fig. 7 SAR analysis of the proposed meander antenna
Table 2 Characteristics of different layers of the human body [12] Material characteristics
Skin
Fat
Muscle
Bone
Relative permittivity
37.95
5.27
52.67
18.49
Conductivity
1.49
0.11
1.77
0.82
Density
1001
900
1006
100,813
Thickness
2
5
20
13
kg which is much lesser than the standard limits. Hence, the proposed antenna is suitable for wearable applications.
4 Conclusion A compact printed meander antenna is discussed in this research paper. The proposed antenna resonates at 4.3 GHz having a − 10 dB impedance bandwidth of 20 MHz. The antenna is simulated on RT/Duroid 6010 substrate having a high dielectric constant. It is a narrow-band antenna that finds its applications in Radar, mobile, and WLAN. As the antenna shows good SAR performance, it is also suitable for wearable applications.
Design of a Low-Profile Meander Antenna for Wireless and Wearable …
545
References 1. Feng Y, Li Z, Qi L et al (2022) A compact and miniaturized implantable antenna for ISM band in wireless cardiac pacemaker system. Sci Rep 12:238 2. Wang G, Xuan X, Jiang D, Li K, Wang W (2022) A miniaturized implantable antenna sensor for wireless capsule endoscopy system. AEU Int J Electron Commun 143:154022. ISSN 1434-8411 3. Fallahpour M, Zoughi R (2018) Antenna miniaturization techniques: a review of topology- and material-based methods. IEEE Antennas Propag Mag 60(1):38–50. https://doi.org/10.1109/ MAP.2017.2774138 4. Best SR (2003) A comparison of the resonant properties of small space-filling fractal antennas. IEEE Antennas Wirel Propag Lett 2:197–200 5. Galehdar A, Thiel DV, O’Keefe SG, Kingsley SP (2007) Efficiency variations in electrically small, meander line RFID antennas. In: Proceedings of the IEEE antennas and propagation society international symposium (AP-S’07), June 2007, pp 2273–2276 6. Gunamony SL, Rekha S, Chandran BP (2022) Asymmetric microstrip fed meander line slot antenna for 5.6 GHz applications. Mater Today Proc 58:91–95 7. Sten JC-E, Hujanen A, Koivisto PK (2001) Quality factor of an electrically small antenna radiating close to a conducting plane. IEEE Trans Antennas Propag 49(5):829–837 8. Kazuki I, Satoshi I, Takeshi F (2010) Gain enhancement of low-profile, electrically small capacitive feed antennas using stacked meander lines. Int J Antennas Propag 2010. https://doi. org/10.1109/ICCCAS.2010.5581891 9. Shawkey H, Elsheakh D (2020) Multiband dual-meander line antenna for body-centric networks’ biomedical applications by using UMC 180 nm. Electronics 9(9):1350 10. Rekha S, Jino Ramson SR (2022) Parasitically isolated 4-element MIMO antenna for 5G/ WLAN applications. Arab J Sci Eng 47:14711–14720 11. Roy AK, Basu S (2017) Miniaturized broadband dielectric resonator antenna for X and Ku band. In: 2017 devices for integrated circuit (DevIC), Kalyani, India, pp 407–409. https://doi. org/10.1109/DEVIC.2017.8073980 12. Rekha S, Let GS (2022) Design and SAR analysis of wearable UWB MIMO antenna with enhanced isolation using a parasitic structure. Iran J Sci Technol Trans Electr Eng 46:291–301
Machine Learning-Based Prediction of Cardiovascular Diseases Using Flask V. Sagar Reddy, Boddula Supraja, M. Vamshi Kumar, and Ch. Krishna Chaitanya
Abstract Health care is an inescapable task in a human’s life. Cardiovascular diseases, also known as CVDs, are amongst the most prevalent causes of death globally, costing the lives of around 17.9 million people each year. CVDs include heart and blood vessel abnormalities that encompass problems such as coronary heart disease, brain disease, rheumatic heart disease, and more. The deaths can be reduced by early detection and treatment of cardiac problems. The present study compares the performance of various machine learning methodologies like SVM, KNN, and decision tree in terms of accuracy. For the prediction of cardiovascular diseases, we have taken few inputs like BP, cholesterol, glucose, current smoking, cigarettes count, BMI, and age. After implementing these methods, KNN had given the best accuracy in the detection of cardiovascular diseases (CVDs). KNN achieved about 80% accuracy in detection. After that we developed a user-friendly website, where users can be able to check whether they were facing any heart-related diseases by giving their parameters like glucose, cholesterol, BP, and so on. Keywords Machine learning · Cardiovascular diseases · Flask · Real time · Algorithms · Accuracy · Prediction
1 Introduction Machine learning has many applications and is used in many different industries. Moreover, this is also true in the healthcare industry. It can play a crucial role in predicting the existence or absence of cardiovascular illnesses, and other conditions. Such information, if expected in advance, can give clinicians valuable information, allowing them to tailor their diagnosis and therapy accordingly. The heart comprises one of amongst the body’s primary organs. It propels the flow of blood through the V. Sagar Reddy · B. Supraja (B) · M. Vamshi Kumar · Ch. Krishna Chaitanya Department of Electronics and Communication Engineering, VNR VJIET, Hyderabad, Telangana, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_47
547
548
V. Sagar Reddy et al.
body’s arteries and veins. The circulatory system plays a vital role through transporting blood, oxygen, and other chemicals to all of the body’s cells and tissues. It is safe to say that the heart is the most vital organ in your cardiovascular system. Problems with the heart’s function can have serious consequences, even death. Since the healthcare industry generates huge amounts of health information, machine learning methods have become essential for accurately predicting heart disease. Recent study has concentrated on combining these approaches to create mixed-type machine learning algorithms. In this proposed paper, data preprocessing measures include the removal of noisy information, the removal of insufficient information, the replacement of default values where suitable, and the categorization of characteristics for the purposes of making predictions and decisions at various tiers. The efficacy of a diagnostic model maybe evaluated in several ways. Results from the support vector machine, decision tree, logical regression and K-nearest neighbour, and an accurate cardiovascular disease prediction model is shown. Cardiovascular Disease Types Heart disorders are a collection of illnesses that impact the cardiovascular system and blood arteries and are generally referred to as cardiovascular diseases (CVDs). Coronary artery disease (CAD) includes symptoms including chest pain and heart attacks and is a leading cause of cardiovascular illness (commonly known as a heart attack). The excessive accumulation of a waxy substance known as plaques in coronary artery walls causes a different form of heart disease known as coronary heart disease (CHD). These arteries carry the blood with plenty of oxygen that the heart’s muscles require. The disease atherosclerosis occurs when plaque builds up in the arteries. It might take years for plaque to develop. This plaque could become harder or explode with time (break open). Coronary arteries gradually become more constricted because of plaque calcification, decreasing the amount of oxygen blood reaching the heart. If the plaque is ruptured, then the blood clotted may form on the surface. Normal blood flow via a coronary artery might be completely blocked by a large clot. Broken plaque contributes to the hardening and narrowing of coronary arteries. If blood supply is disrupted to the heart, cardiac muscle begins to die if it is not quickly restored. A heart attack may lead to serious consequences or even death if it is not treated immediately. Worldwide, heart attacks are a leading killer.
2 Literature Review Based on [1], Bora et al. have talked about how well support vector machines, decision trees, and linear regression work in order to predict cardiac disease. Authors have calculated the accuracy of these strategies by using the repository dataset maintained by UCI for testing and training support vector machines (SVMs). The accuracy obtained is high because the dataset contains only 303 patient’s record. Kumar et al. [2] suggested machine learning approaches for cardiac disease prediction, such as random forest, logistic regression, support vector machine (SVM), and
Machine Learning-Based Prediction of Cardiovascular Diseases Using …
549
k-nearest neighbours (KNNs). Their maximum accuracy was 85% while utilising the random forest algorithm, 74% when using logistic regression, and 77% when using SVM. Using k-nearest neighbours (KNNs), they obtained the lowest accuracy of 68%. They had to use sampling strategies because the dataset they used was unbalanced in their research. However, they did not filter the dataset before applying machine learning methods. A grid search method created by Budholiya et al. [3] is used to optimise the suggested diagnostic system. In the testing, the Cleveland dataset, a database which contains information on heart failure, has been used. Using just seven characteristics, accuracy improves by 3.3%. It produces the result with the suggested method that is more effective and simpler than the typical random forest method. Al Bataineh and Manacek [4] have proposed the Heart Disease Prediction System, and this uses multi-layered perceptron (MLP) algorithm in machine learning. This method provides people with an outcome of prediction indicating how much they are at risk of developing CAD. The algorithms used for machine learning have expanded significantly as a result of recent technical advances, and the authors selected to adopt MLP in the proposed model because of its efficiency and accuracy. Bindhika et al. [5] have developed an innovative approach for using machine learning to identify critical characteristics that improve cardiovascular disease prediction precision. To demonstrate the model for prediction, an extensive variety of feature combinations and widely recognised categorization algorithms are implemented. Sarker et al. discussed how machine learning is useful in findings and making predictions using a large amount of information gathered throughout the years by the healthcare industry. Random forests (RFs), artificial neural networks, support vector machines, decision trees, k-nearest neighbour, and Naive Bayes approach are amongst the machine learning methods utilised in the prediction of cardiovascular disease.
3 Proposed Method This suggested the system which includes data that, based on its characteristics, classify people as having the cardiac disease or not. This suggested system can utilise these data to develop a model that attempts to predict (via data reading and data exploration) whether a patient has this ailment or not. Depending on the functionality of the patient’s heart and significant characteristics of the person’s heart, the information is organised and grouped in numerous structured datasets. Information is extracted first from the data library and imported. Feature engineering follows exploratory data analysis. Various methods are utilised to evaluate the data, and repeated characteristics are eliminated. Multiple computational methodologies are applied to the preprocessed data to predict crop performance over time with excellent efficiency and precision. For data processing, we employed machine learning methods, including support vector
550
V. Sagar Reddy et al.
Fig. 1 Flowchart of the proposed method
machines with regressions [2, 3, 6–9]. Figure 1 depicts the algorithms flow from taking a dataset to finding whether the patient has cardiovascular disease.
3.1 Dataset A dataset was utilised for training the model that consists of 4000 + individuals’ data. We have 14 attributes in the dataset, and every attribute carries the chance of risk. Attributes are grouped into three categories demographic, behavioural, and information about medical history. Demographic consists of parameters like sex, age, and education. Behavioural consists of current smokers, cigarettes per day. Information about medical history has parameters like BP medications, diabetes, total cholesterol, prevalent stroke, prevalent hypertension, systolic blood pressure, diastolic blood pressure, BMI, heart rate, and glucose. Table 1 demonstrates the various factors taken into account for predicting the development of cardiovascular disease [2, 3, 6–9].
Machine Learning-Based Prediction of Cardiovascular Diseases Using …
551
Table 1 Dataset for cardiac diseases S. No.
Attributes
Description
1
Sex
Male or female
2
Age
Patient’s age
3
Current smoker
Yes/no (1 or 0)
4
Cigarettes per day
Number of cigarettes consumed each day BP medications
5
Blood pressure medications
6
Prevalent stroke
Whether the person had ever had a stroke
7
Prevalent hypertensive
Whether or not the person is hypertensive
8
Diabetes
Whether person ever had diabetes or not
9
Total cholesterol
Level of overall cholesterol
10
Systolic blood pressure
Systolic blood pressure (in mmHg)
11
Diastolic BP
Diastolic blood pressure (in mmHg)
12
Glucose
Glucose level
13
BMI
Body Mass Index
14
Heart rate
The patient’s pulse
3.2 Preprocessing Preprocessing is vital in enhancing data quality. Preprocessing helps in maintaining data consistency through eliminating duplicates and abnormalities, normalising the data for comparison, and increasing the accuracy of the outcomes. Machines that know number language, specifically binary numerals one’s and zeros. In preprocessing, the raw and the unstructured data are converted into structured data which make it suitable for building and training a machine learning model. As a consequence, the model relies heavily on the data preprocessing function. In this step, unprocessed or unorganised information from the cardiovascular disease dataset has been extracted and cleaned, and metadata is added through eliminating numerical transformations. As an outcome, the data help to improve training. Analyse each piece of data. We put the metadata in it initially, then subsequently link it to its data, substituting transformed data by the metadata, during this preprocessing. The dataset is then sent ahead, and whatever is not needed from a list of components will be removed, before the data split into training and testing data. Preprocessed data can benefit using scikit-learn’s train test split function, that divides the input information into test and training segments according to the amount of weights provided in the source code. The variation between training and testing data must be 20% and 80% accordingly.
552
V. Sagar Reddy et al.
3.3 Feature Selection Elimination of features’ procedure begins after Exploratory Data Analysis (EDA) and specified data preprocessing. Strategies for feature selection involve limiting no. of input variables in model building. This approach will be used to construct the best prediction model by removing repetitive characteristics from the preprocessed data and selecting only the key attributes. Some modelling for prediction challenges includes many factors which may cause a significant delay. Building and training a model take up a lot of memory. For easier modelling and to increase the performance of the model take less amount of input variables. We examined multiple models that correspond to a variety of specified attributes using various statistical parameters and discovered that the suggested model has carried out the best ones in disease prediction.
3.4 Training the Classifiers Python 3.7 was used to analyse data in the Jupyter Notebook for further classification. After that, techniques for machine learning are employed to identify preprocessed data. The classification algorithm will be used to compare which accuracy is the best overall.
3.5 Classification Classification represents a predictive modelling issue in which a category label for a particular instance of input data can be predicted. In classification, a label value has been assigned to a particular class, and then, a specific kind is to be identified as one type or other. The classification technique makes use of concepts from mathematics such as SVM, KNN, logistic regression, and decision tree.
3.6 Saving the Best Classifier The best classifier is determined after the testing and model evaluation process. The best model is saved for later, and a real- time web application is built using Flask. A key aspect of developing machine learning techniques is sharing the model we developed with others. Regardless of how many models we produce, if they stay offline, just a few are going to be capable of seeing whatever we are accomplishing. That is the reason we should make available the models in order that anybody can communicate with them through a user-friendly User Interface (UI). We developed
Machine Learning-Based Prediction of Cardiovascular Diseases Using …
553
a single-page web-based application utilising Flask as the user interface for this system. It will take input and predict whether the user is facing a chance of chronic cardiac disease or not in 10 years.
3.7 Deployment in the Flask A small web framework written in Python is called Flask. Since it does not need any additional frameworks or libraries, this is most termed as microframework. Abstraction layer in the database, form validation, and other functionalities are absent where previously existed third-party libraries offer shared functionality. Deployment by using flask involves steps as shown in Fig. 2. The process of deploying a machine learning model involves integrating it into a pre-existing production environment that can receive input from users and provide an output. This article discussed the development and deployment of a system for finding the onset of heart diseases by using machine learning methods like logistic regression, SVM, KNN, and decision tree [10–12]. Nevertheless, it is worth noting that algorithm (KNN) and web framework (Flask) utilised in this project have the potential to greatly aid in decreasing the death rate amongst individuals with cardiovascular illnesses.
Fig. 2 Deployment by using flask
554
V. Sagar Reddy et al.
4 Results This study is concentrated on applying a few categorization algorithms and contrasting the results. The training and testing sections of the dataset were split in an 80/20 ratio. To predict cardiovascular disease, classification methods’ tools including SVM, decision tree, logistic regression, and KNN were employed. The confusion matrix is utilised to pinpoint labelling or prediction errors. Four elements—True Positive (TP), to compare estimated and observed data, researchers utilise three different forms of rates of errors: True Negative (TN), False Positive (FP), and False Negative (FN). A Type-I or Type-II error begins with a False Positive or False Negative value. Confusion matrix is used for the calculation of Precision, Recall, F1-score, and Accuracy. Accuracy =
True Positive + True Negative . Total Predictions
(1)
The correctly predicted values are indicated by accuracy. Predictive Accuracy =
True Positive . True Positive + False Positive
(2)
It shows the actual instances of all the good forecasts. Recall =
True Positive . True Positive + False Positive
(3)
Out of all the positive classifications, it describes the values that were accurately anticipated. F1-score can be calculated as follows: F1 = 2 ∗ Precision ∗ Recall Precision + Recall..
(4)
To determine the test accuracy, harmonic mean is used along with evaluations of recall and precision. Confusion matrices of all the classifiers are shown below in Fig 3. The performance of each of the four algorithms is listed in Table 2. From comparative analysis as in Fig. 4, it is clear that with 80% accuracy, the KNN algorithm provided the best forecast. Deployment Results For the real-time deployment, we used Flask. After the deployment as in Fig. 5, it asks for the values of attributes. Following that, the machine learning model predicts whether a person will be at risk of developing a cardiovascular disease or not.
Machine Learning-Based Prediction of Cardiovascular Diseases Using …
555
Fig. 3 Confusion matrices of SVM, logistic regression, decision tree, and KNN classifiers
Table 2 Accuracy of various algorithms Algorithm name
Accuracy (%)
Precision (%)
F1-score (%)
Recall (%)
SVM
65
26.6
37.7
64.5
KNN
80
42
50.3
62.8
Decision tree
71.7
36.3
53
100
Logistic regression
66.7
26.9
37.5
62.0
5 Conclusion Early recognition can help in disease prevention and development. Machine learning technology can assist with prompt diagnosis and the identification of crucial causative factors. The approach suggested creates a machine learning model capable of predicting cardiovascular diseases and cardiac events. The KNN algorithm is the
556
Fig. 4 Comparison of accuracy
Fig. 5 Deployment results
V. Sagar Reddy et al.
Machine Learning-Based Prediction of Cardiovascular Diseases Using …
557
optimum answer for this task. According to the model, latest machine learning algorithms typically result in greater accuracy in forecasting. Existing strategies are analysed and compared in order to find effective and precise techniques. Machine learning algorithm helps in improving the accuracy of heart attack prediction significantly, permitting individuals to be identified at the beginning of the disease process to receive preventative treatment. It might be claimed that there is a great chance for advancement in the application of machine learning algorithms to forecast cardiovascular disease and other heart-related disorders. All the strategies have performed wonderfully under all test conditions. We were able to get an accuracy level of 80% while simultaneously lowering the time required for processing using the K-nearest neighbour technique. In the future, efficiency needs to be evaluated with a range of datasets, and additional latest AI algorithms will be necessary to validate the accuracy prediction.
References 1. Bora N, Gutta S, Hadaegh A (2022) Using machine learning to predict heart disease. WSEAS Trans Biol Biomed 19:1–9 2. Kumar NK, Sindhu GS, Prashanthi DK, Sulthana AS (2020) Analysis and prediction of cardio vascular disease using machine learning classifiers. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2020, pp 15–21, doi: https://doi.org/10.1109/ICACCS48705.2020.9074183 3. Budholiya K, Shrivastava SK, Sharma V (2020) An optimized XGBoost based diagnostic system for effective prediction of heart disease. J King Saud Univ Comput Inform Sci 34:4514– 4523. https://doi.org/10.1016/j.jksuci.2020.10.013 4. Al Bataineh A, Manacek S (2022) MLP-PSO hybrid algorithm for heart disease prediction. J Pers Med 12(8):1208. https://doi.org/10.3390/jpm12081208 5. Bindhika GSS, Meghana M, Reddy M, Dharmadurai R (2020) Heart disease prediction using machine learning techniques. 2395-0056 6. Azhar MA, Thomas PA (2019) Comparative review of feature selection and classification modeling. In: 2019 International Conference on Advances in Computing Communication and Control (ICAC3), pp 1–9 7. Gavhane A, Kokkula G, Pandya I, Devadkar K (2018) Prediction of Heart Disease using Machine Learning. In: 2018 SecondInternational Conference on Electronics Communication and Aerospace Technology (ICECA), pp 1275–1278 8. Katarya R, Srinivas P (2020) Predicting heart disease at early stages using machine learning: a survey. In: 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC) 9. Patel J, Upadhyay P, Patel D (2016) Heart disease prediction using machine learning and data mining technique. J Comput Sci Electron 7:129–137 10. Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554 11. Borkar S, Annadate MN (2018) Supervised machine learning algorithm for detection of cardiac disorders. In: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). IEEE 12. Dinesh KG et al (2018) Prediction of cardiovascular disease using machine learning algorithms. In: 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT). IEEE
Network Traffic Analysis using Feature-Based Trojan Detection Method R. Lakshman Naik, Sourabh Jain, and B. Manjula
Abstract Malware assault cases are currently steadily rising in both the private and public sectors. To analyzed the behavior features of trojan based on the characteristics of host and semantics of code. These techniques are having several drawbacks examining numerous common Trojans’ network behavior, characteristics, and network traffic. In this study, to train the Trojan Detection Algorithm, Random Forest Algorithm, Naïve Bayes Algorithm, and Decision Tree Algorithms are used. The classification and evaluation of performance process are carried out using WEKA. The features of Trojan behavior and communication are extracted using a model. Then locate and catch the traffic of Trojans. The accuracy is up to 98%. The proposed Trojans’ detection model when compared with different machine learning algorithms, experiment shows that proposed algorithm is beneficial and effective for detecting Trojans. Keywords Traffic analysis · Trojan detection · Network behavior analysis · Machine learning
1 Introduction Trojans are more covert and challenging for consumers to notice when running in the background compared to other security threat programs. The robust concealing feature of Trojan also increases its risk. It differs from many other computer programs with virus that harm user’s the computer system or private information by gathering and obtaining the user’s private information instead of which unintentionally compromising the user’s privacy. Some personal computers turn into “bot hosts”
R. L. Naik (B) · S. Jain Department of CSE, IIIT, Sonepat, Haryana, India e-mail: [email protected] B. Manjula Department of CS, Kakatiya University, Warangal, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_48
559
560
R. L. Naik et al.
while under the influence of a botnet Trojan, and computer devices can be used to launch comparable minor attacks, posing additional safety risks [1]. Researchers have drawn a lot of attention to the Trojan problem as a result of the proliferation of Trojans and the prevalence of Trojan attacks. Aside from Trojans’ emergence and replacement, Internet technology is rapidly updating and developing. Since the Internet is now interconnected on a worldwide scale, we should pay great attention to Trojan attacks and do research to promote the Internet’s healthy growth. Cybercriminals can be either internal or external to the organization to do cyberattacks. Due to this, the cybercrime is classified into two categories as following: Insider attack: An insider attack is committed by a user with authorized system access and attacks a computer system and network. Mostly, insider attacks are committed by contractors or by unsatisfied internal employees. The main moto of this is because of greed or retaliation. If the insider attacker was well-aware of IT architecture, procedures, the security systems rules and flaws; then attacker may easily carry out a cyber-attack. Because of it, an insider attacker easily brings down the network and steals important data. External attack: This type of attackers may be employed by an inside or external party to the organization. The target of this cyber-attack is not only for reputational damage or but also for financial loss. As external attackers are outsiders of an organization; they often scan and steal the data.
1.1 Types of Cyberattack A cyber-attack means an attack initiated from one or more computers against a network or a computer or a group of computers. The cyberattacks are divided into two types. Firstly, it is aimed to access the computer data by stealing the administrator rights. Secondly, it is aimed to take down the target computer. The cybercriminals to fulfill their objectives like getting access to computers, data, or network use a variety of new emerging technical techniques and some of them may overlap. Malware: It is a malicious software, apart from how it is executed; it is designed to damage a server, client computer, networks and any type of infrastructure with out knowledge of a user. Trojans, viruses, and worms are types of malwares; they are different from one another based on their spread and reproduction. This attack grants root access to the attacker, by this attacker can take remote control of a system, and may disable the machine or network. Phishing: Through this technique, the hacker sends fraudulent emails or communications most like as a reputable source to a target victim to take some hazardous action. The receiver is trapped to enter confidential information like bank transaction OTPs, telephone number, username, and passwords by leading them into phony website or tricked to click a link that leads to download malware by disguising as important document. Some phishing emails are in particular written for treasured target people in an attempt to influence them to offer beneficial information, but many are as an alternative basic and sent to heaps of potential victims [2].
Network Traffic Analysis using Feature-Based Trojan Detection Method
561
Ransomware: Ransomware encrypts a victim’s files and is a type of malware. Here the attacker demands a ransom and promises to victim to give access to his data again after paying hundreds to thousands of dollars in cryptocurrency; the hacker also provides payment instruction to get decryption key. Denial-of-Service (DoS): This type of attack try to attempts bombard a website with fictitious requests in an effort to force it to respond to them, preventing legitimate users from accessing it. By preventing civilians, military, security professionals, or research organizations from accessing crucial websites, this kind of attack has the potential to interfere with crucial activities and systems. Man in the middle attack: It is a technique used by attackers to intercept secretly between the user and a web application and trying to access the conversion. For example, the attacker creates a fake login page and hacks login credentials once a user logs in and apply the same on the original one. Crypto jacking: It is a specialized assault. Here, hacker installs malware software on victim’s device or runs a Javascript code on victim’s computer browser to mine cryptocurrency. Its moto is profit and it is designed in such a way that the hacker completely hiddens from the victim. SQL injection: It is a hacking technique; here, attacker executes a malicious SQL code to access to victim’s database. A web application is designed with SQL commands to take confidential information from victims. By this, hacker can add, modify, and delete records from the database. This type of attack affects web applications which are using SQL databases. With this, they may gain unauthorized access to sensitive data of a victim. Zero-day exploits: Unpatched software attacks are referred to as zero-days. The attack got this name because of the number of days, the software developer has known about the problem. It is a software-related attack; it exploits a weakness which is unaware to a developer or a vendor. A software patch means a solution to fix a zeroday attack. Zero-day attack available from white market gray and block market; they also known as legal to white range markets. The government agencies discovered that many hackers are using this attack for their own hacking purpose, rather than using it for the common benefit.
2 Related Work Very few research studies like Reverse Engineering Improvement [3, 4], Classification Approaches [2, 6, 7], Golden Model-Free [8], and Real-Time Detection [5] have been done on Trojan Horse Detection Techniques. Classification Approaches have been selected in this study as the machine learning method for identifying the network’s Trojan horse. In this part, three methods that are connected to the classification strategy are explored. Thimbleby et al. [9] explained the formal definition of the Trojan model and the behavior characteristics of Trojans depending on the host. through encryption and deformation, a Malware Detection Model Based on Program Semantics to circumvent
562
R. L. Naik et al.
signature detection evasion technologies created by Seshia et al. [10]. Wang et al.’s [11] research looked at the distinctive paths taken by Trojans in comparison to regular apps and attempted to use this information to scan for Trojans in memory. To find the remote control Trojan program, Wu [12] monitored and analyzed all of the traffic. He used the Borderline-SMOTE algorithm to process the unbalanced dataset. The Trojan control terminal and the process were in constant communication. The feature to identify unusual traffic is the content’s asymmetries. The attack tree model of the associated Trojan that Yang et al. [13] suggested as the basis for the Trojan detection approach incorporates the traits of the Trojan assault. However, due to the complexity of the total procedure, traffic detection is useless. Huge amounts of flow information have to be categorized, researched, and analyzed so as to analyze flow patterns and propose solutions for point-to-point flow, web flow, and predictable application flows [1], and Li [14] suggested an approach recorded and put forth a detection model based on behavior analysis and a knowledge base. Deep payload inspection (DPI) means deep packet inspection technology, which was utilized by Wang et al. [15] to detect Trojan communication traffic using a discovery method based on packet load, and it was suggested that this technology might analyze the penetrating data present in the communication load [1]. Network assaults and associated malicious software can be discovered by determining whether or not information is present. However, this approach cannot more effectively screen crucial information in a timely way and has a lesser ability to detect encrypted data.
3 Methodology The process used to build the Trojan horse detection system is shown in general steps in Fig. 1, which started with data collection from the website. The chosen feature is then identified after preprocessing the received data. Lastly, machine learning classification method is employed to assess the performance of the outcome.
3.1 Collecting the Data The Kaggle website is used to gather the publicly available dataset ‘Trojan Detection’. It was created by a user named “Cyber Cop” and obtained its GNU licensed 3.0 from the Canadian Institute for Cybersecurity (CIC). It was made on September 18, 2021. The network traffic Comma-Separated Values (CSV) file consisting of 95 attributes and 1,77,482 rows of observation of Trojan Horse and Benign [16] is used as dataset; from this, a final same dataset needs to be created for machine learning algorithm model training. For this dataset, firstly it is needed to capture regular Trojan traffic and separate the known Trojan traffic data packets [1] from it, then match it in an exact ration illustrated in Fig. 1.
Network Traffic Analysis using Feature-Based Trojan Detection Method
Data Collection
563
Test ing Data Data Extraction
Data Selection Training Data
Results
Detection Model
Fig. 1 Framework of detection model
3.2 Extracting the Data In machine learning modeling to create feature training set, first we must acquire the related regular application traffic and related Trojan data traffic. Preliminary feature extraction is done using the free and open-source feature extraction program CICFlowmeter in this article’s feature extraction technique. The framework for CICFlowmeter’s feature extraction work, which enables it to extract the important aspects from an interaction between hosts, such as the total amount of the session time and the data flow is the addresses of the destinations and the originator address of the information being transferred. It is examined the size of the information packets sent back and forth by the two parties per second and also other factors are considered. It is observed that, the data packet are consistent with the features of Trojan Traffic through our analysis. In order to increase accuracy of the model, more features need to be considered.As shown in Fig. 2, all that is required for filtering will be the feature information from the training sets recovered the features data. Then, we can wait until the training set of Trojan traffic features is eventually required.
3.3 Selecting the Data Using feature selection techniques, unnecessary and redundant properties from the data were found and eliminated, as well as features that did not add much to the prediction model’s accuracy. As a result, the number of features are reduced from 95 to 45 features. When there are many features, machine learning performance can suffer employing scoring techniques to pick out important qualities and exclude
564
R. L. Naik et al.
Fig. 2 Features of Trojan traffic
unimportant ones. Machine learning classifier algorithms will be trained and tested against those chosen features.
3.4 Training the Model This study builds and trains models using three different machine learning algorithms like decision tree, Naive Bayesian, and random forest. By using the Weka platform’s API, related machine learning algorithms can be used and operated. The algorithm can be called to carry out the model’s single-room job, and it can be modified and optimized to produce varied model training outcomes.
3.5 Detection Model Any kind of machine learning classification model is employed and the performance of the outcomes is assessed. Trojan datasets can be detected and classified using machine learning algorithms. The following classification models that were employed in this study are as follows:
Network Traffic Analysis using Feature-Based Trojan Detection Method
565
• Random forest. • Decision tree. • Naïve Bayes.
4 Experimental Result The WEKA Environment is used in this study to carry out the classification process. It incorporates a variety of machine learning classification approaches for identifying and categorizing datasets using Trojan utilizing the collection of data that was obtained and processed. The necessary machine learning algorithms are evaluated using the percentage division approach in this experiment’s model evaluation process, which helps to some extent ensure the fairness of the data. Two sections make up the training data set. Thirty % is used for detection, while 70% is used for model training. The machine learning classification evaluation method such as accuracy, precision, and recall [17] is the evaluation standard, and it detects and rates Trojan traffic. The F1-score (F1-Score) is another tool used in this paper to compare various machine learning algorithm models. A two classification or multi-task classification model’s accuracy is typically assessed using the F1-score, which also takes the classification model’s recall rate into consideration. The best classifiers to utilize for identifying Trojan horses, according to Table 1, are random forest and decision trees. The classifiers are both 99% accurate in spotting Trojan horses. Compared to random forest and decision tree, Naive Bayes has a lower accuracy. Figure 3 shows the graphical representation of the accuracy precision and F1-score. The random forest approach performs better at detecting for the Trojan traffic training model when compared with Naïve Bayes and decision tree machine learning algorithms. The estimation results of the models developed using each method are depicted in the above figure, although the random forest algorithm model is superior in terms of precision, accuracy, and F1-score.
Table 1 Model performance Model
Accuracy
Precision
Recall
F1-score
Random forest
98.24
0.975
0.982
0.982
Naïve Bayes
93.43
0.932
0.923
0.923
Decision tree
97.34
0.943
0.935
0.935
566
R. L. Naik et al.
0.932 RANDOM FOREST
NAÏVE BAYES
NAÏVE BAYES
(c)
DECISION TREE
(b)
DECISION TREE
0.923
0.935
F1 - SCORE
0.935
0.982
RECALL 0.923
0.982
(a)
RANDOM FOREST
0.943
0.975
PRECISION
93.43
97.34
98.24
ACCURACY
RANDOM FOREST
NAÏVE BAYES
DECISION TREE
(d)
Fig. 3 Performance of the model. a Accuracy, b precision, c recall, and d F1-score
5 Conclusion The three distinct models built using the Naive Bayesian method perform comparatively less than the other two machine learning techniques for the three types of Trojan traffic. Each model of the decision tree algorithm has rather consistent performance when it comes to value variations, with the critical values varying slightly. However, the random forest algorithm-trained model has higher evaluation accuracy and precision than the other two machine learning methods, and the F1-Score value analysis of each model shows that the random forest algorithm is superior in every aspect. In terms of model detection, the random forest method performs better than the other two machine learning techniques. It is clear that the model built using the random forest approach performs better than the other two machine learning techniques at detecting Trojan traffic and is, therefore, better suited for this kind of work.
Network Traffic Analysis using Feature-Based Trojan Detection Method
567
References 1. Z Ma, Huang Y, Lu J (2020) Trojan traffic detection based on machine learning. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). https://ieeexplore.ieee.org/document/9317515, https://waveletlab. cn/booklet.pdf. 2. Lakshman Naik R, Jain S (2023) URL-based technique to detect phishing websites, automation and computation. CRC Press, London. https://doi.org/10.1201/9781003333500 3. Bao C, Forte D, Srivastava A (2016) On reverse engineering-based hardware Trojan detection. IEEE Trans Comput Aided Des Integr Circuits Syst 35(1):49–57. https://doi.org/10.1109/ TCAD.2015.2488495 4. Ab Razak MF, Jaya MI, Ismail Z, Firdaus A (2022) Trojan detection system using machine learning approach. Indonesian J Inf Syst 5(1):38–47. https://doi.org/10.24002/ijis.v5i1.5673 5. Chayal NM, Patel NP (2021) Review of machine learning and data mining methods to predict different cyberattacks. Lecture Notes Data Eng Commun Technol 52:43–51. https://doi.org/ 10.1007/978-981-15-4474-3_5 6. Dushyant K, Muskan G, Annu, Gupta A, Pramanik S (2022) Utilizing machine learning and deep learning in cybesecurity: an innovative approach. In: Cyber Security and Digital Forensics, pp 271–293. https://doi.org/10.1002/9781119795667.CH12 7. Salmani H (2022) Gradual-N-justification (GNJ) to reduce false-positive hardware Trojan detection in gate-level netlist. IEEE Trans Very Large Scale Integration (VLSI) Syst. https:// doi.org/10.1109/TVLSI.2022.3143349 8. Plusquellic J, Saqib F (2018) Detecting hardware Trojans using delay analysis. In: The Hardware Trojan war: attacks, myths, and defenses, pp 219–267. https://doi.org/10.1007/978-3319-68511-3_10 9. Thimbleby H, Anderson S, Cairns P (1998) A framework for modelling Trojans and computer virus infection. Comput J 41(7):444–458 10. Christodorescu M, Jha S, Seshia SA et al (2005) Semantics—aware malware detection. Security and privacy. Oakland, IEEE, pp 32–46 11. Wang R, Wang W, Gong X, Que X, Ma J (2010) A real-time video stream key frame identification algorithm for QoS. In: 2010 Second International Conference on Multimedia and Information Technology (MMIT), vol 1, pp 115–118. IEEE 12. Wu X (2018) Remote control Trojan detection model based on abnormal network behavior. Beijing University of Technology, Beijing 13. Yang W, Zhang S, Hu G (2011) Trojan detection method based on attack tree model. Inform Netw Secur 2011(09):170–172 14. Li J (2014) Trojan detection technology based on traffic. University of Electronic Science and Technology of China 15. Wang Z (2007) Research on Trojan attack and prevention technology. Shanghai Jiaotong University, Shangi 16. Data set (2021) https://www.kaggle.com/datasets/subhajournal/trojan-detection 17. Aqlan AAQ, Manjula B (2020) Extraction and analyze text in twitter using naive Bayes technique. Int J Innov Technol Explor Eng (IJITEE) 9(4):2278–3075
Multi-band Micro-strip Patch Antenna for C/X/Ku/K-Band Applications Karunesh Srivastava, Mayuri Kulshreshtha, Sanskar Gupta, and Shrasti Sanjay Shukla
Abstract In this paper, a multi-band E-shaped antenna is presented for C, X, Ku, and K-band applications. Proposed antenna consists of E-shaped radiating patch and defected ground plane. Introduction of partial ground plane in the design causes decrement in surface waves and increment in antenna gain. The proposed multi-band micro-strip patch antenna has been simulated and optimized using ANSYS HFSS version 15. The proposed antenna uses the FR4 substrate with thickness 1.6 mm, dielectric constant (1) 4.4, and loss of tangent (tan δ) 0.02. Micro-strip feed line (λ0 /4 mm) with 50 Ω characteristics impedance is used for excitation of proposed antenna. Keywords Linearly polarized · Micro-strip-fed slot · C · X · Ku · K-band
1 Introduction In the era of technological advancements, microwave and wireless communication system plays a crucial role for exchanging the information. Evolution of communication systems with integrated technology and rapid growth of wireless communication seeks enhancement in the operation of an antenna and reduction in the size of the antenna, respectively. The utilization of multi-band antennas is crucial in wireless communication systems due to their ability to operate at various frequency ranges [1]. The needs of this modern wireless communication system are catered by micro-strip antenna. In the present era, individual requires wireless phone which is capable of transmitting K. Srivastava (B) · S. Gupta · S. S. Shukla Department of Electronics and Communication Engineering, Ajay Kumar Garg Engineering College, Ghaziabad, UP 201009, India e-mail: [email protected] M. Kulshreshtha Department of Computer Science & Engineering, IMS Engineering College, Ghaziabad, Uttar Pradesh 201009, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_49
569
570
K. Srivastava et al.
audio, video, and data simultaneously. Wireless phone can support multiple networks which operate on different frequencies. In recent times, the need of wide-band communication has emerged due to the introduction of new standards and compact devices. These requirements have been met through various methods such as stacked patches [2], defected ground [3], shorting pins, and cutting slots or notches in the patch or ground plane [4]. Furthermore, modern wireless communication systems necessitate antennas that are lightweight and low profile which offers high gain and efficiency with excellent mobility characteristics [5]. Numerous researchers have developed various configurations to enable the use of multiple frequency bands [6–9] like H slot [6], double U slot [7], double PIFA [8], and U slot [9]. Design of micro-strip array capable of controlling two frequency bands has been investigated to operate on X-band, Ku-band, and K-band, with resonant frequencies 11.5, 16, and 21 GHz, respectively [10]. Multi-band antenna capable of supporting C-band, X-band, and Ku-band with resonant frequencies 6.5, 8.25, 9.96, 10.52, and 14.99 GHz has been reported in [11]. Symmetrical E-shaped patch antenna [12] and wide-band E-shaped micro-strip patch antenna [13] have been reported to operate at dual-frequency band and single-frequency band, respectively. A design for a dual frequency probe-fed micro-strip antenna with an E-shaped radiating patch for S-band and C-band applications has been introduced by [14]. Antenna which is capable of providing the coverage for C/X/Ku/K-band has not been reported yet. A new design for C/X/Ku/K-band applications is proposed in this article. Geometrical structure of the suggested design with its dimension is displayed in Fig.1(c). Antenna is designed and simulated using ANSYS HFSS version 15.
2 Proposed Antenna Design and Evolution In order to examine the behavior of the antenna, performance evaluation of proposed antenna is done in three steps (A1–A3) in terms of reflection coefficient and gain. Antenna 1 (cf. Fig. 1a) consists of uneven E-shaped patch with full ground. Antenna 2 (cf. Fig. 1b) is obtained by creating a defect in the ground of Antenna 1. Introduction of identical vertical slot in between the arms of E-shaped radiating patch of Antenna 2 results Antenna 3 (cf. Fig. 1c). It is observed from Fig. 2 that Antenna 1 and Antenna 2 resonate for dual and single bands respectively with positive gain. It can also be observed from Fig. 3 that defected ground structure improves the gain of Antenna 1, but number of resonating bands in Antenna 1 and 2 are same. Introduction of vertical slots is responsible for the improvement in gain of Antenna 3. Seven resonating bands useful for C/X/Ku/K-band applications are observed with sufficient gain and return loss in Antenna 3. Therefore, we have chosen Antenna 3 as a proposed antenna. The proposed antenna uses FR4 substrate with thickness 1.6 mm, dielectric constant (1) 4.4, and loss of tangent (tan δ) 0.02. Micro-strip feed line (λ0 /4 mm) with 50 Ω characteristics impedance is used for excitation of proposed antenna. The
Multi-band Micro-strip Patch Antenna for C/X/Ku/K-Band Applications Fig. 1 Geometrical top (red color) and bottom (yellow color) view
(a) Antenna 1
(b) Antenna 2
(c) Antenna 3 (Proposed Antenna)
571
572
K. Srivastava et al.
Fig. 2 Parametric analysis of antenna in terms of simulated return loss
Fig. 3 Parametric analysis of antenna in terms of simulated gain
substrate thickness and relative permittivity are crucial parameters in determining the antenna’s performance. The optimized dimensions of the substrate are 53.5 mm in width and 50 mm in length. Initially, a rectangular-shaped patch antenna was created. In order to expand the frequency range of the antenna, additional attempts are made by adding two equal slots [14, 15] to the original rectangular patch, thus forming an E-shaped patch [15]. This patch configuration can enhance the gain, reflection coefficient, and bandwidth of the micro-strip antenna. Coupling between the patches may be the reason of enhancement in gain, return loss, and bandwidth. The E-shape micro-strip patch antenna is a modified version of the rectangular micro-strip patch antenna, where additional slots are introduced to form the E-shape. This design enhances the radiating properties of the antenna and provides a higher gain and bandwidth. The dimensions of the slot are carefully optimized to achieve the desired radiation characteristics.
Multi-band Micro-strip Patch Antenna for C/X/Ku/K-Band Applications
573
3 Results and Discussion The suggested antenna is simulated in terms of reflection coefficient, gain, input impedance, and group delay. The simulated reflection coefficient and gain of the proposed antenna are displayed in Fig. 4. It is clear from Fig. 4 that resonating frequencies are 6.9, 9.95, 14.28, 16.71, 21.36, 22.93, and 25.15 GHz. Proposed antenna possesses sufficient return loss and gain at these resonating frequencies. Maximum return loss of −37.24 dB and maximum gain of 2.62 dBi are observed at a frequency of 21.36 GHz. Simulated input impedance (real and imaginary components) and group delay of the proposed structure are displayed in Fig. 5. It is noted from Fig. 5 that real impedance is normalized to 50 Ω and is ranging between 30 Ω and 60 Ω in the complete frequency range. Polarity of the imaginary impedance curve is used to decide the inductive and capacitive nature of proposed antenna within entire frequency range. Group delay (preferred below 1 ns) is an important antenna parameter which measures the distortion between transmitted and received signals. It is observed from Fig. 5 that simulated group delay is below 0.5 ns, which is desired. Fig. 4 Simulated return loss and gain of proposed antenna
Fig. 5 Simulated impedance and group delay of proposed antenna
574
K. Srivastava et al.
4 Conclusions In this paper, evolution, simulation, and optimization of multi-band E-shaped microstrip patch antenna have discussed. The simulation results demonstrate that the proposed antenna provides sufficient gain and return loss with linearly polarized characteristics which make it suitable for C/X/Ku/K-band applications. In C-band, this antenna can find its usage in cordless telephones, Wi-Fi devices, satellite communications, whereas in X-band, the proposed antenna is suitable for radar applications, space, and terrestrial communication. Direct broadcasting (in Ku-band) and military applications, short-range communication (in K-band) are effective applications of proposed design.
References 1. Prasad L, Ramesh B, Kumar KSR, Vinay KP (2018) Design and implementation of multiband micro strip patch antenna for wireless applications. Adv Electromagn 7(3):104–107 2. Liu WC, Wu CM, Dai Y (2011) Design of triple frequency micro strip-fed monopole antenna using defected ground structure. IEEE Trans Antennas Propag 59(7):2457–2463 3. Matin MA, Sharif BS, Simonides CC (2007) Probe fed stacked patch antenna for wideband applications. IEEE Trans Antennas Propag 55(8):2385–2388 4. Saxena A, Gangwar RPS (2016) A compact UWB antenna with dual band-notched at WiMAX and WLAN for UWB applications. In: International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp 4381–4386 5. Francisco JHM, Vicente GP, Luis EGM, Daniel SV (2008) Multifrequency and dual-mode antennas partially filled with left-handed structures. IEEE Trans Antennas Propag 56(8):2527– 2539 6. Zhou Y, Chen C-C, Volakis, John L (2007) Tri-band miniature GPS array with a single-fed CP antenna element. In: IEEE International Symposium on Antennas and Propagation, pp 3049–3052 7. Niu JX (2010) Dual-band dual-mode patch antenna based on resonant type metamaterial transmission line. Electron Lett 46(4):266–268 8. Hall PS, Lee E, Song CTP (2007) Printed antenna for wireless communications, Edited by R. Waterhouse. Wiley, New York 9. Abu Tarbous HF, Al-Raweshidy HS, Nalavalan R (2008) Triple band U-slot patch antenna for WiMAX mobile application. In: Asia-Pacific Conference on Communications, pp 1–3 10. Motin MA, Hasan MI, Islam MS (2012) Design and simulation of a low cost three band micro strip patch antenna for the X-band, Ku-band and K-band applications. In: International Conference on Electrical and Computer Engineering 11. Palla R, Ketavath KN (2020) Multiband rectangular micro strip patch antenna operating at C, X and Ku bands. In: IEEE Third International Conference on Multimedia Processing, Communication and Information Technology (MPCIT) 12. Yadav A, Chauhan B, Jain A (2012) Micro strip symmetrical E-shape patch antenna for the wireless communication systems. Int J Emerg Technol Adv Eng 2(12):2250–2459 13. Ang BK, Chung BK (2007) A wideband E-shaped micro strip patch antenna for 5–6 GHz wireless communications. Prog Electromagn Res (PIER) 75:397–407
Multi-band Micro-strip Patch Antenna for C/X/Ku/K-Band Applications
575
14. Hossain MM, Wahed MA, Motin MA (2014) Design and simulation of a dual frequency E-shaped micro strip patch antenna for wireless communication. In: International Forum on Strategic Technology (IFOST) 15. Fazia Z, Ali MT, Suhabir S, Yusuf AL (2012) Design of reconfigurable dual band E-shaped micro-strip patch antenna. In: International Conference on Computer and Communication Engineering (ICCCE 2012), pp 113–117
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization for High Aperture Gain for Wi-Fi Applications Madhumitha Jayaram, P. K. Santhya Premdharshini, and Rajeswari Krishnasamy
Abstract An innovative microstrip H-shaped slotted patch antenna array design functions as a sub-array within a switchable outdoor antenna. The primary focus is to create an economical and compact solution for an outdoor WLAN communication system. To achieve this, an optimization study has been conducted to meet specific requirements. The two key challenges in the antenna design were related to bandwidth and space occupancy. To address these constraints, the antenna’s design needed to prioritize high aperture efficiency and higher directivity, possibly even incorporating some degree of flexibility. The proposed solution involves designing the antenna as an array, allowing it to switch between several sub-arrays. The microstrip slotted patches within the arrays are tailored for operation within the frequency range around the 2.45 GHz ISM band. The primary design criterion was to strike a balance between bandwidth and aperture efficiency, with the goal of achieving maximum efficiency to minimize the overall occupied space. To accomplish this, a combination of the Method of Moments algorithm and Particle Swarm Optimization (PSO) was employed for the antenna design. The simulation and validation of the proposed design were carried out using CST Microwave Studio (CST MW Studio). Keywords Microstrip array · Aperture efficiency · Particle Swarm Optimization (PSO) · CST MW Studio
1 Introduction Over the past two decades, WLAN technology has experienced rapid growth, leading to significant advancements in antenna design and fabrication. The increasing demand for dedicated antennas tailored to WLAN applications arises from their deployment in various topological scenarios, often constrained by limited space availability. As a critical design requirement, achieving high aperture efficiency has become essential to maximize antenna performance. To address this challenge, M. Jayaram (B) · P. K. Santhya Premdharshini · R. Krishnasamy Department of ECE, College of Engineering Guindy, Anna University, Chennai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_50
577
578
M. Jayaram et al.
microstrip antenna arrays have emerged as promising candidates due to their costeffectiveness and ability to attain exceptional aperture efficiency with a small number of elements. In [1], an aperture efficiency of almost 90% was recorded. However, the suggested arrays’ bandwidth was lower, and the loss in the feeding network was not taken into account. A low-cost microstrip array is designed with an E-shaped patch to provide broad band and high aperture efficiency in [2, 3]. This paper presents an in-depth optimization study aimed at meeting the specific requirements of an outdoor WLAN communication system utilizing a sub-array operating at a central frequency of 2.45 GHz. The study focuses on addressing two crucial challenges: achieving sufficient bandwidth and efficiently utilizing the available space. To enhance the antenna’s bandwidth, several options were considered. One well-known approach involves incorporating additional stacked or coupled patches into the design. However, this method often leads to larger element configurations, which is undesirable, especially in outdoor WLAN systems with limited space constraints. Another potential solution explored was the use of aperture-fed patches, which has its merits in certain cases. Nonetheless, it was deemed unsuitable for this study due to higher design complexity and concerns about potential sensitivity to fabrication tolerances at the aperture. Another option that was considered and eliminated during the evaluation process was the use of Planar Inverted-F Antennas (PIFAs). Despite their effectiveness in certain applications, PIFAs were deemed less suitable for this specific outdoor WLAN system due to the presence of walls or vias in their topology, which increases production costs and complexity, however identified a promising alternative in the use of specially shaped patches. Two particular patch shapes, the E-shaped patch and the slotted patch, were evaluated for their potential to meet the requirements of the outdoor WLAN communication system. After careful analysis, the H-shaped patch was selected as the most suitable candidate for further optimization. The chosen H-shaped patch offered several advantages, including its ability to achieve the desired performance without significantly increasing the element configuration’s size. Moreover, the H-shaped patch demonstrated robustness and ease of implementation compared to other alternatives considered. These factors made it a compelling choice for the optimization study, as it showcased the potential to address both the bandwidth and space occupancy challenges effectively. To maintain cost-effectiveness, all arrays were fabricated on a standard FR4 substrate with a thickness of 0.8 mm and specific material properties (tan δ = 0.014, permittivity = 4.7). A 50 Ω coaxial cable was used for feeding all the proposed arrays from the back side. The feeding via (probe) has a radius of 0.5 mm. Co-planar feeding was not considered suitable for this study since the proposed arrays are intended to function as sub-arrays within a switchable antenna system. Due to their role, these arrays cannot be placed on the same PCB board, necessitating connection to the central controlling circuits via coaxial cables. The concept of “aperture efficiency” in an antenna refers to the ratio of its effective radiating area (or effective area) to the physical area of the aperture. In other words, the aperture is the region through which the power is radiated from the antenna.
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization …
579
To save space, an integrated feeding network has been adopted, where the patches are connected in a series using microstrip lines. However, this choice introduces complexity into the antenna design, as the electric length of these microstrip lines varies with frequency. As a result, the main beam position shifts with frequency, adding intricacy to the overall design process. In [4–7], a 4 × 4 E-shaped patch antenna array uses a genetic algorithm for optimization to reduce the return loss and to achieve the larger gain for the operating frequency of 3.2–3.8 GHz. In [8–13], a genetic algorithm optimization is used for determining the structural parameter which has the problem of being computationally intensive. However, this issue is significantly overcome by utilizing the skills of sophisticated global optimizers, such as PSO and the genetic algorithm (GA). Particle Swarm Optimization (PSO) is a computational technique that aims to optimize a given problem by iteratively refining candidate solutions based on a defined measure of quality. This method draws inspiration from the behavior of a flock of birds searching for food. In PSO, each “bird” (particle) adjusts its position based on its own experience and the experiences of other particles in the swarm. PSO is particularly effective for finding maximum or minimum values of functions defined in multidimensional vector spaces. PSO offers several advantages over other methods. It has demonstrated superior performance in terms of speed and cost-effectiveness, making it a favorable choice for optimization tasks. Additionally, PSO can be parallelized, further enhancing its efficiency. An essential feature of PSO is its use of velocity to evolve particle positions and explore potential optimal solutions. The velocity is regulated by multiplying a factor to the particle’s velocity. Each particle maintains its local best position (pbest) and the global best position (Gbest) among all particles during the optimization process. Unlike traditional optimization methods, PSO does not require the problem to be differentiable, as it does not rely on the gradient of the function being optimized. This characteristic allows PSO to handle non-differentiable problems effectively. Comparing PSO to genetic algorithms (GA), PSO stands out with its simplicity and ease of programming. It typically converges faster and often provides better solutions for various applications. Both PSO and GA are based on the same principle of incorporating randomness and the cost of error, but they excel in different contexts, catering to diverse optimization needs. Time domain solver is used to evaluate, and Particle Swarm Optimization is used for optimization. A neural network model technique uses Particle Swarm Optimization algorithm to reduce the computational time in [13]. The paper employs Particle Swarm Optimization (PSO) to optimize the size parameters of patch antennas with inset feed. The objective is to achieve the lowest return loss and achieve optimal size miniaturization for the antennas [14–22]. This work focuses on the design of H-shaped microstrip patch antennas in both 2 × 2 and 4 × 4 array configurations, employing the Particle Swarm Optimization (PSO) algorithm to determine the structural parameters. The selection of PSO over genetic algorithm (GA) aims to achieve high gain and the best goal value. The PSO algorithm is implemented with a swarm size of 30 and 4 iterations in this study. The article is organized as follows: Sect. 2 presents the design process for the 2 × 2 and 4 × 4 array H-shaped microstrip patch antennas, while Sect. 4 delves into the details of the
580
M. Jayaram et al.
Particle Swarm Optimization algorithm. Finally, Sect. 5 provides a comprehensive conclusion summarizing the outcomes of this research and the potential implications for future antenna design and optimization.
2 Antenna Design 2.1 H-shaped Slotted Antenna The initial stage of this study involves designing the ground structure using Perfect Electric Conductor (PEC) to model a lossless metallic surface, ensuring a symmetry type boundary condition. Subsequently, the H-shaped slotted patch is designed using FR4 substrate, a material chosen for its favorable electrical properties, costeffectiveness, and compact nature. Although other materials may yield superior results, FR4 remains widely used, especially for frequencies below 1 GHz, where the permittivity of the substrate remains nearly constant. To construct the coaxial feed, a lossy substance called PTFE is utilized, known for its versatility and non-stick characteristics. Figure 1 provides a visual representation of the dimensions of the antenna, measuring 29.66 mm by 38.036 mm. It is important to note that the selection of materials and substrate plays a crucial role in antenna performance. FR4, despite its advantages, may not be the most optimal choice for certain higher frequency applications, where other materials with different dielectric constants could offer improved performance. However, for the specified frequency range, FR4 has a stable permittivity remains advantageous, ensuring reliable and consistent antenna performance. Additionally, the coaxial feed use of PTFE
Fig. 1 a H-shaped slotted antenna. b 2 × 2 array H-shaped slotted antenna
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization … Table 1 Design dimensions
581
Parameters
Dimensions
Length of the patch (L p )
29.66157 mm
Length of the ground (L g )
59.32314 mm
Width of the patch (W p )
38.036 mm
Width of the ground (W g )
76.072 mm
Width of the feed (W f )
1 mm
Height (h)
1.6 mm
Thickness (t)
0.035 mm
contributes to overall transmission efficiency and signal integrity, enhancing the capabilities of antenna. The designed H-shaped slotted patch antenna holds promise for various applications within its frequency range. Its compact size, combined with the efficient use of materials, makes it a viable option for wireless communication systems, WLAN, and other radio frequency (RF) applications. As this study progresses, further analyses and performance evaluations will be conducted to ascertain radiation characteristics of the antenna, impedance matching, and radiation patterns. These findings will help validate the effectiveness of the antenna and pave the way for potential future improvements and adaptations to meet specific application requirements. The dimension used for the above design is given in Table 1.
2.2 2 × 2 Array To achieve greater directivity and enhanced performance, the implementation of an antenna array is utilized, offering several advantages such as increased overall gain, interference cancelation, and improved Signal-to-Interference Noise Ratio (SINR). There are two main types of antenna arrays: coupled arrays and uncoupled arrays. Coupled arrays are interconnected once fully formed and maintain their spatial configuration over distance, with considerations for potential obstructions such as walls or other obstacles. In contrast, uncoupled arrays are not interconnected and offer more flexibility in their arrangement. In this study, the spacing between two elements in the array is set at 61.28 mm. The dimensions of the 2 × 2 array, as depicted in Fig. 1b, are 119.8 mm by 100.5 mm. The implementation of such an antenna array has the potential to significantly improve the overall performance of the system, particularly in wireless communication applications, where higher directivity and minimized interference are critical for achieving reliable and efficient data transmission. Further, investigations and analyses will be conducted to assess the array radiation pattern and array factor to validate its suitability for specific practical scenarios and pave the way for potential future optimizations and adaptations.
582
M. Jayaram et al.
Fig. 2 4 × 4 array antenna
2.3 4 × 4 Array Using a similar procedure as the 2 × 2 array, the 4 × 4 array is also designed, with a spacing of 61.28 mm between each antenna. The dimensions of the 4 × 4 array are 302.14 mm by 339.5 mm, as depicted in Fig. 2. This array configuration holds the potential to further enhance directivity of the directivity and overall performance, offering improved interference cancelation and increased Signal-toInterference Noise Ratio (SINR). The design and optimization of the 4 × 4 array aim to maximize efficiency of the antenna and suitability for various wireless communication applications, where higher gain and enhanced spatial coverage are essential for achieving seamless and reliable data transmission. Subsequent investigations and performance evaluations will be conducted to validate the 4 × 4 array’s radiation characteristics and ascertain its applicability in real-world scenarios.
2.4 Design Equation The design equations for the proposed microstrip patch antenna are presented as (1)(5), and the corresponding parameters are abbreviated in Table 1. These equations and parameters play a crucial role in determining physical dimensions and characteristics of the antenna, enabling the optimization process for achieving desired performance metrics such as gain, directivity, and bandwidth. Table 1 provides a concise reference for the key design parameters, facilitating a streamlined approach to implementing and fine-tuning specifications of the antenna.
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization …
583
Width of the patch, √ c0 W = 2 fr
2 , c0 is speed of light εr + 1
(1)
Effective refractive index, εreff
[ ] h −1/2 εr + 1 εr − 1 + 1 + 12 = , 2 2 W
W >1 h
(2)
Effective length, L eff = C/2 f
√
ε,
( ) (εreff + 0.3) Wh + 0.264 ΔL ( ), = 0.412 h (εreff − 0.258) Wh + 0.8
(3)
(4)
Length, L=
Co − 2ΔL , √ 2 f r εreff
(5)
where Length = L eff − 2ΔL.
3 Simulated Results The design and simulation of the microstrip H-shaped slotted patch array antenna were conducted using CST MW Studio. The antenna elements were arranged with a spacing of 61.28 mm between each antenna, optimizing the overall performance of the array. The design achieved an impressive directivity of 17.10dBi, indicating the antenna ability to focus its radiation pattern in a specific direction, essential for long-range communication applications. Additionally, the antenna demonstrated a considerable gain of 13.29dBi, highlighting its ability to amplify and radiate electromagnetic signals efficiently. The simulated design of the antenna is visually depicted in Fig. 2, providing insights into its physical configuration and structure. Furthermore, the simulated S-Parameter of the designed antenna is illustrated in Fig. 3, representing the scattering parameters of the antenna and characterizing its frequency response, including reflection and transmission properties. The achieved Voltage Standing Wave Ratio (VSWR) of 1.35 indicates a good impedance match between the antenna and
584
M. Jayaram et al.
Fig. 3 S-Parameter of the 4 × 4 array antenna
the transmission line, showcasing efficient power transfer and minimized signal reflections. Overall, the comprehensive design and simulation results present a promising microstrip H-shaped slotted patch array antenna with remarkable directivity, gain, and impedance characteristics.
4 Optimization In recent years, optimization techniques have been widely employed to enhance the performance of antenna arrays by reducing power consumption and improving radiation patterns through the minimization of side lobes. In this particular study, the chosen optimization technique is Particle Swarm Optimization (PSO). This method offers the advantage of automatically controlling inertia weight, acceleration coefficient, and other algorithmic parameters in real time, thereby significantly improving the effectiveness and efficiency of the search process. By dynamically adjusting these parameters during the optimization process, PSO effectively explores the solution space, leading to better convergence and more optimal antenna array configurations. Figure 4 illustrates the comprehensive flowchart for the PSO algorithm utilized in this work. It highlights the step-by-step procedure of the optimization process, including initialization, updating the velocity and position of particles, evaluating fitness functions, and iteratively searching for the best solution. The PSO algorithm has a versatility and adaptability make it a powerful tool in achieving highperforming antenna array designs with superior radiation characteristics and reduced power consumption, aligning with the demands of modern wireless communication systems and other applications. The successful implementation of PSO in this study contributes valuable insights to the field of antenna design optimization, paving the way for further advancements in this ever-evolving area of research. The general outline of implementation of PSO in antenna design is as described below:
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization …
585
Fig. 4 Flowchart of Particle Swarm Optimization
1. Formulate the Design Problem: Precisely outline the objectives and constraints of the antenna design, encompassing the desired characteristics, including radiation pattern, gain, bandwidth, and impedance matching requirements. 2. Formulate the Fitness Function: Create a fitness function that evaluates the performance of the antenna design. The fitness function quantifies how well a particular set of antenna parameters satisfies the design objectives. It may include metrics such as directivity, voltage standing wave ratio (VSWR), radiation efficiency, or any other relevant performance measure. 3. Swarm Initialization: Create a population of particles, with each particle representing a potential solution within the search space. The position of each particle corresponds to a specific set of antenna design parameters, such as element positions or dimensions. 4. Fitness Evaluation: Assess the fitness of each particle by subjecting its associated antenna design to the fitness function. This evaluation entails simulating
586
M. Jayaram et al.
the performance of antenna using electromagnetic analysis tools or established antenna models. 5. Update Particle Velocity and Position: Modify the velocity and position of each particle, considering its previous velocity, position, and the best positions discovered by the particle itself and its neighboring particles. The update equations combine the particle’s individual experience with the collective influence of other particles within the swarm. 6. Iterate: Repeatedly execute the fitness evaluation and particle update steps for a specified number of iterations or until a convergence criterion is satisfied. During each iteration, the particles traverse the search space, adapting their positions and velocities in pursuit of improved antenna designs. 7. Select the Best Solution: After the iterations are complete, select the best solution or particle with the highest fitness value as the final antenna design. This solution should satisfy the design objectives and constraints specified in step 1. Enhancing the PSO algorithm’s performance entails fine-tuning various algorithmic factors. Modifying the swarm size, i.e., the number of particles in the PSO algorithm, can significantly impact convergence and exploration capabilities. Similarly, adjusting the inertia weight, cognitive, and social aspects of particle movement help in striking a balance between exploration and exploitation, facilitating efficient convergence to optimal solutions. Defining appropriate termination criteria ensures that the optimization process halts when the desired level of convergence is achieved, preventing unnecessary computations and saving computational resources. Overall, PSO empowers antenna engineers with a versatile and powerful tool for optimizing antenna designs. By skillfully navigating the algorithmic parameters and constraints, engineers can effectively explore the vast design space, leading to the discovery of superior antenna solutions with improved performance metrics. The adaptability of PSO makes it an invaluable asset in modern antenna design, offering the potential to revolutionize wireless communication systems, radar applications, and various other technological domains reliant on high-performing antennas. As ongoing research advances the understanding and utilization of PSO in antenna optimization, the possibilities for achieving cutting-edge antenna designs continue to expand, bolstering the progress of wireless technologies and communication systems. The swarm size used in this work is 30. Iteration performed is 4. The solver is the product of the swarm size and the maximum iteration. The Time domain solver is used in this work is 121 solvers. The initial point set is Latin Hypercube Sampling (LHS). The directivity of the optimized array antenna is 16.32 dB, and the gain of the optimized array antenna is 12.62 dB. The S-Parameter of the optimized array antenna is shown in Fig. 5. The position and velocity of the particles are updated by the following two Equations (6) and (7), ( ) vik = wvik + c1r1 pbestik − xik + c2 r2 (gbestk − xik ),
(6)
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization …
587
Fig. 5 Optimized S-Parameter of array antenna
xik+1 = xik + vik+1 ,
(7)
where vik is the velocity of the ith particle at the kth iteration and xik is the current solution (or position) of the ith particle at the kth iteration. c1, c2 are positive constants, and r1, r2 are two random variables with uniform distribution between 0 and 1. The First Goal value of the optimized 4 × 4 antenna is −26.43 dBi which is shown in Fig. 6. The Best Goal value of the optimized 4 × 4 antenna is 15.95 which is shown in Fig. 7. The Aperture Efficiency (εA) is an essential parameter in antenna analysis, computed as the ratio of the effective aperture (Aeff ) to the physical aperture (A). This metric provides valuable insights into the antenna capability to capture and radiate electromagnetic energy effectively. To evaluate the performance of different array configurations, a comparison is made among the 1 × 1 array, 2 × 2 array, and 4 × 4 array, and the results are presented in Table 2. The table illustrates how the
Fig. 6 First goal value of the optimized antenna
588
M. Jayaram et al.
Fig. 7 Best goal value of the optimized antenna
aperture efficiency varies across these array configurations, offering a comprehensive understanding of their respective capabilities in maximizing the captured and radiated energy. Among the array configurations, the optimized 4 × 4 microstrip H-shaped array antenna demonstrates remarkable performance, showcasing both high directivity and a well-defined radiation pattern. Figure 8 visually depicts the directivity, which measures the antenna’s ability to concentrate its radiation in a specific direction. The radiation pattern further illustrates the antenna’s response in various directions, providing insights into its beamwidth, side lobes, and main lobe characteristics. The combination of high directivity and a well-shaped radiation pattern indicates the effectiveness of the optimized 4 × 4 array in efficiently directing and transmitting electromagnetic signals, making it a promising candidate for various wireless communication and radar applications, where reliable and precise signal propagation is crucial. Table 2 differentiates the study parameters like VSWR, gain, directivity, aperture efficiency, swarm size and its optimal values for different antenna array sizes.
Table 2 Comparison of antenna array
Parameters
1 × 1 array
2 × 2 array
4 × 4 array
S11
−22.91
−30.07
−26.43
VSWR
1.35
1.07
1.3
Gain
3.312 dB
1.672 dB
13.29 dBi
Directivity
7.129 dB
5.892 dB
17.10 dBi
Aperture efficiency
31%
24%
61%
Swarm size
30
30
30
Iteration
3
3
4
Design of 4 × 4 Array Antenna Using Particle Swarm Optimization …
589
Fig. 8 Directivity of 4 × 4 at 2.4 GHz
5 Conclusion The microstrip H-shaped slotted patch antenna array has undergone a design and optimization process using Particle Swarm Optimization (PSO), resulting in significant improvements over conventional literature work. The 4 × 4 antenna array now exhibits an impressive directivity of 17.10 dBi and a gain of 13.29 dBi. Notably, the optimization process achieved a best goal value of 15.95 at step 61 of the optimizer. The optimized 4 × 4 array antenna demonstrates a high aperture efficiency of 61%. Moving forward, the next steps involve fabricating the optimized 4 × 4 antenna array design and conducting thorough parameter testing on the prototype. This antenna success holds promise for various applications, from wireless communication to radar systems, and marks a notable advancement in antenna design and optimization using PSO techniques.
References 1. Munson R (1974) Conformal microstrip antennas and microstrip phased arrays. IEEE Trans Antennas Propag 22(1):74–78 2. Ma Z, Vandenbosch GA (2012) Low-cost wideband microstrip arrays with high aperture efficiency. IEEE Trans Antennas Propag 60(6):3028–3034 3. Ma Z, Volski V, Vandenbosch GA (2010) Optimal design of a highly compact low-cost and strongly coupled 4 element array for WLAN. IEEE Trans Antennas Propag 59(3):1061–1065 4. Wu ZH, Zhang WX (2010) Broadband printed compound air-fed array antennas. IEEE Antennas Wirel Propag Lett 9:187–190 5. Vrancken M, Vandenbosch GA (2003) Semantics of dyadic and mixed potential field representation for 3-D current distributions in planar stratified media. IEEE Trans Antennas Propag 51(10):2778–2787
590
M. Jayaram et al.
6. Vasylchenko A, Schols Y, De Raedt W, Vandenbosch GAE (2009) Quality assessment of computational techniques and software tools for planar-antenna analysis. IEEE Antennas Propag Mag 51(1):23–38 7. Radiom S, Aliakbarian H, Vandenbosch GAE, Gielen GGE (2009) An efficient technique for UWB printed monopole antenna miniaturization. IEEE Trans Antennas Propag 57(10):2989– 2996 8. Chen X, Wang G, Huang K (2010) A novel wideband and compact microstrip grid array antenna. IEEE Trans Antennas Propag 58(2):596–599 9. Swidzinski JF, Chang K (2000) Nonlinear statistical modeling and yield estimation technique for use in Monte Carlo simulations microwave devices and ICs. IEEE Trans Microwave Theory Technol 48(12):2316–2324 10. Bangash K et al (2019) Effect of embedding H-shaped slot on the characteristics of millimeter wave microstrip patch antenna for 5G applications. In: International Conference on Computing, Mathematics and Engineering Technologies—iCoMET 2019 11. Rappaport TS, Sun S, Mayzus R, Zhao H, Azar Y, Wang K, Wong GN, Schulz JK, Samimi M, Gutierrez F (2013) Millimeter wave mobile communications for 5g cellular: it will work! IEEE access 1(1):335–349 12. Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceedings on IEEE International Conference Neural Networks, Piscataway, NJ, pp 1942–1948 13. Vilovi´c I, Burum N, Brailo M (2013) Microstrip antenna design using neural networks optimized by PSO. In: ICECom 2013, Dubrovnik, Croatia, pp 1–4. https://doi.org/10.1109/ICE Com.2013.6684759 14. Girija HS, Sudhakar R, Abdul Kadhar KM, Priya TS, Ramanathan S, Anand G (2020) PSO based microstrip patch antenna design for ISM band. In: 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, pp 1209– 1214. https://doi.org/10.1109/ICACCS48705.2020.9074290 15. Eberhart RC, Shi Y (2001) Particle Swarm Optimization: developments, applications and resources. In: Proceedings on 2001 Congress on Evolutionary Computation, vol 1 16. Robinson J, RahmatSamii Y (2004) Particle swarm optimization in electromagnetics. IEEE Trans Antennas Propag 52:397–407 17. Roy TB, Ghose MK (2015) Optimization of bow-tie patch antenna using various soft computation techniques. Int J Adv Res Comput Commun Eng 4(5):243–248. https://doi.org/10.17148/ IJARCCE.2015.4556 18. Sahoo S, Mishra LP, Mohanty MN (2016) Optimization of Z-shape microstrip antenna with Islot using discrete Particle Swarm Optimization (DPSO) algorithm. Proc Comput Sci 92:91–98. https://doi.org/10.1016/j.procs.2016.07.328 19. Choukiker YK, Behera SK (2011) Design of microstrip radiator using particle swarm optimization technique. ICTACT J Commun Technol 2(3):482–489 20. Bhaskaran S, Varma R, Ghosh J (2013) A comparative study of GA, PSO and APSO: feed point optimization of a patch antenna. Int J Sci Res Publ 3(5):1–5 21. Errifi H, Baghdad A, Badri A (2007) Design and optimization of aperture coupled microstrip patch antenna using genetic algorithm. Int J Innov Res Sci Eng Technol 3297(5):2319–8753 22. Jayaram M, Devnesh KN, Anjaneyulu BM (2023) Design of a UWB antenna utilising fractal geometry and EBG structures for wireless applications. In: 2023 International conference on intelligent systems for communication, IoT and Security (ICISCoIS), pp 185–189. https://doi. org/10.1109/ICISCoIS56541.2023.10100529
Optimal Energy Restoration in Radial Distribution: A Network Reconfiguration Approach by Kruskal’s Minimal Spanning Tree Algorithm Maitrayee Chakrabarty and Dipu Sarkar
Abstract Power system service restoration using network configuration is an extremely explored known problem in power system. In restoration it is intended to restore the maximum number of loads subsequent to a blackout or any kind of fault by modifying the topological configuration of the power network. In the present work, Kruskal’s minimal spanning tree graph-based approach algorithm has been proposed to search an operative and fast solution for distributed system. To serve the power to the isolated load due to fault on distribution feeder networks, optimal selection of tie switch and sectionalizing switch operation has been performed. An appropriate operation of switches in the network reduces the figure of switching operations with an optimal voltage profile of each of the nodes with least active power losses. The evaluation of the suggested technique has been carried out on modified IEEE 33 bus systems, and results exhibit the signs of switching operation in power system service restoration. Keywords Power system service restoration · Kruskal’s minimal spanning tree algorithm · Voltage level
M. Chakrabarty Department of Electrical Engineering, JIS College of Engineering, Kalyani, India D. Sarkar (B) Department of Electrical and Electronics Engineering, National Institute of Technology Nagaland, Chumukedima, Dimapur 797103, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 H. Zen et al. (eds.), Soft Computing and Signal Processing, Lecture Notes in Networks and Systems 840, https://doi.org/10.1007/978-981-99-8451-0_51
591
592
M. Chakrabarty and D. Sarkar
1 Introduction One of the main concerns of power industries is to fulfill the increasing load demands and evenly distribute the available power to the consumers. Power systems sometimes experience power outage problems, also known as blackouts. A blackout [1–4] event can affect the whole power network or just a portion of it and may be triggered due to limitation of generation insufficiency, transmission line power flow capacity, etc. The Indian power grid experienced a massive blackout event in July 30 and 31, 2017, in which around 370 million people were affected due to shut down of 35.67 GW power plant. Xue et al. in [1] have highlighted the major reasons for this blackout event. In North America, the north-eastern power grid also experienced a blackout event in August 14, 2003; some reasons behind this blackout have been addressed in [2, 3]. Andersson et al. in [3] discussed several causes behind the blackout events in Southern Sweden and Eastern Denmark on September 23, 2003, and on the Italian power grid on September 28, 2003. Lee et al. in [4] investigated the reasons behind the severe blackout in Taiwan on July 29, 1999, due to collapse of 326 transmission towers in which over 8.46 million people were affected, as well as another blackout event that occurred again on September 21, 1999. The key reasons behind blackouts can be a power mismatch between generation and loads, under-excitation effect of generators, over-voltages, frequency deviation, unplanned load shedding, lack of black-start capability, switching issues, communication failure issues associated with SCADA devices, issues with the protection system, issues with energy storage, etc. Adibi et al. [5] addressed several issues in power system service restoration (PSSR) such as identification of the fault location, interconnection assistance, local load shedding, cold load pickup, load generation coordination, standing phase angle, etc. Lindenmeyer et al. in [6] discussed about the planning and preparation stage of restoration and its association with the capacity of the black-start unit, the supply magnitude of cranking power, as well as fault identification, etc. Further, system restoration and its dependence on network reconfiguration and load restoration for restoring isolated loads was also discussed. Adibi et al. in [7] suggest restoring power with minimum time duration based on the accessibility of different types and size of prime mover, numbers and switch action to energize feeder lines, transformers, and load. Sudhakar et al. in [8] solved PSSR planning as a multi-objective problem considering different constants for voltage, current, load, priority of customers, reliability, etc. In [9], network reconfiguration for a distribution network for PSSR was done by connecting or disconnecting the sectionalizing switches and ties switches via employing a heuristic technique. Sarkar et al. [10] proposed a network reconfiguration technique to improve the voltage stability by using Kruskal’s maximum spanning tree algorithm. Sarma et al. [11] proposed a network reconfiguration procedure in to restore isolated loads with the help of network reduction technique using interested tree algorithm. Huang et al. [12] proposed a fast loop breaking technique for PSSR in the presence of DG. Li et al. [13] found the minimum switching operation by spanning tree search algorithm for distribution system restoration for micro-grid. Gholami et al. [14] used
Optimal Energy Restoration in Radial Distribution: A Network …
593
two heuristic methods to restore maximum power in distribution networks for minimizing load shedding areas and reduced voltage drop. Sarkar et al. [15] proposed a fast and efficient graph theory applied to maintain radialized topology in distribution networks. In the present article, an innovative key for the service restoration problem is employed. The suggested solution effectively assimilates the original power restored after a single fault or multiple faults on networks into a combined minimal spanning tree problem. Here, during initial simulation, it is considered that virtually all the sectionalizing and tie switches are closed and it is also assumed that each of the feeder lines which consists of tie switch also have impedance which is proportional to the distance between connectivity of feeder between two nodes. Hence, for the initial stage virtually, the network is a mesh structure. If, single or multiple faults occur on any of the feeders, then corresponding that feeder impedance has been considered too high compared to a normal active feeder line. In the present article, Kruskal’s minimal spanning tree (KMST) technique is proposed, which finds the paths which have less weight or less impedances and maintain the radial structure. As a result, the faulty feeders are removed, and finally, they make a radial structure intact with all the loads. To observe the cogency of the proposed technique, distribution load flow has been performed to check the overall losses and voltage profile of the network after restorative action.
2 Problem Formulation with Constraint Restoration problem can be formulated as an optimization problem. The first and foremost objective is to find out the feeders which have least impedance. In other words, minimum weighted path has to participate for carrying power to restore maximum numbers of loads. On the other hand, the line/lines where fault takes place would be immediately removed from the live system. As a result, that part may be considered as an open circuit. Hence, that part impedance is more. Subsequently, weight of the line would be large compared to a normal active feeder line. The second important objective is to search for an optimal radial restored network, which gives optimal voltage profile with least amount power losses. However to fulfill the above objective the following constraint and limit conditions have to be satisfied.
2.1 Power Balance Constraint During the restoration process, the loads-generation coordination of active and reactive power ought to be matched every time to supply power to the isolated loads. The mathematical criteria can be expressed as
594
M. Chakrabarty and D. Sarkar
Psource = Q source =
Pload j ,
(1)
Q load j ,
(2)
where Psource and Qsource are the active and reactive power generated by the generating unit in a subsystem. Ploadj and Qloadj are overall active and reactive power requirement of jth load.
2.2 Voltage Constraint During restoration, the transmission line voltage should not deviate from minimum and maximum tolerable limit. Vmin ≤ Vm ≤ Vmax ,
(3)
where Vmin is the minimum voltage and Vmax is the maximum voltage capacity of each feeder.
2.3 Feeder Capacity Constraint During restoration, the feeder line current should not go above the maximum tolerable current capacity limit. ILineflow ≤ ICapacity ,
(4)
where ILineflow is the line current and ICapacity is the feeder line capacity.
3 Methods A graph can be formulated by the v vertices or nodes and v − 1 edges or branches. A network can be represented as graph where each edge has specific weight. KMST algorithm helps to convert a network to a tree. Consider G as the input graph represented by G = ((i, j ), Wi, j ),
(5)
Optimal Energy Restoration in Radial Distribution: A Network …
595
Fig. 1 Seven nodes network
where i, j are nodes, W i,j → weight of the branches (i, j). Let us assume the edges connection is represent by Cij in between corresponding i to j node. The probable minimum spanning tree (M-S-T) connection is represented with the help of T. The output of T helps to form the adjacency matrix and size of T = (v × v). The technique for computing the weight of M-S-T of a network G by Kruskal [16] can be illustrated as follows: 1. Sort the edges of G in ascending order by weight. Let T be the set of edges comprising the minimum weight spanning tree. 2. The first edge adds in to T and corresponding Cij value is 1. 3. Add the next edge to T if and only if it does not form a cycle in T. If there are no remaining edges, exit and report G to be disconnected. 4. If T has v − 1 edges and make the adjacency matrix of output T, stop and print the output T; otherwise, go to Step 3. The aim of KMST algorithm base PSSR technique is to find the optimal spanning tree with minimum weight and also maintain the branch number v – 1 with less weight. Ionalizing switches (S1–S6) and four tie switches called ‘TS1,’ ‘TS2,’ ‘TS3,’ and ‘TS4’ for the present system that are connected between node 2–4, node 3–5, node 2–6, and node 3–7, respectively (Fig. 1). Let the impedance (in p.u.) of feeders 1, 2, 3, 4, 5, and 6 be W1, W2, W3, W4, W5, and W6, respectively. Weight of the directly connected nodes at normal condition in Feeder1: C12 = C21 = W1, Feeder2: C23 = C32 = W2, Feeder3: C14 = C41 = W3, Feeder4: C45 = C54 = W4, Feeder5: C16 = C61 = W5, Feeder6: C67 = C76 = W6. In each feeder of the distribution system, there is a normally closed sectionalizing switch and normally open tie switch between the two feeders. If the tie switch is connected, it is presumed that the corresponding feeder impedance would be W7, W8, W9, and W10 for closing the tie switch ‘TS1,’ ‘TS2, ‘TS3,’ and ‘TS4,’ respectively. In the presence of tie switches virtual feeder connections are C24 = C42 = W7, C35 = C53 = W8, C26 = C62 = W9, C37 = C73 = W10. The connection of tie switches are usually in idle condition until some fault restore job has been allocated.
596
M. Chakrabarty and D. Sarkar
3.1 Single Feeder Fault Let us assume that a feeder fault has occurred at Feeder 5, which is connected between node 1 and 6. Due to this fault downstream load at nodes 6 and 7 are disconnected from the source node, which is illustrated in Fig. 2a and only one feeder is removed due to fault in this case. Now physically four feeder (edges) are available which are not sufficient to supply to isolated node (6, 7). Hence, there is a need to operate tie sectionalizing switches. Initially, it is assumed that all tie switches are closed and making some loops. This state may be referred as virtual state of the network. In the presences of virtually closed tie lines, the existing radial network is converted to loop network, which is shown in Fig. 2b. Loop-1 is formulated by node 1–2–4–1, Loop-2 is formulated by node 2–3–5– 4–2, Loop-3 is formulated by node 1–2–6–1 and Loop-4 is formulated by node 2–3–7–6–2. The goal during restoration is to restore the maximum amount of load with minimal activity of the tie switch, which lead to less power loss and good voltage profile at the load node and maintain radial structure. Kruskal’s minimum spanning tree gives optimal radial network, consists of lower weight edge (less feeder impedance), which incur less power loss and voltage drop at the feeder line. In the proposed algorithm, the weight of the tie line is presumed to be greater than that of the usual feeder except for the faulty feeder line. Here, it is considered that W1 < W2 < W3 < W4 < W6 (feeder impedance in p.u. of Feeder1 to Feeder6 in ascending order and less than W5). For tie switches, reactance values are W7 < W8 < W9 < W10 (