151 42 21MB
English Pages 717 [688] Year 2021
Algorithms for Intelligent Systems Series Editors: Jagdish Chand Bansal · Kusum Deep · Atulya K. Nagar
Shikha Agrawal · Kamlesh Kumar Gupta · Jonathan H. Chan · Jitendra Agrawal · Manish Gupta Editors
Machine Intelligence and Smart Systems Proceedings of MISS 2020
Algorithms for Intelligent Systems Series Editors Jagdish Chand Bansal, Department of Mathematics, South Asian University, New Delhi, Delhi, India Kusum Deep, Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India Atulya K. Nagar, School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, UK
This book series publishes research on the analysis and development of algorithms for intelligent systems with their applications to various real world problems. It covers research related to autonomous agents, multi-agent systems, behavioral modeling, reinforcement learning, game theory, mechanism design, machine learning, meta-heuristic search, optimization, planning and scheduling, artificial neural networks, evolutionary computation, swarm intelligence and other algorithms for intelligent systems. The book series includes recent advancements, modification and applications of the artificial neural networks, evolutionary computation, swarm intelligence, artificial immune systems, fuzzy system, autonomous and multi agent systems, machine learning and other intelligent systems related areas. The material will be beneficial for the graduate students, post-graduate students as well as the researchers who want a broader view of advances in algorithms for intelligent systems. The contents will also be useful to the researchers from other fields who have no knowledge of the power of intelligent systems, e.g. the researchers in the field of bioinformatics, biochemists, mechanical and chemical engineers, economists, musicians and medical practitioners. The series publishes monographs, edited volumes, advanced textbooks and selected proceedings.
More information about this series at https://link.springer.com/bookseries/16171
Shikha Agrawal Kamlesh Kumar Gupta Jonathan H. Chan Jitendra Agrawal Manish Gupta •
•
Editors
Machine Intelligence and Smart Systems Proceedings of MISS 2020
123
•
•
Editors Shikha Agrawal University Institute of Technology Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, Madhya Pradesh, India Jonathan H. Chan King Mongkut’s University of Technology Thonburi Bangkok, Thailand
Kamlesh Kumar Gupta Rustamji Institute of Technology Gwalior, Madhya Pradesh, India Jitendra Agrawal School of Information Technology Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal, Madhya Pradesh, India
Manish Gupta Vikrant Institute of Technology and Management Gwalior, Madhya Pradesh, India
ISSN 2524-7565 ISSN 2524-7573 (electronic) Algorithms for Intelligent Systems ISBN 978-981-33-4892-9 ISBN 978-981-33-4893-6 (eBook) https://doi.org/10.1007/978-981-33-4893-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021, corrected publication 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
Nowadays, when the world is becoming so ambiguous that cannot be comprehended by a single individual, information is growing at a tremendous rate and software systems are becoming uncontrollable, and this inspired computer scientists for designing an alternative intelligent systems in which control, pre-programming and centralization are replaced by autonomy, emergence and distributed functioning. The field of research focused on developing such systems and applying them to solve a wide variety of problems is termed as ‘machine intelligence’. Machine intelligence is a methodology involving computing that provides a system with an ability to learn and/or to deal with new situations, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. Since its origin, the number of its successful applications has been grown rapidly, and the use of these machine intelligence algorithms has increased over the years. Machine intelligence (MI) techniques are ideal for such applications as tools of ‘knowledge discovery from data’ or in short ‘data to knowledge’ for complex and often apparently intractable systems. There is a need to expose academicians and researchers to machine intelligence (MI) and their multidisciplinary applications for better utilization of these techniques and their future development for developing smart systems. This book not only deals with an introduction of the MI techniques along with their several applications but also tries to cover several novel applications of combining MI techniques and utilizing the hybrid forms in different practical areas like engineering systems used in agriculture, military and civilian applications, manufacturing, biomedical and healthcare systems as well as education. Equally important, this book intends to demonstrate successful case studies, identify challenges and bridge the gap between theory and practice in applying machine intelligence to solving all kinds of real-world problems. Since machine intelligence is a truly interdisciplinary field, scientists, engineers, academicians, technology developers, researchers, students and government officials will find this text useful in handling their complicated real-world issues by using machine intelligence methodologies and assisting in furthering their own research efforts in this field. Moreover by bringing together representatives of academia and industry, this book v
vi
Preface
is also a means for identifying new research problems and disseminating results of the research and practice. The main goal of this book is to provide scientific researchers and engineers with a vehicle where innovative technologies for developing smart systems through machine intelligent techniques are discussed. Bhopal, India Gwalior, India Bangkok, Thailand Bhopal, India Gwalior, India
Shikha Agrawal Kamlesh Kumar Gupta Jonathan H. Chan Jitendra Agrawal Manish Gupta
Acknowledgments
International Conference on Machine Intelligence and Smart Systems (MISS-2020) was successfully organized during 24 and 25 September 2020 at Rustamji Institute of Technology, Tekanpur, Gwalior, Madhya Pradesh, India. MISS-2020 has attracted a great number of academies, scientists, engineers, postgraduates and other professionals from a large range of countries. We are committed to create a high-level international forum for researchers and engineers to present and discuss recent advances, new techniques and applications in the field of machine intelligence, smart systems and their associated learning systems and applications. The core committee had taken utmost care in each and every facet of the conference, especially regarding the quality of the submissions. The papers were evaluated on the basis of their significance, novelty and technical quality. The conference of this magnitude was possible due to the consistent and concerted efforts of many good souls. We acknowledge the contribution of our advisory body members, technical programme committee, keynote speakers, people from industry and academia and media who have been instrumental in making this conference possible. We take this opportunity to thank authors of all submitted papers for their hard work, adherence to the deadlines and patience with the review process. The quality of a referred volume depends mainly on the expertise and dedication of the reviewers. We are indebted to the program committee members and external reviewers who not only produced excellent reviews but also did these in short time frames. We would also like to thank Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal, India, having coming forward to organize this conference. Prof Sunil Kumar, Vice Chancellor, RGPV, Bhopal, deserves kudos for the great support he has extended from day one of the conceptualization of the idea of conducting the conference. We owe special thanks to our sponsor of the conference TEQIP-III, RGPV, Bhopal, India, and publishing partner Springer. We would also like to thank the participants of this conference, who have considered the conference above all hardships. Finally, we would like to thank all the volunteers who spent tireless efforts in meeting the deadlines and arranging vii
viii
Acknowledgments
every detail to make sure that the conference can run smoothly. All the efforts are worth and would please us all, if the readers of this proceedings and participants of this conference found the papers and conference inspiring and enjoyable.
Contents
1
2
3
4
5
6
Synesthetic LearningPedagogy (SLP)—An Exploratory Investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Lakshmi Narasimhan, G. Vasistha Bhargavi, C. Lakshmi, and Liza Lee
1
MSB/LSB Prediction Based Reversible Data Hiding in Encrypted Images: A Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rama Singh and Ankita Vaish
11
A Review of Image Enhancement Techniques in Medical Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anand Jawdekar and Manish Dixit
25
Smartphone User Identification and Authentication Based on Raw Accelerometer Walking Activity Data Using Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . Prabhat Kumar and S. Suresh
35
Elastic Optical Network: A Promising Solution for Future Communication Networks Considering Differentiated CoS . . . . . . . Shruti Dixit, Deepak Batham, and Ravindra Pratap Narwaria
49
Smart Home Energy Prediction with GRU Recurrent Neural Network Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dimpal Tomar, Jai Prakash Bhati, and Pradeep Tomar
61
7
A Survey on Watermarking and Its Techniques . . . . . . . . . . . . . . . Sanjay Patsariya and Manish Dixit
8
Analysis of Emotion Recognition with Gesture Analysis Through the Machine Learning and Fuzzy Concepts . . . . . . . . . . . . . . . . . . Samta Jaın Goyal and Rajeev Goyal
71
79
ix
x
9
Contents
Dyslexia Detection Using Android Application . . . . . . . . . . . . . . . . Pardeep, Jagrit Kalra, Aman Jatain, and Yojan Arora
10 Prediction of Indian River Water Temperature Using Convolutional Neural Network and Reliable Data Transmission Using IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . K. Sujatha, T. Godhavari, K. Senthil Kumar, and B. Deepa Lakshmi
87
97
11 Performance Evaluation of Conventional and Systematic IT Services Automation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Said Masihullah Hashimi, Khushboo Tripathi, and Deepthi Sehrawat 12 Internet of Things (IoT): State-of-the-Art Technologies, Challenges and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Ranjit Rajak and Kritika Selot 13 Analysis on Protocol-Based Intrusion Detection System Using Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Savitri Mandal, A. Sai Sabitha, and Deepti Mehrotra 14 Neural AutoML with Convolutional Networks for Diabetic Retinopathy Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 V. K. Harikrishnan, Meenu, and Ashima Gambhir 15 The Promise of Deep Learning for Indian Roads: A Comparative Evaluation of Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Mriganka Sharma, Jai Sehgal, Joyjit Chatterjee, and Anu Mehra 16 To Reduce Gross NPA and Classify Defaulters Using Shannon Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Nikhil Sonavane, Ambarish Moharil, Chirag Kedia, and Mansimran Singh Anand 17 Long-Distance Optical Communication Network with Linear Chirped Fiber Bragg Grating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Rajkumar Gupta and M. L. Meena 18 Covid-19 Containment: Demystifying the Research Challenges and Contributions Leveraging Digital Intelligence Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Chellammal Surianarayanan and Pethuru Raj Chelliah 19 Power-Efficient Combinational Circuits Using Reversible Gate . . . . 215 Prashant Kumar, Neeraj Gupta, and Rashmi Gupta 20 Extractive Summarization of Text Using Weighted Average of Feature Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Shatajbegum Nadaf and Vidyagouri B. Hemadri
Contents
xi
21 Sentimental Analysis on Impact of COVID-19 Outbreak . . . . . . . . 233 Deepika Chauhan and Chaitanya Singh 22 Design and Implementation of Drone Using Deep Learning . . . . . . 243 Pooja Batra Nagpal, Sarika Chaudhary, and Aritra Roy 23 Survey on Intelligent Healthcare: An Approach Toward Well-Being . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Ugale Pradip and Asmita A. Moghe 24 Enhanced Lightweight Model for Detection of Phishing URL Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Siddhesh Masurkar and Vipul Dalal 25 Detection of Phishing E-Mails: A Learning-Based Approach . . . . . 281 Shivarajakumar Abbigeri and Anand Pashupatimath 26 Exploring Web Scraping with Python . . . . . . . . . . . . . . . . . . . . . . . 287 Allan Sasi, Ayush Deep, Krishan Kumar, and Vivek Birla 27 Multi-layer Authentication Technique for Medical Image Authentication in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . 297 Rajesh Yadav and Anand Sharma 28 Quantum Cryptography for Data Science Security . . . . . . . . . . . . . 305 Rashi Sharma and Anand Sharma 29 An Unemployment Prediction Rate for Indian Youth Through Time Series Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Shivam Sharma and Hemant Kumar Soni 30 Stratification to Improve Systematic Sampling for Big Data Mining Using Approximate Clustering . . . . . . . . . . . . . . . . . . . . . . 337 Kamlesh Kumar Pandey and Diwakar Shukla 31 Smart Face Locker Using OpenCV and Arduino with Mail Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Sarika Chaudhary, Charu Jain, and Ritvika Dhull 32 Analysis of Unsupervised Learning Algorithms for Anomaly Mining with Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Gangili Divija Arya, Kothuri Venkata Sai Harika, Deeptimahanti Venkata Rahul, Shanmukhi Narasimhan, and Asha Ashok 33 Design and Fabrication of Pneumatic-Powered Upper Body Exoskeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Kiran Somisetti
xii
Contents
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) System on Resistive Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Dilip Yadav, Nidhi Singh, and Vikas Singh Bhadoria 35 Development of a Low-Cost ECG Device . . . . . . . . . . . . . . . . . . . . 401 Raghav Bali, Paras Saini, Adeeb Khan, Md Shahbaz Alam, and Brajesh Kumar 36 Detection Analysis of DeepFake Technology by Reverse Engineering Approach (DREA) of Feature Matching . . . . . . . . . . . 421 Sonya Burroughs, Kaushik Roy, Balakrishna Gokaraju, and Khoa Luu 37 Efficient Analysis and Classification of Stages Using Single Channel of EEG Through Supervised Learning Techniques . . . . . . 431 Santosh Kumar Satapathy, Praveena Narayanan, and D. Loganathan 38 TERRABOT—A Multitasking AI . . . . . . . . . . . . . . . . . . . . . . . . . . 443 Anmol Kumari, Komal Gupta, and Deepika Rawat 39 Mutation Operator-Based Image Encryption Algorithm for Securing IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 Rashmi Rajput and Manish Gupta 40 A Novel Method for Corona Virus Detection Based on Directional Emboss and SVM from CT Lung Images . . . . . . . . . . . . . . . . . . . . 463 Arun Pratap Singh, Akanksha Soni, and Sanjay Sharma 41 Simulation of Physical Layer of IEEE 802.22 WRAN Under Different Cyclic Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Sharad Jain and Ashwani Kumar Yadav 42 Emotion Detection for Traveling Services Using Rule-Based Fuzzy Inference System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 Roop Ranjan and A. K. Daniel 43 Comparative Analysis of Authentication Techniques with Mechanism Implementing HoneyPot Over Cloud . . . . . . . . . . 501 Amit Wadhwa 44 Human Emotion Recognition Through Facial Expressions . . . . . . . 513 Rajasekhar Nannapaneni and Subarna Chatterjee 45 Multi-channel CNN-BiGRU Neural Network For Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527 Bhavik Kanekar and Anand Godbole 46 A Research on Different Type of Possible Attacks on Blockchain: Susceptibilities of the Utmost Secure Technology . . . . . . . . . . . . . . 537 Arvind Panwar and Vishal Bhatnagar
Contents
xiii
47 Comparison of Pre-trained Deep Model Using Tomato Leaf Disease Classification System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Shravankumar Sauda, Loveneet Kaur, and Madan Lal 48 Blockchain-Enabled Secure Electronic Voting System in India . . . . 567 Amber Gautam, Juhi Singh, and Neha Bhateja 49 Haar Features and JEET Optimization for Detecting Human Faces in Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579 Navin K. Ipe and Subarna Chatterjee 50 Accurate Detection of Breast Cancer Using GLCM and LBP Features with ANN via Mammography . . . . . . . . . . . . . . . . . . . . . 593 Ashutosh Kumar Singh, Rakesh Narvey, and Vishal Chaudhary 51 AI-Based Enabled Performances Measurements for MOOCs . . . . . 605 Atulkumar Gupta and Surekha Dholay 52 Application of Hidden Markov Model to Analyze the Biometric Signature: A Comprehensive Survey . . . . . . . . . . . . . . . . . . . . . . . . 621 Neha Rajawat, Bharat Singh Hada, and Soniya Lalwani 53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework for Real-Time Acoustic Data Analysis . . . . . . . . . . . . . 635 Sathyan Munirathinam 54 Sentential Negation Identification of FMRI Data Using k-NN . . . . . 657 Ashish Ranjan, Anil Kumar Singh, Anil Kumar Thakur, Ravi Bhushan Mishra, and Vibhav Prakash Singh 55 Automation Soil Irrigation System Based on Soil Moisture Detection Using Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . 665 Poonam Dhamal and Shashi Mehrotra 56 Threat Detection on UDP Protocols Using Packet Rates in IoT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675 T. Subburaj and K. Suthendran 57 Robust Lightweight Image Encryption Technique Using Crossover Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683 Gaurav Mittal and Manish Gupta 58 Peacock Repellant Technique for Crop Protection in the Agricultural Land . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 S. Anitha, K. Lalitha, and V. Chandrasekaran Retraction Note to: Industrial Internet of hings (IIoT) Framework for Real-Time Acoustic Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . Sathyan Munirathinam
C1
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
About the Editors
Dr. Shikha Agrawal is Director, Centre for University Placement, Training & Corporate Affairs, Rajiv Gandhi Proudyogiki Vishwavidyalaya, and Associate Professor in the Department of Computer Science & Engineering at the University Institute of Technology, RGPV, Bhopal (MP) India. She obtained B.E., M.Tech. and Ph.D. in Computer Science & Engineering. She has more than seventeen years of teaching experience. Her area of interest is artificial intelligence, soft computing, computational intelligence and particle swarm optimization. She has published more than 50 research papers in different SCI/SCIE/Scopus/UGC approved reputed international journals and 03 patents. She authored one book on “Advanced Database Management System” and 12 chapters with international publishers. She is heading research projects from various government organizations. For her outstanding research work in Information Technology, she has been awarded as “Young Scientist” by Madhya Pradesh Council of Science and Technology, Bhopal. She has been elected as an IEEE senior member and participated in numerous (more than 15) conference presentations (including invited and peer-reviewed oral presentations and panel discussions) and chaired technical sessions in various international conferences and also served as a member of reviewer committee of International Journal of Computer Science and Information Security, USA, and many international conferences of IEEE, Springer, etc. Dr. Kamlesh Kumar Gupta is Principal in Rustamji Institute of Technology (RJIT), BSF Academy, Tekanpur, MP, India. He completed B.E., M.Tech. and Ph.D. in Computer Science & Engineering from RGPV Bhopal MP, India. He has published more than 40 research papers in various journals and conferences. He organized international conference in emerging trends of computer science and its applications in RJIT. He also organized various training courses on computer literacy in Border Security Force Academy Tekanpur (MHA). He has guided many Ph.D.s in various universities like RGPV, Uttarakhand Technical University and Amity University, Gwalior. He is a member of CSI and IETE and a member in BOG of RGPV University Bhopal.
xv
xvi
About the Editors
Dr. Jonathan H. Chan is an Associate Professor of Computer Science at the School of Technology, King Mongkut’s University of Technology Thonburi (KMUTT), Thailand. Jonathan holds a B.A.Sc., M.A.Sc. and Ph.D. degree from the University of Toronto and is currently Visiting Professor there until the end of 2019. He was also Visiting Scientist at The Centre for Applied Genomics at Sick Kids Hospital in Toronto in several occasions. He is Section Editor of Heliyon Computer Science (Cell Press), Action Editor of Neural Networks (Elsevier) and a member of the editorial board of International Journal of Machine Intelligence and Sensory Signal Processing (Inderscience), International Journal of Swarm Intelligence (Inderscience) and Proceedings in Adaptation, Learning and Optimization (Springer). He is a senior member of IEEE, ACM and INNS and a member of the Professional Engineers of Ontario (PEO). His research interests include intelligent systems, cognitive computing, biomedical informatics, and data science and machine learning in general. Dr. Jitendra Agrawal is Associate Professor in the School of Information Technology, at the Rajiv Gandhi Proudyogiki Vishwavidyalaya, MP, India. He earned his master’s degree from Samrat Ashok Technology Institute, Vidisha (MP), in 1997, and was awarded Doctor of Philosophy in Computer & Information Technology in 2012. His research interests include database, data structure, data mining, soft computing and computational intelligence. He has published more than 70 publications in international journals and conferences. He authored 4 books and 10 chapters in international publishers. He has also published 3 patents. He is the recipient of the Best Professor in Information Technology Award by the World Education Congress in 2013. He has delivered numerous invited talks and keynote addresses at different academic forums on various emerging issues in the field of Information Technology and Innovations in Teaching Learning System. He is a senior member of the IEEE (USA), a life member of Computer Society of India (CSI), a life member of Indian Society of Technical Education (ISTE) and a member of Youth Hostel Association of India (YHAI) and IAENG. Mr. Manish Gupta is Assistant Professor in Computer Science & Engineering in Vikrant Institute of Technology & Management Gwalior, MP, India. He has done B.E. from MITS Gwalior MP, India, and M.Tech. from ABV-IIITM Gwalior MP, India, and pursuing Ph.D. from RGPV Bhopal MP, India. He has published more than 25 research papers in various journals, IEEE conferences and Springer conferences. He has published two books. He is currently working on a Central Govt. Project “Unnat Bharat Abhiyan” sanctioned by Govt. of India.
Chapter 1
Synesthetic LearningPedagogy (SLP)—An Exploratory Investigation V. Lakshmi Narasimhan , G. Vasistha Bhargavi , C. Lakshmi , and Liza Lee
1 Introduction When you read an article, do you remember anything? Does that include this article too? When you smell jasmine, do you remember any incident in your life? If you say yes to the latter question, you might have synesthesia. In the mid-nineteenth century, synesthesia had intrigued an art movement that sought sensory fusion, according to Cytowic [1]. The various senses that we possess take an important role in the writing, listening, speaking, reading, feeling, etc. Thus, multimedia concerts of music and light became popular, particularly with nonclassical performers. Cytowic argues that “such deliberate contrivances are qualitatively different from the involuntary experiences that I am calling synesthesia in this review” (Cytowic 1995) [2]. He defines synesthesia as the voluntary remembrance with the physical experiences. He sharply distinguishes its phenomenology from “metaphor, literary tropes, sound symbolism, and deliberate artistic contrivances that sometimes employ the term ‘synesthesia’ to describe their multisensory joining” (Cytowic 1995) [2, 3].
V. Lakshmi Narasimhan University of Botswana, Gaborone, Botswana e-mail: [email protected] G. Vasistha Bhargavi (B) · L. Lee Chaoyang University of Technology, Taichung City, Taiwan e-mail: [email protected] L. Lee e-mail: [email protected] C. Lakshmi RGM College of Engineering and Technology, Nandyala, Andhra Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_1
1
2
V. Lakshmi Narasimhan et al.
One can see “synesthetic experience” particularly in all forms of fine arts—in poetry, painting, sculpture, and music and in particular, Indian dance forms such as Kuchipudi, Bharathnatyam, and Kathak. The “synesthetic experience” serves as a means to unify the arts through a psychological unity of the senses. Synesthesia refers to the transfer of qualities from one sensory domain to another, to the translation of texture to tone or of tone to color, smell or taste. The use of color in music has also been employed in commercial tools and educational aids and seen throughout different art movements, from the eighteenth to the twentieth century, including the idea of synchronism which is the syncing of color to music in painting. The use of color organs is seen as far back as the early 1700s with many different attempts at pairing colors and color wheels to Western musical scales [3]. Rainbow, Music Inc., Piano Inc., X1 (a very recent Japanese start-up), and other modern companies and publishers use colors to teach such topics as ear training, note names, piano key identification, chord identification, and basic sight reading skills. Richard Gregory notes that the most common, natural, and well-known mixing of the senses is flavor: “Flavor is usually defined as the overall sensation of taste and smell.”
2 Synesthetic Learning Pedagogy (SLP) The word synesthesia comes from two Greek words, syn (together) and aesthesia (perception), thereby meaning “combined perception“. For example, a synesthet by seeing the word “rainbow” can remember the number “7”. The prevalence of synesthesia within a given population seems to be much higher than assumed before (Cohen et al. 1996) [4]: One out of 23 individuals in general population seems to have at least one type of synesthesia (Simner et al. 2006) [5]. Psychological studies into synesthesia have their roots in medical research. The physician Georg Sachs published the first study into color hearing (audition colorée) in 1812. At that time, synesthesia was considered as a medical pathology and clinical case studies were often used as a method of research. Sachs described photisms (the visual perception of colored spots in front of the eyes, like in after-images) that were perceived by his sister and himself with numbers, days of the week, letters and musical tones. Synesthesia is a real and not a product of imagination (Nunn et al. 2002) [6], and at present many experiments are being carried out based on neurological considerations. This brain anatomy proved that synesthesia has a number of variants, e.g., the grapheme-color synesthets may have an increased gray matter volume in specific parts of the brain (Peter et al. 2008) [7]. Using diffusion tensor imaging, it was discovered that grapheme-color synesthets show an increased neuronal connectivity between different brain regions (Rouw and Scholte 2007) [8]. Ramachandran and his team at the University of San Diego have performed phenomenal amount of work on the neurological bases of synesthesia, and they even present rules for comprehending the esoteric area of esthetics [9–14].
1 Synesthetic LearningPedagogy (SLP)—An Exploratory Investigation
3
Synesthesia is an important key to understand the human mind, especially creativity. There is a connection between learning, memorizing, and synesthesia. Synesthesia and science pedagogy are also related to physical and intellectual space concepts. A study by Klee and Paul [15] proved that neuroesthetics (art as an extension of the functions of the brain) has influenced many artists in various fields and provided a rich background for the development of new (syn-) esthetics, and this phenomenon can be related to teaching pedagogy also. Modernism has led to the conditions for a multidimensional pictorial autonomy, namely perception precedes meaning, and medium precedes message. In periods of intense creativity, the search for correspondences and complementarities between the senses is increasing. The complementary power of brain and teaching pedagogy is extending our horizon of perception. Accordingly, synesthesia is a neurological condition in which a person responds to stimuli such as smell, color, flavor, or music. A synesthet can remember 70–90% of the answer over a period of weeks. But a non-synesthet can remember1 only 20–40%. This paper is about an experiment on synesthetic learning process and how it can be employed to enhance learning capabilities for first-year B. Tech (Freshmen) level students. The rest of the paper is organized as follows: Sect. 3 details the research questions posed for the study along with corresponding hypotheses that form the basis of this study. Section 4 sketches the preliminary experiment formed along with indicative results obtained. Section 5 offers the related research in this area, while the conclusion summarizes the paper and provides pointers for further work in this area.
3 Research Questions Posed The “synesthetic pedagogical ideas” strand was framed with a set of specific research questions to understand their flexibility in terms of pedagogical innovation and to explore the place of pedagogy for the core purposes of engineering higher education. The research process presented in this paper concerns the following research questions on the synesthetic pedagogy: • Why and to what extent might synesthetic pedagogies be promoted—and in what ways? • How does synesthetic pedagogy improve memory of the student? • How might we connect multisensory inputs to improve memory and recall of entities? • How can flexible learning pathways support new forms of thinking, debate, and action in engineering higher education? • What are the impacts of time and memory on synesthets?
1 The
works of Ebbinghaus [16, 17] on memory and recall are also worth noting in this context.
4
V. Lakshmi Narasimhan et al.
Table 1 Mean, SD, and rho values of placebo group compared with IPG group
Table 2 Mean, SD, and rho values of placebo group compared with AG group
Test—1
μ
σ
ρ
Placebo
6
2.4495
–
IG: Recall with image
8.27
2.38951
0.2333
IG: Recall without image
7.27
3.5737
0.1835
Test—2
μ
σ
ρ
Placebo
3
1.9365
–
AG: Recall with audio
10.27
2.7664
0.3209
AG: Recall without audio
8.27
2.0929
0.2060
• How can we improve students’ memory by connecting learning with food, music, images, and colors?
4 Details of the Exploratory Experiment For the purpose of experimentation, we selected first-year2 Bachelor of Technology (B. Tech) students and divided them into four groups, namely placebo group (PG), image/photo group (IPG), audio group (AG), and video group (VG). A lecture from the course/subject of object-oriented analysis using unified modeling, therein specifically on the topic of developing conceptual model using UML, was presented to each group of students. The synesthetic triggers are collected from open-source materials, e.g., a natural scenery, an audio on traditional classical music and the videos used are the interviews with Sathya Nadella, CEO of Microsoft (available on YouTube). The same topic was taught to all four groups at different times by the same instructor; however, the placebo group (PG) received no cues, while the other IPG, AG, and VG groups were provided, respectively, image/photo or audio or video stimuli depending on the group names. Every student belonged to only one group. All groups were given one hour break followed by an objective exam on the same topic. The exams were evaluated by the same instructor. We do nominally assume that all our students and instructors have the same capabilities in terms of understanding and teaching, respectively; the latter assumption is nominal under the law of large numbers in statistics. Tables 1, 2, 3 and 4 detail the statistical results from this study, wherein Tables 1, 2 and 3 show the comparison of mean (μ), standard deviation (σ ), and rho values (ρ) of placebo group (PG) with those of the image/photo group (IPG), audio group (AG), and video group (VG), respectively.
2 “Freshman”
in American terminology.
1 Synesthetic LearningPedagogy (SLP)—An Exploratory Investigation Table 3 Mean, SD, and rho values of placebo group compared with VG group
Table 4 Intervals used for Tables 1, 2, and 3
Test—3
μ
5 σ
ρ
Placebo
10
1
–
VG: Recall with video
3.44
1.8971
0.2747
VG: Recall without video
3.11
1.3699
0.0984
Test—1
Intervals
Placebo
[7.6964, 4.3036]
IG: Recall with image
[9.98108, 6.55892]
IG: Recall without image
[9.3822, 5.1578]
Test—2
Intervals
Placebo
[4.3410, 1.659]
AG: Recall with audio
[11.905, 8.635]
AG: Recall without audio
[9.5069, 7.0331]
Test—3
Intervals
Placebo
[10.6925, 9.3075]
VG: Recall with video
[4.6793, 2.2007]
VG: Recall without video
[4.0049, 2.2151]
It can be inferred from Table 1 that the ρ values of IPG learners (0.2333) are higher than placebo group learners (0.1835), thereby implying that the image/photographbased synesthetic aid helps the learners to recall the taught materials better. It can be inferred from Table 2 that the ρ values of AG learners (0.3209) are higher than the placebo group (0.2060). This implies that audio-based synesthetic aid helps the learners to recall the taught materials better. It can be inferred from Table 3 that the ρ values of VG learners (0.2747) are higher than the placebo group (0.0984). This implies that video-based synesthetic aid helps the learners to recall the taught materials better. From Tables 1, 2 and 3, it can be obviously concluded that the image, audio, and video perceptions influenced the respective groups. It can also be inferred from the overall comparison of the intervals that recall with audio group has better recalling of the taught materials than the placebo group, IPG group, and video group. For completeness sake, the intervals for Tables 1, 2 and 3 are presented in Table 4. Our predictions after using synesthetic aids (image, music, and video) to teach selected topics on: (i) object-oriented analysis and design using UML (OOAD using UML), (ii) data warehousing and data mining, (iii) preprocessing techniques for data mining, and (iv) cluster analysis and cluster typing of data are that there are good potential educational benefits in the learning synesthetic aid pairing/combination. In this exploratory research, we found that student learning skills showed improvement on their memory performance in various follow-up tests conducted.
6
V. Lakshmi Narasimhan et al.
Table 1 shows the performance of placebo group of students, where their retaining ability went on decreasing as time passes by. Their performances were comparatively weak with that of image, audio, and video groups (as shown in Table 2), which shows consistent increase in the memory of the students during the same time period. As noted, several tests were conducted using audio and video to determine the possible role that these aids could have in learning technical materials. We present those results as well as showing the possible ways in which image, audio (Indian classical), and video (a simple pair talk) could be paired with certain topics relating to computer science engineering in Table 1. By creating a multisensory, enlivened learning experience in technical courses, under-performing students, and those who are more visual and kinesthetic learners may benefit. In our study, we have looked at the differences in the memory retention of visual and audio stimuli. We use statistical metrics such as mean (M), standard deviation (SD), rho (ρ), and ψ2 test for our statistical calculations. In each test, we observe a very reasonable improvement in recall performance of the students (see Table 1). This study shows a high degree of intersubjective agreement about which emotions are being triggered by a given musical selection. The results in the experiments also show that the lectured information is better remembered by the presence of music; further, films are better recalled by subjects when music is used in an intentional manner. These findings all point to some type of connection between audio and visual stimuli. In a similar vein, another round of experiment was conducted with another set of students (Test: 2), wherein the students are taught the subjects of data ware housing and data mining with the following sub-topics: (i) data mining functionalities, (ii) data preparing techniques, (iii) classification and prediction, and (iv) mining frequent patterns, quotations and correlations, patterns, associations along with “natural, audio, and visual synesthets.” Despite of higher mean scores for the synesthetic groups (PG, AG, and VG), the differences were considerably significant for the final test. The intervals for the second test with another set of learners are given in Table 5 as drawn from the values of mean, SD, and rho in Test-2. Table 5 Results for the final test and for the retention test
Test—2.1
Intervals
Placebo
[7.8878 < ρ 0 left of MPP ⎪ ⎬ dP/dV < 0 right of MPP ⎪ ⎪ dP/dV = d(V I )/d(V ) ⎪ ⎪ ⎪ ⎭ = I + V ∗ dI /dV The dP/dV is defined as the Maximum power point identifier factor. By utilizing this factor, the IC method is proposed to effectively track the MPP of PV array [6].
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) …
389
2.6 Fuzzy Logic Controller Method Fuzzy Logic Controller (FLC) is connected to a PV Module/Array output to emulate the current MPPT reference signal. The current MPPT work cycle is changed either by adding the previous duty cycle or by subtracting it. Fuzzy logic sounds like halfbaked or ambiguous logic. A Most important reason for fuzzy logic schemes is the concurrent design of the logic. Major components of the FLC are Fuzzification, knowledge of base membership function design, decision-making logic, and the last one is Defuzzification [7]. The output phase maps the sensor or other outputs, such as shifting to the correct membership functions and the values of truth. It is possible to calculate the error signal as shown below. The value of DE can be determined using the following formula: E(K ) =
Ppv (k) − Ppv (k − 1) V pv (k) − V pv (k − 1)
dE(k) = E(k)−E(k − 1)
(2) (3)
where, Ppv (k) and V pv (k − 1) represents the instantaneous output powers and photovoltaic generator output voltage. dE (k) is Global dynamic error, E (k), E (k-1) are two consecutive sampling values of et , and k is sampling counter The fuzzy logic MPPT is done with the help of Mamdani’s system and there are several methods for defuzzification but the center of gravity method is mostly preferred over others because of its performance efficiency is more in center of gravity [12, 13, 15]. Figure 2 represents two inputs, i.e., voltage and the Current and one output duty cycle with the membership function and with Mamdani operation. V n and I n represent the inputs and D indicates the output of the fis file.
Fig. 2 Structure of fuzzy logic Fis File
390
D. Yadav et al.
3 Simulation Results and Discussions Maximum power point tracking control technique is used mainly to extract maximum capable power of the modules with respective solar irradiance and temperature at a particular instant of time by the help of Maximum Power Point Tracking Controller. DC–DC converter transfers maximum power from the solar PV module to the load. MPP is achieved by changing the duty cycle of a converter so that the load impedance is varied and matched at the point of peak power with the source. Considering theses points a Simulation Model of the PV system has been designed in MATLAB/Simulation which consists of one panel, one boost converter and the MPPT with the measuring blocks like multimeter and scope for all three MPPT techniques. Temperature and Radiation level are variable parameters can be set according to the environmental conditions but in this paper, it has been kept constant and the other factors like dust, sand, shading, heating, etc. have not been considered in this simulation. During the simulation, the panel output is in the form of current and voltage that was obtained under fixed radiation of 1000 W/m2 and 25 °C. The panel specification is V oc = 66 V and Isc = 25 A, the output voltage from simulation comes out to be 52.84 Volts, 20.22 A for the given Module/Array. Parameters for the IC and PO method and FLC are kept the same like its switching frequency, Repeating sequence (1/10e3 ), the load of 500 , rating of the panel, IGBT device, and the values for Inductor (L = 0.01 H), and Capacitor (C = 2e−3 F). A Snubber circuit is used in order to suppress the rate of rise of forward voltage dV/dT across the thyristor to avoid damage in the boost circuit the current coming from the Module/Array is based on the general equation of solar cell. All the environmental factors are the same for all the three MPPT Techniques and have a fixed resistive load of 500 to check its performance for PO, IC, and FLC, Fig. 3 shows the Basic IV and PV characteristics for the panel. Fig. 3 General IV and PV characteristic for the panel
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) …
391
Fig. 4 Incremental conductance subsystem block description
3.1 Incremental Conductance MPPT In an MPPT model of incremental conductance method, the voltage and current coming from the panel are directly given to the MPPT to obtain the maximum output from the PV Module/Array. MPPT block tracks the maximum power point for the panel so that Maximum output can be obtained [8]. Figure 4 shows the subsystem model that has been used for IC-based MPPT. The output of the MPPT technique (IC) is in the form of a pulse, whose value is 0–1 as shown in Fig. 5. The output of the IC-based MPPT block is given to the Boost Converters port of the IGBT which can be seen in Fig. 6. The voltage coming from the panel is 52.84 Volts and it was boosted to 232.697 V with the help of the boost converter. The voltage of the boost converter stabilized at 232.697 V, Current to be about 0.465 A across the R load of 500 . The stability time of the voltage at the end of the boost converter comes out to be 1.439 s. The PV Module/Array terminal voltage has lots of ripples whereas boost converter voltage is little smooth and has fewer ripples. Figure 7 shows the Voltage, Current, and Power of the boost converter that was obtained after using the IC-based MPPT technique.
3.2 Perturb and Observe method Figure 8 shows the model Diagram that has been designed in MATLAB/Simulation having one panel, one boost converter, and the MPPT, with the measuring blocks like multimeter and scope to check its precise value. During the simulation, the output current and the terminal voltage of the module was measured at 53 V for the specified panel. P&O MPPT tracks the panel’s
392
D. Yadav et al.
Fig. 5 A pulse of IC method
Fig. 6 Simulink model of PV system with IC MPPT
maximum power point to achieve maximum output. The output of the MPPT technique (Perturb and Observe) is in the form of a pulse, whose value is 0–1 as shown in Fig. 9. The DC converter switching action caused some ripple in the output current; the magnitude of the ripple can be reduced by increasing the size of the inductor used in the boost converter. The voltage of the boost converter stabilized at 218.782 V, and current of 0.423 A across the R load of 500 , the settling time of the PO-based
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) …
393
Fig. 7 Output of boost converter in terms of voltage, current, and power for the R load with IC MPPT
Fig. 8 Simulink model of PO-based MPPT with boost converter
Fig. 9 Pulse of PO MPPT
394
D. Yadav et al.
Fig. 10 Output of boost converter in term of voltage, current, and power for R load with PO MPPT
MPPT is 2.42 s. when the model was simulated for the time period of 5 s as shown in Fig. 10.
3.3 Fuzzy Logic Controller Based MPPT The fuzzy logic Controller Block supports the .fis file as shown in Fig. 11. FLCbased MPPT controller work on the fdl2.fis file that has been created in Matlab, where input and output functions are defined with membership functions on which the fuzzy controller work, the fis file has been constructed here with the help of the Membership Functions and Logical Operations. In a yellow box, the inputs are defined in terms of “V n ” and “I n ” and blue box shows the output in terms of “D”. The Simulink model of the FLC-based MPPT is shown in Fig. 12 The output of the PV Arraywas 53 V, which was the voltage of the panel, the output voltage of FLC-based MPPT is 223.575 V and 0.447 A, stability (the value at which the graph or curve get saturated) has been obtained after 0.047 s. which represents the final
Fig. 11 Simulink Model for FLC with. Fis file block
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) …
395
Fig. 12 Simulink model of fuzzy logic controller MPPT technique with boost converter
Fig. 13 Output of MPPT in terms of pulse
output for the model. The graph shown in Fig. 13 represents the magnified pulse which was obtained for the time period of 0.09 s. for FLC–based MPPT. Figure 14 represents the output of FLC-based MPPT whose output is in the form of voltage, current and power obtained at the load end of the model having a load of 500 . It is seen that the voltage profile has been modified after using the boost converter as it has less transients and a higher value of voltage than the PV system. Figure 14a has the time period of 5 s and 14b represents time period (t = 0.5 s) of 0.5 s. this has been done simply to verify its response and settling time. Figure 15 represents the output of all the Three MPPT in a magnified form so that the curves can be analyzed easily.
4 Comparison of PO, IC and FLC Based MPP Algorithms The PO, IC, and FLC MPPT algorithms are compared by keeping the conditions same like the panel specification like its voltage and Current switching frequency,
396
D. Yadav et al.
Fig. 14 Output of boost converter in terms of voltage, current, and power for the R load with FLC MPPT a t = 5 s, b t = 0.5 s
Repeating sequence (1/10e3 ), the load of 500 , IGBT parameter like snubber circuit, the values for L, C is set to be 0.01 H (Inductor) and 2e-3 F (Capacitor) for the load of 500 ohm that is connected at the end of the Boost converter. FLC finds the maximum power point accurately at given temperature and radiation with less settling time and the ripples in it as compared to the IC and PO, less variation in the output voltage can be seen in the case of the FLC, which result in improvement of the accuracy. FLC has two main advantage that makes it better from the other two that is FLC have less transient and the settling time, in comparison with the PO and IC. Settling time for FLC < IC < PO, i.e., 0.047 < 1.439 < 2.42 s in the time period
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) …
(a) Output of IC based MPPT
(b) Output of P&O based MPPT
(c) Output of FLC based MPPT
Fig. 15 Output of all Three MPPT a IC, b P&O, and c FLC based
397
398 Table 1 Performance analysis of PO, IC and fuzzy logic based MPPT algorithms when connected to a resistive load
D. Yadav et al. PO method
I.C method
Fuzzy logic
Maximum output voltage
218.782 V
232.697 Volts
223.575 V
Minimum output voltage
208.60 V
227.908 Volts
223.485 V
Variation in voltage
7.182 V
4.795 Volts
0.09 V
Maximum output current
0.423 A
0.465 Amp
0.447 A
Settling time
2.42 s
1.439 s
0.047 s
Pulsating nature
Pulsating
Less pulsating
Smooth
Accuracy
Less
High
Very high
Transient
High
Low
Very less
of 5 s and the variation in the voltages is very less in FLC, i.e., 0.09 V, for IC is 4.795 volts and for PO is 7.182 volts that can be easily seen in the graph obtained by simulation. Comparison between the three MPPT in terms of performance is shown in Table 1.
5 Conclusion In this paper, all simulations are conducted with the help of MATLAB and Simulation software. The PV Array with MPPT algorithm was modeled in the first section where the solar energy performance has been enhanced, after this, the Voltage profile was boosted by the help of the boost converter and the results show that after using the MPPT algorithm in the PV system the voltage has been increased and it has very few ripples at the output end of the boost converter which is having the Resistance load of 500 . Parameters like load of 500 ohm, the value of L and C, Panel parameters, IGBT specifications, repeating sequence, and switching frequency have been taken same for all the three MPPT techniques (FLC, IC, and PO), and its response was analyzed by keeping the environmental factors constant (Temperature and Solar Radiation). Comparison of these MPPT Techniques was based on Maximum Output Voltage, Settling Time, Ripples, Accuracy Minimum Output Voltage, and variation in Voltage. After comparing the MPPT, it was concluded that FLC has better performance than IC and PO. Incremental conductance and fuzzy logic control approaches are known to have better performance than the PO algorithm. Ripples are very less in Fuzzy Logic Controller and its performance was better in comparison to the other two MPPT techniques, that’s why nowadays Fuzzy Logic is more in trend as it has the memory element which helps in storing the values which can result in self-learning process of the MPPT. The FLC reaches the maximum power point exactly at the specified temperature and radiation with less fixed time and the ripples are very less
34 Comparison of MPPT Algorithms in Stand-Alone Photovoltaic (PV) …
399
as compared to the IC and the PO, and accuracy is high. Setting time for FLC, IC, PO, i.e., 0.047 s, 1.439 s, 2.42 s in the time frame of 5 s and voltage variance is far smaller for FLC, i.e., 0.09 volts, for IC is 4.795 V and for PO is 7.182 volts which can be easily shown in the graph obtained through simulation. In this paper, we have considered some parameters constant like temperature, irradiation, and load, but these parameters can be taken as variables and the analysis can be done to find out the performance of the PV array with different loads, temperatures, and irradiations.
References 1. Faranda R, Leva S, Maugeri V (2008) MPPT techniques for PV systems: energetic and cost comparison. In Power and energy society general meeting-conversion and delivery of electrical energy in the 21st century, pp 1–6, https://doi.org/10.1109/PES.2008.4596156 2. Berrera M, Dolara A, Faranda R, Leva S (2009) Experimental test of seven widely-adopted MPPT algorithms. IEEE Bucharest PowerTech, Bucharest, pp 1–8, https://doi.org/10.1109/ PTC.2009.5282010 3. Yadav D, Singh Pal N, Ansari MA (2019) Analyzing the effects of coal& dust on solar panel and improving the fill factor by using spray cooling. In: International conference on emerging trends in electronics, IT and communication (Electrocon-2019) IILM Great Noida, India, https://doi. org/10.37591/rtecs.v6i3.3604 4. Yadav D, Pandeya S, Varshney S (2014) PV cell modeling in Matlab/Simulink. In: International conference on renewable energy and sustainable development ICRESD 2014/ICRESD-56, Pune, India 5. Kwon J-M, Nam K-H, Kwon B-H (2006) Photovoltaic power conditioning system with line connection. IEEE Trans Ind Electron 53(4), pp 1048–1054 6. Christopher W, Ramesh R (2003) Comparative study of PO and InC conductance MPPT algorithms. Am J Eng Res (AJER) (12):402–408. e-ISSN 2320-0847 7. Yadav D, Singh N, Ansari MA (2019) Fuzzy logic based MPPT controller with bilevel measurements & peak finder panel. IEEE-COER-ICAIA-2019, Rourke, India 8. Yadav D, Singh N (2014) Simulation of PV system with MPPT and boost converter. IEEE sponsored National conference on energy, power and intelligent control systems, Greater Noida, India 9. Sagar R, Singh Pal N, Ansari MA, Singh N, Yadav D (2019) Performance analysis of BIPV solar panel under the effect of external conditions. In: 3rd international conference on microelectronics and telecommunication (ICEMETE 2019), SRM University, Meerut, India, https:// doi.org/10.1007/978-981-15-2329-8_8 10. Lokanadham M (2012) Incremental conductance based maximum power point tracking (MPPT) for photovoltaic system. Int J Eng Res Appl (IJERA) 2(2):1420–1424. ISSN 2248-9622 www.ijera.com 11. Said S, Massoud A, Benammar A, Ahmed S (2012) A Matlab/Simulink-based photovoltaic array model employing SimPowerSystems toolbox. J Energy Power Eng 1965–1975 12. Lamnadi M, Trihi M, Bossoufi B, Boulezhar A (2016) Comparative study of IC, PO and FLC method of MPPT algorithm for grid connected PV module. J Theor Appl Inf Technol 89(1):2005–2016 13. Zainuri MAAM, Radzi MAM, Rahman NFA (2019) Photovoltaic boost DC/DC converter for power led with adaptive P&O-fuzzy maximum power point tracking. In: Proceedings of the 10th international conference on robotics, vision, signal processing and power applications. Springer, Singapore, https://doi.org/10.1007/978-981-13-6447-1
400
D. Yadav et al.
14. Zainudin HN, Mekhilef S (2010) Comparison study of maximum power point tracker techniques for PV systems. In: Proceedings of 14th international MiddleEast power systems conference (MEPCON”10), Cairo University, Egypt, pp 750–755 15. Bendib B, Krim F, Belmili H, Almi MF, Boulouma S (2014) Advanced Fuzzy MPPT controller for a stand-alone PV system. Energy Procedia 50:383–392. ISSN1876-6102
Chapter 35
Development of a Low-Cost ECG Device Raghav Bali, Paras Saini, Adeeb Khan, Md Shahbaz Alam, and Brajesh Kumar
1 Introduction Cardiovascular diseases (CVDs) are a major cause of death of several million people worldwide due to a lack of pre-requisite care and early detection, and growing healthcare costs. This problem is accentuated in a largely populated and developing country such as India. An annual estimate of 17.9 million people died from CVDs, accounting for one-third of all the global deaths [1]. Thus, there is an imperative need for an accurate, low-cost, and early diagnosis of major heart diseases [2]. ECG is a standard tool employed by cardiologists for monitoring the electrical activity of the heart which provides the cardiac health of the patient. It consists of five consecutive waves called P, Q, R, S, and T respectively. Figure 1 represents a standard ECG signal marked with different segments. The P wave represents atrial depolarization, the QRS complex signifies ventricular depolarization and the T wave denotes ventricular repolarization [3]. An ECG is essentially used for detecting various arrhythmias which are irregular and abnormal heart rhythms. However, being a time-domain signal it poses several R. Bali (B) · P. Saini · A. Khan · M. S. Alam · B. Kumar Department of Instrumentation and Control Engineering, Bharati Vidyapeeth’s College of Engineering, New Delhi, India e-mail: [email protected] P. Saini e-mail: [email protected] A. Khan e-mail: [email protected] M. S. Alam e-mail: [email protected] B. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_35
401
402
R. Bali et al.
Fig. 1 A standard ECG signal
difficulties in manually detecting and classifying different morphologies in the signal. Moreover, the analysis is exhausting and prone to human-induced errors. Hence, we explored the literature to address these drawbacks and found machine learning to be a robust tool to develop efficient and accurate diagnostic algorithms [4, 5]. Among the many algorithms, Support Vector Machine, Neural Networks, and Random Forestbased classifier are the popular ones. These algorithms automatically detect and categorize relevant patterns in the signal by learning from its past experience, without any external human aid [6]. Generally, these approaches have a preprocessing stage for conditioning and preparing the ECG signal before classification. It includes filtering and noise removal, QRS complex detection, feature extraction, etc. Subsequently, the extracted features that are calculated using statistical methods on signal segments are used for classification purposes. These features are used to train the neural network which creates a generalized network that matches the competence of a cardiologist in analyzing the signal [7]. With the aim of developing a low-cost ECG device, we started with the hardware phase. Initially, we used discrete components for building instrumentation amplifier and filters (low pass and notch), but faced several challenges such as the noise generated due to the susceptibility of breadboards to power line frequency, finding components of appropriate specifications, etc. Thus, we moved to the AD8232 heart rate monitor which compensated for the losses. It is a compact and integrated signal conditioning device for ECG and interfaced it with Arduino Uno (microcontroller) to plot and store the data as comma-separated values (.csv) file in computer [8]. The necessary connection between them is shown in Fig. 2. We connected a three-lead ECG electrode system to the device with lead I on the right arm, lead II on the left arm, and lead III on the right leg, conforming to Einthoven’s triangle [9]. The hardware setup of the project is illustrated in Fig. 3. The silver/silver chloride electrodes are attached to the chest as shown in the figure below. The AD8232 heart
35 Development of a Low-Cost ECG Device
403
Fig. 2 AD8232 connected to an Arduino Uno
Fig. 3 AD8232 heart rate monitor (left) and Arduino Uno (right) connected to a three-lead electrode system and a laptop
rate monitor requires a voltage of 3.3 V to operate which is provided by the Arduino Uno board. The three-lead ECG electrode system has a 3.5 mm connector at one end that is plugged into the heart rate monitor [10]. The baud rate of Arduino Uno microcontroller was set to 38,400 symbols per second which enabled us to sample the signal at a sampling frequency of 160 Hz. It has a 10-bit onboard analog to digital converter that is used to read analog signals between 0 and 5 V and outputs digital values ranging from 0 to 1023 [11]. The live ECG signal was plotted using the serial plotter of Arduino IDE as shown in the laptop screen below. Free software called PLX-DAQ was used to store the incoming data from all the three channels in an excel sheet in the form of a .csv file. An ECG contains a lot of information about our heart’s functioning and health embedded in it. The key lies in our ability to extract relevant features from it especially
404
R. Bali et al.
when it comes to the diagnosis of various heart diseases. An ECG is made of up various signals having different frequencies and time intervals which vary depending on the patient’s health. The normal ECG signal specifications are as follows [12]: • Required bandwidth for monitoring: 0.05–40 Hz. • Required bandwidth for anomaly detection: 0.05–100 Hz (upto 150 Hz in few cases). • RR-interval: 600–1200 ms. • P wave: 80 ms. • PR-interval: 120–200 ms. • PR segment: 50–120 ms. • QRS complex: 80–100 ms. • ST segment: 80–120 ms. • T wave: 160 ms. • ST interval: 320 ms. • QT interval: 420 ms (depends on the heart rate). In reality, the raw signals are far from comprehensible due to the presence of noise from various sources which lies in the workable frequency range of the signal. Thus, it becomes imperative to clean the signal and remove any interference it can cause with the signal. The list of main sources of noise present in an ECG is as follows [13]: 1. Baseline wanders: It is a low-frequency noise component which causes a problem in the detection of peaks. For e.g. T-peak would become higher than the R-peak and is falsely detected as an R-peak. It is caused by respiration and/or patient movement. 2. Power line interference: It occurs due to 50 Hz or 60 Hz power line noise and has a very large amplitude. 3. Muscle artifact: It is the noise generated due to muscle movement and is difficult to remove as it belongs to the same frequency spectrum as that of ECG. 4. Other interference: It includes noise due to radio frequency interference generated by nearby medical equipment. After implementing the hardware phase, we initialized the software phase with preprocessing the recorded ECG signal using python 3.7. It offers a vast collection of free and open-source libraries such as NumPy, SciPy, Pandas, Scikit-learn, Keras, etc. essential for doing research and development. We used Arduino IDE to establish handshaking between the computer and Arduino, and to plot the live ECG signal. The sampling frequency of our data was 160 Hz. To store it in a .csv format, we used PLX-DAQ which is a free add-on data acquisition tool for Microsoft Excel. In the preprocessing stage, the recorded data file (.csv) was loaded into the program, and noise was removed from the signal using modules from python libraries. In an ECG, the QRS complex is of significant importance since feature extraction is largely dependent on the accurate detection of R-peaks. For this purpose, we used a moving average window as an adaptive intersection threshold to mark the regions of interest where the R-peaks could be present. The peaks are then identified
35 Development of a Low-Cost ECG Device
405
as the highest point in the region. Using a moving average window coupled with low sampling frequency, lead to false negatives where correct peaks were rejected. Hence, up-sampling by a factor of 2 (or multiple of 2) was done to solve for the false negatives [14]. Features extracted from the detected R-peaks help us in training machine learning models for the classification of ECG. There are many features that can be extracted but the choice intrinsically depends on the classification problem at hand. Following is the list of various key features [14]: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Skewness. Kurtosis. Minimum value of RR-interval. Maximum value of RR-interval. Beats per minute (BPM). Interbeat interval (IBI). Standard deviation of RR-intervals (SDNN). Mean of RR-intervals. Median of RR-intervals. Variance of RR-intervals. Standard deviation of successive differences (SDSD). SD1/SD2 (Poincare plot). Root mean square of successive differences (RMSSD). Percentage of successive RR-intervals that differ by more than 20 ms (pNN20) and 50 ms (pNN50). 15. Mean absolute deviation of RR-intervals (MAD). Consequently, time-domain features such as BPM, IBI, SDNN, SDSD, RMSSD, MAD, etc. were statistically extracted from the recorded ECG signal. In this paper, we present a comparative analysis on the performance of FFNN and CNN on the MIT-BIH Arrhythmia Database in classifying different types of cardiac arrhythmias into Normal Beat class (N), Supraventricular Premature Beat class (S), Premature Ventricular Contraction class (V), Fusion of Ventricular and Normal Beat class (F), and Unclassifiable Beat class (Q) as per the relation between MIT-BIH database and Association for the Advancement of Medical Instrumentation (AAMI) EC57 standard [15]. Table 1 summarizes the above relation. The performance of both models is examined by comparing the values of accuracy and loss. The rest of the paper is organized as follows. Section II describes the dataset used for the training and testing of classifiers. Section III discusses the methodology for feature extraction and machine learning based classification. Section IV presents the results and discusses the future scope of the project. Finally, Section V summarizes the conclusions.
406 Table 1 AAMI EC57 categories and heartbeat annotations (mapping) [15]
R. Bali et al. S. No.
Category
Annotations
1.
N
Normal Left/right bundle branch block Atrial escape Nodal escape
2.
S
Atrial premature Aberrant atrial premature Nodal premature Supra-ventricular premature
3.
V
Premature ventricular contraction Ventricular escape
4.
F
Fusion of ventricular and normal
5.
Q
Paced Fusion of paced and normal Unclassifiable
2 Dataset For our analysis, we take the MIT-BIH Arrhythmia Database from physionet.org [16, 17]. This database provides us with large number of labeled heartbeat samples adequate for training a deep neural network. The database holds 48 files of half-hour recordings acquired from 47 different patients. It is a collection of 109,446 annotated records categorized into five different classes (N, S, V, F, and Q) based on its morphology. In our project, ECG lead II resampled to 125 Hz sampling frequency is used as the input. Refer to Table 1 for the AAMI EC57 categories and its respective beat annotations.
3 Methodology In this section, we shall begin with the first stage, i.e., the preprocessing which filters the ECG and computes a feature set. This is then used by the second stage called arrhythmia classification which incorporates machine learning algorithms on neural nets to classify the signal. The methodology is described in Fig. 4.
3.1 Preprocessing ECG contains several sources of noise which can impede our understanding of the signal and may lead cardiologists to pass a false judgment. Therefore, it is important to clean the signal. The required bandwidth for monitoring is 0.05–40 Hz and
35 Development of a Low-Cost ECG Device
407
Fig. 4 Workflow of ECG arrhythmia classification
for diagnosis is 0.05–100 Hz. We suggest a cogent method for preprocessing ECG signals. The steps are as follows (see Figs. 5, 6, 7, and 8):
Fig. 5 Raw ECG signal of the patient
Fig. 6 Filtered ECG signal of the patient
408
R. Bali et al.
Fig. 7 Comparison of the raw and filtered ECG signal
Fig. 8 R-peak detection with heart rate measurement in bpm
(1) Filtering the continuous ECG signal to eliminate noise using digital Butterworth low-pass and notch filters. (2) Cardiac cycle detection by identifying a QRS complex through a moving average window of 0.75 s with necessary up-sampling. (3) Determination of significant characteristic points by marking the regions of interest (ROI) where the ECG amplitude is larger than the moving average. (4) Marking the highest point in ROI as R-peak and formulating the characteristic feature set (SDSD, SDNN, RMSSD, etc.). The obtained feature set is used for testing and learning of the classifiers. In the subsequent subsections, we describe the architecture of the proposed neural networks and explain the machine learning algorithms employed for the purpose of arrhythmia classification.
3.2 FFNN-Based Arrhythmia Classification As the name suggests, a feedforward neural network is a network that allows the information to flow only in the forward direction (without feedbacks), i.e., from the
35 Development of a Low-Cost ECG Device
409
input neurons, to the hidden neurons (if present), and lastly to the output neurons. For the purpose of classification, the network matches the input feature set with the output classes. The network is trained and tested for an optimal number of iterations until convergence, after which the model can be used for real-world problems [18]. The suggested FFNN architecture is illustrated in Fig. 9. It comprises of three layers, namely: input layer, hidden layer, and output layer. The input layer characterizes the feature vector and the weights are randomly assigned to each neuron thus allowing symmetry-breaking. The hidden layer computes the weighted sum of the features and passes them through an activation function in the output layer, which fires specific neurons that contribute to the desired output of the model (i.e., classification). Fig. 9 Architecture of FFNN-based model
410
R. Bali et al.
We used the backpropagation algorithm (supervised learning) for training our network, cross-entropy as the loss function, gradient descent for optimizing the loss function, regularization to prevent over-fitting of the model, and sigmoid activation for determining the outcome of the network. We used maximum iterations (MaxIter) and regularization parameter (λ) as the tunable hyperparameters. After many permutations, a MaxIter of 50 and λ of 1 was found to perfectly fit our training set.
3.3 CNN-Based Arrhythmia Classification A Convolutional Neural Network is a type of deep learning network which is popular for its role in image classification, speech recognition, and natural language processing. CNNs provide the ability to automatically identify the relevant features in a signal, thus reducing the effort to extract features manually. As a result, a significant reduction in the network parameters and improved performance is observed. A 1-dimensional (1-D) convolution is essentially convolution done in one direction only, which is the time-axis. Thus, we employed 1-D convolution on ECG which is a time-series signal. The architecture of CNN-based arrhythmia classifier is illustrated in Fig. 10. It is a deep neural network with 13 layers in which 11 are convolutional (conv) and 2 are fully connected (FC) with 32 neurons. Each convolution layer has 32 kernels of 5 × 5 size and employ 1-D convolution through time. The network has 5 residual blocks (for solving degradation problem) which has 2 rectified linear units [19] (ReLU activation function) and a residual skip connection [20]. We use max-pooling of 5 × 5 size and 2 × 2 stride with necessary border padding in each pooling layer. We also use softmax activation which outputs a probability distribution required for the multi-class classification of ECG signal [21]. We used categorical cross-entropy as the loss function and Adam for optimizing the loss function during the training phase. We set the suggested Adam parameter values for learning rate, beta1, and beta2 as 0.001, 0.9, and 0.999, respectively, and achieved optimum results [22]. The complete training and testing of the model used Keras with TensorFlow [23] backend implementation on a GeForce GTX 1060 graphical processing (gpu) unit.
3.4 Mathematical Background The mathematical equations below provide an insight into the working of machine learning algorithms used in this project. 1. Backpropagation: It is the backward propagation of errors and is an algorithm used to adjust the weights of a network in proportion to their contribution to the
35 Development of a Low-Cost ECG Device
411
Fig. 10 Architecture of CNN-based model
overall loss. It is an application of the chain rule and is essentially the partial derivative of the loss function with respect to weights in the network [24].
δ (l) j = error of node j in layer l a (l) j = activation vector Di(l)j = the error for all l, i, j l = number of layers in the network, 1 . . . L
412
R. Bali et al.
x = training dataset(vector)of size mxn y = labels(vector) for records in x Θi(l) j
= neural network weight parameters
J (Θ) = loss function (4) δ (4) j = aj − yj
T δ (3) = Θ (3) δ (4) ∗ g z (3) T δ (2) = Θ (2) δ (3) ∗ g z (2)
x (1) , y (1) , . . . x (m) , y (m) i(l)j = 0(for all l, i, j). i = 1tom Set a (1) = x (i)
Perform forward propagation to compute a(l) for l = 2, 3, …, L. Using y (i) , compute δ (L) = a (L) − y (i) . Compute δ (L−1) , δ (L−2) , …, δ (2) (l+1) i(l)j = i(l)j + a (l) j δi
Di(l)j :=
1 (l) + λΘi(l) j if j = 0 m ij
Di(l)j := ∂ ∂Θi(l) j
1 (l) if j = 0 m ij (l) J (Θ) = DiJ
(1)
2. Gradient descent: It is an optimization algorithm that minimizes the loss function by repetitively advancing in the direction of steepest descent, dictated by the negative of the gradient. It uses the gradients calculated by backpropagation [25]. α = learning rate parameter
35 Development of a Low-Cost ECG Device
413
Repeat until convergence: ∂ θj: = θj − α J (θ0 , θ1 )(for j = 0 and j = 1) ∂θ j
(2)
3. Activation functions: They decide what nodes to fire or activate in the network by calculating the weighted sum of its inputs and adding a bias term so that they contribute to the model’s desired output. They add non-linearity to the network which is important for multi-class classification. x ; Sigmoid 1 − e−x
(3)
A(x) = max(0, x); ReLU
(4)
ea i ; Softmax Σ j ea j
(5)
σ (x) =
σi (a) =
4. Categorical cross-entropy: It is a loss function which is reliably used for multiclass classification problems. It is a combination of softmax activation and crossentropy loss. It produces a probability distribution for each input over all the classifiable classes [26]. h θ (x) ∈ R K (h θ (x))i = i th output h θ x (i) = predicted value yk(i) = true value m, k = positive classes of the sample m k 1 (i) yk log(h θ x (i) )k J (θ ) = − m i=1 k=1
+ 1 − yk(i) log 1 − (h θ x (i) )k
(6)
5. Adam: It is an optimizer which is based on adaptive learning rates. It computes individual learning rates for every trainable parameter and is very suitable for optimizing deep neural networks [22]. mˆ t vˆt
= unbiased estimate of first moment = unbiased estimate of second moment
mt vt
= exponential average gradient = squared gradient exponential average
414
R. Bali et al.
gt
= gradient at time t
η ε β
= learning rate parameter = tolerance paramter = hyperparameters
θt
= all parameters m t = β1 m t−1 + (1 − β1 )gt
(7)
vt = β2 vt−1 + (1 − β2 )gt2
(8)
mˆ t =
mt 1 − β1t
(9)
vˆt =
vt 1 − β2t
(10)
θt+1 = θt −
ηmˆ t vˆt + ε
(11)
6. 1-D convolution: It is very effective with time-series data and calculates convolution only in 1 direction, i.e., the time axis [27]. f = function of length n g = function of length m ( f ∗ g)(i) =
m g( j) · f i − j + 2 j=1
m
(12)
4 Results and Future Scope In this section, we compare the behavior of both the models in correctly classifying ECG signals obtained from the MIT-BIH Arrhythmia Database. We describe the total number of cases we have taken and the percentage of train-test split used for training and testing the models. Further, we assess the performance based on the values of accuracy and loss. We also visualize the behavior with loss versus epoch graph and confusion matrix. The training was done several times for sufficient number of iterations and after every training cycle, the results indicated that CNN outperformed FFNN in classifying cardiac arrhythmias. Besides attaining better values for accuracy and loss, the
35 Development of a Low-Cost ECG Device
415
Fig. 11 Distribution of samples among different arrhythmia classes before balancing
computational time was far less compared to FFNN’s since TensorFlow uses gpu acceleration. The MIT-BIH Arrhythmia Database is largely imbalanced with each class of arrhythmia varying in the number of counts (see Fig. 11). This would lead to a biased form of learning and the models would not be trained properly. Hence, data augmentation was performed on the training set to bring every arrhythmia class to equal level. For this purpose, we used the resample function of the scikit-learn library which resamples data matrices in a consistent way. Thus for both the models, the training was performed for a total of 100,000 samples (20,000 from each class of arrhythmia) and validation for a total of 21,892 samples hence constituting approximately an 80:20 train-test split. The distribution of samples among the five classes of arrhythmia before and after balancing is shown in Figs. 11 and 12 respectively.
4.1 Performance of FFNN The model was trained for a total of 50 epochs and achieved maximum training and test set accuracy of 89.89% and 89.82% respectively. The minimum value of loss attained was 0.660. The graph of loss versus epoch is shown in Fig. 13.
416
R. Bali et al.
Fig. 12 Distribution of samples among different arrhythmia classes after balancing
Fig. 13 Loss versus epoch graph for FFNN
4.2 Performance of CNN Similarly, the model was trained for a total of 50 epochs after which it attained an optimal training and validation set accuracy of 99.67% and 94.72%, respectively. The value of training and validation set loss was 0.009 and 0.245, respectively. The plot of loss versus epochs and confusion matrix is illustrated in Figs. 14 and 15, respectively. Table 2 lists the performance metrics of the two models.
35 Development of a Low-Cost ECG Device
417
Fig. 14 Loss versus epoch graph for CNN
Fig. 15 Confusion matrix for CNN
Table 2 Performance metrics of FFNN and CNN models S. No.
Training accuracy (%)
Training loss
Testing accuracy (%)
Testing loss
1.
89.89
–
89.82
0.660
2.
99.67
0.009
94.72
0.245
4.3 Future Scope • To improve upon the current methods of classification. • To implement the classifiers on the ECG recorded by our own device and validate the results from a cardiologist.
418
R. Bali et al.
• To compare the performance of our device with the standard 12-lead ECG device. • To introduce the third approach for classification to broaden the spectrum of research and analysis. • To implement the project on a smartphone.
5 Conclusion Machine learning algorithms are tools that carry the potential to transform the current trends in biomedical engineering. It shows promising results for cancer diagnosis, medical imaging, detection of cardiovascular diseases, etc. With the help of these tools, we can develop neural network models that uses the medical data of patients to predict the probability of a particular disease. In this project, we have presented an approach to develop a low-cost and simple ECG device with a three-lead based electrode system. We connected the device to a computer system for monitoring and storing the data. We preprocessed the signal to remove noise, detect R-peaks, and calculate heart rate. Further, we extracted relevant features from the signal to form our feature set. We have also suggested two different neural network-based machine learning models, namely feedforward neural network and convolutional neural network for the classification of cardiac arrhythmias. We trained and tested the networks on the MIT-BIH Arrhythmia Database and compared their performance. According to the results, the CNN-based arrhythmia classifier achieved prediction accuracy and loss of 94.72% and 0.245, respectively while the FFNN-based arrhythmia classifier achieved prediction accuracy and loss of 89.82% and 0.660, respectively. Furthermore, we visualized the training with a graph of loss versus epoch for both models. Thus, the CNN-based arrhythmia classifier is better optimized at accurately classifying the different ECG signals with higher accuracy, lower loss, and improved computational time.
References 1. World Health Organization (WHO) Fact sheets. https://www.who.int/news-room/fact-sheets/ detail/cardiovascular-diseases-(cvds). Last accessed 15 May 2020 2. UpBeat Homepage. https://www.upbeat.org/heart-rhythm-disorders. Last accessed 10 May 2020 3. Diker A, Cömert Z, AVCI E (2017, July 1) A diagnostic model for identification of myocardial infarction from electrocardiography signals. Bitlis Eren Univ J Sci Technolo 7(2):132–139 4. Esmaili A, Kachuee M, Shabany M (2017) Nonlinear cuffless blood pressure estimation of healthy subjects using pulse transit time and arrival time. IEEE Trans Instrum Meas 66(12):3299–3308 5. Dastjerdi AE, Kachuee M, Shabany M (2017, May 28) Non-invasive blood pressure estimation using phonocardiogram. In: 2017 IEEE international symposium on circuits and systems (ISCAS), pp 1–4. IEEE
35 Development of a Low-Cost ECG Device
419
6. Das K, Behera RN (2017) A survey on machine learning: concept, algorithms and applications. Int J Innov Res Comput Commun Eng 5(2):1301–1309 7. Rajpurkar P, Hannun AY, Haghpanahi M, Bourn C, Ng AY (2017, July 6) Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836 8. Analog Devices (AD8232) Data sheet. https://www.analog.com/media/en/technical-docume ntation/data-sheets/AD8232.pdf. Last accessed 10 May 2020 9. Conover MB (2002) Understanding electrocardiography. Elsevier Health Sciences 10. Antonicelli R, Ripa C, Abbatecola AM, Capparuccia CA, Ferrara L, Spazzafumo L (2012) Validation of the 3-lead tele-ECG versus the 12-lead tele-ECG and the conventional 12-lead ECG method in older people. J Telemed Telecare 18(2):104–108 11. ElectronicWings Arduino. https://www.electronicwings.com/arduino/adc-in-arduino. Last accessed 27 Mar 2020 12. Medscape Homepage. https://emedicine.medscape.com/article/2172196-overview#a1%20. Last accessed 27 Mar 2020 13. Ravichandran G (2014) An effectual approach to reduce the noise in ECG 14. van Gent P, Farah H, van Nes N, van Arem B (2017, October 29) Analysing noisy driver physiology real-time using off-the-shelf sensors: heart rate analysis software from the taking the fast lane project. J Open Res Softw 7(1) 15. EC57 AA (1998) Testing and reporting performance results of cardiac rhythm and ST segment measurement algorithms. Association for the Advancement of Medical Instrumentation, Arlington, VA 16. Moody GB, Mark RG (2001) The impact of the MIT-BIH arrhythmia database. IEEE Eng Med Biol Mag 20(3):45–50 17. Moody GB, Mark RG, Goldberger AL (2001) PhysioNet: a web-based resource for the study of physiologic signals. IEEE Eng Med Biol Mag 20(3):70–75 18. Fine TL (2006, Apr 6) Feedforward neural network methodology. Springer Science & Business Media 19. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10) 2010, pp 807–814 20. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition 2016, pp 770–778 21. Kachuee M, Fazeli S, Sarrafzadeh M (2018, June 4) Ecg heartbeat classification: a deep transferable representation. In: 2018 IEEE international conference on healthcare informatics (ICHI), pp 443–444. IEEE 22. Kingma DP, Ba J (2014, December 22) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 23. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S (2016, Mar 14) Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 24. Hecht-Nielsen R (1992, Jan 1) Theory of the backpropagation neural network. In: Neural networks for perception. Academic, pp 65–93 25. Ruder S. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609. 04747. 2016 Sep 15 26. Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. In: Advances in neural information processing systems 2018, pp 8778–8788 27. MissingLink Resources. https://missinglink.ai/guides/keras/keras-conv1d-working-1d-convol utional-neural-networks-keras/. Last accessed on 10 May 2020
Chapter 36
Detection Analysis of DeepFake Technology by Reverse Engineering Approach (DREA) of Feature Matching Sonya Burroughs, Kaushik Roy, Balakrishna Gokaraju, and Khoa Luu
1 Introduction Though DeepFake technology is a phenomenon formed by deep learning, this new technology proves to however be a medium to spread misinformation. DeepFake is an altered media caused made by using generative adversarial networks. They are used to create visual media that did not happen. This spreads fraudulent media to the public. The name originates from a Reddit user who first showed their work on doctored videos [1, 2]. DeepFake videos can offer forgery to what is visually seen and even audio. This can be very detrimental to the reputation of others. Interestingly enough DeepFakes were originally popularized by amateurs before researchers started using this technology. The process of generating DeepFakes involves training deep neural networks on face images to automatically map one face to the other in order to alter the media [2, 3]. DeepFakes were created by the Generative Adversarial Network [2]. This is used in Deep Learning. This is a machine and deep learning technique used commonly to generate new data based on training datasets. By learning patterns from S. Burroughs (B) · K. Roy Department of Computer Science, College of Engineering, North Carolina Agricultural and Technical State University, Greensboro, North Carolina, US e-mail: [email protected] K. Roy e-mail: [email protected] B. Gokaraju Computational Science and Engineering, College of Engineering, North Carolina Agricultural and Technical State University, Greensboro, North Carolina, US e-mail: [email protected] K. Luu Department of Computer Science and Computer Engineering, College of Engineering, University of Arkansas, Fayetteville, Arkansas, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_36
421
422
S. Burroughs et al.
Fig. 1 Above shows the alteration from the real frame to the fake frame also referred to as a DeepFake image [4]
the training set the network can generate an output of new data. Studies have made it possible to obtain a fake face image through this technique [1, 3]. This technology could be maliciously used to create convincing material for DeepFakes. In this paper, we reverse-engineer the possibility of combining deep learning techniques with key points using Scale-Invariant Feature Transform (SIFT) for feature extraction. We train the convolution neural network (CNN) to detect the existing DeepFakes with frameby-frame training as real and fake datasets. We study the performance of these SIFT features for DeepFake problem with the proposed DeepFake Reverse Engineering Approach (DREA). In this paper, we will examine the possibility of using deep learning techniques like key points using Scale-Invariant Feature Transform, SIFT, to detect the existence of DeepFakes. A DeepFake comparison can be seen below in Fig. 1.
2 Background This technology was created using Generative Adversarial Networks, or GANs, which is used to create data after the model is trained to learn patterns. In the comedic world, GANs were used for creating a way of morphing your face into the features of another or even something as cruel as taking the identity of another. However, this did not remain a tool for laughs. Some used this technique in order to spread misinformation at another’s expense. Some individuals are using this to demonstrate a new narrative of the actions of others, especially in the political scope. The first known attempt at trying to swap someone’s face was in circa 1865 in one of the iconic portraits of U.S. President Abraham Lincoln. His head was swapped with another politician [5]. This shows that the concept of this is beyond our time, but with time these DeepFakes are becoming harder to detect because of the high quality used now. Lincoln’s face-swapping was made by human hand. With the increase of neural networks and computational skills occurred people began to use this technology of misinformation often [6]. This is an example of how doctored videos can show misinformation to the public whether true or not. This is an issue that cyber
36 Detection Analysis of DeepFake Technology by Reverse …
423
officials are looking into so they can better identify these visual and sometimes notso-visual anomalies to detect this fake media. This turned in to a great scope for cybersecurity administration and the authors of this article, to reverse engineer the deep learning using object detection techniques against the GANs outcome. DeepFakes commonly showcase misinformation by morphing features on the object of the desired person. Detection techniques have been investigated to find DeepFake practically by comparing the deviation from the real media. Below are previous works that have also investigated the idea of using SIFT or CNN for the classification of DeepFakes. Some have investigated the same dataset, and some have not. However, there is no previous use case of the SIFT+CNN combination in the detection of DeepFakes’ literature.
3 Related Works DeepFakes commonly showcase misinformation by morphing features on the object of the desired person. Detection techniques must be investigated to find practical ways of detecting a DeepFake by comparing the deviation from the real media. Below are previous works that have also investigated the idea of using SIFT or CNN for the classification of DeepFakes. Some have investigated the same dataset some have not. However, it has yet to see the use of SIFT+CNN in the detection of DeepFakes.
3.1 Generative Adversarial Networks Generative adversarial networks (GAN) is a deep learning model, which learns the data and produces new interpolation or extrapolation information based on learning the patterns [3]. This is how the DeepFakes can generate new features to the object in the video or audio, without it looking obvious to others. Some DeepFakes however are better than others. A GAN contains parts that create fake material and match them up against the real and fake images to create high-quality DeepFakes [1]. This practice intensifies the need to be able to test whether media has been doctored. These fake videos have already been used in political campaigns, which causes the governments to crackdown in the worst cases [5].
3.2 DeepFake Techniques DeepFake detection often knows as a binary classification problem due to having two classes, original media, and fake media. This method requires a large database of real and fake videos to train, so the model can obtain the trends, deviances, and variability of the images to one another. More and more DeepFake videos are becoming readily
424
S. Burroughs et al.
available giving researchers more material for training the models, but it also means there is a rise in the creation of DeepFakes [1]. DeepFakes are however known to have flaws that give away the slightest discontinuity, in the integrity of the material. Whether it be a higher eyebrow than usual or an unusual mannerism. There is a high need for an intelligent algorithm to fight this malicious technology. Algorithms which would detect mistakes in DeepFakes seem beneficial to explore. Specific movements like head tilt or chin motion can be considered important for the analysis of DeepFakes [2].
3.3 Detection with Scale-Invariant Feature Transform Scale-Invariant Feature Transform has been investigated as a beneficial feature detection tool in the detection of doctored images. Feature detection is the process of computing distinctive points of interest. Feature detection and image matching can physically indicate deviances of unique features. An ideal feature detection technique should be robust to the image; this will test the distinctiveness of the points detected within the image. To obtain a high probability of the points must be prominent to the detector [7]. In previous research, Researchers have indicated that SIFT performs the best in most scenarios. This will be explained more later. For some background, SIFT was proposed by Lowe. SIFT is a deep learning algorithm that detects and locates unique features in digital media. SIFT detector locations remain the same test on the same image. The SIFT algorithm has 4 basic steps. The first, is to estimate a scale-space extremum using the Difference of Gaussian (DoG). Secondly, a key point localization where the key point candidates are localized and refined by eliminating the low contrast points. Thirdly, a key point orientation assignment based on local image gradient and lastly a descriptor generator to compute the local image descriptor for each key point based on image gradient magnitude and orientation [7]. This is explaining how the detectors find the unique points and how the point is translated into the image. The points of least significance are removed. SIFT detects points of interest in an image, called key points, and can be considered invariant to scale, rotation, or even illumination [2]. However, SIFT doesn’t focus its key points as much as SURF, Speed up Robust Feature [7]. This can be proved to be a good or bad characteristic in testing for the most efficient feature detector for DeepFakes. For images with varying intensity values, SIFT provides the best matching rate among the rest [7]. SIFT has been used with research focused on DeepFakes. In this paper, they analyzed DeepFake videos using SIFT as a detector of DeepFakes. This research showed that the SIFT features can be useful in differentiating between original and fake videos receiving accuracies over 90% [2].
36 Detection Analysis of DeepFake Technology by Reverse …
425
3.4 Convolutional Neural Networks CNN is needed in this research to be able to learn image features with computational power. Previous works have used CNN in order to learn these features from large datasets to distinguish between real and fake images. CNN must have a promising architecture in order to carefully show effectiveness. With the right architecture, it is natural for a CNN based method to detect fake face images [3]. Many benefitted from using a convolutional neural network (CNN) to extract frame-level features [5]. This is beneficial for the research we have done since SIFT will create a filter over the frames of the DeepFake and real videos before implementing CNN with these frames. The SIFT features will analyze not only the image but the deviance of the features detected to differentiate better, theoretically.
4 Materials and Methodology In this section, we will describe our process of DeepFake Reverse Engineering Approach in detail.
4.1 Materials FaceForensics++ holds a collection of datasets for the detection of face-related anomalies [8]. This dataset consists of over 1000 original video sequences that are doctored with different techniques such as Face2Face, FaceSwap, NeuralTextures, and of course DeepFakes [4]. We used 10 sample videos to populate the dataset for testing with SIFT and CNN. There are two classes of each sample one real class and the other fake class, so this would be 20 videos that we are analyzing. The training dataset contains about 180 frames from each video sample, 90 real and 90 fake. The testing dataset was about 20 frames from each video sample, 10 real and 10 fake. A single video was originally used from the widely-known collection of datasets FaceForensics++. After further accuracy problems, more videos were used for extraction to implement SIFT. The videos are selected from the category “DeepFake” in this dataset.
4.2 Extract Frames Most image detection methods cannot be used for videos because of the strong degradation of the frame data after video compression [1]. The beneficial way to analyze these videos was to implement extraction of the videos into frames (see Fig. 2). This
426
S. Burroughs et al.
Fig. 2 Before the CNN SIFT generated frames are used with CNN. Tasks must be done beforehand. a Once videos (real and fake) are obtained from FaceForensics++, b collect video frames. SIFT by OpenCV is used to generate distinct features in the video frames (c). The training and testing dataset are comprised of these “SIFT”ed frames
is a form of preprocessing before implementing the techniques of detecting DeepFakes by using SIFT and CNN. This allows for the frames, or images, to make up the dataset. The video lengths have different ranges in time, we took the first 100 frames from each video. These frames are generated in the format of .jpg files. This is an acceptable format to use for the implementation of SIFT.
4.3 Feature Detection To detect distinct features on the images for distinguishing, we used a technique called SIFT, Scale-Invariant Feature Transform. SIFT proves to have better results for the objective of the research when tested against other feature detectors like SURF and ORB (Oriented SIFT and Rotated BRIEF). We use SIFT through a module called OpenCV which is available in Python as well as other programming languages. SIFT uses a visual tracking of features in the frames in order to use the comparison of images as well as simple detection This process involves using this function to generate feature indicators onto the frames extracted before in the previous step. The images will all vary on location based on movement in the frames from the other. From the real to DeepFake images however, the indicators visually show a deviation of the indicators around the face of the subject. The detectors appear in slightly if not completely different locations of the frames. This is a great indication that OpenCV’s library, SIFT, would prove as a good detection tool for the classification of DeepFakes. To further investigate this, feature matching is used to match the key points of the real and the DeepFake frame. Key points between the two images are matched by identifying their nearest neighbors. If the ratio is greater than 80% then the match is then rejected [2]. In this research, SIFT worked in projecting the indicators for the detection of features for the real image as well as the fake image
36 Detection Analysis of DeepFake Technology by Reverse …
427
(see Fig. 3). The points that are used in SIFT algorithm can also match features from one image to another (see Fig. 4). The output is presented as the result in Fig. 4. The matching performance statistics are shown in Table 1. SURF has the highest accuracy for feature matching between DeepFake and real images. This is subject to change depending on the frames used. However, the detectors of the SURF were focused on specific areas of the video frame aside from the image. This made SIFT the best for our study. Analysis of all anomalies in the image must be examined. SIFT proves as a good indicator in this preliminary step before using CNN.
Fig. 3 Above displays the SIFT descriptors on the real frame and the DeepFake frame
Fig. 4 This figure shows the feature mapping of the frames, real and DeepFake, shown before. The visual shows the deviation of descriptors
Table 1 The matching statistics of SIFT and other standard feature extractions are given here Point (fake)
Point (real)
Matches
Ratio (%)
SIFT
2447
2481
2090
84.24
ORB
500
500
443
88.6
SURF
3520
3574
2876
80.47
428
S. Burroughs et al.
4.4 Implement Convolutional Neural Network For better results, a convolutional neural network model is needed. A convolutional neural network, or CNN, is used as a means of classification. CNN is a technology in image recognition and objects recognition due to the ability for CNN to learn patterns at a fast rate that will better help in the indication of an anomaly or indication of a real image or not. The convolutional neural network is comprised of layers that can in this case extract features from images to learn. Before feeding into the CNN model, we separated the images into 90% training images and 10% testing. This step is key to training a model. The CNN then goes through the training to learn the patterns of the images to better identify them. The training process is set to 50 epochs for better accuracy results but can be set higher if the accuracy and loss are not as desirable. For this study, the epochs remained at 50. The videos will not train at the same rate; the higher the resolution the longer to train. The images are then validated through the test dataset. From the model’s “knowledge” between real and DeepFake frames, an accuracy is then calculated. The accuracy needed should be above 80% to be deemed as a relevant satisfactory performance accuracy. The CNN architecture is comprised of layers to get an accurate indication of the accuracy when using SIFT with the convolution neural network to detect the DeepFakes by using image classification (see Fig. 5).
5 Results and Conclusion When using SIFT with CNN at epoch 50 most reached over 90% reaching 93% in accuracy. The results of the Scale-Invariant Feature Transform with Convolutional Neural Networks gave accuracies from 90% and higher at around epoch 50 when the training set was 90% and the testing set was 10% of the dataset.
Fig. 5 The proposed architecture of the CNN model
36 Detection Analysis of DeepFake Technology by Reverse …
429
100 90
87 82
80 ACCURACY
70 60 50
56 49.5
92 87
93.5 91
69 60
40 30 20 10 0 0
EPOCHS
Fig. 6 The chart shows the average training and testing accuracy of the video frames during epochs. The CNN+SIFT result is illustrated in blue. The CNN result is illustrated in red. The average accuracies are displayed as numbers. The accuracies are separated by epochs of 10. The y-axis shows the range of accuracy. The last resulting accuracy displays the average accuracy at epoch 50
5.1 Comparison of Test Cases To see the significance of the results of Scale-Invariant Feature Transform with the convolutional neural network, compare the results (see Fig. 6). Using the results from the images with SIFT filter, compare with results of images without SIFT filter. This will be a clear indication of whether the SIFT features are able to detect the DeepFakes. The SIFT with CNN is the first test case, and CNN is the second test case in this research.
5.2 Conclusion There was a slight increase between the test cases with CNN and with SIFT+CNN. CNN+SIFT reaching accuracies varying to 93% while CNN ranged just under. SIFT is a good indicator for DeepFakes, but also SIFT is a good indication of visual deviation. Both SIFT+CNN and CNN test cases have inconclusiveness in complex DeepFakes at epoch 50, the threshold of testing. This will be further investigated. This research focused on the comparison of the original and the DeepFake to investigate if SIFT is a beneficial tool for the detection of this false media when used with a convolution neural network. By using feature matching, SIFT proves to be a decent tool of detection for DeepFakes. In the study, we see that there is a slight increase in accuracy with using DeepFake however with more difficult DeepFakes the accuracies will vary in CNN and with CNN+SIFT. This research was able to prove that SIFT is a feature extraction method for better detection, but there were limitations. This research was not able to reach accuracies above 90% within the 50-epoch threshold
430
S. Burroughs et al.
if the DeepFakes were more subtle. This makes the DeepFake harder to detect which means more training would be needed here. In further research, this will be further investigated to see if other factors can be used to detect more complex DeepFakes. Acknowledgements We would like to acknowledge the support from the National Science Foundation (Award Number: 1900187).
References 1. Nguyen TT, Nguyen CM, Nguyen DT, Nguyen DT, Nahavandi S (2019) Deep learning for Deepfakes creation and detection. arXiv preprint arXiv:1909.11573 - c M, Milivojevi´c M, Gavrovska A (2019, November) DeepFake video analysis using 2. Ðordevi´ SIFT features. In: 2019 27th telecommunications forum (TELFOR), pp 1–4. IEEE 3. Mo H, Chen B, Luo W (2018, June) Fake faces identification via convolutional neural network. In: Proceedings of the 6th ACM workshop on information hiding and multimedia security, pp 43–47 4. Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Niener M (2019) FaceForensics++: learning to detect manipulated facial images. In: International conference on computer vision (ICCV) 5. Güera D, Delp EJ (2018, November) Deepfake video detection using recurrent neural networks. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS), pp 1–6. IEEE 6. Maksutov AA, Morozov VO, Lavrenov AA, Smirnov AS (2020, January) Methods of deepfake detection based on machine learning. In: 2020 IEEE conference of russian young researchers in electrical and electronic engineering (EIConRus), pp 408–411. IEEE 7. Karami E, Prasad S, Shehata M (2017) Image matching using SIFT, SURF, BRIEF and ORB: performance comparison for distorted images. arXiv preprint arXiv:1710.02726 8. Rössler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2018) Faceforensics: a large-scale video dataset for forgery detection in human faces. arXiv preprint arXiv:1803.09179
Chapter 37
Efficient Analysis and Classification of Stages Using Single Channel of EEG Through Supervised Learning Techniques Santosh Kumar Satapathy, Praveena Narayanan, and D. Loganathan
1 Introduction Healthy sleep is basic requirement for a human being, and proper sleep habits affect our performance in both social and professional life and its direct link with our physiological activities. Proper sleep decides the quality of learning ability, physical activity, mental ability and performance of the overall activities [1]. In general, complete sleep duration is covering through different stages of sleep which are interrelated with our biological system [2]. With the modern digital generation, the life style of the human being is complicated, and ultimately, it has resulted that millions of people get poor quality of sleep during night time. This problem is seen across world with all age groups of people, and this is global challenge in health care sector because it has found from the different research that poor quality of sleep is the major responsible of creating critical diseases such as bruxism [3], insomnia [4], narcolepsy [5], obstructive sleep apnea syndrome [6] and behavioral disorder [7]. Currently, there are two important sleep standards followed during sleep staging analysis. According to both standards, the whole sleep stages are divided into three basic categories: wakefulness, non-rapid eye movement (N-REM) and rapid eye movement (REM). The first sleep handbook edited by Rechtschaffen and kales (RK) in the year 1968. According to RK rules, the NREM sleep stage further divided into four sub-sleep stages: N-REM1 stage, N-REM 2 stage, N-REM3 stage and N-REM4 stage [8]. This sleep standards followed all clinicians during analysis of sleep irregularities of the patients, but in the year 2008, another well-known institute named as American Academy of Sleep Medicine (AASM) determined new guideline through small modification on RK rules. As per AASM rules, the non-rapid eye movement is further segmented into three sub-sleep stages such as N-REM1, N-REM2 and N-REM3 [9]. The sleep cycle S. K. Satapathy (B) · P. Narayanan · D. Loganathan Pondicherry Engineering College Puducherry, Pondicherry, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_37
431
432
S. K. Satapathy
generally repeated at regular interval of time between NREM stages to REM stages, and each cycle of interval continued around 90–120 min [10]. Sleep staging is normally examined using polysomnographic (PSG) recordings from the admitted subject in the clinic. Basically during PSG test, three different physiological recordings are collected from the subjects to measure the sleep quality during night time, and such signals are electroencephalogram (EEG), electrooculogram (EOG), electromyogram (EMG) and electrocardiogram (ECG) [11]. Amongst these major researchers, the first priority is towards analysis of sleep pattern abnormality through EEG signal because EEG signal provides the information of brain activities and behavior of subjects during sleep. So, most of the sleep studies are based on EEG signal. Currently, overnight sleep study through polysomnography is one of the standard procedures for measuring the sleep irregularities during sleep [12].
2 Related Work Recently, several work related to sleep stage scoring has been carried out by the different researchers. The maximum work has been proposed in the literature, and that work was implemented through the EEG signals, extracted the features from the representative input signals, next to that some of the authors have obtained different feature selection algorithms to select the most impact features for the classification model. Finally, different classification techniques obtained to discriminate the EEG signals. In the literature, many of the research work already contributed with subject to automatic sleep stage classification based on a single channel of EEG signals. Oboyya et al. [13] proposed sleep stage scoring based on single channel of EEG and selected subjects for this experiment work limited between 35 and 50. Features are extracted through wavelet transform techniques, and selected features are fed into fuzzy c-means algorithm for classification and the average accuracy resulted as 85%. Aboalayon [14] also developed EEG-based sleep disorder analysis and obtained Butterworth bandpass filters that were used to reduce the irrelevant muscle movements and noise artifacts and extracted features fed into SVM classifiers to distinguish between different sleep stages. The research work reported 90% classification accuracy. Hassan [15] proposed a scheme using bootstrap aggregating for classification with help of two benchmark sleep data resources called as SleepEDF and DREAMS, and their accuracy was 92.43%. In [16], author proposed sleep classification system with concept of structural graph similarity. The experimental work is completely based on input of EEG signal. Extracted time domain features forwarded to the SVM classifier. The average classification accuracy was reported as 95.93%. Memar, P. et al. have designed a system for analysis of sleep irregularities in which the author has selected 25 sleep suspected subjects and 20 healthy subjects for experimental purpose. Here, a total of 13 features are extracted from each eight (alpha, theta, sigma, beta1, beta2, gamma1 and gamma2) sub-band epochs. The extracted features validated through the Kruskal–Wallis test, and selected features
37 Efficient Analysis and Classification of Stages Using Single …
433
are classified through random forest classifier. The overall classification accuracy achieved through fivefold and subject-wise cross-validation is 95.31% and 86.64%, respectively [17]. Zhang et al. proposed a novel, simple and efficient sleep disorder diagnosis system from EEG signal with 30 s epochs length of input and obtained entropy features and the extracted features fed into SVM classification model and achieved an overall accuracy of 94.15% [18]. Zhu et al. proposed sleep stage classification methods based on time and frequency domain properties from a single channel of EEG signal, extracted features are represented into graph representation, selected features are forwarded into SVM classifiers for classifying multiple stages of sleep stages, and their final accuracy was 87.50% for two-state sleep stage classification problems [19]. Eduardo T. Braun et al. proposed a portable and effective sleep staging classification system, in which he has conducted an experiment on the combination of features extracted from EEG signal and classifiers. He designed the system in such a manner that, the proposed research achieved best classification accuracy by considering fewer frequency domain features and the overall accuracy reported as 97.1% for the two-state sleep stage classification problem [20].
3 Experimental Data In this study, we have used three different categories of subjects with different medical conditions. All related recorded data was collected from public comprehensive sleep repository, named as ISRUC-SLEEP. Specifically, this dataset has designed for sleep disorders related to study. This dataset contained the sleep records from human subjects with various health conditions. All recordings were recorded through the sleep experts at Hospital of Coimbra University (CHUC) in the department of sleep medicine center [21]. This database managed with different subgroups of data, in first subsection the recorded sleep details from 100 subjects, with one recording session per subject. This dataset has used by so many researchers for sleep stage classification problems. For the acquisition of signals, subjects attended an 8–9 h full night PSG test at a sleep laboratory. These recorded signals are sampled at 200 MHz, and each epoch length considered a time frame of 30 s according to AASM standard. The ISRUCsleep dataset description briefly mentioned in Table 1. Table 2 shows the detailed distribution of sleep records of enrolled subjects in this experimental study (Tables 3, 4 and 5). Table 1 ISRUC-sleep subgroup-I/II/III dataset structure Acquired Bio-signals
Channels
Electroencephalogram
EEGC3-A2 , EEGC4-A1 , EEGF3-A2 , EEGF4-A1 , EEGO1-A2 , EEGO2-A1
Electrooculogram
EOGLOC-A2 , EOGROC-A1
Electromyogram
EMGChin EMG , EMGLeft leg EMG , EMGRight Leg EMG
434
S. K. Satapathy
Table 2 Detailed information of each subjects sleep dataset records used in this study Database used-ISRUC-sleep Subject number/subgroup (I/II/III)
W
N1
N2
N3
R
Total epochs
Subject-16 Subgroup-I/one session
128 17.07%
125 16.67%
280 37.33%
120 16.00%
97 12.93%
750
Subject-2 Subgroup-III/one session
89 40.53%
120 18.80%
274 26.80%
149 7.33%
118 6.53%
750
Table 3 Sleep behavior of mild affected sleep problem subject (subject-16)
Table 4 Sleep behavior of healthy controlled subject (subject-02)
4 Methodology In this work, we have obtained subjects with different medical conditions. Here, we also considered different session recordings of subjects during computation of sleep stage scoring.
37 Efficient Analysis and Classification of Stages Using Single …
435
Table 5 Example of 30 s epochs sleep behaviors from different sleep stages of subject-16 and subject-02
Figure 1 describes current research study on identifying the sleep disorders. All the steps mentioned in block diagram are used during treatment of sleep-related disorders and each step description mentioned below.
4.1 Feature Extraction Feature extraction is one of the most important parts of the classification because it directly links with the performance of a classification; if the features are not chosen well, then the performance of a classification will be degraded. In this study, the EEG signals are segmented into smaller segments called epochs. The interval of each segment is 30 s (6000 sample data points) in this sleep analysis study according to AASM manuals. In this sleep study, we extracted 28 features including both time and frequency domain features. Out of the total extracted features, 13 features are from time domain, and 15 features are from frequency domain-oriented (Table 6).
4.2 Feature Selection To get the most appropriate features and reach the best optimize classification result, it is important to remove the irrelevant features in the extracted feature set. In our
436
S. K. Satapathy
ISRUCSleep Dataset
Patients Data Collection
Subject-02 Healthy Controlled Subject with One Session Recording
Subject-16 Suspected with Mild Sleep Problem with One Session Recording
Extraction Of C3-A2
30s Epochs
Filtering
Diagnosis of Sleep Related Diseases
Feature Extraction
SVM
DT
Fig. 1 Block diagram of the proposed method
proposed sleep study, we have obtained the machine learning techniques for classification purposes, and it requires proper input features to predict the correct outputs most efficiently. It has been seen that sometimes all the extracted features from the input signal may not be relevant for all the subject cases, for this reason we have obtained the feature selection techniques for identifying the suitable features with respect to the individual subjects. In this study, we have considered the online streaming feature selection (OSFS) techniques for selecting suitable features for the classification tasks [22].
37 Efficient Analysis and Classification of Stages Using Single …
437
Table 6 Short explanation of the extracted features for this proposed study Extracted time domain features Extracted feature set
Feature No.
Extracted feature set
Feature No. FE2
Mean value
FE1
Maximum value
Minimum value
FE3
Standard deviation value FE4
Median
FE5
Variance
FE6
Zero crossing rate
FE7
75 percentile
FE8
Signal skewness
FE9
Signal kurtosis
FE10
Signal activity
FE11
Signal mobility
FE12
Signal complexity
FE13
Extracted frequency domain features Extracted feature set
Feature No.
Extracted feature set
Feature No.
Relative spectral power in δ, β, α, θ bands
FE14, FE15,FE16, FE17
Power ratios δ/β, δ/θ, (θ + α)/(α + β), θ/α, θ/β, α/β, α/δ
FE18, FE19, FE20, FE21, FE22, FE23, FE24
Band power in δ, θ, α, β bands-F25, F26, F27, F28
4.3 Classification It has observed from the literature study that most of the authors are obtained support vector machine (SVM) and decision tree (DT) techniques for sleep stage scoring. This technique is generally more acceptable in most of the application classification tasks. Therefore, in the present study, we have obtained the SVM and DT classification techniques for sleep stage scoring.
5 Experimental Results and Discussion To diagnose sleep-related irregularities, first of all, analysis of the different sleeprelated disorders must detect and classified. To resolve these issues, classification on different sleep stages analysis is needed. In this paper, we have approached the two-state sleep stage classifications to monitoring the abnormality in sleep patterns. The working model for this proposed scheme is illustrated in Fig. 1. To evaluate the performance of proposed research work, the different category of medical conditions of subjects with their different session recordings described in Sect. 3. As described before, in this experiment, only one channel used as data acquisition from brain. Next to that, we have filtered out the muscle artifacts and removing noisy portions from recorded signals through Butterworth bandpass filter. The length of each epochs is 30 s time framework. Thus, EEG epochs of 6000 data points were used for each hypnogram representation. The set of experiments is conducted to extract the features from the respective acquired channel in both the time and frequency domain. Hence,
438
S. K. Satapathy
Table 7 Final feature selection list Participants name/gender Best feature combination (30 s epochs) Subject-16 male
FE116 , FE216 , FE316 , FE416 , FE516 , FE716 , FE916 , FE1016 , FE1116 , FE1316 , FE1416 , FE1516 , FE2216 , FE2516 , FE2716 , FE3116 (16 features)
Subject-02 male
FE102 , FE202 , FE302 , FE402 , FE502 , FE702 , FE802 , FE902 , FE1102 , FE1202 , FE1302 , FE1402 , FE2102 , FE2202 , FE2502 (15 features)
there exist a total of 28 features from both domains. The extracted feature names are mentioned in Table 7. We obtained the feature selection technique as OSFS to find out the suitable properties for the classification task. Here, we have conducted the test separately for the subject with suspected sleep disorder with one-time session recordings and the healthy subject with one session recordings. In this proposed study, we have evaluated the classification accuracy [23] from both subjects and conduct a comparative analysis based on achieved results from evaluation metrices. In this proposed study, we have considered some evaluation parameters such as recall [24], specificity [25], precision [26] and F1-score [26] for measuring the performance of the proposed sleep analysis study. From the experimental results, we notice that the SVM depicts an overall classification accuracy of 95.6 and 91.20% achieved through DT classifiers for subject-16. For subject-02, the same classifiers SVM and DT reached overall accuracy of 87.46% and 87.06%, respectively. Table 8 shows the confusion matrix for both the enrolled subjects in this proposed sleep EEG study. To measure the impact of classification techniques with subject to sleep stage scoring, we compute Cohen’s kappa coefficient for our proposed two-state sleep stage classification case. It is one of the robust approaches for evaluation of how best the classification algorithm performed with the proposed system as compared to how well it performed by agreement. The computation of Kappa coefficient is of six levels of Table 8 Confusion matrix obtained for subject-16 and subject-02 (30 s epochs length) Subject-16(ISRUC-Sleep) Subgroup I/Session1_Recording
C3-A2 SVM
Wake Sleep
C3-A2 DT
Wake Sleep
Subject-02(ISRUC-Sleep) Subgroup I/Session1_Recording
Wake
Sleep
C3-A2
133 3
30 584
SVM
Wake
Sleep
C3-A2
134 37
29 550
DT
Wake Sleep
Wake Sleep
Wake
Sleep
20 25
69 636
Wake
Sleep
3 11
86 650
37 Efficient Analysis and Classification of Stages Using Single …
439
Table 9 Overall performance of C3-A2 channel of sleep EEG study for subject-16 and subject-02 Subject enrolled epochs length-30 s Channel
SVM classifier
C3-A2
16
02
DT classifier 16
02
Accuracy
95.60%
87.46%
91.20%
87.06%
Precision
95.11%
90.21%
94.99%
88.32% 98.34%
Recall
99.49%
96.22%
93.70%
Specificity
81.60%
22.47%
82.21%
3.37%
F1-score
97.25%
93.12%
94.34%
93.06%
agreements: (1) excellent agreement ranging from 0.81 to 1, 0.61 to 0.80 considered as substantial agreement, 0.41–0.60 represents as moderate agreement, 0.21–0.4 basically interpreted as fair agreement, 0 to 0.20 mentioned as slight agreement, less than zero considered as poor interpretation [27] (Tables 9, 10 and 11). Table 10 Performance of classification accuracies and Kappa statistic for subject with mild sleep problem and healthy control subject of 30 s epochs length Subject with mild sleep Subject-16 (30 s epochs length) problem (one session) recording
Subject-2 (30 s epochs length)
Classifiers
Accuracy rate Kappa coefficient Accuracy rate Kappa coefficient
SVM
95.60%
1.92
87.46%
0.72
DT
91.20%
1.43
87.06%
0.19
Table 11 Performance comparisons of proposed study with existing similar contributed works Author
Year
Signal type
Method
Feature number
Accuracy (%)
Aboalayon et al. [13]
2014
Single-channel EEG signal
Frequency sub-bands features extraction + SVM classifier
Three features (Time and frequency domain)
92.5%
Zhang et al. [18]
2018
Single-channel EEG signal
Entropy features + SVM
Entropy features
94.15%
Proposed work
2020
Single-channel EEG signal
Sleep EEG study + SVM and DT classifier
Thirteen (time domain) and fifteen (frequency domain)
95.60% (SVM) 91.73% (DT)
440
S. K. Satapathy
6 Conclusion and Future Directions This research article presents an efficient, automated two-state sleep stage classification from EEG signal through machine learning techniques. Both time and frequency domain features are extracted, which becomes more effective for distinguishing the sleep stages. The positive aspect of this application is to analyze the irregularities that occurred during sleep hours from single channel of EEG with 28 features for scoring two-state (wake versus sleep) sleep stage classification according to the AASM standards, and additionally, this application also successfully deals with the different medical conditions of the subjects. The OSFS algorithm was used for selecting the suitable features based on strong relevance. Classification is based on SVM and DT. In the present research work, there were certain benefits as follows. First thing is only single channel is only used for sleep analysis; second advantage is with related to more than 1500 epochs of 30 s from public sleep dataset are analyzed highlighting the effectiveness of proposed system. The introduced method achieves best performance for distinguishing the sleep stages between wake versus sleep with related to other existing similar contributions. Our future research implementation will focus the effects of class imbalance problem and also including the multiple PSG signal such as EOG, EMG and ECG. Further, we will also consider more number of clinical sleep data, especially include the different sleep problem patients to measure the performance of the proposed research work for a higher accuracy.
References 1. Aboalayon KAI, Faezipour M, Almuhammadi WS et al (2016) Sleep stage classification using EEG signal analysis: a comprehen-sive survey and new investigation. Entropy 18:272 2. Chung M-H, Kuo TB, Hsu N, Chu H, Chou K-R, Yang CC (2009) Sleep and autonomic nervous system changes? Enhanced car-diac sympathetic modulations during sleep in permanent night shift nurses. Scand J Work Environ Health 180–187 3. Heyat MBB, Akhtar F, Azad S (2016) Comparative analysis of original wave & filtered wave of EEG signal used in the detection of bruxism medical sleep syndrome. Int J Trend Sci Res Develop 1(1):7–9 4. Heyat MBB, Akhtar SF, Azad S (2016) Power spectral density are used in the investigation of insomnia neurological disorder. In: Proceedings of Pre-Congress symposium, organized by Indian Academy of Social Sciences (ISSA) King George’s Medical University State TakmeelutTib College Hospital, Lucknow, Uttar Pradesh, pp 45–50 5. Rahman Farook, Siddiqui H (2016) An overview of narcolepsy. Int Adv Res J Sci Eng Technol 3:85–87 6. Kim T, Kim J, Lee K (2018) Detection of sleep disordered breathing severity using acoustic biomarker and machine learning techniques. BioMed Eng OnLine 17:16 7. Siddiqui M, Srivastava G, Saeed S (2016) Diagnosis of insomnia sleep disorder using short time frequency analysis of PSD approach applied on EEG signal using channel ROC-LOC. Sleep Sci 9(3):186–191 8. Rechtschaffen A (1968) A manual for standardized terminology techniques and scoring system for sleep stages in human subjects. Brain Inf, Serv
37 Efficient Analysis and Classification of Stages Using Single …
441
9. Iber C, Ancoli-Israel S, Chesson AL, Quan S (2007) The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications. Westchester, IL: American Academy of Sleep Medicine 10. Carskadon MA, Dement WC (2017) Normal human sleep: an overview. In: Kryger M, Roth T, Dement WC (eds) Principles and practice of sleep medicine, 6th edn. Elsevier, Amsterdam, The Netherlands, pp 15–24. [Online]. Avaialble https://doi.org/10.1016/B978-0-323-24288-2. 00002-7 11. Holland JV, Dement WC, Raynal DM (1974) Polysomnography: a response to a need for improved communication. In: Presented at the 14th Association for the Psychophysiological Study of Sleep [Online] 12. Acharya UR et al (2015) Nonlinear dynamics measures for automated EEG-based sleep stage detection. Eur Neurol 74(5–6):268–287 13. Obayya M, Abou-Chadi F (2014) Automatic classification of sleep stages using EEG records based on Fuzzy c-means (FCM) algorithm. In: 2014 31st National Radio Science Conference (NRSC), pp 265–272 14. Aboalayon K, Ocbagabir HT, Faezipour M (2014) Efficient sleep stage classification based on EEG signals. In: 2014 IEEE Long Island, Systems, Applications and Technology conference (LISAT), pp 1–6 15. Hassan AR, Subasi A (2017) A decision support system for automated identification of sleep stages from single-channel EEG signals. Knowl-Based Syst 128:115–124 16. Diykh M, Li Y, Wen P (2016) EEG sleep stages classification based on time domain features and structural graph similarity. IEEE Trans Neural Syst Rehabil Eng 24(11):1159–1168. https:// doi.org/10.1109/tnsre.2016.2552539 17. Memar P, Faradji F (2018) A novel multi-class EEG-based sleep stage classification system. IEEE Trans Neural Syst Rehabil Eng 26(1):84–95. https://doi.org/10.1109/tnsre.2017.2776149 18. Pernkopf F, O’Leary P (2001) Feature selection for classification using genetic algorithms with a novel encoding. In: Skarbek W (eds) Computer analysis of images and patterns. CAIP 2001. Lecture notes in computer science, vol 2124. Springer, Berlin, Heidelberg 19. Zhu G, Li Y, Wen PP (2014) Analysis and classification of sleep stages based on difference visibility graphs from a single-channel EEG signal. IEEE J Biomed Health Inform 18(6):1813– 1821 20. Braun ET, Kozakevicius ADJ, Da Silveira TLT, Rodrigues CR, Baratto G (2018) Sleep stages classification using spectral based statistical moments as features. Revista de Informática Teórica e Aplicada 25(1):11 21. Khalighi S, Sousa T, Santos JM, Nunes U (2016) ISRUC-Sleep: a comprehensive public dataset for sleep researchers. Comput Methods Programs Biomed 124:180–192 22. Hanaoka M, Kobay M, Haruaki Y (2001, October 25–28) Automated sleep stage scoring by decision tree learning. In: Proceedings of the 23rd annual EMBS international conference, Istanbul, Turkey 23. Sanders TH, McCurry M, Clements MA (2014, August) Sleep stage classification with cross frequency coupling. In: Proceedings of 36th annual international conference of the IEEE engineering in medicine and biology society (EMBC), pp 4579–4582 24. Bajaj V, Pachori RB (2013) Automatic classification of sleep stages based on the time-frequency image of EEG signals. Comput Methods Programs Biomed 112(3):320–328 25. Hsu Y-L, Yang Y-T, Wang J-S, Hsu C-Y (2013) Automatic sleep stage recurrent neural classifier using energy features of EEG signals. Neurocomputing 104:105–114 26. Powers D, Ailab (2011) Evaluation: from precision, recall and f-measure to roc, informedness, markedness & correlation. J Mach Learn Technol 2:2229–3981 27. Liang S-F, Kuo C-E, Hu Y-H, Cheng Y-S (2012) A rule-based automatic sleep staging method. J Neurosci Methods 205(1):169–176
Chapter 38
TERRABOT—A Multitasking AI Anmol Kumari, Komal Gupta, and Deepika Rawat
1 Introduction TerraBot is a modern-day bot that expands the idea of Bot. It is a smart AI that uses deep neural networks to understand the natural language and map the associated intent. Such an AI bot not just saves our time but also acts as an effective friend in everyday life. TerraBot is one such helper bot that makes our life easier. It is a multitasking agent that helps us concentrate on other important aspects of our life by taking care of small activities like sending an email or playing a song basis a voice command. This tool helps us in increasing our productivity to mainframe activities. A multitasking tasking Bot comprising such a feature where it can act like a chatbot and at the same time be task-oriented is the need of an hour in every industry and personal use. TerraBot as a chatbot is built using the latest NMT [1–3] model and is trained with humongous data. It will learn and memorize using deep neural networks. It will perceive the statements of the end-user and give responses and take actions accordingly. The ability to extract data and identify the user’s intent along with relevant entities contained in the user’s request is the first requirement and the most crucial step at the core of a TerraBot. A failure at correctly understanding the user’s A. Kumari (B) · K. Gupta HMR Institute of Technology and Management, GGSIPU, New Delhi, India e-mail: [email protected] K. Gupta e-mail: [email protected] D. Rawat Department of Computer Science and Engineering, HMR Institute of Technology & Management, GGSIPU, New Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_38
443
444
A. Kumari et al.
request, won’t help in fetching the desired output. The overall process works in the manner where it firstly focuses to identify which part of the bot is to be invoked. Once the input voice command is converted into text it now focuses on invoking either that chatbot or the multi-agent bot. A chatbot is an artificial intelligence software that can simulate a conversation (or a chat) with a user in natural language through messaging applications, websites, mobile apps, or through the telephone, while multi-agent bot provides various functionalities like an assistant as per users command that can either be typed or through voice input. TerraBot recognizes the user’s intent [4] and corresponds through various APIs integrated into it. It can provide functionalities ranging from basic conversation to recommending nearby places. It has been designed with the ability to play a song for you of a particular movie or a particular artist without fail. It can tell you a joke when you want and also be your personal weather forecast reporter. It can extend to all of it apart from being a conversation builder to you. An assistant as such is often described as one of the most advanced and promising expressions of interaction between humans and machines. However, from a technological point of view, a chatbot only represents the natural evolution of a Question-Answering system leveraging Natural Language Processing (NLP). Formulating responses to questions in natural language is one of the most typical examples of Natural Language Processing applied in various enterprises’ end-use applications. TerraBot is created using the Black box NMT Model and was trained on the corpus that contains Reddit data [5]. It uses the BRNN technique that not just gives memory to it but also extends it to the past and future of input. Sqlite3 was used to create a database that would store 450 GB of data and data is stored in the Table data structure having respective features. The time savings and efficiency derived from AI bot answering recurring questions are attractive to companies looking to increase sales or service productivity. Such a bot can have multiple application basis the industry and computational speed requirement.
2 The Workflow of TerraBot TerraBot has two strict divisions one as a Chatbot and the other as a multitasking agent with a defined set of functionalities it can perform. The input received in the form of speech was first converted into text and principles of NLP were applied to it. Each input was broken into tokens. It was crucial to understand these tokens as this was the point, we needed to identify whether the command was for just a chat using TerraBot as a chatbot or it wanted action to be done using TerraBot as a multi-agent. Multi-agent worked with Intent and Intent classification. An Intent is analyzed as an important aspect of Natural language Understanding and calls were made to the respective API basis the intent classification with associated parameters. If an input command failed to have an intent as per the classification then it is concluded as a call for the chatbot and an NMT Chat agent was called invoking TerraBot as Chatbot. A chatbot was made feasible using NMT. For the Bot to accurately work as a chatbot it
38 TERRABOT—A Multitasking AI
445
Fig. 1 Workflow of TerraBot
needed to be trained well with a persistent memory that called for the use of Neural networks. TerraBot received voice input which was further converted to text using Speech to Text and further processed using the NMT Model (Fig. 1).
3 The Workflow of TerraBot as A Chatbot 3.1 Data Preparation and Structuring One of the primary tasks of making TerraBot conversational was to prepare it well with training data that structure it well in terms of input and output format for any machine learning algorithm to work on it. The training completely depended on the source of data and we used 1.7 Billion comment data of Reddit [5]. Such a data dump called for a huge challenge for TerraBot as data in Reddit was not linear but tree-like. The above layer was linear but it called for a reply to the first layer which further had a layer of reply any deep learning model requires input and output so to combat such an architecture we used to comment with the highest vote as reply corresponding to a certain question. Using the above approach, we could linearly choose one reply for a particular question. Anything in the form of Natural language calls for the need of a Neural Network. For a Neural network input layer was the question and the output layer was the reply. Using this phenomenon, we stored such pairs in our database using unique Id like parentId and commentId to identify our parent and children or question and answer uniquely.
446
A. Kumari et al.
3.2 Training Dataset In the training of our dataset, we basically used parent and an associated reply text file. Each line here was a sample. For example, line 3 of the parent file was the parent comment and line 3 for the reply file was the response to that particular comment of line 3 in the parent file. To create such files, we just had to do a small task of taking up the pair of data that we have already structured in our database and use them accordingly to append t respective training files. The training data was now ready and was to be used further for modelling of TerraBot.
3.3 Modelling In order to fulfil the needs of TerraBot, we devised the Seq2Seq model. Any kind of input can be easily deduced to a sequence mapped against another sequence. Sequence to Sequence model is well known for machine translation, dialogue generation and language modelling. We needed such an architecture that could use an encoder to create the meaning of the token using word vectors and decode it further to the corresponding reply. A seq2seq model is one such based on encoder-decoder architecture. A Recurrent Neural Network (RNN) is used in the Seq2Seq model which is a famous Deep Neural Network peculiarly for actions of Natural Language Processing (NLP). In this model, the input sequence as a vector representation of text is loaded to the encoder. At that point, the encoder generates an intermediate thought vector or information representation [6]. Subsequently, the generated thought vector is dispatched as input to the decoder. At last, the thought vector is processed by the decoder and transforms the sequence one after the other word and generates multiple outputs as target sequence from the decoder [7]. However, the Basic Seq2Seq model utilizes encoder and decoder architecture without attention mechanism, but we needed our model to remember a long sequence of information and this information becomes large for huge datasets and goes to the information obstruct for the RNN network. The basic idea of a traditional Neural Network is that it assumes the input sequences are independent of each other. This was not the case TerraBot needed because for many task predictions of the next word was dependent on the past word. This called the use for recurrent Neural Network as we needed the utilization of its ability to compute the basis the past word placed in its memory. This made an overall better use of the sequences coming as input. A Bidirectional LSTM and NMT were further used to avoid bucketing and padding. Need for NMT The sequence of inputs for TerraBot was not of fixed lengths. Sometimes a single word statement could result in multi-line response whereas multi-line input sometimes
38 TERRABOT—A Multitasking AI
447
could generate a single response. Further to this, each input differs in terms of words, character and more. Words would further get assigned to meaningful ids to form a vector that would give meaning to it [1, 2]. The challenge was handling the variable lengths. In order to overcome this, we decided to make all word string of a fixed size of 30. This idea instantly failed due to the excessive need for padding of input that had extremely small length and truncating of inputs that had length beyond 30. This called the need for using an NMT. It worked with variable inputs and needed no padding or bucketing. It also gave support for attention mechanism as it helped intensified a longer memory to the RNN. We made use of LSTM and Bi-directional RNN [8].
3.4 NMT Neural Machine Translation is a methodology that is generally utilized in machine interpretation or translation [1, 9, 10]. In the interim, it is additionally an extremely well-known way to deal with construct conversational models [2, 3, 11, 12]. With the enormous size of social conversation data, those models are absolutely data-driven (i.e., with no manual standards) to create conceivable reactions. NMT will have a solitary neural system of encoder-decoder to outline succession to target arrangement and the entire neural network be mutually prepared to expand the conditional likelihood of the source sequence creating the target sequence [1, 3, 9, 10] (Fig. 2). NMT works on the principle of reading input in order to create a thought vector. A thought vector is a sequence of numbers that helps in representing the meaning of a sentence. A decoder then lets the vector in it to process the needful translation. This is the architecture followed by an NMT. This way it helps address the problem efficiently. This architecture varies as per need. The basic choice is RNN both for the encoder and the decoder. An RNN may differ in various terms as it can be basis directionality and be unidirectional or bidirectional. It can differ basis depth or type. The one that TerraBot uses is a Bidirectional RNN and LSTM i.e. Long short term memory.
Fig. 2 NMT model
448
A. Kumari et al.
3.5 Long Short Term Memory A Recurrent Neural network called LSTM was used to build TerraBot as it needed a better memory. LSTM i.e. Long Short Term Memory is an exceptional variation of the cell kind of Recurrent Neural Network [13]. It has heuristically shown to function admirably for language modelling. LSTM has three types of gates i.e. forget gates, input gates and output gates. This helps in recollecting more contextual and relevant data and get rid of the remaining sequence which is advantageous in language modelling where reliance inside the sequence is scattered [14]. Also, rather than utilizing unidirectional cells, the performance of bidirectional LSTM cells is much better [8]. An LSTM, In general, could remember 10–20 tokens of the input sequences. However, the performance is seen to drop after a certain level and the Network starts to forget the initial tokens such that it can do the same for new ones and as a result, TerraBot started to find it difficult to retain beyond this limit. NMT supported the attention mechanism which thereby came into use to increase this attention span. Such an attention span gave longevity to the Network of up to 80 words. Attention Mechanism Neural Machine Translation by means of conjunctly learning to line up and translate [15], permits the decoder to specifically take a look at input sequence during decoding. This takes off the pressure of the encoder to encode each and every advantageous data from input [16]. In the decoder, during every timestamp, rather than utilizing a fixed context (last concealed state of the encoder), a discrepant context vector ai is utilized for generating bi word (Fig. 3).
Fig. 3 Attention mechanism
38 TERRABOT—A Multitasking AI
449
Each concealed state in the encoder encodes data about the nearby context in that piece of a sentence. As the data streams from the 0th word to nth word, this context information gets weakened. This makes it essential for the decoder. Various parts of the input sequence contain essential information for generating various parts of the output sequence. In different words, each and every word is aligned in the output sequence to various parts of the input sequence. This alignment model tells us about the measure of how closely the output at the position I coordinate with positions of the inputs. On this basis, we use the weighted sum of concealed states (input contexts) to generate every word in the output sequence.
3.6 Using a Bidirectional Recurrent Neural Network TerraBot needed more than just memory, it needed an understanding that data both in the past and the future were important [17, 18]. This called the need for the memory that had the required property and we used Bidirectional recurrent neural networks for the same. The RNN has numerous benefits of processing sequence data, however, it is more susceptible to gradient explosion and gradient disappearance. BRNN is developed on the premise of RNN to overcome the problems of gradient vanishing faced by RNN and has a bit of leeway over RNN regarding handling complex and long term data [19]. Bidirectional Recurrent Neural networks are trained to anticipate both the directions of time i.e. positive and negative directions simultaneously. Neurons of RNN are split into two directions, first in the positive time direction (for forwarding states) and another in the negative time direction (for backward states). There are no connections between these output states and inputs in the opposite directions. By using these two-time directions concurrently, input data from the future and past of the present time frame can be utilized to compute the same output. Since both the directional neurons don’t connect with each other, Bidirectional Recurrent Neural Networks are trained with algorithms similar to RNNs [20]. If backpropagation is important, some additional procedure is required, since both the input layers and output layers cannot be updated at a time. In general training, before passing the output neurons, both the states i.e. forward states and backward states are processed first in “forward” go. In the backward pass, the contrary takes place. Before passing the forward states and backward states, output neurons get processed first. When the forward pass and backward pass get completed then only the weights are updated.
4 The Workflow of TerraBot as a Multi-Agent There can be innumerable functionalities any Intelligent Bot can offer however we limited TerraBot to provide functionalities that were most important in our day to
450
A. Kumari et al.
day life. The functionalities offered by TerraBot were like playing a song, sending an email, checking on weather conditions, telling a joke and recommending nearby places. We wanted to make the approach of asking for functionality not fixed and user command to not go unserved at any point. Challenges were many but the most difficult one was to identify how we explain TerraBot to identify a particular function and react back to it appropriately. We further used Intent classification for understanding the intent behind an input.
4.1 Intent Classification Intent and Intent classification is an integral part of Natural language Understanding. Intent can be defined as a category of work recognition of which makes TerraBot proceed further to action [4]. It can be customised and trained based on the functionalities chosen. For instance, if the command was to play a song then in order to understand what had to be done the Bot needed to understand it. The objective of command is called intent. In the above case intent was a song. The input may differ in the specification of the song one might command to play a particular song by name or a song by a particular artist or a particular movie. This had to be taken care of. TerraBot had eight Intents namely song, joke, weather, nearby places, covid19, news, ask me anything and email. Next target was to classify the tokens as per intent and it needed further training to understand intent. The challenge was to not put a fixed sort of placements of commands for recognition and cover all possible ways by which command could have been [16]. Thus, each input was broken into two major categories namely Intent and entity. For the Song intent, we had three sets of possible entities namely song name or song name and artist name or song name and movie name. Weather prediction had a basic intent of weather and entities like location followed by the date and the time as it demanded real-time weather conditions of a particular location. The joke had both intent and entity as a joke. When the user demanded TerraBot to send an email it understood it as the intent of email and calculated entities from the input tokens which were the sender email and the body. In order to recommend a place, TerraBot used the needed location, longitude and latitude as an entity and guessed nearby places as intent. We have tried to put a small feature box where you can ask for anything and you get TerraBot to set you a result for it. TerraBot is trained for it as “Ask me anything” is intent and “message body” is taken to be the subject searched for. You can use TerraBot to fetch the trending news and we have limited it to five in number. Training it happened with “news” as intent and “topic” as an entity. Further on recognition of it, an automatic API is called for the same with needful parameter. The current situation is pandemic due to COVID-19 and we built a dashboard that has the feature of current cases and current death followed by mortality rate. It has a feature of top 10 cases and death all across the world and we can vouch for cases increase per day. We have trained TerraBot with “covid19” as intent and using the request module tried to link this web-based module with an associated class.
38 TERRABOT—A Multitasking AI
451
Fig. 4 Output of joke intent
Fig. 5 Output of suggest nearby places
After the primary task of Natural language Understanding and Intent classification, we used a certain API to call the functionalities passing the intent and the entity. The parameters passed as input to calls gave the algorithm a clear idea of what had to be done and whatever was needed to put the intent into action. A further calling was done and resulted in the fulfilment of the command (Figs. 4, 5, 6, 7, 8, 9, 10, 11, 12 and 13).
5 Conclusion and Future Vision The overall objective of ‘TerraBot’ is to provide effective time management and ease to its user and accurate and efficient training helps it serve the purpose the best. Building a multitasking bot is the modern-day need and TerraBot tries to achieve it with niche technologies and better need classification however there is always a scope of improvement and making it accurate. We have trained our model for more than 20 h and with each set of hours of training we observed replies of TerraBot becoming more relevant thus training it using a GPU in the future would give it more efficiency and robustness. Future aim
452
A. Kumari et al.
Fig. 6 Output of weather intent
Fig. 7 Output of song intent
includes integration with IOT for home automation and then car automation to make tasks in these two domains more user-friendly. This can be further extended to build a complete Bot using hardware integration using a microcontroller as the brain of the device. A raspberry pie can be used. The bot’s ability to send an email can be intensified using an attachment addition facility. The future scope of recommending nearby places could be intensified such that we can add particular subdivisions to places like food places, restaurants, travel places, etc. where the bot can actually identify the category and we can add one more entity for recognition of places using API. Suggesting or playing a song is now limited to YouTube. Considering the commercial aspect of the TerraBot, future aspects would include the addition of websites like Saavn, Gaana, Spotify, etc. The overall encapsulation of the entire bot in the form of hardware using a microcontroller
38 TERRABOT—A Multitasking AI
453
Fig. 8 Output of song intent (playing on YouTube)
Fig. 9 Output of email intent
that would actually work in a programmed way to stipulate all the similar outputs. ‘TerraBot’ has a vast scope of improvements and innovation, the future vision can expand on the advancement in technology and innovative modification to make it more and more scalable and user friendly.
454
Fig. 10 Output of TerraBot working as Chatbot
Fig. 11 Output of ask me anything Intent
A. Kumari et al.
38 TERRABOT—A Multitasking AI
455
Fig. 12 Output of covid19 intent
Fig. 13 Output of news intent
References 1. Cho K, Van MerriTnboer B, Gulcehre C et al (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Conference on empirical methods in natural language processing (EMNLP 2014) 2. Shang L, Lu Z, Li H (2015) Neural responding machine for short-text conversation. ACL (1):1577–1586 3. Vinyals O, Le Q (2015) A neural conversational model. CoRR abs/1506.05869 4. Sebastiani F (2002) Machine learning in automated text categorization. ACM Comput Surv 34:1–47 5. Reddit Comments Dataset, [Online]. Available https://www.reddit.com/r/datasets/comments/ 65o7py/updated_reddit_comment_dataset_as_torrents/
456
A. Kumari et al.
6. Jadeja M, Varia N, Shah A (2017) Deep reinforcement learning for conversational AI. In: SCAI’17-search-oriented conversational AI, San Diego 7. Li J, Monroe W, Ritter A, Galley M, Gao J, Jurafsky D (2016) Deep reinforcement learning for dialogue generation. In: Proceedings of the 2016 conference on empirical methods in natural language processing, Austin, Texas 8. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18:602–610 9. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst 3104–3112 10. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 11. Sordoni A, Galley M, Auli M et al (2015, June 1) A neural network approach to context-sensitive generation of conversational responses. In: Conference of the North American Chapter of the Association for Computational Linguistics—Human Language Technologies (NAACL-HLT) 12. Xing C, Wu W, Wu Y et al (2017) Topic aware neural response generation. AAAI 17:3351–3357 13. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26:3111–3119 14. Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. In: NIPS’14 proceedings of the 27th international conference on neural information processing systems, vol 2. Montreal, Canada 15. Neural Machine Translation by Jointly Learning to Align (Trans: Bahdanau D, Cho K, Bengio Y) (Submitted on 1 Sep 2014 (v1)). Last revised 19 May 2016 (this version, v7) 16. Liu J, Li Y, Lin M (2019) Review of intent detection methods in the human-machine dialogue system. J Phys Conf Ser 1267:012059 17. Hakkani-Tur, Dilek, Gokhan T, Celikyilmaz A, Chen, YN, Gao J, Deng L, Wang YY (2016) Multi-Domain joint semantic frame parsing using Bi-directional RNN-LSTM. 10.21437/Interspeech.2016–402 18. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate 19. Setiaji B, Wibowo F (2016) Chatbot using a knowledge in database: human-to-machine conversation modeling. In: 2016 7th international conference on intelligent systems, modelling and simulation (ISMS) 20. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Chapter 39
Mutation Operator-Based Image Encryption Algorithm for Securing IoT Rashmi Rajput and Manish Gupta
1 Introduction In current scenario, Internet of things becomes a latest research area, and most of the researchers are doing research in this field because of increasing demand of IoT devices in near future [1]. The uses of IoT-enabled devices such as smartphones are increasing day by day, and a lot of images are transmitted regularly to thousands of people via social media Web sites and apps. So an exchange of secure images over the communication network becomes a serious issue [2]. Various traditional encryption algorithms such as RSA, AES, IDEA, and Diffie-Hellman have been developed, but the efficiency of these algorithms for image encryption is less due to higher redundancy and higher correlation among pixels. Traditional image encryption algorithms based on symmetric key cryptography are generally more expensive due to algorithmic complexity and require more number of rounds for encryption. So work [3] proposed a less complex algorithm that uses a 64-bit block cipher and uses 64-bit key for image encryption, which requires less memory and less number of rounds as compared to other encryption algorithms for encryption. Various algorithms of image encryption-based 4-D chaotic map [4], hyperchaotic and genetic algorithm [3], hash key using mutation operator and chaos [5], chaos using mutation and mutation operator [6], based on a fractionalorder hyperchaotic system [7], hyperchaotic [8], chaotic dynamic S-box and DNA sequence operation [9], and 3-D logistic map [13] have been proposed for image encryption.
R. Rajput (B) · M. Gupta VITM, Gwalior, MP, India e-mail: [email protected] M. Gupta e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_39
457
458
R. Rajput and M. Gupta
This work uses a mutation operator for both image encryption and key generation phase that increases the difficulty level for the attackers. Rest of the sections of this paper is organized as follows: Sect. 2 describes the proposed work, Sect. 3 describes the experimental results on various parameters, and finally, Sect. 4 concludes the overall work.
2 Proposed Methodology This section describes the proposed methodology used in this work. The proposed methodology works on three different phases: The first one is key generation phase, the second one is encryption phase, and the third one is decryption phase. (a) Key Generation phase: In this phase, a secure unique key is generated by using a random function, named as a mutation operator. In this type of mutation operator, randomly select the single mutation point in a parent string and mutate the value (0–1 or 1–0) of the parent string. Figure 1 shows the process of mutation operator. The following steps show the process of mutation operatorSteps of Mutation Process Step 1: Select mutation probability (Pm ), and here, the value of Pm = 0.01 is taken for experiment. Step 2: Find out the length of binary string on which mutation operator is applied. Step 3: Repeat K = 1 to length of binary string R = rand (); If R < Pm binarystring (k) = ~ binarystring(k); End (b) Encryption phase: This phase is used to encrypt the image information. In this phase, the encryption process completes four different rounds and two different swaps. Each round contains XOR, XNOR logical function, and one random function in the form of mutation operator. The first swap operation is performed
Fig. 1 The concept of mutation operator
39 Mutation Operator-Based Image Encryption Algorithm for Securing …
459
after the first encryption round, and the second swap is performed after the third encryption round. The following steps show the working of encryption process: Step 1: Divide the 64 bits of binary data into 4 blocks, each having 16 bits in size. Step 2: Perform XNOR operation on the first 16-bit block with key K1 Step 3: Apply random function (f) on a 16-bit block collected from step 2 and then perform XOR operation on 16-bit block taken after the f-function operation and third 16-bit block. Step 4: Perform XNOR operation on last 16-bit block with key K1 Step 5: Apply random function (f) on a 16-bit block collected from step 4 and then perform XOR operation on a 16-bit block taken after the f-function operation and second 16-bit block. Step 6: Repeat steps 2 to 5 for K2, K3, K4, and K5 keys, each having 16 bits in size. (c) Decryption Phase: This phase contains the reverse process of the encryption phases described above.
3 Experimental Results This section contains the experimental results performed on various parameters such as NPCR, entropy, correlation, and histogram analysis. Here, MATLAB 2015 version software, a system having Core i3 processor and 2 GB RAM i is used for performing proposed work. The following Lenna image is used for performing operations. Figure 4 shows the encrypted and decrypted Lenna image.
(a) NPCR: The following formulas are used for performing NPCR operation. The NPCR of the two images is given below NPCR =
i, j
D(i, j)
W×H
∗ 100%
where H and W are the height and width of the input image, and the value of D(i, j) is calculated as
460
R. Rajput and M. Gupta
D(i, j) =
0, if C1(i, j) = C2(i, j) 1, 1, Otherwise
(b) Entropy: It is a statistical measure of randomness that can be used to characterize the texture of the input image. For calculating entropy, the following formula is used, E(m) =
M−1 j=0
1 p(m j ) log p mj
(c) Correlation: The following formulas are used to calculate vertical, horizontal, and diagonal correlation, Cov (x, y) = E(x − E(X ))(y − E(y)) Rx y = √
Cov(x, y) √ D(x) D(y)
The following three formulas are used in numerical computations, where y and x are the two adjacent pixels value in the image (Figs. 2, 3 and 4; Table 1). N 1 E(x) = xi N i=1
D(x) =
Fig. 2 The horizontal, vertical, and diagonal correlation of original and encrypted Lenna image
N 1 (xi −E(x))(y − E(y)) N i=1
39 Mutation Operator-Based Image Encryption Algorithm for Securing …
461
Fig. 3 The original and encrypted Lenna image histogram
Fig. 4 The encrypted and decrypted Lenna image
Table 1 The comparison between existing techniques and proposed work on the parameter NPCR and entropy Encryption techniques
NPCR
Entropy
Enayatifar et al. [10]
99.54
7.9339
Enayatifar et al. [11]
99.56
7.8753
Talarposhti et al. [12]
99.63
7.8702
Proposed work using single-point mutation (Max.)
99.62
7.9965
Cov(x, y) =
N 1 (xi −E(x))(yi − E(y)) N i=1
(d) Histogram: statistical similarities between the original and cipher image are measured with the help of histogram analysis.
462
R. Rajput and M. Gupta
4 Conclusion Since most of the communications are in the form of images in IoT-enabled devices and due to the increasing use of mobile phones, there is a need for a secure image encryption algorithm that is fast and secure in nature. This work proposed an image encryption technique based on random function, known as mutation operator of genetic algorithm. For checking the efficiency of the proposed algorithm, experiments are performed on various parameters that show the proposed algorithm is better than the existing techniques described in this paper.
References 1. Liu B, Li Y, Zeng B, Lei C (2016) An efficient trust negotiation strategy towards the resourcelimited mobile commerce environment. Front Comput Sci 10(3):543–558 2. Usman M, Ahmed I, Aslam M, Khan S, Shah U (2017) SIT: a lightweight encryption algorithm for secure Internet of Things. Int J Adv Comput Sci Appl (IJACSA) 8(1) 3. Zhang X, Zhou H, Zhou Z, Wang L, Li C (2018) An image encryption algorithm based on hyperchaotic system and genetic algorithm. Qiao J et al (eds) Bio-inspired computing: theories and applications. BIC-TA 2018. Communications in computer and information science, vol 952. Springer, Singapore 4. Stalin S, Maheshwary P, Shukla PK, Maheshwari M, Gour B, Khare A (2019) Fast and secure medical image encryption based on non linear 4D logistic map and DNA sequences (NL4DLM_DNA). J Med Syst 43(8):267 5. Guesmi R, Farah M, Kachouri A, Samet M (2016) Hash key-based image encryption using mutation operator and chaos. Multi Tools Appl 75(8):4753–4769 6. Samhita P, Prasad P, Patro K, Acharya B (2016) A secure chaos-based image encryption and decryption using mutation and mutation operator. Int J Control Theor Appl 9(34):17–28 7. Huang X, Sun T, Li Y, Liang J (2015) A color image encryption algorithm based on a fractionalorder hyperchaotic system. Entropy 17:28–38 (MDPI) 8. Tong X, Yang L, Zhang M, Xu H, Zhu W (2015) An image encryption scheme based on hyperchaotic Rabinovich and exponential Chaos maps. Entropy 17:181–196 (MDPI) 9. Tian Y, Lu Z (2017) Novel permutation-diffusion image encryption algorithm with chaotic dynamic S-box and DNA sequence operation. AIP Adv 1–23 10. Enayatifar R, Abdullah AH, Lee M (2013) A weighted discrete imperialist competitive algorithm (WDICA) combined with chaotic map for image encryption. Opt Lasers Eng 51:1066–1077 11. Enayatifar R, Abdullah AH, Isnin IF (2014) Chaos-based image encryption using a hybrid genetic algorithm and a DNA sequence. Opt Lasers Eng 56:83–93 12. Talarposhti KM, Jamei MK (2016) A secure image encryption method based on dynamic harmony search (DHS) combined with chaotic map. Opt Lasers Eng 81:21–34 13. Ye G, Jiao K, Pan C, Huang X (2018) An effective framework for chaotic image encryption based on 3D logistic map. Hindawi Sec and Commun Netw 1–11
Chapter 40
A Novel Method for Corona Virus Detection Based on Directional Emboss and SVM from CT Lung Images Arun Pratap Singh, Akanksha Soni, and Sanjay Sharma
1 Introduction COVID-19 is a family of Severe Acute Respiratory Syndrome (SARS) as CoV-2. Till 1-May-2020, even more than 3.26 million people have been positively reported across 187 countries as well as territories, which resulted in more than 233,000 deaths and more than 1.02 million cases have been recovered. The lungs are the most affected organ due to COVID-19 because of this virus admittance host cells via enzymes. This virus also affects the gastrointestinal organs that indirectly affect the small intestine [1]. It has been named coronavirus because it looks like halos (also known as coronas) while having viewed through a microscope. Figure 1 shows the virion compositional structure that showing spikes as a “crown” [2]. The virus is mainly spread among people during close contact, often through droplets or sneezing. People can also be infected by coming in the contact with contaminated surfaces and then touching their faces. It is the most contagious especially during the first three days, although symptoms of the disease may appear and spread in later stages of the disease. Chest CT imaging is a helpful technique for diagnosis. Figure 2 shows the common symptoms of coronavirus-infected people and biological changes in the human body.
A. P. Singh (B) The Right Click Services Pvt. Ltd., Bhopal, India e-mail: [email protected] A. Soni University Institute of Technology, Bhopal, India e-mail: [email protected] S. Sharma Oriental College of Technology, Bhopal, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_40
463
464
A. P. Singh et al.
Fig. 1 Corona virus structure [2]
Fig. 2 Symptoms of COVID-19 [1]
Lungs CT scan is helpful for diagnosis COVID-19 with high clinical suspicion of infected people. But this test is not recommended for early routine checkups. It only works when human lungs may get affected due to the virus. Crazy Paving, Subpleural Dominance, and Consolidation may appear as the disease evolution [1] (Fig. 3).
1.1 Related Works Said Nadeem [3] reviewed and provide a literature work on COVID-19 and their symptoms. This paper reported the credentials till the date of 20th of March 2020 and explained the characterizing virus structure and its mechanism of infection.
40 A Novel Method for Corona Virus Detection Based on Directional …
465
Fig. 3 a Healthy lung, b COVID-19 affected LUNG
The development of treatment and vaccination approaches by different organizations and companies are yet to become successful, this report says. In response to the coronavirus pandemic, Allen Institute has been partnered with various leading research groups to prepare and distribute the Open Research Dataset [3]. Shi et al. [4] surveyed that medical imaging such as X-ray and computerized tomography (CT) plays a significant role against COVID-19. It is an emerging artificial intelligence (AI) technique that helps medical experts with image processing tools. Image acquisition can greatly help for the scanning process and reorganize workflows with minimal contact to patients, serving the best protection to the imaging technicians as well. In this review paper, it covers the whole pipeline of imaging as well as analysis techniques including chest image acquisition, segmentation, feature extraction, diagnosis, and follow-up (Fig. 4).
Fig. 4 a CT platform; b patient monitoring example; c positioning and scanning remotely by a technician [4]
466
A. P. Singh et al.
Segmentation is a significant phase in image processing that can evaluate COVID19 expertise. It distributes areas of interest (ROIs), such as lungs, lobes, vein, and infected lesions to chest X-rays or CT images. Fragmented areas can be used to export self-leased learning features in favor of diagnosis and other relative applications. CT imaging is a method that serves high-dimensional 3D images that is useful for diagnosing COVID-19. For segmenting ROI over CT scan images, the highly used approach is deep learning prototypes. It is to be considered that the imaging method is genuinely providing information regarding diseases about the patients. It combines the imaging information with the examination conducted by clinical methods that also resulted in better and disciplined [4]. Li et al. [5] provided a review report of the ongoing outbreak of coronavirus disease, as per the intact facts people use social media platforms to obtain and exchange a wide variety of information on a historical and unprecedented scale. Therefore, it is an important measure to identify such information and trying to understand how it is being promoted on social media to inform appropriate information. This article tried to fill this gap by using Weibo data to classify COVID-19-related information. Narin; see also Zhang; Chen; Zheng; Ying 2020 [6–10] proposed a system which is based on either ResNet50 or U-Net which is the only concept of proofing the content that only demonstrates how exactly the idea goes for identification and mapped as residual connections and boosting the procedure of learning methodology by enhancing the flow information through a deep network. But if the concern is to detect the disease through this concept is not a practical approach, it is only for simulation. Ghoshal; Wang; Xu 2020 [11–13] proposed a system which is based on CNN where the system has been trained with various dataset or models for detecting coronavirus from lung X-ray or CT images but the weakness of CNN is the amount of data you provide to the network. If the dataset is less then CNNs may perform poorly. CNNs have millions of parameters, would run over-fitting problem because they need a massive amount of data to quenches the thirst.
2 Proposed Methodology The proposed system is based on embossing differential filtration as well as Support Vector Machine (SVM). System process with CT scan lung images and ameliorate the contrast using histogram equalization that adjusted the brightness and skewness effectively. Enhanced contrast image is later processed with embossing for better visibility of low disproportion. It is also known as a directional difference filter that enhances edges directionally with a convolution mask. The emboss encores the calculation as it has been encoded with the filtered matrix for each pixel over an image, the technique itself compares the pixels with neighboring pixels, moving marks where a sharp or deep change over pixel value has been detected. Four primary embossing filter masks are as follows:
40 A Novel Method for Corona Virus Detection Based on Directional …
467
⎡
⎤⎡ ⎤⎡ ⎤⎡ ⎤ 0 +1 0 +1 0 0 0 00 0 0 +1 ⎣ 0 0 0 ⎦⎣ 0 0 0 ⎦⎣ +1 0 −1 ⎦⎣ 0 0 0 ⎦ 0 −1 0 0 0 −1 0 00 −1 0 0 Enlarged emboss filter mask for depth edge control over an image— ⎡
+1 ⎢0 ⎢ ⎢ ⎢0 ⎢ ⎣0 0
0 +1 0 0 0
0 0 0 0 0
0 0 0 −1 0
⎤ 0 0 ⎥ ⎥ ⎥ 0 ⎥ ⎥ 0 ⎦ −1
Emboss can be applied either horizontally as well as vertically or composition of both [14] (Fig. 5). The visibility of impaired lung cells get enhanced after embossing the image. Once it is articulated, SVM is to be applied for better classification. SVM classifies the two different kinds of data in two particular shells and constructs a hyperplane that can be useful for regression. Mostly, a good classification can be iterated by the hyperplanes and has the volumetric distance to the proximate training-data point, since in general the abundant margin, minimize the generalization error of the classifier [15]. To divide the 2 different data points, there are so many possible hyperplanes available that can be chosen. The objective of the system is to find a hyperplane with maximum margin, i.e. the highest distance between data points of both the classes. The mensuration of the hyperplane is depending upon the no. of features [16]. It is the uncomplicated way to divide the data if data is linearly separable. But there is no linear separation, due to that, it cannot be applied to the data. It pertains to non-linear data classification where there is a distinct data point available in lung CT scan images. Calculating
Fig. 5 Emboss filtered CT lung image
468
A. P. Singh et al.
Fig. 6 Non-linear separation [17]
and (x) is worthful for linear separable but it is not effective for non-linearly separable data. In that case, it is required to move forward it to the kernel that will convert low-frequency data dimensions to high-frequency data dimensions (Fig. 6). Kernel trick is a technique of building a linear classifier to classify the non-linear data points. It converts non-linear data points to higher frequency data dimension where it can be linearly separable. Kernel is a function that actually performs the conversion. There are various types of kernels like linear, polynomial, redial, and many more. Selecting an optimized kernel that can best suit the data can be obtained through cross-validation (Fig. 7). (A) SVM-Non Linear Separable (Kernel Method) Step 1: Define data points x = (x1 , x2 , x3…Xn )T y = (y1 , y2 , y3 …yn )T Fig. 7 Kernel trick classification [16]
40 A Novel Method for Corona Virus Detection Based on Directional …
469
Here x & y are two data points Step 2: Compute dimensional space (x) = (x12 , x1 x2 , x1 x3 , x2 x1 , x22 , x2 x3 , x3 x1 , x3 x2 , x32 )T (y) = y12 , y1 y2 , y1 y3 , y2 y1 , y22 , y2 y3 , y3 y1 , y3 y2 , y32 )T k(x, y) is a kernel function 2 k(x, y) = x T y = (x1 y1 + x2 y2 + x3 y3 )2 3
xi x j yi y j
i, j=1
k(x, y) = (x)T (y) Step 3: Kernels are similarity functions between vectors of data points in an image K :X∗X→R K ( x , y) = ( x ), (y ) It has been computed for higher dimensional spaces. Step 4: Let K is equivalent to , with n features and d degrees of polynomial k(x, y) = (x T y + 1)d Step 5: Compute kernel matrix to find decision boundary
CT Kc = ci c j ki, j i
j
=
i
=
⎛
i
ci c j (xi ) x j
j
⎞ ⎛
ci (xi ) ⎝ cj x j ⎠ j
⎞2
=⎝ cj x j ⎠ ≥ 0 j
Step 6: Plotting new data Step 7: Predicting the unknown Step 8: Output (B) Histogram Equalization Histogram equalization is an image processing technique that is used to adjust or enhance image contrast for better visibility. It effectuates by effectively spreading the most frequent intensity values. This method generally increases the global contrast
470
A. P. Singh et al.
Fig. 8 8 Bit grayscale image and gray levels [18]
when the data is represented by nearest contrast values. It allows for those areas of lower local contrast for gaining a higher contrast. Grayscale image—{x}; ni is the no. of occurrence of gray levels. px = p(x = i) =
ni , 0≤i 0, prompt user to confirm if a face is detected. 8: Store user-classified data of faces and non-faces as training data. 9: Pad the image with zeros on all sides. 10: Train a neural network with the user-classified data. 11: Begin centroid iteration with original template size. 12: Increase template scale by 0.1% and search for face matches by positioning the template eye centers on each centroid. 13: If AN N out put ≥ 0.9 for an eye template, a face is detected. 14: If centroid iteration completed, goto step 15 else goto step 12. 15: If template size ≥2.05 of original template size, goto step 16 else goto step 12. 16: Display results and errors. End program.
49 Haar Features and JEET Optimization for Detecting Human Faces in Images
587
4 Results Initial attempts at classification were performed using only P f , P le , P re and P n , with the assumption that even for varying template scales, the difference between faces and non-faces would be evident. Surprisingly, the neural network was unable to classify the data even with a large number of hidden nodes and one hidden layer. Graphs (Figs. 7 and 8) revealed poor linear separability. Even with various combinations of p f , p le , p re , p twf , p twf , p n and the Haar features, it was impossible to obtain linear separability.
4.1 Factors that Made a Difference The sum of Haar values of the cheek, nose bridge and nose combined with the area of white pixels matched on the face gave excellent separability from the non-face areas
Fig. 7 Linear separability trial 1
Fig. 8 Linear separability trial 2 (“bridge” is the nose bridge area)
588
N. K. Ipe and S. Chatterjee
Fig. 9 Linear separability achieved
(Fig. 9). This, added with a check for areas where the percentage difference in black pixels in the left eye area were almost the same as the values in the right eye (which means that the pixels in both eyes were almost symmetric), gave another clear differentiator which helped in distinguishing faces from the background. Additionally, restricting the template scaling to a range that is within a 5% of the maximum and minimum size of the expected face area helps improve linear separability. The neural network’s confusion matrices for face detection are shown in Fig. 13. Table 1 lists the two most significant trials, with the true positives detected for each face (shown as “ f igur enumber .jpg”), followed by two numbers in square brackets, which are from Eq. 10: the degree of distinction obtained in linear separability when the eye template’s right eye is placed on the centroid being iterated. The second number is the degree of distinction when the left eye is placed on the centroid. If there are more than one values for each image, the highest value is selected. It was observed that when using a Matlab-based feed-forward neural network with default parameters, a greater number of nodes and hidden layers (Fig. 12) was better able to handle the complexity of linearly separating face and non-face data. Figures 10 and 11 depict the detected faces.
4.2 Plus Points There were runs with no false positives and a 100% correct detection rate was achieved. Although in reality, given that for some images the classification probability is lower than most, there is a chance of getting false positives, but it could be eliminated by using the right neural network configuration and configuring the learning gain and momentum gain parameters to help the network generalize better.
49 Haar Features and JEET Optimization for Detecting Human Faces in Images Table 1 Results of face detection Trial Highest face detection Layers accuracies for each image 1
2
1.jpg [0.931700, 0.936453], 10.jpg [0.999450, 0.999433], 11.jpg [0.999343, 0.999350], 12.jpg [0.054377], 13.jpg [0.999378, 0.999445,0.998247], 14.jpg [0.999518, 0.999591, 0.999500], 15.jpg [0], 16.jpg [0.999346, 0.998228, 0.999385], 2.jpg [0.05], 3.jpg [0.999397, 0.999311], 4.jpg [0.999349, 0.999313, 0.999248], 5.jpg [0.999247, 0.999624, 0.999523], 6.jpg [0.999386, 0.999260], 7.jpg [0.997756, 0.997540], 8.jpg [0.999515, 0.999477], 9.jpg [0.999653, 0.999649] 1.jpg [0.001875], 10.jpg [0.999335, 0.999400], 11.jpg [0.999756, 0.999755], 12.jpg [0.999548, 0.999514], 13.jpg [0.999488, 0.999566, 0.998948], 14.jpg [0.999529, 0.999619, 0.999582], 15.jpg [0.999698, 0.999715], 16.jpg [0.999617, 0.999628], 17.jpg [0.999688, 0.999714, 0.999640], 3.jpg [0.999752, 0.999724], 4.jpg [0.999688, 0.999650, 0.999438], 5.jpg [0.996937, 0.998362, 0.997799], 6.jpg [0.999444, 0.999390], 7.jpg [0.999654, 0.999614], 8.jpg [0.999584, 0.999434], 9.jpg [0.835035]
Detection rate (%)
hiddenNodes = 10, hiddenLayers = 5
81.25
hiddenNodes = 20, hiddenLayers = 5
100
589
590
Fig. 10 Face detection trial 1
Fig. 11 Face detection trial 2
Fig. 12 Neural network used
Fig. 13 Confusion matrices
N. K. Ipe and S. Chatterjee
49 Haar Features and JEET Optimization for Detecting Human Faces in Images
591
4.3 JEET’s Efficiency Versus Viola Jones The Viola Jones method uses a window that begins with the size equal to the image size and then scales down with each image iteration until it reaches quarter the image size, and each window needs to be tried with multiple Haar features of varying sizes. That is s window scalings, over an m × n image and h Haar features, which gives it a complexity of O(s × m × n × h). JEET performs image thresholding and erosion once on the image to pre-process it and then does not need to convolve all m × n pixels in the image. It is only necessary to check the image at each centroid coordinate. In a small image with one face, this might be as low as 20 centroid iterations, which is far lesser than the 427 × 568 = 242,536 iterations (based on the number of rows and columns of the image) that Viola Jones would perform for the same image. This is a huge reduction in complexity and computation (O(s × c × h), where c is the number of centroid points).
5 Conclusion Human face detection was successfully carried out with 100% accuracy in a controlled environment, using morphology and Otsu thresholding for pre-processing, template matching, Haar features for face detection using a neural network and an eye-position template method to confirm detection of a face. Although there were trials during which false positives were detected and trials during which some faces were not detected, only the best results are presented in Sect. 4, since the false positives can be avoided by finding out the best network parameters and number of nodes, via trial and error. The Haar JEET algorithm reduces iteration computation by an order of 104 (for the dataset currently chosen). Moreover, it was noted that the face template created using any dataset could be used to detect faces of the same pose in other images too. As a further improvement, the JEET method can be supplemented with a separate neural network layer in the architecture, to identify the right level of Otsu thresholding required for creation of centroids for processing. Other ways to improve the algorithm include the use of centroids for placing the face templates at relevant areas, but utilizing the actual image pixels instead of the thresholded pixels, to perform Haar JEET score calculations. Also, using a radial basis function for the neural network could improve results since the current sigmoid functions are not sufficient to classify inputs which are difficult to linearly separate.
References 1. BiPiA: 293 best celebrity caricatures images in 2020: celebrity caricatures, funny caricatures, caricature (2020). https://in.pinterest.com/bipia6886/celebrity-caricatures 2. Gilad-Gutnick S, Harmatz ES, Tsourides K, Yovel G, Sinha P (2018) Recognizing facial slivers. J Cogn Neurosci 30(7):951–962
592
N. K. Ipe and S. Chatterjee
3. Gibson C (2019) 163 best pareidolia images: everyday objects, funny photos, things with faces. https://www.pinterest.com/catmclamb/pareidolia/ 4. Yang MH, Kriegman DJ, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24(1):34–58 5. Lu Y, Zhou J, Yu S (2012) A survey of face detection, extraction and recognition. Comput Inform 22(2):163–195 6. Jones M, Viola P (2003) Fast multi-view face detection. Mitsubishi Electric Research Lab TR-20003-96 3(14), 2 7. Vansh V, Chandrasekhar K, Anil C, Sahu SS (2020) Improved face detection using YCbCr and Adaboost. In: Computational intelligence in data mining. Springer, pp 689–699 8. Ganakwar DG, Kadam VK (2019) Face detection using boosted cascade of simple feature. In: 2019 international conference on recent advances in energy-efficient computing and communication (ICRAECC). IEEE, pp 1–5 9. Thai TT, Nguyen DT (2019) Face detection method in surveillance systems using haar feature and deep neural network. In: 2019 6th NAFOSTED conference on information and computer science (NICS). IEEE, pp 434–438 10. Oztopuz A, Karasulu B (2019) The semi-automatic approach to extract the features of human facial region. In: 2019 3rd international symposium on multidisciplinary studies and innovative technologies (ISMSIT). IEEE, pp 1–6 11. Han CC, Liao HYM, Yu GJ, Chen LH (1997) Fast face detection via morphology-based preprocessing. In: International conference on image analysis and processing. Springer, pp 469– 476 12. Tiwari R, Shukla A, Prakash C, Sharma D, Kumar R, Sharma S (2009) Face recognition using morphological method. In: 2009 IEEE international advance computing conference. IEEE, pp 529–534 13. Yuille AL (1991) Deformable templates for face recognition. J Cogn Neurosci 3(1):59–70 14. Jones P, Viola P, Jones M: Rapid object detection using a boosted cascade of simple features. In: University of Rochester. Charles Rich, Citeseer (2001) 15. VisMod: UCSD computer vision. http://vision.ucsd.edu/content/yale-face-database 16. Marszalec EA, Martinkauppi JB, Soriano MN, Pietikaeinen M (2000) Physics-based face database for color research. J Electron Imaging 9(1):32–39 17. Lyons MJ, Akamatsu S, Kamachi M, Gyoba J, Budynek J (1998) The Japanese Female Facial Expression (JAFFE) database. In: Proceedings of third international conference on automatic face and gesture recognition, pp 14–16
Chapter 50
Accurate Detection of Breast Cancer Using GLCM and LBP Features with ANN via Mammography Ashutosh Kumar Singh, Rakesh Narvey, and Vishal Chaudhary
1 Introduction Breast cancer is one of the vital categories of cancer which causes a huge number of deaths of women around the globe, and after crossing the age of 60 years, the risk of growing this in women is very high [1]. Nowadays, medical imaging plays a major role in computer fields that might help doctors for diagnosis and medical evaluations [2]. The medical images are considered as the main source in medical data as it can provide more pathological information [3]. Currently, mammography is an effective imaging technique mainly used for the early diagnosis of breast cancer. It uses a low-dose X-ray to look inside the internal tissues and parts of the breast. The image generated by this process is known as mammograms [4]. Different sample images of mammograms are shown in Fig. 1, where Fig. 1a, b shows the healthy mammograms and Fig. 1c, d shows the abnormal mammograms [1]. CAD tools can be used for improving the diagnostic details using these mammograms. CAD is an automatic system that can assist pathologists and doctors in their clinical decision [5]. Designing of a CAD system consists of various steps, i.e., lowlevel pre-processing, segmentation, feature extraction, feature selection, and classification [6, 7]. In the low-level pre-processing phase removal of artifacts, labels, and noises from the mammograms need to be done. Segmentation is the process to find a region of interest (ROI). In feature extraction, the features from the mammograms get extracted in the form of vectors. After feature extraction classifier modeling A. K. Singh (B) · R. Narvey · V. Chaudhary Department of Electrical and Engineering, MITS, Gwalior 474005, India e-mail: [email protected] R. Narvey e-mail: [email protected] V. Chaudhary e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_50
593
594
A. K. Singh et al.
Fig. 1 Sample mammogram patches
and validation of this classifier are done so that newly coming mammograms or test samples can be classified properly. In this paper, we have designed and implemented an effective CAD system using LBP and GLCM features with ANN. These features are combined with the same weight, reduce the effect of the semantic gap, and enhance the detection performances. The rest of the paper is divided into three sections: Sect. 2 represents the proposed methods and model, results analyses are given in Sect. 3, and finally, Sect. 4 concludes the finding of this paper.
2 Methods and Model The CAD system proposed in this work consists of three basic phases. The first phase consists of pre-processing. In pre-processing, cropping and enhancement operations are applied to each mammogram. The second phase consists of feature extraction in which GLCM and LBP features are extracted, and finally trained ANN is used for the detection of cancers through mammograms. A basic model of our proposed CAD system has been shown in Fig. 2.
50 Accurate Detection of Breast Cancer Using GLCM …
595
Fig. 2 Proposed frameworks for the detection of cancer
2.1 Mammogram Cropping and Enhancement The first step in the design of CAD tool is cropping of regions of interest (ROI) for the removal of artifacts such as scratches, pectoral muscles, and tagged-labels because these artifacts may misguide in feature extraction and result in poor classifier performances. Hence, a cropping operation is applied to the mammograms to cut off the unwanted portions of the images. Further, contrast limited adaptive histogram equalization (CLAHE) is applied on the ROI to create uniform and better quality image. CLAHE is best for improving the local contrast of mammograms by redistributing the lightness values of the mammogram [8]. Figure 3 shows the outcome
Fig. 3 Cropping and filtering of a sample mammogram
596
A. K. Singh et al.
results of the step image cropping and enhancement.
2.2 Feature Extraction This section introduces LBP and GLCM features. LBP is a gray-level computationally lightweight features and captures the strong texture characteristics of mammogram [1, 9]. LBP features are mathematically defined as: LBP P,R =
P−1
2i S(G i − G c )
i=0
S(x) =
0... X < 0
(1)
1 . . . else
The grayscale values center pixel and neighborhood pixel are denoted as Gc and Gi , and P and R are the neighboring relationship and radius. GLCM features are basically used to capture the strong texture characteristics, defined as [5, 10]: Cx,y (i, j) =
N M 1 if I ( p, q) = i and I (P + x, Q + y) = j 0 otherwise
(3)
P=1 Q=1
where I is the input mammogram having n*m dimension, (x, y) are offsets, and P and Q are the spatial positions in the mammogram I. Contrast: Defined in Eq. (4) indicates change in the intensity with its neighboring pixels in mammograms. Contrast =
|i − j|2 p(i, j)
(4)
i, j
Energy: Defined in Eq. (5) shows the uniformity of texture in mammograms. Energy =
p(i, j)2
(5)
i, j
Homogeneity: Defined in Eq. (6) shows uniformity of pixels in mammogram. Homogeneity =
i, j
p(i, j) 1 + |i − j|
(6)
50 Accurate Detection of Breast Cancer Using GLCM … Table 1 Tuning parameters
597
Tuning parameters
Tuned values
No. of hidden layers
15
Rate of learning
0.06
Iterations (Number)
1000
Goal of training parameter
10-9
Correlation: Defined in Eq. (7) gives a correlation of neighboring pixels in the mammogram Correlation =
(i − μi)( j − μj) p(i, j) σi σ j i, j
(7)
After the extraction of LBP and GLCM features from each mammogram, we have concatenated both the features with same weight. Hybrid features (HF) = LBP U GLCM
(8)
2.3 Classification Cum Detection For the detection of breast cancer, a feed-forward neural network based on backpropagation algorithm is used. Backpropagation learns by iteratively taking the extracted hybrid features (HF) with their known classes, and comparing the network’s prediction for each sample with actual known class value. In training phase, the weights of neurons are repeatedly changed to minimize the mean square error (MSE) between the network prediction and the actual class values [11]. Finally, the training of feed-forward neural network stops when the numeric value of MSE becomes negligible. In an artificial neural network classifier, various tuning parameters are used for the training of the model. These are number of hidden neurons, learning rate, moment factor, and numbers of iterations and training parameter goal. The quantitative values of tuning parameters for the proposed work are given in Table 1.
3 Results Analysis In this paper, we have evaluated our results on MIAS dataset [4]. This dataset consists of 115 mammograms from abnormal and 207 mammograms from normal classes.
598
A. K. Singh et al.
However, some of mammograms have more than one abnormality. Therefore, after cropping of ROI, we got a total of 327 mammograms patches. For the analysis of results of ANN, confusion matrix is drawn which parameters are as shown in Table 2, where description of true positive (TP), true negative (TN), false positive (FP), and false negative (FN) is given in the form of 2 × 2 matrix. By using this confusion matrix, other performance measures can be defined as: Precision = Recall = Accuracy =
TP TP + FP
TP TP + FN
TP + TN TP + FN + FP + TN
FP Rate =
FP FP + TN
Specificity =
TN FP + TN
(8) (9) (10) (11) (12)
Figure 4 shows the intrinsic working steps for the training, testing, and validation of the model. This figure reflects that the learning of the model stops after 6 epochs. During the training of ANN, output plots are plotted with respect to target as shown in Fig. 5. These plots are internally used for the training, testing, and validation purposes. From this figure, it is cleared that model has achieved 95.05% output for the training, 95.22% output for the validation, and 97.63% output for the testing along with overall outputs performance of 95.47%. Figure 6 shows the detection performance in the form of confusion matrixes for the training, testing, and validation of the proposed model. Last right bottom figure reflects the overall classification performance for the ANN. From this figure, it is evident that proposed detection performance is quite encouraging in which out of 327 samples, 311 samples are correctly detected using our proposed work. The overall accuracy, precision, recall, and specificity of this approach are 95.1%, 97.6%, 94.8%, and 95.6%, respectively. Table 2 Confusion matrix Predicted class Positive
Negative
Positive
TP (No. of positive cases that are also predicted as positive)
FN (No. of positive cases that are predicted as negative)
Negative
FP (No. of negative cases that are predicted as positive)
TN (No. of negative cases that are also predicted as negative)
50 Accurate Detection of Breast Cancer Using GLCM …
599
Fig. 4 MSE versus no. of epoch
Further, detection accuracy of this approach is also compared with many existing state-of-the-art methods in Table 3, and it has been found that the proposed framework is significantly encouraging than other state-of-the-art methods.
4 Conclusions This paper introduced an effective approach for the detection of breast cancer using hybrid texture features with ANN. For the pre-processing of mammograms, artifacts are extracted, and CLAHE is adapted for further enhancement. Further, we have extracted LBP and GLCM features and fed these to ANN for the training, testing, and validation. Finally, during the experimental analysis, it was found that the diagnosis performance of this approach for the detection of breast cancer using mammograms is significantly encouraging than other state-of-the-art methods.
600
Fig. 5 Output versus target plot
A. K. Singh et al.
50 Accurate Detection of Breast Cancer Using GLCM …
Fig. 6 Confusion matrixes (performance measures)
601
602
A. K. Singh et al.
Table 3 Comparative analysis State-of-the-art work
Used feature extraction and classification model
Accuracy and other performance measures
Used database
Proposed work
GLCM + LBP with ANN 95.4% accuracy, 94.8% recall, and 97.6% precision, 95.6% specificity
MIAS
Singh et al. [12]
Selected GLCM with random forests
Maximum achieved accuracy 93.90%,
MIAS
Pratiwi et al. [13]
GLCM-based texture feature extraction from different orientations + Radial basis function neural network (RBFNN) classifier
Highest achieved accuracy, sensitivity, and specificity, 93.98%, 94.44%, and 93.62%, respectively
MIAS
Tzikopoulos et al. [14]
Combination of fractal texture features and statistical features, support vector machine classifier
Accuracy = 84.47 − 85.3%
MIAS
Buciu et al. [15]
Gabor wavelets and principle component analysis (PCA), support vector machine classifier
AUC= 0.79, Sensitivity = 97.56%,
MIAS
Prathibha et al. [16]
DWT-based texture features + nearest neighbor classifier
AUC= 0.95
MIAS
Liu et al. [17]
Multiresolution analysis, wavelet and statistical features + binary tree classifier
Accuracy = 84.2%
MIAS
Subashini et al. [18]
Statistical features + support vector machine classifier
Accuracy = 86.67%
MIAS
Wang et al. [19]
Histogram features + back propagation neural network classifier
Accuracy = 71%
MIAS
Oliver et al. [20]
Textural and morphological features + sequential feature selection + k-NN classifier
Specificity = 91%
MIAS
Mutaz et al. [21]
GLCM features + ANN classifier
74% Sensitivity = 91.6%, Specificity = 84.17%
MIAS
(continued)
50 Accurate Detection of Breast Cancer Using GLCM …
603
Table 3 (continued) State-of-the-art work
Used feature extraction and classification model
Accuracy and other performance measures
Used database
Dhahbi et al. [22]
Discrete curvelet Accuracy = 91.27% transform + t-test ranking for feature selection + k-NN classifier
MIAS
Jona et al. [23]
GLCM features + SVM classifier
Accuracy = 94.0%
MIAS
G¨orgel et al. [24]
DWT features + SVM classifier
Accuracy rate = 84.8%
MIAS
References 1. Singh VP, Srivastava S, Srivastava R (2017) Effective mammogram classification based on center symmetric-LBP features in wavelet domain using random forests. Technol Health Care 25(4):709–727 2. Singh VP, Srivastava R (2018) Automated and effective content-based mammogram retrieval using wavelet based CS-LBP feature and self-organizing map. Biocybern Biomed Eng 38(1):90–105 3. Freer TW, Ulissey MJ (2001) Screening mammography with computer-aided detection: Prospective study of 12,860 patients in a community breast center 1. Radiology 220(3):781– 786. https://doi.org/10.1148/radiol.2203001282 4. Suckling J, Parker J, Dance D, Astley S, Hutt I, Boggis C et al (1994) The mammographic image analysis society digital mammogram database. ExerptaMedica Int Congr Ser 1069:375–378 5. Singh VP, Srivastava S, Srivastava R (2018) Automated and effective content-based image retrieval for digital mammography. J X-ray Sci Technol 26(1):29–49 6. Singh VP, Srivastava R (2017) Content-based mammogram retrieval using wavelet based complete-LBP and K-means clustering for the diagnosis of breast cancer. Int J Hybrid Intell Syst 14(1–2):31–39 7. Kulshreshtha D, Singh VP, Shrivastava A, Chaudhary A, Srivastava R (2017) Contentbased mammogram retrieval using k-means clustering and local binary pattern. In: 2017 2nd International conference on image, vision and computing (ICIVC), pp 634–638. IEEE 8. Singh VP, Gupta A, Singh S, Srivastava R (2015) An efficient content based image retrieval for normal and abnormal mammograms. In: 2015 IEEE UP section conference on Electrical Computer and Electronics (UPCON), pp 1–6. IEEE. https://doi.org/10.1109/upcon.2015.745 6733 9. Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623 10. Tesaˇr L, Shimizu A, Smutek D, Kobatake H, Nawano S (2008) Medical image analysis of 3D CT images based on extension of Haralick texture features. Comput Med Imaging Graph 32(6):513–520 11. Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier 12. Singh VP, Srivastava A, Kulshreshtha D, Chaudhary A, Srivastava R (2016) Mammogram classification using selected GLCM features and random forest classifier. Int J Comput Sci Inf Secur 14(6):82 13. Pratiwi M, Harefa J, Nanda S (2015) Mammogram’s classification using gray-level cooccurrence matrix and radial basis function neural network. Procedia Comput Sci 1(59):83–91. https://doi.org/10.1016/j.procs.2015.07.340
604
A. K. Singh et al.
14. Tzikopoulos SD, Mavroforakis ME, Georgiou HV, Dimitropoulos N, Theodoridis S (2011) A fully automated scheme for mammographic segmentation and classification based on breast density and asymmetry. Comput Methods Progr Biomed 102(1):47–63. http://doi.org/10.1016/ j.cmpb.2010.11.016 15. Buciu I, Gacsadi A (2011) Directional features for automatic tumor classification of mammogram images. Biomed Signal Process Control 6(4):370–378. https://doi.org/10.1016/j.bspc. 2010.10.003 16. Prathibha B, Sadasivam V (2010) Breast tissue characterization using variants of nearest neighbor classifier in multi texture domain. IE (I) J 91:7–13 17. Liu S, Babbs CF, Delp EJ (2001) Multiresolution detection of spiculated lesions in digital mammograms. IEEE Trans Image Process 10(6):874–884. https://doi.org/10.1109/83.923284 18. Subashini TS, Ramalingam V, Palanivel S (2010) Automated assessment of breast tissue density in digital mammograms. Comput Vis Image Underst 114(1):33–43. https://doi.org/10.1016/j. cviu.2009.09.009 19. Wang XH, Good WF, Chapman BE, Chang YH, Poller WR, Chang TS, Hardesty LA (2003) Automated assessment of the composition of breast tissue revealed on tissue-thicknesscorrected mammography. Am J Roentgenol 180(1):257–262 20. Oliver A, Freixenet J, Marti R, Pont J, Pérez E, Denton ER, Zwiggelaar R (2008) A novel breast tissue density classification methodology. IEEE Trans Inf Technol Biomed 12(1):55–65. https:// doi.org/10.1109/TITB.2007.903514 21. Al Mutaz MA, Dress S, Zaki N (2011) Detection of masses in digital mammogram using second order statistics and artificial neural network. Int J Comput Sci Inf Technol (IJCSIT) 3(3):176–186. https://doi.org/10.5121/ijcsit.2011.3312 22. Dhahbi S, Barhoumi W, Zagrouba E (2015) Breast cancer diagnosis in digitized mammograms using curvelet moments. Comput Biol Med 1(64):79–90. https://doi.org/10.1016/j.com pbiomed.2015.06.012 23. Jona J, Nagaveni N (2012) A hybrid swarm optimization approach for feature set reduction in digital mammograms. WSEAS Trans Inf Sci Appl 9:340–349 24. Gorgel P, SERTBAS¸ A, Kilic N, Osman N, Osman O (2012) Mammographic mass classification using wavelet based support vector machine. Methods. 10:11
Chapter 51
AI-Based Enabled Performances Measurements for MOOCs Atulkumar Gupta and Surekha Dholay
1 Introduction This project has been taking up because I, along with some other students, had taken up NPTEL MOOC courses a few months back. The course was current and relevant, but it failed to capture our complete attention. The course was compulsory, and we completed successfully, but it showed us how much work there is to do to improve the course. The class didn’t look into the psychology of a student, nor did it take into account the sociological factors affecting them [2]. Various research papers have delved into the reasons for higher drop rates and low completion rates of MOOC courses. These reasons are studied, and other authors have proposed a prediction model for predicting the trend in the drops and low completion rates. In our system, we try to minimize the reasons as mentioned earlier by not only using Artificial Intelligent methodologies but bringing in psychology and sociology to it [18]. MOOC changes the ways of learning and teaching in web-based education. Thus, a learner can learn anything from anywhere at any time. As a result, Learners of MOOCs grow day by day. But every learner has a different level of knowledge and abilities. Providing one standard learning path for every learner is not a practical approach. It affects the interest of learners and the intention to complete the course, which is one of the primary reason for increasing dropout rates. We design a model that personalizes the learning path of learners dynamically according to the learner’s knowledge level and connectivity through courses. The model also analysis learner week wise performance according to the performance learning path suggest learner what to do next. Learners help and clearing doubts on lectures. We create a discussion A. Gupta (B) · S. Dholay Computer Engineering Department, Sardar Patel Institute of Technology, Mumbai, India e-mail: [email protected] S. Dholay e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_51
605
606
A. Gupta and S. Dholay
forum among all learners and course instructor, which allows the learner to make their queries. Using such approaches, we try to design interactive courses in that two-way communication possible. We are using a Fuzzy cognitive map to develop a student’s knowledge level domain because it makes a rule on data and other algorithms generate a rule using data. The primary motivation of this research is that we need a personalized model that can suggest the learning path of learners based on their learning style. Learners should be able to understand courses in their way, and they get features like doubt clearing sessions and course paths as per the knowledge level. This research is motivated by: • Currently many e-learning courses and MOOC courses have dropout rates and completion rate problems. Dropout rate increases and completion rate go down. • For the MOOC, most systems do not have personalized learning systems. They follow a traditional approach that is one way communication. • They do not consider learner preferences, learning styles which causes lake of interest in learner. The sections of our paper are as follows: Sect. 2 gives a brief explanation about previous work done by other researchers and their models. We explained our design and proposed model in Sect. 3. In Sect. 4, the implementation and results of our module are illustrated. We conclude our research in Sect. 5.
2 Literature Survey In recent years, Various research papers have delved into the reasons for higher drop rates and low completion rates of MOOC courses. These reasons are studied and other authors have proposed a prediction model for predicting the trend in the drops and low completion rates. What are the different reasons and purpose model discuss below.
2.1 Reason for Dropout No genuine expectation to finish the course: The various researcher noticed that many candidates selected only to find out about MOOC instead of getting familiar with the subject itself [6, 11, 13]. Absence of essentials and, open passage empower easygoing enrollment. Because of this, numerous enlistments are from the individuals who don’t mean to take an interest entirely and a few people enlisted to pick up data about how MOOC works to produce their courses [11]. Lack of Time: In MOOC, learners are students or employees. Students who plan entirely to finish the course, but they did not do that. Because they couldn’t dedicate the basic opportunity to concentrate even, they have a robust aim and inspiration
51 AI-Based Enabled Performances Measurements for MOOCs
607
to do that, but some time workload stands as barriers in between there. The reason is lack of time management, sometimes study material is appropriate for some, but some take more or less time to adapt to that. Students face problems like lack of support which they needed at that time. Employees did not give sufficient time to complete the course due to their job and other activities [3]. Absence of advanced expertise or learning ability: For the most part, MOOC (online learning) requires a client who’s fit for making a reasonable and educated choice for there own sake and having the option to work with innovation design utilized by them. Nowadays, the individuals who are all around refreshed and acquainted with different sorts of innovation may confront trouble while using another framework or arrangement. Turmoil and dissatisfaction cause high dropout rates in MOOCs [3].
2.2 Related Work Various works are done on the web-based learning system. Many researchers give different adaptive model using many AI approaches but very fewer jobs done towards MOOC in terms of adaption and intelligent system. As a result, a number of the works have been done towards it is presented below. The researcher proposes an intelligent system based on a macro-adaptive approach. It permits adjusting the learning styles and attributes of a student. On the bases of this data system, statically determine the weakness and needs of the learner. The limitation was it requires a prepossessed data set, and it works statically [1]. Created a framework to adapt the learning format and learning process sequences; this framework is called WELSA (Web-based Education with Learning Style Adaptation). This methodology depends on both dynamic and static qualities, including learning styles, student activities, history, and cooperation with the framework. It likewise offers types of assistance, like sharing and conversation discussions [7]. Proposed an adaptive system that recommended courses to the learner according to their area of interest and most enrolled courses on websites. Basically, it uses recommendation techniques which are used in e-commerce websites to recommend the customer [19]. Its created methodology on a student’s knowledge level model with making a heterogeneous group of students. In the beginning, of course, it takes examinations that assist with deciding a student’s knowledge and skills. As a result, the system generates the learning path for that student [12]. It presented a Felder-Silverman Learning Style Model. In this model, it reflects the emotional and social learning of each learner as adding a new dimension. Its benefits the learning style of the student also gives suggestions on course structure and materials efficiency [9].
608
A. Gupta and S. Dholay
The researcher proposed that for the knowledge representation most of the time, we used graph techniques. If we are taking in a term artificial intelligence, it is also known as a semantic network. Basically, it is a graphical representation of concepts and knowledge [6]. In [16], made theoretical charts for a group of semantics systems which is called a conceptual graph. In [20], with application to computerized reasoning, software engineering and psychological science. In [22], explaining that for the learning process, how a conceptual graph helps, and it is basically a knowledge representation tool. In [8], proposed an intelligent tutoring model which was designed based on a conceptual graph.
3 Proposed Solution Higher dropout rates and the lower completion rate is a serious problem for today’s world. This factors directly affect the result of MOOC and e-learning courses which portray the wrong impression on learners. Presently people before enrolling to any courses they check feedback, review, comments and statistical figures like a number of students enrolled and, how many students completed the course such parameter learner considers before joining. The learner was also facing a problem finding the desired course and resource on MOOC. To overcome these problems, I proposed a model which calculates the performance of learners with different parameters using an AI approach. In this model, we have three modules: E-learning Website (Online Course), FCM (Fuzzy Cognitive Mapping) Graph, FIS (Fuzzy Inference System). On the E-learning Website, we created a course to evaluate the learner parameter. FCM Graph is used to calculate the weight matrix among every concept of course and used for knowledge representation of learner and FIS to predict the knowledge level of the learner applying rules on the data set which get dynamically from our ongoing courses. Our system calculates the knowledge level of learner week wise, and according to its knowledge level, it predicts which concept user has to learn before starting the other. In Fig. 1 shows Workflow diagram of the proposed model and every module of that model describes here: Web Application: Web Application module used for giving an interface to learners and instructors. In this, there are two roles: one is a learner and second is an instructor. Learners can enrol toward courses which he/she likes to enrol in. Instructors can design courses for learners, and course mapping is one of the most important jobs. Course design contains study materials, video, presentation, quiz and assignments. Web application gives a dynamic data set and module wise knowledge level, a number of days to complete course and number of assignments solved given by instructors of each learner. Fuzzy Cognitive Map: When integrating fuzzy logic with cognitive mapping then Fuzzy Cognitive Mapping (FCM) is derived. Meanwhile fuzzy logic is used to give truth values of incidents which answer not possible in 0 or 1 and it gives values
51 AI-Based Enabled Performances Measurements for MOOCs
609
Fig. 1 Workflow diagram of the proposed model
between [0–1] that means it gives the possibility of an incident occurring. On the other hand, cognitive mapping is a directed graph between concepts which are related to each other. So, FCM is a networked graph in which concepts are mapped according to its dependency and the weighted between concepts is nothing but fuzzy logic values for those two concepts. To represent knowledge of the system used FCM. FCMs were presented by Kosko and from that point forward they have risen bit by bit risen as an incredible worldview for information portrayal. FCMs are perfect causal cognizance instruments for displaying and reproducing dynamic frameworks. Proposed that FCM is directed graph with signed value, Connection between to concepts also there is edge weight. Let’s say we have to concept Ci and C j and the directed edge between two concepts is W i j. Here directed edge W i j represents the strength of causal relation between two concepts [14, 17, 23]. According to the signed of weight W i j here is three possible way casual relationship between Concept Ci and Concept C j: • If W i j > 0 then Concept Ci and Concept C j have positive causality. Positive causality means if the values of concept Ci is increases/decreases then values of Concept C j also increases/decreases. • If W i j < 0 then Concept Ci and Concept C j have Negative Causality. Negative Causality means if the values of concept Ci is increases then values of Concept C j decreases and if the values of concept Ci is decreases then values of Concept C j increases. • If W i j = 0 then Concept Ci and Concept C j have Zero Causality. That’s means no relation between them.
610
A. Gupta and S. Dholay
Kosko’s Inference Rules Where, • • • •
Ai is concept values of each concept in FCM Graph. Ai(k + 1) is at the simulation step k + 1 value of concept Ci. A j (k) is at the simulation step K value of concept C j. W i j is the edge weight between concept Ci and Concept C j. ⎛ Ai(K + 1) = f ⎝
N
⎞ W i j ∗ A j (K )⎠
j=1, j∗1
Fuzzy Inference System: To represent learner’s performance there are four fuzzy sets defined. Learner’s performance represented in four tuple format: (Un, Uk, K, L). Explanation of four parameter given below: • Unknown (Un): If the learner performance level is calculated between 0 to 60% then the learner did not know that concept very well. • Insufficient Known (Uk): If the learner performance level is calculated between 55 and 75% then the learner did know that concept but not very well. • Known (K): If the learner performance level is calculated between 70 and 90% then the learner did know that concept very well but not learned. • Learned (L): If the learner performance is calculated between 85 and 100% then the learner did know that concept very well and learned.
4 Implementation and Experimental Result In this section, discussed the interface of the end-user, inference rules for weight matrix calculation and fuzzy inference rules. We test our model on dynamic real data set which generated by our on going course.
4.1 Implementation Web Application: In this module created a one course called “Machine Learning with Python”. Basically its e-learning course where learners can learn about this course and using this application interaction with learners can be done. This web application has all services for learners as well as instructors. This website develop using Laravel in Back-End and HTML and CSS Front-End. We are using third party domain and website shown in Fig. 2 and link is: https://atul-s-school-5f24.thinkific.com/courses/MLwithPy
51 AI-Based Enabled Performances Measurements for MOOCs
611
Fig. 2 Home page website
Learner Side Interface: The home page of websites and clicking on enrolling free button website redirects to the enrollment page—the enrollment page for the course, where learners can enrol for the course. After enrolling the course, then the dashboard for learners is open. User dashboard has a course structure and progress bar. Using the start course button, users can start the course. Every Week module is locked when the user starts the course. To unlock the week 1 module, he/she has to give a prerequisite quiz test. Based on this course, modules unlock. After completing the 1st-week module, then he/she goes to the next module if he/she scored passing marks. Progression and path will be shown on the user dashboard. Instructor Side Interface: They show a course management page, where the instructor creates all courses for the learner. There are many services in course management; all are listed following: Create Course: These services are used for designing new courses and instructors can add lessons or chapters, syllabus, quizzes, assignments, survey form, study material, lecture videos and presentations. And also apply prerequisite conditions to every chapter. Advanced Reporting: Using this services instructor can also access the data of students enrolled in the course. Instructors also check the progress of every student and also graphically view the engagement of students. Using such advanced reporting instructors can also know the completion rate and dropout rates for particular courses. FCM Graph for Machine Learning with Python Course: To create FCM Graph following are the steps: • Step 1: Design dependency matrix, i.e. Activation Vector among all concepts as shown in Fig. 3. • Step 2: Calculate Weight matrix for that activation vector.
612
A. Gupta and S. Dholay
Fig. 3 Activation vector Fig. 4 FCM graph for course
• Step 3: Using Kosko’s inference rule we get concept values of every concept, i.e. (Ai). Finally we get out FCM Graph, in that graph we use edge weights for cross validation of knowledge level derived from FIS shown Fig. 4. Implementation of FIS: Using the MATLAB Fuzzy Logic toolbox to design the Fuzzy Inference System and evaluate the performance of learners. Following three input variables are evaluation criteria and performance can be predicted on the range of [0–10].
51 AI-Based Enabled Performances Measurements for MOOCs
613
Fig. 5 Fuzzy inference system
Inputs: • Knowledge Level/Concept: Every learners Knowledge Level for every Concept Taken for this input. Performance are measured by KL (scale 0–10) • Number of Days studies/Concept: Total no days required to complete the concept (Scale 0–10) • Number of Exercise/Assignment Questions Solved: Total no exercise and assignment questions solved for concept measure on scale [0–10]. Output: • Performance: Performance of learner in range [0, 10] inferred based on a set of rules. In Fig. 5, It is a Fuzzy Inference System for performance predictor. FIS has three input variables and one output variable. Membership Functions of three inputs variables and one output variable is shown in Fig. 6. Rules for Fuzzy Inference system is designed are shown in Fig. 7. In Fig. 8, the system predicts the performance of learner using fuzzy rules.
614
A. Gupta and S. Dholay
Fig. 6 Membership functions of variables
Fig. 7 Fuzzy inference rules
4.2 Experimental Results Data: We are using the dynamic data set in this project. So to collect data set, we began an online course. The course name is Machine Learning with Python. Fortyseven learners are enrolled in a course, and we have data from 47 learners. On the weekly bases, we created our data sets. Here, we filter three types of the data set with multiple records.
51 AI-Based Enabled Performances Measurements for MOOCs
615
Fig. 8 Performance prediction
• Data set 1: In this data set from the learner data, we get prerequisite knowledge level of each learner. Before the course started, the instructor designs a quiz based on the prerequisite requirements. After completing the questionnaire, the Prerequisite Knowledge Level is calculated using knowledge level formula. In Fig. 9 shows The Prerequisite Knowledge Level of Each Learner. • Data set 2: In this data set, we have concepts wise records. We have a course with six concepts, so we filter data set according to the concept. It collects knowledge level of 47 learners with every concept. • Data set 3: An overall progress report of every learner in percentage is collected in this. In Fig. 10 shows The Learner progress at the end of course. Concept wise performance predictor: Performance is calculated by the FIS system and the system calculates the performance of all learners and according to performances it shows their status shown in Fig. 11. In parameters are taken from Data set 2. Here, Un = Unknown; Uk = Insufficient Known; K = Known; L = Learned. Completion Rate: In our course, 47 learners enrolled for the course, and many of the learners completed the course. There are a total of 33 learners having a progress percentage of more than 60%. So, the completion rate is 69.6% shown in Figs. 12 and 13. Dropout Rate Graph: In our course, 47 learners enrolled for the course, and many of the learners completed the course. There are a total of 14 learners having a progress percentage of less than 60%. So, the dropout rate is 30.4% shown in Figs. 12 and 13.
616
Fig. 9 The prerequisite knowledge level of each learner
Fig. 10 Learner progress at the end of course
A. Gupta and S. Dholay
51 AI-Based Enabled Performances Measurements for MOOCs
Fig. 11 Concept wise performance predictor
Fig. 12 Statistical data of course
Fig. 13 Completion rate and dropout rate
617
618
A. Gupta and S. Dholay
5 Conclusion In this project, we proposed AI-based measurements of e-learning courses or MOOC courses. There are many proposed models regarding adaptive learning, but they use statistical data. In our project, we dynamic data of 47 students. We predicted their knowledge level, which helps in personalizing the learning path for learners and also its gets interactive user interfaces. For this model, we started our website for e-learning, approximately more than a month our course is launched. In this duration, we had connect with 47 different learners and with their experience, feedback and interest towards the course give us excellent results. Our models gives completion rate is 69.6% and our dropout rate is 30.4%. By adding more new courses, increasing the number of enrollment and integrating the Fuzzy Inference System to Web Application the model will work more effectively in future.
References 1. Amado-Salvatierra HR, Rizzardini RH, Chan MM (2018) Unbundling higher education with internationalization. Experiences from two hybrid postgraduate degrees using MOOCs. In: Learning with MOOCS (LWMOOCS), Madrid 2018, pp 9–12 2. Chatterjee P Nath A (2014) Massive open online courses (MOOCs) in education—a case study in Indian context and vision to ubiquitous learning. In: 2014 IEEE international conference on MOOC, innovation and technology in education (MITE), pp 36–41 3. Chauhan J (2017) An overview of MOOC in India. Int J Comput Trends Technol 49:111–120. https://doi.org/10.14445/22312803/IJCTT-V49P117 4. Chrysafiadi K, Virvou M (2015) Fuzzy logic for adaptive instruction in an e-learning environment for computer programming. IEEE Trans Fuzzy Syst 23(1):164–177. https://doi.org/10. 1109/TFUZZ.2014.2310242 5. Ciolacu M, Beer R (2016) Adaptive user interface for higher education based on web technology. In: 2016 IEEE 22nd international symposium for design and technology in electronic packaging (SIITME), 2016, pp 300–303 6. Devlin K (2013) MOOCS and myths of dropout rates and certification. Available at: http:// www.huffingtonpost.com/dr-keith-devlin/moocs-and-myths-of-dr_b_2785808.html 7. El-Bakry HM, Saleh AA, Asfour TT, Mastorakis N (2011) A new adaptive e-learning model based on learner’s styles. In: Proceedings of the 13th WSEAS international conference on mathematical and computational methods in science and engineering, Wisconsin, USA, Stevens Point, pp 440–448 8. Groumpos PP (2010) Fuzzy cognitive maps: basic theories and their application to complex systems. In: Glykas M (ed) Fuzzy cognitive maps. Studies in fuzziness and soft computing, vol 247. Springer, Berlin, Heidelberg 9. Guevara C, Aguilar J, González-Eras A (2017) The model of adaptive learning objects for virtual environments instanced by the competencies. Adv Sci Technol Eng Syst J 2(3):345– 355 10. Hamada M, Hassan M (2017) An enhanced learning style index: implementation and integration into an intelligent and adaptive e-learning system. Eurasia J Math Sci Technol Educ 13(8):4449– 4470 11. Khalil H, Ebner M (2014) MOOCs completion rates and possible methods to improve retention—a literature review. In: Proceedings of world conference on educational multimedia, hypermedia and telecommunications 2014. AACE, Chesapeake, VA, pp 1236–1244
51 AI-Based Enabled Performances Measurements for MOOCs
619
12. Klašnja-Mili´cevi´c A, Vesin B, Ivanovi´c M, Budimac Z (2011) E-learning personalization based on hybrid recommendation strategy and learning style identification. Comput Educ 56(3):885– 899 13. Kolowick S (2013) Coursera takes a nuanced view of MOOC dropout rates. Available at: http://chronicle.com/blogs/wiredcampus/coursera-takes-a-nuanced-view-of-moocdropoutrates/43341 14. Kosko B (1986) Fuzzy cognitive maps. Int J Man-Mach Stud 24:65–75 15. Krafft PM, Macy M, Pentland AS (2017) Bots as virtual confederates: design and ethics. CSCW 183–190 16. Lehmann F, Rodin E (1992) Semantic networks in artificial intelligence. Pergamon Press, Oxford 17. Papageorgiou EI, Groumpos PP (2005) A weight adaption method for fine-tuning Fuzzy Cognitive Map causal links. Soft Comput J 9:846–857 18. Papanikolaou K, Magoulas G, Grigoriadou M (2000) A connectionist approach for supporting personalized learning in a web-based learning environment. In: Brusilovsky P, Stock O, Strapparava C (eds) Adaptive hypermedia and adaptive web-based systems. Lecture notes in computer science, vol 1892. Springer, Berlin, Heidelberg, pp 189–201 19. Popescu E, Badica C, Moraret L (2010) Accommodating learning styles in an adaptive educational system. Informatica 34(4) 20. Sowa JF (1976) Conceptual graphs for a data base interface. IBM J Res Dev Educ 20(4):336– 357 21. Sowa JF (1984) Conceptual structures: information processing in mind and machine. AddisonWesley Longman Publishing Co Inc, Boston, MA, USA 22. Stylios CD, Groumpos PP (2004) Modeling complex systems using fuzzy cognitive maps. IEEE Trans Syst Man Cybern A 34(1):155–162 23. Xirogiannis G, Stefanou J, Glykas M (2004) A fuzzy cognitive map approach to support urban design. J Expert Syst Appl 26(2)
Chapter 52
Application of Hidden Markov Model to Analyze the Biometric Signature: A Comprehensive Survey Neha Rajawat, Bharat Singh Hada, and Soniya Lalwani
1 Introduction Signature is a mark of identification of an individual which is used to authenticate the possession of the materialistic resources and to mark one’s presence wherever required. It encapsulates the relationship between a document and the signer of that document. Due to its property of uniqueness, handwritten signature represents a significant biometric trait. People are quite aware of applications of signature that is the reason behind the easy and intuitive collection of signature data [1]. There are various uses associated with a signature like a bank and loan applications, document verification, access control mechanism, military and intelligence services, consumer verification, etc. As the services and resources are remotely situated, so to verify the end-user, one has to migrate toward the biometric authentication mechanism, and biometric signature is a branch of this system. Biometric verification is a process to identify a person using unique and personalized biometric features such as DNA, iris, fingerprint, face, and handwritten signature [2]. The biometric authentication system aims to minimize the chances of identification errors to develop a secured digital ecosystem. Following the fundamentals of biometric technique, automated signature verification systems are developed. It is an easy and efficient method to identify a person since signatures are unique and N. Rajawat Career Point University Kota, Kota, India e-mail: [email protected] B. S. Hada Samsung R&D Institute Noida, Noida, India e-mail: [email protected] S. Lalwani (B) Bal Krishna Institute of Technology Kota, Kota, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_52
621
622
N. Rajawat et al.
easily collectible across the world. Data available for signatures are both dynamic and static, that is why the signature verification model is divided into two types: online and offline. In the online signature verification system, a signature is obtained on some digital screen using a digital stylus, whereas in the offline system, an image of the signature is processed through the model. As the online system has a live input stream, thus it can capture more features such as writing speed, pen inclination, pressure, and acceleration [3]. The biometric signature can be deployed for both authentication and verification. In the first phase, the system records the unique signature biometric associated with the user to provide first-time access. During the second phase, the system verifies the input signature data against all the biometric stored in the database. Thus, the signature verification model is used to discriminate between the genuine user and identity theft. These attacks can be classified into three kinds: random, simple, and skilled. When the forger uses its signature to get access to the document or resource, it is called random forgery. In the case of a simple forgery attack, the forger knows the name of the qualified user but does not possess the actual signature. This attack can be more vulnerable for the users who use their name as signature. The last kind of forgery is skilled where forger has unhindered access to the signature of the qualified user and success of forgery depends on the creative skills [4]. Several researchers have published their works since the late 80 s. Some of the recent studies have been published as follows: Tomar and Singh [5] have utilized the directional feature with the energy method to solve the offline signature verification problem. Harika and Ready [6] adopted MCYT database and GPDS960 gray signature dataset to train their model for offline signature verification. Odeh et al. [7] employed the signature database GPDS300, which is based on a natural signature recognition method. The author used MLP neural network, and he reduced the error rate. Anjali and Mathew [8] evolved two stages model. There are several advanced methods like dynamic time wrapping, support vector machine, artificial neural networks, convolution neural networks and hidden Markov model, graphbased recognition methods, Gaussian mixture model, etc. This paper discusses the application of various hidden Markov models (HMMs) in signature verification. HMM has been very popular in modeling the online and offline verification system. To the signature as HMM input, it has to be converted into an array of vectors that can describe complete geometry and trajectory path. So, it is very important to do feature engineering over the selection of the vector to design an efficient HMM based signature verification system. The classification of this paper is as follows: Sect. 2 discusses available biometric techniques including signature verification. Section 3 explains the popular signature verification techniques, and Sect. 4 discusses HMM followed by the description of the signature verification method using HMM. Section 5 discusses the conclusion and remarks.
52 Application of Hidden Markov Model to Analyze …
623
2 Biometric Techniques The biometric technique has shown remarkable performance in the authentication mechanism then the pre-existing technologies. These techniques have become a matter of competition among electronic manufacturing companies to attract consumers [9]. In the next subsections, some of the major biometric techniques are discussed.
2.1 Fingerprints The fingerprints are a pictorial representation of ridges and valleys formed on the surface of the fingers of a person. Human fingerprints are unique, unchangeable, and durable throughout life, thus a perfect attribute for biometric identification. The ridges found on the fingerprints are of three types: 1. Arch: Every ridge starts either side of the finger then rises in the middle section and reaches the opposite side of the finger. 2. Loop: Ridge will always start and end on the same side of the finger. 3. Cyclic: One will observe a circular pattern around the mid-section of the finger. Fingerprints are very easy to collect, and the same property makes it highly vulnerable to identity theft. Sometimes, the damaged palms make it difficult to the identification [9].
2.2 Iris and Retina Scanning Iris scanning captures its shape as a mathematically computed pattern that is unique to every individual and remains intact throughout life. It can be observed from some distance in video image captioning. Retina scan is often confused with Iris scanning, as it uses the pattern present inside the blood cells of the retina. The best advantage of deploying an iris scanner is the extreme resistance of incorrect identification, and also it is very fast to capture one iris data. Some other advantages of these methods are high scalability, easily captured, durability, and guaranteed randomness [10].
2.3 Facial Recognition It is a very popular and fancy identification technique that has seen a tremendous application in the digital ecosystem like smart home, airport access, smartphones, entry gates access, etc. Facial data is collected from an image or video frame captured
624
N. Rajawat et al.
and facial attributes like eyes, nose, eyebrows, lips, chin, etc., and topography is matched against the database of the images. To improve the accuracy of the system, one or more image frames can be captured from a high-resolution camera [11].
2.4 Signature Verification Signature is a type of behavioral biometric that can be collected in a static or dynamic environment. A static signature verification system works when the image of a signature is captured and special features are used for authentication. In the case of a dynamic environment, live signature input is captured on a digital notepad using a stylus. Dynamic signature provides more attributes to be analyzed in an authentication mechanism like speed, time, stroke, an inclination of the pen, etc. [12].
2.5 Speech Recognition The speech of an individual contains unique attributes to be identified as biometric, but human beings are very good in mimicking other voices that is why these technologies are less popular. Intensity, pitch, speed, and magnitude are some of the parameters associated with voice. In the advance deployed systems, movement of lips is also captured [13].
3 Signature Verification Techniques As technology is progressing, several algorithms have been deployed to achieve better performance for the signature verification system. These algorithms can be classified into two categories: writer dependent and writer independent. The first kind of algorithm collects genuine signature data from the actual user and also the forgeries from the random users of the system. In the case of writer independent system, a single classification algorithm will be applied to the completed dataset of signatures, and it will learn common classification features to identify between genuine and forged users [14]. Following sections are describing the major methods for signature verification.
52 Application of Hidden Markov Model to Analyze …
625
3.1 Support Vector Machine (SVM) Support vector machines have been used for signature verification for a very long time. Genuine signature verification can be framed as a one-class classification problem. SVM is generally trained with genuine user’s data, and a combination of genuine and forged signature training is still a challenge for performance improvement [15].
3.2 Hidden Markov Model (HMM) This method can be used for both online and offline signature verification system modeling. HMM process data assuming as a temporal sequence. For example: In offline signature verification, the image will be captured for signature. In the image signature, the geometric shape can be functioned as a temporal drawing of the pixel. In this modeling, every pixel will relate to the previous pixel drawn.
3.3 Gaussian Mixture Model (GMM) GMM is expressed in the form of a weighted sum of Gaussian component densities. It is a parametric probability density function. In the case of the signature verification model, special and temporal attributes can be mapped to probabilistic distribution. GMMs are sometimes combined with a hidden Markov model to improve the collective prediction accuracy [16].
3.4 Neural Network and Deep Learning Deep learning methods can be deployed for both online and offline signature verification system. These systems work on the principle of minimizing the distance between the feature vectors. The neural network will be trained over a gradient descent method to minimize the loss function. Different varieties of NN-architecture can be used for the type of input provided to the network. For example, the signature image captured in the offline verification system can be classified using a convolution neural network, and recurrent neural networks can take advantage of temporal dependencies between the pixels of the image aligned through the path trajectory [17].
626
N. Rajawat et al.
3.5 Dynamic Time Wrapping (DTW) This algorithm applied to the online signature verification system. Spatial and temporal features are combined for input data collected in real-time. Niels and Vuurpijl (2005) describe DTW in handwriting examination as a “technique that compares online trajectories of coordinates.” One can imagine working of DTW in the sphere of a variable rate time fabric where the rate of changing time coincides with the speed of writing. In such scenario similarity between two signature samples can be compared even though they are created in a different period [16].
4 Hidden Markov Model Hidden Markov model is a statistical Markov model that contains many finite hidden and observational states. A probability distribution is defined for these finite states which help to form a sequence of finite states. Each hidden state is associated with an observation state thus a sequence of hidden states will form a sequence of observational states. HMM works on the principle of the Markov process which says that every state is dependent only on its immediate predecessor. Fig. 1 shows a Markov chain of five hidden states. HMM is a probabilistic sequence matching system and works well for offline signature verification system modeling. HMM is flexible enough to consider both inter-personal and intra-personal variation and similarity among the signature images. As shown in Fig. 1, HMM accepts signatures as a series of states coupled with observation vectors. These vectors can be generated using the associated probability distribution. There are two types of transition in HMM, first between hidden states and second is from hidden state to observable state. Transition probabilities are defined to govern these transitions. The signature sample is used to train the observation vectors. During testing, a sequence of probability is generated to recognize a sample [18]. HMM system is defined in the steps given below: 1. Model consists of k states, represented as (Z 1 , Z 2 , . . . , Z k ). Q t defines current state at time t. 2. Number M for observable symbols, represented as (O1 , O2 , . . . , O M ). Ot defines current observable symbol at time t. Fig. 1 Hidden stage of hidden Markov model
52 Application of Hidden Markov Model to Analyze …
627
3. Transition probability matrix is given by B (B = Bi j ). Below expression defines the probability of transition from state Q t = Z i to Q t+1 = Z j is:
bi j = P(Q t+1 /Q t ) 4. Output probability matrix is given by C. For each state j:
c j (M) = P(Ot = O M /Q t = Z i ) 5. An initial transition probability distribution is given by
.
= P(Q i = Z 1 )
i
HMM works on the principle of the Markov process which states “future is independent of past given the present.” In simple words, if the present state is known of a model, then the future state can be estimated without considering any past information [19]. The intuition behind HMMs: As said, HMMs are probabilistic models. One can determine the joint probability for a set of hidden states given the set of observable states. Hidden states are for the latent information. Target is to estimate the probability distribution for the sequence of hidden states. Once it is known, a subset with the highest probability will be chosen and that will define the resultant hidden states. Estimation of this joint probability one has to consider three types of initial probability distribution defined below: • Transition data: the probability of transition between two hidden states. • Emission data: the probability of transition from hidden states to any observed state. • Initial state information or prior probability: initial transition probability to a hidden state. The signature verification system is classified into two types: Online and offline signature verification systems. In the case of offline systems, static image is captured, and geometric and spatial data is analyzed to decide the prediction criteria. A static image can be seen as a fixed pattern to be analyzed, and the hidden Markov model (HMM) has proved to be an efficient algorithm to model a solution [20]. In the coming subsection, the discussion starts with the definition of offline signature and its characteristics. After this, the preprocessing of input data and organizing the input image in the form of feature vectors is done. Finally, the discussion ends with the HMM architecture and how does it solve the task of identity verification in the case of offline signature systems.
628
N. Rajawat et al.
4.1 Offline Signature and Its Characteristics It is also known as a static signature as signature samples are available as a static image which is created by pen and paper. A digital camera or scanner will be used to get the electronic copy of the offline signature. Basic characteristics of an offline signature are edges, curves, trajectory, angular geometry, and pixel information. To improve the efficiency of verification systems, one can derive new properties using these basic characteristics like pixel density, probabilistic distribution of pixel in a certain region of the image, etc. Dynamic features of online signatures are missing in case offline signature that is why these are very challenging to design. Another drawback is the data collection as a very less number of samples will be available from a particular user and live data capturing is also not possible [17].
4.2 Preprocessing of Offline Signature Image The preprocessing stage refers to apply geometric and spatial transformations in the image. Figure 2 depicts the major steps involved in preprocessing:
Fig. 2 Flow diagram of offline signature verification system
52 Application of Hidden Markov Model to Analyze …
629
• Noise reduction and Smoothening: removal of outliers and sparse data from the image to highlight the foreground of the image. Smoothening ensures that the path trajectory to the image is visible, and unnecessary data along with the path are removed. • Conversion into Binary Image: Input image is captured in either colored or grayscale format which will be converted into a black and white image. • Thinning: It ensures that the path trajectory of signature only shows the connectivity between two pixels, and the final result of thinning will be a binary signature image having a width of one pixel. • Center alignment of image: Signature image should be aligned in the center of the background frame for better distribution in two-dimensional space [21].
4.3 Feature Engineering Offline signature verification systems have been a subject of research from many years and different kinds of feature classification methods have been proposed. In a broad view, features can be divided into two types: global and local features. This classification is explained in the following subsections: • Local features: Local features are found inside the image frame. For example, pixel distribution inside the frame, edges, and curves of the geometric path, etc. These features help define a unique signature. • Global features: These features provide an overview of an input signature image, such as height and width of the image frame. • Apart from the above classification, researchers have found several other types of features which are discussed below: • Geometric features: Shape, height, width, loops, area, starting and ending pixel values, etc. • Directional features: Sabourin and Drouhard extracted directional probability density function using the slope of the signature path trajectory. These features help in describing the signature in terms of direction. • Mathematical transformations: Nemcek and Lin explained Hadamard transformation. Pourshahabi et al. used a Contourlet transform. Deng et al. proposed a Wavelet transform. Zouari et al. have investigated the usage of the fractal transform for the problem. • Texture features: Local binary patterns and gray level co-occurrence matrix [15].
4.4 Feature Vector Preparation Preprocessed input is used to form the input vector to be fed into the hidden Markov model. HMMs are designed to work with one-dimensional data, but any image is a two-dimensional matrix. Using matrix transformation techniques, 2D matrix can be
630
N. Rajawat et al.
Fig. 3 Flow diagram of offline signature verification system
mapped to a one-dimensional column matrix. Figure 3 has given an example of this mapping where the input image is divided into five vertical segments which represent five hidden states of HMM. It is described in Fig. 3. Here, Z 1 , Z 2 , Z 3 , Z 4 and Z 5 are hidden states of input data. A curved shape inside the rectangle represents a sample signature image. Using the pixel information present in all different segments, one can make five unique feature vectors to be fed into HMM [15]. Daramola and Samuel [22] have proposed offline signature verification using HMM equipped with the discrete cosine transformation (DCT) method for feature engineering at the sub-image level. DCT was applied over the segmented image. In the segmentation, process image was divided into four vertical parts, and each part was further divided into 16 cells for the center of gravity. Pixel distribution in the image defines this center of gravity. These four vertical segments define four states of HMM, and 16 cells are used to form the feature vector. A maximum likelihood estimation was chosen for parameter selection and Baum–Welch algorithm for optimization. Signature data samples were collected from Covenant University Ota, Nigeria, with the participation of 250 students, and each of those provided seven samples. The author has claimed that HMM has improved the performance significantly, and the accuracy rate was around 99.2%. Justino et al. [23] have used simple and pseudo-dynamic features for verification of random, skilled, and forgeries. This experiment showed that system performance was similar in case of random and simple forgeries; thus, solution can be deployed where simple forgeries represent a major proportion of forgeries. As the advantage of grid segmentation has been proved by several researchers, this is why this work also uses this technique in feature extraction. The feature set contains two static and one pseudo-dynamic feature. The pixel density of each cell is considered, pixel distribution feature called extended shadow code (ESC) which was used by Sabourin and signature skeleton image is used to determine the stroke trajectory. This experiment got high error rates in case of skilled forgeries which leaves space for improvement in this direction. Justino et al. [24] have used graphometric features to provide form better feature vectors which can improve the overall efficiency of HMM. Graphometric features contain caliber (incorporates height and width of the signature), proportion (defines
52 Application of Hidden Markov Model to Analyze …
631
the area covered by the signature pixels in the frame), spacing between the blocks of the signature, vertical movement (if the baseline is defined), and horizontal inclination. To extract these graphometric features, signature image was segmented using horizontal and vertical lines. This segmentation was kept dynamic to capture the same amount of data in a particular cell when different images of the same user’s signature. In the experiments, they have used two different datasets containing 40 writers and 60 writers, respectively. The main purpose of this work was to use simple features for training and verification of HMM. Also, the purpose of using two different dataset tests the variation in the performance to confirm the scalability of the solution. Results show only a 0.4% increase in error rate when the model is migrated from 40 writer’s dataset to 60 writer’s dataset. Batista et al. [25] have proposed a hybrid generative–discriminative ensemble of classifiers to solve the issue of a limited amount of data available to train an efficient HMM. They have used a group of HMM models where each of them has a different number of hidden states and a different set of features to be trained. Using this strategy, a single signature sample will be learned differently in each HMM. Thus, the system does not require a large amount of data. Once the optimum efficiency is achieved in the ensemble, some of the top-performing HMMs will be selected using a K-nearest oracles algorithm. They have used two different datasets in two groups of researchers. One of the groups has used the Brazilian SV database containing 7920 samples, and another group has used the GPDS database containing 16,200 samples. To form the feature vector, the grid segmentation technique was used.
5 Conclusion and Discussion This paper talks about the biometric features, especially the offline signature verification. The main focus of this work is to understand the working of HMM in the field of offline signature verification. Several past works related to these are discussed, and their performance is analyzed. This has provided future insight where one can improve the performance of the offline signature verification system. The offline signature verification system is still very crucial to optimize for better performance. The efficient design of features greatly affects efficiency. This leaves the future scope in terms of feature engineering. Another challenge is data availability which is a costly and time-consuming process. One can try to build an efficient HMM verification system that works better for a low amount of data. In this process, artificial data synthesis techniques can help. One more thing which was observed during the creation of this literature is that the offline signature sample does not provide information related to the temporal problem. To include this, updated signature samples and continuous training of HMM will be required. Research work in this field is ongoing to improve the overall performance as described above.
632
N. Rajawat et al.
References 1. Kanawade MV, Katariya SS (2013) Signature verification & recognition—case study. Int J Electron Commun Instr Eng Res Dev (IJECIERD) 3(1):77–86 2. Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans Circuits Syst Video Technol 14(1) 3. Donato I, Giuseppe P (2008) Automatic signature verification: The state of the art. IEEE Trans Syst Man Cybern Part C (Appl Rev) 38(5):609–635 4. Coetzer J, Ben MH, Johan AP (2004) Offline signature verification using the discrete radon transform and a hidden Markov model. EURASIP J Appl Signal Process 2004:559–571 5. Tomar M, Singh P (2011) A directional feature with energy based offline signature verification network. arXiv:1103.1205 6. Harika K, Ready TCS (2013) A tool for robust offline signature verification. Int J Adv Res Comput Commun Eng 2:3417–3420 7. Odeh S, Khalil M (2011) Apply multi-layer perceptron neural network for off-line signature verification and recognition. Int J Comput Sci 8(6):261–266 8. Anjali R, Mathew MR (2013) An efficient approach to offline signature verification based on neural network. IJREAT Int J Res Eng Adv Technol 1:1–5 9. Kong A, David Z, Mohamed K (2009) A survey of palm-print recognition. Pattern Recogn 42(7):1408–1418 10. Kaur G, Singh G, Kumar V (2014) A review on biometric recognition. Int J Bio-Sci Bio-Technol 6(4):69–76 11. Battaglia F, Iannizzotto G, Bello LL (2014) A biometric authentication system based on face recognition and RFID tags. Mondo Digitale 13(49):340–346 12. José VF, Sánchez A, Moreno AB (2003) Robust off-line signature verification using compression networks and positional cuttings. In: IEEE XIII workshop on neural networks for signal processing 13. Trivedi JA (2014) Voice identification system using neuro-fuzzy approach. Int J Adv Res Comput Sci Technol (IJARCST) 2 14. Mustafa YB, Berrin Y (2016) Score level fusion of classifiers in off-line signature verification. Inf Fusion 32:109–119 15. Hafemann LG, Sabourin R, Luiz SO (2017) Offline handwritten signature verification—literature review. In: Seventh international conference on image processing theory, tools and applications (IPTA) 16. Heidi H (2014) Developments in handwriting and signature identification in the digital age. Routledge, London 17. Wen J et al (2007) Offline signature verification: a new rotation invariant approach. In: IEEE international conference on systems, man and cybernetics. IEEE 18. El-Yacoubi A et al (2000) Off-line signature verification using HMMs and cross-validation. In: Neural networks for signal processing X. The IEEE signal processing society workshop, vol 2 19. Lawrence RR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286 20. Ramanujan KS et al (1997) On-line handwritten signature verification using hidden Markov model features. In: The fourth international conference on document analysis and recognition, vol 1 21. Camino JL et al (1999) Signature classification by hidden Markov model. In: IEEE 33rd annual 1999 international Carnahan conference on security technology (Cat. No. 99CH36303) 22. Daramola SA, Samuel IT (2010) Offline signature recognition using hidden markov model (HMM). Int J Comput Appl 10(2):17–22 23. Justino EJR, Bortolozzi F, Sabourin R (2001) Off-line signature verification using HMM for random, simple and skilled forgeries. In: Proceedings of sixth IEEE international conference on document analysis and recognition
52 Application of Hidden Markov Model to Analyze …
633
24. Justino EJR et al (2000) An off-line signature verification system using HMM and graphometric features. In: Proceedings of the 4th international workshop on document analysis systems 25. Batista L, Granger E, Sabourin R (2012) Dynamic selection of generative–discriminative ensembles for off-line signature verification. Pattern Recogn 45(4):1326–1340 26. Jonathan SE (2001) The electronic signatures in global and national commerce act. Berk Tech LJ 16:391
Chapter 53
RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework for Real-Time Acoustic Data Analysis
PT
ER
Sathyan Munirathinam
1 Introduction
R ET
R AC
TE
D
C
H
A
The market of IIoT especially in manufacturing industry has been growing increasingly. By the latest Gartner report, it is reported over 26 billion of physical devices will be connected through Internet and result in $1.9 trillion in global economic value added through sales by 2020. The IoT framework is poised to revolutionize daily living standards by enabling smarter insights from sensor technology. Everyday items from furniture, appliances, vehicles, to large structures such as buildings, bridges, and aircraft can now serve as a data hub. The streaming data provides operational and network intelligence for real-time analytics to improve the customer experience. This interconnection between sensors and communication technology creates a flexible architecture that allows objects and devices to communicate with one another to reach common goals. Furthermore, the multidisciplinary nature of IoT opens the gate to new research in computing, business, engineering, and health care. An Internet of Things (IoT) architecture was developed with a complete chain of signal processing, machine learning, and interdiction to target systems. Semiconductor manufacturing is one of the most complicated and sophisticated technologyevolving and capital-intensive cost (~$4 billion per Fab). With such high costs, and with the interdependence of the equipment fleets within the Fab, it is essential that these systems are available, productive, and capable of high quality output. Even with the latest technology designed into these processing systems, they are still susceptible to faults, miss-processing and unpredictable parts failures. Effective early prediction of fault detection in equipment is an essential to prevent abrupt equipment breaking The original version of this chapter was retracted: The retraction note to this chapter is available at https://doi.org/10.1007/978-981-33-4893-6_59
S. Munirathinam (B) ASML Holding, San Diego, CA, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021, corrected publication 2022 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_53
635
636
S. Munirathinam
R ET
R AC
TE
D
C
H
A
PT
ER
down and is also benefit to improve productivity, repairing time, and reduce costs. Detecting problems in products and machinery are paramount for manufacturing efficiency, effectiveness, and safety. Fault detection and predictive/prescriptive maintenance are key to keeping the equipment up and running and the fabrication line moving. Having said that, even with thousands of dedicated sensors spread across hundreds of processing systems, the current detection and prediction capabilities do not meet the needs of semiconductor companies. The project proposed here is constructed around approach to machine learning (ML), which uses large volumes of data based on theory of statistics, mathematics to build machine learning models to enable computers to adapt their behavior to predict events/failures in the future. By using a broadband, ~omnidirectional acoustic sensors, we have the potential to greatly increase our abilities for both fault detection and for predictive maintenance. The acoustic signal collected from microphones that are installed in the equipment area used to understand the health of the equipment. The high volume of raw signal data is ingested into Hadoop for any malform detection using big data machine learning analytics. There are numerous IoT sensors like vibration sensors, temperature sensors, infrared, leak sensors, cameras, GPS terminal sensing equipment, etc. Such a recorded data from these sensors is then transmitted to the processing layer typically through a wireless network. It is in general challenging to connect the wireless sensor network (WSN), mobile communication networks or the Internet with each other because of the lack of uniform standardization in communication protocols. Furthermore, the data from WSN cannot be transmitted in long distance with the current limitations in transmission protocols. In this context, low power wireless communication techniques such as ZigBee, 6LowPAN, LoRaWAN, and low power Wi-Fi prove to be an effective transmission for IoT network sensors within the network layer. A typical IoT system architecture is divided into four layers including the collection layer, ingestion layer, data processing layer, and presentation layers. The collection layer aims to acquire, collect, store, and process the data from the physical devices (Fig. 1). Within the data processing layer, big data processing software and storage provide a variety of solutions for IoT. Commercial and public datacenters such as Amazon Web Services and Microsoft Azure provide computing, storage, and software resources as Cloud services, which are enabled by virtualized software stacks. On the other hand, private datacenters offer basic infrastructure services by combining available software tools and services such as Torque and Oscar, as well as file storage systems such as storage area network/network attached storage (SAN/NAS). These data storage providers vary in how they store, index, and execute queries. Specifically, there are relational databases that allow the user to link information from different tables or different types of data buckets, while non-relational databases store data without explicit and structured mechanisms to link data from different buckets. The latter is often referred to as NoSQL databases and are composed of four forms: document-based, graph-based, column-oriented, and key-value store.
637
ER
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
PT
Fig. 1 Generic IIoT Edge-Fog-Cloud real-time processing framework. Credit iiconsortium.org
R AC
TE
D
C
H
A
The role of the presentation layer or inference layer is to provide further insights based on the data transmitted and to predict based on the incoming sensor data. The objective of this investigation is to present the development of an IoT system that can provide real-time diagnostics in equipment health monitoring (EHM) applications related to the semiconductor manufacturing industry. To achieve this goal, the approach presented in this manuscript is based on: (i) using acoustic sensing data including such obtained by mechanical experimentation and production evaluation, (ii) leveraging an IoT approach to enable real-time usage of acoustic wav signals, (iii) developing a data-preprocessing method to create diagnostics information which could be visualized both at the Fog and Cloud layers combined with user interfaces appropriate for EHM applications.
2 Background
R ET
2.1 The Sense of Sound: Acoustic Listening Most of the sounds generated by mechanical machines are structure-borne sounds. When the machine is operating as intended, the audio signature stays usually the same, given the operation conditions and environment do not change. When the machine develops a defect, it often changes the vibrations and sounds generated by the machine. For example, a skilled technician can identify some faults from a car engine by just listening to it. Any unusual noise coming from your car can easily related to mechanical problem in the car. This understanding holds true for an abnormal sound in the manufacturing plants that can signify an equipment wearing, deteriorating, or imminent breakdown. In the noisy and loud manufacturing plants, the real equipment sounds get lost in the noise. It is hard for the workers to differentiate and determine
638
S. Munirathinam
the location enough to what is “normal” and what is not. In industrial manufacturing, equipment is often subject to progressive wear. Monitoring and maintenance procedures are therefore required; however, they are difficult to implement autonomously and in situ. As they must monitor potentially inaccessible environments without interrupting near-continuous production, semiconductor device manufacturing processes are prime candidates for remote monitoring techniques.
ER
2.2 Signal Processing
R ET
R AC
TE
D
C
H
A
PT
Signal processing in this paper refers to the acquisition of audio signals, store in raw format, feature extraction of signals, re-encoding, and visualization. Signal processing algorithms enable machine learning engineers to separate signals from noise, to perform decoding machine characteristic information, or to compress information for more efficient storage or transmission. On the other hand, if the external noise is known to be limited to certain frequency range, where the signal itself do not carry much relevant information, those frequencies could be filtered as well. There are also other, more sophisticated ways to remove unwanted noise from the signal, but they usually require more than one measurement of the signal. In this study, only conventional time and frequency domain methods are used. The main reasons for using only simple and conventional methods are the ease of interpretation of the results and small requirements for computational resources. The audio measurements must be analyzed locally in real time, which poses demands for computationally simple signal processing methods. The most straightforward way to gain information from the acquired audio signal is to examine the raw audio signal, which is in essence a time series of values corresponding to sound pressure levels at the location of the measurement microphone. The examined time series can be described through statistical features, i.e., parameters, calculated from it. Different features describe different aspects, so calculating several features from one signal can give comprehensive description of that signal (Fig. 2). Below are the few feature extractions based on the intensity of the frequency content of the signal as time progresses. These audio features can be used to identify spoken words phonetically, and to analyze the various functions of the equipments. May be we can call as “EquiPhonics,” where equipment’s are talking (Table 1).
639
ER
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
A
PT
Fig. 2 Three forms of audio signals (time vs. amplitude, frequency vs. amplitude, time vs. freq vs. amplitude)
Table 1 Audio feature extraction signals Feature
Description
1
Average power
Average power refers to the average value of the instantaneous power waveform over the time
2
Skewness
The skewness gives a measure of the asymmetry of a distribution around its mean value
3
Kurtosis
The kurtosis gives a measure of the distribution around its mean value, I
4
Spectral slope
5
Spectral centroid
6
Spectral flatness
The spectral flatness is a measure of the noiseness (flat, decorrelation)/sinusoidality of a spectrum or a part of it, I
7
Spectral roll-off
The spectral roll-off point is the frequency so that 95% of the signal energy is contained below this frequency
TE
D
C
H
#
The spectral slope represents the amount of the spectral amplitude
R ET
R AC
The spectral centroid is the balancing point of the spectrum
3 IIoT Framework 3.1 IoT Hardware Designing and deploying IoT hardware is an important architectural component of a fully deployable IoT solution. Understand how a physical edge device collects, processes, and communicates data through the intermediary layers to the Cloud (Fig. 3).
640
S. Munirathinam
3.1.1
ER
Fig. 3 Microphones connected to mixer, and mixer is connected to Odroid, microphone, Odroid device
Omni-Directional Microphone
Edge Computer—Odroid
H
3.1.2
A
PT
The RTA-M (dbx RTA-M Reference Microphone) is an omni-directional, flat frequency measurement microphone specially designed for the manufacturing sectors to pick up all frequencies from 20 Hz to 20 kHz, ensuring accurate signals for real-time analysis of your audio.
Audio Mixer
R AC
3.1.3
TE
D
C
The Odroid XU4 is a powerful small-scale inexpensive IoT device. This minicomputer is similar to a Raspberry PI and compatible with many of the OS libraries. For this project, it is chosen to install ARMBIAN Ubuntu mainly for the audio libraries. This OS allows an easy installation to add GPS libraries, Apache MiniFi, JDK 8, and Python for processing.
R ET
Digital mixers are more resistant to noise from the environment as compared to the analog mixers. The “mixer” is essentially a computer that is programmed to be able to process the audio signals and combine or separate them into different signals. Focusrite mixer is used in this experimentation.
3.2 IIoT Software The developed IIoT architecture (Fig. 4) is subdivided into two parts: On-Premise network and a Cloud network. The On-Premise network hosts the Edge and Fog layers. The Edge consists of the low-level hardware [sensors and data acquisition systems (DAQs)] intended to record data from the structure being monitored. For the purposes of this investigation, the sensors are related to nondestructive tensing
641
C
H
A
PT
ER
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
TE
D
Fig. 4 Generic IIoT Edge-Fog-Cloud real-time processing framework
R ET
R AC
(NDT) techniques, including triaxial accelerometer (TA), acoustic sensor (AS), highresolution image sensor (HRIS), and infrared thermography (IR). Other sensors (e.g., temperature, humidity, and pressure) can be readily used with the proposed architecture to augment the sensing inputs. In this development of IoT framework, only acoustic data are used to demonstrate the capabilities of the IIoT system for clarity as well as conciseness reasons. For the proposed IIoT architecture, the Fog layer is composed of a decentralized computing device and is tasked with aggregating, parsing, filtering, clustering, and classifying data from multiple Edge DAQs. Two design configurations were attempted to assess technical IoT characteristics related to latency, throughput, and computational efficiency.
3.2.1
First Design Iteration
The first design iteration consisted of three Raspberry Pi 3 model B+ computers powered by ARM Cortex-A53 1.4 GHz processors. One Raspberry Pi is used to filter incoming *.txt files from the Edge layer and store only identified features of the incoming data. Another Raspberry Pi is used to conduct preliminary analysis on the filtered data and visualize it for local users. The third Raspberry Pi is used to
642
S. Munirathinam
Second Design Iteration
PT
3.2.2
ER
concurrently run a Data Loader script to upstream data into the Cloud MongoDB database. A specified folder from the second Raspberry Pi is mapped onto each of the Edge nodes, as well as on the other two Raspberry Pi’s. This allows *.txt files from the Edge layer to be deposited directly into the second Raspberry Pi with limited lag. On the second Raspberry PI, the *.txt files are then run through a filtering algorithm to only retain the useful data features. This filtered data is then directly accessible from the other two Raspberry Pi’s which run Python scripts to handle the data stored. The first Raspberry Pi uses this data to provide a real-time preliminary analysis to show the current signals coming into the system. The second works concurrently to ship the flittered data into the Cloud. When the test has been completed, all the analysis done at the Fog level is also transferred into the Cloud.
R ET
R AC
TE
D
C
H
A
NVIDIA Jetson Xavier was used to optimize the data streaming processes to take advantage for its computational power and faster memory read/write speeds compared to the Edge device. Data analysis scripts are run on the NVIDIA Jetson to parse, filter, and visualize the data. This Fog device further has the ability to retrieve commands from the Cloud to change acquisition parameters, start and stop acquisitions, and alter algorithms running on the Fog device. Alternative single board computers are shown in Fig. 1 within the system architecture. The NVIDIA Jetson Nano, NVIDIA TX2, Banana Pi, and Odroid-C2 are alternatives to a Raspberry Pi and may be used depending on the application and computational needs (Fig. 5). The Cloud network consists of a WD My Cloud PR2100 NAS server and two other computers. The WD My Cloud runs on a GNU/LINUX operating system and hosts a MongoDB server. MongoDB is a NoSQL database that uses a JSON document style schema to store data. The schema is flexible and provides a useful data structure for sensor data. The incoming data from the onsite network is stored in MongoDB collections on the server. This data can then be accessed by the network computers to perform further analysis. This analysis takes advantage of both the live data coming in from the Fog layer and the historical data already stored in the Cloud. The results of
Fig. 5 Generic IIoT Edge-Fog-Cloud real-time processing framework
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
643
IIoT Data Structure
PT
3.2.3
ER
the analysis done at the Fog layer can be displayed to a local end user as well as they can be uploaded back to the MongoDB server, thereby making the analysis results accessible from anywhere over an internet connection. In the implementation shown in this investigation, the link between the Cloud layer and the Fog device is achieved using Ethernet. Feedback from the Cloud to the Fog layer enables smart feature selection, which allows for even faster cloud throughput rates. Grafana visualization is used to visualize the data stored in the Cloud. Database architecture can vary based on the sensor data structure as well. Other NoSQL database types can be used depending on the type of sensor data. For example, a document-store database such as CouchDB, a columnar-store database such as Amazon DynamoDB, and a key-value store database such as Cassandra or Apache HBASE can be leveraged.
R ET
R AC
TE
D
C
H
A
Omni-directional microphones attached to these manufacturing machines generate raw wav (.wav) files. The sampling rate of these microphones are 48 kHz/s. On an average for an hour, the size of these wav files are 300 mb files. Then, the audio waveform data is preprocessed to retrieve features that serves as inputs to the machine learning algorithms (Fig. 6). The Librosa library provides audio feature extraction functionalities for processing audio with Python. The audio data that collected in the noisy equipment environment poses a major challenge in increase noise-to-signal ratio. Before, we start processing the raw time-domain data, the first challenge, noisiness should be addressed. The Audacity uses “spectral gating” algorithm to suppress the noise in the audio. A good implementation of the algorithm can be found in noisereduce Python library. The algorithm can be found on the documentation of
Fig. 6 Data architecture in the IIoT framework
644
S. Munirathinam
TE
D
C
H
A
PT
ER
the noisereduce library. Especially in the case of waveform data, such as vibration or acoustic data, the raw signals are usually so large that it is not practical to use the whole raw signal in diagnosis step. Instead, the output of the data processing step is a set of features, which are supposed to appropriately represent the raw data acquired from the equipment. As the data is transferred through the IIoT system, data is processed to optimize computational efficiency. Starting at the Edge, raw acoustic waveforms (Fig. 2) are recorded in the form of amplitude versus time signals. Storage filer is mounted on the Edge device. Edge Software writes these audio files on an hourly basis to these storage filer. At the Fog layer, Storage filer is visible, and the low-level filtration is performed, and the resulting dataset is then uploaded to the Cloud MongoDB server. As part of feature extraction, the following features (average power, spectral centroid, zero cross rate) are extracted in a *.txt format. At the Fog layer, the *.txt file is converted to the binary file format feather to store data frames. Feather is a fast, lightweight file format created by Apache Arrow. It is language (e.g., R, Python) agnostic and has a high read and write speed. The feather data frame is parsed and structured to a BSON format. BSON, short for binary JSON, is a binary-encoded serialization of JSON-like documents. MongoDB saves the data in the BSON file format which is a binary encoding of JSON-type documents that MongoDB uses when storing documents as collections. It adds support for data types like date and binary that are not supported in JSON. Once the data is at the Cloud, onsite computers can then query the data, and the queried data is outputted in the form of data frames. Figure 2 provides an overview of the transformation of data throughout the system which constitutes the backbone of the overall IIoT approach presented in this article.
R AC
4 Experimentation and Results 4.1 Experimentation
R ET
Audio is the most apparent signal of mechanical failure. Most of the faults are signaled in this domain because of the movement of the components in the motor pumps creating friction. Clean dry air is a critical utility for semiconductor plants and having any interruptions can be disastrous. Reliability in your compressed air supply is key to the semiconductor manufacturing processes. Monitoring and notifying any problem in the clean dry air compressor is critical and any early notification to the technicians is vital on the health of the dry air. Installing microphones is the ideal solution to listen and tell if the bearing in the dry air pumps are degrading. This also happens to be a primary requirement when dealing with semiconductor facilities as they need assurance that there will be no negative effect on the motors when a device is installed (Fig. 7). Microphones are listening for signs of wear—for variances to develop in the noises made by the machines—so that maintenance can be scheduled before anything breaks
645
ER
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
PT
Fig. 7 Edwards dry air pump for experimentation in semiconductor manufacturing floor
R ET
R AC
TE
D
C
H
A
and causes downtime. Downtime, as you might imagine, is about the worst thing that can happen to a manufacturing facility. The underlying concept that motivated this research effort is that primary motion classes should have unique audio signatures that could be distinguished through pattern analysis algorithms such as machine learning classifiers. Once identified, a process could be tracked and monitored for wear and degradation, providing the basis for in-service remote monitoring (Fig. 8). One of our key IIoT smart manufacturing initiatives monitors wafer tool health by applying acoustic sensors to wafer processing machines to collect signals as wafers are polished. By comparing baseline sound fingerprints from a wafer’s acoustic signals to sound fingerprints detected during polishing, we can recognize abnormal sounds and identify potentially problematic tool conditions. Omni-directional microphones are microphones that pick up sound with equal gain from all sides or directions of the microphone. The idea of this omni-microphone is to record the signals all with equal gain (Fig. 9).
Fig. 8 Microphone installation in the Edwards dry air pump
S. Munirathinam
ER
646
PT
Fig. 9 Two microphone’s installation in the Edwards dry air pump
R AC
4.2 Results
TE
D
C
H
A
We installed two microphones: One is closer to the dry air pump, and another one is opposite to the pump. The technique that we followed is two microphones to reduce the noise. The primary microphone is closer to the desired source (here, the source is dry air pump). A second mic receives environment noise. In a noisy environment, both microphones receive noise at a similar level, but the primary mic receives the desired sounds more strongly. Thus, if one signal is subtracted from the other (in the simplest sense, by connecting the microphones out of phase), much of the noise is canceled while the desired sound is retained. This technique helps in increasing higher signal-to-noise ratio. Other techniques may be used as well, such as using a unidirectional primary microphone just to listen the specific portion of the pump function.
R ET
To facilitate large-scale deep and automatic data analytics, it is essential to carry out controlled data collection to prove the sensitivity of an acoustic sensor to the specific failures of interest as well as to study what signal features to use (Fig. 10). Three experiments are carried out to study the sensitivity of acoustic sensor to dry vacuum pumps. The first experiment is to study malfunctioning during the pump
Fig. 10 a Normal production run without any scratches and b scratches with some noise. Y-axis represents the major Mel-frequency cepstral coefficient
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
647
startup. Another experiment is to test the pump shutdown. Third experiment is to test during the normal operating condition of the pump.
4.2.1
Data Analysis
R ET
R AC
TE
D
C
H
A
PT
ER
The main objective of the IIoT framework in this investigation is to demonstrate its capabilities to apply a data mining process to handle real-time acoustic sensor data, as an example of datasets related to semiconductor EHM and apply diagnostics to remove noise using machine learning methods. In this machine learning approach, features of acoustic sensor data are selected using a down-selection method based on feature correlation used to eliminate highly correlated features accompanied by a principal component analysis (PCA). Once the number of principal components is identified, an unsupervised learning approach is used to cluster data. In this investigation, a Gaussian mixture model (GMM) was used iteratively to compute the optimal clustering. The labels and historic data are then used to train a support vector machine (SVM) model and use it for real-time data classification by uploading it to the Fog. To demonstrate this data mining approach, Fig. 4 shows actual acoustic sensor datasets recorded using microphone sensors, which are visualized using two of the more than twenty features that could be used to describe the amplitude versus time waveforms that are recorded in microphone, as shown in Fig. 2. More specifically, Fig. 4a, b and Fig. 4c, d portray such raw acoustic raw signals with two selected features (amplitude and peak frequency) plotted for two different datasets, one of which is used for training and the other for testing in the classification approach described in this section (Figs. 11 and 12). Features extracted from the datasets shown in Fig. 9 were first normalized and then compared using a Pearson coefficient correlation matrix, as shown in Fig. 10. Out of the initial twenty features, eighteen features were kept after the feature selection process was applied and by removing features with greater than 90% correlation. The next step is to perform feature reduction, which decreases the feature space by choosing the optimum number of principal components (PCs) based on finding a point where the residuals of the next principal component do not provide additional information to the principal component space. A user-defined threshold of 95% variance was used to determine the number of PCs to describe the reduced feature space. The cut-off point for the training data is exhibited in Fig. 6. As shown, the seventh PC does not significantly vary in the principal component space and is eliminated along with every component after it, giving a reduced feature space of six components (Fig. 13). These principal components are then used in the GMM which was performed iteratively using 600 iterations testing the classification results achieved by attempting to group data in one up to six clusters. To mathematically determine the appropriate number of clusters, three different criteria were used, and the results are shown in Fig. 7. The clusters are then evaluated using the original feature space to relate classifications to specific acoustic data trends that in this case are related to damage initiation and progression in the composite specimens used. The comparative criteria
S. Munirathinam
PT
ER
648
R ET
R AC
TE
D
C
H
A
Fig. 11 a Raw acoustic data visualized as amplitude versus time for the training dataset; b raw acoustic data visualized as peak frequency versus time for the training dataset; c raw acoustic data visualized as amplitude versus time for the testing dataset; d raw acoustic data visualized as peak frequency versus time for the testing dataset
Fig. 12 Pearson coefficient correlation matrix
649
ER
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
D
Fig. 14 Criteria used for cluster assessment
C
H
A
PT
Fig. 13 Variance per principal component
R ET
R AC
TE
used include the silhouette coefficient, Davies–Bouldin, and Calinski Harabasz, and all indicated that the optimum number of classes in this case is two. Figure 8a and b portrays the clustering results for both the training and testing datasets in terms of the same two features across time as in Fig. 4. It can be seen that the classification visualized in Fig. 8 shows good separation between the two clusters in the amplitude versus time plots, while the two clusters are sufficiently distinct in the peak frequency versus time plots (Fig. 14). Once the GMM clusters were defined, they were used to train a support vector machine (SVM) that can associate new signals in a live monitoring case with the clusters. SVM is actually a supervised learning method which once trained it seeks to find the best possible way to separate data. For SVM, the general idea is to produce hyperplanes that intersect between classifications, so that if a new data point falls on one side of the plane, that data point is associated with a certain class while maximizing the margin around the hyperplane. In this investigation, the SVM was trained by using the training dataset shown in Fig. 4 as well as the GMM optimal clusters. In addition, the model was trained using a radial basis function, ten iterations and with a scale gamma parameter. Fivefold cross-validation was used to validate the SVM results using the testing dataset, and the average accuracy score was 92.68% (Table 2; Figs. 15 and 16).
650
S. Munirathinam
Table 2 Classification accuracy for the support vector machine model Classification
Iteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5 Average cross validation score 91.39%
93.84%
93.37%
92.68%
92.68%
D
C
H
A
PT
ER
Support vector 89.17% machine (SVM)
R AC
TE
Fig. 15 a Clustering results across amplitude versus time for the training set; b clustering results across peak frequency versus time for the training set; c classification results across amplitude versus time for the testing set; d classification results across peak frequency versus time for the testing set
R ET
If yˆi is the predicted value of the ith sample and yi is the corresponding true value, then the fraction of correct predictions over nsamples is defined as Accuracy(y, yˆ ) =
1 n samples
n samples −1
1( yˆi = yi )
i=0
4.3 Data Structure During the experiment, data was ingested through the Edge and sent to the Fog and Cloud layer. At the Cloud, the data structure was finalized in a BSON format per document in the MongoDB server. Each document represents one signal; thus, a collection is built of many documents (Fig. 9). A bulk insertion method is then used
651
H
A
PT
ER
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
C
Fig. 16 Confusion matrix for classifying against fault prediction
R AC
TE
D
to upload the rows of the feather data frame as individual records. Multi-document transactions are used to upload multiple documents to the existing collection in MongoDB. As the type of material or conditions change, varying collections can be created. Other sensor data can also be parsed and stored as a collection in the database. This database structure will allow querying and retrieval for model training.
4.4 System Performance
R ET
Acoustic time-domain signal (time vs. amplitude) data was acquired at the Edge (using a RTX Microphone) and stored on a network shared folder in the IoT Device/Odroid cluster. As per our data acquisition settings, two new data files namely DTA (physical acoustics proprietary file) and *.wav are created every 4 min. Four minutes selection is based on the size of raw incoming data. Incoming data is appended to the files for the duration of each second for 4 min until a new batch of files are created (Fig. 17).
4.4.1
Edge to Fog Data Throughput
The data transfer from edge device to fog system is critical in real-time data analysis. Using the PCI2, we recorded a throughput of 10 MB/s. A stopwatch was used to
S. Munirathinam
ER
652
PT
Fig. 17 Database structure used in the IIoT system
Fog to Cloud Data Throughput (Odroid Cluster)
D
4.4.2
C
H
A
measure the time taken for live acoustic raw wav data to transfer from the Edge to the Fog. The stopwatch was concurrently started and stopped with the data acquisition procedure. This test was run three different times. Each time, the throughput was calculated by dividing the total size of the produced data (total .DTA file size) by the amount of time taken. The different throughput values were then averaged.
R AC
TE
Using the Odroid cluster, an overall average throughput of 13.2 MB/s was achieved (Table 2). For this test, all the data was already present on a shared folder in the Odroid cluster. The cluster was operating as described in Sect. 2.1. A timer was added to the MongoDB upload script and was used to measure the amount of time taken to process all the available data. The throughput was then derived by dividing the total size of the .DTA files by the total processing time taken. This process was run five separate times as given in Table 3.
R ET
Table 3 Fog device throughput results
Size of data
Processing time (s)
Throughput (MB/S)
1415
103.9
13.61
1415
115.6
12.24
1415
104.8
13.5
1415
112.2
12.61
1415
101.9
13.88
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
4.4.3
653
Fog to Cloud Data Throughput (Xavier Jetson)
A
PT
ER
Using the Xavier Jetson instead of the Raspberry Pi’s, an overall average throughput of 44.4 MB/s was achieved. For this test, all the data was already present on the Jetson device (1.53 GB of data). The IoT device runs two scripts for this implementation. One script (preprocessing) was parsing, filtering, and clustering. The other script (cloud portion) was in charge of uploading the data to the MongoDB server. The preprocessing portion produced a throughput of 68.3 MB/s, and the cloud upload portion produced a throughput of 44.4 MB/s. Timers were used in both scripts to measure the amount of time taken to process all the available data. The throughputs were then derived by dividing the total size of the data (DTA files) by the amount of time taken. Four iterations of this process were performed, and the mean value of the throughput results was computed (Table 3). Since the overall throughput of the system is throttled by its slowest performing piece, the final throughput was defined to be 44.4 MB/s.
H
5 Discussion
R ET
R AC
TE
D
C
The investigation presented in this manuscript demonstrates the capability to stream live data through an IIoT framework for intelligent data quality management and data analysis. Each layer of the system, the Edge, Fog, and Cloud play a role in consolidating, filtering, and structuring live data. The Edge layer is responsible for denoising analog signals. The Fog layer is responsible for performing diagnostics using a SVM model which is trained by using classification labels derived by the GMM. Additionally, the Cloud layer can send feedback parameters based on its data quality analysis to the Fog. The Fog can then use this information to adjust its data processing parameters thereby creating an intelligent data quality management scheme. The architecture also supports dynamic model improvement. The results presented showcase a machine learning model, a SVM, applied to streaming data, thus enabling smart feature selection and data reduction at the Fog before data is sent to the Cloud. Furthermore, as historic data is stored in the Cloud server, it is structured for simple and flexible retrieval, thus enabling a continuous model training. The model can be further trained using high performance computing (HPC) at the cloud layer resulting potentially in improved computing time. The organized data structure is essential to reduce the computational burden on the Cloud and provide an opportunity to integrate multi-sensor data as NoSQL databases provide hardware and data flexibility. In terms of hardware, the NoSQL databases can be partitioned across several servers, therefore providing an opportunity for horizontal scaling. Specifically, the BSON structure provides a custom key-value schema for structural data collected. It is important to note that the following IIoT framework can be applied for a variety of monitoring scenario.
654
S. Munirathinam
6 Conclusion
AC
References
TE
D
C
H
A
PT
ER
This paper presented showcases a system framework for live acoustic sensor data to be transmitted from the environment to the Cloud with an intermediary Fog layer. The Fog layer acts as a preprocessing layer to filter the incoming data, structure the data, enrich, and diagnostics of real-time acoustic signals. Although this architecture is presented for equipment health monitoring applications, this architecture can be applied to other IoT applications such as smart cities, advanced manufacturing, and autonomous vehicles. Furthermore, alternative algorithms at the Fog a leveraging streaming algorithms and additional quality metrics within the live framework can be implemented. The acoustic sensor automatic analytics with big data machine learning system was developed and deployed. The system was applied in semiconductor manufacturing dry compressor health monitoring. Both experiments and production deployment results demonstrated that the system can detect various types of pump abnormalities. The audio-based condition monitoring often rely on the assumption that the equipment has a characteristic sound signature, which stays fairly constant when the condition of the equipment and its environment stay the same. MongoDB document stored database is implemented to provide real data streaming transfer from the Edge to both the Fog and Cloud. Through this database, the algorithms used are capable to execute filtering by classification at the Fog level, as live data is recorded. After the feature extracted data is automatically uploaded to the Cloud for further data analytics and visualization. The system integration with three layers provides an opportunity to create a paradigm for intelligent real-time data analysis.
R
ET R
1. Pasha S et al (2018) A deep learning approach to the acoustic condition monitoring of a sintering plant. In: Proceedings of Asia-Pacific signal and information processing association annual summit and conference (APSIPA ASC’18), Honolulu, USA, 12–15 Nov 2018, pp 1803–1809 2. Saeed A. Urban sound classification. http://aqibsaeed.github.io/2016-09-03-urban-sound-cla ssification-part-1 3. Shaikh F. Getting started with Audio Data Analysis (Voice) using deep learning. https://www. analyticsvidhya.com/blog/2017/08/audio-voice-processing-deep-learning 4. Takashima R et al (2017) Separation of vibration-derived sound signals based on fusion processing of vibration sensors and microphones. In: Proceedings of the 25th European signal processing conference (EUSIPCO’17), Kos, Greece, 28 Aug–2 Sept 2017, pp 2428–2432 5. Kawaguchi Y et al (2018) Non-negative novelty extraction: a new non-negativity constraint for NMF. In: Proceedings of the 16th international workshop on acoustic signal enhancement (IWAENC), Tokyo, Japan, 17–20 Sept 2018, pp 256–260 6. Salto˘glu R, Humaira N, ˙Inalhan G (2016) Aircraft scheduled airframe maintenance and downtime integrated cost model. Adv Oper Res 2016 7. Kinnison HA, Siddiqui T (2012) Aviation maintenance management 8. Quantas. The A, C and D of aircraft maintenance. www.qantasnewsroom.com.au/roo-tales/ the-a-c-and-d-of-aircraft-maintenance/ 9. Kingsley-Jones M (2017) Airbus sees big data delivering ‘zero-AOG’ goal within 10 years
53 RETRACTED CHAPTER: Industrial Internet of Things (IIoT) Framework …
655
R ET
R AC
TE
D
C
H
A
PT
ER
10. Abdelgawad A, Yelamarthi K (2016) Structural health monitoring: internet of things application. In: 2016 IEEE 59th international midwest symposium on circuits and systems (MWSCAS). IEEE, pp 1–4 11. Abdelgawad A, Yelamarthi K (2017) Internet of things (IoT) platform for structure health monitoring. Wirel Commun Mob Comput 2017 12. Tokognon CA, Gao B, Tian GY, Yan Y (2017) Structural health monitoring framework based on Internet of Things: a survey. IEEE Internet Things J 4(3):619–635 13. Cañete E, Chen J, Martín C, Rubio B (2018) Smart winery: a real-time monitoring system for structural health and ullage in fino style wine casks. Sensors 18(3):803 14. Jeong S, Law K (2018) An IoT platform for civil infrastructure monitoring. In: 2018 IEEE 42nd annual computer software and applications conference (COMPSAC). IEEE, pp 746–754 15. Wang J, Fu Y, Yang X (2017) An integrated system for building structural health monitoring and early warning based on an Internet of things approach. Int J Distrib Sens Netw 13(1):1550147716689101 16. Steinwart I, Christmann A (2008) Support vector machines. Springer, New York 17. Yang P, Hsieh C-J, Wang J-L (2018) History PCA: a new algorithm for streaming PCA. arXiv 415 preprint arXiv:1802.05447 18. Yan Z, Zhang P, Vasilakos AV (2014) A survey on trust management for Internet of Things. J Netw Comput Appl 42:120–134. https://doi.org/10.1016/j.jnca.2014.01.014 (in English) 19. Whitmore A, Agarwal A, Da Xu L (2015) The Internet of Things—a survey of topics and trends. Inf Syst Front 17(2):261–274 20. Lin N, Shi W (2014) The research on Internet of things application architecture based on web. In: 2014 IEEE workshop on advanced research and technology in industry applications (WARTIA). IEEE, p 184187 21. Sheng Z, Mahapatra C, Zhu C, Leung VC (2015) Recent advances in industrial wireless sensor networks toward efficient management in IoT. IEEE Access 3:622–637 22. Mahmoud MS, Mohamad AA (2016) A study of efficient power consumption wireless communication techniques/modules for internet of things (IoT) applications 23. Wang L, Ranjan R (2015) Processing distributed internet of things data in clouds. IEEE Cloud Comput 2(1):76–80 24. Padhy RP, Patra MR, Satapathy SC (2011) RDBMS to NoSQL: reviewing some next-generation nonrelational database’s. Int J Adv Eng Sci Technol 11(1):15–30 25. Han J, Haihong E, Le G, Du J (2011) Survey on NoSQL database. In: 2011 6th international conference on pervasive computing and applications. IEEE, pp 363–366 26. D3039/D3039M-00 (2000) Standard test method for tensile properties of polymer matrix composite materials. ASTM Standard 27. Marlett K, Ng Y, Tomblin J (2011) Hexcel 8552 IM7 unidirectional prepreg 190 gsm & 35% RC. Qualification material property data report. Test Report CAM-RP-2009-015, Rev. A, National Center for Advanced Materials Performance, Wichita, Kansas. pp 1–238 28. Ohtsu M, Enoki M, Mizutani Y, Shigeishi M (2016) Principles of the acoustic emission (AE) method and signal processing. In: Practical acoustic emission testing. Springer, Tokyo, pp 5–34 29. Wisner B, Kontsos A (2018) In situ monitoring of particle fracture in aluminium alloys. Fatigue Fracture Eng Mater Struct 41(3):581–596 30. Mazur K, Wisner B, Kontsos A (2018) Fatigue damage assessment leveraging nondestructive evaluation data. JOM 70(7):1182–1189 31. Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1–3):37–52 32. Jackson JE (2005) A user’s guide to principal components. Wiley, New York, NY 33. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York, NY
Chapter 54
Sentential Negation Identification of FMRI Data Using k-NN Ashish Ranjan, Anil Kumar Singh, Anil Kumar Thakur, Ravi Bhushan Mishra, and Vibhav Prakash Singh
1 Introduction Scientists are investigating the brain for decades to answer the question related to the functioning of language in the brain. The analysis of neural activation for the linguistic phenomenon is studies based on neuroimaging data for the last two decades. Different brain areas dedicated to specific linguistics characteristics have been detected using various neuro-imaging techniques like fMRI, PET-Scan, EEG, MEG. fMRI, and PET measure the change in hemodynamic response corresponding to given stimuli, whereas EEG and MEG measure the electrical and magnetic signals for real-time processing. fMRI is the most popular technique to investigate the functional brain for both experimental and clinical purposes. fMRI measures the change in blood oxygenation level in the investigation of brain function using magnetic signals. Syntactic comprehension is the central aspect of human language and has divergent kinds of stuff than other features of language–morphology, semantics, etc. [1, 2]. The A. Ranjan (B) · A. K. Thakur Department of Humanistic Studies, IIT (BHU), Varanasi, India e-mail: [email protected] A. K. Thakur e-mail: [email protected] R. B. Mishra · V. P. Singh Department of Computer Science and Engineering, Motilal Nehru National Institute of Technology Allahabad, Prayagraj, India e-mail: [email protected] V. P. Singh e-mail: [email protected] A. K. Singh Department of Computer Science and Engineering, IIT (BHU), Varanasi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_54
657
658
A. Ranjan et al.
study of the representation of sentence structure, i.e., the syntax in the brain based on neuroimaging data, is enriched in new findings corresponding to different aspects of sentence types. The study of the syntactic brain grounded on fMRI data can be viewed in five types of sentence arrangements in literature. (A) Syntactic violations, (B) complex versus simple sentences, (C) sentence versus word list, (D) sentences containing pseudo words, and (E) sentences having different agreements and types. The analysis of syntax in the brain is the most important aspect of neuro-linguistic studies. Sentential negation is the study of representation and impact of the affirmative and negative sentence in the brain. The study of affirmative and negative sentences in the brain is studied with two viewpoints: one is the negative sentences that have a more complex structure than that of affirmative due to negation showing words; hence, processing complexity is high. Another view is that the affirmative form is first processed, and then, the negative counterpart is built up in the brain. Different types of negative sentences are represented differently in the brain, depending upon whether they contain bipolar or contradictory predicates.
2 Related Work The negative sentences contain extra syntactic entity for negation; hence, the negative sentences are more complex than affirmative counterparts in terms of syntactic complexity [3]. Processing of negative sentences is assumed to be a two-step process in which firstly, the affirmative form is processed, and then negative versions of it are represented in the brain, whereas affirmative sentences are directly processed in the brain [4]. The additional syntactic transformation reflects greater cortical activation in the case of negative sentences. Left posterior temporal gyrus as well as bilateral parietal area shows grater activation for negative compared to affirmative [5]. In the study of Japanese–English sequential bilingual paradigm with target-probe matching task, a significant activation in left pre-central gyrus and left temporal gyrus was observed for negative sentences when tested for English as the second language for the candidates [6]. Increased activation in the left hemisphere perisylvian and parietal was observed for abstract negative sentences, but a partial deactivation was found in action related negative sentences in left pallidum [7]. In [8] for the Danish language, increased activation in the left premotor cortex (BA6) for negative sentences was found. This study proposes that for semantic processing bilateral inferior parietal (SMG, BA 40), for syntax left premotor cortex(BA6), and left inferior frontal gyrus for computation of syntactic complexity, are three most prominent cortical areas in processing of negative sentences. In [9] in fMRI study of sentences having single and double negation for the German language, it is found that left pars opercularies (BA44), left pars triangularies (BA 45), left superior temporal gyrus (STG(BA42)), and left supplementary motor area (SMA(BA6)) are functionally associated in the processing of main clause negation. Inferior frontal gyrus (IFG) plays the most important role in the co-ordination of the cortical areas responsible for logical reasoning and language processing in the elucidation of negative sentences. In [10], authors
54 Sentential Negation Identification of FMRI Data Using k-NN
659
have examined fMRI data for activation pattern of the brain in the processing of negative/affirmative sentences of Hindi and found that common cortical region involved in processing includes left parietal cortex (BA/40), bilateral inferior frontal gyrus (IFG), bilateral supplementary motor area (SMA (BA6)), left fusiform (BA37), bilateral occipital area (BA17/18), and bilateral temporal gyrus (BA21). In addition to this, the study of fMRI data shows that the anterior temporal pole is dedicated to the processing of negative sentences. In [11], authors have studied the neural activity of affirmative and negative sentences and using NeuCube [12]. With proper training after implementing classifiers to classify the neural activity pattern of negative and affirmative sentences in the brain, Behroozi and Daliri [13] have classified the voxel from the RDLPFC area and achieved an accuracy of almost 75% for all the subjects on the same dataset. They have used F-score based feature selection and SVM classification techniques in the classification of selected attributes. In [14], authors have analyzed the sentence polarity exposure using multilayer perceptron. In this paper, we have analyzed the Star Plus [15] data available online to classify the cognitive state of the brain based on the polarity of sentences. We have extracted the sentence processing fMRI data and using info-gain feature selection approach contributing feature value data is selected, and then, applying k-NN classifier to classify the brain state our model is able to recognize with 90% accuracy when a person is reading a negative sentence or an affirmative sentence.
3 Dataset Description The dataset used in this paper is known as the Star Plus dataset, which was collected by Marcel Just and his colleagues at Carnegie Mellon University. The fMRI signals were collected over a grid of 64 * 64 * 8 voxel throughout the experiment. The data was collected from six healthy subjects. Each subject was performing a sequence of activities during the entire session of fMRI recording. Activities include showing a picture and descriptions of it in sequence and then pressing a button of yes or no depending upon whether the sentence correctly describe the picture or not. This event-related experiment consists of trials. One trial of experiments lasts for 27 s, and at every 500 ms interval, fMRI data was recorded for whole brain; hence, a total of 54 images of the brain were recorded in a specific trial of the experiment. At the start of the experiment, a picture or sentence was presented on the screen for 4 s, and then after 4 s, the blank screen was shown. After that, again for 4 s, a picture or sentence was shown in which a button was to be pressed for yes or no whether the sentence correctly describes the picture or not. If the first stimulus is shown as a picture, then the second item displayed was a sentence and vice versa. The picture first and sentence after was termed as PS dataset, and in the case where the sentence was the first stimulus, the data was termed as SP dataset. In half of the trial, the picture was presented first, and the remaining half trial consists of presenting sentences as the first stimulus. The picture presented in this experiment were geometrical arrangements of $, *, and +. However, the sentences
660
A. Ranjan et al.
presented were a description of picture, and half of the sentences were affirmative sentences, and the remaining half were negative sentences. The sentences were just a description of a picture like “it is true that dollar is above the star” or “it is not true that star is above the plus.” A total of 40 trials were presented to each subject. Each trial consists of around 5000 voxels for one image, so a total of 270,000 voxels for one trial of the experiment. The whole brain is divided into 25–27 region of interests (RIO) into which different voxel activity patterns reside.
4 Model and Result Discussion The proposed model has five functional steps—1. Data extraction for sentence processing, 2. Mean calculation for selected tuples, 3. Feature selection, 4. Classification, 5. Prediction. The work-flow diagram of proposed model is given in Fig. 1. Star Plus dataset is used in this paper in analysis of sentential negation task. Data for sentence processing is extracted from t = 8 to t = 16 for the case when the picture was first stimulus and from t = 0 to t = 8 s for sentence first stimulus. A total of 16 brain images were extracted for each trial. fMRI-sequence (t, t + 8) → {Affirmative/Negative sentence}
(1)
The extracted fMRI data is having dimension of 16 * 5000 for each trial. To reduce the dimensionality of data, mean of the extracted data is calculated in accordance with Eq. 2. Fig. 1 Wok-flow diagram of proposed model
Data extraction for Sentence
Mean calculation of selected tuples
Info-gain feature Selection
k-NN Classification
Prediction
54 Sentential Negation Identification of FMRI Data Using k-NN
∀ j j ∈ column,
n
x/n
661
(2)
i=1
where x ij is an ith row and jth column entry. After the mean calculation, the prominent feature set is obtained using info-gain feature selection [16, 17]. It measures the contribution of each feature in decreasing the overall entropy. The entropy, H(X), is defined as follows: H (X ) = −
(Pi ∗ log2 (Pi ))
(3)
where Pi is the probability of the class i in the dataset. A good attribute is an attribute that contains the most information, i.e., reduces most of the entropy. The contributing feature set is fed to the classifier to classify the brain state using k-NN classifier [18]. k-NN is a non-parametric learning, lazy algorithm which uses feature similarity in the prediction of the value for new data points [19]. The new data point is assigned a value based on its similarity with the data points in training data. The steps of the k-NN algorithm is as following: 1. Load the training as well as test dat. 2. Choose the value of K—the number of neighbors and initialize it. 3. For each data point in test data 3.1. Calculate the Euclidian distance between the test data and each row of training data. 3.2. Add the index and the distance of the test data to an ordered collection. 4. 5. 6. 7.
Sort the ordered collection of indices and distances in ascending order. Pick up the top K rows from the sorted collection. Get the labels of the selected K entries. Return the mode of the K labels.
During analysis of the result at first, the data for sentence processing is extracted, and then, mean is calculated to minimize the matrix size. The feature set is selected using info-gain attribute evaluator which selects the most contributing feature voxels. Selecting top-twenty features from ranked feature set classification is done using kNN classifier. The result obtained from classifier is summarized in Table 1. In the subject 05710, the highest classification accuracy is obtained, whereas in subject 04799, least classification accuracy is found. To obtain optimal classification result accuracy, the value of k, i.e., selection of number of neighbors, is tuned properly in the classification. The confusion matrix describes the performance of classifier. The confusion matrix for above classification result is given in Table 2. The average of result obtained for all subjects is projected in Fig. 2. The average of different result parameter obtained after classification for all subject is following—TP
662
A. Ranjan et al.
Table 1 Result obtained for all subjects Subject id
TP rate
FP rate
Precision
Recall
F-measure
ROC area
Accuracy
04799
0.85
0.15
0.85
0.85
0.85
0.873
0.85
04820
0.9
0.1
0.9
0.9
0.9
0.946
0.90
04847
0.875
0.125
0.884
0.875
0.874
0.889
0.875
05675
0.925
0.075
0.935
0.925
0.925
0.925
0.925
05680
0.9
0.1
0.917
0.9
0.899
0.943
0.90
05710
0.95
0.05
0.955
0.95
0.95
0.99
0.95
Table 2 Confusion matrix
Subject id
Predicted class
04799
17
3
3
17
04820
18
2
2
18
Affirmative
04847
Actual class Negative
16
4
1
19
05675
20
0
3
17
05680
20
0
4
16
05710
20
0
2
18
Affirmative Negative Affirmative Negative Affirmative Negative Affirmative Negative Affirmative Negative Affirmative Negative
1 0.8 0.6 0.4 0.2 0
Fig. 2 Average of result obtained for all subjects
rate—0.9, FP rate—0.1, precision—0.906833, recall—0.9, F-measure—0.899667, ROC area—0.927667, and accuracy—0.9.
54 Sentential Negation Identification of FMRI Data Using k-NN
663
5 Conclusion In this paper we have analyzed the processing of affirmative and negative sentences in the brain. Star Plus dataset available online is used in detection of sentence polarity check. Data for sentence processing was extracted, and mean of the data so obtained is calculated for each trial. Using this reduced data, we have proposed a model of detection of sentential negation using k-NN classification and info-gain feature selection technique. The proposed model obtained an accuracy of 90% in classification of brain state in sentence processing. The obtained result is significantly higher than the other methods which uses same dataset for similar task.
References 1. Kaan E, Swaab TY (2002) The brain circuitry of syntactic comprehension. Trends Cogn Sci 6(8):350–356. https://doi.org/10.1016/s1364-6613(02)01947-2 2. Mayo R, Schul Y, Burnstein E (2004) “I am not guilty” vs “I am innocent”: successful negation may depend on the schema used for its encoding. J Exp Soc Psychol 40(4):433–449 3. Haegeman L (1995) The syntax of negation. Cambridge University Press, Cambridge 4. Zwaan RA (2012) The experiential view of language comprehension: how is negation represented. Higher level language processes in the brain: inference and comprehension processes, p 255 5. Carpenter PA, Just MA, Keller TA, Eddy WF, Thulborn KR (1999) Time course of fMRIactivation in language and spatial networks during sentence comprehension. Neuroimage 10(2):216–224 6. Hasegawa M, Carpenter PA, Just MA (2002) An fMRI study of bilingual sentence comprehension and workload. Neuroimage 15(3):647–660 7. Tettamanti M, Manenti R, Della Rosa PA, Falini A, Perani D, Cappa SF, Moro A (2008) Negation in the brain: modulating action representations. Neuroimage 43(2):358–367 8. Christensen KR (2009) Negative and affirmative sentences increase activation in different areas in the brain. J Neurolinguist 22(1):1–17 9. Bahlmann J, Mueller JL, Makuuchi M, Friederici AD (2011) Perisylvian functional connectivity during processing of sentential negation. Front Psychol 2:104 10. Kumar U, Padakannaya P, Mishra RK, Khetrapal CL (2013) Distinctive neural signatures for negative sentences in Hindi: an fMRI study. Brain Imaging Behav 7(2):91–101 11. Doborjeh MG, Capecci E, Kasabov N (2014) Classification and segmentation of fMRI spatiotemporal brain data with a NeuCube evolving spiking neural network model. In: 2014 IEEE symposium on evolving and autonomous learning systems (EALS). IEEE, pp 73–80 12. Kasabov NK (2014) NeuCube: a spiking neural network architecture for mapping, learning and understanding of spatio-temporal brain data. Neural Netw 52:62–76 13. Behroozi M, Daliri MR (2015) RDLPFC area of the brain encodes sentence polarity: a study using fMRI. Brain Imaging Behav 9(2):178–189 14. Ranjan A, Singh VP, Singh AK, Thakur AK, Mishra RB (2020) Classifying brain state in sentence polarity exposure: an ANN model for fMRI data. Rev Intell Artif 34(3):361–368. https://doi.org/10.18280/ria.340315 15. http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-81/www/ 16. Azhagusundari B, Thanamani AS (2013) Feature selection based on information gain. Int J Innov Technol Expl Eng (IJITEE) 2(2):18–21 17. Lee C, Lee GG (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inf Process Manag 42(1):155–165
664
A. Ranjan et al.
18. Pernkopf F (2005) Bayesian network classifiers versus selective k-NN classifier. Pattern Recogn 38(1):1–10 19. Saini I, Singh D, Khosla A (2013) QRS detection using K-Nearest Neighbor algorithm (KNN) and evaluation on standard ECG databases. J Adv Res 4(4):331–344
Chapter 55
Automation Soil Irrigation System Based on Soil Moisture Detection Using Internet of Things Poonam Dhamal
and Shashi Mehrotra
1 Introduction In India, 50–60% economies depend on agriculture [1]. The development of agricultural is important because major income of India is depending on agriculture. Agriculture is the primary occupation in India. Water is a basic requirement of all living creature. Agriculture is one of the fields where water is required in tremendous quantity. Nowadays, water shortage is becoming one of the most important issues within the world. We want water in each and every field. In our day-to-day life, water is very essential. Table 1 [2] presents the annual availability of water and usage of water yearwise. Wastage of water is the major problem in agriculture. Due to unplanned use of water, the groundwater level is decreasing day by day, and lack of rains and scarcity of land water also result in decrement in volume of water on earth. Due to the increase in population and industries, availability of natural water has been decreasing, which is a matter of serious concern, and understanding the use of water in agriculture is very much required [3]. Most of the time excess of water is given to the fields. There is a great need to modernize the conventional agricultural practices for the better productivity. In terms of environmental problems, IoT-based smart farming [4] will give nice edges together with additional economical water usage or improvement of inputs and coverings. Smart agriculture solution may be beneficial in various ways such as for crop irrigation; optimal water efficiency minimizes losses due to evaporation, runoff, or subsurface drainage while maximizing production. Soil moisture sensors [5] can be used to check availability of water contents in soil and water requirement for the crop. P. Dhamal (B) · S. Mehrotra Department of Computer Science and Engineering, Koneru Lakhmaiah Education Foundation, Vaddeswaram, AP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_55
665
666 Table 1 Annual per capita water availability
P. Dhamal and S. Mehrotra Year
Annual per capita water availability (m3 )
1947
6042
2001
1816
2011
1545
2025
1340
2050
1140
The smart irrigation system [6–8] associates IoT-based device that senses the water requirement by analyzing the wetness of the soil. The automatic irrigation system [9] automatically switches the pump based on the moisture level of the soil. Proper usage of irrigation system is extremely necessary owing to the shortage of land reserved water because of lack of rain and unplanned use of water that results in a large amount of water getting waste. Automatic plant watering system is also terribly helpful altogether atmospheric conditions and may facilitate farmers to realize high profit. Automatic irrigation is the use of a device to operate irrigation structures so the change of flow of water from one bay, or set of bays, to another can occur in the absence of the irrigator [10]. The good irrigation system is also controlled by use of some hardware and the farmers will receive real-time data. The farmer will receive various information like the status of moisture content in soil controlling the flow using hardware. The paper presents a smart irrigation monitoring and controlling system that reduces the usage of water in agriculture without affecting production. The proposed system aims to watch the environmental conditions like temperature, soil wet content, humidity of the air, and water level of agriculture land for dominant the irrigation. The real-time conditions knowledge will be sent to the cloud server for storing the higher cognitive process and dominant actions for future additionally. The paper is organized as follows: Sect. 1 covers introduction of the topic. Section 2 discusses the literature survey. Hardware requirement, methodology, and features of proposed system are discussed in Sect. 3. Section 4 contains conclusion and the future scope.
2 Literature Survey Mat and Kassim [11] presented the greenhouse management system (GHMS), which is based on wireless sensor network (WSN) technology. GHMS scans the status of the soil within the greenhouse by employing a wetness detector and humidity detector. Based on the sensors reading, GHMS will automatically conceive to ON or OFF devices like pump for irrigation and fan for air circulation.
55 Automation Soil Irrigation System Based on Soil Moisture …
667
Rajalakshmi and Mahalakshmi [12] presented the IoT-based crop-field monitoring and irrigation automation. Rajalakshmi, P., and Mahalakshmi, S. D. proposed an automated irrigation system which monitors crop-field exploitation soil wet, temperature, humidity, and lightweight sensors. The information from sensors is forwarded to Web server through wireless. The data are encoded in JSON format in Web server database. The irrigation is machine-controlled if the wetness and temperature of the sector fall below certain level. The farmers will monitor their field ubiquitously because the notifications are sent to their registered mobile numbers sporadically. Chidambaram and Upadhyaya [13] presented automation in drip irrigation using IoT devices. The sensing element information is stored in a database. The Web application is meant in such a way to research the info received and to ascertain with the threshold values of wet, humidity, and temperature. If soil wet is smaller quantity than the threshold value, the motor is switched ON and if the soil wet exceeds the threshold value, the motor is switched OFF. Kodali and Sahu [14] presented an IoT-based soil moisture monitoring on Losant platform. Rovers are used for measuring the moisture value in the field, with the use of soil moisture sensor which is placed in one end of rover and rovers are actuated with the help of DC motor and the rover is completely powered by using a solar powered battery (16 V). An Android app will be designed specifically for this device, and it will allow the user to control the entire system remotely. The data are uploaded to the cloud by using a data pack either by means of wireless or wired communication which will be furthered connected to a Wi-Fi router. The soil moisture sensing element is hooked up to the rover by wired mean of communication and it is used to live the volumetric water content within the soil. The actuation of soil moisture sensor is also the same as a rover and it is done simultaneously at the time of simulation. Villani and Castaldi [15] presented a system in which system nodemcu board acts sort of a shopper publishes the sensing element knowledge into the Losant Message Broker exploitation the MQTT protocol on to the topic subject losant/DEVICE ID/state. In order to understand the messages of losant, a defined json-based payload must be followed. The published data will automatically store in the losant and make it available in the visualization tool of losant platform. The complete hardware and software setup has been done to monitor the soil moisture of the field. The SWAMP project develops a platform for exactness irrigation that consists of an IoT answer for observance the farming and irrigation systems. The platform includes a drone supported data collection from the fields, a cloud-based data collection tool, a data analytics solution for analyzing the water need of plants and irrigation needs. The toolbox is aimed at supporting irrigation planning and water distribution both at farm and district level. Kim et al. [16] present a system that comprises of five in-field sensing stations which are distributed across the whole field. The components are: an irrigation management station and a base station. The in-field sensing stations monitor the sector conditions of soil wetness, soil temperature, and air temperature and a close by lookout monitors small meteorological data within the field, i.e., air temperature, relative humidity, precipitation, wind speed, wind direction, and solar radiation. All in-field sensory knowledge area units are wirelessly transmitted to the base station.
668
P. Dhamal and S. Mehrotra
The base station processes the in-field sensory knowledge through an easy decisionmaking program and sends management commands to the irrigation management station. The irrigation management station updates and sends geo-documented locations of the machine from a differential GPS mounted at the cart to the base station for period observance and management of the irrigation system. Based on sprinkler head GPS locations, the base station feeds control signals back to the irrigation control station to site-specifically operate individual mechanical device to use a depth of water.
3 Proposed System The paper proposed mode is a smart irrigation monitoring and controlling system that reduces the usage of water in agriculture without affecting production. The proposed system will develop to monitor the different conditions such as water level of agriculture land, humidity of the air, temperature and soil moisture content for controlling the irrigation. The watering system plant includes a soil wet device that checks the wet level in the soil. Water pump gets mechanically off once the system finds enough moisture within the soil and if the moisture level is low then the hardware switches on a pump to produce water to the plant. A message is sent to the user via GSM mobile, whenever the system switched ON or OFF the pump and update information of the water pump and soil moisture. The plant irrigation system uses a soil moisture sensor probes for sensing the soil moisture level. A copper clad board is cut and etched to prepare the board. Arduino is used to control the automatic plant watering system process. A LED is used at the sensor circuit. If the LED is ON, the presence of wet within the soil is indicated and also the absence of wet within the soil is indicated if this LED is OFF. This technique is used to send SMS to the user via GSM module. An associate elective digital display is additionally used for displaying standing and messages. The plant irrigation system is completely automated system and it does not require manpower to control the system, as required alert messages are sent to the user on his cell phone. The moisture sensor is placed in the soil, and the soil moisture sensor takes soil as input and gives output in measure of moisture content. Soil moisture is continuously being monitored by the sensor and the output values are stored in a database. If wetness is present in soil then there is physical phenomenon between the two probes of soil wetness detector. The hardware reads low signal and it sends SMS to the user regarding “soil wetness and water pump remains in OFF state.” Once there is no wetness in soil then junction transistor alert high, and the hardware activates the water motor and also sends message to the user regarding “low soil wetness detected. The motor turned ON.” The motor can shut down mechanically once there is spare wetness within the soil.
55 Automation Soil Irrigation System Based on Soil Moisture …
669
Fig. 1 Soil moisture detector
3.1 Hardware Requirement Our propose model consists of following components: the Arduino Uno, GSM module, semiconductor BC547 (2), connecting wires, 16 × 2 LCD (optional), power provider 12 V 1 A, relay 12 V, device pump, soil moisture device, resistors (1k, 10k), variable resistor (10k, 100k), terminal connective, and transformer IC LM317.
3.1.1
Soil Moisture Detector
Soil moisture sensing element lives the meter content within the soil. Moisture sensing element helps to indicate the moisture level for the soil. Once the wetness level is sensible the plant growth also will be good. Wetness level is additional vital for any style of soil. Soil wetness is employed for remote sensing the water content within the agriculture. This sensing element is connected to the board and therefore the wetness sensing element is unbroken in soil and that checks soil with the assistance of sensing element and that we get the readings. Figure 1 presents the soil moisture detector.
3.1.2
Arduino Uno Board
The Arduino Uno is one among the foremost used microcontrollers within the industry. It is terribly simple to handle, convenient, and use. The secret writing of this microcontroller is extremely easy. The program of this microcontroller is taken into account as unstable because of the non-volatile storage technology. The applications
670
P. Dhamal and S. Mehrotra
Fig. 2 Arduino Uno board
Fig. 3 GSM
of this microcontroller involve a good variety of applications like security, home appliances, remote sensors, and industrial automation. This microcontroller has the power to be joined on the Web and perform as a server too. Arduino Uno board is presented in Fig. 2.
3.1.3
GSM
Global systems for mobile communication (GSM) services are a standard application for mobile phones. These features are available to phone subscribers all over world. GSM helps to see our readings directly in our mobile phone. Whenever we need can view our readings directly in phone with the help of GSM (Fig. 3).
3.1.4
Wi-Fi Module
The Wi-Fi module ESP8266 may be an inexpensive module, want to interface the microprocessors. It has a 96 KB of data RAM moreover as a 64 KB of instruction RAM. Wi-Fi is presented in Fig. 4.
55 Automation Soil Irrigation System Based on Soil Moisture …
671
Fig. 4 Wi-Fi module
3.2 Methodology Smart irrigation system mechanically detects the soil moister via soil wet detector. Based on that, the microprocessor gives command. If the soil wet level is smaller in amount than the edge level set by the user, then the microprocessor can activate the pump mechanically. If the soil wet level is larger than the edge level set by the user, then the microprocessor can put OFF the pump mechanically. It sends information concerning the pump and also the wet proportion to user via SMS. It conjointly sends information on Web site in order that user will see the values and conjointly get all the detail (motor info, wet proportion, etc.). The systems are often enforced to examine the standard of the soil and also the growth of crop in every soil with the assistance of wetness. Soil wetness detector may be a detector that senses the wetness content of the soil. Hardware should be implemented by collecting data and also the information that is required for the soil. The connections square measure is given within the board. Collect the sensors that square measure required for the wetness check and check the degree of it. Sensors square measure is checked and also the knowledge square measure is processed through the microcontroller with the assistance of GSM and every one viewed the info square measure. Collect all the info and every one the connections square measure given to the board and performance square measure checked and analyzed in it. Figure 5 presents work flow of our model.
3.3 Features of Proposed System Proposed model may facilitate farmers to improve their livelihoods by permitting an additional economical use of inputs, such as:
672
P. Dhamal and S. Mehrotra
Fig. 5 Flowchart of proposed system
1. Water—Efficient irrigation permits farmers to (i) use less water to grow constant of crops, (ii) a great deal lot of productively farm larger areas of land by victimization constant of water, or (iii) use constant of water to grow higher value, a great deal of water-intensive crop. 2. Fertilizer—Efficient irrigation reduces the number of fertilizer required per plant, as nutrients are dissolved within the irrigation water for uniform application, reduced waste, and lower labor input. 3. Energy—Efficient irrigation reduces energy use as a result of less water which is required for a comparable space of irrigation, that successively needs less energy for pumping this water. Once automatic, farmers also are ready to simply and safely irrigate crops throughout times of fewer power disruptions (i.e., at night). 4. Labor—Efficient irrigation decreases the number of time needed for providing water to a crop space thanks to the regulated flow of water within the irrigation operation. This indirectly reduces time spent on weeding and applying fertilizer.
4 Conclusion and Future Scope The proposed system is implausibly useful for farms, gardens, and residential. There is no need for any human intervention as a result of this method is totally automated. On receiving the input, the controller node checks it with needed soil wet worth and also the notification SMS is shipped to the registered portable. Once soil wet in an exceedingly specific field is not up to the specified level then controller node activates the motor to irrigate the associated field.
55 Automation Soil Irrigation System Based on Soil Moisture …
673
For the future work, deep learning and nature-inspired algorithm will be used to enhance the efficiency of the model.
References 1. Himani (2014) An analysis of agriculture sector in Indian economy. OSR J Hum Soc Sci (IOSR-JHSS) 19(1):47–54 2. Vincent CJ, Supate AR (2017) Efficient wastewater management for sustainable development: challenges and prospects an Indian scenario. Int J Health Sci Res (IJHSR) 7:276–289 3. Ines AV, Honda K (2006) Combining remote sensing-simulation modeling and genetic algorithm optimization to explore water management options in irrigated agriculture. Agric Water Manag 83:221–232 4. Sandeep M, Nandini C (2018) IOT based smart farming system. Int Res J Eng Technol (IRJET) 5(09):1033–1036 5. Athani S, Tejeshwar CH (2017) Soil moisture monitoring using IoT enabled Arduino sensors with neural networks for improving soil management for farmers and predict seasonal rainfall for planning future harvest in North Karnataka—India. In: 2017 international conference on I-SMAC (IoT in social, mobile, analytics and cloud) (I-SMAC). IEEE, pp 43–48 6. Gori A, Singh M (2017) Smart irrigation system using IOT. Int J Adv Res Comput Commun Eng (IJARCCE) 6(9):213–216 7. Sahu CK, Behera P (2015) A low cost smart irrigation control system. In: 2nd international conference on electronics and communication systems (ICECS), pp 1146–1152 8. Nandhini R, Poovizhi S (2017) Arduino based smart irrigation system using IOT. In: 3rd national conference on intelligent information and computing technologies, IICT 9. Suganya P, Subramanian A (2017) Automatic irrigation system using IoT. Int Res J Eng Technol (IRJET) 4(10):1571–1574 10. http://agriculture.vic.gov.au/agriculture/farm-management/soil-and-water/irrigation 11. Mat I, Kassim MRM (2016) IoT in precision agriculture applications using wireless moisture sensor network. In: IEEE conference on open systems (ICOS), pp 24–29 12. Rajalakshmi P, Mahalakshmi S (2016) IOT based crop-field monitoring and irrigation automation. In: 10th international conference on intelligent systems and control (ISCO), pp 1–6 13. Upadhyaya V, Chidambaram RM (2017) Automation in drip irrigation using IOT devices. In: Fourth international conference on image information processing (ICIIP), pp 1–5 14. Kodali R, Sahu A (2016) An IoT based soil moisture monitoring on Losant platform. In: 2nd international conference on contemporary computing and informatics (IC3I), pp 764–768 15. Villani G, Castaldi P (2018) Soil water balance model CRITERIA-1D in SWAMP project: proof of concept. In: 23rd conference of open innovations association FRUCT, p 54 16. Kim Evans (2008) Remote sensing and control of an irrigation system using a distributed wireless sensor network. IEEE Trans Instrum Meas 57(7):1379–1387
Chapter 56
Threat Detection on UDP Protocols Using Packet Rates in IoT T. Subburaj and K. Suthendran
1 Introduction These days, there are some consistent numbers of attacks are happening in worldwide. Figure 1 shows the attack chart for the duration of Dec 2019 to March 2020 [1]. On March 26, 2020, the following countries were affected by attacks: Mongolia, Indonesia, Nepal, Dominican Republic, and Argentina [1]. The attack trends in India during last 30 days covers banking Trojans, botnet, cryptominer, mobile, and ransom ware as show in Fig. 2. In India, many of us are having access to 3G and 4G wireless networks. At present, the data traffic is high in 4G technologies as it supports video call with high speed. The wireless devices or Wi-Fi devices are in risk from a following variety of different cyberattacks: peer-to-peer attacks, eavesdropping, encryption cracking, authentication attacks, MAC spoofing, wireless hijacking, and denial of service.
2 Literature Survey Nair and Gautam [2] have popped up with an intrusion detection system-based entropy variation and J48 algorithm. The network flow variation is estimated in terms of entropy, and the classification of normal and abnormal flow is performed T. Subburaj Department of Computer Applications, Sri Kaliswari College, Sivakasi, Tamilnadu 626123, India e-mail: [email protected] K. Suthendran (B) Department of Information Technology, Kalasalingam Academy of Research and Education, Krishnankoil, Tamilnadu 626126, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_56
675
676
T. Subburaj and K. Suthendran
Fig. 1 Attack chart on Dec 2019 to March 2020
Fig. 2 Last 30 days attacks in India
based on J48 algorithm. The attack detection is impressive in terms of accuracy. Kaur et al. [3] proposed an attack detection approach based on self similarity method. The variations in the packets flow are considered as a key measure. Wavelet coefficient is used for flow segregation. The Hurst and threshold calculation improved its accuracy in attack detection. Shojaei et al. [4] have extended the entropy estimation as a measure to sense DDoS attack in WiMAX and end up with reasonable result. Akyazi and Uyar [5] have introduced the DDoS attack types in mobile agents. The SNORT method is used to notify the attacks occurs in mobile. Cusack and Tian [6] have also projected
56 Threat Detection on UDP Protocols Using Packet Rates in IoT
677
the attack detection approach in mobile phone service. The hop count and ICMP method are deployed to smell and backtrack the slow rate attacks in mobile devices. To sense the tricky threats of wireless sensor networks, Singh et al. [7] proposed to detect the wormhole resistant hybrid technique. With the help of Matthews’s correlation coefficient calculation, the wormhole attacks are identified. Toklu and Simsek [8] have explained the unique approach to catch high rate and low rate DDoS attack. Authors have experimented two filtering approaches, viz. detection with average filter (DAF) for high rate detection and detection with discrete Fourier transform (DDFT) for low rate attack detection with zero false positive and false negative. Udhayan and Hamsapriya [9] have investigated better technique to nullify the false detection of DDoS attacks based on statistical segregation technique. Segregation technique is deployed to categorize the usual flow and unusual flows. Further, it supports the prediction of the low rate, increasing rate attacks. Yu et al. [10] examined the flow correlation coefficient approach based on flash crowd to sense the attacks. In general, the valid and invalid flash crowds have close match. The examination ends up with accurate findings of usual and unusual flow. Kim [11] worked supervised learning method to seize DDoS attack based on the neural networks-based supervised learning approaches that are used to categorize the genuine flow and the compromised flow. Shiaeles and Papadaki [12] have experimented with IP spoofing detection method based on fuzzy hybrid spoofing detector (FHSD) to find the attacks. Fuzzy empirical rules and fuzzy largest of maximum operator are applied to identify the real IP and wrongdoer IP within less than 5 s. Liao et al. [13] researched to sense the DDoS attacks in application layer based on SVD-RM approach and revealed reasonable findings to classify real and attacking flow. Hoque et al. [14] extended field programmable gate arrays (FPGA) method to seize the real-time DDoS attacks based on the correlation measures. Subburaj and Suthendran [15–18] have experimented the DDoS attack detections based on the statistical approach, viz. entropy, chi-square, g-test, and sequential pattern mining and found the attacks and also identified the root cause of the DDoS, bit and piece, low rate and high rate attacks, watering hole attack.
3 Attacking Model This work considers different types of attack approaches on a target machine in wireless networks as follows. The attacking scenarios are Case 1 The wrongdoer directly hits the target node. In this case, the attacker can be easily traced.
678
T. Subburaj and K. Suthendran
Case 2 The attacker targets the node through proxies or zombies. In this scenario, identification of wrongdoer is possible with added complexity. Case 3 The attacker executes DDoS on the target. Here, again tacking of wrongdoer is possible with more complexity.
4 Mobile and Wireless Devices Attack Detection The proposed packet rates resource management techniques are appropriate to detect attacks in all possible scenarios as mentioned above in wireless networks. The packet rate and bandwidth capacity of the incoming packet are considered in this technique to catch the cyberattacks. Two types of analysis are deployed to sense the attack based on the bandwidth capacity and availability. The valid user communication always maintains the accurate bandwidth capacity which means between the maximum capacity and minimum capacity. But, when things go wrong it gets reflected in accuracy of the communication bandwidth and they occupy the whole capacity of the bandwidth in particular path. The following equations are applied to detect the attack based on the bandwidth capacity and availability.
AttackMetric1
n Cmax (i) − Cavail (i) 2 = Cavail (i) i=0
(1)
and
AttackMetric2
n Cavail (i) − Cmin (i) 2 = Cmin (i) i=0
(2)
where C max —Maximum capacity of the bandwidth, C min —Minimum capacity of the bandwidth, C avail —Current availability of the bandwidth. AttackMetric1 value is calculated from the maximum capacity and availability of the bandwidth. AttackMetric2 value is calculated from the minimum capacity and availability of the bandwidth. Based on the two metric values, the threshold value is fixed and the classification of normal packets and attack packets is identified. Time complexity for calculating metric values is O (log n).
56 Threat Detection on UDP Protocols Using Packet Rates in IoT
679
Fig. 3 Work flow of attack detection process
4.1 Attack Detection Process 1. Set the maximum and minimum capacity of the bandwidth 2. Monitoring the all incoming packets for every seconds 3. Based on the maximum capacity and availability of the bandwidth to calculate the AttackMetric1. 4. Based on the minimum capacity and availability of the bandwidth to calculate the AttackMetric2. 5. Finally classify the normal flow and attack flow based on the lowest value in AttackMetric1 and highest value in AttackMetric2.
4.2 Work Flow of Detection Process This work senses the high rate and low rate attack based on AttackMetric1 and AttackMetric2 as made known in Fig. 3. This process observes all the inward packets. The AttackMetric1 and AttackMetric2 are calculated based on predefined maximum and minimum bandwidth, respectively. The least possible value of AttackMetric1 is declared as high rate attack. The highest possible value of AttackMetric2 is confirmed as low rate attack.
5 Experimental Setup A network with nine wireless devices is considered for this work. It covers mobile, iPod, tablet, and laptop as shown in Fig. 4.
680
T. Subburaj and K. Suthendran
Fig. 4 Simplification network architecture
The packet rates from each device are monitored and analyzed. Initially, based on the usage requirement, the bandwidth is allocated for each device. The packet rates of all transactions are continuously monitored. Table 1 confirms the packet rates collected from the network. Using NS3 simulator, for the duration of 80 s, the packet rates are recorded as shown in above table. The total amount of packets from each device to server is calculated and it is compared with metrics for further classification. The total data size in terms of packet per second (PPS) is computed based on packet size and packets transfer from each client to server. Table 2 explains the total size traffic from each device. The attack metric values are evaluated using Eqs. 1 and 2 and listed in the table as follows. Table 1 Packet rates collected from the simplification network Node
Traffic
Client 1
Traffic 1
Packet size
Client 2
Traffic 2
256
20
1
80
5120
Client 3
Traffic 3
1024
1000
10
60
1,024,000
Client 4
Traffic 4
512
20
3
80
10,240
Client 5
Traffic 5
256
30
4
80
7680
Client 6
Traffic 6
128
50
5
80
6400
Client 7
Traffic 7
256
40
5
80
10,240
Client 8
Traffic 8
1024
1000
10
60
1,024,000
Client 9
Traffic 9
256
40
8
80
10,240
128
PPS (packet rate) 40
Start time (s) 1
End time (s) 80
Total size 5120
56 Threat Detection on UDP Protocols Using Packet Rates in IoT
681
Table 2 Metric1 and Metric2 value calculation Node Client 1
Packet size
PPS (packet rate)
128
40
Total size
Metric1
5120
199
Metric2 39
Client 2
256
20
5120
199
39
Client 3
1024
1000
1,024,000
0
7999
Client 4
512
20
10,240
99
79
Client 5
256
30
7680
132.3333
59
Client 6
128
50
6400
159
49
Client 7
256
40
10,240
99
79
Client 8
1024
1000
1,024,000
0
7999
Client 9
256
40
10,240
99
79
6 Detection Process During the detection process, the lowest value from AttackMetric1 and highest value from AttackMetric2 are obtained as mentioned earlier. In Table 2, two AttackMetric1 values are zero and two AttackMetric2 values are 7999, high. In Client 3 and Client 8, AttackMetric1 values are the lowest value and AttackMetric2 values are highest value and are declared as abnormal which is not maintaining the transactions between minimum and maximum bandwidth. Therefore, Client 3 and Client 8 are reported as an attacker.
7 Conclusion The usage of wireless devices becomes inevitable enabling people to work from anywhere at any time. The attackers are looking for juicy information from these devices and if it happens that ends up with loss. As the attacks are happening without the knowledge of users, so by monitoring their device packet flow it is easy to guess the actual happening. The packet rates resource management technique based on AttackMetric1 and AttackMetric2 provides notable result in detection with less computation time.
References 1. https://threatmap.checkpoint.com/ 2. Nair S, Gautam N (2014) Entropy variation and J48 algorithm based intrusion detection system for cloud computing. Int J Comput Appl 103(11):8–14 3. Kaur G, Saxena V, Gupta JP (2020) Detection of TCP targeted high bandwidth attacks using self-similarity. J King Saud Univ Comput Inf Sci 32(1):35–49
682
T. Subburaj and K. Suthendran
4. Shojaei M, Movahhedinia N, Ladani BT (2011) An entropy approach for DDoS attack detection in IEEE 802.16 based networks. Lect Notes Comput Sci 7038:129–143 5. Akyazi U, Sima Uyar A (2012) Distributed detection of DDoS attacks during the intermediate phase through mobile agents. Comput Inform 31:759–778 6. Cusack B, Tian Z (2016) Detecting and tracing slow attacks on mobile phone user service. In: Proceeding of 14th Australian digital forensics conferences, pp 4–10 7. Singh R, Singh J, Singh R (2016) WRHT: a hybrid technique for detection of wormhole attack in wireless sensor networks. Mob Inf Syst 2016:1–13. ID: 83549930 8. Toklu S, Simsek M (2018) Two-layer approach for mixed high rate and low rate distributed denial of service attack detection an filtering. Arab J Sci Eng 9. Udhayan J, Hamsapriya T (2011) Statistical segregation method to minimize the false detections during DDoS attacks. Int J Netw Secur 13(3):152–160 10. Yu S, Zhou W, Jia W, Guo S, Xiang Y, Tang F (2012) Discriminating DDoS attacks from flash crowds using flow correlation coefficient. IEEE Trans Parallel Distrib Syst 23(6):1073–1080 11. Kim M (2019) Supervised learning based DDoS attacks detection: tuning hyper parameters. Electron Telecommun Res Inst J 41(5):560–573 12. Shiaeles SN, Papadaki M (2014) FHSD: an improved IP spoof detection method for web DDoS attacks. Comput J 58(4):892–903 13. Liao Q, Li H, Kang S, Liu C (2015) Application layer DDoS attack detection using cluster with label based on sparse vector decomposition and rhythm matching. Secur Commun Netw 8(17):3111–3120 14. Hoque N, Kashyap H, Bhattacharyya DK (2017) Real-time DDoS attack detection using FPGA. Comput Commun 110:48–58 15. Subburaj T, Suthendran K (2019) Detection and trace back of low and high volume of DDoS attack based on statistical measures. Concurr Comput: Pract Exp. https://doi.org/10.1002/cpe. 5428 16. Subburaj T, Suthendran K (2018) Watering hole attack detects and discovers the source address using the sequential pattern. J Cyber Secur Mobil 7(1):1–12 17. Subburaj T, Suthendran K, Arumugam S (2017) Statistical approach to trace the source of attack based on the variability in data flows. Lect Notes Comput Sci 10398:392–400 18. Subburaj T, Suthendran K (2017) Detection and trace back of DDoS attack based on statistical approach. J Adv Res Dyn Control Syst 9(13):66–74
Chapter 57
Robust Lightweight Image Encryption Technique Using Crossover Operator Gaurav Mittal and Manish Gupta
1 Introduction In current scenario, the uses of smartphones are increasing day by day, and a lot of images are transmitted regularly to thousands of people via social media Web sites and apps. So an exchange of secure images over the communication network becomes a serious issue [1]. Various traditional encryption algorithms such as RSA, AES, IDEA, and Diffie–Hellman have been developed, but the efficiency of these algorithms for image encryption is less due to higher redundancy and higher correlation among pixels. Traditional image encryption algorithms based on symmetric key cryptography are generally more expensive due to algorithmic complexity and require more number of rounds for encryption. So work [2] proposed a less complex algorithm that uses a 64-bit block cipher and uses 64-bit key for image encryption, which requires less memory and less number of rounds as compared to other encryption algorithms for encryption. Various algorithms of image-encryption-based 4D chaotic map [3], hyperchaotic and genetic algorithm [2], hash key using crossover operator and chaos [4], chaos using crossover and mutation operator [5], based on a fractional-order hyperchaotic system [6], hyperchaotic [7], chaotic dynamic S-box and DNA sequence operation [8], and 3D logistic map [9] have been proposed for image encryption. This work uses a crossover operator for both image encryption and key generation phase that increases the difficulty level for the attackers.
G. Mittal (B) · M. Gupta VITM, Gwalior, MP, India e-mail: [email protected] M. Gupta e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_57
683
684
G. Mittal and M. Gupta
Rest of the sections of this paper are organized as follows: Sect. 2 describes the proposed work, Sect. 3 describes the experimental results on various parameters, and finally, Sect. 4 concludes the overall work.
2 Proposed Methodology This section describes the proposed methodology used in this work. The proposed methodology works on three different phases, first one is key generation phase, second one is encryption phase, and the third one is decryption phase. (a) Key Generation phase: In this phase, a secure unique key is generated by using a random function, named as a single-point crossover operator. In this type of crossover operator, randomly select the single crossover point in both parent strings and exchanged the strings with this crossover points to make the two new parent strings. Figure 1 shows the single crossover operator. The following steps show the process of single-point crossover: Steps of single crossover point crossover: Step 1: Find the length of parent strings (both parent strings must be of the same size). Step 2: Assign the value of lower bound (lb) and upper bound (ub), i.e., ub = (length of parent string) −1; lb = 1; Step 3: Find the value of crossover point by using following random function (f): f(cross_p) = round((ub−lb)∗ rand() + lb) Step 4: By the value of crossover point, divide the parent 1 (P) and parent 2 (Q) strings into two parts p1 = parent1 (1: cross_p) Fig. 1 Concept of single-point crossover operator
57 Robust Lightweight Image Encryption Technique …
685
p2 = parent 2 (cross_p + 1: length of parent string) similarly q1 = parent2 (1: cross_p) q2 = parent1(cross_p + 1: length of parent string) Step 5: Combine p1, p2 and q1, q2 to make a new parent string P and Q P = combine(p1,p2) Q = Combine(q1,q2) (b) Encryption phase: This phase is used to encrypt the image information. In this phase, the encryption process completes four different rounds and two different swaps. Each round contains XOR, XNOR logical function, and one random function in the form of crossover operator. The first swap operation is performed after the first encryption round, and the second swap is performed after the third encryption round. The following steps show the working of encryption process: Step 1: Divide the 64-bit of binary data into 4-blocks, each having 16-bit in size. Step 2: Perform XNOR operation on the first 16-bit block with key K1. Step 3: Apply random function (f) on a 16-bit block collected from step 2 and then perform XOR operation on 16-bit block taken after the f-function operation and third 16-bit block. Step 4: Perform XNOR operation on last 16-bit block with key K1. Step 5: Apply random function (f) on a 16-bit block collected from step 4 and then perform XOR operation on a 16-bit block taken after the f-function operation and second 16-bit block. Step 6: Repeat steps 2 to 5 for K2, K3, and K4 key, each having 16-bit in size. (c) Decryption Phase: This phase contains the reverse process of the encryption phases described above.
3 Experimental Results This section contains the experimental results performed on various parameters such as NPCR, entropy, correlation, and histogram analysis. Here, MATLAB 2015 version software, a system having Core i3 processor, and 2 GB RAM i are used for performing proposed work. The following Lenna image is used for performing operations. Figure 4 shows the encrypted and decrypted Lenna image.
686
G. Mittal and M. Gupta
a. NPCR: The following formulas are used for performing NPCR operation. The NPCR of the two images is given below NPCR =
i, j
D(i, j)
W×H
∗ 100%
where H and W are the height and width of the input image, and the value of D(i, j) is calculated as 0, if C1(i, j) = C2(i, j) D(i, j) = 1, Otherwise b. Entropy: It is a statistical measure of randomness that can be used to characterize the texture of the input image. For calculating entropy, the following formula is used (Table 1): E(m) =
M−1 j=0
1 p(m j ) log p mj
c. Correlation: The following formulas are used to calculate vertical, horizontal, and diagonal correlation: Cov(x, y) = E(x − E(X ))(y − E(y)) Rx y = √
Cov(x, y) √ D(x) D(y)
Table 1 Comparison between existing techniques and proposed work on the parameter NPCR and entropy Encryption techniques
NPCR
Entropy
Enayatifar et al. [10]
99.54
7.9339
Enayatifar et al.[11]
99.56
7.8753
Talarposhti et al. [12]
99.63
7.8702
Proposed work using single-point crossover (Max.)
99.64
7.9969
57 Robust Lightweight Image Encryption Technique …
687
Fig. 2 Horizontal, vertical, and diagonal correlation of original and encrypted Lenna image
The following three formulas are used in numerical computations, where y and x are the two adjacent pixels value in the image. E(x) =
D(x) =
N 1 xi N i=1
N 1 (xi −E(x))(y − E(y)) N i=1
N 1 Cov(x, y) = (xi −E(x))(yi − E(y)) N i=1
d. Histogram: Statistical similarities between the original and cipher image are measured with the help of histogram analysis (Figs. 2, 3 and 4).
4 Conclusion Since most of the communications are in the form of images and due to the increasing use of mobile phones, there is a need for a secure image encryption algorithm that is lightweight in nature. This work proposed a robust lightweight image encryption technique based on random function, known as crossover operator of genetic
688
G. Mittal and M. Gupta
Fig. 3 Original and encrypted Lenna image histogram
Fig. 4 Encrypted and decrypted Lenna image
algorithm. For checking the efficiency of the proposed algorithm, experiments are performed on various parameters that show the proposed algorithm is better than the existing techniques described in this paper.
References 1. Usman M, Ahmed I, Aslam M, Khan S, Shah U (2017) SIT: a lightweight encryption algorithm for secure internet of things. (IJACSA) Int J Adv Comput Sci Appl 8(1) 2. Zhang X, Zhou H, Zhou Z, Wang L, Li C (2018) An image encryption algorithm based on hyperchaotic system and genetic algorithm. In: Qiao J et al (eds) Bio-inspired computing: theories and applications. BIC-TA 2018. Communications in computer and information science, vol 952. Springer, Singapore
57 Robust Lightweight Image Encryption Technique …
689
3. Stalin S, Maheshwary P, Shukla P, Maheshwari M, Gour B, Khare A (2019) Fast and secure medical image encryption based on non linear 4D logistic map and DNA sequences (NL4DLM_DNA). J Med Syst (Springer) 4. Guesmi R, Farah M, Kachouri A, Samet M (2016) Hash key-based image encryption using crossover operator and chaos. Multimed Tools Appl 75(8):4753–4769 5. Samhita P, Prasad P, Patro K, Acharya B (2016) A secure chaos-based image encryption and decryption using crossover and mutation operator. I J C T A 9(34):17–28 6. Huang X, Sun T, Li Y, Liang J (2015) A color image encryption algorithm based on a fractionalorder hyperchaotic system. Entropy MDPI 17:28–38 7. Tong X, Yang L, Zhang M, Xu H, Zhu W (2015) An image encryption scheme based on hyperchaotic Rabinovich and exponential Chaos maps. Entropy MDPI 17:181–196 8. Tian Y, Lu Z (2017) Novel permutation-diffusion image encryption algorithm with chaotic dynamic S-box and DNA sequence operation. AIP Adv 1–23 9. Ye G, Jiao K, Pan C, Huang X (2018) An effective framework for chaotic image encryption based on 3D logistic map. Hindawi Secur Commun Netw 1–11 10. Enayatifar R, Abdullah AH, Lee M (2013) A weighted discrete imperialist competitive algorithm (WDICA) combined with chaotic map for image encryption. Opt Lasers Eng 51:1066–1077 11. Enayatifar R, Abdullah AH, Isnin IF (2014) Chaos-based image encryption using a hybrid genetic algorithm and a DNA sequence. Opt Lasers Eng 56:83–93 12. Talarposhti KM, Jamei MK (2016) A secure image encryption method based on dynamic harmony search (DHS) combined with chaotic map. Opt Lasers Eng 81:21–34
Chapter 58
Peacock Repellant Technique for Crop Protection in the Agricultural Land S. Anitha, K. Lalitha, and V. Chandrasekaran
1 Introduction Although agriculturalists have adopted various common control measures like fencing, bursting crackers, making sounds by themselves, wiring, netting, spraying pepper flakes on the plants, etc., it is still uncontrolled due to its increasing growth rate, causing burden to farmers, making harm to peacocks and unsuited for peacock entry prevention. Fences construction around the crops is ineffective due to flyover feature of peacocks. It generally damages crops in the seeding stage itself, so the crops moving from sowing to harvesting stage are very much reduced. The crops around cultivated areas mostly fed by peacocks are groundnut, tomato, paddy, chilly, grains, bananas and other vegetables and fruits. Presently, based on the survey report collected from villagers and priests around our native places, it is found that peafowl diets are paddy, bajra, other grain seeds and partial to agricultural crops and garden plants. Generally, green and black gram crops are very much liking to the peacock birds. All over India, especially, southern regions are mainly affected by peacock based on above-said reasons. The forest officials estimated that around 15,000 birds are currently present in the region. As the growth of peacocks is increasing day by day, owners of the land incur increased loss. To remedy this, they have formed teams to work on the menace in order to analyze the root cause of the problem. Forest S. Anitha (B) · K. Lalitha Department of Information Technology, Kongu Engineering College, Perundurai, Erode, India e-mail: [email protected] K. Lalitha e-mail: [email protected] V. Chandrasekaran Department of Medical Electronics, Velalar College of Engineering and Technology, Thindal, Erode, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_58
691
692
S. Anitha et al.
officials are insisted by farmers for tackling the issues by taking urgent measures and also demands recompense for the loss. Farmers said that the peacock entry is increasing in Thuraiyur, Mannachanallur, Musiri, Marungapuri, Erode and few other areas. “It is high time that forest officials took some steps,” said Ayilai Siva Suriyan, secretary of Thamilaga Vivasayigal Sangam, Trichy [1]. Admitting to the problem, a forest ranger told in the Hindu Express that “even we are wondering as to how to go about the problem. Those affected by the problem quickly throw blame on us. When we capture peacocks or monkeys, there is no proper place to accommodate them” [2]. “The government should take necessary actions considering the complaints posted by affected farmers and construct enclosures for accommodating the captured birds and animals,” the ranger added. Ultimately, farmers are saying that poor rainfall in certain areas like Pudukottai district, especially in Viralimalai region, results in the migration of the peacocks to adjoining fertile lands. The objectives of this paper are: • To make a study on the impact of peacock in farmlands based on the location and types of crops, • To prepare a survey based on questionnaire approach in order to analyze the crop damage by peacock, • To develop a prototype model for analyzing the increased crop yield by repelling the peacock. The overall aim of this paper is to develop a low-cost automatic detection technique which will greatly reduce the crop damage due to peacock without harming it so that the farmers can gain better profits.
2 Related Work The peafowls are active in the morning and evening and took rest in the afternoon in Prosopis juliflora scrub shade [3]. In the Viralimalai district of Tamil Nadu, peafowl sex ratio is 1468:1677 (1:1.4) [4]. The Indian peafowl is omnivorous feeding insects, small mammals and reptiles, groundnut, tomato, paddy, chilly and even bananas [5, 6]. It feeds small snakes and keeps away from larger ones. Peafowl feeds on a wide range of crops such as beans, chilly, capsicum, tomato, maize, paddy, bajra and other grain seeds as reported by villagers and priests [7]. Many researchers found solution for protection of crops from birds during the 90s and highlighted one solution proposed by researchers in the African countries [8]. In order to protect both crops and birds, the author suggested to use nets or acrylic fibers. Another researcher analyzed the crop depredation by birds particularly in Deccan Plateau, India [9]. This study summarizes that the crop depredation happened mostly during harvesting season and mentioned that the farmers are not satisfied with the conventional bird repelling techniques.
58 Peacock Repellant Technique for Crop Protection …
693
An intensive research is carried out to protect sunflower from bird [10]. They suggested several techniques, but everything becomes harmful techniques to birds. In recent days, an extensive review is carried out and highlighted the different repellent techniques to protect bird as well as crops from severe damages. Electronic repellent techniques are one of the techniques suggested [11] and have been summarized that it produces extremely audio sounds to repel birds from crop damage. The main advantages of electronic method are eco-friendly, minimal maintenance and simple electronic circuit design. Moreover, the author suggested using image processing techniques, birds and animals will be detected in a very accurate manner. From the extensive literature review, it is summarized that the existing works are efficient in crop production as well as produce severe damage to birds community. Hence, the objective of this work is to protect the crops as well as protect the birds without severe damage at reduced cost.
3 System Architecture In GSM communication module, there are different features available that provides inexpensive, reduced size, customized interface, timely alarm, quick response time and large area coverage for the proposed system. Hence, the priests/farmers can interact with the system far from the cropland. The overall block diagram of the proposed system is shown in Fig. 1. In the crop field, various sensors are installed at the several locations. Raspberry Pi interfaces USB camera and PIR sensor which are placed at the crop field border. The image of the peacock is captured when PIR sensor is high or USB camera ON. The image and alert message are sent to the landowner through mail/mobile number. At the middle of the crop field, vibration sensor is positioned, and its value becomes higher than the threshold level when it detects any movement. After verifying through color match, buzzer sound gets activated, and also message is sent to the user mobile through GSM. Fig. 1 Block diagram of proposed method
Relays Camera Raspberry Pi Email
PIR sensor
SMS Vibration sensor Crop field
GSM module
694
S. Anitha et al.
The flowchart of the proposed system is shown in Fig. 2. The system checks the sensor value after initializing camera, Raspberry Pi, sensors and GSM. The alarm message will be sent to the user by the Raspberry Pi through GSM when the value of the sensors exceeds the threshold value. Then the system waits for the user command through message. The user can turn on or off specific sensor based on the location of the movement of the object. The overall experimental setup of the system is shown in Fig. 3. Raspberry Pi module, PIR, vibration sensors, camera, buzzer, GSM and relay driver circuit are the hardware components mounted over a wooden plank for the better working of the system. Any kind of motion is accurately detected by PIR sensor. Raspberry Pi interfaces several sensors and gives digital form of output. Sensor senses the data, and it is directly sent through SMS/email to the user. A signal is sent to the microcontroller when an object moves within the range of PIR sensor. Then the webcam is initiated to snap a photo which is stored onto Raspberry Pi’s memory card. Finally, the photo stored in the memory is transmitted to the user through email.
4 Peacock Repellant Technique The paper focuses on the need to study the current issues in agricultural field with the help of data analytics. Predictive analytics [12] generally deals with the statistical analysis for extracting information to uncover relationships and patterns within large volumes of data using various technologies in order to predict events. It is the better alternative technology used to predict future conditions of crop and help farmers to make proactive decisions. Before implementing this technique in the agricultural area, two villages, namely Kummakalipalayam and Vallipurathanpalayam, were selected for study areas. In data collection process, survey method based on questionnaire was adopted. For this, the factors like location, variety of crops, peacock growth rate, entry timings, etc. are taken for preparing questionnaire. Survey was collected from various age groups, farmers and common people in Perundurai Taluk villages. The first step is the raw data pre-processing. In pre-processing, two techniques, viz. detection technique, to find imperfections in data sets, and transforming technique, to obtain more manageable data sets, are used. Pre-processing allows transforming the available raw questionnaire data into a suitable format. Commonly regression, sampling or modeling technique can be used to format the data. Among those techniques, regression is the well-known and suitable method to predict future trends. With the help of regression analysis, it was found that the increasing growth rate of peacock will drastically cause economic loss to the farmers. Researching solution for this crop depredation by peacock is performed based on considering various factors such as types of crops damaged by peacock, revenue loss per year due to crop damage, peacock growth rate predictions and its impact. This study reveals the impact of birds in crop devastation mainly in southern region and creates awareness among agricultural society [13].
58 Peacock Repellant Technique for Crop Protection … Fig. 2 Flowchart of the proposed system
695
Start
Initialize Raspberry Pi and sensors
Initialize GSM module in sending and receiving mode
Read sensor values
No
Is sensor value is above threshold?
Yes
Send the alert message and email to the user
Monitor the commands coming from the user
Color stored database
No
in
Yes
Take a snapshot of the peacock and send it to the user
The message, “The image is not peacock” is sent
696
S. Anitha et al.
Fig. 3 Experimental setup of the proposed system
Based on the survey, the second step is to develop a prototype for detecting the peacock and implement it in the farmland. Then analyzing has to be made to study the effectiveness of the approach in real time by interfacing through Raspberry Pi and other components and setting it in the farmland. Finally, using GSM module, message can be generated that helps to know about the count and timing of peacock entry at the user side and to inform the officials to take guard the peacocks.
5 Conclusion Thus, it is inferred that crop damage is one of the major threats to the communities’ livelihoods. So, government must take remedial measures to safeguard both peacock and crop by implementing different techniques to the benefit of farmers and society. Understanding the usage and effectiveness of different techniques could also help to reduce crop damage problems which will have a direct impact on improving the farmers–peacock relationship.
References 1. http://www.newindianexpress.com/states/tamil_nadu/After-Monkey-Menace-Peacocks-Addto-pain-of-Farmers-in-TiruchyDistrict/2014/12/21/article2581330.ece 2. http://www.thehindu.com/news/cities/Tiruchirapalli/peacocks-damage-crops/article65702 26.ece 3. Rathinasabapathi B (1987) Activity patterns with special reference to food and feeding habits of the Indian Peafowl, Pavo cristatus in Viralimalai area, Tamilnadu 4. Rajadurai T (1988) Present distribution and status of Indian peafowl (Pavo cristatus) in Viralimalai area, Tamil Nadu, and South India. M.Sc. dissertation, A.V.C. College 5. Johnsingh AJT (1976) Peacocks and cobra. J Bombay Nat Hist Soc 2–14 6. Johnsingh AJT, Murali S (1978) The ecology and behavior of the Indian peafowl (Pavo cristatus) Linn. of Injar, Tamilnadu. J Bombay Nat Hist Soc 1069–1079
58 Peacock Repellant Technique for Crop Protection …
697
7. Veeramani A (1990) Studies on ecological and behavioural aspects of Indian peafowl Pavo cristatus in Mudumalai Wildlife Sanctuary, Tamil Nadu, South India 8. Bruggers RL, Ruelle P (1982) Efficacy of nets and fibres for protecting crops from grain-eating birds in Africa. Crop Prot 55–65 9. Kale MA, Nandkishor D, Raju K, Prosun B (2014) Crop depredation by birds in Deccan Plateau, India. Int J Biodivers 10. Linz GM, Homan HJ, Werner SJ, Hagy HM, Bleier WJ (2011) Assessment of bird-management strategies to protect sunflowers. BioScience 960–970 11. Baral SS, Swarnkar R, Kothiya AV, Monpara AM, Chavda SK (2019) Bird repeller—a review. Int J Curr Microbiol Appl Sci 1035–1039 12. http://www.sas.com/en_sg/insights/analytics/predictive-analytics.html 13. http://www.research.ibm.com/articles/precision_agriculture.shtml
Retraction Note to: Industrial Internet of Things (IIoT) Framework for Real-Time Acoustic Data Analysis Sathyan Munirathinam
Retraction Note to: Chapter 53 in: S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_53 The Editors have retracted this conference paper because Figures 1, 4, 5, 6, 7, 8 and 9, Table 2, as well as parts of the text, are duplicated from a previously published article by a different author [1]. The data reported are therefore unreliable. The author, Sathyan Munirathinam, agrees to this retraction. [1] Sarah Malik, Rakeen Rouf, Krzysztof Mazur and Antonios Kontsos. The Industry Internet of Things (IIoT) as a Methodology for Autonomous Diagnostics in Aerospace Structural Health Monitoring. Aerospace 2020, 7(5), 64; https://doi.org/10.3390/aerospace7050064
The retracted version of this chapter can be found at https://doi.org/10.1007/978-981-33-4893-6_53
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6_59
C1
Author Index
A Abbigeri, Shivarajakumar, 281 Alam, Md Shahbaz, 401 Anand, Mansimran Singh, 173 Anitha, S., 691 Arora, Yojan, 87 Arya, Gangili Divija, 365 Ashok, Asha, 365
B Bali, Raghav, 401 Batham, Deepak, 49 Bhadoria, Vikas Singh, 385 Bhateja, Neha, 567 Bhati, Jai Prakash, 61 Bhatnagar, Vishal, 537 Birla, Vivek, 287 Burroughs, Sonya, 421
C Chandrasekaran, V., 691 Chatterjee, Joyjit, 159 Chatterjee, Subarna, 513, 579 Chaudhary, Sarika, 243, 353 Chaudhary, Vishal, 593 Chauhan, Deepika, 233 Chelliah, Pethuru Raj, 193
D Dalal, Vipul, 267 Daniel, A. K., 489 Deepa Lakshmi, B., 97 Deep, Ayush, 287
Dhamal, Poonam, 665 Dholay, Surekha, 605 Dhull, Ritvika, 353 Dixit, Manish, 25, 71 Dixit, Shruti, 49 G Gambhir, Ashima, 145 Gautam, Amber, 567 Godbole, Anand, 527 Godhavari, T., 97 Gokaraju, Balakrishna, 421 Goyal, Rajeev, 79 Goyal, Samta Jaın, 79 Gupta, Atulkumar, 605 Gupta, Komal, 443 Gupta, Manish, 457, 683 Gupta, Neeraj, 215 Gupta, Rajkumar, 185 Gupta, Rashmi, 215 H Hada, Bharat Singh, 621 Harika, Kothuri Venkata Sai, 365 Harikrishnan, V. K., 145 Hashimi, Said Masihullah, 105 Hemadri, Vidyagouri B., 223 I Ipe, Navin K., 579 J Jain, Charu, 353
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 S. Agrawal et al. (eds.), Machine Intelligence and Smart Systems, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4893-6
699
700 Jain, Sharad, 477 Jatain, Aman, 87 Jawdekar, Anand, 25
K Kalra, Jagrit, 87 Kanekar, Bhavik, 527 Kaur, Loveneet, 553 Kedia, Chirag, 173 Khan, Adeeb, 401 Kumar, Brajesh, 401 Kumari, Anmol, 443 Kumar, Krishan, 287 Kumar, Prabhat, 35 Kumar, Prashant, 215
L Lakshmi, C., 1 Lakshmi Narasimhan, V., 1 Lalitha, K., 691 Lal, Madan, 553 Lalwani, Soniya, 621 Lee, Liza, 1 Loganathan, D., 431 Luu, Khoa, 421
M Mandal, Savitri, 131 Masurkar, Siddhesh, 267 Meena, M. L., 185 Meenu, 145 Mehra, Anu, 159 Mehrotra, Deepti, 131 Mehrotra, Shashi, 665 Mishra, Ravi Bhushan, 657 Mittal, Gaurav, 683 Moghe, Asmita A., 255 Moharil, Ambarish, 173 Munirathinam, Sathyan, 635
N Nadaf, Shatajbegum, 223 Nagpal, Pooja Batra, 243 Nannapaneni, Rajasekhar, 513 Narasimhan, Shanmukhi, 365 Narayanan, Praveena, 431 Narvey, Rakesh, 593 Narwaria, Ravindra Pratap, 49
Author Index P Pandey, Kamlesh Kumar, 337 Panwar, Arvind, 537 Pardeep, 87 Pashupatimath, Anand, 281 Patsariya, Sanjay, 71
R Rahul, Deeptimahanti Venkata, 365 Rajak, Ranjit, 117 Rajawat, Neha, 621 Rajput, Rashmi, 457 Ranjan, Ashish, 657 Ranjan, Roop, 489 Rawat, Deepika, 443 Roy, Aritra, 243 Roy, Kaushik, 421
S Saini, Paras, 401 Sai Sabitha, A., 131 Sasi, Allan, 287 Satapathy, Santosh Kumar, 431 Sauda, Shravankumar, 553 Sehgal, Jai, 159 Sehrawat, Deepthi, 105 Selot, Kritika, 117 Senthil Kumar, K., 97 Sharma, Anand, 297, 305 Sharma, Mriganka, 159 Sharma, Rashi, 305 Sharma, Sanjay, 463 Sharma, Shivam, 315 Shukla, Diwakar, 337 Singh, Anil Kumar, 657 Singh, Arun Pratap, 463 Singh, Ashutosh Kumar, 593 Singh, Chaitanya, 233 Singh, Juhi, 567 Singh, Nidhi, 385 Singh, Rama, 11 Singh, Vibhav Prakash, 657 Somisetti, Kiran, 375 Sonavane, Nikhil, 173 Soni, Akanksha, 463 Soni, Hemant Kumar, 315 Subburaj, T., 675 Sujatha, K., 97 Suresh, S., 35 Surianarayanan, Chellammal, 193 Suthendran, K., 675
Author Index T Thakur, Anil Kumar, 657 Tomar, Dimpal, 61 Tomar, Pradeep, 61 Tripathi, Khushboo, 105 U Ugale Pradip, 255 V Vaish, Ankita, 11
701 Vasistha Bhargavi, G., 1
W Wadhwa, Amit, 501
Y Yadav, Ashwani Kumar, 477 Yadav, Dilip, 385 Yadav, Rajesh, 297