1,002 23 31MB
English Pages 828 [786] Year 2021
Algorithms for Intelligent Systems Series Editors: Jagdish Chand Bansal · Kusum Deep · Atulya K. Nagar
Manish Prateek · T. P. Singh · Tanupriya Choudhury · Hari Mohan Pandey · Nguyen Gia Nhu Editors
Proceedings of International Conference on Machine Intelligence and Data Science Applications MIDAS 2020
Algorithms for Intelligent Systems Series Editors Jagdish Chand Bansal, Department of Mathematics, South Asian University, New Delhi, Delhi, India Kusum Deep, Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India Atulya K. Nagar, School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool, UK
This book series publishes research on the analysis and development of algorithms for intelligent systems with their applications to various real world problems. It covers research related to autonomous agents, multi-agent systems, behavioral modeling, reinforcement learning, game theory, mechanism design, machine learning, meta-heuristic search, optimization, planning and scheduling, artificial neural networks, evolutionary computation, swarm intelligence and other algorithms for intelligent systems. The book series includes recent advancements, modification and applications of the artificial neural networks, evolutionary computation, swarm intelligence, artificial immune systems, fuzzy system, autonomous and multi agent systems, machine learning and other intelligent systems related areas. The material will be beneficial for the graduate students, post-graduate students as well as the researchers who want a broader view of advances in algorithms for intelligent systems. The contents will also be useful to the researchers from other fields who have no knowledge of the power of intelligent systems, e.g. the researchers in the field of bioinformatics, biochemists, mechanical and chemical engineers, economists, musicians and medical practitioners. The series publishes monographs, edited volumes, advanced textbooks and selected proceedings.
More information about this series at http://www.springer.com/series/16171
Manish Prateek T. P. Singh Tanupriya Choudhury Hari Mohan Pandey Nguyen Gia Nhu •
•
•
•
Editors
Proceedings of International Conference on Machine Intelligence and Data Science Applications MIDAS 2020
123
Editors Manish Prateek School of Computer Science University of Petroleum and Energy Studies Dehradun, India Tanupriya Choudhury Department of Informatics School of Computer Science University of Petroleum and Energy Studies Dehradun, India
T. P. Singh Department of Informatics School of Computer Science University of Petroleum and Energy Studies Dehradun, India Hari Mohan Pandey Edge Hill University Ormskirk, UK
Nguyen Gia Nhu Duy Tan University Da Nang, Vietnam
ISSN 2524-7565 ISSN 2524-7573 (electronic) Algorithms for Intelligent Systems ISBN 978-981-33-4086-2 ISBN 978-981-33-4087-9 (eBook) https://doi.org/10.1007/978-981-33-4087-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
MIDAS-2020 aims to promote and provide a platform for researchers, academia and practitioners to meet and exchange ideas on recent theoretical and applied machine and artificial intelligence and data sciences research. The conference will provide a platform for computer science, computer engineering and information technology professionals, scientists, researchers, engineers, educators and students from all over the world to share their scientific contributions, ideas and views. We feel glad to welcome all readers to the International Conference MIDAS-2020 at the University of Petroleum and Energy Studies being organized in association with NGCT Society. This time, due to COVID-19 pandemic conditions, the conference is taking place in virtual space, but interestingly with a larger participation ever. The theme of the conference was very thoughtfully decided as machine intelligence and data sciences. With increasing interest in building more and more sophisticated systems on the wheels of tremendous data generation and its processing capabilities, the topic appears to be most suitable for the current research inclination of the academic fraternity. The type of research submissions received further strengthens the suitability of the theme of the conference. Again it is a great pleasure to share that the researchers from 22 different countries around the continents submitted their research contributions in the form of articles, thus making the conference a wonderful platform for the exchange of ideas and further strengthening the collaborations among the interested peoples in the research fraternity. This volume is a compilation of the chapters of various presentations presented in the conference with the aim to be a memoir of the event. The theme of the international conference is machine intelligence and applications, which comprises four tracks. Track 1 addresses the algorithmic aspect of machine intelligence, while Track 2 includes the framework and optimization of various algorithms. Track 3 includes all the papers related to wide applications in various fields, and the book volume may end with
v
vi
Preface
Track 4 which will include interdisciplinary applications. We truly believe that the book will fit as a good read for those looking forward to exploring areas of machine learning and its applications. Dehradun, India Dehradun, India Dehradun, India Ormskirk, UK Da Nang, Vietnam
Manish Prateek T. P. Singh Tanupriya Choudhury Hari Mohan Pandey Nguyen Gia Nhu
Contents
1
2
3
4
5
Probabilistic Machine Learning Using Social Network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Rohini, T. Sudalai Muthu, and Tanupriya Choudhury Prioritization of Disaster Recovery Aspects Implementing DEMATEL Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rana Majumdar, Tanupriya Choudhury, Ravi Tomar, and Rachna Jain A Study on Machine Learning-Based Predictive Modelling for Pick Profiling at Distribution Centers . . . . . . . . . . . . . . . . . . . . Aswin Ramachandran Nair, Aarti Laddha, Andreas Munson, and Anil Sebastian Analysis of Computational Intelligence Techniques in Smart Cities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ayesha Shakeel, Ved Prakash Mishra, Vinod Kumar Shukla, and Kamta Nath Mishra Proposed End-to-End Automated E-Voting Through Blockchain Technology to Increase Voter’s Turnout . . . . . . . . . . . . . . . . . . . . . Ashish Singh Parihar, Devendra Prasad, Aishwarya Singh Gautam, and Swarnendu Kumar Chakraborty
6
Future of Data Generated by Interactive Media . . . . . . . . . . . . . . . Divyansh Kumar and Neetu Narayan
7
Efficient Load Optimization Method Using VM Migration in Cloud Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sarita Negi, Man Mohan Singh Rauthan, Kunwar Singh Vaisla, and Neelam Panwar
8
Analysis of Search Space in the Domain of Swarm Intelligence . . . Vaishali P. Patel, Manoj Kumar Rawat, and Amit S. Patel
1
13
27
35
55
73
83
99
vii
viii
9
Contents
Smart Cane 1.0 IoT-Based Walking Stick . . . . . . . . . . . . . . . . . . . . 111 Azim Uddin Ansari, Anjul Gautam, Amit Kumar, Ayush Agarwal, and Ruchi Goel
10 Web Crawler for Ranking of Websites Based on Web Traffic and Page Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Nishchay Agrawal and Suman Pant 11 Semantic Enrichment for Non-factoid Question Answering . . . . . . 129 Manvi Breja and Sanjay Kumar Jain 12 Genre-Based Recommendation on Community Cloud Using Apriori Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Manvi Breja and Monika Yadav 13 AntVMp: An Approach for Energy-Efficient Placement of Virtual Machines Using Max–Min Ant System . . . . . . . . . . . . . . 153 Varun Barthwal, Man Mohan Singh Rauthan, and Rohan Varma 14 CPU Performance Prediction Using Various Regression Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Abdullah Al Sefat, G. M. Rasiqul Islam Rasiq, Nafiul Nawjis, and S. K. Tanzir Mehedi 15 IBM Watson: Redefining Artificial Intelligence Through Cognitive Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Fatima Farid Petiwala, Vinod Kumar Shukla, and Sonali Vyas 16 An Improved Multi-objective Water Cycle Algorithm to Modify Inconsistent Matrix in Analytic Hierarchy Process . . . . . . . . . . . . . 187 Hemant Petwal and Rinkle Rani 17 Comparative Approach of Response Surface Methodology and Particle Swarm Optimization-Artificial Neural Network (PSO-ANN) in Rehydration Ratio Optimization for Bael (Aegle marmelos (L) Correa) Powder Production . . . . . . . . . . . . . . 199 Tanmay Sarkar, Molla Salauddin, Sudipta Kumar Hazra, and Runu Chakraborty 18 Load Balancing with the Help of Round Robin and Shortest Job First Scheduling Algorithm in Cloud Computing . . . . . . . . . . . . . . 213 Shubham Kumar and Ankur Dumka 19 Downlink Performance Improvement Using Base Station Cooperation for Multicell Cellular Networks . . . . . . . . . . . . . . . . . 225 Sandeep Srivastava, Pramod Kumar Srivastava, Tanupriya Choudhury, Ashok Kumar Yadav, and Ravi Tomar
Contents
ix
20 A Survey on Moving Object Detection in Video Using a Moving Camera for Smart Surveillance System . . . . . . . . . . . . . . . . . . . . . . 241 Manoj Kumar, Susmita Ray, and Dileep Kumar Yadav 21 Tomato Leaf Features Extraction for Early Disease Detection . . . . 255 Vijaya Mishra, Richa Jain, Manisha Chahande, and Ashwani Kumar Dubey 22 Cardiac MRI Segmentation and Analysis in Fuzzy Domain . . . . . . 267 Sandip Mal, Kanchan Lata Kashyap, and Niharika Das 23 Color Masking Method for Variable Luminosity in Videos with Application in Lane Detection Systems . . . . . . . . . . . . . . . . . . 275 Om Rastogi 24 An Automated IoT Enabled Parking System . . . . . . . . . . . . . . . . . 285 Meghna Barthwal, Keshav Garg, and Mrinal Goswami 25 Real-Time Person Removal from Video . . . . . . . . . . . . . . . . . . . . . 295 B. Bharathi Kannan, A. Daniel, Dev Kumar Pandey, and PrashantJohri 26 Characterization and Identification of Tomato Plant Diseases . . . . 299 Richa Jain, Vijaya Mishra, Manisha Chahande, and Ashwani Kumar Dubey 27 Dual-Layer Security and Access System to Prevent the Spread of COVID-19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Arjun Vaibhav Srivastava, Bhanu Prakash Lohani, Pradeep Kumar Kushwaha, and Suryansh Tyagi 28 Parameter Estimation of Software Reliability Using Soft Computing Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Sona Malhotra, Sanjeev Dhawan, and Narender 29 An Ensemble Approach for Handling Class Imbalanced Disease Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 Sayan Surya Shaw, Shameem Ahmed, Samir Malakar, and Ram Sarkar 30 A Comparative Study on Recognition of Degraded Urdu and Devanagari Printed Documents . . . . . . . . . . . . . . . . . . . . . . . . 357 Sobia Habib, Manoj Kumar Shukla, and Rajiv Kapoor 31 Machine Learning Based Feature Extraction of an Image: A Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Namra Shamim and Yogesh 32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Archit Aggarwal and Garima Aggarwal
x
Contents
33 The BLEU Score for Automatic Evaluation of English to Bangla NMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Goutam Datta, Nisheeth Joshi, and Kusum Gupta 34 A Classifier to Predict Document Novelty Using Association Rule Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 Binesh Nair 35 Analysis of Faculty Teaching Performance Based on Student Feedback Using Fuzzy Mamdani Inference System . . . . . . . . . . . . . 421 Vedna Sharma and Sourabh Jain 36 Statistical and Geometrical Alignment for Unsupervised Deep Domain Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433 Leehter Yao, Sonu Prasad, Rakesh Kumar Sanodiya, and Debdeep Paul 37 Deception Detection on “Bag-of-Lies”: Integration of Multi-modal Data Using Machine Learning Algorithms . . . . . . . 445 Karnati Mohan and Ayan Seal 38 An LDA-Based Approach Towards Word Sense Disambiguation in Malayalam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457 S. Sruthi, Kannan Balakrishnan, and Binu Paul 39 Functional Electrical Stimulation for Hand Movement of Person . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Avinash Bhute, Rakesh S. Patil, Shreyas Sanghai, Prajakta Yadav, and Sanketkumar Sathe 40 Neuro-image Classification for the Prediction of Alzheimer’s Disease Using Machine Learning Techniques . . . . . . . . . . . . . . . . . 483 Yusera Farooq Khan and Baijnath Kaushik 41 Binary Classification of Celestial Bodies Using Supervised Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Anwesha Ujjwal Barman, Kritika Shah, Kanchan Lata Kashyap, Avanish Sandilya, and Nishq Poorav Desai 42 Bangla Document Classification Using Deep Recurrent Neural Network with BiLSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507 Saifur Rahman and Partha Chakraborty 43 Comparative Evaluation of Image Segmentation Techniques with Application to MRI Segmentation . . . . . . . . . . . . . . . . . . . . . . 521 Rahul Chauhan and R. C. Joshi
Contents
xi
44 An End-to-End Approach for Automatic and Consistent Colorization of Gray-Scale Videos Using Deep-Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539 Amit Mahajan, Nishit Patel, Akshay Kotak, and Bhakti Palkar 45 Thermal Object Detection Using Yolov3 and Spatial Pyramid Pooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Sachin Kumar and Deepak Gaur 46 Security Issues of Edge Computing in IoT . . . . . . . . . . . . . . . . . . . 567 Adarsh Mall, Prajitesh Singh, Ashutosh Thute, Shailesh Pancham Khapre, and Achyut Shankar 47 Gradient Local Auto Correlation Co-occurrence Machine Learning Model for Endometrial Tuberculosis Identification . . . . . 581 Varsha Garg, Anita Sahoo, and Vikas Saxena 48 Intelligent Abnormality Detection Method in Cyber Physical Systems Using Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 595 S. Krishna Narayanan, S. Dhanasekaran, and V. Vasudevan 49 Efficient Data Search and Update in Named Data Networking with Integrity Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 Tanusree Chatterjee, Chirasree Pal, Apurba Kumar Gorai, and Sipra DasBit 50 A Comparison of Domain-Specific and Open-Domain Image Caption Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621 Abinash Dutta, Sharath Chandra Punna, and Rohit Pratap Singh 51 Offline Computer-Synthesized Font Character Recognition Using Machine Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . 635 Raghunath Dey and Rakesh Chandra Balabantaray 52 Analysis of Object Detection Algorithms Using Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649 Anmol Singh Ahluwalia, Anukarsh Sondhi, and Shruti Gupta 53 Real-Time Data Pipeline in Confluent Kafka and Mule4 ESB with ActiveMQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 Ashutosh Srivastava, Prashant Johri, Arvind Kumar, Nitin Gaur, and Rachna Jain 54 Imbalanced Cardiotocography Data Classification Using Re-sampling Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681 Jayashree Piri and Puspanjali Mohapatra 55 An Exploration of Acoustic and Temporal Features for the Multiclass Classification of Bird Species . . . . . . . . . . . . . . . 693 Sugandha Gupta and Nilima Salankar
xii
Contents
56 Dataset Annotation on Chronic Disease Comorbidities Study in Type 2 Diabetes Mellitus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713 Suparna Dutta, Saswati Mukherjee, Susovan Jana, Medha Nag, and Sujoy Majumdar 57 A Deep Learning Solution to Detect Text-Types Using a Convolutional Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . 727 A. K. M. Shahariar Azad Rabby, Md. Majedul Islam, Nazmul Hasan, Jebun Nahar, and Fuad Rahman 58 A Search of Diversity in Type Ia Supernova Using Self Organizing Map (SOM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737 Neha Malik, Vivek Jaglan, Shashikant Gupta, and Meenu Vijarania 59 Kickstarter Project Success Prediction and Classification Using Multi-layer Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745 Suchitra Patil, Jahnavi M. Mehta, Hrishikesh S. Salunkhe, and Hitansh V. Shah 60 Detection of Artificially Ripen Mango Using Image Processing . . . . 759 Md Mahbubur Rahman, S. M. Taohidul Islam, Md. Alif Biswas, and Chinmay Bepery 61 Intelligent Negotiation Agent Architecture for SLA Negotiation Process in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771 Rishi Kumar, Mohd Fadzil Hassan, and Muhamad Hariz M. Adnan 62 An Efficient Pneumonia Detection from the Chest X-Ray Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 Rajdeep Chatterjee, Ankita Chatterjee, and Rohit Halder 63 Recognition of Car License Plate Name Using Reformed Template-Based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 791 Faisal Imran and Md. Ali Hossain 64 Impact of Deep Learning on Arts and Archaeology: An Image Classification Point of View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801 Rajdeep Chatterjee, Ankita Chatterjee, and Rohit Halder Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811
About the Editors
Prof. Dr. Manish Prateek received his bachelor’s and master’s degree from Southwest State University, Kursk, Russia, and in master’s degree his area of specialization was in microprocessor designing. He has received his Ph.D. degree in the year 2007 in the field of manufacturing and robotics. He has more than 10 years of experience with IT industry and 14 years of experience in teaching. Currently, he is working as Professor and Dean, School of CS at UPES Dehradun. He has also been Wing Founder of domain-specific programme at UG level with industry collaboration with IBM, Xebia, Oracle, etc. He has so far guided 7 Ph.D. scholars with 53 publications in international journals and conferences throughout India and abroad. He is Founder President of Next Generation Computing Technologies (NGCT). He is holding life membership of ISTE, and a member of CSI, and he is also Executive Vice President at Pentagram Research Centre Pvt. Ltd. He is also the recipient of lifetime achievement award for his contribution to research and academics by the board of directors of pentagram research centre in the year 2010 and also he is holding the Fellow of Institution of Engineers since 2020. Dr. T. P. Singh is currently positioned as Professor and HoD of Informatics Department, under School of Computer Science, University of Petroleum & Energy Studies, Dehradun, UK. He holds Doctorate in Computer Science from Jamia Millia Islamia University, New Delhi. He carries 22 years of rich experience with him. He has been associated with Tata Group and Sharda University, Greater Noida, NCR, India. His research interests include machine intelligence, pattern recognition and development of hybrid intelligent systems. He has guided 15 Masters Theses. Currently, 05 research scholars are working towards their doctoral degree under him. There are more than dozens of publications to his credit in various national and international journals. Dr. Singh is a member of various professional bodies including IEEE, ISTE, IAENG, etc., and also on the editorial/ reviewer panel of different journals.
xiii
xiv
About the Editors
Dr. Tanupriya Choudhury received his bachelor’s degree in CSE from West Bengal University of Technology, Kolkata, India, and master’s degree in CSE from Dr. M.G.R University, Chennai, India. He has received his Ph.D. degree in the year 2016. He has nine years of experience in teaching. Currently, he is working as Associate Professor in Informatics Department, under School of Computer Science at UPES Dehradun. Recently, he has received Global Outreach Education Award for Excellence in best Young researcher Award in GOECA 2018. His areas of interests include human computing, soft computing, cloud computing, data mining, etc. He has filed 14 patents till date and received 16 copyrights from MHRD for his own software. He has been associated with many conferences in India and abroad. He has authored more than 85 research papers till date. He has delivered invited talk and guest lecture in Jamia Millia Islamia University, Maharaja Agrasen College of Delhi University, Duy Tan University Vietnam, etc. He has been associated with many conferences throughout India as a TPC member and Session Chair. He is a lifetime member of IETA, a member of IEEE and a member of IET(UK) and other renowned technical societies. He is associated with Corporate, and he is Technical Adviser of DeetyaSoft Pvt. Ltd. Noida, IVRGURU and Mydigital360. Dr. Hari Mohan Pandey is Sr. Lecturer in the Department of Computer Science at Edge Hill University, UK. He is specialized in Computer Science & Engineering. His research area includes artificial intelligence, soft computing techniques, natural language processing, language acquisition and machine learning algorithms. He is Author of various books in computer science engineering and published over 50 scientific papers in reputed journals and conferences, served as Session Chair and Leading Guest Editor and delivered keynotes. He has been given the prestigious award “The Global Award for the Best Computer Science Faculty of the Year 2015”, award for completing INDO-US project “GENTLE”, award (Certificate of Exceptionalism) from the Prime Minister of India and award for developing innovative teaching and learning models for higher education. Previously, he worked as Research Fellow of machine learning at Middlesex University, London. He worked on a European Commission project—DREAM4CAR under H2020. He is Associate Fellow of Higher Education Academy (UK Professional Standard Framework) and has rich experience of teaching at higher education level. Dr. Nguyen Gia Nhu received the Ph.D. degree in Mathematical for Computer Science from Ha Noi University of Science, Vietnam National University, Vietnam. Currently, he is Dean of Graduate School, Duy Tan University, Vietnam. He has a total academic teaching experience of 19 years with more than 60 publications in reputed international conferences, journals and online book chapter contributions (Indexed By: SCI, SCIE, SSCI, Scopus, ACM DL, DBLP). His area of research includes healthcare informatics, network performance analysis and simulation, and computational intelligence. Recently, he has been in the Technical programme committee and review committee and Track Chair for international conferences:
About the Editors
xv
FICTA 2014, ICICT 2015, INDIA 2015, IC3T 2015, INDIA 2016, FICTA 2016, IC3T 2016, IUKM 2016, INDIA 2017, FICTA 2017, FICTA 2018, INISCOM 2018, INISCOM 2019 under Springer-ASIC/LNAI Series. 6 Computer Science books published in Springer, IGI Global, CRC and Wiley Publication. Presently, he is Associate Editor of the IGI Global: International Journal of Synthetic Emotions (IJSE).
Chapter 1
Probabilistic Machine Learning Using Social Network Analysis A. Rohini, T. Sudalai Muthu, and Tanupriya Choudhury
1 Introduction The study of the structural approach is based on the interaction among the nodes or an actor is called social network analysis; the relationships are linked with individual objects or nodes it may demographics, organization or actors. The demand of the students reported unable to find out for better academic institutions. Even though the maximum of merit students is studying unlikely undergoing in the colleges by the family circumstance and unknowing of information about the existing reputed institutions in and around. The field of study that deals with awareness to choose the higher education program, according to their capability, affordability, and accessibility. A challenging problem that arises in this domain is the multiplicity connections of the tie strength. Only a few studies have shown the connectivity of links, among the social media to search the universities. A solution to this problem, which is exposed to the latest educational trends based on the aspects for suggests the students choose the right universities based on their demand. For finding the right colleges involves some crucial milestones, research parameters help you to know yourself, strengths, weakness, aptitude, perception, passion, interest and ambitions which should be taken into account while selecting universities. The tool evaluates the parameters and then A. Rohini (B) · T. S. Muthu Hindustan Institute of Technology and Science, Chennai, India e-mail: [email protected] T. S. Muthu e-mail: [email protected] T. Choudhury Department of Informatics, University of Petroleum and Energy Studies (UPES), Bidholi, Dehradun, Uttarakhand, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_1
1
2
A. Rohini et al.
mapped with respective parameters. The result of the test helps to give suggestion links and predicting the students to give the right guidance and right approach as it helps them to understand. In this paper, the social network plays a role in the transactions and behaviors, and preferences of connecting agents. This empathetic preference satisfied their intrinsic preferences. The objective of the classifier is based on the weather condition, built area to predict the learning environment. The impact of social network analysis gives the real facets of the socio-economic and it has been determined by the concept of crucial factors along with the parameters. To identify the individual link traits is the significance of a social network is a major concern to the research in the network analysis. The current focus of the social data is pointing toward the influence propagation in the analytics.
2 Related Works Nowadays, academic education is being focused on big data, to deliver the learning environment for the public [1]. Data analytics present to educational institutions give a framework for a vast array of data efficiently in the learning environment [2]. It is a relevant significant number of issues in the learning system and skill to the students to enhance the future. The results were analyzed by using the Bayes algorithm and the poison distribution model in the large training sample set [3]. Proposed a method of discretization and structural improvement, to deal with the independent and relevant attributes based on the mutual and conditional set [4]. The authors bring some information about the text classification using Bayes model and their related experiments were proved for the classification and clustering tech-niques in the preprocessing data [5]. Authors have carried out the data evaluation by three different parameters for access the data frequency, quantity and recent access used the weightbased algorithm, yielded values had given better hit rates in various link weight [6]. Gizem Korkmaz et al., addresses big data analytics in higher education. The study has gathered the learning management system, social network blogs, and student information system. A more comprehensive description can be found in the decision making [7]. Experimented in data replication and the results were compared with the Binomial placed replacement algorithm [8]. Reviewed in worldwide educational system big data is an integral part; it is improving learning and teaching methodologies [9]. Researcher has reviewed systematic literature review for big data analytics for education landscape. But it is not sufficient for the predictive models and educational framework [10]. Processed the forecasting future courses in organizations and suggest future planning analyzation for business. Their case study has discussed with All India Council of Technical education [11]. Focused the Big data is significant role in university opportunities and to educational quality guide to the students and recommends that institutions carry out programs to relevant data science, which is every individual student gets maximum benefit and improve the educational quality [12, 13]. They identified the issues in educational research and introduced the data
1 Probabilistic Machine Learning Using Social Network Analysis
3
science methodology in academic research which is the development of educational research with epistemic tools and approach in data science [14, 15].
3 Link Probability Model The probabilities of the links have been derived from empirical analysis in different perspectives from which the data was collected over multiple frequencies of time ranges on a group of nodes, the estimated link probabilities are the proportion of link occurrences in each node of an adjacency matrix, and the statistical distribution of each potential link has each set of nodes V 1 , V 2 …V n (Fig. 1). Fig. 1 Multinomial Mining Permissible classifier model
Pre Processing
Training
Multinomial Mining Permissible Classifier
Trained Model Identification
Predicted data
Decision
4
A. Rohini et al.
Multinomial Mining Permissible algorithm is an approach to estimate the probabilistic distribution of all possible values; this mining permissible algorithm is to derive the complexity of all possible inputs of likelihood evidence and predict the class values.
4 Classification of Nominal Nodes In a Knowledge data mining classification, there are multiple sets of nodes, say N 1 , N 2, …, N k . The prediction of node links aims to calculate possibilities of an object by the conditional probability of a feature vector V 1 , V 2 , V 3 …. Equation (1) shows to a particular node belong to N i, it can be described by each outcome of the links spastically defined by
1 Probabilistic Machine Learning Using Social Network Analysis
5
P(Ni |V1 , V2 . . . , Vn ) = P(V1 , V2 . . . , Vn |Ni ). P(Ni ) P(V1 , V2 . . . , Vn ) for 1 ≤ I ≤ k Now each possible outcome of the probability P(v1 , v2 , . . . Vn |Ni )P(Ni ) = P(v1 , v2 ..Vn .Ni )
(1)
= P(v1 |v2 , . . . , vn , Ni ).P(v2 , . . . , vn , Ni )
(2)
= P(v1 |v2 ..vn .Ni ).P(v2 |v3 ..vn .Ni )
(3)
Finally, the conditional probability of features is independent and the assumption of the event factors gives: P(Ni |v1 , v2 , . . . .vn ) =
k=n
P(v j |Ci )).P(Ci )P(x1 , x2 , . . . xn ) for 1 ≤ i ≤ k (4)
k=1
P(V 1 , V 2 …V n ) is stable for all classes P(Ni |v1 , v2 , . . . vn ) =
j=n
P(v j |Ci )).P(Ci )P(x1 , x2 , . . . xn ) for 1 ≤ i ≤ k (5)
j=1
From Eq. (5), the average complexity value has estimated and evaluated in Tables 1, 2, 3 and 4 from different perspectives of the location. The obtained average-case complexity has shown in Figs. 3 and 4.
5 The Outlook of Likelihood Features The infidelity of nodes will occur and create problems for understanding, these nodes unable to give correctness and reliability, and there is nothing is evidence. The dataset collected through different environment the influence of all the nodes involved in a particular network over a particular path of characteristics of the social network. Thumbnail analysis provides the link between the student and organizational features; decision rules are used in the datasets. The probability of getting a university of the student is connected to the traits, location, mode of study and specialization of courses and conventions that are going on at the time. The factor analysis facilitates the group of variables and describes by the learning analytics through an end-user. Using the Multinomial Mining Permissible (MMP) algorithm, each feature is being classified into self-sufficient to one another, to consider the dataset which describes the location of an organization and
6
A. Rohini et al.
Fig. 2 Classifier for location
Built Area
{Hill, Urban, Town, Cosmopolitan}
Location Weather
{Mild, Hot, Cold}
tells how students have to choose the location of an organization (Fig. 2). Location → {built area, Weather condition, learning environment}
How students will choose the college; each feature has correlated with the students. With the attributes of relation, when the features are not paired it is dependent.
6 Maximum Likelihood Training Maximum Likelihood training can be evaluated by closed-form expressions. Parameters of Weather and Learning environment possibilities of classification are shown here. Weather conditions being cold have nothing to do with the learning environment which is green campus; hence, the features are assumed to be independent. Secondly, each feature is given the weight to predict the outcome of Location. Below given features are the same as tabulated. Our proposed algorithm, from Eq. (6), finds the probability of an occurring event node and the already occurred event P(E|W ) = (P(W |E)P(E))/(P(W ))
(6)
1 Probabilistic Machine Learning Using Social Network Analysis
7
the learning environment (E) and weather conditions (W ) of an event are to find the event probability E, and given the event W is true. Event E is also termed as evidence of location. P(E) is the priority of E. The evidence is an attribute value of an unknown instance (it is W ). P(E|W ) is a posteriori probability of W; it is the probability of events after evidence is seen. Location → {Built area, Hostel, Weather condition, Learning Environment} It is termed as L → {B, H, W, and L}. A feature vector and corresponding class variables can be B = (Hill, urban, cosmopolitan, rural, hot, cold). L = (Green campus, high rise building) P(L|B) = P(B|L)P(L)/P(L)
(7)
P(B|L) here means the probability of built area and location conditions are given the factors of University belongs to green campus and high raise building. Similarly, P(W|B) B = b1 , b2 , b3 . . . bn
(8)
The remaining parameters are similarly categorized from (8).
7 Result and Discussion This assumption of conditional probabilities of the three features which are independence among the features, we split evidence into the independent parts, if two events, E and W are independent of the P(E, W ) = P(E). P(W ) which can be expressed as: The probability of a set of inputs for all potential values of the class variable y yield with maximum probability P(w|e1 , . . . en ) = (P(e1|w)P(e2|w)..P(en|w))/P(e1)P(e2)..P(en) P(w|e1 . . . en) = P(w)
P(ei|w)
(9) (10)
i=1
Calculating the variables P(w) and P(E|W ). P(y) is a class probability and P(E i |W ) is a conditional probability, the assumptions of the distribution P. Bayes classifiers differ mainly regarding the distribution of P(E i |W ).
8
A. Rohini et al.
The above formula has applied on the location dataset and pre-computations are as follows: We need to find P(E i |W j ) for each ei in E and wj in W. All these calculations have been demonstrated in the tables below: P(ei |w). In our educational dataset for locations, we used some features for predicting the results of students conditional values; by the assumptions, we make distributions regarding. P(ei |w) and P(ei |wj ) for each ei in E and wj in W, these calculations have been demonstrated below. The indicator random variable facilitates the use of multinomial mining classifier of probabilistic analysis in likelihood values of built area I (B) = {0 if Nodes does not like 1 if Nodes does like} Using this we have in Table 1. Evidence value of P(T ) = 0.453 and P(F) = 0.5 in urban, high rise building is the highest evidence value compare to the green campus (Tables 2, 3 and 4). Probability of Hill and cool condition if they choose green campus environment, possibilities of hot, mild and urban town are chosen high rise buildings. Similarly, other features are iterated and categorized the values. It will be predicted in our class variables (Fig. 3). Table 1 Likelihood weight values for built area No.
Built area (B)
T
F
P(T )
P(F)
1
Hill
1
0
0.286
0.1
2
Urban
1
1
0.453
0.5
3
Town
1
0
0.416
0.4
4
Cosmopolitan
0
1
0.245
0.32
Table 2 Likelihood weight values for weather No.
Weather (W )
T
F
P(T )
P(F)
1
Cool
2
0
0.4
0
2
Hot
3
2
0.6
1
3
Mild
1
0
0.2
0
Table 3 Likelihood weight values for the learning environment No.
Learning environment (E)
T
F
P(T )
P(F)
1
Green campus
3
0
1
0
2
High rise building
0
2
0
1
1 Probabilistic Machine Learning Using Social Network Analysis Table 4 Discriminate values in certain possibilities
9
No.
Conditions
Possibilities
P(T ) P(F)
1
True
3
0.6
2
False
2
0.8
Fig. 3 Likelihood weight values of built area
The line graph shows the percentage of students to choose some aspects of the built area; we can see that the range of 0–1 of the lesser attraction was cosmopolitan fallen down in the range of 0.2 in the P(T ) of likelihood condition. To least popular attraction was town in the range of 0.4 and gradually likelihood values of urban 0.453 and then random decrease of the hill is 0.286 less than the urban in the likelihood of P(T ) condition. When compared to P(F), the urban has raised due to the hot variable of weather. According to the graph, urban, town and cosmopolitan are chosen 50% of nodes has accepted likely to study the high raise building of learning environment in the P(T ) condition but the nodes are not like the green campus environment due to the weather parameters of cool in the P(F) condition in the possibilities of discriminate values of 3 and 2. During the town, there was a sharp decrease in students from 10 to 20%, then the ranges has gradually increased to 40% in the P(T ) condition and raise down to less than 10% in the P(F) condition. The parameter of the cosmopolitan similar to the town, students are being chosen cosmopolitan increased rapidly to 100% then gradually raise down at the P(T ) condition and gradually raised from P(T ) condition to less than 10–40% it will raise in future. The line for the urban increased 100% and steady between true and false conditions; it raises down up to 40% and a significant raise to 60%. This analyzation framework has preserved the probability of likelihood nodes to the flexibility of dyadic node of relationship to understand the probability of choosing the location of the organization; thus, statistical analysis values are realizing the links and nodes of the traits to choose the location by the social network link analysis. The
10
A. Rohini et al.
Fig. 4 Possibilities of learning environment
parameters of the distribution depend on the various location factors such as built area, weather and learning environment. The MMP algorithm analyzes the links as friendship, followers of friends and common interest. Thus, dependent variables of relationship have communicated the frequency in the structural data environment. The given graph shows the possibilities of the learning environment in two conditions P(T ) and P(F). As we observed from the graph, green campus got less like 40% than the high raise building of 40%. On the average, high raise building got more like from green campus from the possibilities of choosing high raise. Initially, the green campus has 30% out of 35%. The possibilities of chance declined and reached less than 10% at P(F) condition. On the other side, high raise building possibilities of chances to choose 40–50% gradually raise at P(T ) condition. In conclusion, high raise building had been more possibilities of chances to like than the green campus in terms of P(T ) and showed fluctuation in Fig. 4.
8 Conclusion and Future Work The study analyzed four factors of career guidance efficacy has used the behavioral analysis on the social media data to predict the correlation of students and the educational institutions. The set of parameters includes location, weather condition and nature of building/environment. We proposed multinominal mining-based algorithms to correlate the factors. The results were analyzed and discussed with scope in a constrained environment. The prediction values for location, weather,
1 Probabilistic Machine Learning Using Social Network Analysis
11
learning environment are yielded good accuracy of 91.5% prediction of correlation ratio on-location learning environments. The further study extends to compute international universities and predicting the student’s links has focused to suggest in the knowledge mining in machine learning analysis.
References 1. Alemany J (2019) Metrics for privacy assessment when sharing information in online social networks. IEEE Access 7:143631–143645 2. Babu SS (2020) Earlier detection of rumors in online social networks using certainty-factorbased convolutional neural networks. Soc Netw Anal Min 10(1):1–17 3. Chakraborty K (2020) A survey of sentiment analysis from social media data. IEEE Trans Comput Soc Syst 7(2):450–464 4. Chen P-Y (2019) Identifying influential links for event propagation on twitter: a network of networks approach. IEEE Trans Sig Inf Process Over Netw 5(1):139–151 5. Fahad Razaque NS (2018) Using Naïve Bayes algorithm to student guidance. IEEE J 5(1):139– 151 6. Ghori KM (2020) Performance analysis of different types of machine learning classifiers for non-technical loss detection. IEEE Access 8:16033–16048 7. Gizem Korkmaz CJ (2019) A computational study of homophily and diffusion of common knowledge on social networks based on a model of Facebook. Soc Netw Anal Min 10(1):5 8. Qiao S, Han N, Gao Y (2020) Dynamic community evolution analysis framework for largescale complex networks based on strong and weak events. IEEE transactions on systems, man, and cybernetics: systems. https://doi.org/10.1109/TSMC.2019.2960085 9. Zhao G, Lei X, Qian X, Mei T (2019) Exploring users internal influence from reviews for social recommendation. IEEE Trans Multimedia 21(3):771–781 10. Hamid Khalifi SD (2020) Enhancing information retrieval performance by using social analysis. Soc Netw Anal Min 10(1):24 11. Hashimoto TT (2019) Time series topic transition based on micro-clustering. In: IEEE international conference on big data and smart computing (BigComp) 12. Guo J (2019) Node degree and neighbourhood tightness based link prediction in social networks. IEEE Explorer 13. Chu J, Wang Y (2020) Social network community analysis based large-scale group decision making approach with incomplete fuzzy preference relations. Inf Fusion 60:98–120 14. Wang L, Sun T (2020) Applying social network analysis to genetic algorithm in optimizing project risk response decisions. Inf Sci 512:1024–1042 15. Lu M (2019a) A unified link prediction framework for predicting arbitrary relations in heterogeneous academic networks. IEEE Access 7:124967–124987 16. Lu M (2019b) Topic influence analysis based on user intimacy and social circle difference. IEEE Access 7:101665–101680 17. Lu Z, Sagduyu YE, Shi Y (2019) Integrating social links into wireless networks: modeling, routing, analysis, and evaluation. IEEE Trans Mob Comput 18(1):111–124 18. Hashimoto T, Uno T, Kuboyama T, Shin K, Shepard D (2019) Time series topic transition based on micro-clustering. IEEE international conference on big data and smart computing (BigComp). https://doi.org/10.1109/BIGCOMP.2019.8679255 19. Guo J, Shi L, Liu L (2019) Node degree and neighbourhood tightness based link prediction in social networks. 9th International Conference on Information Science and Technology (ICIST). https://doi.org/10.1109/ICIST.2019.8836821
12
A. Rohini et al.
20. Kamis NH, Chiclana F (2019) An influence-driven feedback system for preference similarity network clustering based consensus group decision making model. Inf Fusion 52:257–267 21. Octavio Loyola-González AL-C-P (2019) Fusing pattern discovery and visual analytics approaches in tweet propagation. Inf Fusion 46:91–101 22. Piškorec M, Šmuc T, Šiki´c M (2019) Disentangling sources of influence in online social networks. IEEE Access 7:131692–131704 23. Ren R, Tang M (2020) Managing minority opinions in micro-grid planning by a social network analysis-based large scale group decision making method with hesitant fuzzy linguistic information. Knowl Based Syst 189:105060 24. Skaperas SL (2019) Real-time video content popularity detection based on mean change point analysis. IEEE Access 7:142246–142260 25. Song B, Wang X (2020) Reliability analysis of large-scale adaptive weighted networks. IEEE Trans Inf For Secur 15:651–665
Chapter 2
Prioritization of Disaster Recovery Aspects Implementing DEMATEL Technique Rana Majumdar, Tanupriya Choudhury, Ravi Tomar, and Rachna Jain
1 Introduction In today’s competitive world, organizations irrespective of its dominance in market might experience losses in terms of market potential if disaster transpired which in turn might increase downtime leads to a catastrophic situation. There is a widespread and continuing belief that the bigger the problem, the more involvement of technology is necessary to fix it. This is, of course, a conceptual situation in its own right. As it seldom succeeds, but remains wildly popular, the result is that worsening conditions engender yet more dependence on technological solutions, and vulnerability continues to rise. Disasters are unpredictable in nature and can happen with no prior caution [1]. So from organizations point of view, retrieving loosed or damaged data can prove to be costly and time intense especially for small-scale organization with less emphasis on disaster recovery solution. On contrary, organization with well-equipped tools and techniques might reduce the loss up to some
R. Majumdar (B) Meghnad Saha Institute of Technology, Kolkata, India e-mail: [email protected] T. Choudhury · R. Tomar Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies (UPES), Bidholi, Dehradun, Uttarakhand, India e-mail: [email protected] R. Tomar e-mail: [email protected] R. Jain Amity University Tashkent, Tashkent, Uzbekistan e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_2
13
14
R. Majumdar et al.
extent. Thus, it is necessary for organizations to equipped themselves with such situation and tackle such situations with greater control and easy. DEMATEL method implemented here for the purpose of identifying and prioritizing the utmost critical attributes for disaster recovery solution [2]. In recent times, MCDM approach used as a good tool for examine and evaluating of attributes since they are able to use a combination of qualitative and quantitative factors with greater flexibility and can be introduced to any organization [3] based on their problem statement. From past few decades, organizations have become extremely dependent on information technology to mark their presence and improve their efficiency for better decision making and for better governance of the establishment. Therefore, organizations with IT enabled services should encourage disaster prone environment for better functionality and operation [4, 5]. DEMATEL technique which is used for prioritizing disaster recovery attributes helps to attain better knowledge about attributes to recognize their influential capability in this regard. Taken together, all observations ensure that there is no guarantee that a better understanding of vulnerability would lead to better management of it, but it is nevertheless clear that more and more knowledge of physical hazards does less and less for the process of reducing disaster. Moreover, difficulties arise in choosing the right attributes for disaster recovery solution keeping in mind that the goals should be practical, less complicated while handling criticality like uncertainties. From the past few years, researchers trying to answer a common query, “Will a set of attributes work every where?” To get an idea about its applicability and more notably its desirability [6]. DEMATEL tries to resolve this issue through its unique orientation process by identifying and prioritizing the attributes based on their criticality and by providing a model that can help large corporations be disaster ready in the near future [7]. Another reason is that disaster recovery attributes selection decisions are contingent and resource dependent, which may vary based on the available resources, their effects and present technology. Literature review reveals that there exists other MCDM techniques with their own formation and orientation but at times they are more problem specific. This study preferred DEMATEL over other existing technique because of its uniqueness. In this, it influences components of a system with an initial direct relation matrix. Influences of components can ripple transitively to other components, which is modeled by raising the initial direct relation matrix to powers. The total influence is computed by summing up matrices of all powers based on the assumption that the matrix raising to the power of infinity would converge to zero. On the other hand, while using DEMATEL technique, as it is based on graph theory, a complete pictorial discerption is available for visualization and further it divide attributes in two groups: cause and effect, and exhibiting casual relationships between criteria’s visually, hence reducing our efforts and increasing the efficiency of the decision-making process [8, 9].
2 Prioritization of Disaster Recovery Aspects Implementing …
15
2 Methodology Adopted Decision support tools and techniques are used by organizations to get a better understanding about their precise objective and common goals. But unfortunately, their implementation is still restricted due to its subjectivity and small and medium sized organizations need to put more emphasis on this aspect. Although presently rigorous efforts have been made by organization’s to implement these tools but majorly for supportive tasks and not for actual decision making [10]. Decision-Making Trial and Evaluation Laboratory (DEMATEL) is one of those techniques. This technique helps in making decisions using the Multi-Criteria Decision-Making (MCDM) approach [11, 12]. In order to challenge disaster, unwanted situations here the authors identified ten critical factors as attributes to provide support and demonstrate its applicability to the industry practitioners. They are summarized in Table 1.
2.1 Business Continuity and Disaster Recovery In literature, disaster recovery refers to actions initiated by organizations to resume normal operations from unwanted and catastrophic shutdown [13, 14]. In the field Table 1 Identified attributes Sl. No.
Attribute
Expanded form
Description
P1
BST
Backup strategies
Generating available verified copies of the database files and transection journals
P2
BS
Backup services
For initiating backup process
P3
PM
Project management
Temporary endeavor undertaken to create a unique product or service
P4
RPO
Recovery point objective
Time between two backup operations
P5
DRP
Disaster recovery procedure
Tools and technique for recovery process
P6
ND
Natural disaster
Major events that results from natural process of nature
P7
RTO
Recovery time objective
Duration of time and a service level within which a business process must be restored after a disaster
P8
TI
Telecommunication infrastructure
Exchange of data over significant distances using electronics
P9
MD
Man-made disaster
Disastorous event caused due to negligent human actions
P10
DM
Detecting and monitoring
Initing control mechanism
16
R. Majumdar et al.
of information technology, prompt action proven to be beneficial form business perspective as it involves restoring servers or mainframes with backups, provisioning LANs for make it full operational with RPO and RTO. Business continuity talks about protocols and standard procedure for smooth operation against disaster [15]. It also emphasizes on more practical planning for long term success. Team members, supply chain breakdowns, failures, critical malware infections are some the major cause for business continuity process.
2.2 DEMATEL Technique DEMATEL technique was developed in the years 1972–1979 by Geneva Research Center of the Battelle Memorial Institute. The core purpose of DEMATEL technique was to study the performance of multifaceted and problematical cluster. These methods transformed the most complicated systems in to simple causal structure. It is already been widely accepted as a best tool to solve cause and effect relationship among aspects and criteria [16, 17]. This method can be used to finalize the relationship among the attributes. The DEMATEL method is being applied in many diverse fields like decision making, knowledge management, marketing, consumer behavior, etc. This section comprises of data collection, metaphors and the assessment model DEMATEL. Data was poised through survey where questionnaires were arranged to rap information about the degree of influence of factors influencing disaster recovery solutions from a group of targeted contributors. They were requested to rate the degree of influence between two factors in the scales of 0 and 4. It was completed in pairwise comparison means where a factor was compared with another factor. One of the objectives of the DEMATEL method was judging all the direct and indirect of causal relationships and strength of influence between all variables of a complicated system by using matrix calculation [10]. Figure 1 illustrates the phases intricate in DEMATEL process. It consists of six phases, execution of which will transform to a framework for real-time implementations. Z will be the average of the entire matrix generated from all the experts. Phase 1: Accumulating expert’s opinion and calculation of average matrix (Z) (Table 2) ⎡
0 ⎢ x21 Z =⎢ ⎣ . xn1
x21 0 . xn2
. . . .
⎤ x1n x2n ⎥ ⎥ . ⎦ 0
(1)
Based on questionaries’ experts are asked to provide their independent opinion marks the basis of degree of direct influence between identified attributes. The degree is the influence of one attribute to the remaining one. The score is categorized as (no
2 Prioritization of Disaster Recovery Aspects Implementing …
17
Fig. 1 Schematic diagram showing steps followed in DEMATEL method
Table 2 Z matrix (realizing from Eq. (1)) Z matrix
BST
BS
PM
RPO
DRP
ND
RTO
TI
MD
DM
BST
0
2.75
2.25
2.75
2.25
1.75
2.25
2
2.5
1.75
BS
2
0
2.5
3
1.75
1.5
2.75
2.5
2.5
2.25
PM
2.25
1.5
0
2
2.5
2.25
1.75
2
2.75
1.5
RPO
1.25
3.25
1.75
0
1.75
2.5
1.75
2
1.75
2.5
DRP
2.5
1.75
2
2
0
2.5
1.75
2.75
2
1.5
ND
2.5
2.25
1.75
2.25
2.5
0
2
1.75
2.5
1.75
RTO
2.5
1.25
1.75
2.75
2.5
2
0
1
1.5
3
TI
1.5
2
2.25
1.25
2.25
1.75
2.5
0
2.25
2.5
MD
2.25
2.25
1.75
1.5
2.5
1.75
2.75
1.25
0
1.75
DM
1.75
1.75
2.5
2
2
2
2.5
2.25
2
0
influence) 0, (low influence) 1, (medium influence) 2, (high influence) 3, (very high influence) 4. Every expert is expected to form a n ×n matrix which is non-negative in nature, Ak = [Ak bc], where k denotes the number of respective experts involved in the process. The average matrix this matrix will be denoted by Z matrix. Phase 2: Normalized value of initial direct relation matrix (D). Each element of matrix D lies between 0 and 1; the calculation is expressed as below (Table 3): Define
λ= Max
1≤i≤n
1 n j=1
Xi j
and N = λX
(2)
18
R. Majumdar et al.
Phase 3: Derivation of total relation matrix T. The total relation matrix T is derived from the equation expressed below where I is used for identity matrix of size n × n. The element of total relation matrix T reflects the indirect effect which factor ‘i’ had on factor ‘j.’ T = lim (N + N 2 + · · · + N k ) = N (I − N )−1 k→∞
(3)
Phase 4: In matrix T, calculate the sum of rows and column, where r and c indicate the sum of row and sum of column, respectively, to derive the prominence which identifies the degree of importance of each criterion (Table 4). ⎡ ⎤ n ri = ⎣ ti j ⎦ j=1
(4)
= [ti ]n×1
(5)
n×1
⎡ ⎤ n ti j ⎦ ci = ⎣ j=1
= [ti ]n×1
1×n
Finally, in this technique, one needs to calculate r + c and r – c, as these summed values of ri + ci reflect the total effect imparted on attribute i as a receiver or it causes an effect on other remaining attribute. Interestingly, if the value of ri + ci is on the higher side reflects higher priority, on the contrary the attributes can be ranked on basis of their ri + ci score and this rank reflects their priority. The net contribution of attribute on the system is denoted by ri − ci .If the value of ri − ci is positive, they were the cause and directly influenced the other attributes and when ri − ci was negative, they were affected by other attributes. Phase 5: Calculate the value of alpha (threshold value). The value of alpha has been calculated by the mean of whole elements of matrix T (total relation matrix). Phase 6: Plot the cause effect relationship graph. This graph is realised using ri + ci and ri − ci values and demonstrates the cause and effect relationship among them. These causes and these effects are nothing but the disaster recovery attributes. The causal diagram can be obtained by mapping all coordinates (ri + ci , ri − ci ) onto two planes which may provide some insight when making decisions. The causal diagram visualizes the importance and classification of all criteria.
BST
0
0.09638554
0.10843373
0.06024096
0.12048193
0.12048193
0.12048193
0.07228916
0.10843373
0.08433735
D matrix
BST
BS
PM
RPO
DRP
ND
RTO
TI
MD
DM
0.0843373
0.1084337
0.0963855
0.060241
0.1084337
0.0843373
0.1566265
0.0722892
0
0.1325301
BS
Table 3 D matrix (realizing from Eq. (2))
0.1205
0.0843
0.1084
0.0843
0.0843
0.0964
0.0843
0
0.1205
0.1084
PM
0.0964
0.0723
0.0602
0.1325
0.1084
0.0964
0
0.0964
0.1446
0.1325
RPO
0.0964
0.1205
0.1084
0.1205
0.1205
0
0.0843
0.1205
0.0843
0.1084
DRP
0.0964
0.0843
0.0843
0.0964
0
0.1205
0.1205
0.1084
0.0723
0.0843
ND
0.1205
0.1325
0.1205
0
0.0964
0.0843
0.0843
0.0843
0.1325
0.1084
RTO
0.1084
0.0602
0
0.0482
0.0843
0.1325
0.0964
0.0964
0.1205
0.0964
TI
0.0964
0
0.1084
0.0723
0.1205
0.0964
0.0843
0.1325
0.1205
0.1205
MD
0
0.0843
0.1205
0.1446
0.0843
0.0723
0.1205
0.0723
0.1084
0.0843
DM
2 Prioritization of Disaster Recovery Aspects Implementing … 19
0.97838873
1.08289736
0.9978993
0.96189323
1.02015779
1.04411793
0.99775511
0.95656078
0.96974934
0.98749325
PM
RPO
DRP
ND
RTO
TI
MD
DM
0.9941639
0.9743546
0.9799861
0.9583768
1.0439883
1.0002429
1.0506624
0.9763541
1.0063524
1.1071377
BS
BST
BS
p2
p1
BST
T matrix
1.0123
0.9433
0.9812
0.9639
1.0089
0.9958
0.9794
0.8936
1.0994
1.072
PM
p3
Table 4 T matrix (realizing from Eqs. (3), (4) and (5))
1.0407
0.9826
0.9878
1.0527
1.0799
1.0437
0.9497
1.0291
1.1721
1.1451
RPO
p4
1.0621
1.0393
1.0487
1.0602
1.1095
0.9776
1.0452
1.0713
1.1448
1.1452
DRP
p5
0.9753
0.9262
0.9432
0.9585
0.9128
0.9977
0.9894
0.9752
1.0406
1.033
ND
p6
1.0796
1.0478
1.0573
0.9503
1.0888
1.0536
1.0469
1.0389
1.1836
1.145
RTO
p7
p8
0.9579
0.8809
0.8395
0.8921
0.9642
0.9819
0.946
0.9385
1.0501
1.0151
TI
p9
1.0464
0.9164
1.0331
1.0056
1.0946
1.0512
1.0325
1.0661
1.1575
1.1399
MD
p10
0.9084
0.9465
0.9939
1.0163
1.0117
0.9786
1.0122
0.9638
1.0953
1.0562
DM
20 R. Majumdar et al.
2 Prioritization of Disaster Recovery Aspects Implementing …
21
Fig. 2 Disaster recovery attributes and aspects
3 Prioritization of Attributes A business continuity model is a framework, which lists the attributes required for effective disaster recovery in any organization [18]. Business continuity illustrates the protocols and standard operating procedure to ensure the smooth operation of the respective organization whenever disaster strikes. Now disaster recovery comprises of disaster making and recovery attributes and aspects. These can include natural disasters, manmade disasters, backup strategies, backup servers, etc. Natural disasters include floods, earthquakes, etc. A manmade disaster on the other hand includes soil erosion, forest fire, terrorist’s attacks, etc. Back up strategies include methods like RAID, cloud backup, etc. where the organizational data could be backed up. Figure 2 shows division of each group of disaster recovery attributes and aspects. Next, these attributes and aspects were pairwise compared according to a 0–4 scale. The pairwise comparison has been done for the groups of attributes and aspects and then for individual attributes within the group. This evaluation was done by the focus group where the experts, based on consensus, rated each attribute.
4 Observations Both industry experts and academician from diverse discipline with technology background are invited for their valuable inputs to build an exemplary framework using DEMATEL which proven to be beneficial for practitioners working on disaster recovery solutions. Based on their appreciated response, initially Z and D matrix
22 Table 5 Attributes placed based on their ranks
R. Majumdar et al. R+C
Attribute
21.124
P1
BS
20.834
P2
BST
20.804
P3
DRP
20.548
P4
RTO
20.497
P5
RPO
20.17
P6
MD
20.111
P7
ND
20.047
P8
DM
19.901
P9
PM
19.288
P10
TI
were considered. Consequently, the threshold value was realized and cause effect graph was shaped. The important evaluation attributes were determined by ri + ci values. Table 5 reveals the fact that backup servers (BS) were the most important attributes keeping in view the disaster recovery and business continuity of organizations with the highest ri + ci value = 21.1241, followed by backup strategies (BST) with ri + ci value 20.8339. As it is evidently perceptible from Table 5 that among identified attributes backup servers (BS) appear as a best evaluation perspective with the largest ri + ci value = 21.124, telecom infrastructure (TI) was the least important perspective with the smallest ri + ci value = 19.288. Founded on ri + ci values, the prioritization of the importance of eight evaluation perspective was offered as; P2 > P1 > P5 > P7 > P4 > P9 > P6 > P10 > P3 > P8. Interestingly, the other minute observations were if the value of ri − ci (r – c) was positive, such perspective was classified in the cause group, and directly affected other attributes. The highest ri − ci factors also had the greatest direct impact on other attributes; they are namely P4, P5, P7 and P9 and are considered mostly affected attributes in this study. It is apparent from Table 5 that the utmost significant disaster recovery attribute is backup servers, followed by the backup strategies. Telecom infrastructure is placed last on the priority list and does least affect any other attributes. The causal relations can be better seen in Fig. 3, where are shown the relations among the different disaster recovery attributes, i.e., backup servers, backup strategies, telecom infrastructure, etc. Figure 3 enlightens the cause effect relationship between the diverse attributes of disaster recovery. The attributes with positive ri − ci values are the causes, whereas the attributes with negative ri − ci values are the effects (Table 6). In this work, authors demonstrate the usage and applicability of DEMATEL technique in application areas like disaster recovery solution after accessing with industry people and researchers. This work presented a scenario that revealed fundamental benefits of DEMATEL technique in disaster recovery and business continuity process. DEMATEL guided in finding out the most influential or critical attribute which concludes to accomplishment of disaster recovery in any organization. Here,
2 Prioritization of Disaster Recovery Aspects Implementing …
23
Fig. 3 The cause and effect diagram for the disaster recovery attributes
Table 6 Preference of importance for factors Net group
Preference of importance
Casual factors
P1, P2, P3, P6, P8, P10
Effect factors
P4, P5, P7, P9
P2 > P1 > P5 > P7 > P4 > P9 > P6 > P10 > P3 > P8
one fact should be considered that results obtained in this work are subjective in nature and dependent on beliefs provided by the expert group; the priorities of attribute may or may not change, depending upon opinions different focus groups, which may vary under a altered situation.
24
R. Majumdar et al.
5 Conclusion This disaster recovery model with combination of DEMATEL can be used to prioritize disaster recovery attributes which will help organizations fight back after disasters. These attributes are ordered based on their importance; the organizations themselves will know on which attributes and aspects they need to concentrate more to tackle issues and also what are the areas of improvement of the organization. The results from DEMATEL imply that among the various disasters recovery attributes backup servers, backup strategies and disaster recovery planning were more important attributes and cause effect relationship showed that these attributes are interrelated [19]. These insights should help an organization in faster recovery and in case of disasters and more importantly in better business continuity after disaster. Telecom infrastructure was found to be a less influential attribute than the others. The most important attribute according to prioritization list is Backup Server [20]. DEMATEL is backup servers followed by backup strategies. Using DEMATEL, it was found that these attributes are influencing other attributes in some manner. Since backup servers, backup strategies were found out to be important; the organizations should pay special focus on these in order to plan better, faster and efficient business continuity. From managerial perspective, one could say that for organizations has to be disaster ready and disaster recovery with minimum downtime, backup Servers must be looked toward at it more precisely. Also, according to our research, telecom infrastructure has been observed to be last in the prioritization order so they can be taken care of by the organization in a handy manner.
References 1. Palmer R (2012) Disaster recovery and business continuity with E-commerce businesses. In: IS 8300 disaster recovery/business continuity planning 2. Khoshkholghi MA, Abdullah A, Latip R, Subramaniam S, Othman M (2014) Disaster recovery in cloud computing: a survey. Comput Inf Sci 7(4):39–54 3. Kashi K (2015) DEMATEL method in practice: finding the causal relations among key competencies. In: The 9th international days of statistics and economics, Prague 4. Martin BC (2002) Disaster recovey plan: strategies and process. Boston, MA, SANS Institute 1(3) 5. Xu W, Luo J, Xi J, The research on electronic data backup and recovery system based on network. People’s Police College, Nanchang 6. Sumrit D, Anuntavoranich P (2013) Using DEMATEL method to analyze the causal relations on technological innovation capability evaluation factors in Thai technology-based firms. Int Trans J Eng Manag Appl Sci Technol 4(2):81–103 7. Feng N, Li M (2011) An information systems security risk assessment model under uncertain environment. Appl Soft Comput 11:4332–4340 8. Wasson T, Choudhury T, Sharma S, Kumar P (2018) Integration of RFID and sensor in agriculture using IOT. In: Proceedings of the 2017 international conference on smart technology for smart nation. SmartTechCon 2017. https://doi.org/10.1109/SmartTechCon.2017.8358372
2 Prioritization of Disaster Recovery Aspects Implementing …
25
9. AjazMoharkan Z, Choudhury T, Gupta SC, Raj G (2017) Internet of Things and its applications in E-learning. In: 2017 3rd international conference on computational intelligence and communication technology (CICT), pp 1–5 10. Yang C-L, Yuan BJC, Huang C-Y (2015) Key determinant derivations for information technology disaster recovery site selection by the multi-criterion decision making method. Sustainability 7:6149–6188 11. Singhal A, Sarishma, Tomar R (2016) Intelligent accident management system using IoT and cloud computing. In: 2016 2nd international conference on next generation computing technologies (NGCT), pp 89–92 12. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET using MQTT. Ada User J 38(4):231–235 13. Eslami H, Pourdarab S, Nadali A (2011) Credit risk assessment of bank customers using DEMATEL and fuzzy expert system. In: 2011 international conference on economics and finance research. Singapore 14. de Brito MM, Evers M (2016) Multi-criteria decision-making for flood risk management: a survey of the current state of the art. Nat Hazards Earth Syst Sci (Germany) 16:1019–1033 15. Yang C-K, Lee B-J, Lei T-C, Lanasari T (2014) Assessing the influential factors of fire rescue using DEMATEL method. Int J Innov Manag Technol 5(4):239–243 16. Yin S-H, Wang C-C, Teng L-Y, Hsing YM (2012) Application of DEMATEL, ISM, and ANP for key success factor (KSF) complexity analysis in R&D alliance. Sci Res Essays 7(19):1872– 1890 17. Majumdar R, Kapur PK, Khatri SK (2019) Assessing software upgradation attributes & optimal release planning using DEMATEL & MAU. Int J Ind Syst Eng 31(1):70–94 18. Triantaphyllou E, Mann SH (1998) Multi-criteria decision making: an operations research approach. In: Encyclopedia of electrical and electronics engineering. Wiley, New York, pp 175–186 19. Sengupta S, Annervaz K (2014) Multi-site data distribution for disaster recovery—a planning framework. Future Gen Comput Syst 41:53–64 20. Mann SH, Triantaphyllou E (1995) Using the analytic hierarchy process for decision making in engineering applications: some challenges. Int J Ind Eng Appl Pract 2(1):35–44
Chapter 3
A Study on Machine Learning-Based Predictive Modelling for Pick Profiling at Distribution Centers Aswin Ramachandran Nair, Aarti Laddha, Andreas Munson, and Anil Sebastian
1 Introduction The devices business is a flagship unit for the organization and attaining servicelevel expectations without incremental inventory and cost overheads is of paramount importance to achieve expected growth and target revenue. However, the current planning process uses legacy tools and rudimentary approaches to arrive at a forwardlooking view of the distribution center pick type projections; these have low accuracy and are time consuming. This created a need for Microsoft Supply Chain to develop a robust statistics-based automated predictive modelling solution to improve pick profile predictions and their accuracy. An overview of the different types of customers and how they are segmented along with pick profiles in our distribution centers are provided below to help understand our proposed solution better.
A. R. Nair (B) · A. Laddha · A. Munson · A. Sebastian Microsoft Devices Supply Chain Analytics & Data Science, Hyderabad, India e-mail: [email protected] A. Laddha e-mail: [email protected] A. Munson e-mail: [email protected] A. Sebastian e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_3
27
28
A. R. Nair et al.
1.1 Customer Segmentation Customer segment indicates the size and strategic importance of the customer and defines selling and marketing model/business processes used by Microsoft to engage with the customer. It is derived from the combination of (End Customer) subsegment, pricing level, and business. i.
Managed Retail: Describes revenue generated through the managed retail channel managed by the WW retail sales and marketing team. The primary end users are consumers. The primary products sold through this channel are Xbox, Surface, PC Hardware, Games, Office and Windows Software, etc. Examples of customers under this category include Best Buy, Walmart, Amazon, Target, etc. ii. Commercial: Includes large and mid-sized customers like corporates, institutions, etc. iii. MS Stores: Purchases made through Microsoft’s owned network of retail stores (Brick and Mortar Stores). iv. Online Stores: Online stores are Microsoft’s e-commerce channel where consumers directly buy Microsoft’s products over the Internet without an intermediary service/company.
1.2 Distribution Center Handling Types (Pick Profile) 1.2.1
Pick Profile
A pick profile defines the numerous pick configurations in which we can handle a base item (SKU) at the distribution center. Pick profile can have any number of levels. Each pick type corresponds to a level in the hierarchy. At Microsoft, we have essentially three different pick profiles: i. Each ii. Master Pack iii. Pallet. Each: The lowest level of the pick profile labeled for individual sale. For example, a single SKU (Stock Keeping Unit) of Surface Laptop, if being handled/shipped out of the distribution center would classify as Each Pick. This kind of handling is noticed for online customers and as replenishment packs for small retailers. Master Pack: A group of Each’s that are intended to be sold as a single customer. For example, 12 Surface Laptops packed together being handled/shipped from the distribution center to a commercial partner is considered a Master Pack Pick. This kind of handling is typical to mid-sized retailers, small commercial orders and replenishment packs for big retailers.
3 A Study on Machine Learning-Based Predictive Modelling for Pick Profiling …
29
Pallet: It is the cell of supply chain. A pallet is a flat structure that contains multiple master packs. For example, Surface Laptops in standardized pallet quantities of 10 master packs per pallet. This kind of handling is for big retailers and commercial partners.
2 Solution We have used advanced techniques like time series analyses and machine learning algorithms for generating high accuracy forecast. As part of the initial step, an array of data exploration exercises with the help of Power BI Models in combination with descriptive statistics in R was completed. This helped in understanding behaviors at various levels (SKU, distribution center, region, segment, etc.). Post these analyses, we decided to group the individual Distribution Centers into distribution center/segment (Commercial, consumer, Microsoft stores) combinational clusters, this was finalized by analyzing the historical data available. It was noticed that it drastically improved the predictability of the pick profiles, as segment specifically the seasonality indices and trends remain very correlated year on year. The exercise also helped decrease the instances where we noticed history insufficiency issues for some low volume distribution centers which is a necessary for the time series model training and forecasting. These groups of clusters are first identified and finalized, then we supply the inputs into the modelling module of the analytics framework. This module contains several forecasting methods where we use time series (TS) and regression (ML)-based models. The TS models uses techniques like Holt Winter Exponential Smoothening, Theta Forecasting, ARIMA, SARIMA (Seasonal), ARIMAX and Random Walk Forecasting (RWF). The clusters are passed through this model and a ‘Best Fit’ forecasting technique is identified by accuracy comparison for a testing period historical timeframe of a year. Multi-technique time series approach has proven to show significant improvements [1], this was evident in this project implementation as well. We use Eq. (1) to calculate the forecast error, Eq. (2) for overall error, and Eq. (3) for overall accuracy. Absolute Forecast Error = N WMAPE =
|Forecasted Demand − Shipments| Shipments
i=1 (Absolute
Forecast Error × Shipments) N i=1 Shipments
(1)
(2)
WMAPE is the Volume Weighted Mean Absolute Percentage Error. Accuracy% = 1 − WMAPE
(3)
30
A. R. Nair et al.
The remaining clusters with low testing period accuracy is forwarded to the regression-based model which is also enabled with like modelling and features-based machine learning. As of now, feature-based modelling use intrinsic factors like financial estimates, region/sub-region level behavior, segment level behavior, customer behavior and ordering pattern, etc., and not extrinsic factors like competition insights, market behavior, and trends. We do not have future information on these factors and features, so we use a conventional time series model for forecasting these variables for the next one year. This can be confidently used as the forecast accuracy was consistently observed to be very high (approximately 85–90%) for most of the feature selected independent variables. The forecasted variable values is supplied as input to the regression model which leverages the historical data for training and forecasted data for predictions (the independent variables are assigned these derived weights). The regression model applies methods such as single-/multi-variable liner regression algorithm, polynomial regression, decision tree, support vector machine (SVM), and random forest algorithm. There is also a like modeling and clustering-based forecasting output from this tool which analyses features and attributes for behavior similarities. Then analyze attributes level time series components for behavioral matches and generate predictions, i.e., a region/segment level seasonality index will be used against the own trend/level of the cluster to arrive at a forecast (Fig. 1). Like in the first TS models, MAPE is computed and analyzed for the regression model as well and then validated for a historical year long period (at month level). The best fit forecast selected from this analysis is pushed to visualization/BI tool. The analytics team then evaluates the predictions against the actuals and makes manual amends if necessary, before collaborating and reviewing the insights with fulfillment and logistics team. Figure 2 demonstrates the model process flow and Fig. 3 demonstrates the multistep analytics framework.
Fig. 1 Regression model output simulation (validation and testing period)
3 A Study on Machine Learning-Based Predictive Modelling for Pick Profiling …
Fig. 2 Analytics framework
Fig. 3 Forecasting engine
31
32
A. R. Nair et al.
2.1 Modelling Specifics and Suggested Framework • Data exploration Transactional datasets of last 8 years was analyzed to uncover initial patterns, characteristics, and points of interest using a combination of charts. The data exploration analyses enabled the identification of underlying behaviors and trends from historical transactional data viz. transition evaluation, attributes analysis, segment, geo location, etc. This helped in creating the familiarity with data and historical trends [2, 3]. • Time Series Model is the forecasting engine which uses methods such as ARIMA, SARIMA (Seasonal), ARIMAX, Triple Exponential Smoothening, Theta Forecast, and Random Walk Forecast (RWF) to forecast for every DC cluster combination, [4–6] • Regression Model is the forecasting engine which uses techniques such as multivariate linear regression, support vector machines, polynomial regression, and decision trees/random forest to forecast for every DC cluster combination. This module also completes behavior-based like modeling and clustering to improve the model efficacy [7]. • Best Fit Analysis assists the validation and review of the models used. It looks at the output of time series and regression model listed above for each DC cluster combination and picks the one with the minimum error using the weighted volume normalized mean absolute percentage error (MAPE) method [8]. • Visualization is the final module which is used to socialize the users to understand, visualize and analyze the historical actuals vs model predictions. Power BI-based visualization dashboard contains beacons, flags, and metrics to identify clusters with BIAS, class and volume-based analysis [9] and accuracy issues. This module helps in utilizing the picking percentages generated by our model and applying them against the baseline forecast provided by the collaborative planning with regional planners. Monthly model forecast segmented into Pallets, Master Packs and Each’s is then used by Fulfillment and Logistics team to plan/formulate mitigation strategies and take corrective actions accordingly [10–12].
2.2 Evidence the Solution Works These are model results (Sep’19–Dec’19) Testing Fig. 4 and (Jan’20–Mar’20) Go Live Fig. 5 Months at pick profile level. *Accuracy% [9].
3 Current Scenario The models (TS/ML) have been created on R, input databases in Azure Data Warehouse, and all the modules have been deployed in Power BI to service to all
3 A Study on Machine Learning-Based Predictive Modelling for Pick Profiling …
33
Fig. 4 Testing period results. *Accuracy% = 1 − WMAPE
Fig. 5 Go live results. *Accuracy% = 1 − WMAPE
the visualization requirements. Technologies used encompass Databricks, SQL, R Programming, VB.Net, and Power BI.
4 Next Steps Apart from improving the forecast quality and overall solution, we plan on creating a mechanism to safety check pick profile configurations and rectify any inaccurate configurations. The second area of focus would also be on improving the forecasting accuracy for each picking profile which has performed with low accuracy. We plan to deploy the solution to the cloud (Azure Data Bricks) for timely execution and ease of access in coming days.
References 1. Wagner N, Michalewicz Z, Schellenberg S, Chiriac C, Mohais A (2011) Intelligent techniques for forecasting multiple time series in real-world systems. Int J Intell Comput Cybernet 4(3):284–310. https://doi.org/10.1108/17563781111159996 2. Jayaraman S, Choudhury T, Kumar P (2017) Analysis of classification models based on cuisine prediction using machine learning. In: 2017 international conference on smart technologies for smart nation (SmartTechCon), pp 1485–1490 3. Chhabra AS, Choudhury T, Srivastava AV, Aggarwal A (2018) Prediction for big data and IoT in 2017. In: 2017 international conference on Infocom technologies and unmanned systems: trends and future directions, ICTUS 2017, Jan 2018. https://doi.org/10.1109/ICTUS.2017.828 6001 4. Madan S, Kumar P, Rawat S, Choudhury T (2018) Analysis of weather prediction using machine learning big data. In: Proceedings on 2018 international conference on advances in computing and communication engineering, ICACCE 2018. https://doi.org/10.1109/ICACCE.2018.844 1679
34
A. R. Nair et al.
5. Verma S, Choudhury T, Kumar P, Gupta SC (2019) Emendation of neural system for stock index price prediction. In: Proceedings of the 2018 international conference on communication, computing and internet of things, IC3IoT 2018. https://doi.org/10.1109/IC3IoT.2018.8668122 6. Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J For 14:35–62 7. Gupta A, Chaudhary DK, Choudhury T (2018) Stock prediction using functional link artificial neural network (FLANN). In: Proceedings—2017 international conference on computational intelligence and networks, CINE 2017. https://doi.org/10.1109/CINE.2017.25 8. Ravi T, Tiwari R (2019) Information delivery system for early forest fire detection using internet of things. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 477–486 9. Petersen CG, Aase GR, Heiser DR (2004) Improving order-picking performance through the implementation of class-based storage. Int J Phys Distrib Log Manag 34(7):534–544. https:// doi.org/10.1108/09600030410552230 10. Yadav AK, Tomar R, Kumar D, Gupta H (2012) Security and privacy concerns in cloud computing. Int J Adv Res Comput Sci Softw Eng 2(5):121 11. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET using MQTT. Ada User J 38(4):231–235 12. Ghosh S, Rana HS, Tomar R (2016) Word prediction using collaborative filtering algorithm. Int J Control Theory Appl 9(22):115–122
Chapter 4
Analysis of Computational Intelligence Techniques in Smart Cities Ayesha Shakeel, Ved Prakash Mishra, Vinod Kumar Shukla, and Kamta Nath Mishra
1 Introduction 1.1 What is Computational Intelligence? An informal definition of intelligence given by Legg and Hutter states that intelligence of an agent is a measure of its ability to accomplish goals in a broad variety of environments [1]. So, computational intelligence can be defined as the ability of computers to learn to perform a specific task from experimental observation or data, thus allowing computers to accomplish different goals in a range of different environments. It is also referred to as soft computing because it mostly deals with uncertain problems having partial truths to reach an approximate, stable, and lowcost optimal solution. Since computational intelligence is tolerant to uncertainties, it has an edge over traditional hard computing as many real-life problems cannot be converted into absolute 0s and 1s. CI uses incomplete and inexact knowledge and attempts to address complex real-world problems that cannot be solved by traditional computing. A. Shakeel (B) · V. P. Mishra · V. K. Shukla Department of CSE, Amity University Dubai, Dubai, UAE e-mail: [email protected] V. P. Mishra e-mail: [email protected] V. K. Shukla e-mail: [email protected] K. N. Mishra Birla Institute of Technology Mesra, Ranchi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_4
35
36
A. Shakeel et al.
1.2 Computational Intelligence Techniques The methods or techniques used by computational intelligence are biologically or linguistically inspired computational paradigms. Traditionally, the three primary pillars of CI have been neural networks, fuzzy systems, and evolutionary computation. i.
Neural Networks—Our brain is able to recognize patterns by clustering raw input, with the help of a specialized cell called neuron. Artificial neural networks, inspired by biological neural networks, are a set of algorithms that are designed to detect patterns by clustering and classifying unlabeled data according to their similarities. They are able to do this when they already have a labeled set of observations to learn from. ii. Fuzzy systems—Inspired by human language, which consists of words such as ‘maybe,’ ‘probably,’ ‘mostly,’ etc. fuzzy systems model imprecision and solve uncertain problems by performing approximate reasoning. In fuzzy systems, the input variables may be any real number between 0 and 1, both inclusive. Thus, the system is able to handle partial truths, where the truth value ranges between completely false and completely true iii. Evolutionary Computation—The process of biological evolution, which comprises survival of the fittest and modification of characteristics in species to better suit their environment, gave rise to the field of evolutionary computation. Evolutionary techniques are used for optimization problems, where we need to find the best possible solution. A random set of solutions are generated based on some basic requirements, and then a new generation of solutions is produced by eliminating less suitable solutions and introducing small new changes in the solutions. This process is repeated until the most suitable solution is found. Apart from these three main pillars, the evolving field of computational intelligence encompasses several other nature-inspired computing paradigms such as artificial immune systems, swarm intelligence, artificial endocrine networks, social reasoning, etc. i. Artificial Immune System—AIS is a system inspired by the biological immune system where algorithms are put together to achieve systemic goals. This involves computer and mathematical modeling of immune systems and the abstraction of immunology related principles into algorithms to provide system security and to enable computers to be protected from and to respond to threats better. ii. Swarm Intelligence—Throughout nature, we can see many examples of simple creatures who can perform only a limited range of functions individually, but when present as group or swarm, they show an astounding amount of complexity. This concept is what inspired swarm intelligence in which several small units that follow only a simple set of rules individually are able to solve complex problems when they work collectively.
4 Analysis of Computational Intelligence Techniques in Smart Cities
37
1.3 What Are Smart Cities? A smart city is a title given to a city that utilizes information and communication technologies (ICT) for improving the quality and performance of urban services such as transportation, energy, water supply, information systems, crime detection, schools, hospitals, etc. while reducing costs, consumption of resources and wastage. So, basically, the overall mission of a smart city is to optimize city functions and improve the quality of life for its citizens with the help of smart technology and data analysis, while increasing economic growth. The main characteristics of smart cities are: i.
ii.
iii.
iv.
v.
vi.
Smart energy—Both residential and commercial buildings, street lights, etc. use less energy. The Internet of Things (IoT) helps in better management and distribution of energy. Smart data—Huge volumes of data are collected in a smart city, and it has to be analyzed rapidly to provide valuable information to service providers and residents. Smart Transport—A smart city must have reduced vehicle traffic, fewer road accidents, low pollution levels, and various different means for transportation of people and goods. Smart infrastructure—This refers to integration of various technologies such as IoT, Big Data, etc. in infrastructure which would enable proactive maintenance and more efficient use of resources. Connected Devices—IoT devices are an integral component of smart cities. The sensors in these devices collect useful data that can be analyzed to acquire relevant information. Mobility of data—In a smart city, data must move without interruption through various municipal and administrative systems to be able to provide fast and efficient services.
For all these characteristics to exist, we need something that can analyze huge, repetitive data, make useful inferences from it and provide optimization solutions based on those inferences. Computational intelligence is the answer to all these requirements, and thus, no city can be a ‘smart’ city without CI (Fig. 1).
1.4 Technology Used in Smart Cities As mentioned above, a smart city uses various different forms of technologies to connect all the components of the city and provide better services. The key technologies used in a smart city are: i.
Smart Grid—A smart grid refers to an electrical grid which utilizes information technology by using smart meters and smart appliances and tries to make energy consumption as eco-friendly as possible by using renewable energy and
38
A. Shakeel et al. Smart Governance Smart Energy
Smart Citizen
Smart Healthcare
Smart Building
Smart Mobility
Smart Technology Smart Infrastructure
Fig. 1 Smart diamond to define smart cities [2]
energy efficient resources. Since a smart city aims to have better management of energy, smart grids are essential components for the development of a smart city. Smart grids and their smart meters monitor energy usage of homes or buildings actively and provide optimization techniques and solutions to reduce energy usage. Thus, several solutions have been developed to enhance energy services such as advanced metering infrastructure or prepaid energy applications. ii. Intelligent traffic systems—Cameras are present on smart traffic lights to monitor the flow of traffic so that the traffic lights can operate accordingly, and thus improve the flow of traffic. Also, many locations have ‘smart parking’ which is a system that assists drivers in finding an empty parking spot using sensors in every parking space in order to detect whether a vehicle is present or not, and it accordingly directs drivers towards vacant parking spots. Moreover, automatic vehicle location (AVL) and Global Positioning Systems (GPS) are being used to estimate approximate arrival times, and thus, people are able to get real-time information such as at what time a bus will reach a bus stop. Additionally, in smart cities, various sensors are used to collect data on the movement of people, bikes, and all forms of vehicles. This data is then studied and used to develop solutions for greatly reducing traffic enabling easy movement of goods and people through several different means. iii. Smart IoT devices—IoT devices are one of the most important components in a smart city as they bind everything together. Sensors and actuators form inevitable elements of the IoT networks. For example, smart parking uses sensors to detect whether a vehicle is in the parking spot or not. Actuators, however, operate oppositely to sensors. They takes electrical signals as input and transform them into physical action. Here is an example: (see Fig. 2). Sensors and actuators are necessary in smart cities. A smart city works due to the variety of reporting devices like sensors or other end points that create data. Sensors deployed in various locations in a smart city gather information and with this information exchanged freely, advanced and complicated city systems can
4 Analysis of Computational Intelligence Techniques in Smart Cities
Sensor
Temperature sensor detects heat
Control Center
Sends this signal to the control center
39
Actuator
Control center sends command to
Sprinkler turns on and puts out flame
sprinkler Fig. 2 Sensor to actuator flow [3]
be managed in real-time and, with sufficient integration, can minimize undesired consequences. iv Geospatial Technology—To build the right plan for a city, accurate, concise, and detailed geographic data is required, which is provided by the Geospatial Technology. Geospatial Technology is a relatively new and growing field of study that consists of Geographic Information System (GIS), Global Positioning System (GPS), and Remote Sensing (RS). It gives an essential framework for the collection of data and transformation of observation in these collections to facilitate software-based solutions regarding smart infrastructure. v Artificial and Computational Intelligence—The huge amount of data generated in a smart city is of no use until and unless it is processed. This brings in the part of artificial intelligence that can make inferences from the data. AI enables machines to interact with each other by processing the data and making sense out of it. Let’s consider an example in order to understand the application of AI in regards to smart cities. In a system where there are regular energy spikes, artificial intelligence can discover where they normally occur and also under what circumstances and this information can then be utilized for improving the management of the power grid. Similarly, AI is also used for intelligent management of traffic as well as healthcare facilities. Computational intelligence serves the same purpose but is used in places where artificial intelligence may not be enough, i.e., situations that require soft computing, for example, applications based on evaluating human language or emotions. vi Blockchain technology—Blockchain is a relatively recent smart city concept. Blockchain technology secures the flow of data. Integrating it into smart cities could allow all city services to be connected in a better manner along with increasing transparency and security. In some ways, Blockchain is expected to influence cities through smart contracts, which help with processing transactions, billing, and handling facilities management. Smart grids can also make use of Blockchain in order to allow easy energy sharing.
40
A. Shakeel et al.
1.5 What Are the Advantages and Disadvantages of These Technologies? Advantages i.
These technologies make the city more efficient and eco-friendly. Smart grids ensure that energy is being used only when it is needed. Moreover, buildings which actively monitor their energy usage and report this data to utilities can cutdown their costs. Ultimately, this will reduce pollution and increase efficiency of cities as they become increasingly urbanized. ii. Intelligent traffic systems reduce congestion and by making parking smarter, people spend less time looking for parking spots and circling city blocks. Also, with public transport becoming more systematic and easy to use, more people start using it, thereby reducing congestion, accidents, and pollution. iii. Smart infrastructure allows for proactive maintenance and better planning for future demand. The data collected from the smart devices is used to make meaningful changes in future city plans. This allows the city to stay ahead of time and come up with solutions for problems before they even arise. iv. Intelligent systems optimize functions and give the most economic solutions. Thus, smart cities are expected to generate 60% of the world’s GDP by 2025, according to McKinsey research [4]. Disadvantages i.
Unconstrained movement of data between systems raises security, PR, and privacy concerns. People don’t even come to know what information about them is being stored and shared and who all are able to access this information. Smart home systems also raise the concern of data privacy. ii. Another major problem with a smart city is the huge amount of data. Too much data can be overwhelming. Data received at a time when one is unable to take advantage of it is essentially noise. iii. Adoption of smart systems requires knowledge on different technologies in order to use it. Learning how to use these technologies takes a bit of instruction and practice. Without such instruction or training, people may find the technology difficult to use, especially senior citizens.
2 Literature Review 2.1 Smart Cities Initially, in literature, a city was said to be a smart one based on its transportation system [5]. The quality and efficiency of the transportation system was thought
4 Analysis of Computational Intelligence Techniques in Smart Cities
41
to be the best example of the harmony between development of city and modern technologies. The term smart city is also used in literature [6, 7] where education of its inhabitants is highlighted. A smart city has smart inhabitants in terms of their level of education. Intelligent subjects will affect the way in which the information is received, understood, and then used or applied.
2.2 How Secure Are Smart City Technologies? In smart cities today, while manufacturers are producing new ‘smart products’ at a rate like never before, racing to be first-to-market, municipalities must be more careful. There could be a much greater impact and very serious consequences if a security failure of a smart city initiative happens to occur. Take for instance, the smart city initiative of smart metering. Now, imagine a smart city being hacked, resulting in electrical outages or endangering personal information or payment data of customers. This is a possible large-scale event that could have an effect on not only individual users, but entire communities. The potential impacts and risks are real. A research by Ponemon Institute and IBM estimates that the average cost of a data breach is $3.9 million dollars [8]. Atttributes for secure devices are [9]: i. ii. iii. iv. v. vi. vii.
The hardware-based root of trust Small, trusted computing base Defense in depth Compartmentalization Certificate-based authentication Renewable security Failure reporting.
However, existing smart city defenses are not up to this standard of security and are insufficient. In April 2019, servers of the city of Stuart, Florida were infected by ransomware which brought down payroll, email, and other crucial functions, which eventually costed the city $600,000 in ransom fees. According to a study by McMaster University, 88% of people are worried about their privacy in the context of smart cities [10]. As smart city implementation is taking place at an accelerating rate, cities are falling behind at preventing, identifying, and responding to cyber attacks and privacy risks because of the following reasons: i. City developers and planners have not prioritized security. ii. Decentralized smart city initiatives compromise a centralized and consistent approach to security. Without a centralized approach to security, any initiative which leaves a gap in control or security policy in a smart city implementation elevates the risk of security breach. iii. Physical threats to connected systems are not addressed by security teams. Because of the long distances in power transmission systems, pipelines, and other utilities, remote locations are left exposed.
42
A. Shakeel et al.
iv. Cities are being overwhelmed by the large volumes of data being collected. Security teams are just beginning to develop in the IT environment for collection, classification, and flow mapping; in OT, they are further behind. v. Security teams are not ready to fight data integrity attacks. They do not have the ability to prove whether the data and algorithms which the city functions depend on for decision making have not been tampered with. Ultimately, how municipalities manage security while employing smart city technology will determine the success of their endeavors. Authenticated, authorized, and encrypted communication ensures success.
2.3 Data in Smart City [11] Smart cities continuously generate a large variety of unstructured data (depending on the nature of the source of data) from a range of applications like traffic-management, healthcare, energy-management, environment monitoring, etc., which results in a huge volume of data. This variety of data can be categorized into the following types: i. ii. iii.
iv.
v. vi.
Time series data—Sequences of values or events obtained by taking repeated measurements over time. Streaming data—Data that is continuously arriving, for example internet traffic, sensor data, etc. Sequence data—Ordered elements or events that are recorded with/without a particular notion of time. For example, human DNA, data from smart-retail systems, etc. Graph data—Data that can be represented in the form of graph data-structure. Common examples are information from the World Wide Web, social networks, etc. Spatial data—Data acquired from sources like geographical information systems, remote sensing, or medical imaging data. Multimedia data—Images, video, and audio.
Each of the different data types mentioned above has its own unique characteristics and is analyzed with the help of different data mining techniques (Fig. 3). As we have seen above, a huge volume of data-both structured and unstructured is collected on a daily basis in a smart city. What’s more, the volume of data grows exponentially with time. Such kind of data is known as ‘big data.’ Thus, the main challenge for smart cities is collecting, storing, and analyzing this Big Data.
4 Analysis of Computational Intelligence Techniques in Smart Cities
43
Data Variety
Time series data (Stock market, Smart cars, Smart homes)
Sequence data (Retail systems, Human DNA, Protein structure)
Multimedia data (Smart homes, Surveillance systems)
Streaming data (Sensor Surveillance, systems, Internet traffic)
Graph data (Healthcare systems, Social networks, www)
Spatial data (Satellite data, Medical imaging, GIS data)
Fig. 3 Different types of data generated in smart cities [11]
2.4 Collection of Data Data collection in a smart city is done by IoT devices. A huge variety of data is collected by the matrix of sensors placed in a modern smart city, usually embedded in various devices [11]. For example, sensors in smart homes, connected to smart meters collect digital information regarding energy usage in order to monitor energy consumption patterns in real life. Some of the devices used in data collection include cameras, biometric terminals, card readers, bar code readers, mobile phones, computers, etc. For example, public transportation networks in smart cities collect data using devices such as card readers, GPS trackers, etc., and this data expresses the accessibility to these networks by different communities. Likewise, street cameras can collect information and produce data on the vehicular traffic on roads, traffic accidents or even human activity on pavement. Specific sensors collect information on public safety and process this through digital data processing platforms. These sensors create repositories of information and assist law enforcers in controlling civic violations and arresting criminals within city limits. Additionally, social media platforms are also a form of digital technology that enables governments to gather public information with the intention of promoting smart governance. Smart city applications are based on a combination of the raw data (such as that collected from sensors, cameras, etc.) and some level of enhanced data (data regarding energy usage, traffic conditions, etc. from analytics platforms, edge management, etc.) [12] (see Fig. 4). In many cases, raw data is enough to perform the application.
44
A. Shakeel et al.
Sensors, Cameras, Environmental City Records, Surveillance Raw Data Crowdsourcing Location Waste Fill Level Data Consumers Traffic and Road Conditions Health Monitoring Enhanced Energy Usage Data Parking Management
Analytics Platforms, Edge Management, Augmented Reality
Fig. 4 Consumption of data in smart cities [12]
However, in some cases, a combination of raw data and enhanced data is required to support applications that need analytics or special formatting.
2.5 Importance of Analyzing the Data Collected [13] Data analytics has a very important role to play in all aspects of smart city operations and public services. For example, the goal of becoming zero-carbon cities requires advanced data analytics for enabling the cities, utilities, and other partners to optimize energy and resource flow in order to meet their arduous zero-carbon targets. Analytics is very important for managing the distributed renewable energy-based community energy systems, micro grids, and storage technologies in an efficient manner. Some key applications of data analytics include: i. Better mobility ii. Better asset management iii. City benchmarking.
2.6 Challenges in Analyzing Big Data [14] We now know that analyzing data generated in smart cities is very useful and important. However, we face a number of challenges when attempting to analyze the data such as processing the data securely, robust handling and storing the data in emergency situations, etc. These problems cannot be tackled by traditional tools due to different reasons such as centralized nature of Big Data stores or the increased ability of attackers to infiltrate and neutralize conventional security systems.
4 Analysis of Computational Intelligence Techniques in Smart Cities
45
In order to make use of the advantages of Big Data analytics in a progressively knowledge-driven society, we need solutions that decrease the complexity and cognitive strain on accessing and processing these vast volumes of data. Since real-time applications are becoming increasingly complex, many challenges arise from the implementation of Big Data in the real world. One of the factors, due to which real-time applications have become complex, is that datasets possess a high dimensionality degree which increases the difficulty in processing and analyzing the data. Secondly, data is usually collected from many different input channels and diverse sources and all this data has to be integrated and the different data types have to be concurrently analyzed. Thus, the processing becomes very difficult due to the variety of input. Additionally, the data collected often consists of multiple types of input which sometimes may not be complete or precise due to various possible reasons for uncertainty, imprecision, or missing data (e.g. inaccurate sensors or malfunctioning). So, the computational techniques used for Big Data analytics should be able to deal with all the aforementioned challenges, and most importantly, should be able to extract knowledge from the data in a way that can be interpreted easily. The techniques used should make the patterns existing in the data clear to the person trying to understand and use them. Also, there is a need for these techniques to be able to include user-specific and contextual components in their design as well as their decision-making process, in a computationally feasible and user-friendly manner. We can create successful applications if all of the aforementioned features are present in the computational techniques used for processing and analyzing Big Data.
2.7 Use of CI Techniques in Smart Cities Various techniques of computational intelligence have been identified as potential tools for analysis of big data and many of them are being used for the same already. These techniques are able to overcome most or all of the challenges mentioned in the previous section, and thus have been chosen as ideal for Big Data analytics. Given below are explanations on how different CI techniques can contribute in analyzing Big Data and also other applications. Deep learning algorithms are becoming increasingly popular as they have presented effectual CI techniques for performing tasks like object recognition and speech perception. Large-scale data with high dimensionality can be modeled using these approaches. Deep learning (DL) algorithms are established on the concept of Artificial Neural Networks (ANNs) having numerous concealed layers, where algorithms can be trained in an unsupervised manner (finding hidden patterns within the data or discovering features by itself) to produce higher level representations of sensory data which can then be used to train a classifier based on standard supervised training algorithms [15]. Deep neural networks have the potential to discover correlations from data and are stated to be very efficient at pattern recognition [16].
46
A. Shakeel et al.
Kiremidjian et al. [17, 18], Gul et al. [19] and Yao et al. [20] have shown that pattern recognition-based damage detection algorithms can detect structural damage in civil structures and are thus ideal for structural health monitoring (SHM). Thus, ANNs are widely used for such structural engineering applications [21]. Smarsly et al. [22] even proposed a self-managing structural health monitoring system that uses multiagent technology coupled with advanced computing methods and demonstrated the practicability, reliability, and robustness of the system. Deep learning is effectively applied in tracking of objects, image recognition, examination and interpretation of huge and multilayered information, and data [23, 24]. A prominent application of neural networks in smart cities is for predicting flow of traffic. Ayalew Belay Habtie and others presented a model based on neural network model to predict the flow of traffic in cities. Their experiments used simulation as well as real-world data and displayed the potential of the model for providing correct results [25]. There exist many more smart city applications as well that have used neural networks successfully. J. Lwowski and others presented a system for regional detection which uses deep convolution networks to detect pedestrians in real-time. Their proposed system was accurate up to 95.7%, while being fast and simple due to which it is fitting for real-time use [26]. Gupta and others proposed a model that utilizes a Hopfield neural network (HNN) to optimize traffic light sequences at intersections and uses a genetic algorithm to calculate the optimal green time for traffic lights. The traffic flow in a smart city could be greatly improved by using this method [27]. Dementia monitoring and predictive care recommendation solutions can be developed by advanced approaches of deep learning, which could predict behavioral changes of each patient, by using data on other dementia patients. Such care recommendation decisions would help caretakers in carrying out therapy plans and adaptive care, according to the requirements of the patients which keep changing as a result of their decreasing cognitive ability [14]. Another set of challenges faced in the analysis of Big Data is regarding the fact that data is collected from a variety of different sources many of which may even be noise contaminated. Such data tends to contain a lot of noise and uncertainty. It has been shown that systems based on fuzzy logic can effectively deal with uncertainties in data. For example, systems that can predict the emotion of the user requires the use of databases which have a lot of uncertainty which is due to the fact that our emotions are fuzzy in nature. It has been proved that fuzzy systems can handle this problem, and thus, powerful fuzzy systems have been created which perform comparable to or better than other techniques that are more complicated, while maintaining a decent tradeoff between time performance and accuracy in classification [28]. This is one of the most important factors while handling large volumes of data as it enables the classification to be performed within a feasible timespan. Because fuzzy logic depends upon the fuzzy rules of natural language, it is able to successfully visualize the hidden relations that exist in data and thus allows the users of the applications to easily visualize these concealed relations [29]. Moreover, adaptive fuzzy logic
4 Analysis of Computational Intelligence Techniques in Smart Cities
47
systems have been seen to have a very good potential at modeling and accounting for differences in each agent as well as circumstantial information while maintaining a very feasible computational burden due to which they have been considered as a very good choice for creating user-centered and personalized systems [28, 30, 31]. Fuzzy logic has also been found to be a great solution for the problem of high-power consumption by heating ventilation and air conditioning (HVAC) systems. Applying fuzzy logic systems to HVAC enables the modification of the functioning of the AC and reduction in its intake of electrical energy while using all available resources efficiently [32]. Omar Behadada and others proposed a procedure for characterizing fuzzy partition rules semi-automatically, which made use of huge data sets available to the public, experimental data and data from specialists, to build a system that detects heart rate arrhythmia. Their proposed system shows a remarkable tradeoff between interpretability and accuracy [33]. Furthermore, many applications that use Big Data derived from social networks for analyzing public opinion have made use of fuzzy logic. For example, Bing developed an FMM system which is a matrix-based fuzzy system for mining data collected from Twitter. The system demonstrated an impressive performance at prediction and also has low processing times [34]. Another noteworthy application used fuzzy logic-based matching algorithms along with MapReduce to perform Big Data analytics for the purpose of clinical decision support [35]. The developed system showed high flexibility and was able to deal with data from various sources. Fuzzy logic was also used as a computational base for a successfully tested model that produced accurate flood alerts, proposed by Melo et al. [36]. Evolutionary algorithms (EA) or genetic algorithms (GA) are yet another CI paradigm that can handle the challenges of Big Data analytics. Because Big Data is subject to a high degree of sparseness and dimensionality, EA become very good candidates for Big Data analysis as EA are excellent at exploring the search space [37]. In the paper by Bhattacharya et al., an EA had been proposed which has the ability to deal with high dimensionality and sparseness. Though the researchers considered this application to not be ready for handling Big Data, their approach was later found to perform very well in comparison to other techniques. Many problems related to machine learning, that are used for analyzing Big Data like clustering, feature selection, etc. also use evolutionary algorithms. Recent research has shown the use of EA in fusion with signal inputs like EEG signals, for new and challenging application fields like multi-brain computing [38]. Genetic algorithms have also provided effective smart city solutions in the past. They are also the backbone of a recent research by Radek Fujdiak and others, which used a GA for the optimization of waste collection by the municipality, a task that is challenging yet vital task for any efficient smart city. The algorithm calculated the most optimal routes for the garbage trucks, thus improving the efficiency of the waste collection process [39]. Genetic algorithms are usually also combined with other CI techniques to produce the best solutions for smart city problems. An example of this would be the work by
48
A. Shakeel et al.
Vlahogianni et al. which uses a genetically optimized multilayer perceptron to solve the problem of unavailability of parking. Their approach used data collected from Santader in Spain, a smart city, and was able to predict regional parking occupancy rates for up to half an hour from the current time [40]. As seen above, CI techniques and their combinations can be used to extract meaning and insights from the data that can be used to come up with solutions that find applications in various different fields. These solutions must be applied to hardware and software, offline and online, data control, and processing needs, which can then be further optimized to domain specific dynamics and constraints. Thus, these CI approaches may be utilized to provide efficient multi-purpose intelligent analysis of data as well as decision support systems which can be used for various commercial or industrial applications that involve huge amounts of complex or vague information that requires analysis to be able to make operational and cost effective decisions [41].
2.8 Applications of CI and Big Data in Smart Cities Transport and Traffic Management As seen in the previous section, transport and traffic management are a key application area for computational intelligence techniques. UK has several examples of smart city projects, especially ones based on traffic management, such as Peterborough’s virtual model, the smart transportation system by Milton Keynes, Cambridge intelligent technology for the management of traffic, energy network, air quality and health and social care, and others [18]. As shown by the research examples mentioned earlier, intelligent transportation is one of the main application areas which can be enormously benefited from the utilization of Big Data analytics and computation intelligence with regards to smart cities. Many initiatives have also been introduced to satisfy personalized user defined needs in relation to improving user mobility, satisfaction, and utility alongside helping to avoid congestion. These initiatives use technologies like vehicular adhoc networks (VANETs) and vehicular content centric networks (VCCNs), which are intended to recover and deliver data quickly and efficiently in the context of vehicular environment [14]. Healthcare Promoting the quality of life and standard of living of residents is one of the main aims of a smart city. A very important factor for achieving this aim is better healthcare and that requires getting a better insight into individuals’ or groups’ healthcare needs. Doing this is very important to be able to provide tailored treatment and therapy intervention recommendations. Computational health informatics may be utilized on data based on large population by developing interpretable decision support models to gain insight into the healthcare needs of communities and to facilitate efficient health policy or even intervention planning with regards to managing crisis related to
4 Analysis of Computational Intelligence Techniques in Smart Cities
49
famine and disease epidemics. An example of such models is the dementia monitoring system mentioned in the previous section. Security While smart cities are becoming increasingly intelligent, they must not forget about security and provide their residents with a secure environment. Computational intelligence methods may be used for classifying behavioral traits for correctly identifying abnormal or suspicious behavior for different applications like crowd surveillance or vehicle theft by monitoring and analyzing many data streams in real-time. Physiological information may additionally be utilized for improving the wellbeing and safety of the population that has long duration, repetitive, or late-night working conditions. For example, we can detect the drowsiness levels of drivers with the help of feature classification algorithms that can detect faces and their orientation along with blink rate, gaze, and eyelid closure, which are factors that indicate drowsiness levels. A study has shown the close relation between drowsiness and high blink rate and also lower heart rate and these can be detected by unobtrusive modern sensors [42]. Economy Big Data and CI are also increasingly influencing how business operate and the strategies used thus the economic development of smart cities. Computational techniques can be utilized in the stock market or other financial factors and this could give organizations an insight helping them in improving their strategies and regulating the decisions they make in order to keep up with the requirements of the current market situation along with the overall economic environment. Combining fuzzy techniques and deep learning enables the processing of diverse data sources and production of successful business models which can effectively compensate for the inevitable uncertainty linked with drivers of the market such as government policy, consumer demand, etc. Additionally, business models may be designed to foresee and forewarn about potentially dangerous situations or conditions. In the year 2015, Sridhar and Nagaraj presented a system for prediction of bankruptcy that categorizes organizations on how high their bankruptcy risk is by using machine learning algorithms and Big Data, which represents the surrounding financial environment. Such kind of a system can be used as a decision support system [43]. Another key factor for developing a prosperous and functional smart city is having strong and effective manufacturing industries. Detection of faults in complex manufacturing domains may be done using a combination of CI techniques, as mentioned in the previous section, to model and classify different kinds of faults be it hardware or software alongside accounting for uncertainties caused by the environment. The models can then be used to further optimize the process of production in order to decrease cost, machines’ down-time, scrap production as well as human effort. Maniac et al. proposed an approach based on deep learning neural network that used labeled historical data and unlabeled sound data collected by experts, for the detection and identification of defective audio signaling devices automatically [44–46]. This system eliminates the requirement of manually inspecting the devices, and it has accomplished high accuracy in recognition. The success of this system shows the
50
A. Shakeel et al.
potential computational intelligence techniques have for the purpose of improving the performance of modern manufacturing companies.
3 Comparative Analysis of CI Techniques in Smart Cities Fuzzy logic having the least disadvantages seems to be the most favorable computational technique. However, fuzzy logic may not be enough for various applications, and thus, many applications use a combination of different techniques to get the most optimal algorithm (Table 1). Table 1 Advantages and disadvantages of the different CI techniques CI technique
Advantages
Disadvantages
Deep learning
Can be used for multiple abilities like vision, speech, language, decision/making, etc. Can discover correlations in data Effective for object recognition Can be used for large-scale data with high dimensionality
It is highly data intensive It requires a lot of supervised data to work properly High storage and computational power is required
Fuzzy logic
It is simple Can handle inaccuracies and uncertainty in data making it ideal for data related to human emotion or language Can visualize the hidden relations in data Has a very good potential at accounting for individual differences and circumstantial information Useful for reduction in energy consumption particularly in heating ventilation and air conditioning (HVAC)
Designing fuzzy logic controllers requires more time and hardware May become a hurdle in verifying the reliability of a system in the case of highly complex systems because there are a lot of ways for interpreting fuzzy rules, combing outputs of different fuzzy rules and de-fuzzifying output
Evolutionary algorithms Concept is easy to understand Supports multi-objective optimization Can work effectively in noisy environments Can deal with high dimensionality, sparseness and are excellent at exploring the search space
GA is computationally expensive, i.e., time-consuming Designing an objective function and getting the representation and operators right can be difficult EAs are stochastic so it cannot be guaranteed that two runs with the same conditions would provide the same solutions, or that the algorithm really converged on the most optimal candidate solution
4 Analysis of Computational Intelligence Techniques in Smart Cities
51
4 Summary and Conclusion Computational intelligence, which is the ability of computers to learn to perform a specific task from experimental observation or data, has several useful paradigms, that are—fuzzy systems, neural networks, evolutionary algorithms, etc. Fuzzy systems model imprecision and solve uncertain problems by performing approximate reasoning. Neural networks are a set of algorithms designed to detect patterns by clustering and classifying unlabeled data according to their similarities. Evolutionary techniques are used for optimization problems, where a random set of solutions are generated and then less suitable solutions are eliminated repeatedly until the most suitable solution is obtained. Computational intelligence techniques are very useful for many applications of smart cities, which are cities that use information and communication technologies (ICT) to improve the quality and performance of urban services. A smart city uses technologies such as IoT networks, smart grids, intelligent traffic systems, geospatial technology, artificial intelligence, blockchain technology, etc. These technologies help smart cities become more efficient, ecofriendly, and economic. However, they do have disadvantages such as inability to handle and analyze enormous amounts of data and most importantly security and privacy of residents. Existing smart city defenses are not up to the required standard of security, and thus, improvement is required in this field. Smart cities generate a huge amount and variety of structured and unstructured data. This data can be classified into various types such as time series data, streaming data, sequence data, graph data, spatial data, and multimedia data. This data is collected through IoT devices such as sensors, GPS trackers, cameras, mobile phones, card readers, etc. It is very important to analyze this data as analysis of the data provides useful knowledge and insights that can be applied in various areas of a smart city, such as better mobility, asset management, etc. However, certain challenges are faced while analyzing the data such as the large volume of data, different sources and types of inputs, etc. Since CI techniques have features like the ability to handle uncertainties and multiple hidden layers, these techniques are able to overcome the challenges of Big Data analytics. Thus, computational intelligence is being used for this purpose in smart cities by being applied in several fields such as transport and traffic management, healthcare, security, and improvement of economy, as discussed. Deep learning is primarily used for pattern recognition, classification, and object recognition related tasks while Fuzzy logic is broadly used for tasks involving evaluation or prediction of human language or emotion, apart from several other uses. Evolutionary algorithms are mainly used for various optimization problems, usually in combination with other CI techniques. The use of CI techniques has allowed for the creation of solutions that were not possible with traditional hard computing and of systems that truly increase the efficiency in different aspects of life. Thus, it can be concluded that CI is a great tool for the development of smart cities and that its different techniques should be used appropriately for the particular advantages that each one offers.
52
A. Shakeel et al.
References 1. Legg S, Hutter M (2007) Universal intelligence: a definition of machine intelligence. Mind Mach 17(4) 2. Khound K (2013) Smart cities: from concept to reality. Frost and Sullivan: Mountain View, CA, USA, 2013 3. Kayla L (2019) IoT systems: sensors and actuators. DZone, 6 Aug 2019. https://dzone.com/ articles/iot-systems-sensors-and-actuators 4. Dobbs R, Smit S, Remes J, Manyika J, Roxburgh C, Restrepo A (2011) Urban world: mapping the economic power of cities. McKinsey Global Institute, Mar 2011 5. Giffinger R, Fertner C, Kramar H, Meijers E (2007) City-ranking of eurpoean medium-sized cities. Centre of Regional Science, Vienna Univiersity of Technology (UT), Austria, pp 1–12 6. Caragliu A, Del Bo C, Nijkamp P (2009) Smart cities in europe. J Urban Technol 18(0048) 7. Schaffers H, Komninos N, Tsarchopoulos P, Pallot M, Trousse B, Posio E, Fernadez J, Hielkema H, Hongisto P, Almirall E, et al. (2012) Landscape and roadmap of future internet and smart cities. Technical Report, Hal Open Archive 8. Ponemon Institute (2019) Cost of a data breach report. https://www.ibm.com/security/data-bre ach.html Accessed 10 Aug 2020 9. Hunt G, Letey G, Nightingale EB (2017) The seven properties of highly secure devices. Technical Report MSR-TR-2017-16, Microsoft Research 10. Bannerman S (2019) The privacy implications of smart cities. McMaster University, Feb 2019 11. Pal D, Triyason T, Padungweang P (2018) Big data in smart-cities: current research and challenges. Indonesian J Electr Eng Inform (IJEEI) 12. Atis (2018) Data sharing framework. https://www.atis.org/smart-cities-data-sharing/.http Accessed 10 Aug 2020 13. Eric W (2019) Data analytics: the key to delivering smart cities. https://www.smart-energy.com/ magazine-article/data-analytics-the-key-to-delivering-smart-cities/. Accessed 10 Aug 2020 14. Iqbal R, More F B, Mahmu S, Yousuf U (2017) Big data analytics: computational intelligence techniques and application areas. Int J Inf Manage 15. Hinton G E, Salakhutdinov R R (2016) Reducing the dimensionality of data with neural networks. Science 313 16. Chung J, Gulcehre C, Cho K H, Bengio Y (2014) Emperical evaluation of gated neural networks on sequence modeling. In: Andrew YN, Yoshua B, Adam C, et al, Deep Learning and Representation Learning, Montreal, Canada 17. Cheung A, Cabera C, Sarabandi P, Nair K K, Kiremidjian A, Wenzel H (2008) The application of statistical pattern recognition methods for damage detection to field data. Smart Mater Struct 17 18. Andre J, Kiremidian A, Liao Y, Rajagopal R (2016) Structural health monitoring approach for detecting ice accretion on bridge cables using the autoregressive model. Smart Struct Syst 6 19. Gul M, Catbas F N (2009) Statistical pattern recognition for structural health monitoring using time series modeling: theory and experimental verifications. Mech Syst Sign Proces 23(7) 20. Yao R, Pakzad S N (2012) Autoregressive statistical pattern recognition algorithms for damage detection in civil structures. Mech Syst Sign Proces 21. Salehi H, Burgueno R (2018) Emerging artificial intelligence methods in structural engineering. Eng Struct 22. Smarsly K, Law K H, Hartmann D (2012) A multiagent-based collaborative framework for a self-managing structural health monitoring system. J Comput Civil Eng 26 23. Paganelli F, Turchi S, Giuli D (2014) A web of things framework for restful applications and its experimentation in a smart city. IEEE Syst J 10(4) 24. Eric W, David A, Roberto R, Rowan W (2016) Assessment of Strategy and Execution of the UK’s Leading Smart Cities. UK Smart Cities Index 25. Habtie AB, Abraham A, Midekso D (2016) A neural network model for road traffic flow estimation. Advances in Nature and Biologically Inspired Computing
4 Analysis of Computational Intelligence Techniques in Smart Cities
53
26. Lwowski J, Kolar P, Benavidez P, Rad P, Prevost J J, Jamshidi M (2017) Pedestrian detection system for smart communities using deep convolutional neural networks. IEEE Syst Syst Eng 27. Gupta V, Kumar R, Reddy K S, Panigrahi B K (2017) Intelligent traffic light control for congestion management for smart city development. 2017 IEEE Region 10 Symposium 28. Karyotis C, Doctor F, Iqbal R, James A (2015) An intelligent framework for monitoring students affecting trajectories using adaptive fuzzy systems. IEEE International Conference on Fuzzy Systems 29. Doctor F, Iqbal R (2012) An intelligent framework for monitoring student performance using fuzzy based linguistic summarisation. IEEE International Conference on Fuzzy Systems 30. Doctor F, Hagras H, Callaghan V (2005) A type-2 fuzzy embedded agent to realise ambient intelligence in ubiquitous computing environments. Inf Sci 171(4) 31. Karyotis C, Doctor F, Iqbal R, James A, Chang V (2018) A fuzzy computational model of emotion for cloud based sentiment analysis. Inf Sci 32. Fakhrudding H N M, Ali S A, Muzafar M, Azam P Q S (2016) Fuzzy logic in HVAC for human comfort. Int J Sci Eng Res 7 33. Behadada O, Trovati M (2015) Big data-based extraction of fuzzy partition rules for heart arrhythmia detection: a semi-automated approach. Concurrency and Computation: Practice and Experience 34. Bing L, Chan K C C, Ou C (2014) Public sentiment analysis in twitter data for prediction of a company’s stock price movements. In: Li Y, Guo J (eds) Proceedings of the 11th IEEE International Conference on e-Business Engineering, IEEE Computer Society, Guangzhou, China, pp 1–10 35. Duggal R, Khatri S K, Shukla B (2015) Improving patient matching: single patient view for clinical decision support using big data analytics. 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) 36. Melo FS, Silva JLM, Macedo HT (2016) Flood monitoring in smart cities based on fuzzy logic about urban open data. Proceedings of the 2016 8th Euro American Conference on Telematics and Information Systems, Institute of Electrical and Electronics Engineers (IEEE) 37. Bhattacharya M, Islam R, Abawajy R (2016) Evolutionary optimization: a big data perspective. J Netw Comput App 38. Kattan A, Doctor F, Arif M (2015) Two brains guided interactive evolution. In: Systems, Man, and Cybernetics (SMC). 2015 IEEE International Conference. IEEE, pp 3203–3208 39. Fujdiak R, Masek P, Mlynek P, Misurec J, Oishannikova E (2016) Using genetic algorithm for advanced municipal waste collection in smart city. 10th International Symposium on Communication Systems, Networks and Digital Signal Processing (CSNDSP) 40. Vlahogianni EI, Kepaptsoglou K, Tsetsos V, Karlaftis MG (2016) Real-time parking prediction system for smart cities. J Intell Transp Syst 20(2) 41. Doctor F, Hagras H (2013) Neuro type-2 fuzzy based method for decision making. US Patent No. US8515884B2 42. Borghini G, Astolfi L, Vecchiato G, Mattia D, Babiloni F (2014) Measuring neurophysiological signals in aircraft pilots and cat drivers for the assessment of mental workload, fatigue and drowsiness. Neuroscience and Biobehavioral Reviews 43. Nagaraj K, Sridhar A (2015) A predictive system for detection of bankruptcy using machine learning techniques. Int J Data Min Knowl Manage Process 5(1) 44. Maniak T, Iqbal R, Doctor F, Jayne C (2015) Automated sound signalling device quality assurance tool for embedded industrial control applications. In: 2013 automated intelligent system for sound signalling device quality assurance. Proceedings - 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2013), Manchester, United Kingdom, pp 4812–4818 45. Mishra VP, Shukla B, Bansal A (2019) Analysis of alarms to prevent the organizations network in real-time using process mining approach. Cluster Comput 22(3):7023–7030 46. Mishra VP, Shukla B (2017) Process mining in intrusion detection-the need of current digital world. In: International conference on advanced informatics for computing research. Springer, Singapore, pp 238–246
Chapter 5
Proposed End-to-End Automated E-Voting Through Blockchain Technology to Increase Voter’s Turnout Ashish Singh Parihar, Devendra Prasad, Aishwarya Singh Gautam, and Swarnendu Kumar Chakraborty
1 Introduction Voting system of India is really unique and responsible task. Any countrymen who is 18 years and above can cast his or her vote to choose desired representative. The one who gets the maximum number of votes win. All this process is taken care by Election Commission (EC). EC in India is very powerful and strict panel. For each and every election, government of India make sure that voter turnouts will be increased as compared to previous election. Every official belonging to EC try to push for increased voting. There are various types of elections in country, Members of the Parliament (in Lok Sabha), Members of State Legislative Assembly, Members of the Parliament (in Rajya Sabha), Member of State Legislative Council, Members in local panchayat or city corporation council. Lok Sabha elections are also called general election in which Members of the Parliament are elected by being voted upon by all adult countrymen of India.
A. S. Parihar (B) · D. Prasad Department of CSE, DIT University, Dehradun, Uttarakhand, India e-mail: [email protected] D. Prasad e-mail: [email protected] A. S. Gautam Department of ECE, Tezpur University, Tezpur, Assam, India e-mail: [email protected] S. K. Chakraborty Department of CSE, National Institute of Technology Arunachal Pradesh, Yupia, Arunachal Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_5
55
56
A. S. Parihar et al.
The process of election for every country is a very crucial and turning point for their countrymen as well as for the country. Each country wants a strong government which leads country up to new dimensions among the globe. There is a very important role of youth in this process, the maximum portion of population of any country belongs to these youngsters. So, for them, it becomes a duty to vote with full pace no matter how the circumstances are. Physical presence of voter at voting location is important as per the current voting scenario which is most of times really not possible due to the remote location of work. Every election time, government spreads the awareness about the voting effect to its countrymen and also explain the benefits of higher voting. Various technical and non-technical communication mediums are being used for this voting awareness but due to various reasons in country, voter turnouts are not up to the mark. In Table 1, total voting in terms of percentage is shown of state Uttar Pradesh of India in general election to state legislative assembly (vidhansabha) from 1991 to 2017 [1]. As form Table 1, it can be observed that voting percentage is much less as compare to expectation. Time to time, lots of surveys are being conducted to understand the root cause of this less voting. In the duration of 1951–2014, voter turnout can be observed through Fig. 1 in which maximum turnout is up to 68%. In Fig. 2, comparison can be seen in between number of registered electors and actual voters in voting. Some of the reasons for less voting are: • • • •
Countrymen does not have faith in any political party. Voter’s name missing in voter list. Voter is not physically available in desired location. Transportation issue of old and disabled voters, etc.
To select any strong and trustworthy government, it is very important that higher number of voting is being done. Importance of casting vote is listed below: • • • • •
Voting is considered as a process of change. Every vote counts and matters to select a strong government. It offers a medium of expression. Voting is not a process it is a responsibility. Voting is a pride and honor to its countrymen.
In this article, an end-to-end automated voting system is proposed using cryptography and blockchain technology. With the help of public–private key concept in cryptography, all communications are encrypted so that communication channel keeps flowing massages securely in network and using blockchain technology, flow of transactions becomes temper proof and secure. Rest of the article is organized as follows. Firstly, related literature survey is presented in Sect. 2. Then, background knowledge of cryptography and blockchain is briefly introduced in Sect. 3. In Sect. 4, we give a brief explanation of the proposed methodology. Section 5 analyze and evaluated the sample metric for this proposed framework. Limitation and assumptions for this proposal are discussed in Sect. 6. Finally, we conclude this study in Sect. 7.
5 Proposed End-to-End Automated E-Voting Through Blockchain …
57
Table 1 Voting percentage in general election to state legislative assembly (vidhansabha) of Uttar Pradesh, India (1991–2017), as per EC’s database Year 2017
2012
2007
2002
1996
1993
1991
Male
Female
Others
Total
Total voting in %
Number of electors
77,042,607
64,613,747
7292
141,663,646
61.24
Number of electors who voted
45,570,067
40,906,123
277
86,755,499
Number of electors
70,256,859
57,232,002
3975
127,492,836
Number of electors who voted
41,225,412
34,500,316
65
75,725,793
Number of electors
61,581,185
51,968,165
0
113,549,350
Number of electors who voted
30,392,935
21,784,164
0
52,177,099
Number of electors
54,738,486
45,017,841
0
99,756,327
Number of electors who voted
31,047,200
22,621,474
0
53,668,674
Number of electors
55,532,926
45,372,893
0
100,905,819
Number of electors who voted
33,455,897
22,776,712
0
56,232,609
Number of electors
49,053,353
40,477,388
0
89,530,741
Number of electors who voted
30,766,752
20,379,284
0
51,146,036
Number of electors
95,988
78,659
0
174,647
Number of electors who voted
59,579
40,388
0
99,967
59.40
45.95
53.80
55.73
57.13
57.24
2 Literature Survey With the study through available literature, this e-voting automated process is achieved with various available technologies [2–9]. Out of all, the most important factor with this kind of voting is maintaining the level of security and believe of countrymen’s on this process. Each and every individual process has its own pros
58
A. S. Parihar et al.
Fig. 1 Voter turnout in Lok Sabha election during 1951–2014
Fig. 2 Comparison of electors and voters in national elections during 1951–2014
and cons toward the process, but government must ensure that advantages of the process are much more strong as compare with disadvantages. Karim [10] proposed a system where voter needs to be authenticate via his or her physical presence at polling office and then after creation of blocks getting started through blockchain. This automation process includes creation, sealing, and termination of blockchain blocks sequentially. At the very initial of this process, the voter’s physical presence is must because only by doing so, voter can be authenticated through some national-level authenticated id.
5 Proposed End-to-End Automated E-Voting Through Blockchain …
59
Sedky and Hamed [11] proposed this e-voting process which is basically designed to overcome the accuracy, transparency, and cost-effective issues in a very secure way. Prior to transferring the last outcome report based on every polling station to the district’s committee automatically, this final outcome report of every polling station was monitored manually by each and every candidate’ representative. Then, it asked for their sign on it as a proof of their agreement of the final presented report. This report will be physically delivered to office of propagate district’s committee. Evaluator in every office of district’s committee will then compare the same against the polling station physically delivered result report with the automatically generated report to be very sure that the outcomes are accumulated without any type of failure. Hjalmarsson et al. [12] proposed an e-voting system based on the concept blockchain that basically utilizes smart contracts to enable cost-efficient and secure election while maintaining voter’s privacy. In their paper, they have shown that with the help of blockchain technology, a new possibility arise to reduce the adoption barriers and limitations of e-voting systems which actually ensures that the election security, lays and integrity the ground for transparency. With the help of private Ethereum blockchain, there is a possibility to send multiple transactions per second towards the blockchain technology, ensuring that every aspect of the smart contract to ease the load on the blockchain network. Kshetri and Voas [13] proposed a blockchain-based e-voting in which they use an analogy say digital currency, this process’s issues that every voter with a “wallet” containing a voter’s credential. Every voter is having a single “coin” which basically provide him/her an opportunity to cast their vote. Again, casting a vote mean that transferring voter’s coin toward a nominated candidate’s wallet. Each individual voter can only spend coin one time. Additionally, it is all up to the voter if he or she wants to change their vote with the existing deadline. Garg et al. [14] showed a comparatively analysis to understand issues generally faced by a e-voting system. According to their paper, there always remain the concern of an authentication of the current voter and required some sort of bio-authenticated devices or unique id. Through this paper, they recommend that blockchain is a better alternate for this voting process as compared to others technologies which is basically temper and error proof. Yavuz et al. [15] implemented and verified a e-voting application like a smart contract using the Ethereum wallets and the Solidity language for the Ethereum network. According to this, android environment is considered for allowing voting for voters who actually do not have wallet of Ethereum. Once this election has been done, then records of votes are being hold by Ethereum. Voters can cast their respective votes through an android-based device or through Ethereum wallet. Sheer Hardwick et al. [16] showed a solution which consists a potential toward the lack of interest in the process of voting among the youngsters. This blockchain technology was their potential technology for such e-voting to become more transparent, open, and independently auditable. Khoury et al. [17] have been proposed a novel of decentralized voting platform that was based on blockchain technology to overcome the trust issues of the voter. Data integrity and transparency were the main features of their system and implemented
60
A. S. Parihar et al.
one vote per one mobile number for each poll with guaranteed privacy. In their system, Ethereum Virtual Machine (EVM) was used to behave as the blockchain run time environment on which consistent and deterministic smart contracts were deployed by the organizers for every voting event in order to run the voting rules. Voters were authenticated via their mobile numbers without the need of any third party.
3 Background Knowledge of Cryptography and Blockchain 3.1 Cryptography Whenever any data is travelled between two workstations, that data is being secured exactly at the endpoints, i.e., at source and destination. But that data need to travel some medium like, wire line, wireless, bluetooth, routers, etc. At any time, in these medium, anyone can simply get the data (i.e., kind of hack). Initially, data is captured in raw format called as plain text. So to secure this data, we need mechanism called encryption and decryption. The whole process of this encryption and decryption is known as cryptography [18]. Symmetric and asymmetric cryptography are different strategy to implement cryptography. Public–private key cryptography is a part of asymmetric cryptography in which no common key is shared among all available stations. Some of popular public–private key cryptographic algorithms are: • • • • • • •
Diffie Hellman key exchange DSS with digital signature Elgamal Various elliptic curve techniques Paillier cryptosystem RSA algorithm Cramer S cryptosystem. Workflow of public–private key cryptography process is shown in Fig. 3.
3.2 Blockchain Blockchain technology comprises continuously growing ledger that consists permanent record of every transactions that have taken place in a secure and immutable way. This technology can utilize for the secure money transfer, property, etc., without any interference of third-party intermediary such as bank or government. This blockchain can also be visualized like a distributed database in which a series of blocks are chained together and continuously growing. Each block in blockchain
5 Proposed End-to-End Automated E-Voting Through Blockchain …
61
Fig. 3 Public–private key cryptography flow
is cryptographically hashed using SHA-256 [19] and passed to the next coming block in chain, so that they can connect internally using that hashed function [20]. There are various modules in blockchain like minors, proof of work, public ledger, cryptographically hashed function which make this technology temper proof. Bitcoin as a crypto-currency is a classic implementation of this blockchain technology. There are three types of blockchain technology available: • Public blockchain. • Private blockchain. • Federated or consortium blockchain. Depending on the nature of environment and implementation, this blockchain can of public or private or the combination of these two. Some of the important features of this blockcgain are listed below: • • • • • •
Achieving high level of security Hacking threat reduced Transparency in transaction is reduced No brokerage for intermediator Various levels of accessibility Automatic reconciliation of accounts.
Essential steps in blockchain workflow is shown in Fig. 4 in which money transaction is shown from one party to another. Initially, all transaction takes place in block then that block is validated through consensus mechanism. Then, after block is cryptographically hashed using SHA 256 and linked with other previously created block.
62
A. S. Parihar et al.
Fig. 4 Standard blockchain workflow
4 Proposed Methodology 4.1 Voter Registration Through EC’s Portal Before starting of the actual voting, EC starts registration of voters through their own portal. Registration of voters is done by voter itself in some dedicated camp organized in their city or town by EC. During the registration process, voters needs to be authenticated via some unique identification (UID) number given by government of country (like, Aadhaar card number in India) and then need to provide some answers about his/her self. These questions are designed in such a way that it contains some tricky but basic questions about voter and only voter know the answers. The preparation of these questions are designed by EC in such a way that these questions are applicable to all voters and complex to answer for other voter to avoid any threat. Some sample questions may be like, middle two digit of birth year, company of first vehicle, fourth letter of mothers first name, etc. These questions are designed in such a way that the questions are very basic to specific voter if it is designed for him/her but on the other hand, it becomes very complex and time consuming for other voter to guess or answer at the same time. In Fig. 5, workflow of voter’s registration is shown on EC’s portal.
4.2 Exchange of Data Using Public Private Key Cryptography Since the design, creation and exchange of these questions and their answers with respect to a specific voter are highly sensitive; there needs to be a secure mechanism
5 Proposed End-to-End Automated E-Voting Through Blockchain …
63
Fig. 5 Voter’s registration through EC portal
in between the EC and voter. The exchange of questions and their answers between EC and voter is done via its public–private key cryptography. A. Question level a. At EC side: (Questions) Encrypted by EC via its private key = (Cipher Question) b. At voter side: (Cipher Question) Decrypted by voter via EC’s public key = (Question). B. Answer level a. At Voter side: (Answer) Encrypted by Voter via its private key = (Cipher Answer) b. At EC side: (Cipher Answer) Decrypted by EC via voters public key = (Answer). At the very beginning of registration, EC and voters exchange their public keys. By this, communication can be secure in between EC and voters while registration.
4.3 Display of Voter List After the registration process over at EC portal, EC publishes a list of polling stations and its corresponding voters. Each polling stations is represented by a unique polling ID. Voters needs to be ensured that their name are in their polling stations (PS) or not. After the exchange of questions and answers, list of voters has been published at PS level as shown in Fig. 6.
64
A. S. Parihar et al.
Fig. 6 Publishing voters list
4.4 Creation of Blocks At the day of voting, each presiding officer (PO) at their corresponding polling station starts their voting blockchain by creating first PS’s genesis block. Each voter can be authenticated via combination of its UID and PS id and can be part of PS voting blockchain by answering the question within a short time frame (like for each question, voter is having only 5–10 s to answer) to avoid any kind of threat. Successful completion of answers by voter can lead them to add their block with vote of their choice in PS voting blockchain and remove their name from the PS’s voter list. The same is shown in Fig. 7.
Fig. 7 Blockchain of PS’s blocks
5 Proposed End-to-End Automated E-Voting Through Blockchain …
65
4.5 Chaining and Sealing the Blocks Creation of blocks corresponding to each polling station stops either of below two cases: I. Time permits for voting expires (decided by EC). II. All voters of specific PS cast their votes. Each voter’s block is hashed through SHA-256 and passes to the next voter’s blocks and this process will go on till the above two case exists. Genesis block of each PS created by PO also hashed through SHA-256 and keyed hashed value is passed to the first voter of that PS. PS level block has been shown in Fig. 8. This process is applicable for all the PS and once creation of blocks at each PS’s stops, then again the whole chain is hashed through SHA-256 and create a single block of PS. Now that single block corresponding to each PS is chained together as above procedure and finally being hash a single blocks at city level and again hashed through SHA-256 which is shown in Fig. 9. Once blocks are created at city level. EC’s generate its own genesis block which is chained with other city-level block to create final blockchain of voting shown in Fig. 10.
Fig. 8 Creation of PS level block
66
Fig. 9 Creation of city-level block
Fig. 10 Final blockchain with city-level block
A. S. Parihar et al.
5 Proposed End-to-End Automated E-Voting Through Blockchain …
67
Algorithm I 1. START 2. Registration And Data Exchange(Voter's ID, Voter's Answers) 3. if(Voter's ID == Valid National ID) then 4. Accept Voter & Start Questions 5. return Voter's Answers 6. end if 7. else 8. Abort session 9. end else 10. end Registration And Data Exchange(Voter's ID, Voter's Answers) 11. Display List(Polling Station's list) 12. while(Polling Station's list!=empty) 13. Iterate (each polling station) 14. display(Voter's Name according to Polling Station) 15. Polling Station's list -= 1 16. end while 17. end Display List(Polling Station's list) 18. Block Creation(PO's ID, Voter's ID, PS's ID) 19. genesis block creation using PO's ID for each polling station 20. for each(Voter belongs to valid polling station) 21. if(Voter's answer == valid) 22. create and add block to the prev. generated block 23. end if 24. else 25. Abort session 26. end else 27. end for each 28. end Block Creation(PO's ID, Voter's ID, PS's ID) 29. Chain and Seal(PO's ID, Voter's ID, PS's ID) 30 . List createBlock = new Block Creation(PO's ID, Voter's ID, PS's ID) 31. while(voting timeout != true && list of voter's!= empty) 32. createBlock= Block Creation(PO's ID, Voter's ID, PS's ID) and chain them using SHA-256 blockchain policy 33. end while 34. Seal the chain 35. end Chain and Seal(PO's ID, Voter's ID, PS's ID) 36. END
4.6 Voting Result Display Once final blockchain at EC level is created, blockchain admin collects all transaction data of voting. Since this chaining is being implemented through blockchain, there is no scope of tempering and modifications in the corresponding data. Once all data has been gathered by the admin, final result is displayed.
68
A. S. Parihar et al.
5 Analysis and Metrics Evaluation of Proposed Framework In case of Ethereum implementation of the proposed framework, the expected block time for the ethereum is 10–19 s. Block time means that the time limit in which a new block is created or generated. Rather than Bitcoin, Ethereum has no block size limit. So the time taken by the registration function of the Algorithm I would be around 50–60 s for each voter. For the display function of Algorithm I would take around 2–3 min each polling station. Let assume the following scenario, Number of registration per hour (Avg.) = 100,000. Number of created blocks per hour = 6897. Avg. Trans./block = (Avg. Trans./hour)/(Avg. blocks per hour) = 14.5 Avg, Trans. Size = BS(Block Size)/(Avg. Trans./block) = 1.4 KB. It can be observed by the above calculation that avg. trans. size is approximately 1.4 KB. Following system configuration have been used for testing purpose: • Intel Core i5-7200U Processor (3 M Cache, up to 3.10 GHz) • 16 GB of memory with Windows 64 bit Operating System (V. 10). For more evaluation, we have used the following below matrices that consist throughput and latency of the proposed framework, • Throughput is the unit time in which data travelled in chain from one location to another location. In Fig. 11, throughput is evaluated with number of voters
Fig. 11 Throughput with number of voters
5 Proposed End-to-End Automated E-Voting Through Blockchain …
69
Fig. 12 Latency while increasing throughput
like, 10,000 voters are having 596 KBps throughput with the proposed voting framework, i.e., maintaining this much of throughput can transfer 10,000 voter’s data from one location to another. • Latency is considered as a delay under which one chain is waiting for other to join. • Block waiting time to join the chain is evaluated in terms of latency and shown in Fig. 12. JMeter is the tool for this particular evaluation of latency in the framework. Both of the above parameters have been evaluated on the sample data of voters and being monitored on the various values, so that the effectiveness of the framework can be observed.
6 Limitations and Assumptions Implementation of this automated voting system has certain limitations and preassumptions which are listed below: (1) (2) (3) (4)
Prior knowledge of technologies and its implementation. Private level blockchain is considered in this kind of implementation. Government convince to voters about system security. Voter awareness required about technical stuffs. Government provides proper training to rural areas. (5) Upper management should be highly trustable.
70
A. S. Parihar et al.
(6) Prototype of entire system should be build before the actual voting. (7) No Internet failure should happen during the process. (8) Entire system should be supported by architectural neutrality feature.
7 Discussion and Conclusion Voting for any country is not just a process, it is a way by which country decide its future for upcoming years. Each and every individual belonging to any country always wants a government which is unbiased, trustworthy, and having a large vision about country. Conducting a voting process is not a easy task for any organization or institute. It is fully responsible task and also having many technical stuff. Different countries having their different strategies to conduct voting process. But the final aim of this voting conduction i, it should be completely fair and unbiased. The idea of this research paper comes from the concept of Indian voting process where various reasons exist for the less voting. According to this research, one of the possible issue for less voting is targeted and accordingly, implementation is proposed. Choosing the one of the trending technology blockchain is the most effective feature of this proposed solution by which transaction of votes becomes highly secure and temper proof. There is a always been a hesitation to adopt technologies in such matters like voting which is highly sensitive issue for any government of the country. At the initial phase, the implementation of these technologies is quite complex at national level but after many revisions, once the entire environment is ready, the whole process of voting becomes quite acceptable.
References 1. https://eci.gov.in, Feb 2019 2. Patil PS, Bansal A, Raina U, Pujari V, Kumar R (2018) E-smart voting system with secure data identification using cryptography. In: 2018 3rd international conference for convergence in technology (I2CT). https://doi.org/10.1109/i2ct.2018.8529497 3. Bhuvanapriya R, Rozil Banu S, Sivapriya P, Kalaiselvi VKG (2017) Smart voting. In: 2017 2nd international conference on computing and communications technologies (ICCCT). https://doi. org/10.1109/iccct2.2017.7972261 4. Hanifatunnisa R, Rahardjo B (2017) Blockchain based e-voting recording system design. In: 2017 11th international conference on telecommunication systems services and applications (TSSA). https://doi.org/10.1109/tssa.2017.8272896 5. Lai W-J, Hsieh Y, Hsueh C-W, Wu J-L (2018) DATE: a decentralized, anonymous, and transparent e-voting system. In: 2018 1st IEEE international conference on hot information-centric networking (HotICN). https://doi.org/10.1109/hoticn.2018.8605994 6. Moura T, Gomes A (2017) Blockchain voting and its effects on election transparency and voter confidence. In: Proceedings of the 18th annual international conference on digital government research—DGO’17. https://doi.org/10.1145/3085228.3085263
5 Proposed End-to-End Automated E-Voting Through Blockchain …
71
7. Wang S, Ni X, Yuan Y, Wang F-Y, Wang X, Ouyang L (2018) A preliminary research of prediction markets based on Blockchain powered smart contracts. In: 2018 IEEE international conference on internet of things (iThings) and IEEE green computing and communications (GreenCom) and IEEE cyber, physical and social computing (CPSCom) and IEEE smart data (SmartData). https://doi.org/10.1109/Cybermatics_2018.2018.00224 8. Andrian HR, Kurniawan NB, Suhardi (2018). Blockchain technology and implementation : a systematic literature review. In: 2018 international conference on information technology systems and innovation (ICITSI). https://doi.org/10.1109/icitsi.2018.8695939 9. Hellebrandt L, Homoliak I, Malinka K, Hanacek P (2019) Increasing trust in tor node list using Blockchain. In: 2019 IEEE international conference on Blockchain and cryptocurrency (ICBC). https://doi.org/10.1109/bloc.2019.8751340 10. Karim MM, Khan NS, Zavin A, Kundu S, Islam A, Nayak B (2017) A proposed framework for biometrie electronic voting system. In: 2017 IEEE international conference on telecommunications and photonics (ICTP). https://doi.org/10.1109/ictp.2017.8285916 11. Sedky MH, Hamed EMR (2015) A secure e-Government’s e-voting system. In: 2015 science and information conference (SAI). https://doi.org/10.1109/sai.2015.7237320 12. Hjalmarsson FP, Hreioarsson GK, Hamdaqa M, Hjalmtysson G (2018) Blockchain-based evoting system. In: 2018 IEEE 11th international conference on cloud computing (CLOUD). https://doi.org/10.1109/cloud.2018.00151 13. Kshetri N, Voas J (2018) Blockchain-enabled e-voting. IEEE Softw 35(4):95–99. https://doi. org/10.1109/ms.2018.2801546 14. Garg K, Saraswat P, Bisht S, Aggarwal SK, Kothuri SK, Gupta S (2019) A comparitive analysis on e-voting system using Blockchain. In: 2019 4th international conference on internet of things: smart innovation and usages (IoT-SIU). https://doi.org/10.1109/iot-siu.2019.8777471 15. Yavuz E, Koc AK, Cabuk UC, Dalkilic G (2018) Towards secure e-voting using ethereum blockchain. In: 2018 6th international symposium on digital forensic and security (ISDFS). https://doi.org/10.1109/isdfs.2018.8355340 16. Sheer Hardwick F, Gioulis A, Naeem Akram R, Markantonakis K (2018) E-voting with Blockchain: an e-voting protocol with decentralisation and voter privacy. In: 2018 IEEE international conference on internet of things (iThings) and IEEE green computing and communications (GreenCom) and IEEE cyber, physical and social computing (CPSCom) and IEEE smart data (SmartData). https://doi.org/10.1109/cybermatics_2018.2018.00262 17. Khoury D, Kfoury EF, Kassem A, Harb H (2018) Decentralized voting platform based on ethereum Blockchain. In: 2018 IEEE international multidisciplinary conference on engineering technology (IMCET). https://doi.org/10.1109/imcet.2018.86030 18. https://en.wikipedia.org/wiki/Public-key_cryptography, May 2019 19. https://en.bitcoinwiki.org/wiki/SHA-256, May 2019 20. https://support.blockchain.com/hc/en-us/articles/211160223-What-is-blockchain-technology, Aug 2019
Chapter 6
Future of Data Generated by Interactive Media Divyansh Kumar and Neetu Narayan
1 Introduction Data was our past, it is our present and it will very much be our future. Everything people do online is recorded. Every search they make, every song they listen to, every movie they watch, and everywhere they go while the location of their electronic devices is turned on. But why does it happen? Why is our information being stored? Does it have any value to anyone other than us? If so, why is our permission not being taken and why are we not being compensated for all the personal and private information that we are providing? Well, almost all of the information collected from the Internet activity of people is used to serve them with better advertisements and better recommendations for all sorts of products and services. This is done with the help of recommendation systems. These recommendation systems need to be fed with variables that can help them decide the recommendations to present to a particular user. But these recommendation systems can only work with the data they get and that data is limited to the search history of a user and their interaction with the content of the sites that deploy these recommendation systems, such as music, podcast, movie streaming services and even online retailers, for example, ‘Discover Weekly’ by Spotify, ‘For You’ by Netflix, ‘Customers who bought this also bought…’ by Amazon, etc. From a customer’s point of view, recommendations are helpful suggestions which make it easier for them to search relevant content from a large number of options D. Kumar · N. Narayan (B) Amity School of Engineering and Technology, Amity University Noida, Sector-125, Noida, Uttar Pradesh, India e-mail: [email protected] D. Kumar e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_6
73
74
D. Kumar and N. Narayan
present online. But from a vendor’s point of view, the recommendations are targeted advertisements to increase their profit [1] and they also help in retaining the user to their website or application for a longer period of time as the website is showing the customers what they might like. Now, with the introduction of new age interactive movies, companies have a new way of improving their recommendations and also help other companies improve theirs. Interactive movies give users the choice to choose what happens in particular scenes of the movies; therefore, giving users the control over the movie’s storyline. These choices made by the user are stored and can be used to improve the recommendations provided to the user thereon [2]. But the benefit and application of interactive media are not limited to just that, it can be much more.
2 Recommendation Systems Recommendation systems/engines are programs that are fed user data and give us output in terms of ratings that a certain user would give to a certain product or service. These systems are being used to recommend everything from songs, movies, books, news, search queries, to jokes, life insurance, financial services, and even romantic partners. Recommendation systems are deployed in almost all websites in today’s world and they take their input from users in many ways. By looking at the search history of users and focusing on certain keywords or by the content they consume (songs, music, etc.). For example, if we feed an Eminem rap song, Lose Yourself , to a song recommendation system, it will give an output of other Eminem songs or even other rap songs in general, based on the genre of the song or the artist. Recommendation engines are used in different ways by a large number of companies, which provide various services and products. Some companies give input to the recommendation system in the form of certain keywords entered by the user and others take input from the type of media the user consumes. Recommendation systems are made with the help of machine learning algorithms. They produce recommendations in one of the two ways—collaborative filtering and content-based filtering (also known as personality-based approach). Collaborative filtering approach takes into consideration the user’s past behavior along with products and services opted by users of similar interest, to recommend products and services to the user. Content-based filtering approach uses a bunch of features of an item to recommend items with similar features. These two approaches are often combined to make hybrid recommendation systems to get the most accurate recommendations. But what if there was another way to help the recommendation engines give more accurate recommendations?
6 Future of Data Generated by Interactive Media
75
3 Interactive Media Interactive media can be defined as a form of media where the output is dependent on the input from the user. Interactive media works with the help of user interaction. Although interactive media is the same as any other media that people consume, the biggest and the only difference is that it gives users the ability to interact with the media and alter the output. The latest and the best example of interactive media is Netflix’s Black Mirror: Bandersnatch. The way interactive movies work is that as the movie progresses, the users are given two choices to choose from, at many points in the movie, to take inputs from the users to determine how the storyline should proceed. This way the users gets to decide what they want to see in the movie. As the movie progresses, the users usually start to connect with the characters and make choices for the characters based on what they want the character in the movie to go through. In a two-hour interactive movie, the users can be asked to give their input ten or more times. The choices made by the users alters the story lines of the movie and thus the endings. Different users choose different options and get to see a different ending.
4 How Interactive Movies Collect Data The way interactive movie streaming services collect user data is by storing the user’s choices, which are mapped to particular scenes in the movie and can be named as any variable of choice. For instance, Netflix stores the data of every user in a table where all the choices they make, while streaming the interactive movie is stored as variable names [2]. Suppose you are given a choice to select the music you want the character in the movie to listen to and you have to choose between, let us assume, Lose Yourself (Eminem) and Delicate (Taylor Swift), then this choice will be stored as 1L or 1D. These choices are stored in a different table for each user by Netflix [2]. Along with the choices, the date, time, streaming source (website/app), etc., can also be saved to get a better understanding of the user and their habits. The time at which a user is streaming the interactive media can also be significant to the choices the user makes while watching the interactive media.
5 Marketing Aspect of Interactive Movies Apart from being an interesting way to watch movies, interactive movies also help in generating a large amount of data about the user interacting with the movie. The data
76
D. Kumar and N. Narayan
generated has a lot of significance and can be used in various ways like improving recommendations for new products and services. Given in Tables 1 and 2 are examples of scenes that can be used in the interactive movies to get inputs from users which can later generate a lot of insight about the user and their likes and dislikes. Companies can use interactive movies to get a market surveys done for their upcoming products and even get insights about which of their product fares well as compared to other products made by them and their competitors. There are a lot of Table 1 Product recommendation from straight-forward choices S. No. Situation in the media
Choices
1.
User is asked to choose what type of cereal they want the character to consume
Quaker Sugar Puffs
2.
User is asked to choose Lose Yourself (Eminem) what type of music they want the character to listen to
Endgame (Taylor Swift)
3.
User is asked to choose what book they want the character to buy at the book store
The Lord of the Rings (J. R. R. Tolkien)
The 7 Habits of Highly Effective People (Stephen R. Covey)
4.
User is asked to choose what computer they want the character to buy at the electronic store
Apple MacBook Pro
Dell Alienware M15 Gaming
5.
User is asked what they Play FIFA 19 want the character to do at a night stay with friends
Kellogg’s Frosties
Watch Stranger Things
Table 2 Product recommendation from psychological/subconscious choices S. No. Situation in the media
Choices
1.
User is asked if they want the character to get Hell yeah! into a fight or avoid it
No
2.
Choosing if the user wants the character to go Yes out on a date with another character
No
3.
Choosing whether the main character goes on Sure, why not? a roller coaster with their friends
No thanks, I’m good
4.
Choosing if the character should talk to their family/friends about how they feel or should they tell them they’re fine
Talk about it
Tell them you’re fine
5.
Choosing if the character should go to the grocery shop to buy groceries themself or order online
Go to the store
Order online
6 Future of Data Generated by Interactive Media
77
ways in which interactive movies can be used to gather more data about the users [3, 4]. Table 1 gives us an example of how different items like cereals, books, songs, laptops, and even vehicles can be targeted to get an understanding of a user’s likes and dislikes. These choices are just a few examples and there can be many more choices in movies to get better insights about the user. Let us examine each choice one by one and try to understand what they mean in terms of marketing and improving recommendations. 1. When the user is asked to choose between the type of cereal they would like the character in the movie to consume, the user’s choice dipicts their preference to a certain extent. The user is most likely to choose the cereal that they have an affinity toward or have heard about. The choice of the users in this case can be used to identify the product that is preferred by the masses for different age groups, genders, and even locations. Various companies can also sponsor the featuring of their products in an interactive movie to see which of their products are preferred by which demographic and how they compare with their competitors. 2. The user’s song selection gives insights about their affinity toward a particular genre or artist. The choices made here can be used to improve music recommendation for that particular user, provided the email ID that they are using on the Interactive Media Streaming Service (IMSS) is same as their email ID on any music streaming service that is in partnership with the IMSS. The MAC address of the user can also be used for the purpose of mapping the IMSS user to a music streaming service and then improving the recommendations for the user using that MAC address. Apart from improving recommendations, this information about the user can also be used for targetted advertisements featuring that particular song or artist to grab the user’s attention. 3. The user’s choice here gives insights about their preferable book genre or even their preference of a certain author. This information can be used for targetted advertisements by online retail stores that are in partnership with the IMSS. They can show the picture of particular books from a similar genre in their advertisements to this particular user. 4. The user’s choice here gives insights about their preferability of a certain operating system, or a brand of computers. This information can be used by an online retail store for targetted advertisements and even by the manufacturing companies to better understand their target demographic for future advertisement campaigns. 5. This choice gives better insights about the user’s preference between watching a TV series or playing a video game. This can help in improving targetted advertisements with respect to streaming services and online gaming platform and even gaming CPU building websites. All these choices pertaining to the storyline of interactive movies can also be pre-decided in partnership with a brand to help them market their product and get a better understanding of their target demographic [5, 6].
78
D. Kumar and N. Narayan
6 Psychological Aspect of Interactive Movies Psychological questions go a long way in helping us understand how a person thinks, but these questions do not help much when asked in the form of a survey. This is majorly because when a survey is being conducted, participants keep in mind that they will be judged on the basis of their answers to these questions. The case is totally different when such questions are asked through the medium of interactive movies. These questions are strategically put in the movie scenes in order to receive a reflex psychological response from the user and this response has a higher chance of being more precise than a written or an online survey, since the user is subconsciously attached to the characters in the movie. By giving the users the opportunity to choose what they want the characters to do, a rough idea about how the users thinks or what the users want and how they want it can be obtained. Given in Table 2 are examples of some strategically placed psychological choices in interactive movies which give us an insight of the user’s thought process. These questions and situations help the media service providers, like Netflix, to understand how the user thinks and they can also help them improve recommendations and targetted advertisements. These are just a few examples of choices given to users and there can be many more choices in movies that will help us in understanding the user better. What do these choices mean in terms of understanding the user’s psychology? Let us examine each choice one by one. 1. This choice tells us the user’s preference between watching a fight scene in a movie or avoiding it. It gives us insights about whether the user likes to watch movies with fight scenes or not. It might also give insights about the user’s stand on voilence in real life. If the user liked to watch hand-to-hand combat in movies, they should be recommended movies where such fights scenes are present, and the user can also be targetted with hand-to-hand combat training advertisements. 2. This choice can be used to get a better understanding of the user’s preferences vis-à-vis their dating partner [7]. This information can be used by dating sites to improve that particular user’s matches and display certain type of individuals in their advertisements to attract the user. 3. This choice gives us an idea about the user’s affinity toward recreational/adventurous activities and hence the users can be targetted with advertisements of theme parks, water parks, etc. 4. This choice tells us whether the user is open to discussion about their personal life with another individual or not. It can also indicate the user’s subconscious inclination toward anxiety and depression. So, advertisements from various sites which provide therapy services can target this particular type of users with positive and welcoming advertisements. 5. This choice tells us whether the user is interested in going to the grocery shop and likes the experience of shopping by touching and feeling the items that they want to buy or prefers online shopping. Depending on the answer the user can be targetted by online grocery stores and shopping sites.
6 Future of Data Generated by Interactive Media
79
These choices can help us understand every user uniquely and get insights that were not available before, which can help further improve targetted advertisements, so they can reach the users who actually find the advertisements useful. So how do these choices help companies, like Netflix, get better at recommendations, under the hood? Earlier, the only way to recommend products and services to a person was by analyzing their Internet activity, but interactive movies can now help us understand how the user thinks, thus they can help target the user’s subconscious mind which will yield better results and interaction with the advertisements from the user’s end.
7 Future of Interactive Movies At this time, there are only two options to choose from while interacting with the movies, but in the future, there will be more than two choices. The reason behind this is not only its less complicated approach on the users’ end, but it is also easier to develop a movie with less options at every turn. As the number of pathways in the movie decreases, the complexity in its production and the size of the movie also decrease [8]. It also saves time, effort, and money of the production team. However, in the future, as more and more production houses start making interactive movies, they might invest more money and the end result may be more choices to choose from. Eventually progressing toward way more story lines and endings. This is a costly process as different scenes are shot for different story lines and thus the budget of the whole movie is increased. New content genres may be created after analyzing the information on what different demographics of users like to watch and what they are attracted to. This way movies that have a higher chance of being profitable will be made. Interactive movies can also solve the new user cold start problem. The cold start problem is a problem faced by websites and applications that rely on recommendation systems to improve their services, when they do not have enough information on new users to draw any inferences. If the new users have watched any interactive media on an interactive media streaming service and a website has a contract with the streaming service, it might be able to make some inferences about the new user if they use the same email ID or MAC address with which they access the streaming service.
8 Privacy Privacy dates back to ‘The Right to Privacy’, which was published in 1890 and became a pioneering work of traditional American law. Warren and Brandeis [8] argue that the right to privacy is a unique right that should be protected from unwarranted disclosure of details that others want to keep secret in their personal lives [9].
80
D. Kumar and N. Narayan
The Internet has become a part of life, people leave huge data footprint on major websites every day. In the big data environment, it makes it easier for us to disclose personal privacy [10]. Privacy is a big concern for people today and the way this paper suggests the use of the data generated by interactive movies might be seen as a violation of the user’s privacy. This paper does suggest the use of the choices made by a user while watching an interactive movie to learn more about the user and their preferences. But this is no different than what YouTube, Instagram, TikTok and other such platforms do to train their recommendation engines or how Google’s AdSense shows the users advertisements based on their browsing history. When the necessary measures are not taken during the processing and sharing of big data, which may contain sensitive and personal information, it is inevitable that privacy disclosures and other negativities can be experienced [11]. So, the information retrieved from the users via interactive media should be shared carefully among partner companies such that it does not put the privacy of the users at risk. The recommended and ethical way to use the information generated by the user is to ask for consent from the user stating that the information generated from this interactive media will be used to improve recommendations, create new content and might also be shared with partner companies.
9 Conclusion Various conclusions can be drawn from this research about what the future holds for interactive media, new ways to gather more information about the users and what to do with that information. Interactive movies are the future and they have the potential to interpret audiences in ways that were not possible before. This research paper explores different ways in which the interactive movies can be used to gain more information about the user on a deeper level which is not possible through surveys and browsing history. It also briefly discusses the privacy aspect of using the information generated by a user while streaming an interactive media.
10 Applications There are various applications of this research on interactive movies when it comes to improving marketing and the quality of recommendations. This research shows, with examples, how different scenes in the interactive movies can be used to get more information about the user. It also shows how the choices made by the user can help improve the accuracy of targeted advertisements and recommendations to unprecedented levels. It can also result in tremendous response and increased interaction with the advertisements from the users.
6 Future of Data Generated by Interactive Media
81
References 1. Chaudhari DD, Agnihotri N, Dandwani P, Nair S, Kulange S (2016) A choice based recommendation system using WUM and Clustering. In: Presented at the 2016 international conference on data mining and advanced computing (SAPIENCE), Ernakulam, India, 16–18 Mar 2016 2. Veale M (mikarv) Remember everyone quickly speculating whether Black Mirror: Bandersnatch was a data mining experiment. I used my GDPR right of access to find out more. (short thread) #Bandersnatch, 11 Feb 2019, 4:02 P.M. Tweet (Thread) 3. AjazMoharkan Z, Choudhury T, Gupta SC, Raj G (2017) Internet of Things and its applications in E-learning. In: 2017 3rd international conference on computational intelligence and communication technology (CICT), pp 1–5 4. Choudhury T, Kumar V, Nigam D, Mandal B (2016) Intelligent classification of lung & oral cancer through diverse data mining algorithms. In: Proceedings—2016 international conference on micro-electronics and telecommunication engineering, ICMETE 2016. https://doi.org/10. 1109/ICMETE.2016.24 5. Singhal A, Sarishma, Tomar R (2016) Intelligent accident management system using IoT and cloud computing. In: 2016 2nd international conference on next generation computing technologies (NGCT), pp 89–92 6. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET using MQTT. Ada User J 38(4):231–235 7. Shah S (2019) Netflix promises more interactive shows like ‘Bandersnatch’ | Engadget | News | by Saqib Shah, 12 Mar 2019 [Online]. Available: https://www.engadget.com/2019/03/12/net flix-interactive-shows-bandersnatch/ 8. Schwartz DI (2019) Beyond ‘Bandersnatch,’ the future of interactive TV is bright | The Conversation | News | by David I. Schwartz, 27 Mar 2019 [Online]. Available: https://theconversat ion.com/beyond-bandersnatch-the-future-of-interactive-tv-is-bright-111037 9. Warren SD, Brandeis LD (1980) The right to privacy. Harvard law review, pp 193–220 10. Wang C (2018) Research on the protection of personal privacy of tourism consumers in the era of big data. In: Presented at the 2018 international symposium on computer, consumer and control (IS3C). Taichung, Taiwan, Taiwan, 6–8 Dec 2018 11. Canbay Y, Vural Y, Sagiroglu S (2018) Privacy preserving big data publishing. In: Presented at the 2018 international conference on big data, deep learning and fighting cyber terrorism (IBIGDELFT). ANKARA, Turkey, Turkey, 3–4 Dec 2018
Chapter 7
Efficient Load Optimization Method Using VM Migration in Cloud Environment Sarita Negi, Man Mohan Singh Rauthan, Kunwar Singh Vaisla, and Neelam Panwar
1 Introduction Cloud computing or internet computing [1] was evolved in the primary years of 1996 and got popularized in 2006 by Amazon.com service called Elastic Compute Cloud. The goal of cloud computing is to make computer system resource services available on-demand to cloud users in metered or non-metered basis. Cloud computing facilitates its services without direct intervention of cloud users. In this new era of advancement in the technology world, the traditional desktop technology-based applications have been migrated to cloud computing world [2]. Before the idea of cloud computing, there were many computing technologies that were used to provide connectivity between the machines and internet. The main evolution of cloud computing has been seen with the evolution of distributed system, grid computing, utility computing, and parallel computing. Before these computing, in late 1965–1970 client/server model came into existence, although it has the same property of centralized storage as of the cloud, but it did not have a user-centric focus. Client/Server computing is the “hurry up S. Negi (B) Uttarakhand Technical University, Dehradun, Uttarakhand 248007, India e-mail: [email protected] M. M. S. Rauthan · N. Panwar SOET, H. N. B. Garhwal University, Srinagar, Pauri Garhwal, Uttarakhand 249161, India e-mail: [email protected] N. Panwar e-mail: [email protected] K. S. Vaisla B. T. KIT, Dwarhat, Uttarakhand, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_7
83
84
S. Negi et al.
and wait” experience which holds the control on mainframe and the server. After a decade, P2P model emerged into the computing world which was in the middle of client/server computing and distributed computing. In contrast with client/server, P2P is not centralized base network. Distributed computing uses the unused resources of the computers that haven’t utilized its computing power by distributed facility. The distributed.Net and SETI@home are best examples for providing links between millions of individual computers. From the working point of view, cloud follows the architecture of Grid computing. Grid computing categorizes as decentralized resource control, standardization, and QoS (latency, throughput, and reliability) [3]. Examples of grid computing are, Earth System Grid, Open Science Grid, EGEE, TeraGrid, and caBIG [4]. Cloud is the combination of the grid services, computer utility, virtualizations, and multiple domains. Literature [4] perfectly describes that cloud and grid are quite similar from the vision of computing cost, reliability and flexibility on pay basis. Services are used which are different in terms of resource management. An immense growth of cloud forces users to have a well-organized load balancing mechanism for the enhancement of unbalanced cloud [5]. A cloud is said to be balanced when all the user tasks are properly assigned to all available Virtual Machines (VMs) under associated Physical Machines (PMs). When user tasks that are assigned to the VMs become overloaded or load balancing may occur in PMs. Datacenters (DCs) are the major working entities of cloud environment. It consists of PMs, VMs, and their tasks. Researchers have shown more interest in task scheduling, VM allocation, load balancing, and migration-related work of cloud computing field [6, 7]. Author explains how efficient task scheduling approach may lead to achieve better load balancing, and improved utilization of resources. Load Balancing and scheduling of resources are the NP-hard problems. For this reason, researchers have opted: machine learning (Artificial Neural Network (ANN)), Optimization (Particle Swarm Optimization (PSO), Soft Computing (Fuzzy and Genetic Algorithm (GA)) based approaches to solve these problems and to achieve more realistic outcomes. Cloud performance is measured by cloud metrics such as MakeSpan, Resource Utilization, Throughput, Execution Time, Processing Cost, Completion Time, Degree of Imbalance, Load Fairness, Software Level Agreement, etc. Each metrics has its own impact on the betterment of cloud Quality of Services (QoS) such as, MakeSpan, Execution Time, Completion Time, Processing Cost and Degree of Imbalance must be lower in computational results whereas, Resource Utilization, Throughput and Load Fairness should always be higher in computational results. Load balancing can be achieved by supervised and unsupervised methods [8]. Authors have introduced a hybrid approach of ANN and i-Kmeans algorithms. Here, ANN has been implemented to calculate optimized loads of each VM and PSO is used for task scheduling process. TOPSIS based PSO tasks scheduling algorithm is introduced for efficient task mapping and resource utilization [9]. Authors used a mathematical multi-criteria based Technique of Order Preference by Similarity to Ideal Solution for the calculation of Fitness Value in PSO.
7 Efficient Load Optimization Method Using VM Migration …
85
Performing VM migration would be a better approach. In literature [10, 11] an innovative and interruption-free protocol model Named Data Networking (NDN) is introduced for continuous VM migration action within DCs. The organization of the paper is as follows: Background work on load balancing in cloud environment has been discussed in Sect. 2. Section 3 highlights proposed model. Section 4 elaborates implementation of proposed model while Sect. 5 explains the evaluated outcomes of proposed work. Lastly, conclusion and future scope are covered in Sect. 6.
2 Literature Review A Honey Bee-Behavior inspired Load Balancing (HBB-LB) approach has introduced an efficient method of load balancing among VMs [12]. The HBB-LB algorithms performed task migration approach in which tasks from overloaded VMs migrated to underloaded VMs. Priority of tasks has been considered for the tasks by minimizing the waiting time in the queue that have been removed from the overloaded VMs (as Honey bee) and hence treated as information updater. HBB-LB has achieved better results in terms of makespan and degree of imbalance. However, tasks are randomly assigned to VMs without considering initial loads of available VMs which may increase makespan and resource utilization. Live VM Migration based real-time scheduling in share-nothing IaaS cloud is introduced [11] to enhance load balancing. File system named MigrateFS is used by the broker to provide sync and replication of virtual disks during the migration process. Authors have focused on the evaluation of prioritization of tasks and cost requirements by migration. The decision of live VM migration was performed by hypervisor based on the performance of RAM and CPU. Due to its complexity, this method is less effective in resource utilization. An improved weighted round-robin algorithm for non-preemptive dependent tasks has introduced to achieve load balancing [13]. The focus of the work was on different attributes such as VM capability, requested job length, and dependencies between each task. Authors have used threshold value, load per unit capacity, and load imbalance factor to perform task migration from overloaded VM to underloaded VM. Load balancer was responsible for the load balancing and task migration within DC. Utilization prediction model based energy-aware work was introduced for energy efficiency in cloud computing [14]. The work included VM consolidation policy with the consideration of utilization of resources using current as well as future resource utilization. Author investigated a resource utilization prediction using real workload platform such as PlanetLab and Google cluster. Regression model used to get approximation of future utilization of VM and PM. The work seems effective in prediction of resource utilization but it increases number of VM migration which will increase overhead in cloud system. A mathematical model inspired multi-criteria based approach TOPSIS is used to achieve an efficient task scheduling algorithm called TOPSIS-PSO [9]. The work
86
S. Negi et al.
included TOPSIS for the evaluation of fitness function (relative closeness) to PSO using three attribute, i.e., execution time, transmission time, and processing cost. TOPSIS-PSO aims to work as a multi-objective algorithm to solve task scheduling issues. The work is suitable for task scheduling approach but not efficient in the terms of load balancing in cloud environment. Meta-heuristic based cloud job scheduling approach using fuzzy and genetic method has been introduced [15] called FUGE. Hybridization of fuzzy and GA involves load balancing and job scheduling.VM speed, VM memory, and bandwidth with length of jobs are considered for assignments of jobs. The performance was evaluated on degree of imbalance, makespan, execution time, and cost. This work is suitable for task scheduling related issues but not suitable for dynamic cloud environment. The above literature have suggested many effective models but still an efficient and dynamic load balancing is in demand for dynamic cloud environment. To overcome the previous work issues, this literature is introducing a new and innovative load balancing model using soft computing technique (Fuzzy Logic) for efficient VM migration approach.
3 Proposed Model The proposed model is designed to perform load balancing between physical machines (PMs) on a cloud environment. Cloud based applications are intimated having different attributes of cloud such as Datacenters (DC), VM Manager, Physical Machines or Host, Virtual Machines (VMs), Tasks, etc. Cloud is comprised of DCs = {DC1 , DC2 … DCq } that maintains set of PMs = {PM1 , PM2 … PMn } and each PM holds number of VMs = {VM1 , VM2 …VMj }. The major VM parameters are required to be stored in a container such as, image size (in MB), ram (in MB),million instruction per second (MIPS), bandwidth, CPUs, and virtual machine manager (VMM). Tasks Ts = {T 1 , T 2 … T i } are assigned to each VMj with their Task_ID. Task length, file size, output size, processing element, and utilization model are the initial parameters of tasks. Each DCs, PMs, and VMs are represented by their IDs in the cloud environment. The role of VM manager is to maintain list of PMs and VMs in the cloud. For efficient load balancing between PMs, VMs in overloaded PM perform a migration to underloaded PMs. Here, finding an appropriate underloaded PM as a final destination for migrated VM is a major task. A soft computing based fuzzy logic theory is implemented to solve such major issue. Detailed working structure of FLM method is discussed in the following section.
7 Efficient Load Optimization Method Using VM Migration …
87
3.1 VM Migration In this proposed model, cloud model is assumed as set of PMs having set of VMs with assigned tasks. Each VM uses its resources and initialized parameters for the execution of tasks. PMs maintain the distribution of resources to VMs as per their computational load. Study images that it is difficult to maintain PM level load balancing in cloud. Such issues are resolved by the effective migration schemes. Migration techniques permit better potential to achieve VM level and PM level of load balancing. In VM level, tasks are transmitted from source VM to destination VM whereas in PM level, VMs are migrated from overloaded PM to underloaded PM. In this research, PM level load balancing is being focused using VM migration. The different schemes of migration can be performed as Live migration, Non-Live migration, Pre-copy, Post-copy, etc. Live VM (or hot migration) migration scheme migrates the VM from overloaded PM to underloaded PM including pre-migration brownout and post-migration brownout states. In live migration method, all states of VMs (information about VMs) are moved from source PM to destination PM. This process avoids delays between the migrations. Non-live (or cold migration) VM migration adds up pause during the execution of tasks on VMs at source PM. However, once VM information is sent to destination VM, the execution of tasks are resumed. For these reasons, non-live migrations are quite slower than live migration. Figure 1a, b represent the working structure of live VM migration and non-live VM migration respectively. VM migration performs memory transfer in three phases. These are push, stopand-copy and pull phase. Push phase stops running some of the pages between source and destination PM and resumes when reached a destination. In stop-andcopy phase, VM at source PM is initialized to pause then memory pages of the VM are copied to the destination PM and lastly, new VM is initialized. In pull phase, the new VM at destination PM is initialized to execute. If any memory page is missed then required page is pulled from the VM at source PM. By these schemes VM migration can be fully achieved which helps to perform load balancing among PMs. The imbalance loads on PMs may result in unbalanced cloud and underutilization of resources. To overcome such issues, load on each PM must be monitored in such a way that overloaded PM may migrate its VMs to underloaded PM. VM manager is always responsible to provide each PM information to DC. Load of each VM can be calculated by the ratio of tasks on VMj and service rate of jth VM [12], L VM j = T VM j /S VM j
(1)
where, T VM j is the number of tasks in jth VM and SR VM j is the service rate given by VMj .
88
Fig. 1 a Live VM migration. b Non-live VM migration
S. Negi et al.
7 Efficient Load Optimization Method Using VM Migration …
89
Load on PMs can be obtained by the load of all VMs that are using the resources of that PM and expressed as, j PMn (Load) = L VM j
(2)
n=1
where, load of nth PM is calculated by the summation of load of all VMs in that PM. To perform VM migration, VM manager can take decision on the basis of the current load of PMs. If load on PM increases and becomes overloaded then highly overloaded VM on that PM can be migrated to optimal PM which may be selfsufficient to run the VM by utilizing its resources. The selection of maximum overloaded VM for migration is calculated by the migration time. Total time required to switch the VM from one machine to another is said to be migration time which must be minimized and is calculated as, Migration Time j = RAM of VM j Bandwidth of VM j
(3)
The VM which has minimum migration time will be selected for the migration is given as, Selected VM = minimum (Migration Time)
(4)
The Selected VM for migration is represented as migrated VM (M(VM)). The VM migration is said to be efficient if the selected destination PM is optimal with respect to its available resources.
3.2 Fuzzy Logic Based PM Selection Optimal PM selection for destination migration is one of the major tasks in VM migration based load balancing. Availability of resources in destination PM should be appropriate to share resources among migrated VMs. Hence, selection of optimal PM must be achieved through a well-defined process. To achieve this objective, soft computing based Fuzzy Logic (FL) technique is implemented. The concept of FL method uses set of partial truth input elements where IF-ELSE fuzzy rules are applied to obtained fuzzy implications (Inference) to create fuzzy system. Characterization of fuzzy set is obtained by fuzzy membership function (MF). A MF can be performed using either a discrete universe of discourse or a continuous universe of discourse. Triangular, Trapezoidal, Gaussian, Generalized bell (or Cauchy) and Sigmoidal are some of the known membership functions in fuzzy logic. Fuzzy implications is used with the set of linguistic variables, universe of discourses and If–then rules to make fuzzy elements in appropriate form. The output of inference mechanism further gets defuzzified which means fuzzy elements get converted into crisp output. There are
90
S. Negi et al.
four main defuzzification methods, Lambda-cut, Weighted average, Maxima and Centroid methods. With these methods, general FL has less potential for situations where large number of uncertainty occurs. Type1 fuzzy logic systems (T1 FLS) always performs fixed MF, hence representation of linguistic knowledge for rule base functions in (T1 FLS) is not possible. Type2 fuzzy logic systems (T2 FLS) have enrolled by Zadeh to provide more degree of freedom for the systems that consist lots of uncertainties [16]. T2 FLS have the potential to provide better performance than T1 FLS and general FLS.
3.2.1
Type2 Fuzzy Logic (T2FL)
Type2 Fuzzy Logic is more flexible to uncertainty and degree of freedom. It has potential to improve the FLS characteristics by enhancing Footprint of Uncertainty (FoU) of MFs. The work includes three decision variables as a crisp input to the Fuzzifier such as, available CPU, available memory and current load of the PM for optimal VM migration to destination PM. Low, medium and high are represented as linguistic terms which are further converted into fuzzy weights using fuzzy member function. Fuzzy rule is used to get optimal destination PM such as, if available memory is high and available CPU is high and current Load is low then the result is highly desirable for the selection of optimum destination PM. In this rule, available memory, available CPU, current Load and optimal destination PM are universe of discourses for low, medium and high linguistic terms respectively. T2 FL method performs type reduction and defuzzification over fuzzy output set. To compute the centroid type reduction, Karnik–Mendel (KM) algorithm [17] and for defuzzification, centroid of sum (CoS) technique is used. The VM migration process includes steps for the selection of destination optimal PM in Algorithm 1. Algorithm 1: Selection of destination PM using T2FL Compute load on PM using Eq. (2) If anyone of PM overloaded for all VMs in overloaded PM Compute MT(VM) using Eq. (3) Select that satisfies Eq. (4) Initialize T2FL method Crisp input : Available Memory, Available CPU, Current Load Output as: Optimum PM Fuzzifier Crisp Input (Fuzzy input Set) for each PM Avail_Mem_PM=broker.getVmList().get(i).getHost().getStorage(); Avail_CPU_PM= broker.getVmList().get(i).getMips(); Load_PM=broker.getVmList().get(i).getCloudletScheduler().runningCloudlets(); Set up membership functions making up the T2F sets for each input and output. Set up the antecedents and consequents to associate inputs. Set up the rulebase and add rules. Inference Rules and Fuzzifier
7 Efficient Load Optimization Method Using VM Migration …
91
Optimal_PM_Rule(Avail_Mem_PM, Avail_CPU_PM, Load_PM){ Possibility=0; if((Avail_Mem_PM==0)&&(Avail_CPU_PM==0) &&(Load_PM==0)){ Optimum_PM=0; } else if((Avail_Mem_PM==0)&&(Avail_CPU_PM==0) &&(Load_PM==2)){ Optimum_PM =0; } else if((Avail_Mem_PM==0)&&(Avail_CPU_PM==2) &&(Load_PM==0)){ Optimum_PM =1; } else if((Avail_Mem_PM==0)&&(Avail_CPU_PM==2) &&(Load_PM==2)){ Optimum_PM =0;} else if((Avail_Mem_PM==2)&&(Avail_CPU_PM==0) &&(Load_PM==0)){ Optimum_PM =1;} else if((Avail_Mem_PM==2)&&(Avail_CPU_PM==0) &&(Load_PM==2)){ Optimum_PM =1; } else if((Avail_Mem_PM==2)&&(Avail_CPU_PM==2) &&(Load_PM==0)){ Optimum_PM =2; } else Optimum_PM =1;} return Optimum_PM;} Rule_Combination_Output_Processing IT2_Fuzzy__Sets (from) inference Perform Defuzzification using Centroid method Obtain non-fuzzy crisp output values (optimal destination PM for migration) end for until underloaded PM found or maximum iteration reached end for Select optimal destination PM Move optimal destination PM
4 Implementation The proposed VM Migration based Load Balancing methods using Fuzzy Logic (VMMLB-FL) algorithm has been implemented for efficient load balancing among underloaded PMs. The performance evaluation of proposed VMMLB-FL method is implemented on Cloud Simulator tool (CloudSim) [18] that supports modeling and simulation environment of cloud computing. CloudSim tool provides simulation model of broker, datacenters, hosts, VMs, tasks, scheduling and utilization models etc. Cloud simulation parameters have been set as shown in Table 1. The CloudSim tool is executed in 2.20 GHz processor with 16 GB memory. Java based fuzzy, i.e., Juzzy [19] package has been extended into the CloudSim for T2FLS. VMMLB-FLS algorithm is implemented to select the destination optimal PM for VM migration Fig. 2a–d show the upper membership function and lower membership function of each crisp input and output using vertical slice representation respectively. Triangular MF has been used for input crisp set whereas for output Gaussian has been incorporated. Representation of Type2 set can be achieved through Vertical slice, Horizontal slice, Wavy slice, and zSliced based representation. However, this experiment uses vertical slice representation due to its simplicity and ease. Fuzzy set is fed into inference model which combines the fuzzy set and rules given in rule base. Based on these rules, T2 FL results in single output for multiple inputs.
92 Table 1 Simulation parameters for cloud environment
S. Negi et al. Parameter
Value
Physical machine Number of PMs
2–6
Number of processing units 4 in one PM
Virtual machine
MIPS
9600
Storage capacity
11 TB
RAM
4 GB
Scheduling interval
30 ms
Monitoring interval
180 ms
Number of VMs in each PM
10–50
MIPS
2400
Number of processing units 4 Task
Number of tasks
50–2000
Maximum task length
20,000
Task size
500
MIPS required
10,000
Average RAM
512 MB
Average bandwidth
100,000 Mbps
Fig. 2 MF for input given in (a), (b), (c) and output in (d)
7 Efficient Load Optimization Method Using VM Migration …
93
Hence, PM with high available memory, high available CPU, and low load is selected as optimal destination for migration. Here, VM migration process results in load balancing among PMs. This load sharing method gives flexibility among PMs to perform efficient resource utilization.
5 Result and Discussion The obtained results and its comparative performance are analyzed in this section. Experimental observation for proposed VMMLB-FLS is compared with existing cloud environment based algorithms. The performance of proposed method is analyzed with RR (Round Robin), and Cloudlet Migration based enhanced First Come First Serve (CM-eFCFS) algorithms [7]. The cloud performance metric used for performance evaluation is illustrated in the following section: Transmission Time (TT): This metric is defined by the ratio of size of task to the VM bandwidth and expressed as follows, TT = Size of task Bandwidth of VM
(5)
MakeSpan (MS): The most effective optimization performance metric is minimization of MS. It is the difference between the final time of last task and the initial time of first task on VM. The MS is calculated using Eq. (6). This metric aims to achieve efficient and faster execution of tasks by minimizing MS. MS = Final time of last task − Initial time of first task
(6)
Execution Time (ET): The time when task is working on execution process is called the execution time and it must be minimized for efficient optimization. ET can be achieved by the ratio of length of ith tasks to the MIPS of jth VM and expressed as, ET = Length of ith task MIPS of jth VM
(7)
Number of migration (NoM): This metric provides the number of migrations carried out in the system. It is important to perform minimum NoM on the cloud environment. As migration of VMs is always proportional to the usage of energy bandwidth and CPU hence, large NoM can increase the energy consumption. VM migration may be frequently used in the unbalanced cloud systems. Hence this metric plays vital role in evaluation of load balancing.
94
S. Negi et al.
Transmission Time (Sec.)
VM=5
Transmission Time
500 400 300 200 100 0 RR
CM-eFCFS Algorithms 10
20
VMMLB-FLS
30
Fig. 3 Comparative analysis on Transmission Time
5.1 Comparative Analysis 5.1.1
Analysis of Transmission Time
This metric evaluates the total transmission time that includes the size of each task and bandwidth of jth VM given in Eq. (5). The nature of TT should be low for an efficient cloud system. Figure 3 represents the comparative analysis among proposed VMMLB-FLS, RR, and CM-eFCFS algorithms with respect to five numbers of VMs in which ten to thirty number of tasks are assigned. The analysis shows that TT increases with respect to number of tasks. From the figure, it is analyzed that RR algorithm suffers with high TT than CM-eFCFS and VMMLB-FLS. This is because RR algorithm does not perform effectively to achieve load balancing. It is simply scheduling the tasks in VMs. VMMLB-FLS algorithm has maintained 63.38% and 31.92% less TT compared to RR and CM-eFCFS methods respectively. The load balancing process assures proper assignment of tasks to underloaded PM and decreases TT. Therefore, VMMLB-FLS method minimizes transmission time more effectively.
5.1.2
Analysis of MakeSpan
The MS is achieved by including length of tasks and MIPS of VM as given in Eq. (6) for analyzing the overall completion time of task. Lowest the MS the better the cloud system. The results obtained for existing and proposed methods with respect to MS are compared and represented in Fig. 4. From the figure, it can be analyzed that for every set of tasks, VMMLB-FLS has obtained better values than RR and CM-eFCFS algorithms. The proposed method has achieved 34.83% and 42.52% less MS than RR and CM-eFCFS algorithms respectively. This analysis shows that proposed method offers good performance in cloud system.
7 Efficient Load Optimization Method Using VM Migration …
VM=5
95
MakeSpan
MakeSpan (sec.)
60 50 40 30 20 10 0 10
20
30
Algorithms RR
CM-eFCFS
VMMLB-FLS
Fig. 4 Calculated MakeSpan for different algorithms
Execution Time (sec.)
VM=5
Execution Time
400 300 200 100 0 RR
CM-eFCFS
VMMLB-FLS
Algorithms 10
20
30
Fig. 5 Calculated Execution Time for different algorithms
5.1.3
Analysis of Execution Time
The ET is achieved by analyzing the overall completion time of task as given in Eq. (7). Cloud system must possess lower ET to achieve better cloud performance. The obtained ET results are compared with existing methods. The obtained results are depicted in Fig. 5 which illustrates that VMMLB-FLS has obtained 21.33% and 40.84% minimum ET than RR and CM-eFCFS algorithms respectively for different set of tasks. Hence, proposed method has performed better in cloud system with respect to execution time.
5.1.4
Analysis of Number of Migration
Load balancing among PMs are carried out by VM migrations which consumes some energy as well as time for migration. Hence it is necessary to minimize the number of migrations in the system. Comparative analysis of number of VM migrations
96
S. Negi et al.
Number of Migrations
6 5 NoM
4 3 2 1 0 HBB-LB
VMMLB-FLS Algorithms
Fig. 6 NoM for different algorithms
of VMMLB-FLS method with existing HBB-LB method is depicted in Fig. 6. In proposed method, only one VM migration is performed whereas HBBLB method performs 5 VM migrations. Since VMs are balanced in VMMLB-FLS method, the requirement of less VM migration loading to low PM overloading is achieved.
6 Conclusion Migration based load-balancing technique is one of the key solution for cloud computing environment that enhances organized distribution of workload. In this research, a novel load balancing model is introduced that uses soft computing based Type 2 fuzzy logic technique (T2FL) for the selection of optimal destination PM for VM migration. VM migration based load balancing using fuzzy logic system (VMMLB-FLS) method initiated with three crisp inputs i.e., available PM memory, available PM CPU and current PM load. These linguistic variables have been used to set fuzzy inference rules hence, the destination PM is optimal for VM migration only if available memory is high, available CPU is high and current PM load is low. The introduced model for the migration of VM which has minimum number of migration to the optimal destination PM shows better results than existing Round Robin and CM-eFCFS algorithms. It is also analyzed that number of VM migration in proposed method is comparatively less than HBB-LB method. The introduced VMMLB-FLS method has achieved remarkable results which motivate us to expand the work with respect to load fairness and energy consumption of cloud.
References 1. Sadiku MNO, Musa SM, Momoh OD (2014) Cloud computing: opportunities and challenges. IEEE Potentials 33(1):34–36
7 Efficient Load Optimization Method Using VM Migration …
97
2. Mezmaz M, Melab N, Kessaci Y, Lee YC, Talbi EG, Zomaya AY, Tuyttens D (2011) A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J Parallel Distrib Comput 71(11):1497–1508 3. Weinhardt C, Anandasivam WA, Blau B, Borissov N, Meinl T, Michalk WW, Stößer J (2009) Cloud computing—a classification, business models, and research directions. Bus Inf Syst Eng 5:391–399 4. Ahson SA, Ilyas M (2011) Cloud computing and software services: theory and techniques. CRC Press Tylor and Francies 5. Milani AS, Navimipour NJ (2016) Load balancing mechanisms and techniques in the cloud environments: systematic literature review and future trends. J Netw Comput Appl 71:86–98 (Elsevier) 6. Zhan Z-H, Liu X-F, Gong Y-J, Zhang J (2015) Cloud computing resource scheduling and a survey of its evolutionary approaches. ACM Comput Surv 47(4):1–33 7. Jeyakrishnan V, Sengottuvelan P (2017) A hybrid strategy for resource allocation and load balancing in virtualized data centers using BSO algorithms. Wirel Pers Commun 94(4):2363– 2375 8. Negi S, Panwar N, Vaisla KS, Rauthan MMS (2020) Artificial neural network based load balancing in cloud environment. In: Advances in data and information sciences. Lecture notes in networks and systems, vol 94, pp 203–215 9. Panwar N, Negi S, Rauthan MMS, Vaisla KS (2019) TOPSIS–PSO inspired non-preemptive tasks scheduling algorithm in cloud environment. Cluster Comput 4:1–18 10. Xie R, Wen Y, Jia X, Xie H (2015) Supporting seamless virtual machine migration via named data networking in cloud data center. IEEE Trans Parallel Distrib Syst 26(12):3485–3497 11. Tsakalozos K, Verroios V, Roussopoulos M, Delis A (2017) Live VM migration under timeconstraints in share-nothing IaaS-clouds. IEEE Trans Parallel Distrib Syst 28(8):2285–2298 12. Dhinesh Babu LD, Venkata Krishna P (2013) Honey bee behavior inspired load balancing of tasks in cloud computing environments. Appl Soft Comput 13:2292–2303 13. Devi DC, Uthariaraj VR (2016) Load balancing in cloud computing environment using improved weighted round Robin algorithm for nonpreemptive dependent tasks. Sci World J (Hindawi Publishing Corporation) 2016:1–14 14. Farahnakian F, Pahikkala T, Liljeberg P, Plosila J, TrungHieu N, Tenhunen H (2016) Energyaware VM consolidation in cloud data centers using utilization prediction model. IEEE Trans Cloud Comput, 99 15. Shojafar M, Javanmardi S, Abolfazli S, Cordeschi N (2015) FUGE: a joint meta-heuristic approach to cloud job scheduling algorithm using fuzzy theory and a genetic method. Clust Comput 18(2):829–844 16. Mendel JM, John RI, Liu F (2006) Interval type-2 fuzzy logic systems made simple. IEEE Trans Fuzzy Syst 14(6):808–821 17. Mendel J (2001) Uncertain rule-based fuzzy logic systems: introduction and new directions. Prentice Hall, Upper Saddle River, NJ 18. Buyya R, Ranjan R, Calheiros RN (2009) Modeling and simulation of scalable cloud computing environments and the cloudsim toolkit: challenges and opportunities. High Perform Comput Simul, 1–11 19. Wagner C (2013) Juzzy—a java based toolkit for type-2 fuzzy logic. In: IEEE symposium on advances in type-2 fuzzy logic systems (T2FUZZ).
Chapter 8
Analysis of Search Space in the Domain of Swarm Intelligence Vaishali P. Patel, Manoj Kumar Rawat, and Amit S. Patel
1 Introduction Bio-inspired algorithm which are replication of evolution and foraging pattern of different living entity exist on the world are broadly classified in two sub-field: Evolutionary and swarm-based algorithm. Evolutionary algorithms are derived from the theory of survive in nature by increased population, progress, companion, mate selection and breeding. Genetic algorithm, differential evolution, evolution strategy are few among them. Swarm algorithms are inspired from foraging process which exhibit social and cognitive behavior, decentralize and self-organized pattern of swarm. Particle swarm optimizer, artificial bee colony algorithm, glowworm swarm algorithm, firefly algorithm, cuckoo search algorithm, bat algorithm, gray wolf optimizer, Spider Monkey Optimization are the algorithms following swarm approach.
V. P. Patel · M. K. Rawat Department of Computer Engineering, Oriental University, Indore, Madhya Pradesh, India e-mail: [email protected] M. K. Rawat e-mail: [email protected] A. S. Patel (B) Department of Mechanical Engineering, Dharmsinh Desai University, Nadiad, Gujarat, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_8
99
100
V. P. Patel et al.
2 Complexity of Search Space in Swarm Intelligence When meta-heuristic algorithms are applied, they surrounded with huge amount of data may be neighbor or far unknown region. And both data should be effectively analyzed to get optimum solution. It is also known as exploitation and exploration, respectively. Higher rate of exploitation may converge process faster or pre-mature solution and lack of global solution. Higher rate of exploration may slow down the process or may not true solution. Therefore, it is important to find the technique which balance between local and global search process.
3 Literature Analysis After study of many research articles in this work, we produced compressive review of search space analysis in swarm intelligence. Figure 1 shows the general strategies for optimization between exploration and exploitation. It is done either by structure modification or information exchange. In information exchange, it is further divided based on which strategy is applied: population distribution, update or control. Algorithm structure can be modified by updating governing equation, hybridization with new algorithm or with new topology.
3.1 Information Sharing Swarm population is dynamic in nature to find better food source, mates, pray or for mutual communication. During this behavior, they share information locally or globally. Optimization of Exploration and Exploitation
Structural change
Information Sharing
Population Initialization
Attribute Property
Population Update
Grouping
Population Control
Selection Scheme
Modified Governing Equation
EA Operator
New Algorithm
Fig. 1 Graphical depiction of sorting process for search strategies
Hybridization
New Topology
Imitating biological organism or physical system
8 Analysis of Search Space in the Domain of Swarm Intelligence
101
Table 1 Methods to generate initial population Autors/years
Algorithm
Method to generate initial population
Guha [1]
GWO
Quasi opposition-based learning theory
Kumar [2]
ABC
K-means algorithm
Moradi [3]
ABC
Logistic maps
Tian [4]
PSO
Logistic maps
Chun-Feng [5]
ABC
Good point set theory
Heidari [6]
GWO
Oppositional-based learning
Zhou [7]
GWO
Differential evolution algorithm
AliIbrahim [8]
GWO
Chaotic logistic map and the opposition-based learning
Liu [9]
GWO
Inverse parabolic spread distribution
Table 2 Attribute Year/authors Algorithm Attribute Tran [10]
ABC
Gan [11]
BA
Local search will be disturbed to find global best
Cai [12]
GWO
Random local search and random global search
Zhou [13]
MO
Cooperation process
Tu [14]
GWO
Global best around the current best solution and cooperative strategy
Kumar [15]
GWO
The each gray wolf learn from movement of sun
Wen [16]
GWO
Personal historical best position and the global best position
Saad [17]
ABC
Use previous knowledge gained by predecessor
Banati [18]
BA
Best neighbor
3.1.1
Use current, global and random property of data
Population Generation
Table 1 presents different mechanism for better population generation.
3.1.2
Population Update
Population Attribute Best or worst information available from personal, neighbor, historical or new data is shared with other candidate to find optimum solution as shown in Table 2.
Population Grouping Grouping criteria are shown in Table 3.
102
V. P. Patel et al.
Table 3 Grouping criteria Year/authors
Algorithm
Grouping criteria
Saad [17]
CB-ABC
Best solution from history and self-adaptive information
Cui [19]
ABC
Convergence and diversion population
Yao [20]
IDABC
Based on value of fitness function
Sharma [21]
Ageist spider MO
Levels of ability to interact and to track changes in the environment
Table 4 Selection scheme Year/authors
Algorithm
Selection scheme
Babalik [22]
ABC
Utilize objective values instead of fitness value in greedy selection process
Zheng [23]
ABC
Greedy selection is replace with Deb’s constrained method
Biswas [24]
ABC
Greedy scheme of positional perturbation
Li [25]
ACO
Node random selection mechanism
Al-Betar [26]
BA
Six selection mechanisms global best, proportional, exponential, random, linear and tournament rank
Heidari [6]
GWO
Greedy selection mechanisms
Awadallah [27]
ABC
Four selection schemes: global best, exponential, tournament and linear rank
Wu [28]
GSO
Greedy selection mechanisms
Kamoona [29]
CS
Greedy selection mechanisms
Selection Scheme Table 4 explains different selection scheme proposed in literature.
EA Operators Table 5 shows the brief review of EA operators used in bio-inspired algorithm.
3.1.3
Population Control
In this section, we list some of the parameter strategy to control the search process as shown in Table 6.
8 Analysis of Search Space in the Domain of Swarm Intelligence
103
Table 5 EA operators Year/authors
Algorithm
EA operators
Wang [30]
CS
Probabilistic mutation strategy
Tian [31]
PSO
Gaussian mutation strategies
Liu [32]
PSO
Gaussian chaotic mutation operators
Yan [33]
ABC
Crossover operator
Sharma [34]
PSO
Mutation operator
Xia [35]
PSO
Probabilistic mutation operator is applied on historical best position
Sharifipour [36]
ACO
Gaussian mutation with (1 + 1) ES algorithm
Eskandari [37]
ACO
Mutation of the global best and personal best of each solution
Tian [38]
FA
Adaptive mutation
Dash [39]
CS
Mutation operator of differential evolution technique
Cao [40]
ACO
Roulette wheel algorithm of genetic algorithm
Table 6 Control strategy Year/authors
Algorithm
Control strategies
Babalik [22]
ABC
Modification rate: to change more than one parameter during update phase
Xie [41]
FA
Attractiveness coefficient is replaced with a randomized control matrix
Long [42]
GWO
Nonlinear control parameter strategy
Akhand [43]
SMO
Swap sequence and swap operator-based operations
Wen [16]
GWO
Nonlinear adjustment strategy
Lin [44]
PSO
The comprehensive learning probability is updated dynamically by quality of solution
3.2 Change in Structure 3.2.1
Modified Governing Equation
In this section, we introduced the work done in the equation modification as shown in Table 7.
3.2.2
Hybridization
Hybridize with New Algorithm Tables 8 and 9 show the hybridization with new algorithm.
104
V. P. Patel et al.
Table 7 Modified governing equation Year/authors
Algorithm Equation modification
Zhu [45]
PSO
ABC equation is modified by global best from PSO
Sharma [46]
ABC
ABC equation is modified by the local and global best from PSO
Imanian [47] ABC
New solution derived from global, local best and velocity equation of PSO
Alqattan [48] ABC
Velocity equation of PSO is embedded with onlooker phase
Chen [49]
ABC
ABC updated with fireworks explosion search
Ramli [50]
BA
Velocity equation is enhanced by inertia weight which is function of velocity and speed
Wu [28]
GSO
ABC and PSO are combined to developed new movement formula
Yelghi [51]
FA
Firefly attractiveness is replaced with tidal force formula
Singh [52]
MO
Local leader phase is modified by the Nelder–Mead method
Table 8 Hybridization Year/authors
Algorithm
Hybridization with new algorithm
El-Abd [53]
ABC-SPSO
Component-based: PSO is combined with ABC component to enhance personal best
Baktash [54]
PSABC
Fitness value of ABC is optimized by PSO
Goel [55]
ACO and FA
The initial population of firefly is obtained through ant
Table 9 Hybridization Year/authors
Algorithm
Hybridization with new algorithm
Lebedev [56]
Ant and Bee
Swap-based: ants and bees exchange their function
Imane [57]
BA
Tabu search is used to select new solution in bat algorithm
Saraswathi [58]
CS-BA
Local output of cuckoo search algorithm is assign to bat algorithm to find global optimum solution
Murugan [59]
BA- ABC
Recombination method: First phase is BA, the second is onlooker bee and the last is scout bee phase
Imitating Biological Organism or Physical System New solution is generated by imitating the biological organism or physical system exhibits on the earth. This may include behavior, interaction or survival phenomena of living entity or any fix rule govern the physical system. In following Table 10, we present recent work done in the literature.
8 Analysis of Search Space in the Domain of Swarm Intelligence
105
Table 10 Biological organism or physical system Year/authors
Algorithm
Biological organism or physical system
Mishra [60]
PSO
Basic human qualities: Maturity, leader, awareness, follower’s relationship and leadership are combined with PSO
Zhoua [61]
Symbiotic organism search algorithm
Biological interaction: mutualism, commensalism and parasitism phase
Table 11 New topology
3.2.3
Year/authors
Algorithm
New topology
Lin [62]
PSO-ring topology
Ring topology
Ji [63]
ABC with
Topology of scale free network
Lu [64]
GWO
Cellular automaton
New Topology
Table 11 presents different topology used in swam intelligence for effective search result.
4 Current Problems and Future Opportunities In our study, we find that balance between exploration and exploitation plays major role in success of any swarm inspired algorithm. Many researchers proposed good solution but still there are challenging issue. Based on this, we conclude some open problem and future direction. Population Generation In literature, most of the methods are surrounded to chaotic or opposition-based learning. But in future, new method should be developed based on new statically property, nonlinear distribution, numerical method, simulation technique which compatible with surrounding environment. Population Update In our finding, population is updated based on data attribute, grouping, Selection scheme and EA operators. In future, different attribute should be finding out based on inherent property of data. Grouping may be combined with clustering for clear understanding of data. Selection scheme should be adaptive or automatic and data dependent and advanced EA operators can be applied.
106
V. P. Patel et al.
Population Control It is find out that most of the strategies are based on constant parameter update; in future, automatic parameter tuning may be good domain to emerge. Change in Structure It is found that in modification of governing equation, most of formulations are based on formula borrow from other algorithm. It is recommended that in future, other novel parameter, formula or step from biological system, physical law, chemical process, mathematical rule or any real-life application should be applied. Hybridization It is proven that hybridization of different algorithm comes with new opportunities. It is recommended that recent developed algorithm in swarm intelligence should be learned, and based on its property, other compatible algorithm should be combined. New Topology In literature, few researcheres try to develope algorithm structure with new topology. This is still active research area. New topological structure from other domain should be investigated and applied to swarm algorithms.
References 1. Guha D, Roy PK, Banerjee S (2016) Load frequency control of large scale power system using quasi-oppositional grey wolf optimization algorithm. Eng Sci Technol an Int J 19:1693–1713 2. Kumar Y, Sahoo G (2017) A two-step artificial bee colony algorithm for clustering. Neural Comput Appl 28:537–551 3. Moradi P, Imanian N, Qader NN, Jalili M (2018) Improving exploration property of velocitybased artificial bee colony algorithm using chaotic systems. Inf Sci (NY) 465:130–143 4. Tian D, Shi Z (2018) MPSO: modified particle swarm optimization and its applications. Swarm Evol Comput 41:49–68 5. Chun-Feng W, Kui L, Pei-Ping S (2014) Hybrid artificial bee colony algorithm and particle swarm search for global optimization. Math Probl Eng 2014:832949 6. Heidari AA, Abbaspour RA, Chen H (2019) Efficient boosted grey wolf optimizers for global search and kernel extreme learning machine training. Appl Soft Comput 81:105521 7. Zhou Z, Zhang R, Wang Y et al (2018) Color difference classification based on optimization support vector machine of improved grey wolf algorithm. Optik (Stuttg) 170:17–29 8. Ibrahim RA, Elaziz MA, Lu S (2018) Chaotic opposition-based grey-wolf optimization algorithm based on differential evolution and disruption operator for global optimization. Exp Syst Appl 108:1–27 9. Liu X, Tian Y, Lei X et al (2019) An improved self-adaptive grey wolf optimizer for the daily optimal operation of cascade pumping stations. Appl Soft Comput 75:473–493 10. Tran DC (2015) A novel hybrid data clustering algorithm based on artificial bee colony algorithm and K-means. Chinese J Electron 24(4):694-701(7) 11. Gan C, Cao W, Wu M, Chen X (2018) A new bat algorithm based on iterative local search and stochastic inertia weight. Exp Syst Appl 104:202–212 12. Cai Z, Gu J, Luo J et al (2019) Evolving an optimal kernel extreme learning machine by using an enhanced grey wolf optimization strategy. Exp Syst Appl 138:112814
8 Analysis of Search Space in the Domain of Swarm Intelligence
107
13. Zhou Y, Chen X, Zhou G (2016) An improved monkey algorithm for a 0–1 knapsack problem. Appl Soft Comput 38:817–830 14. Tu Q, Chen X, Liu X (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Appl Soft Comput 76:16–30 15. Kumar V, Kumar D (2017) An astrophysics-inspired Grey wolf algorithm for numerical optimization and its application to engineering design problems. Adv Eng Softw 112:231–254 16. Long W, Jiao J, Liang X, Tang M (2018a) Inspired grey wolf optimizer for solving large-scale function optimization problems. Appl Math Model 60:112–126 17. Saad E, Elhosseini MA, Haikal AY (2019) Culture-based artificial bee colony with heritage mechanism for optimization of wireless sensors network. Appl Soft Comput 79:59–73 18. Banati H, Chaudhary R (2017) Multi-modal bat algorithm with improved search (MMBAIS). J Comput Sci 23:130–144 19. Cui L, Li G, Luo Y et al (2018) An enhanced artificial bee colony algorithm with dual-population framework. Swarm Evol Comput 43:184–206 20. Zhou J, Yao X, Chan FTS et al (2019) An individual dependent multi-colony artificial bee colony algorithm. Inf Sci (Ny) 485:114–140 21. Sharma A, Sharma A, Panigrahi BK et al (2016) Ageist spider monkey optimization algorithm. Swarm Evol Comput 28:58–77 22. ÖZKIS¸ A, Babalik A (2014) Performance comparision of ABC and A-ABC algorithms on clustering problems. In: Proceedings of the international conference on machine vision and machine learning. Prague, Czech Republic 23. Zhang C, Ouyang D, Ning J (2010) An artificial bee colony approach for clustering. Exp Syst Appl 37:4761–4767 24. Biswas S, Bose D, Kundu S (2012) A clustering particle based artificial bee colony algorithm for dynamic environment. In: Panigrahi BK, Das S, Suganthan PN, Nanda PK (eds) Swarm, evolutionary, and memetic computing. Springer, Berlin, pp 151–159 25. Li X, Yu D (2019) Study on an optimal path planning for a robot based on an improved ANT colony algorithm. Autom Control Comput Sci 53:236–243 26. Al-Betar MA, Awadallah MA, Faris H et al (2018) Bat-inspired algorithms with natural selection mechanisms for global optimization. Neurocomputing 273:448–465 27. Awadallah MA, Al-Betar MA, Bolaji AL et al (2019) Natural selection methods for artificial bee colony with new versions of onlooker bee. Soft Comput 23:6455–6494 28. Wu B, Qian C, Ni W, Fan S (2012) The improvement of glowworm swarm optimization for continuous optimization problems. Exp Syst Appl 39:6335–6342 29. Kamoona AM, Patra JC (2019) A novel enhanced cuckoo search algorithm for contrast enhancement of gray scale images. Appl Soft Comput 85:105749 30. Wang L, Zhong Y, Yin Y (2016) Nearest neighbour cuckoo search algorithm with probabilistic mutation. Appl Soft Comput 49:498–509 31. Tian D, Zhao X, Shi Z (2019) Chaotic particle swarm optimization with sigmoid-based acceleration coefficients for numerical function optimization. Swarm Evol Comput 51:100573 32. Liu G, Chen W, Chen H, Xie J (2019) A quantum particle swarm optimization algorithm with teamwork evolutionary strategy. Math Probl Eng 2019:1805198 33. Yan X, Zhu Y, Zou W, Wang L (2012) A new approach for data clustering using hybrid artificial bee colony algorithm. Neurocomputing 97:241–250 34. Sharma M, Chhabra JK (2019) Sustainable automatic data clustering using hybrid PSO algorithm with mutation. Sustain Comput Inform Syst 23:144–157 35. Xia Y, Feng Z, Niu W et al (2019) Simplex quantum-behaved particle swarm optimization algorithm with application to ecological operation of cascade hydropower reservoirs. Appl Soft Comput 84:105715. https://doi.org/10.1016/j.asoc.2019.105715 36. Sharifipour H, Shakeri M, Haghighi H (2018) Structural test data generation using a memetic ant colony optimization based on evolution strategies. Swarm Evol Comput 40(76–91):9 37. Eskandari L, Jafarian A, Rahimloo P, Baleanu D (2019) A modified and enhanced ant colony optimization algorithm for traveling salesman problem: theoretical aspects. pp 257–265
108
V. P. Patel et al.
38. Tian M, Bo Y, Chen Z et al (2019) A new improved firefly clustering algorithm for SMC-PHD filter. Appl Soft Comput 85:105840 39. Dash J, Dam B, Swain R (2017) Optimal design of linear phase multi-band stop filters using improved cuckoo search particle swarm optimization. Appl Soft Comput 52:435–445 40. Cao M, Yang Y, Wang L (2019) Application of improved ant colony algorithm in the path planning problem of mobile robot. In: Proceedings of the 2019 3rd high performance computing and cluster technologies conference. Association for Computing Machinery, New York, pp 11–15 41. Xie H, Zhang L, Lim CP et al (2019) Improving K-means clustering with enhanced firefly algorithms. Appl Soft Comput 84:105763 42. Long W, Jiao J, Liang X, Tang M (2018b) An exploration-enhanced grey wolf optimizer to solve high-dimensional numerical optimization. Eng Appl Artif Intell 68:63–80 43. Akhand MAH, Ayon SI, Shahriyar SA et al (2020) Discrete spider monkey optimization for travelling salesman problem. Appl Soft Comput 86:105887 44. Lin A, Sun W, Yu H et al (2019a) Adaptive comprehensive learning particle swarm optimization with cooperative archive. Appl Soft Comput 77:533–546 45. Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217:3166–3173 46. Sharma TK, Pant M, Abraham A (2013) Blend of local and global variant of PSO in ABC. In: 2013 world congress on nature and biologically inspired computing. pp 113–119 47. Imanian N, Shiri ME, Moradi P (2014) Velocity based artificial bee colony algorithm for high dimensional continuous optimization problems. Eng Appl Artif Intell 36:148–163 48. Zakaria N. Alqattan RA (2015) A hybrid artificial bee colony algorithm for numerical function optimization. Int J Mod Phys 26(10):1550109 49. Chen X, Wei X, Yang G, Du W (2020) Fireworks explosion based artificial bee colony for numerical optimization. Knowl Based Syst 188:105002. https://doi.org/10.1016/j.knosys.2019. 105002 50. Ramli MR, Abas ZA, Desa MI et al (2019) Enhanced convergence of Bat algorithm based on dimensional and inertia weight factor. J King Saud Univ Comput Inf Sci 31:452–458 51. Yelghi A, Köse C (2018) A modified firefly algorithm for global minimum optimization. Appl Soft Comput 62:29–44 52. Singh PR, Elaziz MA, Xiong S (2018) Modified spider monkey optimization based on NelderMead method for global optimization. Expert Syst Appl 110:264–289 53. El-Abd M (2012) On the hybridization of the artificial bee colony and particle swarm optimization algorithms. J Artif Intell Soft Comput Res 2 54. Baktash N and MMR (2011) A new hybridized approach of PSO and ABC algorithm for optimization. In: Proceedings of the 2011 international conference on measurement and control engineering, pp 309–313 55. Goel R, Maini R (2018) A hybrid of ant colony and firefly algorithms (HAFA) for solving vehicle routing problems. J Comput Sci 25:28–37 56. Lebedev BK, Lebedev OB, Lebedeva EM, Kostyuk AI (2019) Integration of models of adaptive behavior of ant and bee colony. In: Silhavy R (ed) Artificial intelligence and algorithms in intelligent systems. CSOC2018 2018. Advances in Intelligent Systems and Computing, vol 764. Springer, Cham. https://doi.org/10.1007/978-3-319-91189-2_18 57. Imane M, Nadjet K (2016) Hybrid Bat algorithm for overlapping community detection. IFAC Pap Online 49:1454–1459 58. Saraswathi M, Murali GB, Deepak BBVL (2018) Optimal path planning of mobile robot using hybrid cuckoo search-Bat algorithm. Proc Comput Sci 133:510–517 59. Murugan R, Mohan MR, Rajan CCA et al (2018) Hybridizing bat algorithm with artificial bee colony for combined heat and power economic dispatch. Appl Soft Comput 72:189–217 60. Mishra KK, Bisht H, Singh T, Chang V (2018) A direction aware particle swarm optimization with sensitive swarm leader. Big Data Res 14:57–67 61. Zhou Y, Wu H, Luo Q, Abdel-Baset M (2019) Automatic data clustering using nature-inspired symbiotic organism search algorithm. Knowl Based Syst 163:546–557
8 Analysis of Search Space in the Domain of Swarm Intelligence
109
62. Lin A, Sun W, Yu H et al (2019b) Global genetic learning particle swarm optimization with diversity enhancement by ring topology. Swarm Evol Comput 44:571–583 63. Ji J, Song S, Tang C et al (2019) An artificial bee colony algorithm search guided by scale-free networks. Inf Sci (NY) 473:142–165 64. Lu C, Gao L, Yi J (2018) Grey wolf optimizer with cellular topological structure. Exp Syst Appl 107:89–114
Chapter 9
Smart Cane 1.0 IoT-Based Walking Stick Azim Uddin Ansari, Anjul Gautam, Amit Kumar, Ayush Agarwal, and Ruchi Goel
1 Introduction A blind or visually impaired person cannot identify objects in the environment. They rely on others for major tasks like taking public transport, crossing the roads, etc. and on their senses and touch for household chores. Electronic devices are built to help them. Also boarding and de-boarding bus or trains is a major issue while commuting between work and home. In cities, often markets are at a distance from the residential places. They face problems in getting the basic daily supplies. In metro rails, they find it difficult to locate the platform or use stairs or lift. Smart Cane 1.0 proposes a system based on two ultrasonic sensors working in the range between 10–250 cm in the forward direction and 5–100 cm in a downward direction. The user will be alerted through a buzzer, vibration pad and audio signals. The device will be connected to mobile via Bluetooth. The system is powered using a battery. The main focus of the system is minimum cost and efficiency. Also the user can use the smart box with their existing white canes. The users do not need to buy a new white cane. A blind or visually impaired person cannot identify objects A. U. Ansari (B) · A. Gautam · A. Kumar · A. Agarwal · R. Goel Krishna Engineering College, Ghaziabad, Uttar Pradesh, India e-mail: [email protected] A. Gautam e-mail: [email protected] A. Kumar e-mail: [email protected] A. Agarwal e-mail: [email protected] R. Goel e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_9
111
112
A. U. Ansari et al.
in the environment. They rely on others for major tasks like taking public transport, crossing the roads, etc. and on their senses and touch for household chores. Electronic devices are built to help them. Also boarding and de-boarding bus or trains is a major issue while commuting between work and home. In cities, often markets are at a distance from the residential places. They face problems in getting the basic daily supplies. In metro rails, they find it difficult to locate the platform or use stairs or lift. Smart Cane 1.0 proposes a system based on two ultrasonic sensors working in the range between 10–250 cm in the forward direction and 5–100 cm in a downward direction. The user will be alerted through a buzzer, vibration pad and audio signals. The device will be connected to mobile via Bluetooth. The system is powered using a battery. The main focus of the system is minimum cost and efficiency. Also the user can use the Smart Box with their existing white canes. The users do not need to buy a new white cane.
2 Literature Review A lot of ideas are constantly being proposed in the field of walking aids for visually impaired people. Earlier visually impaired people have to rely on others for help in their day-to-day life. But nowadays, many devices are available to make them self-dependent. Gayathri et al. [1] proposed a smart walking stick model using sensors like ultrasonic sensor, pit sensor, water sensor and GPS receiver. The model intends to provide an overall artificial vision and object detection and assistance using GPS. Nada et al. [2] proposed a model using microcontroller 18F46K80 embedded system, vibration motor and ISD1932 flash memory. The model can detect obstacles ahead and pits in the road along with water pile on roads. Kumar et al. [3] proposed a model based on five sensors ultrasonic, water, infrared, light (LDR) and fire. The model is intended to help day and night. Chaurasia and Kavitha [4] proposed a model using ultrasonic and infrared sensors to detect the obstacle. The system uses two of each type of sensor mounted on the stick along with a vibrating sensor. Adhe et al. [5] proposed a project based on ultrasonic sensor, water sensor. For providing outputs, buzzer and Bluetooth module is used. Users can reach the stick by pressing the button on the remote provided. Anwar and Aljahdali [6] proposed an affordable and lightweight system consisting of five sensors along controlled using Arduino Uno R3 microcontroller. Output buzzers, vibrators and voice alarm. GPS navigation is used to guide in unfamiliar places. The limitations of some of the existing systems are costly. The already proposed system needs the user to purchase new sticks which adds to the cost. Also the system is limited not too crowded areas. The Smart Cane 1.0 tries to solve certain limitations of high cost and less effectiveness in crowded places. The proposed system suggests the use of hearing devices
9 Smart Cane 1.0 IoT-Based Walking Stick
113
like earphones for providing the alert. The smart box of smart cane 1.0 can attach to the existing white cane. The user does not need to buy a new cane, thus decreasing the cost.
3 Methodology Smart Cane 1.0 uses two ultrasonic sensors. One ultrasonic sensor is placed in the forward direction, which checks for the obstacle ahead like walls, polls, parked, upcoming traffic or any other possible obstacle. The second one is placed in a downward direction, which checks for the pits, holes, pebbles in and on the roads. The second sensor also checks for the staircases. Both the ultrasonic sensors are controlled using the Arduino Uno R3 microcontroller. In case an obstacle is detected, the alerts are provided using vibration pad, buzzer and earphones. Earphones alert provides a better way in crowded areas. The Smart Cane 1.0 box provides an SOS emergency system consisting of a button that sends an SOS to the nearby police station and emergency contact using the Android application. The system is powered using a Li-ion battery making it easily rechargeable and environment-friendly. Smart Cane 1.0 uses devices and sensors which are as follows: 1. Ultrasonic sensor: It is used to detect an obstacle in forward or backward direction. It helps in detecting the distance of the obstacle and is used for generating alert symbols. 2. Arduino Uno R3: It is a microcontroller board based on the ATmega328 AVR microcontroller. It is used to manage the connected devices and sync their functions. 3. Piezo-buzzer: An electronic device used to produce an alarm or sound. Inside the house, piezo-buzzer is effective and also it uses less energy. 4. LED bulb indicators: It helps in identifying the presence of the visually impaired person by the non-visually impaired in the night-time. 5. Vibration motor: It is a mini DC motor that generates vibrations, no sounds. It is used as a backup with a piezo-buzzer. 6. Bluetooth module HC-05: It works on serial port protocol (SPP) module, used for wireless connectivity. In the proposed system, it is used to send inputs like the distance of obstacles and SOS message signal to the Android app. 7. Battery unit: The power unit consists of 3.7 V 2600 mAh Li-ion battery, thus making the system for work more days and light weight (Figs. 1, 2, and 3).
4 Conclusions The proposed system is cost effective and user-friendly. In our proposed work, we are trying to make a system that is based on concepts of IoT. A single box-based design is proposed so that the visually impaired can attach it to their existing stick.
114
A. U. Ansari et al.
Fig. 1 Flowchart of smart cane 1.0. Self-drawn using
Two ultrasonic sensors are used, making the system effective. Ultrasonic sensors are used to keep the cost to a minimum. The alerts are provided using a speaker and vibration pad fixed in the smart cane box. The system provides an audio signal using earphones making it effective in public transport, metro rail and crowded areas. The SOS alert system sends an alert message to the emergency contact and ambulance in case of any emergency. The Smart Cane 1.0 box and Android application are in sync with an auto-reconnect feature.
9 Smart Cane 1.0 IoT-Based Walking Stick
Fig. 2 Circuit diagram of the proposed model. Self-drawn using
Fig. 3 Interface of proposed Android application
The main focus is to make the system cost effective and user-friendly.
115
116
A. U. Ansari et al.
References 1. Gayathri G, Vishnupriya M, Nandhini R, Banupriya M (2014) Smart walking stick for visually impaired. Int J Eng Comput Sci 3(3) 2. Nada AA, Mashelly S, Fakhr MA, Seddik AF (2015) Effective fast response smart stick for blind people, Apr 2015 3. Kumar M, Kabir F, Roy S (2017) Low cost smart stick for blind and partially sighted people. Int J Adv Eng Manag 2(3):65–68 4. Chaurasia S, Kavitha KVN (2014) An electronic walking stick for blinds 5. Adhe S, Kunthewad S, Shinde P, Kulkarni VS (2015) Ultrasonic smart stick for visually impaired people 6. Anwar A, Aljahdali S (2017) A smart stick for assisting blind people. J Comput Eng 19(3):86–90 7. Louis L (2016) Working principle of Arduino and using it as a tool for study and research. Int J Control Autom Commun Syst 1(2):21–29 8. Zhmud VA, Kondratiev NO, Kuznetsov KA, Trubin VG, Dimitrov LV (2018) Application of ultrasonic sensor for measuring distances in robotics 9. Cotta A, Devidas NT, Ekoskar VKN (2016) Wireless communication using HC-05 bluetooth module interfaced with Arduino. Int J Sci Eng Technol Res 5(4) 10. Torcolini N, Oh J, Effects of vibration motor speed and rhythm on perception of phone call urgency 11. https://create.arduino.cc/projecthub/SURYATEJA/use-a-buzzer-module-piezo-speaker-usingarduino-uno-89df45 12. https://www.arduino.cc/en/tutorial/blink 13. https://www.tinkercad.com 14. https://lucidchart.com
Chapter 10
Web Crawler for Ranking of Websites Based on Web Traffic and Page Views Nishchay Agrawal and Suman Pant
1 Introduction The World Wide Web contains large amount of information [1, 2]. It is a collection of documents, web resources such as images, videos, text files, etc. identified by the Uniform Resource Identier on the internet. Figure 1 shows how different algorithms of web crawler determine the order of the search results. As we know the search engine depends on large collection of pages, multimedia resources such as images, videos, etc. that are obtained with the support of the crawler to updating their web content of the websites and indexes of other web site content and resources [3]. As we know that the extraction of the hyperlinks on the web pages and then storing them for indexing purposes is done by an automated program known as web crawler [4]. The web pages for every hyperlink are indexed according to the relevancy of the content. Different web crawlers used by search engines for crawling purposes such as bing uses the Bingbot crawler [5]. Due to the utilization of spiders by all the engines researchers are gaining interest in the optimization and improvement of the results obtained through web crawlers. Different web crawlers are proposed by researchers for crawling the web-based, on the application requirements. For crawling of the surface web different web crawlers [6–9] are proposed. For the spidering of the hidden web different web crawlers [10–13] are proposed. Robots.txt le restricts the web crawlers from accessing the unauthorized portion of the websites. The robots.txt file provides exclusion rules and regulations for a website. These crawled pages are then indexed according to the relevant information. The ranking is the process in which the search engine deals with the ordering of search results. Ranking algorithm N. Agrawal (B) · S. Pant Department of Computer Science and Engineering, Tula’s Institute, Dehradun, India e-mail: [email protected] S. Pant e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_10
117
118
N. Agrawal and S. Pant
Fig. 1 Various phases for fetching relevant information related to search query
is influenced by a large number of factors such as site-level factors, entity match and the number of backlinks, etc. Sometimes traffic on the website is required to be considered for providing the most relevant search results. Many times users want to access the most popular website based on the traffic of the website. Right now we have proposed a crawler depends upon traffic density and page views. The number of users visiting for a particular website or we can say that the quantity of traffic on the web page is called web traffic. It is determined by two factors- the number of visitors on the website and the number of web pages visited by visitors. Web traffic is updated regularly due to the gradual increase or decrease in the visitors on the websites. Pageviews is another parameter for the ranking algorithm that depends on the requested website URL. Web crawling and ranking algorithms both are essential for gathering relevant information related to user search queries while web surng. The ranking is important for ordering websites based on different ranking parameters. The performance of a web crawler is analyzed based on different parameters such as robustness, manageability, and reconfigurability.
2 Related Work Different techniques used by the search engine to access relevant information from a huge source of information [14, 15]. The ranking is used for measuring the website’s popularity of the websites and improving their order in search results [16, 17]. The author in the paper [18] has proposed clickstream based web pages to rank and
10 Web Crawler for Ranking of Websites Based on Web Traffic …
119
to discover the popularity of the web pages on the internet. The author in [19] has proposed the approach for ranking the pages based on four parameters- PageRank algorithm, team weighting technique, visitors count, and user’s feedback. The problem faced by this approach is that the PageRank algorithm depends only on in-links and out-links of web pages rather than the query. In [20, 21], the author has proposed an algorithm to rank the web pages using the machine learning algorithm and explains the role of supervised and unsupervised machine learning for ranking purposes. To retrieve authentic search results, the author in [22] has a modified PageRank algorithm based on parameters such as backlinks, navigation, time, and synonym. In [23], the Two-Phase Page Ranking (TPPR) algorithm for ranking and sequencing the web pages in the search result is proposed along with the content mining approach. WEBIR framework is proposed in [24] for effective web information retrieval based on content-based results feedback techniques. The framework for advanced crawler for deep web interface based on the binary vector and page rank algorithm to improve the results is proposed in [25]. An author in the paper [26] has proposed the User Preference-Based Page Ranking algorithm that provides an effective search navigation experience. Availability of large amounts of dynamic, diverse, and complex information on the internet, ranking search results becomes a big challenge. In this paper, we have an efficient crawler with URLs ranked according to their popularity.
3 Proposed Approach We deal with the crawling and ranking of the website URLs based on the amount of data sent and received by visitors to a website and Pageviews. Seed URL and search keyword is passed to the web crawler that systematically and automatically traverses the Web’s hypertext structure. In this web crawling, firstly spider starts with a seed URL and search string and then searches the hyperlinks to visits and then by crawling the content of the web pages, identify it and adds the hyperlinks to the list of visiting URLs of the websites that are relevant to the search keyword. Log fie (crawler.log) that stores and manages all the URLs and pages crawled by the web crawler. It acts as an intermediate between web crawler and ranking algorithm. Now, we rank the URLs of the website crawled by the web spider based on their relevancy. In this paper, we design the ranking algorithm based on two parameters- web traffic and page views. In this stage, the ranking algorithm takes crawled URLs of the website one by one from the index database and processes the input URL and produces the website URLs with their ranking based on traffic and page views. Simplexml_load_file() function that takes crawled URL from the input text file. After getting the URL of the website, the isset() function is used to provide the rank of the website based on traffic density and web pages viewed by the visitors if the URL of the website is valid. Finally,
120
N. Agrawal and S. Pant
the URL of the website along with its rank is stored in an unsorted ranked URL file (unsorted_output.txt). This process is repeated iteratively until all crawled URLs of the websites are processed by our ranking algorithm. At last, our comparative and sorting algorithm is used to sort the unsorted ranked URLs text file. We convert the t2 text file into linear data structure (array) using file_get_contents() and explode() function.
This type of linear data structure that is used is further divided into keys-values pairs where the key is the URL of the website and vale is website ranking. This array is sorted using usort() function that takes two parameters- array and user-defined comparison function. This user-defined comparison function contains the logic for sorting the array based on the ranking of the URLs of the websites (values) in the ascending order. This ranked list of the URLs is stored in the sorted ranked URLs text file (sorted_output.txt). Finally, we get the ranked crawled URLs of the websites in the search results based on the page views and traffic in response to the search query and seed URL as shown in Fig. 2.
10 Web Crawler for Ranking of Websites Based on Web Traffic …
121
Fig. 2 Flowchart for crawling and ranking of URLs of the websites
4 Simulation Results In this section, we have preliminary results acquired after the usage of this proposed approach. For the validation and testing purpose of our perspective, we take the parameters from the Alexa web traffic for the ranking of crawled URLs of the websites. We started our approach for collecting relevant URLs of the websites related to the search keyword. Some of the keywords are like java, python tutorials, restaurants, hotels, movies related websites, etc. (Tables 1, 2, 3, 4 and 5).
Table 1 Comparisons of ranking of URLs of websites based on traffic for “Java” search keyword URLs of the website Web traffic (in %) Proposed ranking based on web traffic and page views https://www.w3schools.com https://www.geeksforgeeks.org https://www.javatpoint.com https://www.tutorialspoint. com https://www.oracle.com https://www.guru99.com https://www.java.com
84.2 93.1 75.1 89
4 1 5 3
52 90.3 38
6 2 7
122
N. Agrawal and S. Pant
Table 2 Comparisons of ranking of URLs of websites based on traffic for “Python” search keyword URLs of the website Web traffic (in %) Proposed ranking based on web traffic and page views https://www.w3schools.com https://www.python.org https://www.tutorialspoint. com https://www.python-course.eu https://www.learnpython.org https://www.javatpoint.com
84.2 69.7 89
3 6 2
90.3 76.6 75.1
1 4 5
Table 3 Comparisons of ranking of URLs of websites based on traffic for “Restaurants” search keyword URLs of the website Web traffic (in %) Proposed ranking based on web traffic and page views https://www.tripadvisor.com https://en.wikipedia.org https://www.zomato.com https://www.justdial.com https://www.unsplash.com
66.7 86.2 62.6 85.3 31.8
3 1 4 2 5
Table 4 Comparisons of ranking of URLs of websites based on traffic for “Movies websites” search keyword URLs of the website Web traffic (in %) Proposed ranking based on web traffic and page views https://www.hotstar.com https://www.thelivemirror.com https://www.tech21century. com https://www.fossbytes.com https://www.probytes.net https://www.popcornflix.com
23.7 82.6 81.6
6 2 3
58 84.7 66
5 1 4
As shown in the figures, the x-axis represents the proposed ranking based on traffic and page views and on the y-axis, we have web traffic (in %). In Fig. 3, www. geeksforgeeks.org website gets top rank as compared to other websites in the search results of the java search query. In Fig. 4, www.python-course.eu website gets top rank as compared to other websites in the search results of the python search query. In Fig. 5, en.wikipedia.org website gets top rank as compared to other websites in the search results of the restaurants search query. In Fig. 6, www.probytes.net website gets top rank as compared to other websites in the search results of the movies search
10 Web Crawler for Ranking of Websites Based on Web Traffic …
123
Table 5 Comparisons of ranking of URLs of websites based on traffic for “Hotels” search keyword URLs of the website Web traffic (in %) Proposed ranking based on web traffic and page views https://www.trivago.in https://www.hotels.com https://www.makemytrip.com https://www.yatra.com https://www.agoda.com https://www.hotetonight.com
49.9 38 54.6 56.6 34.6 40.1
3 5 2 1 6 4
Fig. 3 Ranking of URLs of websites for “Java” search keyword
query. In Fig. 7, www.yatra.com website gets top rank as compared to other websites in the search results of the hotels search query. These bar graphs show how the ranking of the URLs of the websites varies with the number of users visiting the websites(web traffic).
5 Conclusion In this paper, we discuss the crawling of the URLs for ranking of the websites according to the traffic and page views. In this approach, Web traffic and the number of URLs requests by the user parameters used to rank the crawled URLs of the websites. Good crawling time and good stability of the web crawler for ranking of websites based on web traffic and page views are achieved. The proposed web crawler
124
N. Agrawal and S. Pant
Fig. 4 Ranking of URLs of websites for “Python” search keyword
Fig. 5 Ranking of URLs of websites for “Restaurants” search keyword
is a reliable and robust web crawler based on simulation results. We get the list of ranked URLs of the websites based on web traffic and page views. The websites with low-rank value are listed at the top position as compared to the websites with the high-rank number in the search results. The advantage of this proposed approach is that it motivates the website owner to increase the combination of traffic and page views parameters so that the popularity of the websites increases on the internet and also increases the user attraction with good ranking websites so that we get efficiently relevant information.
10 Web Crawler for Ranking of Websites Based on Web Traffic …
Fig. 6 Ranking of URLs of websites for “Movies” websites search keyword
Fig. 7 Ranking of URLs of websites for “Hotels” search keyword
125
126
N. Agrawal and S. Pant
References 1. Agrawal N, Johari S (2019) A survey on content based crawling for deep and surface web. In: 2019 fifth international conference on image information processing (ICIIP). IEEE, Shimla, India, pp 491–496. https://doi.org/10.1109/ICIIP47207.2019.8985906 2. Brodlie K (1997) Visualization over the World Wide Web. In: Scientific visualization conference (Dagstuhl’97). IEEE, Dagstuhl, Germany, p 23. https://doi.org/10.1109/IV.1997.626485 3. Shkapenyuk V, Suel T (2002) Design and implementation of a high-performance distributed web crawler. In: Proceedings 18th international conference on data engineering. IEEE, San Jose, CA, USA, pp 357–368. https://doi.org/10.1109/ICDE.2002.994750 4. Leng AGK, Kumar PR, Singh AK, Dash RK (2011) PyBot: an algorithm for web crawling. In: 2011 international conference on nanoscience, technology and societal implications. IEEE, Bhubaneswar, pp 1–6. https://doi.org/10.1109/NSTSI.2011.6111993 5. Algiryage N, Dias G, Jayasena S (2018) Distinguishing real web crawlers from fakes: googlebot example. In: 2018 Moratuwa engineering research conference (MERCon). IEEE, Moratuwa, pp 13–18 6. Agre GH, Mahajan NV (2015) Keyword focused web crawler. In: 2015 2nd international conference on electronics and communication systems (ICECS). IEEE, Coimbatore, pp 1089– 1092. https://doi.org/10.1109/ECS.2015.7124749 7. Nakashe SM, Kolhe KR (2018) Smart approach to crawl web interfaces using a two stage framework of crawler. In: 2018 fourth international conference on computing communication control and automation (ICCUBEA). IEEE, Pune, India, pp 1–6. https://doi.org/10.1109/ ICCUBEA.2018.8697592 8. Sharma S, Gupta P (2015) The anatomy of web crawlers. In: International conference on computing, communication and automation. IEEE, Noida, pp 849–853. https://doi.org/10.1109/ CCAA.2015.7148493 9. Wang H, Li C, Zhang L, Shi M (2018) Anti-crawler strategy and distributed crawler based on Hadoop. In: 2018 IEEE 3rd international conference on big data analysis (ICBDA). IEEE, Shanghai, pp 227–231. https://doi.org/10.1109/ICBDA.2018.8367682 10. Kumar M, Bhatia R (2016) Design of a mobile web crawler for hidden web. In: 2016 3rd international conference on recent advances in information technology (RAIT). IEEE, Dhanbad, pp 186–190. https://doi.org/10.1109/RAIT.2016.7507899 11. Peisu X, Ke T, Qinzhen. H (2008) A framework of deep web crawler. In:2008 27th Chinese control conference. IEEE, Kunming, pp 582–586. https://doi.org/10.1109/CHICC.2008. 4604881 12. Akilandeswari J, Gopalan NP (2008) An architectural framework of a crawler for locating deep web repositories using learning multi-agent systems. In: 2008 third international conference on Internet and web applications and services. IEEE, Athens, pp 558–562. https://doi.org/10. 1109/ICIW.2008.94 13. Liu G, Liu K, Dang Y (2011) Research on discovering deep web entries based on topic crawling and ontology. In: 2011 international conference on electrical and control engineering. IEEE, Yichang, pp 2488–2490. https://doi.org/10.1109/ICECENG.2011.6057954 14. Kumar G, Duhan N, Sharma AK (2011) Page ranking based on number of visits of links of web page. In: 2011 2nd international conference on computer and communication technology (ICCCT-2011). IEEE, Allahabad, pp 11–14. https://doi.org/10.1109/ICCCT.2011.6075206 15. Alahmadi SH (2018) Information retrieval of distributed databases a case study: search engines systems. In: 2018 1st international conference on computer applications and information security (ICCAIS). IEEE, Riyadh, pp 1–5. https://doi.org/10.1109/CAIS.2018.8441966 16. Manek FS, Reddy AJ, Panchal V, Pinjarkar V (2017) Hybrid crawling for time-based personalized web search ranking. In: 2017 international conference of electronics, communication and aerospace technology (ICECA). IEEE, Coimbatore, pp 252–255. https://doi.org/10.1109/ ICECA.2017.8203681
10 Web Crawler for Ranking of Websites Based on Web Traffic …
127
17. Guha SK, Kundu A, Dattagupta R.: Web page ranking using domain based knowledge. In: 2015 international conference on advances in computing, communications and informatics (ICACCI). IEEE, Kochi, pp 1291–1297. https://doi.org/10.1109/ICACCI.2015.7275791 18. Ahmadi-Abkenari F, Selama A (2011) A clickstream-based web page significance ranking metric for web crawlers. In: 2011 Malaysian conference in software engineering. IEEE, Johor Bahru, pp 223–228. https://doi.org/10.1109/MySEC.2011.6140674 19. Batra N, Kumar A, Singh D, Rajotia RN (2014) Content based hidden web ranking algorithm (CHWRA). In: 2014 IEEE international advance computing conference (IACC). IEEE, Gurgaon, pp 586–589. https://doi.org/10.1109/IAdCC.2014.6779390 20. Chauhan V, Jaiswal A, Khan J (2015) Web page ranking using machine learning approach. In: 2015 fifth international conference on advanced computing and communication technologies. IEEE, Haryana, pp 575–580. https://doi.org/10.1109/ACCT.2015.56 21. Yong SL, Hagenbuchner M, Tsoi AC (2008) Ranking web pages using machine learning approaches. In: 2008 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology. IEEE, Sydney, pp 677–680. https://doi.org/10.1109/WIIAT.2008.235 22. Sen T, Chaudhary DK, Choudhury T (2017) Modified page rank algorithm: efficient version of simple page rank with time, navigation and synonym factor. In: 2017 3rd international conference on computational intelligence and networks (CINE). IEEE, Odisha, pp. 27–32. https://doi.org/10.1109/CINE.2017.24 23. Usha M, Nagadeepa N (2018) Combined two phase page ranking algorithm for sequencing the web pages. In: 2018 2nd international conference on inventive systems and control (ICISC). IEEE, Coimbatore, pp 876–880. https://doi.org/10.1109/ICISC.2018.8398925 24. Shekhar S, Arya KV, Agarwal R, Kumar R (2011) A WEBIR crawling framework for retrieving highly relevant web documents: evaluation based on rank aggregation and result merging algorithms. In: 2011 international conference on computational intelligence and communication networks. IEEE, Gwalior, pp 83–88. https://doi.org/10.1109/CICN.2011.17 25. Mahale VV, Dhande MT, Pandit AV (2018) Advanced web crawler for deep web interface using binary vector and page rank. In: 2018 2nd international conference on I-SMAC (IoT). IEEE, Palladam, India, pp 500–503. https://doi.org/10.1109/I-SMAC.2018.8653765 26. Gupta D, Singh D (2016) User preference based page ranking algorithm. In: 2016 international conference on computing, communication and automation (ICCCA). IEEE, Noida, pp 166– 171. https://doi.org/10.1109/CCAA.2016.7813711
Chapter 11
Semantic Enrichment for Non-factoid Question Answering Manvi Breja and Sanjay Kumar Jain
1 Introduction Semantic means understanding the meaning of a text or sentence. Semantic is a term which is linked to various other terms like enrichment, classification, tagging, indexing. Semantic enrichment applies annotation to a text with useful information linking related concepts. It analyzes headings, content, and metadata of text by searching keywords and analyzing the role of each entity involved in text in order to determine the importance and thus finding the meaning of text [1]. Non-factoid question answering is one of the applications that utilize semantic relations to understand the user’s query, extract appropriate answer candidates to a question. Non-factoid questions are subjective in nature that doesn’t ask for reasoning about facts rather seek opinions, experience of different user. According to [2], whytype, definitional, what-if hypothetical, and how-type questions fall in the category of non-factoid questions. Since there are numerous possible answers to a question different with user, context, and time, the need is to address such questions semantically. In English language text, entities are related by various types of semantic relations such as part-whole, part-of, if–then, cause–effect [3]. To address why-type questions, causality plays a major role as it connects two phrases representing as a cause and its effect. The researches utilize causality to understand the meaning of text and determine the relation between different parts of a sentence. Causality is determined from the cause–effect relation in a sentence which plays a very crucial role for decision making. The paper tries to explore the M. Breja (B) · S. K. Jain National Institute of Technology, Kurukshetra, Haryana, India e-mail: [email protected] S. K. Jain e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_11
129
130
M. Breja and S. K. Jain
importance of causality in retrieving candidate answers to a why-type question and highlight various other relations that can be utilized. Causality is involved in different phenomenon such as it connects two processes, where cause part gives influence to effect part, sometimes causal parts representing past action or event with its effect part as its consequence to be occurring in future. The cause and effect may or may not be contained in one sentence [4]. Causality is explored for answering why-type questions motivated by the fact in Aristotle’s philosophy which states that the "cause" refers to "explanation" and is determined as "answer to a why-question" since their answers require explanation/reasoning [5]. The paper is divided into various sections. Section 2 discusses the importance of semantic relations in semantic enrichment. Section 3 discusses the role of semantic relations for QAS modules. Section 4 introduces different types of semantic relations viz. discourse and lexical-semantic relations used for why- and how-type questions. Section 5 explores challenges encountered in identifying semantic relations and methods used to tackle them. Section 6 briefs researches deploying semantics for QAS modules. Finally, Sect. 7 concludes the work with future research directions.
2 Importance of Semantic Relations Question Answering System (QAS) is an application that requires ontology mapping which represents knowledge as relations between the concepts involved in a domain. Ontologies represent existing knowledge and introduce new domain knowledge by making inferences. The need of ontology mapping is addressed using lexico-semantic relations between terms and clauses. There are different types of lexical-semantic relations between entities involved in text such as synonymy, antonmy, meronymy, polysemy, homonymy, hypernymy which determine related terms contained in a text [6]. The adverbial clauses in a relevant document depict different events occurring in a sentence. Even the clauses and phrases in a sentence are classified as containing semantic relations such as place, cause/reason, result, manner, purpose, condition, contrast. These semantic relations play a significant role in retrieving appropriate answer candidates to a question. Different semantic relations are contained in answer candidates for different types of question. For example, "cause/reason," "result," and "purpose" are expected to contain in an answer for why-type question, "manner" for how-type questions, "place" for where-type questions, and "substitution and contrast" for comparative type questions [7, 8].
11 Semantic Enrichment for Non-factoid Question Answering
131
3 Role of Lexical-Semantic Relations in Different Parts of Question Answering Lexical-semantic relations provide a proper structure to lexicons, taxonomies, ontologies that help to determine the meaning and relation of words contained in them. There are various fields of information retrieval which utilizes lexical-semantic relations which are discussed [9]: 1. Document retrieval: QAS answers user’s natural language question with appropriate facts or text extracted from relevant set of documents. The process matches semantic relations between the concepts contained in document and user’s question which depend on the type of question. 2. Recall and precision enhancement: Recall enhancement aims for increasing the number of relevant documents retrieved whereas precision enhancement targets for reducing the number of non-relevant documents retrieved. This is accomplished by query expansion through incorporating syntagmatic and paradigmatic relations. 3. Query expansion: Query expansion which appends the user’s query with other related terms to understand the need of user and resolve lexical mismatch issue between query and relevant answer documents. The appropriate terms are identified through semantic relations [10]. 4. Automatic document summarization: Summarization is a process of identifying important and non-redundant information from a set of documents. The concepts between different documents are identified using semantic relations. They further help to determine the discourse structure of documents. 5. Ontology mapping: Ontology provides a representation to the domain knowledge. A thesaurus is utilized to find the domain-oriented terms and relations between them. Semantic Web is based on ontology which describes different types of semantic relations and entities linked between these relations [6] 6. Determining inter-sentential relation: The relations between different set of sentences are analyzed to interpret the logic and interpretation between the sentences. These utilize relations such as entailment which states that a sentence S entails other sentence S’ which is designated as text and hypothesis. Different researchers provide different categorizations of these relations. The above discussed areas play some role in question answering, and it clearly depicts the importance of determining semantic relations for improving the accuracy of question answering.
4 Classification of Semantic Relations There are five phases of natural language processing (1) Lexical analysis, which segregates the text into words, sentences, and paragraphs. It determines the structure
132
M. Breja and S. K. Jain
of terms contained in text, (2) Syntactic analysis determines the arrangement of words in sentence using English grammatical rules, (3) Semantic analysis finds out the meaning from the text by considering the task domain of the sentence, (4) Discourse integration determines the meaning of succeeding sentences based on the preceding sentences, and (5) Pragmatic analysis extracts interpretation from the sentence by understanding the context of sentence [11–13]. 1. Discourse relations There are different taxonomies of discourse relations. • Rhetoric relations based on rhetorical structural theory: Condition, circumstance, elaboration, interpretation, justify, motivation, purpose, contrast, evidence, concession, etc. [14]. • Relations in Pen Discourse TreeBank classifying explicit and implicit discourse relations as comparison (but, while, however, although, though, still, yet), expansion (and, also, for example, in addition instead, indeed, for instance, unless, in fact, as a result), contingency (if, because, so, since, thus, as a result), temporal (when, as, after, then, before, meanwhile, until, later, once) [15]. 2. Lexical-semantic relations • Is-a: Reflects generic-specific relation in a hierarchical inheritance, e.g., red is a color. • Part-of: Reflects the hierarchical domain structure or entities having meronymy relation, e.g., Inorganic Chemistry is a part of Chemistry. • Made-of: Relation which links objects to the material they are made of, e.g., Cement is made of lime, silica, alumina, iron, and gypsum. • Located-in: Relation corresponds to location of an object • Takes place in: Relation related to processes in spatial and temporal dimensions, e.g., dredging takes place in sea. • Result of: Relation is related to events resulting from other events. • Causes: Relation which links clauses describing cause and their immediate effect or consequence. For example, tsunami causes earthquakes. • Affects: Relation links entities that affect or change other event or object [16]. With reference to non-factoid question answering, there are several parameters employed by researchers to classify semantic relations and use them to answer the questions. 1. For why-type questions: Causal relations are utilized in different scenarios to find an answer to a question as why-questions involve phrases linking causes and their effects. They are expected to contain the following [17]: • Ambiguous terms not always contemplating causation and non-ambiguous terms always comprising causation. • Explicit patterns containing cue phrases and implicit patterns semantically comprise causation but not contain cue phrases.
11 Semantic Enrichment for Non-factoid Question Answering
133
• Explicit patterns comprising adverbial connectives (e.g., for this reason, with the result that), prepositional connectives (e.g., as a result of, be- cause of, due to), and subordinate connectives to address immediate effect (e.g., so, since, because). • Explicit patterns comprising causation verbs as linking verbs (e.g., lead to, force, generate, cause), resultative causatives (e.g., kill, melt, break), and instrumental causatives (e.g., poison, hang, clean). • Explicit patterns containing causative adverbs of causatives and effects. • Implicit verbs representing causality to agent and patient. • Explicit causal questions comprising explicit keywords, semi-explicit containing ambiguous keywords, and implicit questions containing no cue phrases but contain phrases linked with causes and their effects. Despite causal (cause–effect) relations, there are other semantic relations expected to be contained in answers for why-type questions. The answers are expected to contain reason–consequence relation where consequence is the main clause and reason as subordinate clause and consequence are inferred by the speaker of the question. Other relation motivation result is also contained in answers to why-type question, where main clause expresses a result depicting the intention of the speaker, and result is expressed by the subordinate clause in an answer candidate. Circumstance–consequence relation, where reason is dependent on the circumstance or condition of question asked. Purpose–consequence realized by clauses, such as so, so that, to make some event to happen, to achieve intention of task. And the purpose is further can be done for positive sense and negative sense [18, 19]. 2. For how-type questions: Similar to the research in why-type non-factoid questions, similar cue phrases and patterns can be investigate to answer how-type questions. But before addressing patterns and connectives contained in them, there is a need to determine different types of how-type questions. • Answers to "How-much type" expected to contain clauses related to quantity, space, position, direction. • Answers to "How-do" or "how-to" referring to adverbial clauses of procedure describing manner, means, instrument, agency. • Answers to "How-is" referring to adverbial clause of contrast and process, e.g., "though," "even though," "although," "whereas," "while." • Answers to "How-long" refers to containing adverbial clauses of time, e.g., "after," "as soon as," "before," "by the time," "since." Thus, adverbial clauses linking different entities help to determine their semantic role which helps to find expected answer candidates to a question. The section explores other semantic relations that can also be utilized to address why- and how-type question answering.
134
M. Breja and S. K. Jain
5 Addressing Challenges in Identifying Semantic Relations and Approach Method to Tackle Them This section discusses challenges faced while utilizing semantic relations and methods to overcome such challenges. Challenge 1: Ambiguous and implicit semantic relations: There are certain phrases which always express relation, and some rarely express relation depending on the context of their appearance in a sentence. It is difficult to understand the nature of sentence which requires semantic interpretations to understand the implications of such cue phrases encountered in a sentence [20]. Method: Contextual knowledge is required to identify the context in which the sentence is spoken. This is addressing using (1) feedback from user (2) finding textual inferences using textual entailment (3) contextual knowledge taken from previously asked QA pairs (4) common sense reasoning applied through logics [21]. Challenge 2: Automatic extraction of semantic relations: Sometimes there are no direct semantic relationships in a sentence. For example, there can be a chain of causes and effects, and it is very cumbersome to automatically extract final effect and cause [22–24]. Method: These indirect relationships are approached by (1) identifying sentence boundaries using CRFs [25] (2) by traversing through the paths in dependency tree (3) using commonsense knowledge base called ConceptNet [26].
6 Brief on Research Utilizing Semantics in Different Modules of QAS Table 1 briefs the researches utilizing semantics for different phases of QAS.
7 Conclusion and Future Work The paper addresses the concept of semantic enrichment and its importance in different fields of information retrieval. It also discusses the semantic relations employed by researchers in question answering system. Besides existing research in why-type QAS, it also puts direction in exploring semantic relations for answering different categories of how-type questions. In the future, we plan to utilize semantic relations with integrating common sense knowledge base to improve the accuracy of non-factoid question answering.
11 Semantic Enrichment for Non-factoid Question Answering
135
Table 1 Research utilizing semantics for QAS modules S. no.
References
Approach
Merit
1
Moschitti et al. Question 2007 [27] classification
QAS phase
Annotation using Propbank is used to produce shallow semantic representation of both question and answers and determine semantically similar sentences
Overcomes the weakness of BOW model and helps to pinpoint an answer to a question
2
Feng et al. 2015 [28]
Find semantic similar questions by calculating semantic distance
Use to map questions into semantically similar question with answer in database, question classified into domain represented by positive and negative query examples
3
Prabhumoye Question analysis Determines relation et al. 2014 [29] between words and find keyphrase to search right document for answer
Understand meaning of query depending on question classification and parsing
4
Yih et al. 2013 [30]
Help to find association between words involved in question and answer candidates. Thus, incorporating lexical semantics increases MAP and MRR scores from 0.6483 to 0.6784 and 0.7150 to 0.7552, respectively, for logistic regression models, 0.6243 to 0.6967 and 0.6895 to 0.7899 in boosted decision tree models
Question classification
Candidate answer Find semantically selection related pair of words measured by word relations like synonymy/antonymy, hypernymy/hyponymy, semantic word similarity
(continued)
136
M. Breja and S. K. Jain
Table 1 (continued) S. no.
References
QAS phase
5
Yih et al. 2014 [31]
Candidate answer Uses convolutional selection neural network-based semantic model to project each word to a contextual feature vector and uses cosine similarity to compute relevancy score of pattern and relation
Approach
Merit Helps to capture contextual information between question and answer candidates Semantic and contextual information thus improves F1 score by seven points
6
Oh et al. 2013 [25]
Answer re-ranking
Use three types of features, partial tree matching, term matching, and excitation polarity matching to express causal relations in answer candidates to be an appropriate answer to a question
Utilizing intra- and inter-sentential causal relations help to improve the precision of QAS and thus finding the quality answers
7
Jansen et al. 2014 [32]
Answer re-ranking
Uses two measures: (1) lexical-semantic similarity between Q and A calculated as cosine similarity between question and answer vectors, and (2) overall similarity score by summing vectors and renormalizing to unit vector
Semantic information with discourse structures improve the performance of QAS
8
Fried et al. 2015 [33]
Answer re-ranking
Uses NNLM using overall similarity, average, minimum, and maximum pairwise similarities of question and answer candidate and alignment models to compute conditional probability of a word in question given word in answer
Estimate the indirect associations between question and answer texts
11 Semantic Enrichment for Non-factoid Question Answering
137
References 1. Clarke M, Harley P (2014) How smart is your content? Using semantic enrichment to improve your user experience and your bottom line. Sci Ed 37(2) 2. Fukumoto JI (2007) Question answering system for non-factoid type questions and automatic evaluation based on BE method. In: NTCIR 3. Murphy ML (2003) Semantic relations and the lexicon: antonymy, synonymy and other paradigms. Cambridge University Press 4. Lombrozo T, Vasilyeva N (2017) Causal explanation. In: Oxford handbook of causal reasoning, pp 415–432 5. Falcon A (2006) Aristotle on causality 6. Arnold P, Rahm E (2014) Enriching ontology mappings with semantic relations. Data Knowl Eng 93:1–18 7. Quirk R (2010) A comprehensive grammar of the English language. In: Pearson Education India 8. Breja M, Jain SK (2020) Causality for question answering. In: COLINS, pp 884–893 9. Khoo CS, Na JC (2006) Semantic relations in information science. Ann Rev Inf Sci Technol 40(1):157–228 10. Voorhees EM (1994) Query expansion using lexical-semantic relations. In: SIGIR’94. Springer, London, pp 61–69 11. Kumar E (2011) Natural language processing. IK International Pvt Ltd. 12. Kumar S, Tomar R (2018) The role of artificial intelligence in space exploration. In: 2018 international conference on communication, computing and internet of things (IC3IoT). IEEE, pp 499–503 13. Bansal P, Aggarwal B, Tomar R (2019) Low-voltage multi-input high trans-conductance amplifier using flipped voltage follower and its application in high pass filter. In: 2019 international conference on automation, computational and technology management (ICACTM). IEEE, pp 525–529 14. RST Homepage, https://www.sfu.ca/rst/01intro/intro.html. Last accessed on 24 Apr 2020 15. Pitler E, Raghupathy M, Mehta H, Nenkova A, Lee A, Joshi AK (2008) Easily identifiable discourse relations. In: Technical reports (CIS), pp 884 16. Arauz PL, Faber P (2010) Natural and contextual constraints for domain-specific relations. In: Proceedings of the workshop semantic relations. Theory and applications, pp 12–17 17. Arauz PL, Faber P (2012) Causality in the specialized domain of the environment. In : Semantic relations-II. Enhancing resources and applications workshop programme. Citeseer, p 10 18. Karyawati AE, Winarko E, Azhari A, Harjoko A (2015) Ontology-based why-question analysis using lexico-syntactic patterns. Int J Electr Comput Eng 5(2):318 19. Karyawati AE, Putri LR (2018) Two-step ranking document using the ontology-based causality detection. In: 2018 5th international conference on information technology, computer, and electrical engineering (ICITACEE). IEEE, pp 287–292 20. Lin Z, Kan MY, Ng HT (2009) Recognizing implicit discourse relations in the Penn Discourse Treebank. In: Proceedings of the 2009 conference on empirical methods in natural language processing, pp 343–351 21. Roemmele M, Bejan CA, Gordon AS (2011) Choice of plausible alternatives: an evaluation of commonsense causal reasoning. In: 2011 AAAI spring symposium series 22. Jayaraman S, Choudhury T, Kumar P (2017) Analysis of classification models based on cuisine prediction using machine learning. In: 2017 international conference on smart technologies for smart nation (SmartTechCon), pp 1485–1490 23. Purri S, Choudhury T, Kashyap N, Kumar P (2017) Specialization of IoT applications in health care industries. In: 2017 International conference on big data analytics and computational intelligence (ICBDAC), pp 252–256 24. Oh J, Torisawa K, Kruengkrai C, Ryu IIDA, Kloetzer J (2020) Non-Factoid question answering device. U.S. Patent Application No. 16/629,293.
138
M. Breja and S. K. Jain
25. Oh JH, Torisawa K, Hashimoto C, Sano M, De Saeger S, Ohtake K (2013) Why-question answering using intra-and inter-sentential causal relations. In: Proceedings of the 51st annual meeting of the association for computational linguistics, vol 1, Long Papers, pp 1733–1743 26. Havasi C, Speer R, Alonso J (2007) Conceptnet: a lexical resource for common sense knowledge. In: Recent advances in natural language processing V: selected papers from RANLP, vol 309, pp 269 27. Moschitti A, Quarteroni S, Basili R, Manandhar S (2007) Exploiting syntactic and shallow semantic kernels for question answer classification. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 776–783 28. Feng G, Xiong K, Tang Y, Cui A, Bai J, Li H, Li M (2015) Question classification by approximating semantics. In: Proceedings of the 24th international conference on world wide web, pp 407–417 29. Prabhumoye S, Rai P, Sandhu LS, Priya L, Kamath S (2014) Automated query analysis techniques for semantics based question answering system. In: 2014 international conference on recent trends in information technology. IEEE, pp 1–6 30. Yih SWT, Chang MW, Meek C, Pastusiak A (2013) Question answering using enhanced lexical semantic models 31. Yih WT, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, vol 2: Short Papers, pp 643–648 32. Jansen P, Surdeanu M, Clark P (2014) Discourse complements lexical semantics for nonfactoid answer reranking. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, vol 1: Long Papers, pp 977–986 33. Fried D, Jansen P, Hahn-Powell G, Surdeanu M, Clark P (2015) Higher-order lexical semantic models for non-factoid answer reranking. Trans Assoc Comput Linguist 3:197–210
Chapter 12
Genre-Based Recommendation on Community Cloud Using Apriori Algorithm Manvi Breja and Monika Yadav
1 Introduction Social Network Analysis (SNA) is an emerging area of computer science which is used to understand the connections between different entities. It has its significance in various fields of social media, information retrieval, and cloud computing, etc. Since the number of people interacting in a domain are increasing exponentially, there is a need to visualize connections between them. As the number of people on social media platforms is increasing at an exponential rate, it becomes a necessity to understand the intricacies of connections between them. SNA uses the concept of networks and graph theory to visualize the connections between different nodes. In a network, person, group, or entities are represented by nodes while edges (ties or links) represent their relationships. It is used to perform mathematical analysis to judge how connections are there between the nodes [1]. It has its dimensions in analyzing social behavior in animals [2], finding justifications for information diffusion [3], and analyzing users’ interactions on social media platforms like Twitter [4], Facebook [5], and Instagram, etc. In data mining process, knowledge discovery is one of the crucial stages as it is responsible for pulling out relevant information by digging the available dataset with the help of a computer-assisted process. In order to veil out patterns from the database that might be crucial for the decision making process, various data mining tools are flourishing in the marketplace. These modern tools are extremely fast and M. Breja (B) · M. Yadav Department of Computer Science and Engineering, The NorthCap University, Gurugram, Haryana, India e-mail: [email protected] M. Yadav e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_12
139
140
M. Breja and M. Yadav
more predictive than the traditional method of knowledge discovery. Data mining is used for finding important information from transactional and analytical systems for which Apriori algorithm mines frequent patterns from itemsets. Cloud computing have seen massive growth, with millions of Internet users by shifting the computing infrastructure and services to virtual machine manager. Virtual machine manager is commonly known as hypervisor who eliminates the need to purchase, configure, and maintain those infrastructure and services. Cloud is not limited to networks, storage, servers, and services; it also provides other distinct advantages based on demand self-services which can be accessible at any time through the Internet. Nowadays, everyone is adopting cloud as it provides several incentive such as lower operational cost of software, hardware, and reduced human effort. Cloud computing and social networking have blended in a variety of ways. Most noticeably social networks can be hosted on cloud as platform as a service (PAAS), and host can be made scalable using software as a service (SAAS) within the social networks. Public, private, and hybrid cloud deployment models can be used to host social networking services on cloud. YouTube which is one of the most popular platform to share and upload videos; the motive of the paper is to visualize the trend and form a network of users having common interests. For this, social network analysis is utilized to visualize the diversity of video genres uploaded by different users and the most popular uploader in a genre. A recommendation algorithm is also proposed to suggest other new genres to uploaders that may be of interest to them which is calculated based on various genre pairs found from Apriori algorithm. Based on the outcome of Apriori, relevant community cloud deployment model is recommended to the user. When several entities having similar requirements comes together on a multitenant platform, then this type of architecture is known as community cloud.
2 Social Networks Analysis: Properties This section discusses the properties which clarify the concepts of social network analysis. Table 1 discusses the properties which are utilized to determine connectivity and distributions in a network [6, 7].
3 Literature Review In recent past few decades’, researchers shifted their focus toward the study of social network analysis. In early 21 century, a rapid growth is seen in the study of behavioral sciences through social media analysis. SNA is one of the most researched area, and ample research is already done on it but few of its area is still untouched and must
12 Genre-Based Recommendation on Community Cloud …
141
Table 1 Properties of social network analysis 2.1. Properties to determine connectivity (a) Homophily
It determines the probability of one entity to be linked with other entity. Two entities are expected to have similar features and attributes. It is viewed in real life scenario as two human beings are more likely to become friends if they have similar interests and desires
(b) Multiplexity
It measures the degree of connection between two nodes
(c) Network closure or structural holes
It is characterized as clusters of connections with similar interests. For instance, the clusters of people where there is a uniformity of information, new ideas, and their behavior
2.2. Properties to determine distribution (a) Degree centrality
It is used to find out number of nodes to which a particular node is connected in a network, i.e., it determines the neighbors of a particular node. It is calculated as g
CD =
∗ )−C (i)] D [(N −1)(N −2)]
i=1 [C D (n
(1)
where graph G has two components n and i, n: number of nodes, i: number of edges, n* : node having highest value, N: total number of nodes in network, and C D: value of degree centrality (b) Betweenness centrality
It determines the influential nodes in a network which is involved in information flow and mostly appeared on the shortest path of a network. It is expressed as: gab (i) CB (n) = (2) g a PmUtil + requested capacity of VMcandidate ) 14. then VMcandidate is added to VMready list 15. end for 16. if (VMready list is not empty), then 17. for each i to VMready list size 18. probability calculation 19. end for 20. Choose VM randomly according to the probability 21. allocate VM to PM with PmIndex 22. set solution matrix (allocatedVM id, PmIndex) = 1 23. update PmUtil of PM 24. Remove selected VM from VMcandidate list 25. Clear VMready list 26. end while 27. ant solution = solution matrix 28. end for 29. for each ant a {0,m} do 30. best solution = ant solution (min. no. of active PMs) 31. pheromone update 32. end for 33. return the best solution
4 Simulation Environment Most of the research experiments were done in a simulated environment due to the unavailability of the datacenter. CloudSim [12] provides a simulated environment of the cloud datacenter, where we have performed the experiments. We simulated the cloud environment in CloudSim according to Table 1. For the evaluation of our algorithm, we have taken three cases, heavy workload, average workload, and low workload. The workload was represented by the number of VM requests. In case of a heavy workload, 20 VMs requests were mapped to 10 PMs; in average case, 15
13 AntVMp: An Approach for Energy-Efficient Placement …
159
Table 1 Cloud environment Specification
PM type
VM type
Cloud task Type I‘
Type II
Type III Type IV
One task is assigned to one VM Cloud length 10,000
Computing capacity
3720 MIPS
5320 MIPS
500 MIPS
1000 MIPS
2500 MIPS
2500 MIPS
Memory
8192 MB
8192 MB
613 MB
1.7 GB
1.7 GB
0.85 GB
Table 2 Power model Utilization % EC
0
10
20
30
40
50
60
70
80
90
100
105
112
118
125
131
137
147
153
157
164
169
VMs were mapped to 10 PMs, and 08 VMs were required to map in case of low workload. We have used the SPEC1 power benchmark for the assessment of EC in the algorithm (Table 2). In Table 2, for the different CPU utilization percentage corresponding energy consumption is given in watt. As it can be seen that for 0% CPU utilization, corresponding EC is 105 W, so it is better to shut down those PMs which show 0% CPU utilization.
5 Simulation Results and Discussion In a research [13], it was found that CPU utilization of PM has a linear relationship with EC so in our research, involvement of CPU utilization is considered to assess the EC in the cloud datacenter. We have performed simulations for all cases and found that AntVMp performed better in comparison to other heuristic approaches. It can be seen in Fig. 1 that the consumption of energy has been reduced successfully. If PM is in the ideal state, then it consumes 70% of the energy of their peak energy usage [13, 14]. Therefore, AntVMp consolidates VMs onto minimum numbers of PMs, as it can be observed in Fig. 2 that AntVMp significantly reduces the total number of active PMs when compared with other approaches. A comparative study was also done to assess the performance of AntVMp in contrast to WF and BFD (Table 3). It is found that AntVMp reduced the EC 53, 32, and 16% in the case of 8, 15, and 20 VMs, respectively, when compared with WF.
1 https://www.spec.org/power_ssj2008/.
160
V. Barthwal et al. 1600 1400 1200 1000
WF
800
AntVMp
600
BFD
400 200 0 8 Vm
15 VM
20 VM
Fig. 1 Energy consumption in each algorithm 12 10 8 WF 6
AntVMp BFD
4 2 0 8 Vm
15 VM
20 VM
Fig. 2 No. of active PMs in each algorithm
Table 3 EC comparison
Number of VMs AntVMp versus WF AntVMp versus BFD (%) (%) 8
53
20
15
32
12
20
16
8
When AntVMp and BFD are compared, then there is an improvement of 20, 11, and 8% in EC in the case of 8, 15, and 20 VMs, respectively.
13 AntVMp: An Approach for Energy-Efficient Placement …
161
6 Conclusion We have performed an offline placement of VMs onto PMs using max–min ant meta-heuristic. It searches the solution randomly from a large domain of solution and arrives at an optimum solution using multiple iterations. The main aim of the proposed research is to find out the applications of meta-heuristic approaches to develop the solution for the VM placement while maintaining the energy consumption. Simulation results demonstrated that AntVMp is found effective for the autonomous management of VMs. As a future direction, we are in the process of pursuing the solution for dynamic consolidation of VMs using ACO and other meta-heuristic-based approaches.
References 1. Dorigo M, Di Caro G, Gambardella LM (1999) Ant algorithms for discrete optimization. Artificia Life 5:137–172, 49, 98 2. Dorigo M, Gambardella l (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66 3. Tawfeek MA, El-Sisi AB, Keshk AE, Torkey FA (2014) Virtual machine placement based on ant colony optimization for minimizing resource wastage. In: Hassanien AE, Tolba MF, Taher Azar A (eds) Advanced machine learning technologies and applications. Communications in computer and information science, vol 488. Springer, Cham 4. Xu P, He G, Li Z, Zhang Z (2018) An efficient load balancing algorithm for virtual machine allocation based on ant colony optimization. Int J Distrib Sens Netw 5. Shabeera T, Kumar SM, Salam SM, Krishnan KM (2017) Optimizing vm allocation and data placement for data-intensive applications in cloud using ACO metaheuristic algorithm. Eng Sci Technol Int J 20:616–628 6. Liu X-F, Zhan Z-H, Zhang J (2017) An energy aware unified ant colony system for dynamic virtual machine placement in cloud computing. Energies 10(5):609 7. Gao Y et al (2013) A multi-objective ant colony system algorithm for virtual machine placement in cloud computing. J Comput System Sci 79(8):1230–1242 8. Farahnakian F et al (2015) Using ant colony system to consolidate VMs for green cloud computing. IEEE Trans Serv Comput 8(2):187–198 9. Feller E (2012) Autonomic and energy-efficient management of large-scale virtualized data centers. Distrib Parallel, Cluster Comput Université Rennes 1 10. Wei W, Gu H, Lu W, Zhou T, Liu X (2019) Energy efficient virtual machine placement with an improved ant colony optimization over data center networks. IEEE Access 7:60617–60625 11. Stützle T, Hoos H (1998) Improvements on the ant-system: Introducing the MAX-MIN ant system. In: Artif Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10. 1007/978-3-7091-6492-1_54 12. Calheiros RN, Ranjan R, Beloglazov A, Rose CAFD, Buyya R (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. J Softw Pract Exp 41:23–50 13. Fan X, Weber WD, Barroso LA (2007) Power provisioning for a warehouse-sized computer. In: Proceedings of the 34th annual international symposium on computer architecture (ISCA 2007), ACM New York, NY, USA, pp 13–23 14. Beloglazov RB (2012) Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Conc Comput 24(13):1397–1420
Chapter 14
CPU Performance Prediction Using Various Regression Algorithms Abdullah Al Sefat, G. M. Rasiqul Islam Rasiq, Nafiul Nawjis, and S. K. Tanzir Mehedi
1 Introduction In order to predict the value of a target variable in a dataset, regression algorithms are used. There are quite a few number of regression algorithms and their performances differ. In this work, we have created a dataset of central processing unit (CPU) components and we predict the performance of the CPUs using five different kinds of regression algorithm, namely, linear regression, gradient boosting regression, support vector machine regression, decision tree regression and random forest regression and compare the coefficient of determination (R 2 ) score and different kinds of errors, such as, mean absolute error, mean squared error and root mean squared error. The result have differed for each regression algorithm and we present a comparative analysis.
A. Al Sefat (B) · N. Nawjis Computer Science and Engineering Department, Ahsanullah University of Science and technology, Dhaka, Bangladesh e-mail: [email protected] N. Nawjis e-mail: [email protected] G. M. Rasiqul Islam Rasiq Computer Science and Engineering Department, Daffodil International University, Dhaka, Bangladesh e-mail: [email protected] S. K. T. Mehedi Information and Communication Technology Department, Mawlana Bhashani Science and Technology, Tangail, Bangladesh e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_14
163
164
A. Al Sefat et al.
2 Methodology Our methodology is divided into several steps. In Fig. 1 all the broad steps are outlined. A short description of the total procedure is given as followed.
2.1 Data Acquisition The first and foremost task of this research work was to collect data. We collected data of CPU units that were available for purchase in market and also OEM processors used in various pre-built systems. There are CPU available in the market of two manufacturers, Advanced Micro Devices(AMD) and Intel. It is important that we keep the dataset coherent and reasonable, that is why we can not include the processors which are very old and obsolete; doing so would create a skewed dataset [1].
Fig. 1 Flowchart of the complete work plan
14 CPU Performance Prediction Using Various Regression Algorithms
165
The size of our dataset, therefore, is small, with 100 tuples. The following features were collected for every single CPU chip available in market : – Brand: This is a categorical feature. The brand of a CPU is either AMD or Intel. – Core count: The number of physical cores on board the CPU is taken a feature. Many have a misconception that a processor with a higher core count performs better than a processor with a lower core count, but that is not invariably the case. – Clock speed: The clock speed of CPU in Giga Hertz is taken as a feature. The faster the clock, the more instructions the processor can complete per second [2]. Although, a higher clock speed does not directly translate to better performance since, a processor’s performance depends on multiple factors. However, if two processors are taken with the exact same specifications except clock speed, the processor with the higher clock speed will perform better, as it will execute more instructions per unit time. – Cache size: The cache is a very fast memory and holds most frequently used data so that it can be easily accessed and used. This reduces access time and increases efficiency [3]. There are three kinds of cache memory namely L1, L2 and L3. L3 cache is the fastest of cache memory, most expensive and impacts performance the most. The value of L3 cache in Megabytes is taken as a feature. – Simultaneous multithreading: Simultaneous multithreading schedules tasks in a way that one physical core appears as two logical cores and therefore, increases performance. However, the performance gain is not equivalent to an additional physical core. Intel uses a variant simultaneous multithreading called hyperthreading in quite a lot of their desktop grade processors [4]. In the dataset, this feature is a boolean categorical feature, where the value is true or false depending on whether the processor unit supports simultaneous multithreading or not. – Turbo boost: A feature exclusive to the Intel CPU chips is turbo boost. This feature increases the the performance of the processor by increasing the clock speed under load and the CPU operates in a lower speed when it is idle. This advanced feature is very efficient. For out dataset, turbo boost is also a boolean categorical feature, and the value is true or false depending on whether the processor unit supports turbo boost or not. – Unlocked: Clock speed of processors are usually locked to a specific value beyond which the clock speed can not exceed. This measure is taken because processor units heat as clock speed increases and the limit on clock speed prevents processors from overheating, potential damage and reduced longevity. However, high end processor units tend to unlocked, meaning that, they can be overclocked to a higher clock speed than the base clock speed, although usage of a more powerful cooler is recommended if a CPU is overclocked. We have taken this overclocking capability as a boolean categorical feature. – Lithography: We have taken the size of transistors in a CPU in nanometers as a numerical feature. The quantity of transistors in an integrated circuit doubles in value every two years, which is possible due to size of transistors getting smaller [5]. The smaller the transistor size, the more transistors a CPU can accommodate, thus increasing performance.
166
A. Al Sefat et al.
– Cinebench multi core performance score: We have used cinebench test scores as a performance metric of the processors. Cinebench is an application that measures the performance of a CPU by testing its efficiency and time it takes to render complex 3D images which can be taxing [6]. The quicker a CPU can render the image the higher score is assigned to the CPU. The cinebench test suite provides multi core performance score which indicates how well all of the cores of the processor handles the 3D rendering task. The higher the score the better a CPU will perform in multi threaded tasks. We are using cinebench test score as the target variable. For prediction, we have used supervised learning. In order to perform supervised machine learning algorithms we need a dataset that has values of the target variable and therefore, we have gathered cinebench test scores of CPUs available in the market [7].
2.2 Data Preprocessing and Feature Selection Before the gathered data can be used for prediction of price by any machine learning regression algorithm, it has to be preprocessed into a format which will yield a better result in prediction. Failing to preprocess data may give low accuracy in prediction and high error. Many machine learning algorithms perform poorly unless the dataset has been standardized, i.e. transformed in to a format similar to normal distribution [8]. We have used scikit-learn for perprocessing our gathered data and apply machine learning algorithms [9]. We have transformed our dataset using the minmax scaler. The min max scaler transforms the dataset by scaling the features between a given range [10]. We have set the range to be between 0 and 1 [11]. After the dataset has been scaled and transformed, we perform feature selection. Feature selection is the process where a subset of feature variables are selected towards training the model and features that do not positively impact the prediction of the model and features which hinders the accuracy of the model have been excluded. For feature selection we have used the Filter Method of feature selection. In this method of selecting features, we at first measured the Pearson’s correlation of feature variables with the target variable. We selected features that correlates most with the target variable and excluded the ones that do not correlate very well. Figure 2 is a correlation matrix of the dataset. Figure 2 is a table where the pearson’s correlation coefficient between the features in our dataset is shown and each cell of the table shows correlation between two features. In Pearson’s Correlation, a value closer to +1 indicates stronger positive correlation. From Fig. 2 we can see that, the target variable Multi core performance score (MCS) has a very poor correlation with the feature variables namely Brand and Lithography and we therefore, will exclude these features while training our regression model. Cache memory size, core count, clock speed and unlocked have good positive correlation with the target and therefore, we will use these four feature variables to train our regression models.
14 CPU Performance Prediction Using Various Regression Algorithms
167
Fig. 2 Correlation matrix of the dataset
2.3 Training Regression Models For estimating our target variable multi core performance score(MCS) we have trained five different regression models namely, linear regression, support vector machine regression, decision tree regression, random forest regression and gradient boosting regression. Linear regression model tries to fit the data points on a straight line in an n + 1 dimensional space, n being the number of features. Support vector machine uses a set of points called support vector point and it uses a decision function, which is evaluated by these vector points. Decision tree creates regression models as a tree. This algorithm forms a tree as it creates subset of the dataset. This results in a tree with decision nodes and leaf nodes, called a decision tree. Random forest regression is an ensemble machine learning algorithm. Random forest contains a set of decorrelated decision trees. Random forest takes the decisions of those decision trees and gives the mean as result. This increases the accuracy of the result. Gradient boosting is also an ensemble machine learning algorithm. Ensemble algorithms use multiple models and then combines their results by taking average using weighted voting. If a machine learning algorithm is trained and tested using the same data it would result in a phenomenon known as overfitting. Overfitting causes the regression model to be very precise for the training dataset but however, may perform extremely poor on a novel dataset. In order to avoid overfitting for supervised machine learning algorithms, a subset of the dataset, called the test data set is left out while training the model. We have used k-fold cross validation, which divides the dataset into k parts or folds, k-1 folds are used to train the model and one fold is left as test dataset. This process is done in k iterations, with a different test dataset for each iteration and then the average of accuracy in each iteration is taken. Figure 3 illustrates 5-fold cross validation.
168
A. Al Sefat et al.
Fig. 3 Visual representation of 5-fold cross validation
2.4 CPU performance Prediction We have used 5-fold cross validation and 7-fold cross validation to train the five different regression models mentioned in the previous section. We have calculated the Coefficient of determination, mean absolute error, mean squared error and root mean squared error for each corresponding model trained. The results of the prediction are discussed in the following section.
3 Result Analysis 3.1 Metrices R 2 score indicates the accuracy of the model i.e. how well does the model predict CPU performance which for our case is the Multi Core Performance Score. We have also calculated various metrics of errors. The following metrices were used to evaluate the prediction made by our regression models. – Mean absolute error: Mean of the absolute value of predicted and actual value. It can be calculated using the following equation, where yi is the predicted value, xi is the actual value of the ith sample and n is the population size. n MAE =
i=1
|yi − xi | n
(1)
14 CPU Performance Prediction Using Various Regression Algorithms
169
– Mean squared error: Mean squared error measures the mean of the square of the difference between actual and predicted value. It can be calculated using the following equation, where yi is the predicted value, xi is the actual value of the ith sample and n is the population size. n MSE =
i=1 (yi
− x i )2
n
(2)
– Root mean squared error: Root mean squared error measures the square root of the mean of square of difference between the actual and predicted value. It can be calculated using the following equation, where yi is the predicted value, xi is the actual value of the ith sample and n is the population size. RMSE =
n i=1 (yi
n
− x i )2
(3)
– coefficient of determination (R 2 ): The coefficient of determination is a performance metric used to determine how well a machine learning model performs while predicting for future outcomes. It can be calculated using the following equation where SSreg stands for Sum of squared regression and SStot stands for sum of squared total : r2 =
SSreg SStot
(4)
3.2 Results In Table 1 mean absolute error, mean squared error, root mean squared error and r square scores of the different regression models used are given for 5-fold cross validation. We then trained the models again but using 7-fold cross validation and achieved a slightly better result compared to 5-fold cross validation. In Table 2 we can see the mean absolute error, mean squared error, root mean squared error and r square scores of the different regression models used for 7-fold cross validation. From the table it is clear that the models performed better when applied using 7-fold cross validation instead of 5-fold cross validation since the errors have reduced and coefficient of determination score have improved. However, for both 5 and 7 fold cross validation the support vector machine regression performed very poorly, yielding a R 2 score of −0.173 and −0.148 respectively. Support vector machines work well on datasets that have a huge number of features but a comparatively smaller set of training examples and in our case the number of features in the dataset were very low, which most likely resulted in the SVM regressor performing extremely poor.
170
A. Al Sefat et al.
Table 1 Result using 5-fold cross validation Mean absolute Mean square error error Linear regression 125.57 Support vector 508.29 regression Decision tree 74.86 regression Random forest 98.55 regression Gradient boosting 78.12 regression
23529.97 644593.60
150.28 776.59
0.934 –0.173
18394.51
129.03
0.952
27963.36
159.04
0.935
18668.57
130.19
0.957
Table 2 Result using 7-fold cross validation Mean absolute Mean square error error Linear regression 123.43 Support vector 504.56 regression Decision tree 74.38 regression Random forest 77.10 regression Gradient 64.54 Boosting regression
Root mean square R 2 error
Root mean square R 2 error
22047.64 641750.61
147.01 768.09
0.936 –0.148
22196.19
129.64
0.966
17546.75
119.14
0.971
14974.76
103.42
0.978
4 Conclusion In this research work, we tried to perform a novel task of estimating the performance and computing power of a central processing unit. Our task of estimating CPU performance is most likely the first of its kind and as a result, we could not find scholarly articles pertaining this topic and thus resorted to textbooks of Computer Architecture, Microprocessors and articles published by CPU manufacturers about the technologies they have used. From the performance metrices we can see that ensemble models work the best and gradient boost regression has the best scores.
14 CPU Performance Prediction Using Various Regression Algorithms
171
References 1. Anuj Kumar Y, Tomar R, Kumar D, Gupta H (2012) Security and privacy concerns in cloud computing. Int J Adv Res Comput Sci Softw Eng 2(5) 2. Kumar NS, Saravanan M, Jeevananthan S (2011) Microprocessors and microcontrollers. Oxford University Press Inc., USA 3. Stallings W (2002) Computer organization and architecture, 6th edn. Prentice Hall Professional Technical Reference 4. Intel® hyper-threading technology. https://www.intel.com/content/www/us/en/architectureand-technology/hyper-threading/hyper-threading-technology.html 5. Moore’s law—Oxford reference. https://www.oxfordreference.com/view/10.1093/oi/ authority.20110803100208256 6. Cinebench—Wikipedia. https://en.wikipedia.org/wiki/Cinebench 7. Duda RO, Hart PE, Stork DG (2000) Pattern Classif, 2nd edn. Wiley, USA 8. Gron A (2017) Hands-on machine learning with Scikit-learn and TensorFlow: concepts, tools, and techniques to build intelligent systems, 1st edn. O’Reilly Media, Inc 9. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830 10. Min max scaler—Scikit learn. https://scikit-learn.org/stable/modules/generated/sklearn. preprocessing.MinMaxScaler.html 11. Agarwal A, Gupta S, Choudhury T (2018) Continuous and integrated software development using DevOps. In: 2018 International conference on advances in computing and communication engineering (ICACCE), pp 290–293
Chapter 15
IBM Watson: Redefining Artificial Intelligence Through Cognitive Computing Fatima Farid Petiwala, Vinod Kumar Shukla, and Sonali Vyas
1 Introduction IBM Watson, a supercomputer, which was named in honor of, Thomas J. Watson, who is IBM’s founder, was first created as a question answering machine/system which is claimed to be capable of communication in natural language, which only humans can use. It was developed initially for “Jeopardy!” a quiz show where it won first prize worth $1 million against Brad Rutter and Ken Jennings. Blasting off on its very first official debut in February 2011. Apart from the will to be able to conquer the show jeopardy!, IBM looked forward to revolutionize the era of technology by creating an AI supercomputer that would be able to find and analyze solutions even from unstructured data setting it apart from other common systems that could only work on structured data [1, 2]. The goal of creating Watson was never to be able to replicate human brain and its amazing abilities stated David Ferricci, it was to be able to build a system that would be able to communicate with the users in “natural language” however not necessarily entirely in a similar pattern that we humans do [1, 2]. It replicates and surpasses a humans ability to answer questions at the rate of as much as 80 teraflops, in addition to which it has access of over 90 servers with all in all its data combined to be of more than 200 million worth of pages containing all sorts of data, information, and other F. F. Petiwala · V. K. Shukla (B) Amity University, Dubai, UAE e-mail: [email protected] F. F. Petiwala e-mail: [email protected] S. Vyas University of Petroleum and Energy Studies, Dehradun, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_15
173
174
F. F. Petiwala et al.
details, this mammoth amount of information is processed by it against 6 million logic rules [3, 4], unlike most catboats which concentrate on mimicking human interactions leading to users being frustrated due to misunderstandings. Watson goes through several steps of analyzing and interpretation, keeping in mind that AI cannot be human rather be the best assistance [3]. Watson is well capable of understanding the requirements of a user in a particular situation and depending on the context it analyzes when to use its knowledge base to look for an answer, when to ask the user about their requirements for clarity and when the user should be directed to a human.
1.1 IBM Watson Health Watson is a high-tech cognitive system that uses its cognitive skills and abilities to solve problems. Cognition refers to how any individual perceives the world and acts in it. These abilities are basically brain-based skills that are required by us in order to be able to carry out any simple or complex task at hand. Rather than dealing with actual knowledge, they have a relation with our mechanism of how we as humans do things, learn, think, our problem solving skills, remembering, or even paying attention to details. Take for example, answering the phone involves us hearing the ringtone and recognizing what the next step would be (perception), next step would be making a decision of whether to pick it up or not, followed by our inbuilt motor skills which involves lifting up the phone, then comes talking on the phone and being able to interpret its meaning (language skills), etc. [5].
1.2 Watson’s Working We can interact with Watson in many ways including a mobile app with a voice interface, a virtual assistance in any existing social media apps, or sites such as Facebook [6]. As shown in Fig. 1, the end user can input desired data that needs to be processed/analyzed in order to get a specific output via a channel which could be either voice or written, etc. The assistant receives it and processes it, and it then asks further questions if instructions require further clarifications. General answers are provided to the user to test the accuracy and satisfaction of the customer, it then searches for more data within its database, related results and provides desired output.
Fig. 1 How Watson works [6, 7]
15 IBM Watson: Redefining Artificial Intelligence Through …
175
The data given by user is taken by Watson using the dialog skill which helps interpret what exactly the users requirements are, based on its understanding Watson asks more specific questions to the user to understand the exact needs, based on the information/data it receives it will and apply algorithms to solve the given query, questions that remain unanswered are given to the search skill which will not only explore the data given by user but also searches through its entire database and tries to match related keywords from its database in order to find the answer and finally formulates the output and presents it to the user.
1.3 Watsons Shortcomings and Main Reasons for the Negative Publicity One of the earliest problems faced by Watson for oncology back in 2012 was that of the M. D. Anderson Cancer Center located at the university of Texas hospital having used Watson for decision making in clinical terms by matching the records of the cancer patients to the data about clinical trials, in order to improve worldwide outcomes. Despite the huge cost of 62 million US dollars, the M. D. Andersons cancer centers Watson backed oncology expert advisor (OEA) unfortunately had to stop before it could even take off following an external audit which had been requested by the council of the university. During this audit, the shocking news that Watson for oncology was unable to integrate with the cancer centers electronic medical recording system due to compatibility issues and hence the research on leukemia and lung cancer was all continued with the help of the Clinic Station system that was already in use before Watson [8, 9]. This point further proves that it is indeed not the technology, perhaps it is the time required by it. A technology this advanced cannot be expected to be implemented properly within a span of few years and requires patience which was absent. Craft stated that the primary cause due to which Watson for oncology department was attracting all the bad comments and reports was due to the fact that the product that was promised by the marketing department of IBM was not something which could possibly be delivered, which is extremely well stated because a technology at this level would definitely require the investment of more time, effort, data, and research and should have probably be kept in the incubation or prep stage for a while longer before bringing it into the market. She further added to her statement saying that due to the same above-mentioned reason IBM had succeeded in disappointing their clients and created further skepticism concerning the integrity and reliability of their most advanced product [9]. IBM personnels have to work in conjunction with the hospital clients in order to be able to ensure that their product Watson would function efficiently in a proper way, leading to contradictions from the commercial version that the clients were hoping for however did not receive the same. Another issue to be highlighted would be that as far as the training of Watson is concerned IBM mainly used data in collaboration with
176
F. F. Petiwala et al.
their partners in development Memorial Sloan Kettering Cancer Center (MSKCC), due to the fact that the AI system had mainly been trained using records from only this hospitals research and patient data the results and analysis would obviously be based on the particular pattern of whatever organization is taken into consideration, hence leading to biased analysis and proposed techniques [8, 9]. According to Perlich, data scientists who dream and aspire to introduce platforms similar to Watson however simpler could take into consideration multiple sources like Microsoft Azure, Data Ninja or even Amazon web services, as these are assumed to be able to perform in a similar way as Watson. However, using Watson would provide the clients with “branding” which the others in comparison fail to do so, hence losing the competition. Despite all the bad publicity, anyone and everyone is more than happy to be able to claim that they have worked with “Watson.” He further added saying “So I think Watson is monetizing primarily on the brand perception.” Leading to the one obvious conclusion that despite all the issues stated all parties are interested in Watson as it offers great branding unlike other competitors [10]. There are also ethical issues relating to Watson’s approach, keeping in mind that one wrong step can be extremely crucial especially when in context to human life. Telling a patient we can cure cancer and showing them fake statistics just to get their hopes up high is wrong [11]. Watson though emerged as the glorious winner at the popular show Jeopardy! In 2011, unlike other platforms working hard in improving their systems and platforms bringing out newer better and more creative versions, IBM has decided to just settle with the current Watson version making no efforts for a newer version, instead of this limited improvement in technology they should rather focus on bigger improvements in their technology to be able to keep up with the advancement in competitors technology as well [12]. According to research data of 2018, IBM’s Watson health division has faced several criticisms on not being able to deliver what was promised in order to be able to use artificial intelligence to provide better personalized medicine and prescriptions in a more enhanced and smarter way [11]. Imagining an AI device can analyze our symptoms and generate detailed reports and even goes as far as to give personalized medicine shots up hope like no other. However, due to the extreme hype of the product, the margin that the expectation levels had reached were sky high—impossible to satisfy, leading to wide spread disappointment. No doubt IBM has delivered a great project and with technology there is always room for growth and improvement, but when it comes to satisfying the demands Watson has scored a 6. With the change in IBM’s Watson health division leadership in 2018 and the announcement for deployment of a business strategy relying on “Hybrid cloud” rather than limiting to either a public or private cloud model only, it was assumed that things would change [11, 13, 14]. However, throughout the previous years in addition to praises and acknowledgment, Watson for Oncology which is focused on study and research of tumors had taken spotlight for all the directed criticism toward Watson health, accused of not being able to meet expectations and even as far as not being able to offer the right advice to the physicians. IBM’s Watson-for-oncology is mainly a commercial, cognitive computing cloud assistance platform found and developed by
15 IBM Watson: Redefining Artificial Intelligence Through …
177
IBM which allows easy analysis of huge amount of data generated from the patients’ health care and thus in turn based on this data offers cancer treatment options to the physicians. The well-reputed, vice president of the research for the partners healthcare and strategy business, Laura Craft had stated that in the recent third-quarter results that the cognitive computing division of IBM unfortunately did not do as good as it was expected to do [11, 15, 16].
1.4 IBM’s Defense Around the July of 2018, Stat a healthcare news publication organization sent out a report in which it claimed the so-called top secret “internal IBM documents,” which portrayed that the extremely hyped new generation AI technology Watson the supercomputer apparently gave out incorrect, wrong, or misleading advice for cancer treatment and that the customers alongside the medical experts and specialists within the company identified some results as “unsafe treatment techniques recommendations and incorrect diagnosis” alongside IBM promoting its “groundbreaking AI technology” [8]. In the following month of August, the senior vice president at IBM’s department of Cognitive solutions and Research, John Kelly, in one of the online blog’s fired back toward all the negative reports and publicity received and targeted toward Watson reminding the audience that stating that there were no patients and physician benefits at all would be like denying the time and effort it had saved for many specialists and how easy it had become to analyze several volumes of data available after research. As far as the incorrect treatment plans issue is concerned, the vice president at IBM’s department of the external relations had swiftly declined to all the charges mentioned in the report [8].
1.5 Solutions If IBM really wants Watson to be successful as a commercial product, then IBM should not be having analysts within the client base helping build up suitable and appropriate models as required by the client and interfering with whatever the client data is, these models and designs instead of being hidden by IBM should be directly available to the clients without any interference whatsoever. Doing so will help build up trust relationships with their customers. Also, highlighting the fact that any organizations collaboration is based on complete trust, doing the above fails to keep up with that, eventually leading to conflicts. After all the client has paid a hefty sum for the technology not receiving what was promised is unfair and hinders the integrity of the parent company. As for the issue of biased results, Watson should probably be trained further with records and research data and patient records from various hospitals instead of sticking on to just one of them, in turn helping widen the scope of the technology
178
F. F. Petiwala et al.
and wider range of cases to be referred to in future, helping to allow access various distinct and unique cases where the treatment option or technique is often hard to comprehend. In order to prevent disappointment due to overhype, they should focus on presenting it realistically rather than portraying it as a product that it is not. In comparison to competitors, Watson has less features and is more expensive due to the branding which should be worked on. They should try and bring out a better and more improved version with wider range of features in the future. IBM should try and establish associations with organizations from diverse industries in order to be able to gain maximum benefit in terms of extended variety of data and better opportunities for incorporation of new technologies. Considering its current risks in the IT market, IBM should consider diversifying the scope of its business which would help reduce the estimated risks. In order to be able to deal with external competition, IBM should focus on bringing up some new innovative reform process which can improve the performance ability and distinctiveness of its products and help eliminate duplication and extreme competition [17, 18].
2 Watsons Alternative—Comparisons Observing the analytical data from the above table, we observe that Microsoft has a lot more to offer compared to the others in addition to the distinct number of supported frameworks followed by Amazon and Google, lastly IBM. In context to custom modeling of platforms, all of them offer similar features. However, Microsoft Ml Studio would be the best option if drag-and-drop type interface is required [19, 20] (Table 1).
2.1 Amazon ML—Predictive Analysis Amazon provides its machine learning services on two types of levels, the first one being the Amazon ML (for predictive analysis) and the second one being for the data scientists—the SageMaker tool. The capacity of Amazon ML’s predictive analysis has only three options, being the binary, multiclass classification, and the regression. Along with which the user does not have to learn or understand any methods of machine learning as Amazon automatically analyzes the data and chooses the methods on its own. The high level of automation provided by it comes with both pros and cons, nonetheless if the user requires a solution which is fully automated then Amazon ML would be a good choice. However, if more customization is required the SageMaker can be considered [19, 21].
15 IBM Watson: Redefining Artificial Intelligence Through …
179
Table 1 Machine learning services provided by the above-mentioned vendors for custom predictive analytics tasks Automated and semi-automated ML services Amazon
Microsoft
Google
IBM
Amazon ML
Microsoft Azure ML studio
Cloud Auto ML
IBM Watson ML model builder
Classification
Yes
Yes
Yes
Yes
Regression
Yes
Yes
Yes
Yes
Clustering
Yes
Yes
No
No
Anomaly detection No
Yes
No
No
Recommendation
No
Yes
Yes
No
Ranking
No
Yes
No
No
Platforms for custom modeling Amazon SageMaker
Azure ML services
Google ML engine
IBM Watson ML studio
Built-in algorithms
Yes
No
Yes
Yes
Supported frameworks
TensorFlow.MXNet, Keras, Gluon.Pytorch, Caffe2, Chainer.Torch
TensorFlow, scikit-learn, Microsoft Cognitive Toolkit, Spark ML
TensorFlow, scikit-learn, XGBoost, Keras
TensorFlow,Spark MLlib, scikit-learn, XGBoost, PyTorch, IBM SPSS, PMML
Cloud machine learning services comparison [19]
2.2 Amazon SageMaker This is an environment of machine learning which is to help simplify the duties of all data scientists. It does so by making available tools which provide the user with earlier and quicker model building alongside deployment as well. Let us consider Jupyter which is a compilation and creation notebook, which allows with simplified exploration of data and its analysis taking away the inconvenience of server management. There are certain built-in algorithms in Amazon which are built for databases which are large in size and for distributed systems computations. However, the good thing about SageMaker is that if the user at any point does not want to use these premade algorithms the user can add their own methods and run it with the help of the deployment features provided by it [19].
180
F. F. Petiwala et al.
2.3 Google Cloud AutoML Another one of Google’s open search machine language library via different variety of tools related to data science is TensorFlow—which is not a ML granted service kind of product. Combining the Google cloud services along with TensorFlow helps provide us with both platform and infrastructure as services which are in built [19].
2.4 IBM Watson ML Studio It provides an environment for both beginners and experienced individuals alike. It offers two sets of pathways to the users being—manual or automated aimed at expert individuals. Watson studio in comparison to Amazon ML and Google prediction provides AutoAL services (Table 2). Table 2 Machine learning APIs from Amazon, Microsoft, Google, and IBM comparison Amazon
Microsoft
Google
IBM
Speech recognition (speech into text)
Yes
Yes
Yes
Yes
Text into speech conversion
Yes
Yes
Yes
Yes
Entities extraction
Yes
Yes
Yes
Yes
Key phrase extraction
Yes
Yes
Yes
Yes
Language recognition
100+
120
120+
60+
Topics extraction
Yes
Yes
Yes
Yes
Spell check
No
Yes
No
No
Auto completion
No
Yes
No
No
Voice verification
Yes
Yes
No
No
Intention analysis
Yes
Yes
Yes
Yes
Metadata extraction
No
No
No
Yes
Relations analysis
No
Yes
No
Yes
Sentiment analysis
Yes
Yes
Yes
Yes
Personality analysis
No
No
No
Yes
Syntax analysis
No
Yes
Yes
Yes
Tagging parts of speech
No
Yes
Yes
No
Filtering inappropriate content
No
Yes
Yes
No
Low-quality audio handling
Yes
Yes
Yes
Yes
Translation in languages
6
60+
100
21+
Chatbot toolset
Yes
Yes
Yes
Yes
Speech and text processing APIs comparison [19]
15 IBM Watson: Redefining Artificial Intelligence Through …
181
Table 3 Image analysis APIs comparison [19] Object detection
Amazon
Microsoft
Google
IBM
Yes
Yes
Yes
Yes
Scene detection
Yes
Yes
Yes
No
Face recognition (person face identification)
Yes
Yes
Yes
Yes
Facial analysis
Yes
Yes
No
No
Inappropriate content detection
Yes
Yes
Yes
Yes
Celebrity recognition
Yes
Yes
Yes
No
Text recognition
Yes
Yes
Yes
No
Written text recognition
No
Yes
Yes
No
Search for similar images on web
No
No
Yes
No
Logo detection
No
No
Yes
No
Landmark detection
No
Yes
Yes
No
Food recognition
No
No
No
Yes
Dominant colors detection
No
Yes
Yes
No
From the above feature analysis, it is clearly noted that Microsoft offers the richest types of features in comparison followed by Amazon and Google with IBM ranking last. However, the important and most critical ones are offered by all competitors. The application program interface provided by the above-mentioned vendors mainly has three divisions—translation of text, recognition of text, and analysis of text; image and video recognition along with related analysis; and others that have uncategorized services to fit the specific needs of different users [19] (Table 3). The table clearly shows that in terms of image analysis the most distinct and better able to be adapted features are provided by Google cloud, followed by Microsoft, the Amazon and last again, IBM [19]. It is clearly evident that in comparison to the other products the number of no’s on IBM’s side is heavily weighted, giving out the observation that for almost the same price plans and packages IBM indeed has a lot more alternatives than they claim can replace their supercomputer Watson.
3 IBM Watson Products and Services See Table 4.
4 Growth of AI Market Grand view research, released a report stating that the AI market globally, has been approximately calculated to be around USD35,870.0 million by the coming year of
182
F. F. Petiwala et al.
Table 4 Products and services provided by IBM Watson [22] Category
Service
AI assistance
Watson Assistant—It is a smart AI which provides solutions to user queries depending on the data given to it as input to be analyzed. It said to provide fast and simple and precise responses [23]
Data
Watson Studio—It basically aims and being able to apply and implement ML in businesses in order to speed up the incorporation of AI growth and promote innovation [24] Watson Machine Learning—IBM Watson deep learning lets data scientists and engineers improve the implementation of AI and machine learning. Watson machine learning is a free, extensible model process that allows companies to automate and optimize AI on a scale in every domain [25] Watson Knowledge Catalog—It is a catalog management platform which can be used by businesses to bind the data and knowledge to the individuals who will be working with it [26]
Knowledge
Watson Discovery—It is an AI technology which focuses mainly on search and data analysis to provide results for specific user queries by analyzing the relationship between data [27] Discovery News—Watson provides enriched new content and claims to provide “smarter news” in real time [28] Natural language understanding—Provides advanced text analysis by using deep machine learning to extricate metadata from various texts [29] Knowledge studio—Customers can “teach” Watson how the organization works, it can quickly grasp the concepts related to the domain it is applied in and study its linguistic variation [30]
Vision
Visual recognition—Enables search, marking, classification, extraction of visual data with the help of ML. User can also use it to develop models for detection of some specific data/content inside the application using custom made parameters [31]
Speed
Speech to text—Enables conversion of voice into actual text by applying ML Text to speech—Enables conversion of actual text into audio (voice) in diverse languages and tones (voices)
Language
Language translator—Help translate any text from one language to another. It provides options between a wide varieties of languages Natural language classifier—Helps analyze, mark, and organize data in the form of text easily
Empathy
Personality insights—Helps get detailed information about the users activities, personality, what are their needs, demands, values, etc., all based on data it collects from written text in the form of digital communication Tone analyzer—Helps gain insights that could help understand about the style of communication, emotions, etc., of the user
2025. It has been anticipated that according to the current growth trends of the AI market, advanced inventions in AI technology will be implemented in major projects like robots, drones, and as far as self-driven vehicles [32]. With regard to the current trends of development, Fig. 2 shows a rapid growth in the revenue obtained from the AI market. Comparing the data from back in 2016–2025, the increase is quiet significant. Observing the graph further, the year 2025 reaches the peak value of
15 IBM Watson: Redefining Artificial Intelligence Through …
183
Fig. 2 Revenue from AI for enterprise applications market worldwide, from 2016 to 2025 (in million US dollars) [33]
revenue. Clearly the field of AI is growing, and at an extremely accelerated pace. With the things that humanity has currently achieved we can only imagine what the future holds for us.
5 Conclusion From all of the above data including several other sources, it is by now an easily observable fact that IBM Watson is certainly not what it claims to be—an AI computer? Yes no doubt, but the best that can be offered? Certainly a no. From the analysis and comparison to other competitors, we can clearly observe that there are indeed technological systems which offer exactly what Watson does, and in some cases—even more than what Watson has to offer. The discussion about “Fame from branding” has thus concluded to be true. No doubt that Watson has indeed contributed a lot to medical as well as other field, but the amount of inaccuracies as it is “not there yet” cannot be turned a blind eye against. However, it is important to mention again that it is technology and the scope of technology is ever growing. Turning from nothing to a proper system back in 2004 to being able to make a grand debut at the show jeopardy in 2011, eventually widening its scope towards AI assistance and Watson for health, it has definitely grown and developed a lot, and this enhancement cannot be ignored. However, the main question was is it overhyped? To which the concluded answer would be yes. It was definitely not what was expected out of it, more time needs to be invested, more data from distinct organizations should be collected and most importantly the engineers need to realize one specific goal at a time instead of trying to bring up 10 different systems for various field. As the well-known saying Watson has indeed become the jack of all but master of none.
184
F. F. Petiwala et al.
References 1. IBM100 A computer called Watson. IBM, [online]. https://www.ibm.com/ibm/history/ibm100/ us/en/icons/watson/ 2. Rouse M IBM Watson supercomputer. Search Enterprise AI, [online]. https://searchenterpris eai.techtarget.com/definition/IBM-Watson-supercomputer 3. Noyes K (2016) It’s (not) elementary: how Watson works. PCWorld, [online]/ https://www. pcworld.com/article/3128401/its-not-elementary-how-watson-works.html, (7 Oct 2016) 4. Rastogi A (2018) IBM Watson and its key features. newgenapps (13 Aug 2018) 5. Michelon P (2006) What are cognitive abilities and skills, and how to boost them? SharpBrains, [online]. https://sharpbrains.com/blog/2006/12/18/what-are-cognitive-abilities/ (18 Dec 2006) 6. IBM Redbooks (2012) The era of cognitive systems: an inside look at IBM Watson and how it works. Redbooks, [online]. https://www.redbooks.ibm.com/abstracts/redp4955.html?Open (12 Dec 2012) 7. IBM Cloud Docs About Watson assistance. IBM Cloud Docs, [online]. https://www.cloud. ibm.com/docs/assistant?topic=assistant-index 8. Mearian L (2018) Did IBM overhype Watson health’s AI promise? ComputerWorld, [online]. https://www.computerworld.com/article/3321138/did-ibm-put-too-muchstock-in-watson-health-too-soon.html (14 Nov 2018) 9. Strickland E (2019) How IBM Watson overpromised and under delivered on AI health care. Spectrum, [online]. https://spectrum.ieee.org/biomedical/diagnostics/how-ibm-watson-overpr omised-and-underdelivered-on-ai-health-care (02 Apr 2019) 10. Brown J (2017) Why everyone is hating on IBM Watson—including the people who helped make it. Gizmodo, [online]. https://gizmodo.com/why-everyone-is-hating-on-watson-includ ing-the-people-w-1797510888 (08 Oct 2017) 11. Snapp S (2019) How IBM is distracting from the Watson failure to sell more AI. Bright Work Research, [online]. https://www.brightworkresearch.com/demandplanning/2019/06/how-ibmis-distracting-from-the-watson-failure-to-sell-more-ai/ (13 June 2019) 12. Weebly IBM Watson. Weebly, [online]. https://ibmwatson237.weebly.com/, last accessed 2020/04/19 13. Higbee T (2017) Watch out … you’ll be surprised, and then VERY sorry! Seems to be a great product—with limited to no support. TrustRadius, [online]. https://www.trustradius.com/rev iews/ibm-bluemix-2017-07-12-00-55-09 (15 July 2017) 14. Jaklevic MC (2017) MD Anderson Cancer Center’s IBM Watson project fails, and so did the journalism related to it. Health News Review, [online]. https://www.healthnewsreview.org/ 2017/02/md-anderson-cancer-centers-ibm-watson-project-fails-journalism-related/ (23 Feb 2017) 15. Lardinois F (2018) IBM launches deep learning as a service inside its Watson Studio. TechCrunch, [online]. https://techcrunch.com/2018/03/19/ibm-launches-deep-learning-as-aservice-inside-its-watson-studio/ (20 Mar 2018) 16. Bloomberg J (2017) Is IBM Watson a ‘joke’? Forbes, [online]. https://www.forbes.com/sites/ jasonbloomberg/2017/07/02/is-ibm-watson-a-joke/#50433a5ada20 (2 July 2017) 17. Rowland C (2017) IBM SWOT analysis and recommendations. Panmore Institute, [online]. https://panmore.com/ibm-swot-analysis-recommendations (12 June 2017) 18. Hernandez D, Greenwald T (2018) IBM Has a Watson dilemma. WSJ, [online]. https://www. wsj.com/articles/ibm-bet-billions-that-watson-could-improve-cancer-treatment-it-hasnt-wor ked-1533961147 (11 Aug 2018) 19. Altexsoft (2019) Comparing machine learning as a service: Amazon, Microsoft Azure, Google Cloud AI, IBM Watson. Alexsoft, [online]. https://www.altexsoft.com/blog/datascience/com paring-machine-learning-as-a-service-amazon-microsoft-azure-google-cloud-ai-ibm-watson/ (27 Sept 2019) 20. FinancesOnline (2020) IBM Watson analytics alternatives. Financesonline, [online]. https://alt ernatives.financesonline.com/p/ibm-watson-analytics/, last accessed 2020/04/19
15 IBM Watson: Redefining Artificial Intelligence Through …
185
21. Mathur N (2018) Predictive analytics with AWS: a quick look at Amazon ML. Packtpub, [online]. https://hub.packtpub.com/predictive-analytics-with-amazon-ml/ (9 Aug 2018) 22. IBM Watson products and services. IBM, [online]. https://www.ibm.com/watson/in-en/pro ducts-services/ 23. IBM Watson assistant. IBM, [online]. https://www.ibm.com/cloud/watson-assistant/ 24. IBM Cloud Watson studio. IBM, [onilne]. https://cloud.ibm.com/catalog/services/watson-stu dio#about 25. IBM IBM Watson machine learning. IBM [online]. https://www.ibm.com/cloud/machine-lea rning 26. IBM Watson knowledge catalog—overview. IBM Cloud Docs, [online]. https://dataplatform. cloud.ibm.com/docs/content/wsj/catalog/overview-wkc.html 27. IBM Cloud Docs Watson discovery. IBM Cloud Docs, [online]. https://www.ibm.com/cloud/ watson-discovery 28. IBM Watson discovery news. IBM Cloud Docs, [online]. https://www.ibm.com/watson/ser vices/discovery-news/ 29. IBM Watson natural language understanding. IBM Cloud Docs, [online]. https://www.ibm. com/cloud/watson-natural-language-understanding 30. IBM Cloud Docs Watson knowledge studio. IBM Cloud Docs, [online]. https://www.ibm.com/ watson/services/knowledge-studio/ 31. IBM Watson visual recognition. IBM Cloud Docs, [online]. https://www.ibm.com/cloud/wat son-visual-recognition 32. Vision ZM Market-Opportunities.over-blog.com 33. Columbus L (2018) 10 charts that will change your perspective on artificial intelligence’s growth. Forbes, [online]. https://www.forbes.com/sites/louiscolumbus/2018/01/12/10-chartsthat-will-change-your-perspective-on-artificial-intelligences-growth/#722ba4ef4758
Chapter 16
An Improved Multi-objective Water Cycle Algorithm to Modify Inconsistent Matrix in Analytic Hierarchy Process Hemant Petwal and Rinkle Rani
1 Introduction The analytic hierarchy process (AHP) [1] is among the decision-making techniques that are widely utilized for solving multi-criteria decision-making (MCDM) problems. In AHP, a problem is decomposed into sub-problems, and a hierarchical structure is formed. The hierarchical structure keeps the objective on the top and arranges the decision-factors as well as alternatives in their corresponding descending order. A hierarchically decomposed problem facilitates decision-makers to analyze a problem critically and compute the priorities, which further helps decision-makers in sorting, ranking, or selection of the most suitable solution for an MCDM problem. However, various factors, e.g., the number of decision-makers involved, the method utilized for priority assessment, consistency evaluation, etc., affect the effectiveness of AHP by influencing the decisions in a pair-wise comparison matrix (PCM). This paper focuses only on PCM consistency measurement. The elements of the PCM are represented as aij that defines the dominance of ith alternative on jth alternative. In a PCM, the value of each aij lies between 1 and 9 and equals to 1/a ji . Whenever priorities are computed in AHP, consistency measurement of PCM plays a decisive role in determining the reliability of the outcome [2]. In AHP, a consistency threshold is used for evaluating the consistency of PCM. If a PCM fails to satisfy the required consistency threshold, it turns to inconsistent PCM. An inconsistent PCM always produces unreliable outcomes and requires modification so to achieve satisfactory results. In recent H. Petwal (B) · R. Rani Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, Punjab, India e-mail: [email protected] R. Rani e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_16
187
188
H. Petwal and R. Rani
years, the cosine consistency index (CCI) [3] is used for measuring the consistency of PCM. A PCM qualifies the consistency test only if it meets the desired consistency threshold (≥0.90). Although their [3] contribution is useful, however, they do not focus on improving CCI values. Improving an inconsistent PCM is a crucial task because it is challenging to compare matrix elements. Further, it takes longer to ask the involved decision-makers for their revised judgments repeatedly. Therefore, these challenges motivated the present study to modify inconsistent PCM automatically to achieve the optimal CCI threshold. Optimal CCI values represent the PCM that not only satisfies the desired consistency threshold but also has a minimum deviation from the original inconsistent matrix to maintain the original decision.
2 Related Work In recent years, several methods are proposed to improve the inconsistent pair-wise comparison matrix [2, 4]. The inconsistency in the cosine consistency index in AHP is improved using weighted arithmetic mean, also called WAM, and weighted geometric mean, also called WGM. Both approaches find the element having a maximum deviation in inconsistent PCM [5]. Similarly, AHP and ACO are combined to improve inconsistent PCM [6]. The two key issues that the study focused on were the consistency threshold and deviation of modified PCM from the original PCM. An induced matrix is proposed to modify the elements of inconsistent PCM [7]. The induced matrix aimed to identify and change the matrix elements causing the PCM inconsistent. However, the proposed method modified only those matrix elements that found suspected of making the PCM inconsistent. Also, in recent years, modifying an inconsistent matrix using an intelligent algorithm has been a growing research trend. The genetic algorithm (GA) [8, 9] is proposed to obtain a consistent PCM automatically. An automatic repair approach based on particle swarm optimization (PSO) and the Taguchi method is proposed for modifying inconsistent PCMs [10]. The GA [9] and the PSO [10] are focused on two essential aspects; difference index between modified PCM and original PCM, and the eigenvalues nearly equal to the number of comparing elements. Although these aspects are added to the objective function for minimizing the overall index, there is no significant consideration of how to find the matrix of the optimal difference index between the inconsistent and modified PCM. The proposed study highlights the fact that the matrix becomes more significant if it satisfies consistency as well as it is closer to the original PCM. For implementing the highlighted idea, the concept of the multi-objective water cycle algorithm (MOWCA) [11], which is one of the intelligent nature-inspired optimization algorithms, is used in this study. However, no research contribution has been found that modifies an inconsistent PCM using MOWCA. WCA is deemed as an efficient algorithm in solving different optimization problems, including scheduling [12], traveling salesman problem (TSP) [13], and clustering [14]. This paper proposes an improved MOWCA algorithm to make an inconsistent matrix optimally consistent.
16 An Improved Multi-objective Water Cycle Algorithm to Modify …
189
3 Proposed Multi-objective Algorithm (MOA) This section describes the proposed multi-objective algorithm (MOA) in detail. The algorithm starts with parameter initialization. Subsequently, initial solutions based on the PCM encoding approach [6] are produced. Then, the fitness of each initial solution is computed, and based on the obtained fitness, the best solutions are updated in the population. The population follows this update process iteratively until the condition of termination is reached and finally obtains the optimum solution. A detailed workflow of the proposed MOA is presented in Algorithm 1.
190
H. Petwal and R. Rani
3.1 Objective Function To modify an inconsistent PCM, and to obtain an optimally consistent matrix, two objectives are framed in this paper. The first objective is to maximize the CCI value, and the second objective aims to minimize the distance between modified PCM and original inconsistent PCM so that the initial judgments of the decision-makers can be maintained. Distance (DI) between the two matrices is computed using Euclidean distance. The objective functions are as follows: Maximize Fitness (x1 ) = F1 (x), and Minimize Fitness (x1 ) = F2 (x) where F1 (x) → CCI =
C∗ , matricsize
F2 (x) → DI =
n i=1
InconsistentPCMi −ModifiedPCMi
(1)
2 .
16 An Improved Multi-objective Water Cycle Algorithm to Modify …
191
3.2 Improved Evaporation Rate In the proposed MOA, the existing evaporation process [15] is the improvised version of the existing MOWCA [11]. The proposed evaporation rate considers two crucial aspects of running water. The first aspect is concern about the salt content in running water. The proposed MOA assumes that the river’s evaporation rate is dependent upon the salt amount it carries, which further reliant on the volume of its tributary streams. The second aspect is about infiltration. The volume of the running water (surface run-off) is affected by the infiltration, and it decreases as infiltration increase. Moreover, an increase in salt amount (S) increases the mass (Mass ) and density (DI) of water, resulting of the evaporation rate. The process of improved in the reduction evaporation rate EVRimp is as follows: S = L B + (U B − L B) ∗ rand, ⎧ ⎨ Massi = Si ∗ Volmi , where Volmi = CCIi /DIi DIi ∝ Massi , i = {Sea, River1,2...Numbersr −1 , ⎩ Stream_1, 2, . . . .Numberstream
(2)
Here, infiltration is represented by Volm = CCI/DI, and it varies as position the updated changes. The initial evaporate rate (EVR) and improved evaporation solution rate EVRimp for each river is computed in the following way: EVRimpriveri =
1 1 , EVRriveri = Densityriveri Densityriveri
(3)
stream where Densityriveri = DIriveri + Number DIstreami . s=1 Each river’s evaporate rate is updated as all tributary streams and rivers update their positions corresponding to their rivers and sea, respectively. The improved evaporation rate EVRimp of each river is compared with its previous evaporation rate (EVR). As a result, the rivers with higher evaporation rate evaporate. The proposed evaporation rate is computed between river and streams as follows:
Densityriveri = DIriveri +
1 EVRriveri
, EVRriveri =
1 Densityriveri
, EVRimpriveri > E V Rriveri
EVRriveri = EVRimpriveri , where i = 1, 2, 3 . . . Numbersr − 1, otherwise (4)
192
H. Petwal and R. Rani
4 Experiments and Results The experiments are carried out in two sub-sections. Section 4.1 focuses on the comparison of the performance among the proposed MOA and existing algorithms. In Sect. 5, a statistical significance test to validate the proposed MOA is presented.
4.1 Performance Comparison Among the Proposed MOA and Existing Algorithms The proposed MOA is assessed on different inconsistent PCMs; i.e. PCM2 [5], PCM11 [6], PCM14 [10], PCM15 [10]. Subsequently, based on the performance, the proposed MOA is compared with existing algorithms. Parameter Setting
In the experiment, the following parameters, i.e., size of the population Sizepop , size of the external Pareto archive SizeEpa , total number of rivers and sea (Numbersr ), the lower and upper bound of the sea (Sealb , Seaub ), the lower and upper bound of the river (Riverlb , Riverub ), lower and upper bound of the stream (Streamlb , Streamub ), are initialized as follows: Sizepop is set to 100, SizeEpa is set to 1, Numbersr is set to 8, Sealb , Riverlb , Riverub are set to 0.05,Seaub is set to 3.5 and Streamlb , Streamub is set to 0.01. Results and Discussion The experimental evaluations are performsed in 50 iterations. Then, based on obtained outcomes, the proposed MOA is compared with existing algorithms: MOWCA [11], ER-WCA [15], and [5] for measuring the difference in the performance. The results are tabulated in Table 1. Table 1 indicates that the proposed MOA optimally modifies all the considered inconsistent PCM as compared to the existing algorithms. The results of Table 1 are presented in Fig. 1. Figure 1a, b reveal that the proposed MOA has achieved the highest CCI value and the lowest deviation, respectively, for all modified inconsistent PCM as compared to existing algorithms.
5 Statistical Significance Test For validating the significant improvement in CCI values, obtained by the proposed MOA, a paired sample t-test is performed. The paired t-test shows that whether the results obtained by proposed MOA are significant (P-value < 0.05) or not. So, the initial CCI value and the optimized CCI value of PCMs are taken to implement the t-test. The results, shown in Table 2, demonstrate that the P-value of the conducted
16 An Improved Multi-objective Water Cycle Algorithm to Modify …
193
Table 1 Performance comparison among the proposed MOA and the existing algorithms Matrix (Initial element, CCI value) Algorithm
Optimized element (CCI, DI, and improvement in CCI)
PCM2 1 5 −7−7−
1 2
9−
1 9
Proposed MOA 1 5
−
1 5
−
1 9
−
1 7
−
1 3
−
−5
−9− 17 − 17 − 19 −7− 13 −7−9−5−9 (0.7283) MOWCA
− 7 − 6 − 13 − 61 − 19 − 15 − 13 − 13
− 21 − 21 − 61 − 91 − 13 − 3−4−9−5−9 (0.9229, 13.81, 0.1946) 1 2
1 5
−7−
− 7 − 6 − 13 − 61 − 19 − 15 − 13 − 13
− 21 −
1 2
−
1 6
−
1 7
−
1 3
−
1 6
−6−
− −9−5−9 (0.9156, 14.76, 0.1873) 1 2
WCA
1 2
1 2
− 7 − 6 − 13 − 61 − 15 − 21 − 13 − 13
− 21 −
1 4
−
1 3
−
1 7
−
1 8
−
1 7
−6−
− −6−4−8 (0.9101, 15.12, 0.1818) 1 2
Khatwani and Kar [5]
1 5 1 8
1 2
−7−7−
1 1 1 1 1 5 − 5 − 9 − 7 − 3 1 1 1 1 7 − 7 − 7 − 9 −
−
− 2 − 13 − 3 5 4 − 13 − 29 − 9 − 5 − 9
(0.9023, 15.16, 0.1714) PCM11 Proposed MOA 3− 21 − 17 −6−9−9−2 −4−4−5 (0.8616) MOWCA WCA
3 − 21 − 41 − 6 − 9 − 1 − 9 − 5 − 5 − 4 (0.9185, 2.0, 0.0569) 2− 21 − 61 −6−9−1−9−5−6−5 (0.9074, 2.6, 0.0458) 3− 21 − 17 −4−5−6−5−3−7−4 (0.936, 7, 0.07)
Khatwani and Kar [5] 2 − 81 − 17 − 61 − 21 −6− 41 − 21 −7 75 −6 21 (0.9166, 12.15, 0.055) PCM14 1 5 −1−5−3−7−1−7−3− 7−3−3−5− (0.7934)
1 5
−
1 5
Proposed MOA
−5
−2−4−5−8−3−6−6− 5 − 2 − 4 − 6 − 1 − 21 − 2 1 2
(0.9217, 6.15, 0.1283) MOWCA
1 2 −3−4−5−6−4−7−6− 5 − 2 − 4 − 5 − 21 − 13 − 2
(0.9026, 6.57, 0.1092) WCA
−6−4−5−6−2−8−8− 2 − 3 − 4 − 5 − 21 − 13 − 2 1 2
(0.9009, 9.65, 0.1075) (continued)
194
H. Petwal and R. Rani
Table 1 (continued) Matrix (Initial element, CCI value) Algorithm Khatwani and Kar [5]
Optimized element (CCI, DI, and improvement in CCI) 2 1 1 3 5 − 2 − 2 −6− 8 −4−5−4− 5 − 7 − 4 − 2 − 21 − 41 − 61
(0.9233, 11.88, 0.12993) PCM15 1 1 5 − 9 −3−1−5− 5−7−3−3− (0.7767)
1 3
1 5
Proposed MOA
1 1 1 1 5 − 2 −3− 5 −3− 3 −2−6− 3−7−1−4−1−3−2 (0.9421, 7.02, 0.165)
MOWCA
− 21 − 4 − 15 − 3 − 13 − 2 − 7 − 3 − 6 − 1 − 2 − 21 − 3 − 2
−5−5−
−3−7
1 7
(0.9122, 7.34, 0.1355) WCA
− 21 − 4 − 41 − 3 − 61 − 3 − 6 − 3 − 9 − 2 − 8 − 21 − 7 − 2 1 5
(0.9092, 9.26, 0.1325) Khatwani and Kar [5]
1 3 1 3
− 21 − 4 − 5 − 21 − 41 − 21 − 7 − −8−8−9−9−7−6
(0.9256, 5.41, 0.1489)
t-test satisfies the level of significance (=0.90) and has the lowest deviation (distance) from the original PCM so that the original judgment of decision-makers can be preserved. For measuring the performance, the proposed MOA is compared with the existing algorithms on different inconsistent PCMs. Further, the proposed MOA is validated using the paired sample t-test. Results suggest that the proposed algorithm produces statistically significant optimal results as compared to existing algorithms.
16 An Improved Multi-objective Water Cycle Algorithm to Modify …
195
Fig. 1 a Comparison between the proposed MOA and existing algorithm on different inconsistent matrices based on CCI values. b Comparison between the proposed MOA and existing algorithm on different inconsistent matrices based on DI values
196
H. Petwal and R. Rani
Table 2 Paired sample t-test of the proposed MOA based on inconsistent matrices Paired values
Paired difference Mean
Optimized-initial 0.137 CCI values
Std. Dev.
T Std. error diff.
95% confidence interval of difference Lower
0.061 0.031 −0.2351
DF P-value (significance)
Upper −0.0401 4.49 3
0.0206 (significant)
The limitation of the proposed work is that it only focuses on modifying the precise and complete inconsistent PCM. It does not focus on incomplete inconsistent PCM. Furthermore, a tradeoff of the proposed MOA is that although it is competent and efficient for modifying higher dimensional incomplete PCMs, a large number of iterations are required to do so. In the future, several directions can enhance the proposed work. The proposed work can be improved by focusing on modifying the incomplete inconsistent PCM. The proposed work can also be enhanced by modifying the inconsistent PCM while focusing on changing the least elements of the original matrix.
References 1. Saaty TL (1990) How to make a decision: the analytic hierarchy process. Eur J Oper Res 48(1):9–26 2. Lin C, Kou G, Ergu D (2014) A statistical approach to measure the consistency level of the pair-wise comparison matrix. J Oper Res Soc 65(9):1380–1386 3. Kou G, Lin C (2014) A cosine maximization method for the priority vector derivation in AHP. Eur J Oper Res 235(1):225–232 4. Cao D, Leung LC, Law JS (2008) Modifying inconsistent comparison matrix in analytic hierarchy process: a heuristic approach. Decis Support Syst 44(4):944–953 5. Khatwani G, Kar AK (2017) Improving the Cosine Consistency Index for the analytic hierarchy process for solving multi-criteria decision-making problems. Appl Comput Inf 13(2):118–129 6. Girsang AS, Tsai CW, Yang CS (2015) Ant algorithm for modifying an inconsistent pair-wise weighting matrix in an analytic hierarchy process. Neural Comput Appl 26(2):313–327 7. Ergu D, Kou G, Peng Y, Shi Y (2011) A simple method to improve the consistency ratio of the pair-wise comparison matrix in ANP. Eur J Oper Res 213(1):246–259 8. da Serra Costa JF (2011) A genetic algorithm to obtain consistency in analytic hierarchy process. Brazil J Oper Prod Manag 8(1):55–64 9. Lin CC, Wang WC, Yu WD (2008) Improving AHP for construction with an adaptive AHP approach (A3). Autom Constr 17(2):180–187 10. Yang IT, Wang WC, Yang T (2012) Automatic repair of inconsistent pair-wise weighting matrices in analytic hierarchy process. Autom Constr 22:290–297 11. Sadollah A, Eskandar H, Bahreininejad A, Kim JH (2015a) Water cycle algorithm for solving multi-objective optimization problems. Soft Comput 19(9):2587–2603 12. Gao K, Zhang Y, Sadollah A, Lentzakis A, Su R (2017) Jaya, harmony search and water cycle algorithms for solving large-scale real-life urban traffic light scheduling problem. Swarm Evol Comput 37:58–72
16 An Improved Multi-objective Water Cycle Algorithm to Modify …
197
13. Osaba E, Del Ser J, Sadollah A, Bilbao MN, Camacho D (2018) A discrete water cycle algorithm for solving the symmetric and asymmetric traveling salesman problem. Appl Soft Comput 71:277–290 14. Qiao S, Zhou Y, Zhou Y, Wang R (2019) A simple water cycle algorithm with percolation operator for clustering analysis. Soft Comput 23(12):4081–4095 15. Sadollah A, Eskandar H, Bahreininejad A, Kim JH (2015b) Water cycle algorithm with evaporation rate for solving constrained and unconstrained optimization problems. Appl Soft Comput 30:58–71
Chapter 17
Comparative Approach of Response Surface Methodology and Particle Swarm Optimization-Artificial Neural Network (PSO-ANN) in Rehydration Ratio Optimization for Bael (Aegle marmelos (L) Correa) Powder Production Tanmay Sarkar, Molla Salauddin, Sudipta Kumar Hazra, and Runu Chakraborty
1 Introduction Bael (Aegle marmelos (L) correa) comes under the rutaceae family. It is a subtropical fruit and mostly found in Southeast Asia. Besides being used in house held consumption it also contributes to medicinal purposes [1]. The colour of the fruit pulp in ripe condition is generally yellow or orange in colour, which is generally, consumed either as fresh or it can be processed into products like marmalade, syrups and jam. Since bael is a seasonal fruit in India, it is mostly found in the months of May–June [2]. As a good source of phytochemicals [3], it has become popular for consumption throughout the year. So the fruit requires to be preserved to get available. In studying the drying characteristics of bael fruit the most crucial parameter which has to be taken into consideration is the rehydration ratio of dried product [4]. This is because the acceptability of dried product largely depends on some criteria like its flavour, colour, and nutritional quality. After its reconstitution, these criteria should match to the original fruit to a greater extent. Drying techniques and process parameters affect the quality of the dried product greatly. Response surface methodology is an approximation method which utilizes statistical techniques in solving complex designs. It involves the design of experiments and response fitting to establish a relationship between variables and responses. T. Sarkar · M. Salauddin · S. K. Hazra · R. Chakraborty (B) Department of Food Technology and Biochemical Engineering, Jadavpur University, Kolkata 700032, West Bengal, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_17
199
200
T. Sarkar et al.
Appropriate response values can be produced by optimization of the variables. RSM technique used previously in the optimization of hot air drying of olive leave [5], crops [6]; osmotic drying of potatoes [7], kiwi fruit [8], carrot cubes [9], Aonla slices [10], green peeper [11], red paprika [12]; microwave drying of apple slices and button mushroom [13]; far infra-red assisted freeze drying of yam slices [14]. Besides all these advantages RSM cannot be used in cases where more complex relationship found between process variables and responses as the number of variables increases. Like RSM, particle swarm optimization (PSO) method also contributes to the optimization of designs consisting multi-variables. Kennedy and Eberhart were recognized as the developer of the algorithm [15]. This method is inspired by swarm behaviour of bird flocking and movement dynamics of insects and fish. A particle in the algorithm is represented by a bird, in search of optimal location the particle traverse in hyper dimensional space. Not only that the position of particles alters to find the best position of its own and also for neighbours [16]. This study shows the comparison among RSM and PSO on rehydration ratio for various drying techniques of bael fruit namely hot air drying, microwave assisted drying, freeze drying and sun drying.
2 Materials and Methods Baels were collected from Agri-Horticultural society of India, Kolkata, West Bengal, India. Fresh, ripe and good quality baels were washed with clean water. Then the outer hard surface of baels was cut by knife. Pulp was extracted by means of screening with mesh size varying from 4 to 6. The was blended in an electric blender (Prestige Stylo, Serial no. 9B 4030, 2800 rpm) followed by storage in a deep freezer (New Brunswick Scientific, England, C340-86) at a temperature of −180 °C for future use.
2.1 Preparation of Hot Air Dried Bael (HAD) The pulp was spread over petri dishes (puree load 0.6 g/cm2 ) and dried in hot air oven (Concepts international, Kolkata, India) to get final moisture content of 12–15%, in accomplishment of the drying process the method applied was central composite design (CCD) (Table 1). The three independent variables were drying temperature (60–80 °C), screen opening (4–6 mesh) and homogenization time (30–60 s). Preliminary tests were done in determining the input variables ranges. Samples had been collected in triplicate at an interval of 10 min over a period of 5 h and estimation of moisture content was done by gravimetric method [17].
17 Comparative Approach of Response Surface Methodology…
201
Table 1 Process variables and response for hot air dried bael powder Run
Temperature of Screen drying (°C) opening (mesh)
Homogenization Time (s)
Rehydration ratio (RR)
RSM predicted (RR) value
PSO-ANN predicted (RR) value
1
70.00
5.00
45.00
3.81
3.81
3.81
2
80.00
4.00
60.00
3.42
3.50
3.46
3
70.00
5.00
70.23
3.90
3.88
3.91
4
80.00
4.00
30.00
3.08
3.17
3.03
5
70.00
3.32
45.00
3.75
3.69
3.73
6
70.00
5.00
45.00
3.81
3.81
3.81
7
70.00
5.00
45.00
3.80
3.81
3.80
8
60.00
6.00
30.00
4.02
3.89
4.06
9
53.18
5.00
45.00
4.48
4.68
4.58
10
70.00
6.68
45.00
3.82
3.96
3.90
11
80.00
6.00
60.00
3.87
3.93
3.89
12
70.00
5.00
45.00
3.81
3.81
3.81
13
80.00
6.00
30.00
3.51
3.45
3.48
14
60.00
4.00
60.00
4.35
4.35
4.36
15
70.00
5.00
45.00
3.81
3.81
3.81
16
70.00
5.00
45.00
3.81
3.81
3.80
17
70.00
5.00
19.77
3.09
3.19
3.08
18
60.00
4.00
30.00
4.11
4.00
4.10
19
86.82
5.00
45.00
3.72
3.60
3.69
20
60.00
6.00
60.00
4.53
4.39
4.48
2.2 Preparation of Microwave Dried Bael (MWD) Laboratory microwave dryer (Samsung, Combi CE1031LAT, Mumbai, India) was used for drying operation. The CCD was adapted to accomplish the drying (Table 2) where the input process parameter were microwave power level (100–300 W), screen opening (4–6 mesh) and homogenization time (30–60 s) to obtain the final product with 12–15% moisture content.
2.3 Preparation of Freeze Dried Bael (FD) Freeze drying was performed using a laboratory freeze dryer (FDU 1200, EYELA, Japan) at −40 °C, 0.13 ± 0.02 mbar pressure 18 h. The CCD was adapted to accomplish the drying (Table 3). The two independent variables were screen opening (4–6 mesh) and homogenization time (30–60 s).
202
T. Sarkar et al.
Table 2 Process variables and response for microwave dried bael powder Run
Power level (W)
Screen opening (mesh)
Homogenization Time (s)
Rehydration ratio (RR)
RSM predicted (RR) value
PSO-ANN predicted (RR) value
1
300
4.00
60.00
3.90
3.93
3.92
2
240
5.00
45.00
4.20
4.22
4.21
3
180
4.00
60.00
4.62
4.58
4.59
4
240
5.00
70.23
4.25
4.32
4.30
5
180
6.00
30.00
4.59
4.50
4.53
6
240
5.00
45.00
4.20
4.22
4.21
7
300
6.00
30.00
3.81
3.85
3.87
8
300
4.00
30.00
3.72
3.81
3.74
9
240
3.32
45.00
4.24
4.18
4.20
10
340
5.00
45.00
3.72
3.67
3.70
11
240
5.00
45.00
4.18
4.22
4.20
12
300
6.00
60.00
4.11
3.98
4.10
13
180
4.00
30.00
4.53
4.45
4.49
14
180
6.00
60.00
4.71
4.63
4.69
15
139
5.00
45.00
4.62
4.76
4.68
16
240
5.00
45.00
4.22
4.22
4.22
17
240
6.68
45.00
4.17
4.26
4.21
18
240
5.00
19.77
4.15
4.11
4.17
19
240
5.00
45.00
4.21
4.22
4.20
20
240
5.00
45.00
4.19
4.22
4.20
2.4 Preparation of Sun Dried Bael (SD) Bael pulp was poured on glass petri dishes which were previously smeared with oil and was kept under the sunlight in open air condition. The CCD was adapted to accomplish the drying (Table 4) where the two independent variables were screen opening (4–6 mesh) and homogenization time (30–60 s). The ambient temperature was 35 ± 5 °C for entire period of drying (3 days).
2.5 Preparation of Bael Powder The differently dried products were separately taken in an electric blender (Prestige Stylo, Serial no. 9B 4030, 2800 rpm) and mixed for 30 to 60 s. The resulting powder was collected and stored at deep freeze (New Brunswick Scientific, England, C34086) at a temperature of −18 °C for future use.
17 Comparative Approach of Response Surface Methodology…
203
Table 3 Process variables and response for freeze dried bael powder Run
Screen opening (mesh)
Homogenization Time (s)
Rehydration ratio (RR)
RSM predicted (RR) value
PSO-ANN predicted (RR) value
1
6.00
60.00
5.52
5.34
5.50
2
4.00
60.00
5.01
4.92
5.08
3
5.00
23.79
4.71
4.79
4.72
4
6.41
45.00
5.09
5.29
5.10
5
5.00
45.00
5.06
4.99
5.05
6
6.00
30.00
5.10
5.06
5.08
7
5.00
66.21
4.90
5.19
5.00
8
4.00
30.00
4.59
4.64
4.55
9
5.00
45.00
5.05
4.99
5.05
10
5.00
45.00
5.05
4.99
5.05
11
5.00
45.00
5.06
4.99
5.05
12
5.00
45.00
5.04
4.99
5.05
13
3.59
45.00
4.62
4.69
4.60
14
5.00
45.00
5.07
4.99
5.05
Table 4 Process variables and response for sundried bael powder Run
Screen opening (mesh)
Homogenization time (s)
Rehydration ratio (RR)
RSM predicted RR value
PSO-ANN predicted RR value
1
5.00
45.00
2.83
2.83
2.83
2
5.00
45.00
2.81
2.83
2.80
3
5.00
45.00
2.85
2.83
2.86
4
5.00
45.00
2.80
2.83
2.80
5
5.00
45.00
2.80
2.83
2.80
6
5.00
45.00
2.89
2.83
2.86
7
6.00
60.00
3.39
3.39
3.39
8
6.00
30.00
2.99
3.00
3.00
9
5.00
23.79
2.60
2.63
2.60
10
5.00
66.21
3.36
3.40
3.35
11
3.59
45.00
2.73
2.83
2.80
12
6.41
45.00
3.30
3.28
3.30
13
4.00
30.00
2.61
2.53
2.60
14
4.00
60.00
3.30
3.22
3.30
204
T. Sarkar et al.
2.6 Rehydration Ratio Rehydration ratio (RR) was calculated following the method referred by Adiletta et al. [18] with little modification. 1 g of bael powder was dissolved in 10 g of water at temperatures of 60 °C, with continuous stirring for 20 min. The following equation was used in calculation of RR (1). RR =
MRh M
(1)
where M Rh = mass of bael powder after rehydration (g), M = mass of bael powder before rehydration (g).
2.7 Response Surface Methodology (RSM) Optimization Though all the investigations were done in triplicate they were documented as an average. All the observations were analysed using the Design-of-Expert software of version 7.0.0 (Statease Inc.; Minneapolis-USA) by considering central composite design model. The mathematical optimization of bael powder processing was intended to find the process input to maximize RR.
2.8 Particle Swarm optimization-Artificial Neural Network (PSO-ANN) Here in this proposed model the weight and bias parameters of ANN method are optimized by using PSO technique. For PSO, position (x b ) and velocity (vb ) of each particles were represented as: xb = (xb1 , xb2 , xb3 , . . . , xbn ) for b = 1, 2, 3, . . . , n
(2)
vb = (vb1 , vb2 , vb3 , . . . , vbn ) for b = 1, 2, 3, . . . , n
(3)
For each iteration the velocity (vb ) and position (x b ) for bth particle was updated at (i + 1)th as the following equation: vb (i + 1) = w · vb (i) + A1 · r1 · (Plbest − xb (i)) + A2 · r2 · Pgbest − xb (i) xb (i + 1) = xb (k) + vb (i + 1)
(4) (5)
17 Comparative Approach of Response Surface Methodology…
205
where A1 and A2 symbolize the coefficients due to acceleration; r 1 and r 2 represent the random number generated by the system; w represents the inertial weight; Plbest stands for local best position value of the particle whereas Pgbest represent the global best fitness value for best particle among the population [19].
3 Results and Discussion The capability of predictive modelling for a statistical approach like response surface methodology (RSM) and artificial intelligence method like particle swarm optimization (PSO) was established in optimization of the RR of bael powder production in different state of the art drying methods. The highest RR value was observed for bael powder processed under FD (5.52), whereas, the same for MWD ranged from 3.72 to 4.71, while the maximum of 4.71 for the bael powder produced at 180 W, screen opening of 6 mesh size and homogenization time of 60 s (Table 2). Similarly, for HAD and SD, RR ranged from 3.08 to 4.53 and 2.6 to 3.39 respectively. At 60 °C, screen opening of 6 mesh size and homogenization time of 60 s the RR value attained the maximum for HAD (Table 1); on the other hand, RR found to be the highest for SD at screen opening of 6 mesh size and homogenization time of 60 s (Table 4). The ANOVA table specifies that hot air oven temperature, screen opening mesh size, homogenization time, quadratic effect of oven temperature and homogenization time significantly (p < 0.05) affect the RR of bael powder (Table 5). Significance of the model was substantiated through the F-value of 17.83. There could be a probability of 0.01% in failure of “model F-value” for the developed model for HAD. Similarly, linear effects of microwave power level and homogenization time significantly (p < 0.05) alter the RR of MWD bael powder (Table 5). Significance of the model was substantiated through the F-value of 85.96 for MWD. There could be a probability of 0.01% in failure of “model F-value” for the developed model for MWD. Linear effects of screen opening and homogenization time were significantly (p < 0.05) affect RR of FD sample. Significance of the model was substantiated through the F-value of 13.76. There is only a 0.1% probability of failure of the model due to noise for FD. Linear effects of screen opening, homogenization time, the interaction between screen opening and homogenization time, quadratic effects of both the screen opening and homogenization time were significantly (p < 0.05) affect RR of SD sample. Significance of the model was substantiated through the F-value of 48.39 for SD. There could be a probability of 0.01% in failure of “model F-value” for the developed model for SD. Regression models concerning RR of the product with the processing factors were achieved for HAD, MWD, FD and SD contented the lack of fit test (p > 0.05) concluded an R2 value of 0.9413 (HAD), 0.9416 (MWD), 0.7144 (FD) and 0.9676 (SD) (Table 6). The predicted model revealed that RR was amplified with the parameters like oven temperature and screen opening for HAD, linear effect of screen opening for MWD and FD, both linear and quadratic effect of screen opening for SD.
206
T. Sarkar et al.
Table 5 Effect of process factors on RR of bael powder R2 value
Adjusted R2 value
Method of drying
Response
Models
Hot air drying
Rehydration ratio
Rehydration ratio = + 0.9413 13.99616 − 0.24488 × Temperature − 0.78594 × Screen opening + 0.042950 × Homogenization Time + 9.875 × 10–3 × Temperature × Screen opening − 4.16667 × 10–5 × Temperature × Homogenization Time + 2.41667 × 10–3 × Screen opening × Homogenization Time + 1.17928 × 10–3 × Temperature2 + 6.55844 × 10–3 × Screen opening2 − 4.26542 × 10–4 × Homogenization Time2
0.8886
Microwave drying
Rehydration ratio
Rehydration Ratio = 0.9416 +5.20248 −5.39853 × 10–3 × Power level + 0.024330 × Screen opening + 4.18925 × 10–3 × Homogenization Time
0.9306
Freeze drying
Rehydration ratio
Rehydration Ratio = 0.7144 +3.52153 + 0.21059 × Screen opening + 9.23917 × 10–3 × Homogenization Time
0.6624
Sun drying
Rehydration ratio
Rehydration Ratio = 0.9676 + 3.75814 − 0.73549 × Screen opening + 4.70669 × 10–3 × Homogenization Time − 4.83333 × 10–3 × Screen opening × Homogenization Time + 0.11125 × Screen opening2 + 4.16667 × 10–4 × Homogenization Time2
0.948
8.36
C2
0.0160
0.8473
0.0052
0.4347
0.8912
0.0509
0.0001
0.0416
0.0001
0.0001
85.96 247.18 1.39 9.30
Model
A
B
C 0.0076
0.2549
0.0001
0.0001
P-value
B
A
Model
Source
8.31
19.20
13.76
F-value
RR
FD
0.0149
0.0011
0.0010
P-value
5.32 23.14 16.43
A2 B2
148.3
51.53
48.39
F-value
RR
SD
A B
B
A
Model
Source
0.0037
0.0013
0.0499
0.0001
0.0001
0.0001
P-value
A linear effect of the hot air oven temperature (°C), B linear effect of screen opening, C linear effect of homogenization time (s), AB interaction of oven temperature and screen opening, AC interaction of the hot air oven temperature (°C) and homogenization time (s), BC interaction of screen opening and homogenization time (s), A2 quadratic effect of hot air oven temperature (°C), B2 quadratic effect of screen opening, C 2 quadratic effect of homogenization time (s). A’ linear effect of microwave power level (W), B’ linear effect of screen opening, C’ linear effect of homogenization time (sec). A and A ’ linear effect of screen opening, B and B linear effect of homogenization time (s), A B interaction of screen opening and homogenization time, A2 quadratic effect of screen opening, B2 quadratic effect of homogenization time
0.039
0.02
AC
B2
4.92
AB
0.66
36.49
C
12.63
5.46
B
A2
89.66
BC
17.83
A
RR F-value
P-value
MWD
F-value
Source
RR
HAD
Model
Source
Table 6 Regression models concerning rehydration ratio and independent variables for bael powder processing
17 Comparative Approach of Response Surface Methodology… 207
208
T. Sarkar et al.
The dried tissues absorb water during the process of rehydration, and due to the usual springiness of cell walls, it tries to return to its original structure. But microwave radiation disrupt the cellular arrangements, maybe due to this a lower RR was observed in case of MWD bael powder, analogous interpretations were also described by Argyropoulos et al. [20] for rehydration of dehydrated mushroom slice. However, intercellular void spaces were built due to the penetration of microwave radiation, moisture particles may have enter through this capillary spaces which may be liable for higher RR in MWD than HAD and SD bael powder, Ozcan-similar et al. [21] observed comparable results for Kumquat drying. Based on Tables 1, 2, 3 and 4, the optimized process inputs achieved by RSM and PSO-ANN were virtually comparable. In respect with predictive competence, RSM predicted RR value of 4.39(HAD), 4.63 (MWD), 5.34 (FD) and 3.39 (SD) whereas projected RR value through PSO-ANN was 4.48 (HAD), 4.69 (MWD), 5.50 (FD) and 3.39 (SD). The error in prediction for RSM and PSO-ANN techniques in optimal production condition were 30.09% (HAD), 0.72% (MWD), 3.26% (FD) and 0.00% (SD) and 1.10% (HAD), 0.24% (MWD), 0.36% (FD) and 0.00% (SD) respectively. An R2 value of ≥ 0.75 is considered for a robust model [22]. PSO-ANN fits the investigational results with a better correlation in comparison to the RSM modelling. It can be presumed from Table 7 that both RSM and PSO-ANN approach built efficient model though the predicting performance of PSO-ANN was more consistent than RSM for all the drying methods employed. The convergence criteria meet in lower than 100 iterations for all the drying methods (Fig. 1). The elapsed time to meet convergence for HAD, MWD, FD and SD were 40.443038, 1.089084, 1.093933 and 1.520142 s respectively. Due to complex quadratic interaction within the model parameter for HAD are relatively complex model was derived for HAD leading to requirement of higher convergence time. RSM model does not fit well for FD though PSO-ANN methodology constructed a correlation coefficient close to unity (0.9751). Validation of Models The method for optimization is minimizing the cost of energy for bael powder production, lowering the material cost and to maximize the RR for better reconstitution of product. Further investigational observations were executed to validate the anticipated process parameters. The RR at the projected process ambience was 4.5 for HAD, 4.7 for MWD, 5.51 for FD and 3.39 for SD through RSM and PSO-ANN modelling. All the experimental RR was in the near vicinity with the projected results, recommending the justification of the model anticipated. The percentage Table 7 Comparison of optimization efficiencies of rehydration ratio between RSM and PSO-ANN
Model correlation co-efficient
RSM
PSO-ANN
R2 value for HAD
0.9413
0.9915
R2
0.9416
0.9878
R2 value for SD
value for MWD
0.9676
0.9937
R2 value for FD
0.7144
0.9751
17 Comparative Approach of Response Surface Methodology…
209
Fig. 1 PSO convergence characteristics curves. a Hot air drying. b Microwave drying. c Freeze drying. d Sun drying
error for RSM model and PSO were 0.44% for HAD, 0.21% for MWD, 0.18% for FD, 0.00% for SD and 2.44%% for HAD, 1.49% for MWD, 3.08% for FD, 0.00% for SD respectively in the event for validation data matrix, so it can be exemplified that the proficiency of PSO-ANN to predict the precise process variables for optimum RR for different drying methods, in comparison with RSM.
4 Conclusion Virtually, both the RSM and PSO-ANN predict the RR value for all the differently processed bael powder. The highest RR value for HAD was attained at 60 °C oven temperature, screen opening of 6 mesh and 60 s of homogenization time. The same for MWD was reached at 180 W power level, screen opening of 6 mesh and 60 s of homogenization time. Screen opening of 6 mesh and 60 s of homogenization time were found to be the optimum process condition for attainment of maximum RR value for FD and SD. Though RSM model could not predict FD with desired
210
T. Sarkar et al.
accuracy (R2 = 0.7144) PSO-ANN tool indeed appeared to build a more robust model (R2 = 0.9751). Wet-lab investigation and predicted results show that PSOANN technique is an unconventional practice for the highest RR for differently dried product. Certainly, PSO-ANN delivers rapid optimized finding of the optimize process variable on bael powder production, in comparison to traditional practice like RSM.
References 1. Sarkar T, Salauddin M, Chakraborty R (2020) In-depth pharmacological and nutritional properties of bael (Aegle marmelos): a critical review. J. Agri Food Res 2:100081. https://doi.org/ 10.1016/j.jafr.2020.100081 2. Hazra S, Salauddin M, Sarkar T, Roy A, Chakraborty R (2020) Effect of freeze drying on antioxidant properties of bael fruit (Agle marmelos). In: Biotechnology and biological sciencesproceedings of the 3rd international conference of biotechnology and biological sciences, BIOSPECTRUM 2019. https://doi.org/10.1201/9781003001614-71 3. Sarkar T, Nayak P, Salauddin M, Hazra S, Chakraborty R (2020) Characterization of phytochemicals, minerals and in vitro medicinal activities of bael (Aegle marmelos L.) pulp and differently dried edible leathers. Heliyon 6(10):e05382. https://doi.org/10.1016/j.heliyon.2020. e05382 4. Sarkar T, Nayak P, Chakraborty R (2020) Storage study of mango leather in sustainable packaging condition. Mater Today Proc 22:2001–2007. https://doi.org/10.1016/j.matpr.2020. 03.177 5. Erbay Z, Icier F (2009) Optimization of hot air drying of olive leaves using response surface methodology. J Food Eng 91(4):533–541 6. Kumar D, Prasad S, Murthy GS (2014) Optimization of microwave-assisted hot air drying conditions of okra using response surface methodology. J Food Sci Technol 51(2):221–232 7. Sarkar T, Salauddin M, Hazra S, Chakraborty R (2020) Comparative study of predictability of response surface methodology (RSM) and artificial neural network-particle swarm optimization (ANN-PSO) for total colour difference of pineapple fortified rasgulla processing. Int J Intelligent Net 1:17–31. https://doi.org/10.1016/j.ijin.2020.06.001 8. Cao H, Zhang M, Mujumdar AS, Du WH, Sun JC (2006) Optimization of osmotic dehydration of kiwifruit. Drying Technol 24(1):89–94 9. Singh B, Panesar PS, Gupta A, Kennedy JF (2007) Optimization of osmotic dehydration of carrot cubes in sucrosesalt solutions using response surface methodology. Eur Food Res Technol 225(2):157–165 10. Alam MS, Amarjit S, Sawhney B (2010) Response surface optimization of osmotic dehydration process for aonla slices. J Food Sci Technol 47(1):47–54 11. Ozdemir M, Ozen BF, Dock LL, Floros JD (2008) Optimization of osmotic dehydration of diced green peppers by response surface methodology. LWT-Food Sci Technol 41(10):2044–2050 12. Ade-Omowaye B, Rastogi N, Angersbach A, Knorr D (2002) Osmotic dehydration behavior of red paprika (Capsicum annuum L.). J Food Sci 67(5):1790–1796 13. Han Q-H, Yin L-J, Li S-J, Yang B-N, Ma J-W (2010) Optimization of process parameters for microwave vacuum drying of apple slices using response surface method. Drying Technol 28(4):523–532 14. Lin Y-P, Lee T-Y, Tsen J-H, King VA-E (2007) Dehydration of yam slices using FIR-assisted freeze drying. J Food Eng 79(4):1295–1301 15. Eberhart RC, Shi Y, Kennedy J (2001) Swarm intelligence. Elsevier 16. Du KL, Swamy MNS (2016) Particle swarm optimization. In: Search and optimization by metaheuristics. Springer International Publishing, pp 153–173
17 Comparative Approach of Response Surface Methodology…
211
17. Sarkar T, Salauddin M, Hazra S, Chakraborty R (2020) Effect of cutting edge drying technology on the physicochemical and bioactive components of mango (Langra variety) leather. J Agri Food Res 2:100074. https://doi.org/10.1016/j.jafr.2020.100074 18. Adiletta G, Wijerathne C, Senadeera W, Russo P, Crescitelli A, Matteo MD (2018) Dehydration and rehydration characteristics of pretreated pumpkin slices. Ital J Food Sci 30:684–706 19. Beheshti Z, Shamsuddin SMH, Hasan S (2013) MPSO: Median-oriented particle swarm optimization. Appl Math Comput 219(11):5817–5836 20. Argyropoulos D, Heindl A, Müller J (2011) Assessment of convection, hot-air combined with microwave vacuum and freeze-drying methods for mushrooms with regard to product quality. Int J Food Sci Technol 46:333–342. https://doi.org/10.1111/j.1365-2621.2010.02500.x 21. Ozcan SG, Özkan KA, Tamer C, Copur OU (2018) The effect of hot air, vacuum and microwave drying on drying characteristics, rehydration capacity, color, total phenolic content and antioxidant capacity of Kumquat (Citrus japonica). Food Sci Technol (Campinas). https://doi.org/ 10.1590/fst.34417 22. Sarkar T, Salauddin M, Choudhury T, Um JS, Pati S, Hazra S, Chakraborty R (2020) Spatial Optimisation of mango leather production and colour estimation through conventional and novel digital image analysis technique. Spat Inf Res. https://doi.org/10.1007/s41324-020-003 77-z
Chapter 18
Load Balancing with the Help of Round Robin and Shortest Job First Scheduling Algorithm in Cloud Computing Shubham Kumar and Ankur Dumka
1 Introduction Load balancing is a science and method of distributed load units over a set of processors that are linked to a network and may be distributed over the globe. All systems always possible that some nodes are having fewer loads. In such a case, processor in a system can be analyzed allow to their present load. Load divide into three parts– Heavily load execution sufficient jobs are waiting, easily loaded is defined as the fewer jobs are waiting in a task, the idle or empty processor has no jobs for execution [1]. Scheduling is a tool or machine to control the execution of a number of processes performed by a computer. Load balancing in cloud computing assigning the total load over each node of the mutual or shared system to make resource usage effective and to correct the response time of the job. In this, we have used the shortest job first (SJF) and Round Robin (RR) scheduling algorithms. Basically, cloud computing is an appearing methodology and allots processing of a large amount of data but if we structure process large amount of data. In cloud computing providing performance and better utilization of resources with the help of scheduling. Scheduling plays a great role in cloud computing to improve the management of resources. All service provider providing a resource to infinite manner to the client this all handle with the help of scheduling algorithm [2].
S. Kumar Graphic Era Deemed to be University, Dehradun, India e-mail: [email protected] A. Dumka (B) Women Institute of Technology, Dehradun, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_19
213
214
S. Kumar and A. Dumka
Basically, why do we go scheduling just not provide a resource as a when demand comes up. So, if we providing a scheduled algorithm. Some users purchase cloud services in terms of high priority. In such case, service level provider and resource been provided to the user and another wait but their type scenario not possible to wait so in case we have use scheduling algorithms [3, 4]. Scheduling is the main part of cloud computing and jobs or workflow scheduling under the platform as service. Infrastructure as a service work under virtual machine scheduling which machine is decided by scheduler should go on which job or virtual machine. Simply Round-Robin is a time quantum-based preemptive algorithm, and SJF is a burst time-based non-preemptive and preemptive algorithm. But RR work in two-phase like expected finish time calculated for each process [5]. Each process with the total minimum finish time is scheduled. This algorithm result as no longer waiting for a processor for the smallest task. SJF work in the same phase with overall high and low bandwidth. The load balancing can be more efficient and dynamic by making some changes in the before implemented RR and SJF algorithm [6]. But here merge two algorithms and reduce response time and average waiting time. Here, scheduling help, individually increase operational coat, and increase resource utilization. Designing Issues of load balancing i.
ii.
iii. iv.
v.
Load Calculation Policy critical task is designing a load balancing algorithm to choose which method to use to calculate the workload of an especially node. In this calculation of the workload, an especially nodes is a difficult problem for which number totally pleasure solution exists. A node workload can be estimated based on some measurable parameters like a total number of processes on the load resource demand of these processes [7, 2]. Process Forward Policy is used to forward processes from heavily loaded nodes to lightly loaded nodes. Some algorithms use threshold policy to decide whether the node is heavily loaded or lightly load but the threshold value limiting the workload of a node which can be determined by static(predefined) and dynamic (decide calculated). Below the threshold value, node accepts the process to execute and the value node transfer the process to a lightly loaded node [8]. Load Location Policy is defined as the node where the process to be executed. Static Information Exchange Policy is used for the frequent exchange of static information within the nodes. In this, we have used the periodic broadcast to given all nodes information after every T units of time and cause heavy traffic poor scalability. Broadcast when a state changes broadcast when process to a node arrives and departs. On-demand exchange broadcast when node shift from either overload or unload exchange by the policy underparts node is search by polling other nodes by one until poll limit search. Priority Assignment Policy determines which node to execute the first local or remote.
18 Load Balancing with the Help of Round Robin …
215
vi. Migration Policy is determined the number of time processes can be migrated the uncontrolled process can be migrated n the number of times but the controlled process to be migrated is determined by pre-defined factors. The Scheduling Performance Criteria (a) CPU Utilization: It is defined as the extent; the CPU must be active working processes all the time for superior performance or achievement [9]. (b) Burst time: A process control of the CPU continuously its completion is its burst time. (c) Landing or Arrival Time: It is defined as the process that comes to queue during the time of execution [9]. (d) Throughput: On processes completed all along a period of bigger throughput and better achievement. (e) Response or feedback time: The time spent waiting for an initial response later the resignation of a request or demand. (f) Waiting Time: In the queue, the process waits for CPU is its waiting time [7, 9].
Waiting Time = Turnaround Time−Burst Time (g) Turnaround Time: It shows the spent time from its arrival until its execution. (h) Turnaround Time = Completion Time − Arrival Time
2 Work Done This algorithm aims in cloud computing to improve the average waiting time and response time with the help of RR and SJF scheduling algorithm. The highlights of this scheduling direct red ucing operational costs and reducing waiting time [10]. Average takes can be calculated given in this equation AV =
k
LT/a
(1)
j=1
where LT is the length of the process and a is the number of processes. Execution time is based on VM and the correct time taken to assassinate the next equation ExecutionTime(ET) = LT/VMP
(2)
216
S. Kumar and A. Dumka
Completion time is the sum of previous task and execution time but execution time (ETn ) of the current process allotted in the clone VM shown in this equation [8, 11] CT = ETn +
k−1
ExT
(3)
j=1
Period Time in the machine with varying processing power, while brief to trying the makespan. Basically, period is all over the length of the schedule and a full load of the time needed to full association or group of the task that specifies the realization time of the end process [2]. PeriodTime = CTa Response time under load balancing in cloud computing response time shows the amount of time required to respond, and SB is the submission time of a task [12]. RT =
a
CT j + SB j
(4)
j=1
Calculate total response time for every VMs in this and A is the number of processes in the VM [10]. ART = RT/a
(5)
Computation of load unbalance factor. The sum of loads of all virtual machines is defined as Load (L) =
k
LjI
(6)
j=1
where j represents the number of virtual machine in a data center. The load per unit capacity is defined as Load Per Cell (LPC) = k
L
j=1
Cj
(7)
Threshold T j = LPC ∗ C j where C j is the size node. In this load, unbalance factor of a specific virtual machine is given by In case
18 Load Balancing with the Help of Round Robin …
VM < T ji −
k
217
L v , Under loaded
j=1
VM > T j −
k
L v , Overloaded
j=1
VM = T j −
k
li Balanced
j=0
The under loaded virtual machine obtain the load from the overload virtual machine before the load on the virtual machine best the entrance, and the difference are λj. The relocation of a load from the overload virtual machine is borne out until its load is less than the threshold. Here implies that the volume of load that can be transmitted or removed from the downloaded virtual machine will be in the range of λ and μ [3]. The Round-Robin algorithm distributes process to the next or later virtual machine in the line or queue inattentive of the load on that virtual machine. The RR approach does not study the ability or resource capabilities, arrangement, and length of the task. So, the above priority or preference and the lengthy process finish up with greater response times [6]. The Round Robin (RR) algorithm directs on the suitable issues of user rings as its queue or order to store tasks, each task in the queue has the equal execution time, and it will be executed in turn. If a task cannot be finished at the time its turn, it will save and back to the queue waiting for the next rotate or turn. Next, SJF algorithm allocates the process having the littlest execution time is chosen for the next execution. This scheduling approach can be preemptive or nonpreemptive [13]. A task that produces the first demand will be executed or hang one. Basically, this main problem of SJF is its protective implementation. The protective implement may lead to the longer average waiting for a large job to finish or end, the transfer issue or effect exists. Cloud scheduling is divided at the user level and system level. At customer-level, scheduling deals or agreement with problems lifted by a maintenance arrangement between provider and customers [11].
3 Proposed Algorithm In this proposal, we make a combination of two scheduling algorithms and starting all processes in the ready queue are organized by CPU in takeoff order of their burst time, after that, we discover the time quantum using mean for RR [14].
218
S. Kumar and A. Dumka
Simple FlowChart
Initialize Current position_repetition_t = 1 (phenomena time) Current position_empty_solution = null If a % 4 ==0 Next Time Quantum = Pa/4 Here Pa is burst time of process number a Else Time Quantum = P(a/4 + 1) a process qualified out of ready queue is beheaded the relaxing task or process is over arranged and over the burst time is the measure. 1.
Running phase i. A process measures the burst time. ii. Prepare one copy of all burst time. iii. Initial position value τjk (t) = Completion for every route between process and VMs and optimize the ready queue and cheek burst time. iv. Insert n number of processes on the initiating VMs inconstantly. v. For j = i: a If {ct (j + 1) < ct (j + 1)} CT is completion time length = ct (j);
18 Load Balancing with the Help of Round Robin …
219
ct (j + 1) = ct(j); length = ct (j + 1); Terminate Terminate vi.
For request (R) = 1 to a do Request the opening ready queue to VM Do process while all process does not end their request every process chooses the VM for the next task accordingly. V = mod (a, 2); If (V ==0) K = a/2; K is subset of process Else R = a +1; K = q/2; Insert the selected process queue to a virtual machine. End do
vii. S = ct(k); viii. Measure the greatest burst time ix. Greatest = 0; For increment value I = 1: a If {greatest < ct (i)} Burst Time (i) = greatest; K = i; Terminate Terminate x. xi.
After execute the process using measure sorted queue and time quantum. While (greatest >=0) For q = 1: a If {completion time (q) ~=}; wait (q) = wait (q) + (t- slow (q)); If (ct (q) >=z) ct (q) = ct (q)-z; t = z + t; Else Time = completion time (q) + Time; Completion time (q) = 0; End If (ct (q) == 0) Slow (q) = t; End P = mod (a, q + 1); % Greatest = Greatest + 1;
220
S. Kumar and A. Dumka
Terminate Greatest = Greatest – s; Terminate Illustration of Proposed Algorithm Result In this, proposed algorithm is compared given parameters as follows: CPU agreement, throughput time, turnaround time, waiting time. Let VM = (VM1 , VM2 … VMn ) be the set “m” of virtual machines which should process “a” tasks represented by the request comes R = (T1 , T2 … T n ). All the VMs are running in parallel and unrelated or each VMs runs on its resources. There is no sharing of its resources by other VMs. We schedule non-preemptive dependent tasks to these VMs. In this, we select a ready queue and connect load balancer with the internet. In this figure, create a virtual private cloud to connect cloud operators. All request comes load balancer to the data center. Given figure load distributed node to particular data center machines. The proposed algorithm mainly balanced to coming requests and distribute to a particular node or virtual machine. In this paper, we reduce mathematically on the total processing time, transformation cost, and average waiting time. All complete-time (ct) and the waiting (wt) time is calculated in seconds and execution time (ExT) of an order process is show: a a a Texecution Time = Tintraction Time + Tcomputation Time
where a TexecutionTime → ath all process execution time a Tintraction Time → ath all process computation time a Tcomputation Time → ath all process interaction time.
Turnaround Time = Process Requirement/Process Capability + Process Waiting Time TProcessing Time = Treceived Time + Texecution Time + Torder + Tsubmit In this waiting time of all processes, unified shortest job first and bandwidth alertness are the littlest heedless of the number or amount of cloudlets discarded (Fig. 1).
18 Load Balancing with the Help of Round Robin …
221
Fig. 1 Process comes datacenter by the proposed algorithm
Comparison result of scheduling algorithm Algorithms
Average waiting time (ms)
FCFS
Average turnaround time (ms)
Average response time (ms)
Context switches
16.25
22.16
15.35
11
Round Robin 14.25
20.12
14.25
9
SJF
13.88
19.75
7.5
8
9.38
15.25
9.38
7
Purposed algorithm
222
S. Kumar and A. Dumka
Comparision Chart of Algorithms
4 Conclusion and Future Work A critical literature review has been made on load balancing proficiency and scheduling algorithms. It compared and analyzed based on the literature review research gaps that are identified and purposed new algorithms with the help of Round Robin (RR) and shortest job first (SJF). In a purposed scheduling algorithm to conduce process scheduling with the minimum construction. In this, we show empirical evidence that proposed scheduling is better than RR and SJF. The empirical result showed that proposed scheduling is more efficient to improve the response time and waiting time to improve. Future work can be done Shortest Job First (SJF) and Round Robin (RR) will serve as time measure-based scheduling algorithms architecture connect both time quantum to ready queue. The turnaround time reduces by the raised result or throughput and performance increase time quantum will be measure all time a process or task comes or exits the queue.
References 1. Yasin A, Faraz A, Rehman S (2017) Performance analysis of cloud applications using cloud analyst. IEEE pp 79–84 2. Ru J, Keung J (2013) An empirical investigation on the simulation of priority and shortest job first scheduling for cloud-based software system. IEEE, pp 78–87 3. Pradeep K, Prem Jacob T Dr (2016) Comparative analysis of scheduling and load balancing algorithm in cloud environment. IEEE, pp 526–531 4. Narale SA, Butey PK Dr (2018) Throttled load balancing scheduling policy assist to reduce grand total cost and data center processing time in cloud environment using cloud analyst. IEEE, pp. 1464–1467
18 Load Balancing with the Help of Round Robin …
223
5. Gatot Subroto JJ (2016) Comparison analysis of CPU scheduling: FCFS, SJF and Round Robin. IJEDR, pp 338–342 6. Kang L, Ting X (2015) Application of adaptive load balancing algorithm based on minimum traffic in cloud computing architecture. IEEE 7. Jbara YH Dr (2019) A new improved Round Robin-based scheduling algorithm-a comparative analysis. IEEE 8. Xoxa N, Zotaj M, Tafa I, Fejzaj J (2013) Simulation of first come first served (FCFS) and shortest job first (SJF) algorithms. IEEE, pp 444–449 9. Deepa T, Cheelu D Dr (2017) A comparative study of static and dynamic load balancing algorithms in cloud computing. IEEE, pp 3375–3378 10. Mishra RK, Kumar S, Sreenu NB (2014) Priority based Round-Robin service broker algorithm for cloud-analyst. IEEE, pp 878–881 11. Kumar P, Bundele D Dr, Somwansi D (2018) An adaptive approach for load balancing in cloud computing using MTB load balancing. IEEE 12. Balharith T, Alhaidari F (2019) Round Robin scheduling algorithm in CPU and cloud computing. IEEE 13. Dubey AK, Mishra V (2017) Performance analysis of cloud applications using cloud analyst. IEEE pp 78–84 14. Kishor L, Goyal D (2013) Time quantum based improved scheduling algorithm. Int J Adv Res Comput Sci Softw Eng. 3.8 15. Velde V, Rama B (2017) An advanced algorithm for load balancing in cloud computing using fuzzy technique. pp 1042–1047. https://doi.org/10.1109/ICCONS.2017.8250624
Chapter 19
Downlink Performance Improvement Using Base Station Cooperation for Multicell Cellular Networks Sandeep Srivastava, Pramod Kumar Srivastava, Tanupriya Choudhury, Ashok Kumar Yadav, and Ravi Tomar
1 Introduction In the traditional mobile cellular scenarios, it is noted that a major capacity degrading phenomenon which produces the overall performance of the system is termed as inter-cell interference. This happens due to machine learning algorithm apply for the neighbor’s cells because generally they use the same frequency resources. In the results, this ICI affects the throughput specially in terms of data loss at the user or mobile station (MS) terminal. Generally, users which are at the cell boundaries or near the cell edge get more affected due to ICI, because this type of users get the signal from more than one cell. In which only one signal is used as desired signal from the home cell, while signal from other cell is used as the interfering signal. So this is the major challenging factor in cellular network to suppressing the ICI, S. Srivastava G L Bajaj Institute of Technology and Management, Greater Noida, Uttar Pradesh, India e-mail: [email protected] P. K. Srivastava Rajkiya Engineering College, Azamgarh, Uttar Pradesh, India e-mail: [email protected] T. Choudhury (B) · R. Tomar Department of Informatics, School of Computer Science, University of Petroleum and Energy Studies (UPES), Dehradun, Uttarakhand, India e-mail: [email protected] R. Tomar e-mail: [email protected] A. K. Yadav Galgotias University, Greater Noida, Uttar Pradesh, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_20
225
226
S. Srivastava et al.
to suppressing the ICI many types of autonomous techniques have been used and recommended [1, 2]. Furthermore, SINR is also another performance factor in this scenario. Generally, MS which is near to the home base station experiences the greater SINR whereas MS which is at the cell boundary or far away from the cell center experiences the low SINR. In [3, 4] to improve the work of system in context of power efficiency and spectrum utilization, a key method which is multi-input and multi-output (MIMO) is analyzed. Although with MIMO there are also many types of unsolved problem associated to it. One of them is capacity area of MIMO broadcast channel (BC) because of the scarce in a common theory of no degraded broadcast channel, a feasible area for MIMO broadcast channel can be gained when the dirty paper coding (DPC) is implemented [5] on the transmitter’s end [5, 6]. The base of this paper, cell interiors users in the multicell cellular scenario because users at cell edge as well at cell interior experiences the ICI in both the cases. For the solution to this problem, we investigate the ICI suppression on the main cell interior users by setting up the cells at the boundary of main cell interior users. In [7–10], this is observed command of optimal power control is base station co-operation removes the ICI risk. It can a grid of cells which provide the interference has the homogeneous cell power as a unique cell to maximize the multicell capacity, Optimal and Distributed Scheduling technique come into effect [5–7, 11]. We analyses the cooperation of Multicell as a newly and developing communication model presenting expressive progress in system scope by reducing ICI. Base station cooperation (BSC) results in dividing control signals user generation, transmit or broadcasted data, CSI and the pre-coders through the major capacity-wired oriented backhaul links to coordinated propagation. CSI plays an important role at the base stations to improve the system performance. In our scenario, we have considered this cooperation scheme for the multicell environment while other cell interference is negligible. Also, SINR is key parameter which is used to determine the transmission rate to each user. For two base station Cooperative transmission can maximize the SINR by propagating collectively to single MS at an instance. In cellular network, further investigation for fractional frequency reuse (FFR) scheme using with BSC or say Coordinated multi-point (COMP) [12–14] applies COMP with base station collective transportation for heterogeneity multiuser in FR cell edge, remaining area of the center of FFR cell not in working [15, 16]. In [13], author considers a three cells cluster-group with cooperative MIMO and FFR scheme and observe it’s spectral efficiency can be improved by using the antenna rearrangement. Another technique called as inter-cell interference coordination (ICIC) is analyzed for a multi-cell network to combative pattern reuse. Now the current generation, the FFR plot has pulled in the consideration of the researcher in various standard frames and discussions. The background structure FFR shows in that manner. MSs in the interior cell area and it much durable towards the interference because of low direction loss and consequently they can endure greater reuse contrasted with cell boundary experiencing close interference and additionally high direction losses [17, 18]. Thusly, it bodes well to utilize diverse levels of use again consider for MSs the concentrate on cell. A typical case, in a network depend upon BSs for FFR. It is
19 Downlink Performance Improvement Using Base Station …
227
a focus on cell and edge zones by the combination of reuse factor of 1, 3, and 7. The execution of this plan is thought about with that of some context plans and at moment, reuse of 1, 3, and 7 plans [19, 20]. In [21] author consider the cluster of 19 cell and concluded that cell edge performance can be maximized in the context of capacity by applying the multicell coordination among the adjacent cell.
2 System Model In cellular framework, three independent phenomena namely large scale shadowing, small scale fading and path loss variation with distance characterize the radio propagation. We assumed the large reuse factor to isolate the cell in 19 cell cluster and inter-cell interference is negligible through the spectrum assignment cautiously in the correlated BSs. Now assumed the cellular network which has P cells which is in coordination with each other and each cell consists I antennas which is shown in Fig. 1a. Also every cell comprised of K users and every user equipped with J antennas. Also at the base station side, we assumed perfect CSI. Also each and every cell is considered to be the same The transmit signal vector xk of user which is already precoded is given by xk = Tk sk
(1)
where Tk and sk indicates precoding matrix and information for k user, respectively. Therefore, at user k signal received vector can be defined as
Fig. 1 a Blueprint for multi cell cellular network for downlink case, b rectangular and polar coordinates to locate
228
S. Srivastava et al.
yk = Hk,k xk +
L
Hk, j x j + n k
(2)
j=1, j=k
In Eq. 2, first term represents the desired signal, second term represents the ICI and n k is Gaussian noise vector with σ 2 variance. The coefficients of fading stay quasi-static inside few time interval (also known as block) and varies freely among the blocks. Since by using random N × M matrix, we can define channel from user k to jth cell as Hk, j =
−α cdk, j gk, j B
k, j
(3)
−α where cdk, j represents the path loss, dk, j denotes the distance among user-k and base station, where distance is in km, α is used as path loss exponent, c is used as mean path loss median considering the loss at 1 km reference distance, in above equation, log-normal distributed shadowing variable is represented by gk, j with variance and small-scale fading is represented by Bk, j∈C N ×M . In a cell, spatially multiplexed data streams are allocated to each individual user which have high SINR, and this will have done on the bases of MIMO capacity and MIMO channel rank. Little SINR cell border MSs which look for cooperation are constantly doled out a solitary stream of information. The SINR which is got by the MS k is represented below H Hk,k TkH Hk,k Tk L H H 2 k=1, j=k j=1 Hk, j Tk, j Hk, j Tk, j + σ I
SINRk = k
(4)
Assuming that (μk ,ωk ) represents the polar status of user-k. (ς j , ψ j ) is the polar coordinate status of the jth remote antenna unit in a cell, and d is the distance k, j observed by the positions of base station, user, and RAU which is depicted in Fig. 1b. Since distance between Base Station and MS is given by the users. dk, j =
μ2k + c2 + 2μk c j cos(wk − ψ j
(5)
We know that the cell edge as well as cell interior performance is defined by the location specific downlink spectral efficiency [12]. So for multiple cell network, for unguided channel the mutual information can be written as below P H Hk, j Hk, j I = log2 det I L N + M P H H Hk, j = log2 det I M + M k, j
(6)
19 Downlink Performance Improvement Using Base Station …
229
⎞ L g Pc k, j H ⎠ I = log2 det ⎝ I M + α Bk, j Bk, j M j=1, j=k dk, j ⎛
(7)
This is the random variable which depends on the fading condition. Through taking the mean of above equation with regard to small scale fading coefficient and shadowing, we can get the expression of location specific spectral efficiency, i.e. ⎡
⎛
C(μk , ωk ) = E ⎣log2 det ⎝ I M +
Pc M
L j=1, j=k
⎞⎤ gk, j H ⎠⎦ α Bk, j Bk, j dk, j
(8)
2.1 Multicell Cooperation Scheme It is observed that conventional cellular network is not able to increase the downlink act in the context of efficiency, spectral efficiency, and coverage because of increasing the mobile users and limited spectrum resources in the cellular network. Since multiple cooperation at the base stations and at the users addressed to meet this coverage limited problem. Multicell cooperation is a key technology in the wireless cellular network to mitigate the several types of constraints for example ICI. In other words, by suppressing ICI, using newly and emerging multi-cell cooperation improve the communication channel in capacity of system. Base station cooperation results in sharing control signals user propagation, transmit data, CSI and the precoders via high capacity wired backhaul links to transmissions which is coordinated. CSI plays an important role at the base stations to improve the system performance. In our scenario, we have consider this cooperation scheme for the multicell environment while other cell interference is negligible. SINR is key parameter which is used to determine the transmission rate to each user. For two base station Cooperative transmission can improve the SINR by propagating collectively to single MS at a single time. But still, this maximization in context of output may not be all-time sufficient to improve the output of individual MS. The home and neighbor BS at the same instance both generate the signal using the terminal are frame synchronized [22, 23]. Furthermore, the improving cellular network performance in the context of downlink is also characterized on the basis of cost for the using CSI through the feedback and channel training in FDD system to observed parallel with several transmitted and received antennas as well as several MS in network to keep up the consistent difference rate of sum value with respect to complete CSI rate.
230
2.1.1
S. Srivastava et al.
No Cooperation Case
Generally, in case of normal operation there is no cooperation due to this user receives the desired signal from the home cell only. so the capacity for terminal MS in case of no cooperation can be approximated from the capacity expression given by Shannon as Cnc = log2 (1 + β SINRnc )
(9)
We can get the value of β through theoretical limit and the scheme of practical coding, by calculating their SNR gap.
2.1.2
Cooperation Case
In case of cooperation, BS cooperate for transmit information to the mobile station. Also SINR for channel (from BS side to MS) will depend on the scheme which is applied. In this case, capacity for terminal mobile station is approximated as Ccoop = δ log2 (1 + β = δ log2 (1 + β SINRcoop )
(10)
where capacity is in bits/s/Hz in both the cases. δ shows the ratio of sharing resource between the nodes under cooperation. In a network, we have considered fairness of resource, since the value of δ is 1/2.
2.2 Inter-Cell Interference (ICI) ICI is a performance degrading factor. Generally, ICI occurs when user moves away from the center of home cell. As a result, SINR degrades because of two common factors too, first one due to transmitted signal strength falls because path loss increases with distance from home cell. Second one is due to ICI rises because when user moves away from one base station to another, it is so closer to other base station as depicted in Fig. 2a. We consider that user is connected with BS1 and moving away in the direction of another BS2. Also we consider a dynamic frequency reuse, which means same frequency resources are transmitted by both the BSs and therefore signal transmitted from BS2 results as broadcast to the users. In Eq. (5), assume θ = 90° and β = 0 than also suppose d = ρ for the case of polar coordinate. So the user got the value of SINR at distance d from BS2 can be given to Eq. (4) for case of quadrilateral coordinates. SINR = j=k
P1 h k dk−α P2 h j (2R − d)−α j + N0
(11)
19 Downlink Performance Improvement Using Base Station …
231
Fig. 2 a An example of two cell interference; b interference got by the cell interior user in the multicell network in case of downlink
In Eq. (11), α denotes the path loss exponent, N0 shows noise and Pk is the power of transmit for kth base station and R is the radius of cell while 2R is the distance between both the base stations. Generally, all the base station operates at the same transmit power in a system. Therefore we have assumed P1 = P2 . For the case of interference capable system, we can ignore the background noise. So, Eq. (11) can be more easier as Pd −α P1 d −α = N0 + P2 (2R − d)−α P(2R − d)−α α 1 2R −1 = −α = 2R d −d
SINR =
(12)
d
We observe that SINR goes down when increasing distance value d. also, the SINR is larger for α when the value of a given d < R,. As for d < R interference travels the larger distance and is more attenuated for greater α. We have considered the similar path loss model for desirer BS1 and interferer BS2 as in Eqs. 13 and 14. PLs = 128.1 + 37.6 log10 (d) dBs
(13)
PLi = 128.1 + 37.6 log10 (2R − d) dBs
(14)
The SINR got by user can be written as
SINRwith−I C I
Pl s Phk 10 10 Pl = i N0 + Phi 10 10
(15)
232
S. Srivastava et al.
When there is no ICI-SINR for the user can be written as Pl s Phk 10 10 SINRwithout−I C I = N0
(16)
2.3 Multicell Network for Case of Frequency Reuse: Cell Interior Performance We have considered the architecture of hexagonal cell with two levels of interferers as tier 1 and tier 2 and universal frequency reuse as depicted in Fig. 2b. We have considered the cluster size of 19 cell as shown in Fig. 2b, home cell is surrounded from six vicinal cells with number 2–7. And each cell equipped with its own BS, M transmit antennas, k users, and every cell having N antennas for receiving. Radius of each cell is R. each cell has an extra cell parameter known as inter-cell coordination distance defined in system model. Inter-cell coordination distance calculates the distance between all cell interiors in the system model depicted in Fig. 2b. For the calculation of high-level interference generated from vicinal cells, we apply the frequency reuse. Several vicinal cells have the channel information of users of cell edge and cell interiors, and they coordinate for signal transmission. Out of 19 cell, one cell is work as home cell that will transmit the data to such a user at cell edge and cell interior while rest of the neighboring cell consider this user for designing the matrices with precoding. With pre cancellation of ICI at other neighboring cell and pre cancellation of ICI generated by the home cell so user of cell edge has no interference from those cells. Such a coordinated strategy is useful for efficient interference mitigation for user at cell edge and interior. FFR is another technique in which channel fraction is divided in a manner that nearest cells will transmit and receive on separate groups of channels. Inter-cell coordination strategy is applied to do prior cancellation of interference at the all vicinal cell for working user of cell interior and select only one cell for data transmission to this user. We have used multi-cell MIMO as a precoding technique for inter-cell coordination, similar to the intra-cell interference. Each user at cell interior selects the home cell based on the channel condition while other neighboring cell plays an important role for data (information) transmission. The rest of the cells are interferers cells [24, 25].
2.4 Cell Interior SINR and Channel Capacity In our system model, we consider a user (MS) at the cell-center as shown in Fig. 2b experiencing interference from six first-level cells. Consider that the user is located
19 Downlink Performance Improvement Using Base Station …
233
near to the centre of cell, the six first-level interferers are at a distance of almost 1.7R. The user will also receive interference from 12 cell of second-level with six interferers at a distance of almost 3R and rest six second-tier interferers at a distance of 3.4R. The SINR of this user at cell-centre can be written as: SINRreuse−1 = =
6 × (1.7R)−α
6 × (1.7)
−α
(β R)−α + 6 × (3R)−α + 6 × (3.4R)−α
(β)−α + 6 × (3)−α + 6 × (3.4R)−α
(17) (18)
where βR (≤R) shows the distance of the user from the home cell. Assuming the frequency reuse of 3, the interference of two interferers at a margin of 1.7R can be ignored. Since the SINR for this case can be estimated as: SINRreuse−3 = =
4 × (1.7R)−α
4 × (1.7)
−α
3 × (β R)−α + 6 × (3R)−α + 6 × (3.4R)−α
3 × (β)−α + 6 × (3)−α + 6 × (3.4)−α
(19) (20)
Let’s assume that a frequency reuse of seven, the interference of all the firstlevel interferers can be removed resulting in SINR given by the following expression below: SINRreuse−7 =
7 × (β R)−α 7 × (β)−α −α −α = 6 × (3R) + 6 × (3.4R) 6 × (3)−α + 6 × (3.4)−α
(21)
The capacity limit of user at cell-center for reuse of 1, 3 and reuse of 7 can be calculated as: Creuse−1 = 1 · log2 (1 + SINRreuse−1 ) Creuse−3 Creuse−7
1 · log2 (1 + SINRreuse−3 ) = 3 1 · log2 (1 + SINRreuse−7 ) = 7
(22) (23) (24)
where all three capacities are given in bps/Hz. However, when we apply frequency reuse scheme. let us consider a phenomena of frequency reuse of 3, from this reuse interference got by the several users from multiple cell in downlink is suppressed in BS1. Therefore, this scheme of frequency reuse provides the grater upgradation in SINR and equivalent improvement in capacity output in contest of downlink. Although we have observed that when we apply frequency reuse of 3, the improvement in downlink performance is low.
234
S. Srivastava et al.
2.5 Analysis for Multicell Network Performance with ICI In this segment, we will discuss the cell interior performance by considering the ICI. From Eq. (2), the ICI in addition to noise can be written as
n=
L
Hk x j + n k
(25)
j=1, j=k
Q k , s covariance matrix, applied on the MS’s position and log- normal shadowing can be estimated as ⎡ ⎤ L g1, j Pc 1 + 0 ⎥ M dα ⎢ ⎢ ⎥ j=2, j=k 1, j (26) Q k = E n k n kH = ⎢ ⎥ ⊗ Im , L . ⎣ gL , j ⎦ . . Pc 0 α M d j=2, j=k
L, j
Here symbol ⊗ represents the Kronecker product. we know the theorem about central limit which states that ICI is asymptotically Gaussian distribution at the time of number of interferers is large. Therefore, n k can be derived as an coequal, Gaussian noise with co-variance (24). With the help of Gaussian approximation for ICI, the mutual information of the downlink channel, given the channel matrix Hk and the noise covariance Q k , can be approximated as P Hk, j Hk,Hj Q −1 I = log2 det I L N + k M P H −1 Hk, j Q k Hk, j = log2 det I L N + M ⎡ ⎛ ⎞−1 ⎤ k L g P P k, j ⎠ c ⎝1 + c Hk,Hj Hk, j ⎦ = log2 det ⎣ I M + α M j=1 M j=2, j=k dk, j ⎡
⎛ k Pc ⎝1 + Pc I = log2 det ⎣ I M + M j=1 M
L j=2, j=k
⎞−1 gk, j ⎠ α dk, j
(27)
⎤ ×
gk, j H ⎦ α Hk, j Hk, j dk, j
(28)
represents the reverse matrix of Q k . The site of particular spectral where Q −1 k effectiveness of distributed antenna choice with ICI can be written as
19 Downlink Performance Improvement Using Base Station …
⎡⎛
⎛ k P c ⎝1 + Pc C(μk , ωk ) = E log2 det ⎣⎝ I M + M j=1 M
×
235
L j=2, j=k
⎞−1 gk, j ⎠ α dk, j
gk, j H α Bk, j Bk, j dk, j
(29)
2.6 Power Constraints for Every BS Now we will discuss the sum rate maximization issue in the context of downlink for multicell network. Several ICI mitigation schemes have been come into existence and proposed, listing from partially coordinated beamforming to the fully cooperative network [20, 21, 26–29]. But in our system model, we apply the MIMO network approach for cooperation which is limited, where BSs which are under cooperation, work as a single distributed MIMO transmitter and also other cell interference considered to be the noise. To the transmit covariance matrix of BS denotation k by power constraint that can be characterized as Qk =
Vk k VkH
(30)
k
where k is the diagonal element of Q k , relate with allocation of power for the kth user. Since showing the characterization of sum rate maximization with every cell power constraint can be L
Arg max { k }
s.t.tr
H H log I + H j,k V j,k k V j,k H j,k
j=1, j=k
Vk k VkH
≤ Pk , k = 1, . . . , K
k
k ≥ 0, k = 1, . . . , K . In this way, the issue is characterized as a convex optimization problem and condition of sum power constraints is shown below tr (Q k ) =
L K
j
k ≤ Pk
k=1 i=1
where Psum =
L k=1
Pk . The sum rate maximization further simplified as
(31)
236
S. Srivastava et al.
arg max{ k }
K L
j j log 1 + k k .
k=1 j=1 j
assume k = HkH Hk shows the rectangle element for j = 1, 2, 3, …, L. for every BS constraints, The Lagrangian function of above expression can be written as £( , λ) =
L K
j j log 1 + k k − λP
(32)
k=1 j=1
where λ ≥ 0 is a dual variable vector with respect to power constraints for BS. The KKT circumstances are given by ∂£ j ∂ k
=
1 1+
j
j j
k k
k − λ ≤ 0,
By further solving we find j
k j
j
1 + k k
−λ=0
That means j k
=
1 1 − . λ kj
The result of above equation is related to sum power constraints which is estimated through Water filling.
3 Simulation Parameters Parameters
Value
Number of cells
19
BSs under cooperation
7
Cell structure
Hexagon structure
BS location
on circle with radius
MS location
0.8R
(BS, user) antenna number
(7, 3)
Carrier frequency
2 GHz
Used bandwidth
10 MHz
Log-normal shadowing
Gaussian distributed with zero mean, 10 dB standard deviation (continued)
19 Downlink Performance Improvement Using Base Station …
237
(continued) Parameters
Value
Power of transmission
46 dBm
SNR
−15 to 20 dB
Path loss exponent
3.6
Path loss expression
128.1 + 37.6 log10 (d)
Fading
i.i.d. Rayleigh
3.1 Simulation Results We have evaluated the proposed results via high-performance MATLAB simulation. In Fig. 3a we plot the SINR is using as a distance function from center of cell (d) may and may not assuming the inter-cell interference (ICI) for a user (MS) which is receiving transmission. We assume the total background noise N0 is −104 dBm for 10 MHz bandwidth. Moreover, we consider the capacity of transmit power of base station (P) is 46 dBm and by using the Eqs. (15) and (16) we have evaluated the SINR with respect to the distance d. From Fig. 3a we have observed that SINR gain by ICI elimination is greater than ICI consideration at the MS. Although SINR goes down as user going far away from the center of cell. This is the case of users at the cell edge. From this discussion, we come to an end that inter-cell interference cannot be ignored especially for cell-edge users. The SINR as function of path loss exponent (α) for different reuse factors are plotted in Fig. 3b by using the Eqs. (18), (20), and (21). We have calculated the cell interior SINR at β = 0.3. This implies that the cell-center user is considered to be a distance of 0.3R to the mid-point of cell. From Fig. 4 We observe that there is small incremental gains in SINR in case of cell interior when we are increasing reuse factor size although reuse factor of 7 provides larger SINR with compared to the reuse factor of 3, that provides almost 10-dB improvement in cell interior SINR.
Fig. 3 a SINR with respect to distance. b Cell-interior SINR at β = 0.3 for several reuse factors
238
S. Srivastava et al.
Fig. 4 a Cell-interior capacity limits at β = 0.3 for several reuse factors. b Capacity of sum rate for multiple cell cooperation
While reuse factor of 3 gives almost 7 dB SINR improvement when it is compared to reuse factor of 1. we have taken all the observation at the path loss exponent of 3.6 as shown in Fig. 4a. It is verified by the fact that a user at cell center receives low ICI eliminating ICI increases SINR scarcely. The cell interior channel capacity limits as a function of α for different reuse factors are plotted in Fig. 4a by using the Eqs. (22), (23), and (24). We observed that channel capacity at different cell interior positions i.e. 0.1R, 0.2R, …, R, and we concluded that in the cell interior reuse factor of 1 provides the larger throughput till 0.6R from cell center, when compared to reuse factor of 3 and 7. After 0.6R distance from cell center increment in throughput by reuse of 1 goes down due to increment in ICI. We have taken observation at a distance of 0.3R from cell center. We can see from Fig. 4a that reuse of 3 and 7 provides worse throughput compared to the reuse factor of 1. Although in case of cell interior SINR is incremented by using reuse factor of 3 and 7 respectively. In Fig. 4b, we simulated a nineteen cell complete reuse multi-cell network for analysis about the improvement of SINR and MS capacity for transmission conditions, named as with co-operation and without co-operation. In our simulation, we assume a cellular network of 500 m, which is operated at 1800 MHz with one cell interior user at a distance of 0.8R from home cell center per cell. Because channel gain depends on the path loss model which also include log-normal shadowing and fading for both desired signal and corresponding inference, since we consider i.i.d. random variable with zero mean and unit variance as a fading component. While individual’s base station transmission power is 46 dBm. Detailed parameters related to simulation given in above table. From Fig. 4b it is clear that in case of BS’s no cooperation state that means they are not sharing user data and CSI among their corresponding base station, network’s achievable capacity is smaller compared to the cooperation case. while in case of BS cooperation that means sharing of channels among the adjacent cells (coordinated cells) there is improvement in the context of capacity. On the other hand, BS cooperation cooperates in such a manner that they decode the user collectively by
19 Downlink Performance Improvement Using Base Station …
239
operated at the same frequency. Although for case of cooperation it is observed that power allocation can be performed in both the manners that means either individual power constraints for each user or by the sum power constraints.
4 Conclusion In this paper, we see the effect of ICI at cell edge as well as interior users. We focus to increase the cell interior capacity for that we deployed 19 cell cluster network with two-tier of interference, and analyze the cell interior performance with and without ICI with respect to distance from their corresponding base station. Based on the various reuse scheme we prove that reuse factor of 3 provides almost 7 dB improvement in SINR with respect to reuse of 1 on the other hand reuse factor 7 provides the 10 dB improvement at path loss exponent of 3.6 when compared to reuse factor of 3. Moreover, we proposed the base station cooperation scheme for multicell MU-MIMO and show that this scheme improves the throughput of cell interior users. Also this scheme may be useful for LTE network in order to improve spectral efficiency. As a part of future work, we can use intelligence algorithm to further optimize and improve the reuse factor [30].
References 1. Yousafzai AK, Nakhai MR (2010) Block QR decomposition and near-optimal ordering in intercell cooperative multiple-input multiple-output-orthogonal frequency division multiplexing. IET Commun 4(12), 1452–1462 2. Le TA, Nakhai MR (2013) Downlink optimization with interference pricing and statistical CSI. IEEE Trans Commun 61(6):2339–2349 3. Telatar E (1999) Capacity of multi-antenna Gaussian channels. Eur Trans Telecommun 10(6):585–595 4. Spencer QH, Peel CB, Lee Swindlehurst A, Haardt M (2004) An introduction to the multi-user MIMO downlink. IEEE Commun Mag 42(10):60–67 5. Costa M (1983) Writing on dirty paper (corresp.). IEEE Trans Inf Theory 29(3):439–441 6. Yu W, Cioffi JM (2004) Sum capacity of Gaussian vector broadcast channels. IEEE Trans Inf Theory 50(9):1875–1892 7. Weingarten H, Steinberg Y, Shamai S (2004) The capacity region of the Gaussian MIMO broadcast channel. In: Proceedings of international symposium on information theory. ISIT 2004. IEEE, p 174 8. Yagyu K, Nakamori T, Ishii H, Iwamura M, Miki N, Asai T, Hagiwara J (2011) Physical layer aspects for evolved universal terrestrial radio access (UTRA) Physical layer aspects for evolved universal terrestrial radio access (UTRA). IEICE Trans Commun 94(12):3335–3345 9. Gesbert D, Hanly S, Huang H, Shamai Shitz S, Simeone O, Yu W (2010) Multi-cell MIMO cooperative networks: a new look at interference. IEEE J Select Areas Commun 28(9):1380– 1408 10. Zhang R (2010) Cooperative multi-cell block diagonalization with per-base-station power constraints. IEEE J Sel Areas Commun 28(9):1435–1445
240
S. Srivastava et al.
11. Viswanath P, Tse DNC (2003) Sum capacity of the vector Gaussian broadcast channel and uplink-downlink duality. IEEE Trans Inf Theory 49(8):1912–1921 12. Kiani SG, Gesbert D (2008) Optimal and distributed scheduling for multicell capacity maximization. IEEE Trans Wireless Commun 7(1):288–297 13. Wang L-C, Yeh C-J (2011) 3-cell network MIMO architectures with sectorization and fractional frequency reuse. IEEE J Sel Areas Commun 29(6):1185–1199 14. Xu L, Yamamoto K, Murata H, Yoshida S (2010) Cell edge capacity improvement by using adaptive base station cooperation in cellular networks with fractional frequency reuse. IEICE Trans Commun 93(7):1912–1918 15. Sung DK, Jung BC (2012) Method for inter-cell interference mitigation for a mobile communication system. U.S. Patent 8,145,252, issued 27 Mar 2012 16. Giambene G, Yahiya TA (2013) LTE planning for soft frequency reuse. In: 2013 IFIP wireless days (WD). IEEE, pp 1–7 17. Khan F (2009) LTE for 4G mobile broadband: air interface technologies and performance. Cambridge University Press 18. You X, Wang D, Zhu P, Sheng B (2011) Cell edge performance of cellular mobile systems. IEEE J Sel Areas Commun 29(6):1139–1150 19. James JVB, Ramamurthi B (2011) Distributed cooperative precoding with SINR-based cochannel user grouping for enhanced cell edge performance. IEEE Trans Wireless Commun 10(9):2896–2907 20. Khan MHA, Chung J-G, Lee MH (2016) Downlink performance of cell edge using cooperative BS for multicell cellular network. EURASIP J Wireless Commun Netw (1):56 21. Sharma S, Singh J, Kumar R, Singh A (2017) Throughput-save ratio optimization in wireless powered communication systems. In: 2017 international conference on information, communication, instrumentation and control (ICICIC). IEEE, pp 1–6. https://ieeexplore.ieee.org/abs tract/document/8279031 22. AjazMoharkan Z, ChoudhuryT, Gupta SC, Raj G (2017) Internet of things and its applications in E-learning. In: 2017 3rd international conference on computational intelligence and communication technology (CICT), pp 1–5 23. Piyush N, Choudhury T, Kumar P (2017) Conversational commerce a new era of e-business. In: Proceedings of the 5th international conference on system modeling and advancement in research trends, SMART 2016. https://doi.org/10.1109/SYSMART.2016.7894543 24. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET using MQTT. Ada User J 38(4):231–235 25. Tomar R, Tiwari R (2019) Information delivery system for early forest fire detection using internet of things. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 477–486 26. Huh H, Tulino AM, Caire G (2011) Network MIMO with linear zero-forcing beamforming: large system analysis, impact of channel estimation, and reduced-complexity scheduling. IEEE Trans Inf Theory 58(5):2911–2934 27. Hassan NU, Yuen C, Zhang Z (2012) Optimal power control and antenna selection for multiuser distributed antenna system with heterogeneous QoS constraints. In: 2012 IEEE globecom workshops. IEEE, pp 1112–1117 28. Hassan NU, Yuen C, Zhang Z (2012) Optimal power control between two opportunistic cooperative base stations. In: 2012 IEEE 13th international workshop on signal processing advances in wireless communications (SPAWC). IEEE, pp 194–198 29. Kumar R, Singh A (2018) Throughput optimization for wireless information and power transfer in communication network. In: 2018 conference on signal processing and communication engineering systems (SPACES). IEEE, pp 1–5. https://ieeexplore.ieee.org/abstract/document/ 8316303/ 30. Singh A, Sharma S, Singh J, Kumar R (2019) Mathematical modelling for reducing the sensing of redundant information in WSNs based on biologically inspired techniques. J Intell Fuzzy Syst Preprint 1–11. https://content.iospress.com/articles/journal-of-intelligent-and-fuzzy-sys tems/ifs190605
Chapter 20
A Survey on Moving Object Detection in Video Using a Moving Camera for Smart Surveillance System Manoj Kumar, Susmita Ray, and Dileep Kumar Yadav
1 Introduction In computer vision, the detection of moving object from a video is a process which is widely used in the various fields such as traffic monitoring, traffic analysis, activity analysis, border surveillance, ATM surveillance, packaging industry, gas-leakage, under-water surveillance, indoor-outdoor surveillance, healthcare, robotics, manufacturing and various other intelligent video surveillance systems [1–4]. The detection of moving objects from a moving camera is a more challenging task as compare to the object detection using a static camera [4–7]. Due to the motion of the camera, the detection of moving objects face more challenges, due to which basic background subtraction method cannot be efficiently applied to the detection of moving objects with a moving camera [6, 7].In reality, there is no specific method in image or video processing that can detect all kind of objects from colour as well as thermal video. In the last decade, various researchers have contributed their efforts and surveyed regarding the detection of moving objects using a static camera [7–9]. In a static camera, the object is in motion, and the camera is fixed. Where as in case of moving camera, both camera and object are in motion [6, 9–11]. Various researchers and scientists have contributed in this area. Some of them are discussed as a literature.
M. Kumar (B) · S. Ray Manav Rachna University, Faridabad, India e-mail: [email protected] S. Ray e-mail: [email protected] D. K. Yadav Galgotias University, Noida, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_21
241
242
M. Kumar et al.
Kalsotra et al. [3] has presented a study on various datasets for background subtraction algorithms. Ortego et al. [11] have suggested a framework that improves the performance of segmented foreground and background subtraction algorithms. The abandoned object detection is also a hot topic for the visual-surveillance domain, so Luna et al. [12] has presented a literature study related to abandoned objects. Various methods are described wherein background subtraction and frame differencing, we will calculate the difference between two consecutive frames in time series to get the moving object information through a threshold. In the optical flow method, this method uses the motion target vector information which changes by time to detect the moving area in an image sequence. It gives better results under the vision of a moving camera [4, 5]. Horn [13] has focused on the concept of optical flow for computing the motion. Another method given by Dosovitskiy et al. [14] computes the motion throw optical flow using convolution network. Chen et al. [15] has focused on a low-rankness and sparsity with contextual regularization based method to detect motion-based objects in multi-scenario surveillance system. This work considers the contiguous outlier detection problem using low-rank constraint and the contextual regularization. This work modelled the background for multiple scenarios using dictionary where a learning-based sparse representation is applied. In this manuscript [16], an effective module is motion-from-memory (MFM) based method is developed. It also encodes temporal context for improved the method for detection in surveillance videos. With the CNN based appearance features, the MFM maintains the dynamic memory for input frames along with output based motion features on every frame. Another work [17] for object detection in motionbased video has been developed by segmentation of different batches. To improve the continuous infrastructure of transportation system, Dai [18] has developed a vehicle counted method that categories them. This method also compute routes to know the detailed information of the traffic-flow. Another popular method (Redmon [19] i.e. YOLOv3) also experimentally performs better against its peer methods to compute the vehicle counting. This YOLOv3 method predicts an objectness-score using logistic regression for each bounding box.
2 Application of Moving Object Detection In real-world detection and tracking is a very challenging task for a smart surveillance system. For the smart surveillance system, statutory and non-statutory cameras are used. The main objective of the smart surveillance system includes object detection, moving object tracking, traffic monitoring & analysis, human behaviour analysis, and many more. • Defence: Now a day’s, object detection is very useful for border surveillance [20–23] and monitoring for safety of the country. The border areas should be equipped with highly artificial intelligence based smart surveillance system. The surveillance system detects every movement of enemies at the border.
20 A Survey on Moving Object Detection in Video Using a Moving …
243
• Facial Recognition: The facial recognition [24–26] system needs object detection to read the facial expressions of the human. Faces can be recognized for various applications such as human recognition, model recognition, reduction of crime, identification of criminal, police and law-enforcement etc. • Parking System: The object detection system detects every vehicle coming in the parking area and such systems can automate our parking system [27–29]. • Industry: Now a day’s, the packing industries are widely using object detection for the packing and logo checking on their goods. So, detection and recognition is highly recommended for the packaging industry [30, 31]. It is also applicable for crack detection in manufacturing units. • Self-Driving Cars: One of the best use of vehicle/lane/object/traffic-sign etc. detection [32] is applicable in driverless cars [33, 34]. So, what to do next is simply depends on object detection such as weather to accelerate or to apply the brakes. • Robotics: Robots are provided with the ability to process the visual data in realtime situations to take the decision or for completing their task. Robots can be used in numerous applications [35–38] in real-life from manufacturing unit to wars. • Transportation: In transportation, moving vehicles can be identified and tracked. It is applied to acquire information related to traffic density, traffic movement, lane/sign/number plate detection [32, 39–41]. No a days, the driverless car [33, 34] is equipped with such features. • Indoor-Outdoor Surveillance: It can be applied for moving object detection in in various areas such as indoor-outdoor applications [1–3, 7, 9, 42] for surveillance.
3 Moving Object Detection Methods A large number of methods have been proposed for the detection of moving objects using a static camera over the last few years. Some algorithms have been proposed for moving object detection using a moving camera, too. The detection of moving objects using a moving camera is now a new area for researchers. Lots of techniques are available for such kind of detection. In the basic background subtraction technique, the first few frames are used for the training purpose to construct the model for the background. Practically, when the bootstrapping problem arises then this does not work properly. Various methods such as statistical, fuzzy, CNN, or deep CNN can be used for the training purpose of background modelling (Fig. 1).
244
M. Kumar et al.
Fig. 1 Overview of various methods
3.1 Spatio-Temporal Difference Method In this section, the spatio-temporal difference method is explored where two consecutive frames are subtracted in the spatial domain. These are not appropriate methods for handling dynamic situations. It is presented in Fig. 2.
20 A Survey on Moving Object Detection in Video Using a Moving …
245
Fig. 2 Temporal frame difference method
3.2 Background Subtraction Method As the name suggests, the background subtraction is the mechanism of separating the foreground from the background in video frames [1, 2, 5, 42]. In the background subtraction method, the first step is to construct the background model from the starting of the video sequence after obtaining the background model. Then we need to compute the difference between constructed the background model and the current frame [8, 10, 11]. The background subtraction has one global threshold (T ) for all pixels in each frame. There is another problem is that this threshold is not associated with the function of t and it does not generate better results. So, local threshold is used that may be maintained with adaptive methods. This method is more effective than spatiotemporal. Based on the construction of background model, the better outcome can be achieved. The major limitations of this method causes lack of better results in the given conditions: i. if the considered background is bimodal ii. if the scene consists of lots of slowly moving objects iii. if illumination variation i.e. fluctuations in the light with time
Fig. 3 Background subtraction techniques
246
M. Kumar et al.
Fig. 4 Optical flow representation
iv. If abrupt motion is available in cluttered environment v. if the objects available in the current frame are moving fast along with slow the frame rate. Various researchers have done different-2 modifications and changes to tradeoff of such issues. The main aim of robust background subtraction algorithm is to handle lighting invariant, abrupt/fast/repetitive motions in cluttered situation. It easily handled the illumination variation and camouflage problems [5]. This method is also applicable for night vision based vehicles for transportation system [10, 42].
3.3 Optical Flow (Differential Method) An optical flow method is an expensive approach for the detection of moving objects using a non-stationary camera. In this method the detection of moving objects is made by estimating the optical flow of the image sequence and then the clustering is made according to the optical flow distribution [4, 6]. The optical flow method [13, 14] is much sensitive about the noise that makes it better for real-time. It represents the distribution of apparent velocities of the object in individual frame (Figs. 4 and 5). According the available literature [4, 6, 13, 14, 43, 44] the optical-flow-based methods are computationally complex and difficult to implement.
4 Challenging Issues in Moving Object Detection The major objective of moving object is to take a video from a moving or static camera and output is a binary image which represents the moving objects for each frame is a video. A few are discussed here in detail. Occlusion can be handled properly if a good appearance of modelling of moving objects is updated timely. The camera movement is not a simple issue while in the case of PTZ cameras, the PTZ camera
20 A Survey on Moving Object Detection in Video Using a Moving …
247
Fig. 5 Directions of optical flow [13] a straight line motion, b rotation based optical flow
[13] is useful in surveillance and these are stable or mounted on the walls. While in the case of handheld cameras, stability is a challenging issue. Noise is another factor that affects the quality of the video sequence and generated due to various kinds of problems such as hardware, software, etc. Table 1 have briefly summarized the major issues faced during moving object detection, especially in video surveillance application or other applications of computer vision.
5 Comparison of Moving Object Detection Method This section depicts the benefits and limitations of moving object detection methods. These are achieved through the study of various literature and their experiments over challenging datasets as given in Table 2.
6 Publicly Available Datasets In this section, some of the publicly available datasets are mentioned. Various realistic, natural, and problematic datasets are available for a video surveillance system. These are based on resolution, cluttered nature of background, diversity in scenes, and object’s activity/event categories etc. These publicly available datasets are the benchmark for the computer vision community. These are available in Table 3.
248
M. Kumar et al.
Table 1 Real-time challenges in object detection in video Challenge
Description
Foreground aperture [45]
It depicts that the part of moving homogenous object classified as the background instead of classified as a moving pixel
Camouflage [45]
Due to the similar color of background and foreground, it is very difficult to classify which pixel is belongs to the background and the foreground
Bootstrapping [45]
The unavailability of the background frames causes a problem in the construction of the reference frame
Illumination variation [45]
The illumination change occurs due to the light on or lights off. Due to a sudden change in the intensities of the light arise the problem of illumination. The illumination change includes the bogus appearance of pixels
Environmental effect [5, 7, 8, 10] The major environmental effects are dust, mist, fog, cloudy, sunny, rainy, etc. Dynamic situation [45, 46]
The motion in the background is just like the moving water in a river or sprouting of t water behind the moving object
Camera jitter [46, 47]
The camera’s jitter is used to depict the motion which causes vibration occurs
Occlusion [48–51]
The object may be occluded by the other object, or some part of the object is camou-flagged with the same part of the other object or the one object is fully hidden by the other objects or it seems when two objects came too close to each other and its peer methods to merge
7 Performance Analysis Parameters For the evaluation and comparison, we should have the ground truth of the each dataset on which the experiments are performed. In the object detection method, the results are binary images in the form of detected results, in which white pixels are in the form of detected results. The main parameters [5, 10, 42, 46] used for performance analysis are TP, FP, FN, TN. Most of the analysis based parameters are computed through TP, FP, FN, TN. • True Positive (TP): It is also called as Hit, the number of the moving object’s detected pixel according to the detected pixel in the ground truth. • False Positive (FP): It is also called as false, the number of moving object detected pixel according to a non-detected pixel in the ground truth. • True Negative (TN): It is also called as true rejection, the number of moving object non-detected pixel according to the non-detected ground truth. • False Negative (FN): It is also called as a miss, the number of moving object nondetected pixel according to a detected pixel in the ground truth. Some of the most useful metrics are given as follows:
20 A Survey on Moving Object Detection in Video Using a Moving …
249
Table 2 Benefits and limitations of various methods Method
Benefits
Limitation
Spatio-temporal frame difference [1–3, 11]
• Simple and easy • Subtract consecutive frames
• Not able to resolve cluttered issues • Not appropriate for motion estimation
Background subtraction method [2, 5, 7, 10, 11]
• Easily applied to real-time • Dependent on the background application and fast model and adaptive • Provide better detection in maintenance schemes static background • Not works fine for moving camera and moving object • More efficient for static camera • Don’t give an estimate of the Motion
Optical flow method [4, 6, • Easy to handle 43, 44] • Accurate time derivatives • Automatic object extraction from frame • Also, estimate the motion Object tracking method [43, 44]
• It is complex • Applied in real-time only if special hardware configuration is available • Work for moving the camera too
• Works fine for moving camera • No information on the object’s • Medium in complexity shade • Primary selection should be • More efficient in case of good moving camera
• Precision: Precision is also called as positive predicted values. It measures the percentage of detected pixels that are of the moving object. Precision = TP/(TP + FP) • Recall: It measures the percentage of all pixels related to the moving object which is detected correctly. Recall = TP/(TP + FN) • F1 Measure: It measures the weighted average of Precision and Recall. F1 − Measure = 2 ∗ Precision ∗ Recall/(Precision + Recall) • Accuracy: It measures the percentage of all correctly detected and rejected pixels. Accuracy = (TP + FN)/(TP + TN + FN + FP) This kind of work is also easily handled with Big Data technology as data captured through various sensory devices, of cameras. If we use cloud environment to store such Big Data in the form of video, it can be handled and resolves lots of data storage problems. Video summarization can also be used to store only informative
250
M. Kumar et al.
Table 3 Publicly available benchmark datasets Dataset
Sequence
Source: URL
Microsoft’s wallflower [45]
Bootstrap, Camouflage, Foreground Aperture, Light Switch, Moved Object, Time Of Day, Waving Trees
https://www.microsoft.com/ en-us/download/details.aspx? id=54651
Change detection [46]
Library, Dinning room, park, corridor, lakeside
https://changedetection.net/
CAVIAR [52]
Walking, Meeting, Shop- ping, https://groups.inf.ed.ac.uk/vis Fighting, Passing Out and last ion/CAVIAR/CAVIAR DATA1/
CSIR-CSIO India (MOTIID) [53]
Hot Ambassdor Near, Innova
https://vcipl-okstate.org/ pbvs/bench/
Terravic infrared [53]
Terravic Vehicle/Weap-on/Motion Infrared
https://vcipl-okstate.org/ pbvs/bench/
OSU—pedestrian [53]
OSU-Padestrian_004
https://vcipl-okstate.org/ pbvs/bench/
Houston-zoo [53]
HoustonZoo_Rino_l5b1
https://www.vcipl.okstate. edu/otcbvs/bench/
VIRAT video [54]
Realism and natural scenes
https://viratdata.org/
Caltech [55]
Detection and recognition
https://www.vision.caltech. edu/Image_Datasets/Caltec h101/Caltech101.html
Various datasets [56]
Huge Collection of Datasets
https://homepages.inf.ed.ac. uk/rbf/CVonline/Imagedbase. htm
video data on cloud and then analyses easily. Now a days various tools area available in MATLAB, OpenCV, and python libraries which are very helpful to perform such experiments.
8 Conclusion and Future Directions In this manuscript, several aspects of moving objects have been discussed. The detection of motion-based object using static or moving cameras is very common but is really a tough task to achieve the targeted outcome. This manuscript tried to contribute large efforts for covering most of the important aspects. Here, various applications, challenges, and various methods for moving object detection and tracking have been discussed. It also focuses on the various parameters which are mostly used for evaluating the performance. This paper is very helpful for new researchers to understand basic study about the related domain. In future work, based on the above methods, we investigate a new method that should be better than existing literature.
20 A Survey on Moving Object Detection in Video Using a Moving …
251
References 1. Akula A, Ghosh R, Kumar S, Sardana HK (2013) Moving target detection in thermal infrared imagery using spatiotemporal information. JOSA A 30(8):1492–1501 2. Yazdi M, Bouwmans T (2018) New trends on moving object detection in video images captured by a moving camera: a survey. Comput Sci Rev 28:157–177 3. Kalsotra R, Arora S (2019) A comprehensive survey of video datasets for background subtraction. IEEE Access, pp 59143–59171 4. Yadav DK, Suri A, Sharma SK (2019) Moving object detection using optical flow and fuzzy algorithm. J Adv Res Dyn Control Syst 11(11): 840–847 5. Yadav DK, Singh K (2019) Adaptive background modeling technique for moving object detection in video under dynamic environment. Int J Spatio-Temporal Data Sci, Indersci 1(1):4–21 6. Cho J, Yung Y, Kim D, Lee S, Jung Y (2019) Moving object detection based on optical flow estimation and a gaussian mixture model for advanced driver assistance systems. Sensors, MDPI 19:1–14 7. Edward KKN, Delp J (2011) Background subtraction using a pixel-wise adaptive learning rate for object tracking initialization. In: Visual information processing and communication II, Proceedings of SPIE digital library, vol 7882 8. Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. In: IEEE computer society conference on computer vision and pattern recognition, pp 246–252 9. Wu Y, He X, Nguyen TQ (2017) Moving object detection with a freely moving camera via background motion subtraction. IEEE Trans Circ Syst Video Technol 27(2):236–248 10. Yadav DK, Singh K (2016) A combined approach of Kullback-Leibler divergence method and background subtraction for moving object detection in thermal video. Inf Phys Technol 76:21–31 11. Ortego D, SanMiguel JC, Martinez JM (2019) Hierarchical improvement of foreground segmentation masks in background subtraction. IEEE Trans Circ Syst Video Technol 21(6):1645–1658 12. Juan EL, SanMiguel C, Ortego D, Martinez JM (2018) Abandoned object detection in videosurveillance: survey and comparison. Sensors 18(12):1–32 13. Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17:185–203 14. Dosovitskiy A, et al (2016) FlowNet: learning optical flow with convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV) 15. Chen BH, Shi LF, Ke X (2019) A Robust moving object detection in multi-scenario big data for video surveillance. IEEE Trans Circuits Syst Video Technol 29(4):982–995 16. Liu W, Liao S, Hu W (2019) Perceiving motion from dynamic memory for vehicle detection in surveillance videos. IEEE Trans Circuits Syst Video Technol 29(12):3558–3567 17. Li K, Tao W, Liu L (2019) Online semantic object segmentation for vision robot collected video. IEEE Access 7:107602–107615 18. Dai Z, Song H, Wang X, Fang Y, Yun X, Zhang Z, Li H (2019) Video-based vehicle counting framework. IEEE Access 7:64460–64470 19. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. https://arxiv.org/pdf/ 1804.02767v1.pdf 20. Dr˘agana C, Stamatescu G, Dobrin A, Popescu D (2016) Evaluation of continuous consensus algorithm in border surveillance missions. In: 2016 8th international conference on electronics, computers and artificial intelligence (ECAI), Ploiesti, pp 1–6 21. Dhulekar PA, Gandhe ST, Sawale N, Shinde V, Khute S (2018) Surveillance system for detection of suspicious human activities at war field. In: 2018 international conference on advances in communication and computing technology (ICACCT), Sangamner, pp 357–360 22. Wan Q, Kaszowska A, Samani A, Panetta K, Taylor HA, Agaian S (2018)Aerial border surveillance for search and rescue missions using eye tracking techniques. In: 2018 IEEE international symposium on technologies for homeland security (HST), Woburn, MA, pp 1–5
252
M. Kumar et al.
23. Arjun D, Indukala P, Menon KAU (2019) Integrated multi-sensor framework for intruder detection in flat border area. In: 2019 2nd international conference on power and embedded drive control (ICPEDC), Chennai, India, pp 557–562 24. Klare BF, Burge MJ, Klontz JC, Vorder Bruegge RW, Jain AK (2012) Face recognition performance: role of demographic information. IEEE Trans Inf Forensics Secur 7(6):1789–1801 25. Smith DF, Wiliem A, Lovell BC (2015) Face recognition on consumer devices: reflections on replay attacks. IEEE Trans Inf Forensics Secur 10(4):736–745 26. Soldera J, Schu G, Schardosim LR, Beltrao ET (2017) Facial biometrics and applications. IEEE Instrum Meas Mag 20(2):4–30 27. Zhu H, Qiu S, Shen J, Yu F (2018) High-accuracy parking surveillance based on collaborative decision making. In: 2018 IEEE international conference on mechatronics and automation (ICMA), Changchun, pp 730–736 28. Kashid SG, Pardeshi SA (2014) Detection and identification of illegally parked vehicles at no parking area. In: 2014 international conference on communication and signal processing, Melmaruvathur, pp 1025–1029 29. Lee JT, Ryoo MS, Riley M, Aggarwal JK (2009) Real-time illegal parking detection in outdoor environments using 1-D transformation. IEEE Trans Circuits Syst Video Technol 19(7):1014– 1024 30. Lim N, Kim J, Lee S, Kim N, Cho G (2009) Screen printed resonant tags for electronic article surveillance tags. IEEE Trans Adv Packag 32(1):72–76 31. Unander T, Nilsson H (2011) Evaluation of RFID based sensor platform for packaging surveillance applications. In: 2011 IEEE international conference on RFID-technologies and applications, Sitges, pp 27–31 32. Wei X, Zhang Z, Chai Z, Feng W (2018) Research on lane detection and tracking algorithm based on improved hough transform. In: 2018 IEEE international conference of intelligent robotic and control engineering (IRCE), Lanzhou, pp 275–279 33. Abueh YJ, Liu H (2016) Message authentication in driverless cars. In: 2016 IEEE symposium on technologies for homeland security (HST), Waltham, MA, pp 1–6 34. Dhall A, Dai D, Van Gool L (2019) Real-time 3D traffic cone detection for autonomous driving. In: 2019 IEEE intelligent vehicles symposium (IV), Paris, pp 494–501 35. Probst T, Maninis K, Chhatkuli A, Ourak M, Poorten EV, Van Gool L (2018) Automatic tool landmark detection for stereo vision in robot-assisted retinal surgery. IEEE Rob Autom Lett 3(1):612–619 36. Abdalla GOE, Veeramanikandasamy T (2017) Implementation of spy robot for a surveillance system using internet protocol of Raspberry Pi. In: 2017 2nd IEEE international conference on recent trends in electronics, information and communication technology, Bangalore, pp 86–89 37. Das H, Chakraborty H, Chowdhury MSU (2019) Design and implementation of voice command based bipedal surveillance robot. In: 2019 1st international conference on advances in science, engineering and robotics technology, Dhaka, Bangladesh, pp 1–5 38. Kim K, Bae S, Huh K (2010) Intelligent surveillance and security robot systems. In: 2010 IEEE workshop on advanced robotics and its social impacts, Seoul, pp 70–73 39. Shin Y, Hwang K, Park J, Kim D, Ahn S (2019) Precise vehicle location detection method using a wireless power transfer (WPT) system. IEEE Trans Veh Technol 68(2):1167–1177 40. Feng R, Fan C, Li Z, Chen X (2020) Mixed road user trajectory extraction from moving aerial videos based on convolution neural network detection. IEEE Access 8:43508–43519 41. Dai Z et al (2019) Video-based vehicle counting framework. IEEE Access 7:64460–64470 42. Akula A, Khanna N, Ghosh R, Kumar S, Das A, Sardana HK (2014) Adaptive contourbased statistical background subtraction method for moving target detection in infrared video sequences. Infrared Phys Technol 63:103–109 43. Kanagamalliga S, Vasuki S (2017) Contour-based object tracking in video scenes through optical flow and Gabor features. Optics 157:787–797 44. https://nanonets.com/blog/optical-flow/ 45. https://www.microsoft.com/en-us/download/details.aspx?id=54651
20 A Survey on Moving Object Detection in Video Using a Moving …
253
46. Goyette N, Jodoin P-M, Porikli F, Konrad J, Ishwar P (2012) changedetection.net: a new change detection benchmark dataset. In: Proceedings IEEE workshop on change detection (CDW-2012) at CVPR-2012, Providence, RI, 16–21 June 2012. https://changedetection.net/ 47. Avola D, Cinque L, Foresti GL, Massaroni C, Pannone D (2017) A key point-based method for background modeling and foreground detection using a PTZ camera. Pattern Recogn Lett 96:96–105 48. Liu D, Shyu M, Zhu Q, Chen S (2011) Moving object detection under object occlusion situations in video sequences. In: 2011 IEEE international symposium on multimedia, Dana Point CA, pp 271–278 49. Kim JU, Kwon J, Kim HG, Ro YM (2020) BBC Net: bounding-box critic network for occlusionrobust object detection. IEEE Trans Circuits Syst Video Technol 30(4):1037–1050 50. Ouyang W, Zeng X, Wang X (2016) Partial occlusion handling in pedestrian detection with a deep model. IEEE Trans Circuits Syst Video Technol 26(11):2123–2137 51. Chen X, Xu X, Yang Y, Wu H, Tang J, Zhao J (2020) Augmented ship tracking under occlusion conditions from maritime surveillance videos. IEEE Access 8:42884–42897 52. https://homepages.inf.ed.ac.uk/rbf/CAVIARDATA1/ 53. https://vcipl-okstate.org/pbvs/bench/ 54. https://viratdata.org/ 55. https://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html 56. https://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm
Chapter 21
Tomato Leaf Features Extraction for Early Disease Detection Vijaya Mishra, Richa Jain, Manisha Chahande, and Ashwani Kumar Dubey
1 Introduction Nowadays, India’s economy is growing rapidly. The recent growth and advancement in food technology industry established the fifth rank in consumption, production, and expected growth in food processing industry. Since the problem of plant disease is a worldwide concern, so the plant detection at the early age of plant becomes a major concern and important field of research nowadays. Once the plant disease symptoms are observed, the presence of disease and the category is verified using certain algorithms and disease detection techniques. Tomato is infected or prone to many bacterial and fungal diseases. These diseases are difficult to control, and hence, the early detection becomes very important. Tomato is a type of vegetable which has an incredible rate of production throughout the world. In this paper, a histogram-based analysis is done on types of diseases that can affect the Tomato plant. Histograms are the graphs that are used to depict the contrast, dynamic range and saturation level in the image. Images are stored as pixel values, then each pixel depicts the intensity range.
V. Mishra · R. Jain · M. Chahande · A. K. Dubey (B) Amity School of Engineering and Technology, Amity University Uttar Pradesh, Sector-125, Noida, UP, India e-mail: [email protected] V. Mishra e-mail: [email protected] R. Jain e-mail: [email protected] M. Chahande e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_22
255
256
V. Mishra et al.
2 Related Work Tomato is produced at a large amount everywhere in the world due to its high consumption. Therefore, it is also targeted by thousands of pests and diseases [1]. Usually, these kinds of diseases are treated by pesticides [2]. Many researchers have taken leaves to identify the diseases in the plants [3, 4]. In another application, cotton leaves pattern recognition has been implemented to identify and classify the diseases in the cotton plants [5]. K-means clustering method is also one of the methods being implemented to isolate each and every infected spot in the leaved [6]. Histograms based segmentation and disease detection in potato has also been implemented and achieved classification accuracy up to 97.5% [7]. Fuzzy-based weed shape detection method can be applied to achieve 92.9% accuracy [8]. Similarly, support vector machine (SVM) can give detection accuracy up to 98.2% [9]. Sankaran and Ehsani [10] used k-nearest neighbor (KNN) and quadratic discriminant analysis (QDA) to detect disease in citrus leaves and found highest accuracy as 99.9%. Cheng et al. [11] used Alex net to identify or classify cultural pest and got accuracy up to 98.67%. Ferreiraa et al. [12] used convents to perform disease detection in soybean plant and achieve accuracy as 99.5%. Recently, deep convolution neural network (DCNN) is gaining popularity to accurately classify multiple disease [13, 14].
3 Diseases in Tomato Leaves 3.1 Early Blight This disease is very not unusual on tomato in addition to on potato plants. The early blight is particularly as a result of the Alternaria solani which is basically a fungus. This is found throughout inside the USA. This first grows at the decrease or older part of the leaves which is visible as a small brown spots like a bull’s eye. As the sickness matures, it spreads throughout the leave and turns the leaves yellow and thereafter dies. Afterward top part of the plant starts getting inflamed. This disease also regarded on tomato seeds [14] as shown in Fig. 1 [15].
Fig. 1 Early blight in tomato leaf (Lycopersicon) [15]
21 Tomato Leaf Features Extraction for Early Disease Detection
257
Fig. 2 a Green and yellow mosaic pattern on leaf infected with Tomato mosaic virus (TMV), b virus symptoms on a tomato seedling [17]
3.2 Mosaic Virus Tomato mosaic virus is one of the oldest plant viruses in tomato plant. It is extremely easily to spread out and can be destructive to plants mainly to the tomato. It is hard to find the symptoms of mosaic virus on tomato plant. Tomato mosaic virus symptoms include could be found at any stage of growth, and all parts may be infected. They are often seen as a general mosaic appearance on surface of plant. When the plant is severely affected, leaves may become stunted [16] as shown in Fig. 2 [17].
3.3 Target Spot Target spot, or simply we can say early blight, is one of the most commonly disease found in the potatoes and tomatoes. It is caused by alternaria solani which is a fungus. It originates as small circles to oval dark brown spots. These spots first enlarge then become oval to angular in shape. During favorable conditions, the individual spots starts growing up to a certain height. When the disease is become severe and something to concern, all spots unite with each other and cause an upward rolling of the leaf tips and eventually lead to death [18] as shown in Fig. 3 [19].
3.4 Yellow Leaf Curl Virus This virus is an ailment on tomato plant where an infected flower shows the stunted and upright plant growth. Plants inflamed at an early duration of growth show severe quantity of stunting [20] as shown in Fig. 4 [21].
258
V. Mishra et al.
Fig. 3 Close-up to show the ring patterns in the leaf spot caused by target spot on tomato [19]
Fig. 4 Yellow leaf curl virus on tomato leaf [21]
4 Methodology For the evaluation of diseases, the hassle will become greater hard [22], and as on this, it is extra centered on only one plant mainly tomato. In this paper, images have been taken to pick out diseases based on the signs that differentiate one plant from other [23]. The current advancement in the field of machine learning and computer vision requires distinctive features to generate accurate inference for which it has been designed. The images used in this work are taken from the available datasets [24] like distinctive types of tomato plants as given in Fig. 5.
21 Tomato Leaf Features Extraction for Early Disease Detection
259
Fig. 5 Images of diseased tomato leaves from plant village dataset [24]
The major steps involved in the feature extraction process of tomato leaf images are: acquisition of image, pre-processing of the image, segmentation, feature extraction, and disease identification as presented in Fig. 6. Fig. 6 Process flow of tomato disease detection through leaves pattern analysis
260
V. Mishra et al.
4.1 Image Acquisition In this paper, direct image acquisition process is not adopted. The image datasets have been taken from Plant Village Dataset [24] for analysis and evaluation.
4.2 Image Pre-processing This process is used to resize the captured image from high to low resolution. Each captured image needs to have a definite length so that it could be analyzed consequently. This intermediate process is necessary to explore accurate statistical characteristics features of the tomato leaves [23].
4.3 Disease Segmentation Segmentation means to divide large data (image) into multiple fragments or segments for simpler analysis. Disease segmentation is a crucial step to make our data more meaningful. In this, we classify our dataset (leaves images) into small segments on the basis of different types of diseases in tomato leaves. Diseases in tomato leaves are: early blight, mosaic virus, target spot, and yellow leaf curl virus. These classifications can be done with the help of histograms and statistical features with different dataset [23] as shown in Figs. 7, 8, 9 and 10.
Fig. 7 Segmentation of early blight disease
21 Tomato Leaf Features Extraction for Early Disease Detection
261
Fig. 8 Segmentation of mosaic virus disease
Fig. 9 Segmentation of target spot disease
4.4 Feature Extraction This is the most remarkable step of image processing which helps in dimensional reduction of the image and give us a compact feature image so that further classification becomes easier for the classifiers [22]. In this paper, the following methods
262
V. Mishra et al.
Fig. 10 Segmentation of yellow leaf curl virus disease
have been adopted to explore image like: colored histogram and Haar wavelets. The extracted features will be used to classify the following diseases: early blight, mosaic virus, target spot, and yellow leaf curl virus.
4.5 Identification of Disease in Tomato Plant With the help of feature extracted above, a machine or deep learning-based classifier can classify the diseases in tomato plants and an early disease predicator will predict the disease type to be happening.
5 Results and Discussions In this work, 250 tomato leaf images were taken. Out of which, 50 were the healthy tomato leaf images and 200 were the diseased tomato leaf images. To evaluate the similarities or differences of each disease, we firstly visualize the histogram of each analyzed image and compared it with a sample of the various diseases. Here, we computed the mean, median, range, and standard deviation as shown in Tables 1 and 2 from leaf disease histogram for disease and healthy tomato plant. Comparing with features of healthy tomato like mean feature there is very slight deviation in values
21 Tomato Leaf Features Extraction for Early Disease Detection
263
Table 1 Statistical values calculated from leaf disease histogram S. No
Tomato leaf disease
Mean
Median
Range
Std. Dev.
1
Early blight
97.81
87
255
57.46
2
3
4
Mosaic virus
Target spot
Yellow leaf curl
103
104
198
50.8
122.5
135
255
46.8
128.1
140
255
52.27
119.6
129
255
39.25
119.2
130
255
47.32
127.1
146
255
53.47
122.8
137
255
47.92
106.6
122
255
52.52
118
129
255
52.55
115.2
128
255
48.48
126.2
124
221
39.68
93.01
95
222
37.58
139.7
160
255
59.25
117.7
128
255
55.93
124
141
255
60.46
Table 2 Statistical values calculated from healthy tomato disease histogram S. No.
Healthy leaf
Mean
Median
Range
Std. Dev.
1
Healthy leaf 1
96.5
102
255
65.05
2
Healthy leaf 2
96
100
255
66.3
3
Healthy leaf 3
97
97
255
72.57
4
Healthy leaf 4
92.25
86
255
53.87
of mean where as its observed that greater variation in various mean value when considering diseased tomato leaves (Fig. 11). On comparing the statistical data shown in Tables 1 and 2, it has been found that mean of healthy leaves are obtained around 96 whereas for disease affected leaves it comes out between 100 and 140 and standard deviation for healthy leaves are obtained around 66, whereas for disease affected leaves, it is between 46 and 60 for the sample data set. Hence, statistical data will be more fruitful for early disease prediction system of tomato plants, and preventive measure can be taken to protect those plants at an early stage and before the further cost is added to the crop.
264
V. Mishra et al.
Fig. 11 Histograms of four type of tomato leaves diseases
6 Conclusions In this paper, critical diseases of tomato have been discussed that generally occur in tomato plants. The statistical features have been extracted from the healthy and defected tomato leaves. An analysis is done on the leaf features data collected, and it has been found that there is a huge difference in the statistical features of the healthy and defected tomato leaves. The type of disease and respective histogram is also analyzed to see the intensity distribution on the defected leaves images. A remarkable feature value difference is obtained while analyzing the leaves statistical features data. From the histogram visualization, one can determine the color information which will be helpful to identify the particular diseases in the plant. Hence, an effective leave features based early disease prediction system can be made to protect the plant before more cost is added to the tomato crop as a mass.
21 Tomato Leaf Features Extraction for Early Disease Detection
265
References 1. Mohanty SP, et al (2016) Using deep learning for image-based plant disease detection. Front Plant Sci 7:1419. https://doi.org/10.3389/fpls.2016.01419 2. Bergougnoux V (2013) The history of tomato: from domestication to biopharming. Biot. Adv. 32:170–189. https://doi.org/10.1016/j.biotechadv.2013.11.003 3. Qin F et al (2016) Identification of alfalfa leaf diseases using image. PLoS ONE 11(12):e0168274. https://doi.org/10.1371/journal.pone.0168274 4. Prince G et al (2015) Automatic detection of diseased tomato plants using thermal and stereo visible light images. PLoS ONE 10(4):e0123262 5. Rothe P, et al (2015) Cotton leaf disease identification using pattern recognition techniques. In: 2015 international conference on prevention computer (ICPC). IEEExplore, Pune, India, pp 1–6. https://doi.org/10.1109/PERVASIVE.2015.7086983 6. Sannakki SS, et al (2011) Leaf disease grading by machine vision and fuzzy logic. Intl J Com Tech App 2(5):1709–1716. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1. 208.8002&rank=1 7. Samanta D, et al (2012) Scab diseases detection of potato using image processing. Intl J Comp T Tech 3:109–113. https://ijcttjournal.org/archives/ijctt-v3i1p118 8. Herrera PJ et al (2014) A novel approach for weed type classification based on shape descriptors and a fuzzy decision-making method. Sensors 14(8):15304–15324 9. Cheng B, et al (2015) Feature based machine learning agent for automatic rice and weed discrimination. In: Rutkowski L, et al (eds) Artificial intelligence and soft computing. ICAISC 2015. LNCS, vol 9119. Springer, Cham. https://doi.org/10.1007/978-3-319-19324-3_46 10. Sankaran S, et al (2013) Comparison of visible-near infrared and mid-infrared spectroscopy for classification of Huanglongbing and citrus canker infected leaves. Agr Eng Int CIGR J 15(3):75–79. https://cigrjournal.org/index.php/Ejounral/article/view/2469/1765 11. Cheng X et al (2017) Pest identification via deep residual learning in complex background. Com Elect Agric 141:351–356. https://doi.org/10.1016/j.compag.2017.08.005 12. Ferreira AS et al (2017) Weed detection in soybean crops using Conv Nets. Com Elec Agr 143:314–324. https://doi.org/10.1016/j.compag.2017.10.027 13. Sladojevic S, et al (2016) Deep neural networks based recognition of plant diseases by leaf image classification. Comp Intell Neuro 289801:1–11 14. Fuentes A, et al (2016) Characteristics of tomato plant diseases a study for tomato plant disease identification. In: ISITC 2016 international Symposium on information technology and conversion. Shanghai, China, pp 226–230 15. https://www.planetnatural.com/pest-problem-solver/plant-disease/early-blight, last accessed 2020/04/22 16. https://www.missouribotanicalgarden.org/gardens-gardening/your-garden/help-for-the-homegardener/advice-tips-resources/pests-and-problems/diseases/fungal-spots/early-blight-of-tom ato.aspx, last accessed 2020/04/22 17. https://www.gardeningknowhow.com/edible/vegetables/tomato/managing-tomato-mosaicvirus.htm, last accessed 2020/04/22 18. https://extension.umn.edu/diseases/tomato-mosaic-virus-and-tobacco-mosaic-virus, last accessed 2020/04/22 19. https://barmac.com.au/problem/target-spot-in-tomato/, last accessed 2020/04/22 20. https://www.pestnet.org/fact_sheets/tomato_target_spot_163.htm, last accessed 2020/04/22 21. https://www2.ipm.ucanr.edu/agriculture/tomato/tomato-yellow-leaf-curl/, last accessed 2020/04/22 22. Sankaran S et al (2010) A review of advanced techniques for detecting plant diseases. Comp Elect Agric 72(1):1–13. https://doi.org/10.1016/j.compag.2010.02.007 23. Chaudhary S, et al (2019) A review: crop plant disease detection using image processing. Intl J Inn Tech Expl Eng 8(1):472–477 24. https://www.kaggle.com/noulam/tomato, last accessed 2020/04/22
Chapter 22
Cardiac MRI Segmentation and Analysis in Fuzzy Domain Sandip Mal, Kanchan Lata Kashyap, and Niharika Das
1 Introduction Congestive heart failure is main reason of mortality among human in this era. Cardiovascular diseases affect the blood vessels (veins, arteries, and capillaries) and the heart. Different kinds of cardiac imaging techniques such as echography, computed tomography, coronary angiography, and cMRI are available for diagnosis of cardiac diseases. cMRI plays an important role among other cardiac imaging techniques for research and clinical treatment [1]. It is a non-invasive technique by which various physiological measures like anatomy of cardiac, functionality; intromission, and myocardial properties can be obtained. cMRI also provides good image quality of the heart and blood vessels as compared to other image modalities. Physicians also use the cMRIs to assess anatomical composition and functioning of the organ. However, cMRI has evolved as a standard imaging technique for non-invasive evaluation of ventricular function as it provides good quality of information on cardiac structure using adequate protocols. The extraction of information like morphology, perfusion, and tissue viability from cMRI can be done correctly by defining regional borders [2]. Physicians primarily do manual segmentation of region of interest which is an extensive and tiresome work as well as prone to error. Segmentation of RV is difficult due to hazy and poor contrast of wall boundaries of the RV. This problem increases as we move from the apex to the base of RV. The inhomogeneity in the blood flow and similar contrast to myocardium creates the hurdle in the delineation of the wall. Thus, an automated approach needs to be developed for S. Mal · K. L. Kashyap (B) VIT Bhopal University, Kothrikalan, Madhya Pradesh, India e-mail: [email protected] N. Das Central University Jharkhand, Ranchi, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_23
267
268
S. Mal et al.
proper segmentation of the RV. Proper segmentation of RV helps in accurate quantification of regional and structural parameters and for the prognosis of various cardiac diseases. In this work, automatic segmentation of RV is done based on the fuzzy property of cMRI. The present work is organized in different sections. Section 2 describes the related work. Section 3 discusses the applied algorithm for segmentation. Section 4 presents the experimental outcomes. Finally, Sect. 5 gives the conclusions.
2 Literature Review Numerous segmentation techniques have been already applied for segmentation and diagnose of different heart ailments effectively by various authors. These segmentation techniques consist of threshold-based, pixel-based, edge and area, atlas-based techniques [3]. cMRI is processed separately to find the contours of endocardial and epicardial due to the challenges present in segmentation of these contours. Endocardial contours can be located in first step by applying thresholding or dynamic programming technique. The second step is to find the epicardial contours, which often rely on endocardial segmentation. Katouzian et al. used an arbitrary point inside the cavity for reference and applied power-law transformation to the image [4]. Further, thresholding technique is applied on transformed image for removal of small object and to find final segmented image. Combination of morphological and threshold decomposition opening technique is applied on power-law transformed images by Vincent et al. [5]. Goshtaby et al. applied an edge-based technique to obtain the position and size of ventricles [6]. Pednekar et al. applied Hough transformation and seed point to segment the left ventricle [7]. Gaussian mixture learning and expectation minimization model is used for segmentation of left ventricle by Vlassia et al. [8]. Pednekar et al. and Cootes et al. applied active appearance model (AAM) and active shape model (ASM) for LV segmentation [9, 10]. The hybrid model of AAM and ASM is applied to segment both LV and RV by Cootes et al. [11]. Rohlfing et al. applied atlas guided-based approach for segmentation [12]. From literature review, it is found that much work has not been done for the segmentation of right ventricular due to the fuzziness in lower part of ventricle. Therefore, robust algorithm is required with minimal user interaction which provides the accurate segmentation. Furthermore, the accurate segmentation gives useful information which is helpful for determining cardiac health.
3 Proposed Algorithm for RV Extraction Extraction of right ventricle (RV) is obtained by utilizing the fuzzy properties of morphology of image. The regions of the image are not always well defined, so it
22 Cardiac MRI Segmentation and Analysis in Fuzzy Domain
269
can be represented as fuzzy subset of image [13]. Different types of fuzzy properties of geometry which includes area, length, breadth, and index of area coverage are utilized to measure the fuzziness present in the cMRI [14]. Here, cMRI is represented as I of M × N dimension. Membership values of fuzzy geometrical properties are determined by selecting the membership function. The membership function assigns higher membership values to the pixels whose chances of belongingness in the object are more. S-function is used to represent the bright image plane which is defined as: µ(x) = {s(x, p, q, r ) if x < p x−p 2 if p ≤ x ≤ q =2 r−p x−p 2 if q ≤ x ≤ r =1−2 r−p = 1 if x ≥ r
(1)
Here, q and w are known as cross-over point and window size, respectively. Mathematically, q and w are defined as: q=
(p + r) and w = r − p, respectively 2
Proposed algorithm for RV extraction is given as: 1. Define membership function µ(x) as defined in Eq. 1. 2. Calculate different fuzzy geometrical properties which are defined below. Area: area(µ) =
M N
µ(x, y)
(2)
x=1 y=1
Length: length(µ) = max x
µ(x, y)
(3)
y
Breadth: breadth(µ) = max y
Index of area coverage (IOAC):
x
µ(x, y)
(4)
270
S. Mal et al.
IOAC(µ) =
area(µ) length(µ) · breadth(µ)
(5)
3. Select those membership values µ for which IOAC has local minimum by changing value of q between minimum and maximum gray values. After that, global minimum value is chosen from selected local minima as cross-over point. 4. The membership plane µ which has minimum IOAC value can be considered as an extracted image of RV.
4 Experimental Results Proposed algorithm has been validated on 25 end-diastole images taken from different patients. Short axis view of cMRI of heart is shown in Fig. 1. This image consists of RV which supplies blood to the lungs which is received from the right atrium and left ventricle (LV) which supplies oxygen-rich blood to other parts of body. The cardiac image contains regions of similar shapes with different or same intensity values. The area chosen for RV extraction is known as region of interest (ROI). ROI of 155 × 180 is selected manually from cMRI in this work. After that, RV is segmented by implementing the proposed algorithm explained in Sect. 3. The membership value is selected as 0 and 1 for the pixels having gray value less than q and greater than q, respectively. The resultant thresholded image is used for further processing. In the next step, erosion followed by dilation morphological operations is performed on the thresholded image to reduce irregularities of resultant endocardial contour. Further, two image features which are areas and convexity are computed of all regions. Convex object is defined if end points of line segment of any object are members of the same object. Small areas are removed from the image having area less than the average area of all the regions. After that, binary thresholded image contains LV, RV and some other regions. From the literature, it is known that the shape of LV is circular and convexity of left ventricle is highest among the convexity of all other regions. So, now the second most convex region is selected as segmented RV. Fig. 2a, b shows original cMRI and segmented RV, respectively Fig. 1 Short axis cMRI of heart
ROI
22 Cardiac MRI Segmentation and Analysis in Fuzzy Domain
271
Fig. 2 a Original images, b extracted RV images
Accuracy of extracted RV is validated by calculating dice coefficient (DC) and Hausdorff metric (HM) [15, 16]. The DC is defined similarity between ground truth region of image and extracted image region. DM is formulated by following equation: DC(X, Y ) = 2
X ∩Y X ∪Y
(6)
. Here, X and Y present area of ground truth and segmented image, respectively. The value of DC varies from 0 (complete non-overlap) to 1 (complete overlap). The Hausdorff metric is defined as distance between two contours [17]: HM(X, Y ) = max
max x∈X
max min min D(x, y) , D(x, y) . y∈Y x ∈X y∈Y
(7)
Here, D(x, y) indicates the Euclidean distance between two points x and y. X and Y represent ground truth and segmented contour, respectively. The average value of DC and HM of segmented images is shown in Table 1. Accuracy of proposed algorithm is compared with existing techniques which includes BIT-UPM, GEWU, ICL and LITIS [18]. It is marked that proposed methodology outperforms than existing methodologies by analyzing the computed values Table 1 Average values of dice coefficients and Hausdorff metric with (± standard deviation)
Method
DC
HM (mm)
BIT-UPM
0.80 ± 0.19
11.15 ± 6.62
GEWU
0.59 ± 0.24
20.21 ± 9.72
ICL
0.78 ± 0.20
9.26 ± 4.93
LITIS
0.76 ± 0.20
9.97 ± 5.49
Proposed method
0.85 ± 0.16
10.80 ± 0.19
272
S. Mal et al.
of DC and HM. Average computed value of DC is 0.85 which is closer to complete overlap value 1. And HM value is 10.80 which is smaller than the compared two BIT-UPM and GEWU existing techniques.
5 Conclusions A semi-automatic right ventricle segmentation technique in cMRI is proposed in this work. The problem is mapped into fuzzy domain to overcome the difficulties which arises during the segmentation of cMRI. Four fuzzy morphological properties of the images which include area, length, breadth, and IOAC are computed to select the threshold value for extraction of right ventricle. The proposed algorithm is validated on end-diastole (ED) images only. Accuracy of segmentation techniques is computed based on Jaccard index and Hausdorff distance which is better than the computed values based on existing techniques. So, in future, robust method will be proposed to perform segmentation on end-systole (ES) images also.
References 1. Haddad F, Hunt S, Rosenthal D, Murphy D (2008) Right ventricular function in cardiovascular disease, part i. Circulation 117(11):1436–1448 2. Caudron J, Fares J, Lefebvre V, Vivier P-H, Petitjean C, Dacher J-N (2012) Cardiac MR assessment of right ventricular function in acquired heart disease: factors of variability. Acad Radiol 19(8):991–1002 3. Petitjean C, Dacher J (2011) A review of segmentation methods in short axis cardiac MR images. Med Image Anal 15:169–184 4. Katouzian A, Prakash A, Konofagou E (2006) A new automated technique for left and right ventricular segmentation in magnetic resonance imaging. In: EMBS annual international conference, Aug 30–Sept 3 5. Vincent L (1993) Morphological grayscale reconstruction in image analysis: application and efficient algorithms. IEEE Trans Image Process 2:176–201 6. Goshtaby A, Tuner D (1995) Segmentation of cardiac cine MR images for extraction of right and left ventricular chambers. IEEE Trans Med Images 14(1):56–64 7. Pednekar A, Kurkure U, Muthupillai R, Flamm S, Kakadiaris IA (2006) Automated left ventricular segmentation in cardiac MRI. IEEE Trans Biomed Eng 53(7):1425–1428 8. Vlassia ans Likas N (2002) A greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 15(1):77–87 9. Pednekar A, Kakadiaris IA (2006) mage segmentation based on fuzzy connectedness using dynamic weights. IEEE Trans Image Process 15(6):1555–1562 10. Kedenburg G, Cocosco CA, Köthe U, Niessen WJ, Vonken EPA, Viergever MA (2006) Automatic cardiac MRI myocardium segmentation using graphcut. In: SPIE proceedings, 61440A 11. Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59 12. Rohlfing T, Brandt R, Menzel R, Russakoff D, Maurer C, Vadis Q (2005) Atlas-based segmentation In: The handbook of medical image analysis: segmentation and registration models, chap 11
22 Cardiac MRI Segmentation and Analysis in Fuzzy Domain
273
13. Rosenfeld A (1984) The fuzzy geometry of image subsets. Pattern Recogn Lett 2:311–317 14. Pal SK, Rosenfeld A (1988) Image enhancement and thresholding by optimization of fuzzy compactness. Pattern Recogn Lett 7:77–86 15. Sankar KPAL, Ghosh A (1990) Index of area coverage of fuzzy image subsets and object extraction. Pattern Recogn Lett 11:831–841 16. Abi-Nahed J, Jolly M-P, Yang G-Z (2006) Robust active shape models: a robust, generic and simple automatic segmentation tool. In: Proceedings of MICCAI, pp 1–8 17. Huttenlocher D, Klanderman G, Rucklidge W (1993) Comparing images using the hausdroff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863 18. Petitjean C, et al (2013) Right ventricle segmentation from cardiac MRI: a collation study
Chapter 23
Color Masking Method for Variable Luminosity in Videos with Application in Lane Detection Systems Om Rastogi
1 Introduction The inspiration for this paper came from a human’s ability to identify and distinguish colors into categories. It can be inferred that the ability of the computer to mask objects with respect to color is noteworthy and make them more like humans. Color Thresholding is a concept in image processing, which separates pixels into groups with similar color characteristics [1]. Color masking method segments the image based on the boundary manually set by the user, to extract specific information related to color. However, these methods are not effective for varying luminous circumstances in a video and highly rely on uniform light conditions. In his paper [2], Peter Belhumeur conveyed that objects under different illumination conditions, detection cannot be solved using hard constraints. Hence this paper is focused to obtain variable color thresholding values for variant illumination in Lane Detection Application. The proposed method is able to provide variant threshold values. As shown in Fig. 1, after adjusting each frame to similar conditions using a luminosity filter, the model is calculated and masking takes place. The equation for the upper boundary of the mask is calculated and the masking takes place. Luminosity is the Amount of light emitted from a certain light source whereas brightness is the amount of light manifested or received.
O. Rastogi (B) JSS Academy Of Technical Education, Noida, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_24
275
276
O. Rastogi
Fig. 1 Different levels of the method
2 Literature Review Lane Detection system has been an important area of research for people in autonomous vehicle systems. There are a large number of lane detection algorithms that are available in the literature during recent years [3]. McDonald and Palmer et al. presented methods based on detecting edges and applying Hough Transform in [4, 5] respectively. However Parajuli stated in his paper [3]—“These edge methods are sensitive to illumination, heavily affected by shadows, environmental conditions. Additionally, these methods have experienced difficulties when choosing a suitable intensity threshold to remove the unwanted edges”. Researchers have addressed the problem of variable luminosity and presented their solution. Parajuli [3] presented a method for lane boundaries detection that is not affected by the shadows, illumination, and uneven road conditions. Firstly, points on the adjacent right and left lane got recognized using the local gradient descriptors followed by a simple linear prediction model to predict the direction of lane markers. Alvarez et al. [6] proposed an effective approach that relies on using a shadow-invariant feature space combined with a model-based classifier. Alberto et al. [7] presented a novel vehicle detection and tracking system with a recursive background modeling approach. He introduced luminosity sensors, which assisted an adaptive background modeling occurring in the static scene.
3 Luminosity Filter When light is bright, the object with a high reflection index, shine making them brighter and are not masked. This makes color masking too noisy and unfit for detection purposes. Idea is to normalize the light effect for the object of identification. There is some light that is being reflected from the source if it is too bright or too dim than the identification becomes difficult. Now if the incident light is adjusted, to near mid-value, than the identification will be sufficient. The process for filtering is (Fig. 2): 1. The image is divided into two parts: – Containing The Light Source – Containing The object for segmentation 2. Calculate the mean luminosity from the light source.
23 Color Masking Method for Variable Luminosity in Videos …
277
Fig. 2 Object of identification
3. This mean value is then divided by the middle value of the color range, the value is called “lumin”. 4. All the values from the object image are added with lumin (Figs. 3, 4 and 5).
Fig. 3 Without luminosity filter (range 58–200)
Fig. 4 Without luminosity filter (range 100–200)
Fig. 5 With luminosity filter (range 58–200)
278
O. Rastogi
4 Variable Color Masking A masking task requires upper and lower boundary conditions, all the pixels with values between the boundaries are kept one while others are kept zero in a new binary image. These boundaries values, depend on required segmentation and the color model of the image. For variable luminosity, HSL or HSV color model is considered to be the most optimal as the last channel decides the controls the brightness of the color. In HSV model, ‘V’-channel is of relevance to the luminous changes, hence this value is varied in each frame. Mean and standard deviation of the image is used to train a regression model [8], which gives a linear equation to calculate the lower boundary of the mask. The equation for calculation lower bound is n = C + Ax + By where n is the lower bound, x and y are two features, A and B are the coefficient of x and y respectively, C is the intercept. For the study, the upper boundary ‘n’ was set manually, for each frame individually. While other features were calculated simultaneously and stored, four measures were considered for the study: 1. 2. 3. 4.
Standard Deviation of Image Histogram of Horizontal Projection (hsd) Standard Deviation of Image Histogram of Vertical Projection (vsd) Mean of the Source Image (mean) Standard Deviation of Histogram of Object Image (lsd).
Figure 6 implies mean value and vsd value have the highest correlation, while hsd value is least related. Fig. 6 Correlation chart of features
23 Color Masking Method for Variable Luminosity in Videos …
279
Figures 7, 8 and 9 show the correlation of upper boundary threshold value to the different properties of image. The lsd value doesn’t weight much and show quite random value, hence it is not considered in the model. Only mean and vsd are used in the linear regression model. The equation for calculation lower bound is n = C + Ax + By where n is the lower bound, x and y are two features, A and B are the coefficient of x and y respectively, C is the intercept. X is the mean and y is the vsd value (Fig. 10). Hence, A = −0.61412, B = −0.00125, C = 231.4917.
5 Results Proposed method gives more information, than classical thresholding approach. A comparison is shown between both methods. Figures 11 and 12. will have same frames, processed with existing and proposed methods respectively. The classical color thresholding method, tuned to initial frames Figure 11 are taken from different instances after color masking with constant bounds. Most frames can’t deliver any information and hence this method has not been considered. Results After Color Masking Tuned to variable luminosity Figure 12 capture the same frames, but with the variable color masking method, and all of them provide essential information to detect lanes. Fig. 7 Vsd versus n graph
280
O. Rastogi
Fig. 8 Mean versus n graph
Fig. 9 lsd versus n graph
Fig. 10 Linear regression model
6 Limitations Color Masking fails for highly contrast frames and makes mistakes. This can be seen whenever intense light from a slit type structure is observed (Fig. 13). This method also limited to problems like occlusions and clutters [9]. Shiny surfaces also won’t be masked as they show similar colors as lane markers. Irregular surfaces and lighting also hinder the information, like Figs. 12f, h and 14.
23 Color Masking Method for Variable Luminosity in Videos …
281
a. frame 1
b. frame 2
c. frame 3
d. frame 4
9
e. frame 5
f.frame 6
g.frame 7
h. frame 8
i. frame 9
Fig. 11 a Frame 1. b Frame 2. c Frame 3. d Frame 4. e Frame 5. f Frame 6. g Frame 7. h Frame 8. i Frame 9
282
O. Rastogi
a. frame 1
b. frame 2
c. frame 3
d. frame 4
e. frame 5
f. frame 6
g. frame 7
h. frame 8
i. frame 9
Fig. 12 a Frame 1. b Frame 2. c Frame 3. d Frame 4. e Frame 5. f Frame 6. g Frame 7. h Frame 8. i Frame 9
23 Color Masking Method for Variable Luminosity in Videos …
283
Fig. 13 High saturation of light
Fig. 14 Color masked image of above frame
7 Conclusion In the above experiment, the method was able to provide lane information much clearly than normal color masking techniques. Although this doesn’t directly help create a robust or state of the art technology to detect lanes, it is a fairly accurate and low computation method to segment road marker information for processing. However, some images cannot be color masked due to extreme saturation and high contrast. Overcoming of color masking algorithm failure for high contrast frames remains a subject of future work. This research not only aims to improve lane detection algorithms but to increase the scope of color masking technology. Color is one of the most essential parts for vision and by the understanding of the effect of luminosity on color will affect the computer vision technology.
References 1. Fermin CD, Degraw S (1994) Colour thresholding in video imaging. Department of Pathology and Laboratory Medicine, Tulane University School of Medicine, New Orleans and I Oncor Imaging, Gaithersberg, USA 2. Belhumeur P, Jacobs DW (1998) Comparing images under variable illumination. In: Proceedings of CVPR98 3. Parajuli A, Celenk M, Riley HB (2013) Robust lane detection in shadows and low illumination conditions using local gradient features. Open J Appl Sci 4. McDonald J (2001) Detecting and tracking road markings using the Hough transform. In: Proceedings of the irish machine vision and image processing conference 5. Palmer PL, Kittler J, Petrou M (1993) An optimizing line finder using a Hough transform algorithm. Comput Vis Image Underst 68(1):1–23 6. Álvarez JM, Lopez AM (2010) Road detection based on illuminant invariance. IEEE Trans Intell Transp Syst
284
O. Rastogi
7. Faro A, Giordano D, Spampinato C (2011) Adaptive background modeling integrated with luminosity sensors and occlusion processing for reliable vehicle detection. In: IEEE transactions on intelligent transportation systems 8. Montgomery DC, Peck EA, Vining GG (2012) Introduction to linear regression analysis 9. Steger C (2002) Occlusion, clutter, and illumination invariant object recognition. In: International archives of photogrammetry and remote sensing 10. Arshad N, Moon K-S, Park S-S, Kim J-N (2011) Lane detection with moving vehicles using color information. In: Proceedings of the world congress on engineering and computer science 2011 vol. I WCECS 2011, 19–21 Oct 2011 11. McCall JC, Trivedi MM (2004) An integrated, robust approach to lane marking detection and lane tracking. In: IEEE intelligent vehicles symposium, University of Parma, Parma, Italy 12. Kulkarni N (2012) Color thresholding method for image segmentation of natural images. In: I.J. image, graphics and signal processing 13. Du EY, Chang CI (2004) An unsupervised approach to color video thresholding. In: Remote sensing signal and image processing laboratory 14. Jumb V, Sohani M, Shrivas A (2014) Color image segmentation using K-means clustering and otsu’s adaptive thresholding. Int J Innov Technol Exp Eng (IJITEE) 15. Lipsk C (2008) A fast and robust approach to lane marking detection and lane tracking. IEEE SSIAI, 24–26 Mar 2008 16. Kim Z (2008) Robust lane detection and tracking in challenging scenarios. IEEE Trans Intell Transp Syst 9(1):16–26 17. Sivaraman S, Trivedi MM (2010) Improved vision-based lane tracker performance using vehicle localization. In: Proceedings of IEEE intelligent vehicles symposium, 21–24 June 2010 18. Borkar A, et al (2009) A layered approach to robust lane detection at night. In: IEEE CIVVS, Mar 30–Apr 2 19. Ramstrom O, Christensen H (2005) A method for following unmarked roads. In: Proceedings IEEE intelligence vehicles symposium 20. Tsai C-Y, Liu T-Y (2015) Real-time automatic multilevel color video thresholding using a novel class-variance criterion. In: Machine vision and applications
Chapter 24
An Automated IoT Enabled Parking System Meghna Barthwal, Keshav Garg, and Mrinal Goswami
1 Introduction One of the most concerning issues that are faced in today’s lifestyle is traffic congestion, not only because of the unavailability of sufficient parking but also illegal parking [2]. This problem is becoming more prominent in urban cities where there are shopping malls, cinema theatres, metro stations, etc. In India, network failures are very prominent and do not have very skilled drivers, so the parking system must not rely on network and use minimum technology at user end with minimum prerequisites. There have been several parking systems proposed to date, but none could solve the above problem entirely [3, 4]. The existing system uses Arduino, raspberry pi, wi-fi module, IR sensors/ultrasonic sensors to detect the availability of parking slots and deliver this information to the user to park the vehicle [5]. Ultrasonic sensors are used to detect the presence of the vehicle on a particular parking spot and store suitable data (“1” if occupied “0” if empty) and wi-fi module is then used to send this data over the cloud which is processed by microcontroller (Arduino, raspberry pie). These system uses a large number of wi-fi modules (i.e., at every parking spot), thus increasing the cost of internet utilization provided by ISP [6]. Some concepts even use cloud services to maintain a database of vehicles availing the parking service through IBM cloud or any other cloud providers and provides an interface (GUI) to the customer to book and collect information [1, 4, 7]. But these parking system does not solve the issue of illegal parking as there is nothing proposed regarding the authenticity of the user using parking space moreover there is no automation M. Barthwal · K. Garg · M. Goswami (B) Department of Systemics, School of Computer Science, University of Petroleum and Energy Studies, Bidholi, Dehradun, India e-mail: [email protected] K. Garg e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_25
285
286
M. Barthwal et al.
proposed while entering the parking space (i.e checkpoint) thus a guard is required for opening the boom barrier and letting the vehicle enter the parking area. Very few systems that reduced the cost of internet usage and the entire system again failed to provide the authorization and security to the vehicles [4]. The existing structure of parking areas needs a little bit of change to serve the smarter India as much as new parking areas coming up. The parking system proposed here is based on RFID authorization, implemented using Arduino UNO wi-fi and ultrasonic sensors. Talking about the technology used at the customer’s end to resolve the issues like network failure or some other connectivity issues, ID cards (containing RFID chip) are allotted to each customer. This card needs a recharge that can be done be done using any wallet or visiting the respective parking offices. On arriving at the parking checkpoint, customer scans their ID card, and on successful detection, a parking charge would be detected and displayed on the screen nearby. The next section will highlight the existing parking system. In Sect. 3, will investigate the proposed system, followed by a comparison of the proposed system with the existing systems is described in Sect. 4. Section 5 concludes the paper.
2 Related Work In [7], an automated parking system is investigated, which focuses on providing the user with the information of all the empty slots to save time hustle of parking vehicles. It also checks whether the parked vehicle is properly parked or not by taking the snapshots of the parking space and figuring out the number plate of the parked vehicle. Further, it tracks the time duration for which the vehicle is parked. However, it fails to provide security in the system. The system used old and slow technologies, which in turn increases maintenance as well as development cost. An improvised version came into the market that solved the problem of cost and internet usage [4]. A smart parking system using LoRa technology came up that uses LoRa transmitters and receivers instead of wi-fi modules to reduce the cost. In addition to that, the IBM cloud was maintained, and all the information regarding the available slots was sent to the user on their mobile phones through an android application. Despite several advantages, the system fails to address some of the important issues like authorization, entry, or exit of the vehicle (still manual). Moreover, the Android phone is used to act as a stopper; however, not everyone has a smartphone in India. In [1], another parking system was proposed that solved the problem of automation and authorization. The system uses RFID chips for authorization was proposed using Arduino microcontroller and wi-fi modules. IR sensors were used to detect the availability of the empty space, store the information on cloud web services, and the user is notified about the details of parking slots through SMS on the registered mobile number (using GSM module). IR sensors fail when a black color vehicle arrives as the black color will absorb the emitted light, and output will be infinite distance. Also, without a valid ID card, the customer can not enter the parking area. The entire system was divided into 3 major parts: first being the ID authorization
24 An Automated IoT Enabled Parking System
287
and slot detection. Second, updating the cloud database and third, SMS notification to the user. However, the biggest concern of the proposed system is network failure in the country like India. If networks fail, there is no guarantee of getting SMS on time. Moreover, wi-fi modules are heavily used, which in turn increases the cost as well as internet uses. Also, the payment getaway problem is still unsolved.
3 Proposed Methodology This work targets to build a fully automated parking system with the minimal use of technology at the user’s end to avoid disruption in the process. The only pre-requisite is to have a valid ID card and ability to drive, of course, the rest of the process is managed automatically. Ultrasonic sensor is used for the purpose of detection of slots unlike IR sensor [1]. Ultrasonic Sensors here are mainly used to detect the presence of any vehicle and subsequently provide the data (0 and 1) to Arduino which is the control center of the entire proposed system. RFID on the other hand is essential to enter the parking area and serves as the basis of authorization and security for the vehicle parked. On successful detection of the ID card (RFID), SERVO MOTOR rotates and the boom banner is lifted up allowing the entry of automobiles. A smarter way of updating the database is followed, which saves us the cost of using multiple Wi-Fi modules. As we already discussed in the introduction section, the parking system proposed here is based on RFID authorization, implemented using Arduino UNO Wi-Fi and ultrasonic sensors. Talking about the technology used at the customer’s end to resolve the issues like network failure or some other connectivity issues, ID cards (containing RFID chip) are allotted to each customer. This card needs a recharge that can be done be done using any wallet or visiting the respective parking offices. On arriving at the parking checkpoint, customer scans their ID card, and on successful detection, a parking charge would be detected and displayed on the screen nearby. Alongside a list of all available parking slots will also be available, and the barrier will be lifted up automatically to let you in. In addition to this, the available balance can also be displayed on the screen. During exit, you again need to swipe your card but no amount deduction this time (to ensure the safety of the vehicle). Instead of using multiple WI-FI modules at different parking lots, the proposed system uses Arduino integrated with Wi-Fi, which will save both the internet usage cost and system complexity[4]. Moreover, our systems come in three variants suitable for different situations as follows: 1. Commercial Parking (available for all) • No specific need of maintaining a database. • ID card is must to avoid illegal parking and a dedicated parking amount is always deducted.
288
M. Barthwal et al.
2. Residential Parking (for societies and apartments) • Database can be maintained to distinguish among residents and visitors. • No deduction of parking amount, only authorization required. 3. Companies/Organization • Database can be maintained to keep track of customers and employees. • No parking amount detection for employees.
3.1 Design and Implementation This section will discuss the complete architecture of the proposed system. The proposed parking system is shown in Fig. 1. It can be divided into three phases: 1. ENTRY: The entry flow of the proposed system is shown in Fig. 2. The user on approaching the check post of the parking area needs to swipe/punch his/her ID card on the Rfid reader. On successful detection of the card, a boom barrier opens up, allowing you to enter the area, and a parking amount is deducted and displayed on the screen along with the empty slots (in an ordered manner of their distance from the check post).
Fig. 1 The proposed parking system
24 An Automated IoT Enabled Parking System
289
Fig. 2 Entry flow
2. BACKEND: All the parking slots are stored in using data structure (say in an array) in ascending order of their distance from the check post. Ultrasonic sensors installed at all the parking lots are controlled using Arduino that store flag values as ‘0’ and ‘1’ for occupied and available respectively in another data structure (i.e., for every slot a flag value is allotted). Whenever a card is detected, Arduino process all the slots with flag value as ‘1’ and are then displayed on the screen in the same order. On the successful detection of the card, Arduino also commands, and the entry closes after 1 min of the opening to avoid the congestion. After the vehicle is parked at its favorable parking slot, ultrasonic sensors update the value of the slot from ‘1’ (available) to ‘0’ (occupied). Once the ID card is punched, the details linked with the respective card can be stored in a remote database over the cloud (as if required, mentioned in variant 2 and 3). Arduino UNO Wi-Fi comes with an inbuilt Wi-Fi, which makes it easy to send any data from Arduino to the cloud. For variant 1, or scenarios where database maintenance
290
M. Barthwal et al.
Fig. 3 Exit flow
is not required, Arduino doesn’t require a Wi-Fi connection. For this scenario, instead of Arduino UNO Wi-Fi, Arduino UNO is used. This will help to reduce the prototype development cost also. 3. EXIT: The exit flow of the proposed system is shown in Fig. 3. For exiting the parking area, the driver will encounter another boom barrier where he/she again has to scan their ID card for the barrier to open, allowing the driver to exit the area. The complete connection is shown in Fig. 4.
24 An Automated IoT Enabled Parking System
291
Fig. 4 Circuit connection
4 Comparison Table 1 is reported comparison among all the previous systems with the proposed system. Unlike previous systems, only users with a valid ID card and a certain amount are allowed in the parking space, which mitigates the problem of illegal parking. It also increases the security of parked vehicles, which is not considered in any of the previous systems. The biggest challenge of the previous parking system is the automated behavior of the complete system. As it is apparent from Table 1, the proposed system is automatically controlled from entry to exit, thus established as an automated system. Previous systems need manual intervention at some point in time [1, 4, 7], which fails to act as an automated system. The significant advantage of the proposed system is the driver is not compelled to carry a smartphone or don’t need any sort of technical knowledge or intelligence. Just a mere easy to carry ID card would do the job reduces the cost of the system. Ultrasonic sensors are set to a distance of 25 cm to check the availability of parking slots instated of IR sensors; hence customers need not worry about finding the availability of space as it was already done by the system automatically. IR sensor uses light to calculate the distance in [1], but it emitted light is absorbed by BLACK color, thus if any black colored vehicle appears at the parking space, the sensor will still reflect the slot as available. The biggest advantage of the proposed system is it reduces the problem like network failure at the user end, which is very common in the country like India. As minimal technology is used at the user end, there is no space for the problems like network/power failure. Apart from all these, an automatic payment facility is also available in the proposed system, which is missing in any of the previous systems discussed here.
292
M. Barthwal et al.
Table 1 Comparison of previous systems with the proposed system Features In [7] In [4] In [1]
Proposed
Authorization Automation Security feature Smart phone Payment facility SMS alert Cost Detection sensor
No Low NA Required No Yes Low Ultrasonic
No Moderate NA Required No No Moderate Ultrasonic
No Partial NA Not necessary No No High IR
Yes Complete Available Not necessary Yes No Low Ultrasonic
5 Conclusion In this paper, an IoT enabled automatic parking system is investigated. After the implementation of the hardware and compiling the codes, users will have a simple and easy to use mechanism to find the parking slots and complete the payment in a completely automatic process. The driver only needs to have a valid ID card to enter the premises and nothing else. This is purposely made for countries like India with minimal technology use at user end due to prominent network failures and unskilled drivers. Also, a database is maintained for all the vehicles parked using the time stamp of the ID card punch and the details attached to the parking card. To overcome such issues, RFID tags along with Ultrasonic sensors for efficient identification of available parking slots are used. Motors are also used to control the entry and exit gate, which makes the proposed system as a fully automated system. Using this parking system nation will be a step closer towards the dream of Smart Cities for a Smarter and Greater India.
References 1. Elakya R, Seth J, Ashritha P, Namith R (2019) Smart parking system using IoT. Int J Eng Adv Technol 9:6091–6095 2. Ghorpade SN, Zennaro M, Chaudhari BS (2020) GWO model for optimal localization of IoTenabled sensor nodes in smart parking systems. IEEE Trans Intell Transport Syst 1–8 3. Ke R, Zhuang Y, Pu Z, Wang Y (2020) A smart, efficient, and reliable parking surveillance system with edge artificial intelligence on iot devices. IEEE Trans Intell Transport Syst 1–13 4. Kodali RK, Borra KY, Sharan Sai GN, Domma HJ (2018) An IoT based smart parking system using LoRa. In: 2018 International conference on cyber-enabled distributed computing and knowledge discovery (CyberC), pp 151–1513 https://doi.org/10.1109/CyberC.2018.00039 5. Mahendra BM, Sonoli S, Bhat N, Raghu T (2017) IoT based sensor enabled smart car parking for advanced driver assistance system. In: 2017 2nd IEEE international conference on recent trends in electronics, information communication technology (RTEICT), pp 2188–2193
24 An Automated IoT Enabled Parking System
293
6. Mendiratta S, Dey D, Rani Sona D (2017) Automatic car parking system with visual indicator along with IoT. In: 2017 International conference on microelectronic devices, circuits and systems (ICMDCS), pp 1–3 7. Sadhukhan P (2017) An IoT-based e-parking system for smart cities. In: 2017 International conference on advances in computing, communications and informatics (ICACCI), pp 1062– 1066
Chapter 25
Real-Time Person Removal from Video B. Bharathi Kannan, A. Daniel, Dev Kumar Pandey, and PrashantJohri
1 Introduction The process of removing specific areas such as a human object or any unwanted object in a video is known as video painting. The unwanted object is removed from the video in video painting, and it is filled by the same video background pixels. In ancient times, artist used inpainting techniques to fix or repair the images. In general, painting is the way to automatically fill in holes and gaps in image or video using information from the surrounding area or background in such a way that the observer finds the modified region within the image or video in a difficult way. Therefore, inpainting techniques are also named image/video completion. In this work, we have used the Tensorflow framework to render our BodyPIX model of human segmentation. It segments the video into pixels and returns an array of binary values 0 and 1 with 0 representing all non-human objects and 1 representing human objects. Sample segmentation result from the model (Fig. 1).
B. Bharathi Kannan · A. Daniel (B) · PrashantJohri School of Computing Science and Engineering, Galgotias University, Delhi NCR, UP, India e-mail: [email protected] B. Bharathi Kannan e-mail: [email protected] D. K. Pandey Department of Computer Science and Engineering, Galgotias University, Delhi NCR, UP, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_26
295
296
B. Bharathi Kannan et al.
Fig. 1 A sample segmentation output on the canvas
2 Proposed Work The objective of this work is to provide a web-based approach for video inpainting that can remove human objects from video. With a static background and a mobile human object, this approach should be able to remove human objects from the video. This work should provide the approach of using Google’s Tensorflow.js to use the Tensorflow framework on the web. There the user can select any video and use the BodyPIX segmentation model to remove the human object from the video. Segmentation: Segmentation refers to the segregation of objects from a collection. In this work, segmentation assigns binary values of 0 and 1 to the segmented video pixels. The segmentation model we have used is BodyPIX. It is pre-trained human segmentation machine learning model and is highly efficient. Given a video, we first render the video frame by frame on a temporary canvas where we process the frame image using our ML model. Next is segmentation configuration. BodyPIX provides methods to increase the segmentation accuracy at the cost of processing speed.
25 Real-Time Person Removal from Video
297
3 Algorithm 1. The step for our algorithm is summarized below. 2. Convert the video into frames on the temporary canvas. 3. Load the BodyPIX model using the segmentation configuration to obtain maximum accuracy. 4. Get the segmented pixel data of the frame on the canvas. 5. The segmentation result is single dimension array of binary values. All nonhuman objects have values of 0. 6. Create a loop to iterate through every pixel of the frame image. 7. If the pixel value is 0, then copy the pixel data from the video to the output canvas. If its 1, then skip the updating process. RESULT After removing the human pixels (Fig. 2).
Fig. 2 The output result showing no human object in video
298
B. Bharathi Kannan et al.
4 Conclusion From the results obtained, we can conclude that the segmentation model performed well and the accuracy is good. It only segments the human and non-human pixels; thus, the shadows are also considered as non-human pixels, and therefore, we as of now cannot remove the shadows using this model. The proposed method fails when there is a mobile background or a static foreground. This is because the model fails to inpaint the videos due to the lack of pixel data. Also, we might have to adjust the configuration of the segmentation model for each video as per need. Future purpose of this paper is to deal with moving background and foreground video sequence and also deal with random camera movement (including zooming), moving object size shifts, and dynamic backgrounds. This algorithm does not currently tackle the complete analysis of the moving object, and work is underway to adapt the methodology to such scenario. Dealing with shifts in lighting in the series is a problem, too. Recently results for adapting to changes in lighting have appeared as an extension.
References 1. Sufyan M. Unwanted object removal from video using video Inpainting 2. Beralmio M, Caselles V, Sapiro G, Ballester C (2000) Image inpainting. In: Proceedings of ACM SIGGRAPH, 2000, pp 417–424 3. Patwardhan KA, Bertalmío M, Sapiro G (2007) Video inpainting under constrainedcamera motion. IEEE Trans Image Process 16(2):545–553 4. Zarif S, Faye I, Rohaya D (2013) Static object removal from video scene using local similarity. In: 2013 IEEE 9th international colloquium on signal processing and its applications, 8–10 Mar 2013
Chapter 26
Characterization and Identification of Tomato Plant Diseases Richa Jain, Vijaya Mishra, Manisha Chahande, and Ashwani Kumar Dubey
1 Introduction The diseases in plants cause considerable loss in economic as well as in production level of plants. The causes include elevated levels of CO2 which further causes increase in temperature. The temperature affects plants severely such as plant structure and plant stress [1, 2]. The diseases in plants are not only the problem in India, but it exists in the whole world. Hence, disease in the plant becomes a major topic of concern nowadays. Disease in plants is majorly affecting farmers as a result is a threatening to food security [3]. Once, an individual plant is affected, it spread to whole crop causing severe damage. To protect this damage, we left with one option that is the use of pesticides. The quantity and concentration of pesticides depends upon the type and depth of the disease [4]. There are various methods to detect the variety of diseases in plants. Different diseases have different colors, shapes, and structures. Hence, in this paper, we have tried to analyze the majority diseases that may occur in the tomato plant’s fruit. The diagnosis of disease is the most important step in disease management of tomato plant. The symptoms associated with the diseases on tomato plant are also included in the analysis. Tomato plant is the most popular vegetable in the world. But, the production of tomato plant is not easy. It brings about lots of challenges like various kinds of R. Jain · V. Mishra · M. Chahande · A. K. Dubey (B) Amity School of Engineering and Technology, Amity University Uttar Pradesh, Sector-125, Noida, UP, India e-mail: [email protected] V. Mishra e-mail: [email protected] M. Chahande e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_27
299
300
R. Jain et al.
disorders, insects, diseases, and pests may affect the plant during the growing of plant and may damage plant production, leading to poor-quality of tomato fruit. Tomato is an essential crop in the Indian agriculture. Leading cultivating states of Tomato in India are mainly Andhra Pradesh, Punjab, Bihar, Maharashtra, Karnataka and Gujarat, etc. [5]. Tomato is used in various ways in the Indian cuisine. It is majority used as a ketchup, soups, puree, etc. But the production of Tomato plant is not easy as tomato plant suffers from a huge diseases, which may be both including air-borne and seed-borne diseases. Diseases are mainly caused by two factors including biotic and abiotic factors. Biotic factors include majority of bacteria, fungus as well as viruses. Thus, it is important to detect these diseases at right time so as to protect this precious crop.
2 Related Work We have chosen tomato for our analysis because of its data availability of infected and healthy sample and its usefulness [6]. To reduce the impact of infection due to diseases, pesticides are used that are neither environment friendly and nor hygienic [7, 8]. Few researchers have tried to detect disease in tomato plants through different approaches. But, the most acceptable approach is non-destructive approach [9]. Fuentes et al. [10] have studied to distinguish disease in tomato plants. Another approach of plant disease detection was proposed by Rothe P et al. [8], and by Khairnar K et al. [11]. But, they have focused on cotton leaf disease identification. There algorithm was based upon pattern recognition. A leaf disease grading is done by Sannakki et al. [9] and Rastogi A et al. [12] using machine vision and fuzzy logic. Hence, it may be concluded that a consistent and reliable algorithm is still needed which can identify the disease in the tomato accurately.
3 Tomato Plant Diseases and Its Causes Tomato plant is attacked by several diseases and pests [13]. Abiotic disorders are mainly caused by environmental factors such as humidity and various temperature changes, etc. [14]. Some common pests are hornworms, tomato fruit worms, stinkbugs, and leaf miners, etc. [12, 15]. Few diseases are caused by the bacteria, fungus, and viruses also [13, 16, 17]. The symptom of diseases may appear early on leaves and stem also. The symptoms of disease may appear different on different side of the leaf.
26 Characterization and Identification of Tomato Plant Diseases
301
Fig. 1 Nutritional tomato [20]
3.1 Nutritional Tomato Tomatoes are good source of various vitamins and minerals like: A, C, K, B1 to B7, folate, Fe, K, Mg, Cr, Zn, and P, etc. [18, 19]. Initially, they appear as green and become red when mature. The nutritional tomato looks similar as shown in Fig. 1. Different categories of diseases and their sub-types are discussed below.
3.2 Disease Category—Fungal 3.2.1
Anthracnose
The symptoms under this type are seen most commonly and also infect the stem, leaves, and root portion. The characteristic of the disease is of sunken circular lesion on the fruit, and the center turns dark like tan in color and dotted as they become mature. A lesion can appear in very large size. This disease favors warm weather and mostly appears in early spring as shown in Fig. 2 [20].
Fig. 2 a Sunken and circular lesions on the fruit is a characteristic symptom. b Anthracnose on tomato fruit [20]
302
R. Jain et al.
Fig. 3 a Early blight symptoms. b A dark lesion at the shoulder of the fruit [20]
Fig. 4 a Raised spots on unripe fruit. b Bird’s eye spot on tomato [17]
3.2.2
Early Blight
This fungal disease symptom begins with oval-shaped lesions region (LR) and yellow chlorotic region (YCR) around it. Additionally, after sometimes, the LR changes to the dark flecks with ring pattern. At early stage, lesions seem to be quite, but as it transit to the marketplace, it broadens as shown in Fig. 3 [20].
3.3 Disease Category—Bacterial 3.3.1
Bacterial Canker
There is no specific age for this disease to affect the plant, but this may cause a large amount of crop loss. The spot on tomato fruit due to the bacterial canker is known as bird’s eye spots as represented in Fig. 4 [17, 20].
3.3.2
Bacterial Spot
The type of bacteria that cause bacterial spot to the plant is known as Xanthomonas. Bacterial spot can sometimes appear to be very harmful and is impossible to control if favorable conditions are there [21]. A small water-soaked spots type lesions appear on the surface as shown in Fig. 5 [20].
26 Characterization and Identification of Tomato Plant Diseases
303
Fig. 5 Bacterial spot [21]
Fig. 6 a Symptoms of TSWV on tomato fruit. b Complex ring spots on fruit infected with TSWV [20, 22]
3.4 Disease Category—Virus 3.4.1
Tomato Spotted Wilt Virus (TSWV)
The infections caused by TSWV virus is shown in Fig. 6 [20–22]. In this, the bronze and purple color with distinctive area can be seen on the surface of the tomato.
3.5 Disease Category—Physical Disorder 3.5.1
Blossom-End Rot
Blossom end or physical disorder happen due to low concentration of calcium in fruits. Initially, it appears as tan in color, and thereafter, it turns into black and leathery in structure as shown in Fig. 7 [20].
304
R. Jain et al.
Fig. 7 a, b Blossom-end rot on tomato fruit [20]
Fig. 8 Catfacing symptoms [20]
3.5.2
Catface
This disease often appears in the tomatoes which show a crack on the surface of a corky the tomato like brown scar as shown in Fig. 8 [20].
4 Methodology In this paper, we have tried to analyze the majority diseases caused by the biotic factor that may occur in the tomato plant such as fungal infected, bacterial spot, physical disorder, and virus infected using image processing, and after applying this process, we will compare the statistical values of diseased tomatoes with nutritious tomatoes so that identification of disease can be done earlier as to protect the tomato plant. The process flow of the disease characterization in tomato plant is shown in Fig. 9.
26 Characterization and Identification of Tomato Plant Diseases
305
Fig. 9 Flow chart of the process in identifying the tomato disease
4.1 Image Acquisition We have collected a large number of images which contain the diseases in different amount depending on the farm where they were taken from. Datasets have been taken from Plant Village Dataset as shown in Fig. 10 [20].
4.2 Image Pre-processing This feature is used to resize captured image from high to low resolution. Each captured image should have a definite size so that it can be analyzed accordingly. Then, the captured image is compared with the captured infected image so that it can be diagnosed using histogram equalization (Refer Figs. 11, 12, 13, 14 and 15).
306
R. Jain et al.
Fig. 10 Dataset of diseased and nutritional tomatoes from Plant Village Dataset [20]
Fig. 11 Segmentation of nutritious tomato fruit
4.3 Disease Segmentation Segmentation means to divide large data (image) into multiple fragments or segments for simpler analysis. Disease segmentation is a crucial step to make our data more meaningful. In this, we classify our dataset (fruit images) into small segments on the basis of different types of diseases in tomato leaves. Diseases in tomato fruit include bacterial spot, fungal infection, mosaic virus, and physical disorder. These classifications can be done with the help of histograms provided with different datasets (Refer Figs. 11, 12, 13, 14 and 15) [15, 19–21].
4.4 Feature Extraction This is the most interesting step or part of image processing which helps in dimensional reduction of the image and gives us a compact feature image so that further classification becomes easier for us. In this paper, the following methods have been
26 Characterization and Identification of Tomato Plant Diseases
Fig. 12 Segmentation of bacterial spot disease
Fig. 13 Segmentation of physical disorder
307
308
Fig. 14 Segmentation of fungal disease
Fig. 15 Segmentation of mosaic virus disease
R. Jain et al.
26 Characterization and Identification of Tomato Plant Diseases
309
Fig. 16 Histogram comparison between diseased tomatoes and nutritious tomatoes
adopted to explore image like colored histogram and Haar wavelets as shown in Fig. 16.
5 Analysis and Data Preparation In this paper, the main focus has been given characterization and detection of the varieties of disease in tomato fruit. There we have collected a large number of images that contain different varieties of diseases, and all the images are taken from Plant Village Dataset [20] which is an open source. It covers 13 crop species, 26 diseases, 54,306 images, and 38 classes. There are many types of diseases which occur in tomato fruit such as fungal infected, bacterial spot, physical disorder and virus infected,ss and so on. At the end, we will compare the histogram values of diseased tomato to healthy tomato as shown in Fig. 16.
310
R. Jain et al.
6 Results and Discussions In this work, 250 tomato images were taken. Out of which, 50 were the healthy tomato images and 200 were the diseased tomato images, 50 images for each four types of diseases. To evaluate the similarities or differences of each disease, we firstly visualized the histogram of each analyzed image and compared it with a sample of the various diseases. Here, we computed the mean, median, range, and standard deviation as shown in Tables 1 and 2 from tomato disease histogram for disease and healthy tomato plant. Comparing with features of healthy tomato like mean feature, there is very slight deviation in values of mean, median, standard deviation where as its observed that greater variation in these values when considering diseased tomato. On comparing the statistical data shown in Tables 1 and 2, it has been found that mean of nutritious tomatoes is obtained between 126 and 133, whereas for diseased tomatoes, it comes out between 58 and 147, and standard deviation for Table 1 Histogram values of nutritious tomato Nutritious tomato
Mean
Median
Std. dev.
Range
1. Tomato 1
131.4
151
73.3
255
2. Tomato 2
129.2
121
75.85
255
3. Tomato 3
133.9
132
74.56
255
4. Tomato 4
126.4
153
71.43
255
Table 2 Histogram values of diseased tomato Diseased tomato
Mean
Std. dev.
Range
1. Bacterial spot
101.7
96
71.2
255
147.8
168
49.52
255
79
38.66
255
119.8
117
76.91
255
115
103
69.44
255
100
83
79.74
255
115.7
115
69.86
255
114
81.78 2. Fungal infected
3. Mosaic virus
4. Physical disorder
Median
115
51.75
255
97.16
98
57.11
255
85.67
82
56.3
255
100
104
63.12
255
110.8
107
71.48
255
136.5
152
42.04
255
63.72
51
49.77
255
58.8
33
56.7
255
67.02
43
57.09
255
26 Characterization and Identification of Tomato Plant Diseases
311
nutritious tomatoes are obtained between 71 and 75, whereas for diseased tomatoes, it is between 38 and 76 for the sample data set.
7 Conclusions This paper presents a statistical approach for tomato plant disease detection based on segmented feature analysis. With very less computational efforts, the optimum results were obtained, which help for tomato disease identification at early stage or the initial stage. The statistical features have been extracted from the healthy and defected tomato. An analysis is done on the tomato features data collected, and it has been found that there is a huge difference in the statistical features of the healthy and defected tomato. The proposed method of disease detection shows good results to detect plant diseases accurately.
References 1. Cheng X et al (2017) Pest identification via deep residual learning in complex background. Comput Electron Agric 141:351–356. https://doi.org/10.1016/j.compag.2017.08.005 2. Sankaran S et al (2010) A review of advanced techniques for detecting plant diseases. Comput Electron Agric 72(1):1–13. https://doi.org/10.1016/j.compag.2010.02.007 3. Chaudhary S et al (2019) A review: crop plant disease detection using image processing. Int J Inn Tech Expl Eng 8(1), 472–477. ISSN 2278-3075. https://www.ijitee.org/wp-content/upl oads/papers/v8i7/F3506048619.pdf 4. Samanta D et al (2012) Scab diseases detection of potato using image processing. Int J Comput Trends Tech 3(1):97–101. ISSN 2231-2803. https://ijcttjournal.org/archives/ijctt-v3i1p118 5. https://shodhganga.inflibnet.ac.in/bitstream/10603/79514/10/10_chapter1.pdf 6. https://www.gardeningknowhow.com/edible/vegetables/tomato/managing-tomato-mosaicvirus.htm. Last accessed on 22 Apr 2020 7. Dacal-Nieto A et al (2011) Common scab detection on potatoes using an infrared hyperspectral imaging system. In: Maino G, Foresti GL (eds) Image analysis and processing—ICIAP 2011. ICIAP 2011. LNCS, vol 6979. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3642-24088-1_32 8. Rothe P et al (2015) Cotton leaf disease identification using pattern recognition techniques. In: 2015 International conference on pervasive computing (ICPC), Pune, India, pp 1–6. https:// doi.org/10.1109/PERVASIVE.2015.7086983 9. Sannakki SS et al (2011) Leaf disease grading by machine vision and fuzzy logic. Int J Comput Tech Appl 2(5):1709–1716. ISSN 2229-6093. https://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.208.8002&rep=rep1&type=pdf 10. Fuentes A et al (2017) A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition. Sensors 17(9):2022. https://doi.org/10.3390/s17092022 11. Khairnar K et al (2018) Image processing based approach for diseases detection and diagnosis on cotton plant leaf. In: Pawar P et al (eds) Techno-societal 2018. Springer, Cham. https://doi. org/10.1007/978-3-030-16848-3_6 12. Rastogi A et al (2015) Leaf disease detection and grading using computer vision technology & fuzzy logic. In: 2015 2nd International conference on signal Proc. and integrated networking (SPIN), Noida 2015, pp 500–505. https://doi.org/10.1109/SPIN.2015.7095350
312
R. Jain et al.
13. Lange WH et al (1981) Insect pests of tomatoes. Ann Rev. Entomol 26:345–371 14. Velásquez AC et al (2018) Plant-pathogen warfare under changing climate conditions. Curr Biol 28(10):R619–R634. https://doi.org/10.1016/j.cub.2018.03.054 15. Griffin RP (2019) Tomato insect pests. Available at https://hgic.clemson.edu/factsheet/tomatoinsect-pests/ 16. Damicone Jet al (2017) Common diseases of tomatoes-Part II: diseases caused by bacteria, viruses and nematodes. Extension, Id: EPP-7626 Available at https://extension.okstate.edu/ fact-sheets/common-diseases-of-tomatoes-part-ii-diseases-caused-by-bacteria-viruses-andnematodes.html 17. Young PA et al (1940) Common diseases of tomatoes. Tcx Agr Expt Sta Cir 86–32 18. https://www.livescience.com/54615-tomato-nutrition.html. Accessed on 10 May 2020 19. https://www.britannica.com/plant/tomato 20. https://plantvillage.psu.edu/topics/tomato/infosFuentes 21. https://extension.msstate.edu/publications/common-diseases-tomatoes 22. https://vegetablemdonline.ppath.cornell.edu/factsheets/Virus_SpottedWilt.htm
Chapter 27
Dual-Layer Security and Access System to Prevent the Spread of COVID-19 Arjun Vaibhav Srivastava, Bhanu Prakash Lohani, Pradeep Kumar Kushwaha, and Suryansh Tyagi
1 Introduction We aim to elucidate the effectiveness and advantageousness of the dual-layer security and access system in multiple avenues where existing solutions fail in some capacity. This paper illustrates how this system is better than traditional systems such as biometric- or keypad-based systems. This system is especially beneficial for physically challenged people. The system is built from the ground up with a singular focus on privacy. It will prevent leakage or hacks of sensitive user biometric data such as fingerprints. It also enables a much higher level of flexibility with respect to the security, as making a system extremely secure has its own trade offs. Human–brain– computer interface works on the principle of first converting the EEG waves from the brain into digital signals and then into meaningful information that can be parsed and processed by the computer provided. The formatter will need to create these components, incorporating the applicable criteria that follow. After the preliminary stage of facial recognition, the password authentication is done by using a non-invasive method of brain–computer interface using either earphones or a heads up display. The dual-layer security system is extremely future proof as it can be enhanced with only software updates without requiring any updated hardware thus increasing the A. V. Srivastava (B) · B. P. Lohani · P. K. Kushwaha · S. Tyagi Department of Computer Science and Engineering, Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] B. P. Lohani e-mail: [email protected] P. K. Kushwaha e-mail: [email protected] S. Tyagi e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_28
313
314
A. V. Srivastava et al.
efficiency and conserving resources considerably. The brain–computer interface can also be used as an input mechanism to enter data and select options.
2 Literature Review Password [1] hashing technique called RIG can be used for client-independent systems. This provides flexibility for choosing between memory and number of iterations. RIG is useful for systems with limited resources [2]. This paper shows the vital importance of linking biometric information with individuals and using this data to prevent attacks on the national security from anarchists and terrorists [3]. This paper shows that users do not know how to properly protect their passwords and shows various techniques to prevent brute force attacks. The implementation known as password multiplier is available for all to use [4]. Biometrics enhance the security, but still there are problems. The problems can be solved by fusion and multimodal systems [5]. The paper leads us to the conclusion that eigenfaces algorithms work better in 2D and fisherfaces algorithms work better in 3D” [6]. This paper shows that the biometric systems contain flaws and they should not be used without proper analysis, and multilayer biometric systems are better [7]. This paper shows that the biometric systems contain flaws and they should not be used without proper analysis and multilayer biometric systems are better [8]. The paper shows that it is better to store data in the cloud because it is more scalable and provides greater performance. It shows that cloud storage has lower cost and can have better security than traditional methods [9]. This paper shows that BCI system can be created using low power and low cost. It shows how Bluetooth can be used to transfer data to the computer for processing and to progress to the final layer of the system [10]. This paper shows that BCI have enormous potential in other arenas such as video games along with virtual reality to make the experience immersive [11]. The paper concludes that facial recognition is a new technology and it is only a tool and not a perfect or final solution, and before using it we need to study the constitutional and legal implications [12]. This paper shows the growth in the field of brain–computer interface and mentions the future potential of the system. It also shows the various challenges that we face [13]. This paper shows the way in which a low-cost brain– computer interface can be created to better the lives of people and allow them to live more independent lives. Virtual keyboards can be based on EOG and EMG signals [14]. This study shows the usage of brain–computer interface using only one electrode that is intracranial. It allows the users to type letters using EEG waves [15]. This paper shows that physically disabled people such as those with paralysis can be able to move cursors on a virtual screen by using brain–computer interface [16]. Facial recognition has emerged as the single most useful tool in the arsenal of law enforcement to ensure guilty people are punished and innocent people are spared [17]. “Coronavirus is a very deadly disease that can kill millions of people around the world and can wreck the world economy. It can become the biggest problem that has been faced by humankind since the world war. Research suggests that coronavirus
27 Dual-Layer Security and Access System to Prevent the Spread …
315
can kill up to 2 to 3% of the affected population. It is a very deadly and uniquely horrible virus that needs to be very seriously. It has already killed more than 150,000 people by the mid of April 2020. If we do not act by changing the infrastructure for a post-COVID world, then we are committing a big mistake” [18]. “Research suggests that coronavirus can survive on surfaces such as metal for more than 2–3 days. It can survive on different surfaces for different durations. The survival strength of the virus depends on the humidity and the temperature of the environment. This paper shows us steps that can be taken to curtail the spread of the virus” [19]. “This paper has shown that coronavirus can last on surfaces for different durations. The relative humidity and temperature of the environment in which the surface in question is present is the biggest factor that can affect how long the virus can last on it” [20]. “Shows the potential for using our computers in a way that allows us to be freed from constraints and allows transfer of thoughts directly to computer. This type of interface can have tremendously implications for the world of software and computer science [21]. Research suggests that mask prevents the spread of coronavirus by the means of droplets and this can lead to the decrease in the number of those infected and subsequently will reduce the number of deaths. Masks are a very effective in stopping the spread of coronavirus [22]. The current expert opinion along with the mandated guidelines by the various governments of the world suggests that mask wearing is most effective when a large percentage of the population is compliant. Masks should be worn for the greater good along with rational self-interest [23]. Brain–computer interfaces can be used to authenticate users in some types of computer networks, and this can be done by the means of some sort of electrophysiological signals in reaction to specific inputs.”
3 Approach The dual-layer security and access system works on the principle of screening first and verifying later. The first preliminary layer involves facial recognition using a very fast algorithm that uses minimal resources and does not require any special equipment such as infrared sensors, apart from a basic camera (Fig. 1). The final layer of verification involves brain–computer interface. The user is required to wear a specialized headset that can detect changes in the EEG waves produced by the brain, and these waves show a noticeable difference when the user performs a specified action such as blinking or sleeping (Fig. 2). The brain–computer interface software can be trained to recognize the times when a user blinks and register the response along with the associated time stamp instantaneously. This phenomenon can be utilized to authenticate a numeric password. The password can be relayed through audio via headphones into the ears of the user. The user needs to blink when they hear the correct digit on the headphones. Let us take an example where the passcode is 321. Now the numbers start being played in the headphones, and as soon as the first digit of the passcode appears, the user needs to blink, and the response is registered. This means that the user should blink after
316
A. V. Srivastava et al.
Fig. 1 Example of facial recognition technique
Fig. 2 Explanation of how brainsense works
hearing the first digit, i.e., ‘3’, then the next digit, i.e., ‘2’, and then the next and final digit, i.e., ‘1’. As soon as the passcode is correctly authenticated, the login process is successful and the user can access the physical space or data (Fig. 3). There are a vast number of steps that can be taken to ensure enhanced security. The first step that can be taken is to ensure that the numbers relayed in the headphones for authentication appear in a random order so that nefarious elements are unable to interpret the passcode by calculating the time elapsed between blinks (Fig. 4).
27 Dual-Layer Security and Access System to Prevent the Spread …
317
Fig. 3 Class diagram of facial recognition system
Fig. 4 Real-life image of brainsense device
4 Methodology There are various different types of security systems used all around the world in areas ranging from mobile phones to microwaves and from suitcases to nuclear submarines. The various SS such as physical keypads, fingerprint-based biometrics, or simple lock and key systems have their own advantages and disadvantages. For example, let us take a simple lock and physical key-based system. The biggest utility of such a system is in its sheer simplicity and the ease of operation. Such systems are cheap to create and require no maintenance (Fig. 5). They have stood the test of time and are extremely reliable while they have their own drawbacks such as the possibility of the key being misplaced due to an accident. They also have a major flaw in terms of their secureness due to the possibility of foul play because of the key getting cloned, copied, or stolen by nefarious elements.
318
A. V. Srivastava et al.
Fig. 5 Proposed system
Brain–computer interface-based security and access system has a multitude of features that make the system the best option for a variety of avenues. The duallayer security and access system is specially useful for providing physically disabled people with the means to access utilities such as lockers. If blind person has to use traditional lock systems such as those based on fingerprint biometrics or entering passcodes through a keypad, they face enormous difficulties which might be insurmountable at times. The problems can arise from being unable to locate the key hole to lack of security in entering the passcode in plain sight of nefarious elements. The dual-layer security and access system is specially useful in areas where privacy and extreme security are the major concerns. The dual-layer security and access system can be used for allowing physically challenged people who suffer disabilities such as blindness, paralysis, and loss of limbs to operate terminals such as ATM systems, bank lockers, and home security systems with ease. The dual-layer security and access system allows them to enter their authentication details just by thinking. The dual-layer security and access system is highly secure and can be used in places such as defense installations and other places where security is paramount. The dual-layer security and access system is better than traditional systems used for ATMs access as the traditional systems have the possibilities of fraud and theft. ATM cards can be cloned or stolen. The process of entering the password is also not secure as when the user enters the password into the terminal there is no privacy and all nefarious elements can find out the password by using hidden cameras or binoculars. The dual-layer security and access system solves all these issues by requiring a dual layer of security in a way in which no one can guess or find out the password. The dual-layer security and access system is also designed with the possibility of forced access of security systems in mind. In cases when thieves force a user to open a lock for an ATM, home locker, household door locks, or any other such lock or access system, then the user is helpless and has to provide the thief with access. In the dual-layer security and access system, the user will have a special password that will unlock the lock but will alert the authorities and send the GPS coordinates of the location to the police for prevention of the crime. This makes the dual-layer security and access system unique in giving the user the physical security when in
27 Dual-Layer Security and Access System to Prevent the Spread …
319
positions of coercion while at the same time keeping the data or the physical valuables secure. There are various different types of security systems that exist. They can be as primitive as lock and key, to as esoteric as voice signature-based systems. A biometric system is one that uses characteristics that uses one or more than one of the physical or behavioral features of the individual such as fingerprint, retina, iris, ear, voice, gait, palmprint, and others. These unique features of an individual are known as traits, modalities, indicators, or identifiers. Unimodal biometric systems, such as those using only as single physical trait, are not successful in all cases as they suffer from numerous setbacks such as lack of universality, acceptability, and a the problem of them not being distinct enough. Such biometric systems, henceforth referred to as unimodal biometric systems, lack accuracy and are not as operationally efficient in terms of performance. Absolute accuracy is desired in biometric systems, but it cannot be achieved, but we can get close to perfection. Absolute accuracy is much harder to achieve in systems that are unimodal because of issues of intra-class variations, spoof attacks, lack of universality, interoperability issues, noise in the sensor data, inter-class similarities, and other issues. Biometric systems can have varying levels of accuracy, and this can be calculated by the usage of two two distinct rates known as “FRR and FRR.” “FAR refers to the false acceptance rate.” “FRR refers to the false rejection rate.” Multimodal biometric systems minimize FAR and increase the FRR, but this is a trade off that we have to be willing to make. The perfect biometric system is one that is secure, permanent, universal, distinct, and is highly acceptable. There is no current biometric system that meets all of the specified requirements at the same time. It has been found after lots of research that there might not be a single trait that can satisfy our requirements, but a combination of different traits is capable of doing the job. This means that multimodal biometric systems are successful at this while unimodal biometric systems fail at this. The key is to make aggregate data from unimodal biometric systems and use intelligent decision making on them to reach a correct decision. Multimodal biometric systems have emerged as a new and unique solution to the problems faced by security systems of earlier times. It is a very highly promising approach that uses the evidence of various biometric traits to create a final decision that has a nearly certain chance of being the correct decision. Due to their high reliability, multimodal systems have evolved in the last few years and are considered a better option as compared to unimodal systems. While unimodal biometric systems work only based on one single trait like fingerprint or retina, multimodal systems work by combining various traits such as using both fingerprint and retina at the same time. This allows the system to use the better features from each individual biometric. There are multiple avenues in life where a security system needs to be highly secure. There are also many security systems that are a high value target in the eyes of the highly capable thieves and forgers. ATM systems are an intersection of both of these and need the latest technology to stay safe from unwanted and nefarious elements as attacks on ATMs have taken an upward trajectory due to the coming of new means of hacking and cloning. ATMs can be compromised by eavesdropping attacks that involve the thief watching the person entering the PIN into the machine. They can be cracked by spoofing where a
320
A. V. Srivastava et al.
copy of the card is created. Malware and shimming attacks are also a constant source of worry for bank and security officials.
5 Analysis 5.1 Face Detection Recognition System For “face detection recognition system,” suppose we are having an input face image V which is to be verified, after we found the mean mass vector and V th mass image vector of the same the distance then to be calculated is Euclidean distance which can be calculated by formula εV = | − 2|
(1)
In case the distance is lower than energy theta θ, then the entered image is said to be known image, and if it is greater, then the entered image is termed as unknown image. In this process, a total of 180 images were used in which 90 were authentic and 90 were unauthentic. All the coordinates of authentic images were grouped in four and were in the set of learning and then the distance of every member was found and checked. In this process of face detection, accuracy was found to be more than 95%, and it was more than some previous face detectors which we had reviewed. This type of face detection technique was developed using nulled and ranged space of class within the parameters of face area. In this techniques, amalgamation was attempted in two different ways. • Level which is featured • Level which has to be decided. Consolidating data at choice level call for the development of classifiers exclusively on invalid space and ranged space. At that point, these two classifiers are joined utilizing three choice combination methodologies. Alongside, two old-style choice combination strategies sum regulations and item rule, we utilize our own choice combination method which imperalizes each classifier space independently to upgrade consolidated execution. Our method of choice combination utilizes “linear discriminant analysis (LDA) and nonparametric LDA” on classifier’s reaction to upgrade class detachability at classifier yield space (Figs. 6, 7 and 8).
5.2 Image Enhancement A perfect high-pass channel is utilized for picture improvement to make a picture honing. These channels underline fine subtleties in the picture, and the nature of
27 Dual-Layer Security and Access System to Prevent the Spread …
321
Fig. 6 Sample for face detection technique
Fig. 7 After fusion level
a picture profoundly debases when the high frequencies are weakened or totally expelled. Conversely, upgrading the high-recurrence parts of a picture prompts an improvement in the picture quality. For instance, on the off chance that the face picture is given as an information, at that point the channel work for a perfect high-pass channel is communicated as Eq. (3), 0 I f (u, v) ≤ D0
(2)
322
A. V. Srivastava et al.
Fig. 8 After adding in database successfully
(u, v) = {
(3)
1 if(u, v) ≤ D0
(4)
where (u, v) is the separation between focal point of the recurrence square shape and JF(u, v) is the improved picture. So, also the iris and palm pictures are improved by this channel, and it is communicated as (u, v) and JP(u,v) separately.
5.3 FEP-RSA-MM Testing Picture acquisition is a procedure to catch the natural info picture. The wedged contribution image is improved by the high-passing channels. Improved image emphasize esteems are removed by the BEMD. From that point forward, the element esteems are intertwined by the combination method. At last, the melded highlight esteems are given for coordinating. At that point, the figure writings of facial and fingular unscrambled in the FEP-RSA-MM testing by RSA decoding. The private key of RSA unscrambling got from the key age calculation [23, 24].
5.4 RSA Decryption Algorithm Input: The collector’s private key (d) and the got scrambled figure content.
27 Dual-Layer Security and Access System to Prevent the Spread …
323
Fig. 9 RSA operation
Output: The original plain text. Subsequent to completing the RSA unscrambling, the prepared component esteems from the information base and the info picture from the testing is given as the contribution to the relationship-based coordinating (Fig. 9).
5.5 Correlation Based Matching In the FFP-RSA-MA, connection-based coordinating segment is the most significant procedure, which is handled by two information include values: database highlight esteems and info picture highlight esteems. A coordinating method predicts the choice of people dependent on the relationship among the highlights. At that point, the picture-based biometric confirmation (e.g., facial and fingular) was got by this relationship-based encoding. The prepared picture melded include vector and info coordinating picture intertwined highlight vector speaks to as (x, ay) and g(x, ay) at position (x, vy) individually. The size of the prepared picture is a 55 × 551, and the information coordinating picture is 1552 × 155, where, 1551 < 155 and 1551 < 155. Multi-modal biometric system assumes a significant job in acknowledgment frameworks. Right now, are three sorts of biometric segments; for example, facial and fingular are considered while playing out the confirmation procedure. Presently, a day, programmers took person’s biometric qualities for getting to the client’s information. In view of this issue, the biometric highlights melded by the FLA so as to upgrade the security of client’s information. Moreover, the RSA cryptography was utilized for scrambling the melded vector to improve the strength of this framework [25, 26] (Table 1).
324
A. V. Srivastava et al.
Table 1 Risk analysis Risk type
Risk intensity Chances of event Solution strategy
Compile error
Low
Unlikely
Error in input of data from brainsense
High
Somewhat likely Make sure that the device is calibrated properly and that it has been worn properly on the head
Ensure that the code is correct and does not have errors
Problem in data transfer from High brainsense to processing computer
Unlikely
Ensure that the wireless connection made via Bluetooth is strong and it does not get disturbed
Change of facial structure
Medium
Unlikely
In cases when the person’s face undergoes a rapid and drastic change, the system can adapt by requiring the longer full password
Privacy issues
Low
Unlikely
Make sure that the data gathered from the sensors is stored using state-of-the-art hashing techniques in secure clouds
Security access
Medium
Unlikely
Use random order of numbers which are to be played in the earpiece for authentication using brain–computer interface
Processing issues
Medium
Somewhat likely Ensure that the algorithms and programs behind the dual-layer security and access system have enough data to reach the correct results
Run time errors
High
Unlikely
Check that all the data is accessible and all the code does not contain any compilation errors, while also ensuring that the data transfer and data gathering stages do not encounter any glitches (continued)
27 Dual-Layer Security and Access System to Prevent the Spread …
325
Table 1 (continued) Risk type
Risk intensity Chances of event Solution strategy
Technology obsoletion
Low
Unlikely
Use latest hardware that is of a modular nature so that it can be updated in the future without changing the whole setup. Give the system the ability to receive remote software updates to enhance its capabilities
6 Result The results of the dual-layer security system are successful. We can see that the facial recognition and the EEG sensor work along with the RSA encryption to provide a superlative experience to the user. We get enhanced security compared to other existing security systems, and we get even better results than we would have gotten if we had just used a single type of security apparatus. We find that authentication using brainsense is successful the vast majority of the times and failure is only associated with either hardware problems or incorrect parsing of data.
6.1 Facial Recognition To evaluate the impact of higher subjects of learning sample performance on nulled and ranged space, we evaluate the detection accuracies for various number of learning samples images. The accuracies are evaluated for number of learning sample as per level ranging from two to nine. From Table 2, it can be said that performance of nulled area is lessening with more and more no. of samples and provide only fifty percent correctness rate when highest number of learning test pieces are taken. This shows that it has negative effect on nulled area when the number of learning samples Table 2 Sample of nulled area and ranged area
No of learning samples
Nulled area
Ranged area
Two
96.11
66.12
Three
92.34
75.43
Four
88.75
82.89
Five
83.12
78.22
Six
76.23
67.21
Seven
76.34
64.22
Eight
61.49
59.21
Nine
50.00
56.78
326
A. V. Srivastava et al.
Table 3 Sample division for facial database PID First trial
ORB Second trial
First trial
Second trial
Train
5
5
2
2
Authentication
5
5
3
3
Evaluation
70
70
5
5
Table 4 Sample division for finger database XKa and XKb First trial
Second trial
First trial
Second trial
Train
3
2
2
1
Authentication
1
2
2
3
Evaluation
4
4
4
4
is higher. Little increment in nulled area sample improves the performance of nulled area, while in ranged area, performance betters with more and more number of learning sample up to a limit, but the performance shows decrement when limit is crossed. From the table, it is clear enough that nulled area lessens with higher number of sample while ranged area increases up to a limit but again lessens.
6.2 Combined Face Recognition, EEG, and RSA We have amalgamated face, finger, and RSA at level of decision. We took all the samples from stored database and found the distribution per subject according to various parameters like test, learning, and authentication. The distribution to each segment was done according to sample distribution done individually for each segment. Tables 3 and 4, respectively, show the segment division for each parameter.
7 Conclusion and Future Scope The results of the dual-layer security system are successful. We can see that the facial recognition and the EEG sensor work along with the RSA encryption to provide a superlative experience to the user. We get enhanced security compared to other existing security systems, and we get even better results than we would have gotten if we had just used a single type of security apparatus. We find that authentication using brainsense is successful the vast majority of the times and failure is only
27 Dual-Layer Security and Access System to Prevent the Spread …
327
associated with either hardware problems or incorrect parsing of data. The dual-layer security and access system has been created by using brainsense, and implementing the algorithms in MATLAB and OpenCV. The other supporting hardware required is a Bluetooth-enabled computer, computer connected camera, input and output devices, and a network connection. The dual-layer security and access system is successful and is deployed on a system where it can be used to enhance security and safety. The dual-layer security and access system also provides help and assistance to the physically challenged to enable them to access services and unlock systems. The biggest potential benefit of the dual layer security and access system can be in reducing the spread of coronavirus and in helping humanity to fight this disease. Coronavirus can be spread by touching the same surfaces in situations like ATM machines, and coronavirus can remain on the machine. If the dual-layer security and access system is used, then we can eliminate all problems of touching surfaces for entering codes in ATMs or buildings. This will be a helpful boost in humanity’s fight against the menace of coronavirus. The duallayer security and access system has huge benefits for user privacy by not storing details of biometrics that can be compromised. The dual-layer security and access system can also be used in the future for national security purposes by enabling a centralized and highly secure database to access and process the facial data recorded by the cameras and to use this data for preventive and predictive purposes. The dual-layer security and access system can be used for both forward and backward reasoning. The dual-layer security and access system can be used for interacting with computers by simply thinking of what we want to do and the computer understanding it. The modular nature of the dual-layer security and access system allows constant upgradation without changing the whole system and also allows remote software updates to enhance capabilities. In future, the dual-layer security and access system will enable us to live in the world of science fiction.
References 1. Chang D, Jati A, Mishra S, Sanadhya SK (2015) Rig: A simple, secure and flexible design for password hashing. Lect Notes Comput Sci (Including Subser Lect Notes Artif Intell Lect Notes Bioinformatics) 8957:361–381. https://doi.org/10.1007/978-3-319-16745-9_20 2. Woodward JD. Rand Rp1194 using biometrics to achieve identity dominance in the global war on terrorism 3. Halderman JA, Waters B, Felten EW (2005) A convenient method for securely managing passwords, p 471. https://doi.org/10.1145/1060745.1060815 4. Guger C, Allison B, Leuthardt EC (2014) Brain-computer interface research-summary-2. Biosyst Biorobotics. https://doi.org/10.1007/978-3-642-54707-2 5. Cutillo A, Molva R, Strufe T, Antipolis S. Security and privacy issues in OSNs 6. Models HM, Filters C, Rentzeperis E (2003) A comparative analysis of face recognition algorithms. 3:201–204 7. Zhang C, Florencio D, Ba DE, Zhang Z (2008) Maximum likelihood sound source localization and beam forming for directional microphone arrays in distributed meetings. IEEE Trans Multimedia
328
A. V. Srivastava et al.
8. Adkins LD (2007) Biometrics: weighing convenience and national security against your privacy. Michigan Telecommun Technol Law Rev 13:541–555 9. Duc NM, Minh BQ, Vulnerability S (2009) Your face is NOT your password Face Authentication By Passing Lenovo–Asus–Toshiba. Black Hat Briefings 1–16 10. Rajeswari S, Kalaiselvi R (2018) Survey of data and storage security in cloud computing. In: IEEE international conference on circuits and systems ICCS 2017, pp 76–81. https://doi.org/ 10.1109/ICCS1.2017.8325966 11. Soni YS, Somani SB, Shete VV (2017) Biometric user authentication using brain waves. In: Proceedings of the international conference on inventive computation technologies, ICICT 2016 12. Lécuyer A, Lotte F, Reilly RB et al (2008) Brain-computer interfaces, virtual reality, and videogames. Computer (Long Beach Calif) 41:66–72. https://doi.org/10.1109/MC.2008.410 13. Woodward JD, Horn C, Gatune J, Thomas A (2002) Documented briefing 14. Alwasiti HH, Aris I, Jantan A (2010) Brain computer interface design and applications: challenges and future. Appl Sci 11:819–825 15. Dhillon HS, Singla R, Rekhi NS, Jha R (2009) EOG and EMG based virtual keyboard: A braincomputer interface. In: Proceedings of 2009 2nd IEEE international conference on computer science and information technology ICCSIT 2009, pp 259–262. https://doi.org/10.1109/ICC SIT.2009.5234951 16. Verity R, Okell LC, Dorigatti I et al (2020) Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect Dis 20:669–677. https://doi.org/10.1016/S14733099(20)30243-7 17. Kakushadze Z, Liew JK-S (2020) Coronavirus: case for digital money? SSRN Electron J 1–12. https://doi.org/10.2139/ssrn.3554496 18. Casanova LM, Jeon S, Rutala WA et al (2010) Effects of air temperature and relative humidity on coronavirus survival on surfaces. Appl Environ Microbiol 76:2712–2717. https://doi.org/ 10.1128/AEM.02291-09 19. Thorpe J, Van Oorschot PC, Somayaji A (2006) Pass-thoughts: Authenticating with our minds. Proc New Secur Paradig Work 2006:45–56 20. Feng S, Shen C, Xia N et al (2020) Rational use of face masks in the COVID-19 pandemic. Lancet Respir Med 8:434–436. https://doi.org/10.1016/S2213-2600(20)30134-X 21. Cheng KK, Lam TH, Leung CC (2020) Wearing face masks in the community during the COVID-19 pandemic: altruism and solidarity. Lancet 2019:2019–2020. https://doi.org/10. 1016/S0140-6736(20)30918-1 22. Lopez-Gordo MA, Ron-Angevin R, Pelay F (2015) Authentication of brain-computer interface users in network applications. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) 23. Choudhury T, Kumar V, Nigam D, Mandal B (2016) Intelligent Classification of lung & oral cancer through diverse data mining algorithms. In: Proceedings - 2016 international conference on micro-electronics and telecommunication engineering, ICMETE 2016. https://doi.org/10. 1109/ICMETE.2016.24 24. Kumar P, Choudhury T, Rawat S, Jayaraman S (2016) Analysis of various machine learning algorithms for enhanced opinion mining using twitter data streams. In: International Conference on Micro-Electronics and Telecommunication Engineering (ICMETE) 2016, pp 265–270 25. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET Using MQTT. Ada User Journal 38(4):231–235 26. Tomar R, Tiwari R (2019) Information delivery system for early forest fire detection using Internet of Things. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 477–86.
Chapter 28
Parameter Estimation of Software Reliability Using Soft Computing Techniques Sona Malhotra, Sanjeev Dhawan, and Narender
1 Introduction Soft computing methods contrast to conventional computing methods deal with complex real-life problems and gives promising results. It is highly lenient to ambiguity, vagueness, partial truths, and best guesses. Soft computing deals with approximate models. Soft Computing is a rising assortment of philosophies, which mean to utilize resilience for imprecision, vulnerability, and halfway fact to accomplish power and absolute ease [1]. Soft computing techniques have been invaluable in numerous applications and outperformed the various statistical and intelligent methods [2]. Rather than expository strategies, soft computing systems impersonate cognizance and discernment in a few significant regards: they can gain for a fact; they can universalize into areas where direct experience is missing, they can perform mapping from contributions to the yields quicker than inherently sequential analytical portrayals. The inspiration for such an augmentation is the normal abatement in computational burden and subsequent speed up that grant progressively strong framework and a robust system. This system leads to a number of applications in many areas. Soft
S. Malhotra · S. Dhawan · Narender (B) UIET, Kurukshetra University, Kurukshetra, Haryana, India e-mail: [email protected] S. Malhotra e-mail: [email protected] S. Dhawan e-mail: [email protected] Narender Manav Rachna University, Faridabad, Haryana, India © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_29
329
330
S. Malhotra et al.
Fig. 1 Soft computing techniques
computing has been broadly categorized into machine learning, fuzzy logic, evolutionary computation, meta-heuristics search algorithms, and Bayesian networks, etc. [3]. The components of soft computing [4] are shown in Fig. 1. This paper covers the different assessment of soft computing techniques. Section 2 covers the parameter estimation using conventional methods. Comparison criteria for parameter estimation techniques have been covered in Sect. 3. Section 4 covers the parameter estimation techniques using soft computing techniques. Machine learning, fuzzy logic, evolutionary computation, and Bayesian networks have been discussed in the section. Section 5 covers the recent work and development in the area of soft computing applications. Conclusion part has been discussed in Sect. 6.
2 Parameter Estimation Using Traditional Method Maximum likelihood estimation and least square estimation methods have been used consistently to approximate the factors of the software reliability since the last forty years. These models gave the promising results when applied to linear data; but suffered a lot when tried to find the solution for nonlinear complex problems.
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
331
2.1 Maximum Likelihood Estimation (MLE) MLE method has been used largely for estimating the parameter of the software reliability for statistical model. It is a method of estimating the parameter by taking the mean and variance that makes the observed result most probable. It gives desired results when applied to the large sample of data as it has a number of statistical characteristics for optimism. Along with it have a number of disadvantages: • It gives optimal results when applied to the huge sample of data but requires set of complex equations need to be solved numerically [5]. • It is heavily biased when applied to the small sample size having small number of failures [5]. • It usually requires some specialized software to solve the complex problems [6].
2.2 Least Square Estimation (LSE) LSE is the sum of squares of Euclidean distance ‘J’ i.e. the difference between actual values and predicted values. LSE can be easily implemented due to adaptive compatibility with other methods [5], but MLE confers good results as compared to LSE for finding the best fit and predictive legitimacy [7]. ‘J’ can be shown by the equation. T Min J = [(m(t) − μ(t))]2
(1)
t=0
3 Comparison Criteria for Parameter Estimation Techniques There may be a number of things for comparison of parameter estimation. This paper presents three criteria on which the comparison is based upon: 1. Parameters considered: Parameters are the components considered by different parameter estimation method, for example, population size, membership function, expected output, etc. 2. Simplicity: Parameter estimation should be easy to understand. It should be like that even a nonmathematical background person also can be able to apply on problems. 3. Goodness of fit: It is a measure of function that can minimize the average error or maximize the reliability prediction.
332
S. Malhotra et al.
4 Parameter Estimation Using Soft Computing Techniques 4.1 Machine Learning Machine learning provides the feature to treat data and algorithms as a single entity to keep the data accurate and complete. Machine learning can be easily merged with other methods. There are a lot of examples where machine learning can be integrated with optimization problems.
4.1.1
Neural Networks (NN)
Neural networks [8] mimic the behavior of biological neural networks. It is assumed that W ij is the weighed connection between i and j neuron and inj is the estimation of the info variable for input layer neuron j. The info hubs are associated with the yield nodes through concealed layers of the system. The neural network repeatedly adjusts various weights until the variation between estimated output and actual output from the network is minimized. Parameter Considered The summed input Netj , the output Outj at neuron j, minimization of Vector E(W ) of connection weights W, can be given respectively ⎧ in ( j = input layer neuron) ⎪ ⎪ ⎨ j N Net j = ⎪ Wi j Outi − θ j (if not) ⎪ ⎩
(2)
i=1
⎧ ⎨ Net j ( j = input layer neuron) Out j = 1 ⎩ (if not) 1 − e−N et j /T E(W ) =
N M
2 d pi − Out pi
(3)
(4)
p=1 i=1
where Outi θj T d pi Outpi
output of the neuron i threshold neuron with n input neurons gain adjustment of the function desired output (ith layer) actual output (ith layer).
Simplicity Application is a little bit difficult because a lot of equation needs to understand, however assumption of data are nominal.
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
333
Goodness of Fit E(W ) can be represented by goodness of fit.
4.1.2
Support Vector Machine (SVM)
SVM [9] is suited for extreme data points (a kind of support vectors). SVM maps the input data to a higher dimensional space. Inputs can be considered as {[u1 , v1 ], [u2 , v2 ] … [un , vn ]} & Rn × R. Here ui and vi is the input vector and associated target respectively for dimension n. Parameter Considered The linear regression function f (u), Vapnik’s ε intensive loss function L ∈ (v, f (u)) and cost function R(w) can be given respectively.
f (u) = (w.u) + b, w ∈ Rn, b ∈ R L ε (v, f (u)) =
|y − f (x)| ≤ ε 0 |v − f (u)| −ε|v − f u)| > ε
R(w) = 1/2||w||2 + C
n
L ε (v, f (vi , u i ))
(5)
(6)
(7)
i=1
Simplicity Originally was a classification function, but has been extended to regression with the help of loss function. Goodness of Fit R(w) is represented as goodness of fit.
4.2 Fuzzy Logic (FL) In FL [10] working memory takes the rules from fuzzy rule base and produce fuzzy outputs for defuzzification interface. Finally the last element of the system produces crisp output. Parameters Considered A fuzzy set ‘A’, and fuzzy time series is represented respectively as
A = f A (x1 )/x1 + f A (x2 )/x2 + · · · + f A (xn )/xn
(8)
F(T ) = F(T − 1)oR(T, T − 1)
(9)
334
S. Malhotra et al.
where x 1 , x 2 ,…, x n ∈ X (Universe of discourse) membership function of fuzzy set A(f A : X → [0, 1]) grade of membership F(T – 1 and F(T ) are the current state and next state respectively, F(T − 1) is the cause of F(T ) in such a way that F(T − 1) → F(T ). R(T, T − 1) fuzzy relation between F(T )and F(T − 1) “o” min–max operator. fA f A (x i )
Simplicity Study belongs to approximation, fuzzification, and defuzzification, so lot of calculation is there. Goodness of Fit F(T ) can be used as an optimization function.
5 Evolutionary Computation 5.1 Genetic Algorithm (GA) Transformation and hybrid activity is completed among two guardians to get new posterity with the assistance of transformation [11]. The new posterity replaces the singular populace in the example and rehash until the final solution is found. Parameters Considered Population size Pk ∈ Gn with a constant size ‘m’. Natural selection of fittest from the Pk is f (x). Pnk is the new population for remaining chromosomes. The function of relative fitness and number of copies of each chromosome is given respectively
g(x) = f (x)/
f (x), x ∈ P k
e(x) = m ∗ g(x), e(x) < m
(10) (11)
Finally Pnk is replaced by Pk+1 which is the final solution by operating on Pnk . Simplicity GA is simple to apply but very time consuming because of a number of equation. Goodness of Fit g(x) has been treated as goodness of fit.
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
5.1.1
335
Differential Evolution (DE)
DE [12] algorithm takes ‘N’ candidate solution as population, and then mutates the population for finding the degree and perturbation. Crossover applied on mutant vector and perturbed individual for generating the new population. Then new population is selected and contrasted and partner in current populace. Parameters Considered Current population is characterized as X i,G = (x 1 , x 2 ,…, x n ) of ‘N’ candidate solution. Application of mutation, crossover, and selection is given by function V i,G+1 , U j,i,G+1 , X i,G+1 respectively.
Vi,G+1 = xr 3,G + F xr 1,G − xr 2,G
U j,I,G+1 = X i,G+1 =
V j,I,G+1 , rand j < crossover factor X j,I,G+1 , rand j ≥ crossover factor
Ui,G+1 , f (Ui,G+1 ) < f (X i,G ) X i,G ,
f (Ui,G+1 ) ≥ f (X i,G )
(12)
(13)
(14)
where Xr i,G (i = 1, 2, 3) are the candidate solution form ‘N’. F is user-specified value. Each candidate solution is then compared with counterpart in current population. Simplicity It has been decreased in adaptive DE. Goodness of Fit X i,G+1 is represented as fitness function.
5.1.2
Ant Colony Optimization (ACO)
ACO [13] is motivated from the collective activities of ants in search of food. Ant has the property of releasing pheromone throughout the path it covers. In the whole process, the path with highest accumulated pheromone becomes the optimal path. Parameters Considered The measure of pheromone saved for change of kth from x to y is given by, amount of pheromone deposited by ‘m’ number of ants, without vaporization and with vaporization, probability of kth ant for not moving to some other path than x to y are shown respectively by
τx y =
m k=1
τxky (without vaporization)
(15)
336
S. Malhotra et al.
τx y = (1 − ρ)τx y +
m
τxky (with vaporization)
(16)
k=1
Pxky =
(τx y )α (ηx y )β α β zalloowed (τx z ) (ηx z )
(17)
where ρ z allowed x ηxz α
the evaporation coefficient heuristic value as path visibility trail level for other possibility is the influence of pheromone and β is the influence of heuristic.
Simplicilty ACO is easy to understand and implement and also can be used with other methods. Goodness of Fit ‘τ ’ has been treated as a function of goodness of fit.
5.1.3
Particle Swarm Optimization (PSO)
In PSO ‘n’ Particles travel into a chasing space to locate the worldwide finest. They [14] renew their situation with their past finest and fellow’s present finest. Parameter Considered Population representation of particles, Velocity update of individual, position update of particles are shown respectively by
X = L + rand × (U − L)
(18)
vi (t + 1) = wvi (t) + c1r1 x pi (t) − xi (t) + c2 r2 x gi (t) − xi (t)
(19)
xi = xi + vi
(20)
Here c1 , c2 , r 1 , r 2 are the random values. Simplicity It can be used with other methods. Goodness of Fit ‘x gi ’ has been treated as a global best solution or function of goodness of fit.
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
5.1.4
337
Cuckoo Search (CS)
Cuckoo [15] uses the nest of other birds for laying eggs. Cuckoo has the characteristics of imitating the pattern, size, and color of the eggs like other bird’s egg. Cuckoo replaces her eggs with other bird’s eggs to expand the incubating likelihood of her eggs. Parameter Considered The finest nests by good feature of eggs are solution for the next generation. New solution and random walk of the cuckoos with random step size by levy distribution is given respectively
X i(t+1) = X i(t) + α ⊕ Levy(λ), α > 0
(21)
Levy ∼ u = t −λ , 1 < λ ≤ 3
(22)
where α is the step size and α = 1 in most cases. ⊕ represents exclusive OR. Simplicity Little bit complex due to long tail. Goodness of Fit It can be represented by Levy ~ u.
5.1.5
Weed Optimization (WO)
WO [16] was inspired from colonization and behavior of invasive weeds that is how they colonize and find better place for its growth and reproduction. Parameters considered Maximum number of iteration, population, number of seeds. The equation for reproduction and the seeds produced by the group with standard deviation in the normal distribution is given by respectively
Smax (max_fit − fit(wi )) + Smin (fit(wi ) − min_ f it) max _fit − min_fit
σt = ((T − t)/T )n σinitial − σ f inal + σ f inal
N (Wi ) =
where T maximum iteration. n nonlinear modulation index. S max and S min linear decrement of grains.
(23) (24)
338
S. Malhotra et al.
Simplicity Widely acceptable because of its simplicity. Goodness of Fit σ t is treated as goodness of fit that tells production of more fit plants and elimination of unfit plants.
5.2 Bayesian Networks (BN) The establishment for Bayesian Networks (BN) is probability theory. Baye’s theorem of probabilistic rule is specified by P(A/B, h) = P(B/A, h)/P(B/ h)) ∗ P(A/ h), f or P(B/ h) > 0
(25)
where P(A/h) is the probability of A, when B is not known and P (A/B, h) is the probability of A when B is known. Parameters Considered Generalized sum rule B P(A/B, h), and rule of expansion P(A/h) can be given by equations respectively.
P(A/B, h) = P(A/ h)
B
P( A/ h) =
P(A/B, h)P(B/ h)
(26) (27)
B
Simplicity Is easy to apply. Goodness Is signified by P(A/h).
6 Recent Work and Development In the last forty years, several reliability models proposed using the soft computing techniques shown in Fig. 2. All of them showed better results compared to existing ones. There are some methods which are giving better results when merged with other methods. Some recent models, their work and development have been discussed here. Noekhah et al. [17], projected a hybrid approach for software reliability by with multilayer perceptron NN and imperialist competitive algorithm. Author showed that the approach is the solution to many previous problems and convergence problems. Arunima et al. [18], applied machine learning technique with artificial neurofuzzy information system (ANFIS), and compared the same application with multilayer
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
339
Fig. 2 Hybrid approach using soft computing
perceptron, back propagation and feed forward NN, support vector machine and regression NN. Tyagi et al. [19] also compared the ANFIS with a simple fuzzy information system. Manjula and Florence [20] developed a hybrid approach using deep NN and GA. Author incorporated a new version of GA that gives good representation of chosen software features along with adaptive auto encoder deep NN. Ozyurt et al. [21], proposed a hybrid approach using convolution NN and perceptual hash function that can find the fair features of an image. Results showed good classification and maintenance of liver image up to a threshold. Dash [22], presented a study of NN trained with evolutionary algorithms for intrusion detection. The approach was compared with some other conventional methods and presented a better accuracy. Zhang and Wong [23], developed enhanced ACO with number of modification for the job shop setting. Author showed modified algorithm outperformed many meta-heuristic method. Ahmad et al.[24], presented the hybrid approach for GA and ACO for generating the test cases based on precedence. Solanki et al. [25], proposed an approach on modified ACO for test case prioritization. Author has done this by altering the natural phenomena of real ants. Kaur and Dhiman [26], proposed a hybrid meta-heuristic approaches using PSO and spotted hyena optimizer (SHO). Author compared this hybrid approach with for meta-heuristic methods (PSO, SHO, GE, DE) and showed the proposed approach gave promising results compared to others metaheuristic approach. Behrouz et al. [27], proposed a approach by combining ANN and PSO for predicting the safety feature of uniform slope. Author performed a number of sensitive analyses for modeling procedure. Author predicts the safety factor with great performance compared to ANN. Rijwan et al. [28], projected a hybrid approach using GA and CS for minimization cost and time for the testing process. Author showed that hybrid approach is producing better results when compared to either one of CS and GA. Li et al. [29], presented an extension of CS based on individual knowledge learning(IKLCS). Author used the individual’s historical knowledge for the optimization and compared with other evolutionary algorithms. Results demonstrated IKLCS as a new evolutionary algorithm. Wu et al. [30], projected a hybrid system to overcome the deficiency of CS by using fuzzy reasoning. Author improves the algorithm as an acceleration of convergence speed and optimization of local and global optimum. Mellouk et al. [31], used a hybrid CS and GE algorithm and
340
S. Malhotra et al.
demonstrated the improvement like execution time reduction, better exploration of search space and diversification, and choose near-optimal within less time period. History shows a huge application of WO in the area of mechanical, electronics, and agriculture fields. Yue and Zhang [32], proposed enhanced hybrid bat algorithm with weed optimization. Author enhanced the local search ability and demonstrated the application in image segmentation. Mahekmohammadi and Moghadam [33], applied the Bayesian networks (BN) in for risk assessment in hierarchical structure. Author demonstrated reasonable and acceptable risk assessment in producing suitable solution. Tosun et al. [34], presented a literature on the application of BN to predict the quality of the software. Author presented a framework for replication and extension of BN. Chatterjee and Bappa [35], designed an algorithm for forcasting the errors in starting phase of software designing based on BN. Author proposed a framework that can facilitate the software personnel with required information of software metrics.
7 Conclusion A number of conventional methods have been used in estimating the parameter in the last few decades. Most of them lack in estimating the parameters accurately. Still they are the attraction of the researchers. The reason is that today a single estimation technique is not capable to estimate the parameters. There are a number of reasons for that, for example, a number of methods are applicable on small sample of data, some are applicable on large sample, some are based on assumption, some of them are prone to premature convergence and exploration, etc. That’s why now a day’s hybrid approaches are used. An approach is merged with other so that it can give the accurate results for all kinds of problems. Literature shows that most of the time individual parameter estimation method cannot estimate the parameters accurately so there is a need of hybrid approach to estimate the parameters accurately. The paper covers an analysis of various soft computing techniques with different parameters consideration so that future aspects can be emphasized.
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
341
Table 1 Assessment of soft computing techniques Technique
Author (Year introduced)
Parameters considered (Eq. no.)
Pros.
Cons.
NN
McCulloh and Pitts [8]
(2)–(4)
Accurate in consistent behavior in reliability
Convergence problem, local minima, lack of analysis due to weak mathematical foundation [8]
SVM
Vapnik [9]
(5)–(7)
Support several data mining processes e.g. processing, regression, classification
Several key parameters need to be set correctly [36]
FL
Zadeh [10]
(8), (9)
Predict the reliability Works on very more accurately [18] limited features of data collection [37]
GA
Holland [11]
(10), (11)
Easy to understand and implement [12]
DE
Storn and Prince [12]
(12)–(14)
Ability to control the Simplicity has been behavior of mutation decreased in the through variable extended version [12] distribution
ACO
Dorigo [13]
(15)–(17)
Easy to understand and implement
Convergence problem, local minima [17]
PSO
Kennedy and Eberhart [14]
(18)–(20)
Wide range of fundamental application areas
trapped into local maxima and often gets Detroit when dimensions increase [25]
CS
Yang [15]
(21), (22)
Efficient randomization due to large tail step and less number of parameters needs to be adjusted [9]
Premature Convergence, poor balancing, low computation accuracy [29]
WO
Mehrabian and (23), (24) Lucas [16]
Advantage in terms of premature convergence and exploration
Once reaches to maximum weeds, needs to investigate new methods [16]
BN
Storn and Prince [12]
Ability to handle missing data
Sensitive to outliers, low model versatility [39]
(26), (27)
Time Consuming [38], Individual Fitness not declared
342
S. Malhotra et al.
References 1. A special issue dedicated to soft computing. www.journals.elsevier.com/ASC/ 2. Kiran NR, Ravi V (2008) Software reliability prediction using soft computing techniques. JSS. https://doi.org/10.1016/j.jss.2007.05.005 3. Kaswan KS, Choudhary S., Sharma K (2015) Software reliability modeling using soft computing techniques: critical review. IJITCS. https://doi.org/10.5815/ijitcs.2015.07.10 4. Wikipedia. www.en.wikipedia.org/wiki/Soft_computing. Last accessed on Feb 2020 5. Malik M, Garg G (2016) Parameter estimation in Software reliability. IJSR 5(7):632–637. ISSN (Online) 23197064 6. Laurent AG (2004) Conditional distribution of order statistics and distribution of the reduced in orders statistics of the exponential model. pp 652–657 7. Singh M, Bansal V (2016) Parameter estimation and validation testing procedures for software reliability growth model. IJSR 5(12):1675–1680. ISSN (Online) 23197064 8. McCulloch W, Pitts W (1943) A logical calculus of ideas immanent in nervous activity. Bltn Math Biophys 5(4):115–133 9. Boser BE, Guyon IM, Vapnik Vladimir N (1992) A training algorithm for optimal margin classifiers. In: AWCLT, COLT. CiteSeerX, p 144. ISBN 978–08979149702 10. Zadeh LA (1992) Fuzzy logic, neural networks and soft computing. One-page course announcement of CS 294–4. Spring 1993. University of California at Berkeley 11. Chakrabarty RC. Fundamental of Genetic algorithm. www.myreaders.info 12. Storn R, Price K (1997) Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. JGO. 11(4):341–359 13. Colorni A, Dorigo M, Maniezzo V (1991) Distributed optimization by ant colonies. In: ADLPC. Elsevier, Paris, France. pp 134–142 14. Kennedy J, Eberhart R. Particle swarm optimization. In: ICNN, vol IV, pp 1942–1948. https:// doi.org/10.1109/ICNN.1995.488968 15. Yang XS, Deb S (2010) Cuckoo search via Lévy flights. In: NaBIC 2009. IEEE Publications. pp 210–214 16. Mehrabian AR, Lucas C (2006) A novel numerical optimization algorithm inspired from weed colonization. Ecological Inf 1(4):355–366 17. Noekhah S, Salim NB, Zakaria NH (2017) Predicting software reliability with a novel neural network approach. In: ICRICT. Springer, Cham, pp 907–916 18. Jaiswal A, Malhotra R (2018) Software reliability prediction using machine learning techniques. IJSAEM 9(1):230–244 19. Tyagi K, Sharma A (2014) An adaptive neuro fuzzy model for estimating the reliability of component-based software systems. ACS 10(1–2):38–51 20. Manjula C, Florence L (2019) Deep neural network based hybrid approach for software defect prediction using software metrics. Cluster Comput 22(4):9847–9863 21. Ozyurt F, Tuncer T Avci E, Koc M, Serhatlioglu I (2019) A novel liver image classification method using perceptual hash-based convolutional neural network. AJSE 44(4):3173–3182 22. Dash T (2017) A study on intrusion detection using neural networks trained with evolutionary algorithms. Soft Comput 21(10):2687–2700 23. Zhang S, Wong TN (2018) Integrated process planning and scheduling: an enhanced ant colony optimization heuristic with parameter tuning. JIM 29(3):585–601 24. Ahmad FS, Singh DK, Suman P (2018) Prioritization for regression testing using ant colony optimization based on test factors. In: ICCD. Springer, Singapore pp 1353–1360 25. Solanki K, Singh Y, Dalal S, Srivastava PR (2016) Test case prioritization: an approach based on modified ant colony optimization. In: ERCICA. Springer, Singapore, pp 213–223 26. Dhiman G, Kaur A (2018) A hybrid algorithm based on particle swarm and spotted hyena optimizer for global optimization. AISC 816:599–615 27. Gordan B, Armaghani DJ, Hajihassani M, Monjezi M (2016) Prediction of seismic slope stability through combination of particle swarm optimization and neural network. Eng Comput 32(1):85–97
28 Parameter Estimation of Software Reliability Using Soft Computing Techniques
343
28. Khan R, Amjad M, Kumar A (2018) Optimization of automatic test case generation with cuckoo search and genetic algorithm approaches. In: ACCS, Singapore, pp. 413–423 29. Li J, Li YX, Zou J (2018) Cuckoo search algorithm based on individual knowledge learning. In: ICBICTA, Singapore, pp 446–456 30. Wu Z, Chunqi Du (2019) The parameter identification of PMSM based on improved Cuckoo algorithm. Neural Process Lett 50(3):2701–2715 31. Mellouk L, Aaroud A, Boulmalf M, Dine KZ, Benhaddou D (2019) Development and performance validation of new parallel hybrid cuckoo search–genetic algorithm. Energy Syst 1–23 32. Yue X, Zhang H (2019) Improved hybrid bat algorithm with invasive weed and its application in image segmentation. Arab J Sci Eng 44(11):9221–9234 33. Malekmohammadi B, Moghadam NT (2018) Application of Bayesian networks in a hierarchical structure for environmental risk assessment: a case study of the Gabric Dam, Iran. Environ Monit Assess. 190(5):279 34. Tosun A, Bener AB, Akbarinasaji S (2017) A systematic literature review on the applications of Bayesian networks to predict software quality. Softw Qual J 25(1):273–305 35. Chatterjee S, Bappa M (2018) A Bayesian belief network based model for predicting software faults in early phase of software development process. Appl Int 48(8):2214–2228 36. Li J, Li B (2014) Parameters selection for support vector machine based on particle swarm optimization. In: ICIC. Springer, Cham, pp 41–47 37. Chaudhary A, Tiwari VN, Kumar A (2014) Analysis of fuzzy logic based intrusion detection systems in mobile ad hoc networks. BVICA M’s IJIT 6(1):690 38. Li W (2004) A genetic algorithm approach to network intrusion detection. SANS Institute, United States 39. Bashar A, Parr G, McClean S, Scotney B, Nauck D (2014) Application of Bayesian networks for autonomic network management. JNSM 22(2):174–207
Chapter 29
An Ensemble Approach for Handling Class Imbalanced Disease Datasets Sayan Surya Shaw, Shameem Ahmed, Samir Malakar, and Ram Sarkar
1 Introduction Disease is an abnormal condition of the body that can cause many negative effects on body. Among all diseases, chronic diseases are considered as more dangerous as they are long lasting (more than three months) and cannot be cured by vaccine. Some of the chronic diseases are diabetes, heart disease, chronic kidney disease, cancer, thyroid etc. Liver disorder is one of the most widely spread disease all over the world. Liver disorder can be caused by virus, bacteria or consumption of alcohol etc. In India, according to the statistics of World Health Organization (WHO), in 2014 more than 21 million (2.44% of all deaths) people and in 2017 more than 25 million (2.95% of total deaths) people died died due to liver disorder [23]. According to National Cancer Institute (NCI) approximately 38.4% people all over the world are affected by cancer based on 2013–2015 data. Worldwide cancer rate is set to the double by 2020, according to the report published by WHO. Thyroid disease is also common in worldwide. In India, about 42 million people suffer from thyroid disease.
S. S. Shaw (B) Department of Computer Science and Engineering, University of Calcutta,Kolkata, India e-mail: [email protected] S. Ahmed · R. Sarkar Department of Computer Science and Engineering, Jadavpur University, Kolkata, India e-mail: [email protected] R. Sarkar e-mail: [email protected] S. Malakar Department of Computer Science, Asutosh College, Kolkata, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_30
345
346
S. S. Shaw et al.
It is feasible to take the real world data from the hospitals in recent years due to digitally available of the patients’ information. Machine learning approaches help to study large amount of data to learn, generalize and predict the correct results of diseases and that help doctors to detect the pattern of the disease [20]. Efficient prediction of the occurrence of disease in a human body by the machine learning model in early stage and then by proper treatment these type of chronic disease can be cured before these go to the critical stage. A major issue while applying machine learning algorithms on the medical data is the class imbalance problem. Class imbalance means there is a disproportionate ratio of observations in each class. In many cases, it is found that the imbalance is so extreme i.e. the minority class contains only few percent of the entire dataset. As a consequence of this, learning ability of the classifier gets suffered. In such cases, classifier tends to overlook the minority class entirely and classifies almost all the minority class samples as samples as the majority class samples. Nevertheless, role of the minority class instances is truly significant for some domains. For example, in cancerous and non-cancerous genes where the number of instances of cancerous class generally surpasses the number of non-cancerous classes as only a few number of people (considering the entire population of the globe) suffer from this illness. In this case, if the learning algorithm cannot identify the cancerous genes (here minority class), then this wrong prediction would be disastrous. To solve this class imbalanced problem, researchers generally apply one of the two resampling techniques: oversampling and undersampling. Oversampling is adding more training data in the minority class so that classes become balanced. On the other hand, undersampling is the opposite of the oversampling process. It removes or merges the examples in majority class and makes the classes balanced. Some of the undersampling techniques are Random undersampling on majority class, NearMiss, Edited Nearest Neighbour (ENN) Rule, Condensed Nearest Neighbour (CNN) etc. In the recent past, many researchers have used the ensemble approach in various fields [1, 6, 12–15] due to its’ effectiveness to provide better classification performance. But the limitation of undersampling is that the examples that are removed from majority class might be important, useful or critical point for correct prediction. Due to this removal, often underfitting may occur, and the model may not work properly. Therefore, choice of an effective undersampling technique plays an important role in classifying class imbalanced data. Furthermore, an undersampling technique may work for a particular dataset but might fail miserably for other datasets. Keeping the above facts in mind, in the present work, we have attempted an ensemble approach. In this work we have combined the outcomes three different undersampling techniques so that chances of removing important samples from the majority class become less and a good balance between the majority and minority classes can be established as well. We have evaluated the proposed method on three publicly available standard disease datasets where class imbalance problem is pertinent.
29 An Ensemble Approach for Handling Class Imbalanced Disease Datasets
347
2 Literature Survey Over the years many researchers have developed different machine learning based algorithms which deal with classifying chronic disease data. In this section we discuss some recently proposed techniques which deal with three disease datasets that include WDBC, new-thyroid and BUPA. In 2009, authors of the work [24] proposed undersampling to make up the imbalanced datasets on breast cancer. They used the bagging algorithm to construct an integration decision tree model so that it can predict the result of the disease correctly. The work [7] introduced some sampling techniques like SMOTEBoost to balance the imbalanced datasets and the performance measured more appropriate for mining imbalanced datasets. The work mentioned in [30] applied two classifiers - logistic regression and decision tree with a combination of Synthetic Minority Over-Sampling technique (SMOTE), Cost Sensitive Classifier (CSC) technique, undersampling, bagging and boosting. They used this method on the datasets of cancer obtained from surveillance. However SMOTE is not beneficial for highdimensional data because most of the time it performs similarity to the uncorrected class-imbalanced classification and worse than cut-off adjustment or random undersampling [2]. The work proposed in [3], used oversampling and undersampling jointly and cross-validation technique on disease datasets (diabetes, hepatitis, breast cancer). In their work, for experimental need they selected some data from literature where poor performances were reported due to improper choice of crossvalidation techniques. In 2013, the authors of the work [5] proposed an effective wrapper approach directly into the cost-sensitive neural network. They used particle swarm optimization (PSO) as the optimization function. The method was tested on the imbalanced datasets hepatitis, hypothyroid, abalone etc. They claimed to have effective results than normal sampling technique. In 2015, the authors of the work [1] worked with ensemble approach named as AdaBoost incorporating linear SVM (ADASVM) on cancer classification dataset. Density based undersampling technique called DBMUTE used in the work mentioned in [4]. DBMUTE adapts directly density reachable graph. Results are reported on the UCI health monitoring datasets (diabetes, haberman’s survival). In 2018, the authors in [16] proposed an efficient techniqueon class imablanced datasets (EFTWSVM-CIL) that performed well on many real-world datasets and as well as disease datasets like pima, cancer, new-thyroid etc. This work [18] adopted a distribution sensitive sample synthesis method according to the distance from the surrounding minority samples. They reported that this method improves especially accuracy rate and recall rate of minority class on real medical diagnostic datasets like breast-cancer and lung-cancer. The work proposed in [11] used the different oversampling and undersampling algorithms on the 15 different cancer datasets from SEER program and the performance improved in 90% of the cases after using balancing techniques. The authors of this work [8] proposed energy efficient fog-assisted healthcare system for maintaining blood glucose level to reduce risk of diabetes. By using fog-assistance technique an emergency alert can be generated as a precaution.
348
S. S. Shaw et al.
The work mentioned in [10] proposed the detailed usability of mHealth system for self-management in diabetes datasets. Recently many researchers proposed various ensemble techniques in different fields. In the work [6], the authors proposed an ensemble approach of three filter based feature selection methods namely, Mutual Information (MI), Chi-square and Anova F-Test. In [13] a genetic algorithm (GA) based ensemble filter approach was used. The authors of [12] introduced a two-stage approach on protein secondary structure classification. The work mentioned in [29] proposed the usage of overlap-based undersampling method. They detected and removed the negative class instances from the overlapping region and they claimed to get high accuracy in the positive class on the datasets like WDBC and thyroid. From the above discussions, it can be observed that before applying any classifier to the imbalanced datasets, if we use a combination of undersampling methods then the target classes will be balanced and we can get more accurate data to be used to build a competent learning model. With this motivation, we make an attempt to develop such a combination of different undersampling methods so that prediction accuracy especially for minority class can be improved.
3 Proposed Work As mentioned earlier, in this work, we have formed an ensemble of three undersampling methods in order to rectify the class imbalance problem of the disease datasets. By using three undersampling methods, we try to ensure that chances of loosing important samples of the majority class become less. Three different undersampling methods considered in the present work to form the ensemble is briefly discussed below: 1. Condensed Nearest Neighbour (CNN): It is an undersampling technique which selects a subset of the sample collection so that there will be no loss in model performance. This technique uses a 1 nearest neighbour method to iteratively decide if a sample should be removed or not. It is relatively a slow procedure, so small datasets are preferred. 2. NearMiss (Version-1): It selects examples from the majority class based on the distance from the minority class. Using this method, clusters of majority class examples are formed around the minority class instances that overlap. 3. NearMiss (Version-3): To avoid the issue of information loss NearMiss (version3) is used. It selects the examples of majority class for every examples in minority class that are closest. It only keeps those majority class examples that are on the decision boundary. So in some cases it selects less number of majority class samples than the minority class. With these three techniques we have designed our ensemble approach that ensures less loss of important representatives of the majority class. We have used these three undersampling methods and have retrieved the indices of the target columns indicating the data selected by these methods individually.
29 An Ensemble Approach for Handling Class Imbalanced Disease Datasets
349
Fig. 1 The proposed framework for classification of imbalanced disease datasets
Let, • • • •
index1 = list of indices selected from majority class by using CNN index2 = list of indices selected from majority class using NearMiss (version-1) index3 = list of indices selected from majority class using NearMiss (version-3) indexfinal = ∪ (index1 , index2 , index3 ).
Then we have retrieved the samples from indexfinal and got our final balanced dataset with important samples from majority class. At the end, we have used AdaBoost classifier to predict our model performance in classifying the disease samples. Figure 1 shows the proposed framework used to handle the class imbalance problem of the disease datasets.
4 Results and Analysis In order to assess the performance of the proposed ensemble based undersampling method, we have considered three class imbalanced disease datasets [9] as discussed below:
350
S. S. Shaw et al.
(a) WDBC
(b) new-thyroid
(c) BUPA
Fig. 2 Imbalance representation of the three datasets considered here for experimentation. Here, in (a) and (c) minority and majority class samples are represented using blue and orange color respectively whereas in (b) the color codes are opposite
1. Wisconsin Diagnostic Breast Cancer (WDBC): It is a real-life dataset with 569 instances and number of attributes is 32 taken from UCI ML repository. Patients with malignant breast mass are positive and with benign breast mass are treated to be negative. The percentages of negative (0) and positive (1) class are 37.26% (minority class) and 62.74% (majority class) respectively as shown in Fig. 2a. 2. New-thyroid: It is a real-life dataset with 215 instances and number of attributes is 6. Normal cases are negative and other two classes being positive are in minority group. The percentage of negative (0), positive1 (1) and positive2 (2) class is 69.76%, 16.28% and 13.96% respectively. Hence, the percentage of negative (0) and positive (1) class is 69.76% (majority class) and 30.23% (minority class) respectively as displayed in Fig. 2b. 3. BUPA: It is a real-life dataset and selected from UCI machine learning repository. Patients suffering from liver disorder are considered as positive (1) and normal healthy people are considered as negative (0). There are 200 instances as class 1 and rest 145 instances as class 0. As shown in Fig. 2c the percentage of negative (0) and positive (1) class is 42.03% (minority class) and 57.97% (majority class) respectively. For all the datasets, we have generated scatter plot to show how different types of samples from majority class are selected by the three undersampling methods considered here. Figures 3, 4a–c and 5 show the scatter plot and balancing ratios for WDBC, new-thyroid and BUPA respectively.
Fig. 3 Scatter plots showing the selected samples of the majority class of the WDBC dataset using a CNN, b NearMiss (version 1), c NearMiss (version 3). d Represents the ratio of majority and minority classes after applying the proposed method. Here blue color represents minority class samples and orange color represents the majority class samples
29 An Ensemble Approach for Handling Class Imbalanced Disease Datasets
351
Fig. 4 Scatter plots showing the selected samples of the majority class of the New-thyroid dataset using a CNN, b NearMiss (version 1), c NearMiss (version 3). d Represents the ratio of majority and minority classes after applying the proposed method. Here blue color represents majority class samples and orange color represents the minority class samples
Fig. 5 Scatter plots showing the selected samples of the majority class of the BUPA dataset using a CNN, b NearMiss (version 1), c NearMiss (version 3). d Represents the ratio of majority and minority classes after applying the proposed method. Here blue color represents minority class samples and orange color represents the majority class samples
The experimental results show that our framework is effective in handling the imbalanced datasets. We have split the three datasets into two sets - training set (80% of the entire data) and test set (20% data) for classifying using AdaBoost classifier. The measures used for representing performance are described below. • True Positive (TP): predicting a patient with disease positive and actual condition is also positive. • True Negative (TN): predicting a patient with disease negative and observation also says negative. • False Positive (FP): actually a patient with no disease, predicted as positive. • False Negative (FN): actually a patient with disease, predicted as negative. 1. Accuracy Score: It is the number of samples correctly predicted among the total number of samples predicted. Accuracy =
TP + TN × 100 TP + TN + FP + FN
(1)
But, typical metric like accuracy score will not reflect the misclassification of minority class samples. For this reason precision, recall, F1 score, AUC score are considered. 2. Precision: The number of correctly predicted positive examples by the classifier among all positive examples.
352
S. S. Shaw et al.
Precision =
TP TP + FP
(2)
3. Recall: The number of correctly predicted positive examples by the classifier among actual positive values. Recall =
TP TP + FN
(3)
4. F1 Score: Harmonic mean of Precision and Recall. F1 Score =
2 × Recall × Precision Recall + Precision
(4)
5. AUC Score: The Area Under the ROC (Receiver Operating Curve) Curve (AUC) is a measurement of a model’s classification performance. It shows to which extent the model successfully separates the positive and negative examples, thus it ranks them correctly [27]. Tables 1, 2 and 3 display the comparison of the results of WDBC, New-thyroid and BUPA datasets respectively using our proposed method with the existing state-of-art methods. In tables, AUC represents AUC score, Acc represents achieved accuracy, P represents Precision, R represents Recall and F1 represents F1 score. From Table 1 it is clear that our proposed method performs better than other state-of-the-art methods with AUC score of 99.7% and F1 score of 100% in WDBC datasets. In New-thyroid dataset, AUC Score is 99.56% and F1 score is 100% which is better than any other previous results as shown in Table 2. In BUPA dataset we have got 86.59% AUC Score and 86.10% F1 score as displayed in Table 3.
Table 1 Performance comparison on the WDBC dataset Methods
Year
Technique used
Polat et al. [26]
2017
Feature selection – using Principal Component AttributeEvaluation
AUC
Acc
P
R
F1
96.99
–
–
–
Gurbuz and Kihc [17]
2013
Adaptive SVM
–
100
–
–
–
Vuttipittayamongkol and Elyan [29]
2020
BaseLine
–
–
–
–
94.37
Proposed model
–
Ensemble of undersampling methods + AdaBoost classifier
99.7
100
100
100
100
29 An Ensemble Approach for Handling Class Imbalanced Disease Datasets
353
Table 2 Performance comparison on the new-thyroid dataset Methods
Year
Technique used
AUC
Acc
P
R
F1
Kodaz et al. [21]
2009
IG-AIRS (10 x FC)
–
95.90
–
–
–
Temurtas [28]
2009
PNN + 10 fold cross validation
–
94.81
–
-
–
Kousarrizi et al. [22]
2012
10 fold cross validation
–
98.62
–
–
–
Proposed model
–
Ensemble of undersampling methods + AdaBoost classifier
99.56
100
100
100
100
Table 3 Performance comparison on the BUPA dataset Methods
Year
Technique used
AUC
Acc
P
R
F1
McDermott and Forsyth [25]
2016
number of half-pint equivalents of alcoholic beverages drunk per day+logistic regression
–
82
–
–
–
Haque et al. [19]
2018
ANN,Random forest classifier
–
85.28
85.71
–
82.76
Kumar and Thakur [23]
2020
NWFKNN
75.92
78.46
88.64
-
84.78
Proposed model
–
Ensemble of undersampling methods + AdaBoost classifier
86.59
86.36
86.28
85.97
86.10
5 Conclusion In this paper, we have proposed an ensemble approach to handle the class imbalanced disease datasets recognition. Instead of relying on single algorithm, we have considered three different undersampling algorithms and form a union of the data samples selected by the three methods, so that important samples can be retained. We have evaluated our method on three datasets namely WDBC, New-thyroid and BUPA. Results obtained confirm that our method outperforms some recently proposed methods by other researchers. We have also achieved 100% F1 score on WDBC and new-thyroid datasets. Our method may be helpful to predict the disease class (minority) more accurately and it will be beneficial to the doctors for detecting the disease. Currently, we have provided equal weightage to all the undersampling methods considered here. Hence, in future, we would try to apply some intelligent approach by which we can put better weightage to the method. Besides, we can consider some
354
S. S. Shaw et al.
other undersampling methods having different working principles. Furthermore, we aim to apply our method on some datasets having extreme imbalance ratio which would help to establish the robustness of the our method.
References 1. Begum S, Chakraborty D, Sarkar R (2015) Cancer classification from gene expression based microarray data using SVM ensemble. In: 2015 International conference on condition assessment techniques in electrical systems (CATCON). IEEE. https://doi.org/10.1109/catcon.2015. 7449500 2. Blagus R, Lusa L (2013) SMOTE for high-dimensional class-imbalanced data. BMC Bioinf 14(1). https://doi.org/10.1186/1471-2105-14-106 3. Blagus R, Lusa L (2015) Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models. BMC Bioinf 16(1). https://doi.org/ 10.1186/s12859-015-0784-9 4. Bunkhumpornpat C, Sinapiromsaran K (2016) DBMUTE: density-based majority undersampling technique. Knowl Inf Syst 50(3):827–850. https://doi.org/10.1007/s10115-0160957-5 5. Cao P, Zhao D, Zaïane OR (2013) A PSO-based cost-sensitive neural network for imbalanced data classification. In: Lecture Notes in computer science. Springer, Berlin, Heidelberg, pp 452–463. https://doi.org/10.1007/978-3-642-40319-4_39 6. Chakraborty A, De R, Chatterjee A, Schwenker F, Sarkar R (2019) Filter method ensemble with neural networks. In: Lecture notes in computer science. Springer, pp 755–765. https:// doi.org/10.1007/978-3-030-30484-3_59 7. Chawla NV (2009) Data mining for imbalanced datasets: An overview. In: Data mining and knowledge discovery handbook. Springer, US, pp 875–886. https://doi.org/10.1007/978-0387-09823-4_45 8. Devarajan M, Subramaniyaswamy V, Vijayakumar V, Ravi L (2019) Fog-assisted personalized healthcare-support system for remote patients with diabetes. J Ambient Intell Hum Comput 10(10):3747–3760. https://doi.org/10.1007/s12652-019-01291-5 9. Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml 10. Fontecha J, González I, Bravo J (2019) A usability study of a mHealth system for diabetes self-management based on framework analysis and usability problem taxonomy methods. J Ambient Intell Hum Comput. https://doi.org/10.1007/s12652-019-01369-0 11. Fotouhi S, Asadi S, Kattan MW (2019) A comprehensive data level analysis for cancer diagnosis on imbalanced data. J Biomed Inf 90(103):089. https://doi.org/10.1016/j.jbi.2018.12.003 12. Ghosh KK, Ghosh S, Sen S, Sarkar R, Maulik U (2020) A two-stage approach towards protein secondary structure classification. Med Biol Eng Comput. https://doi.org/10.1007/s11517020-02194-w 13. Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2018) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 57(1):159–176. https://doi.org/10.1007/s11517-018-1874-4 14. Ghosh M, Guha R, Singh PK, Bhateja V, Sarkar R (2019a) A histogram based fuzzy ensemble technique for feature selection. Evol Intell 12(4):713–724. https://doi.org/10.1007/s12065019-00279-6 15. Ghosh S, Bhowmik S, Ghosh K, Sarkar R, Chakraborty S (2019b) A filter ensemble feature selection method for handwritten numeral recognition 16. Gupta D, Richhariya B, Borah P (2018) A fuzzy twin support vector machine based on information entropy for class imbalance learning. Neural Comput Appl 31(11):7153–7164. https:// doi.org/10.1007/s00521-018-3551-9
29 An Ensemble Approach for Handling Class Imbalanced Disease Datasets
355
17. Gürbüz E, Kılıç E (2013) A new adaptive support vector machine for diagnosis of diseases. Expert Syst. 31(5):389–397. https://doi.org/10.1111/exsy.12051 18. Han W, Huang Z, Li S, Jia Y (2019) Distribution-sensitive unbalanced data oversampling method for medical diagnosis. J Med Syst 43(2). https://doi.org/10.1007/s10916-018-1154-8 19. Haque MR, Islam MM, Iqbal H, Reza MS, Hasan MK (2018) Performance evaluation of random forests and artificial neural networks for the classification of liver disorder. In: 2018 International conference on computer, communication, chemical, material and electronic engineering (IC4ME2). IEEE. https://doi.org/10.1109/ic4me2.2018.8465658 20. Harimoorthy K, Thangavelu M (2020) Multi-disease prediction model using improved SVMradial bias technique in healthcare monitoring system. J Ambient Intell Hum Comput. https:// doi.org/10.1007/s12652-019-01652-0 21. Kodaz H, Öz¸sen S, Arslan A, Güne¸s S (2009) Medical application of information gain based artificial immune recognition system (AIRS): diagnosis of thyroid disease. Expert Syst Appl 36(2):3086–3092. https://doi.org/10.1016/j.eswa.2008.01.026 22. Kousarrizi MRN, Seiti F, Teshnehlab M (2012) An experimental comparative study on thyroid disease diagnosis based on feature subset selection and classification 23. Kumar P, Thakur RS (2020) Liver disorder detection using variable—Neighbor weighted fuzzy k nearest neighbor approach. Multimedia Tools Appl. https://doi.org/10.1007/s11042-01907978-3 24. Liu YQ, Wang C, Zhang L (2009) Decision tree based predictive models for breast cancer survivability on imbalanced data. In: 2009 3rd International conference on bioinformatics and biomedical engineering. IEEE. https://doi.org/10.1109/icbbe.2009.5162571 25. McDermott J, Forsyth RS (2016) Diagnosing a disorder in a classification benchmark. Pattern Recogn. Lett. 73:41–43. https://doi.org/10.1016/j.patrec.2016.01.004 26. Polat H, Mehr HD, Cetin A (2017) Diagnosis of chronic kidney disease based on support vector machine by feature selection methods. J Med Syst 41(4). https://doi.org/10.1007/s10916-0170703-x 27. Rosset S (2004) Model selection via the AUC. In: Twenty-first international conference on Machine learning—ICML 04. ACM Press. https://doi.org/10.1145/1015330.1015400 28. Temurtas F (2009) A comparative study on thyroid disease diagnosis using neural networks. Expert Syst Appl 36(1):944–949. https://doi.org/10.1016/j.eswa.2007.10.010 29. Vuttipittayamongkol P, Elyan E (2020) Overlap-based undersampling method for classification of imbalanced medical datasets. In: IFIP advances in information and communication technology. Springer, pp 358–369. https://doi.org/10.1007/978-3-030-49186-4_30 30. Wang KJ, Makond B, Wang KM (2013) An improved survivability prognosis of breast cancer by using sampling and feature selection technique to solve imbalanced patient classification data. BMC Med Inf Decis Mak 13(1). https://doi.org/10.1186/1472-6947-13-124
Chapter 30
A Comparative Study on Recognition of Degraded Urdu and Devanagari Printed Documents Sobia Habib, Manoj Kumar Shukla, and Rajiv Kapoor
1 Introduction Any OCR system relies on the nature of the paper and how the information picture has been printed. This printing issue fundamentally prompts the corruption of image. There are two kinds of Script, i.e. Handwritten and printed. In this paper, we are going to concentrate on the sorts of degradation in printed Urdu and Devanagari documents. Figure 1 illustrate types of degradation in any printed document.
1.1 Urdu Script Urdu is a cursive content with 39 essential characters. All the characters aside from ‘dal’, ‘zal’, ‘ra’ and ‘zay’ are written in three structures for example beginning character, centre character and the end character. The state of the character depends as where it has been utilised. For e.g., ‘ba’ can be written in three structures ' ' ' ﺑﮩﻦ, ' 'ﺍﺑﻬﯽ,'ﺍﺏ. First word is ‘ab’ where BA is utilised toward the finish of the word; then, the subsequent word is ‘abhi’ where BA is utilised in the centre, and the final word is ‘Behen’ where BA is utilised at the outset. So, the network and state of each S. Habib (B) · M. K. Shukla Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] M. K. Shukla e-mail: [email protected] R. Kapoor Delhi Technological University, Delhi, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_31
357
358
S. Habib et al. Printed Documents
Overlapping
Touching
Broken
Heavy Printed
Faxed
Bleed Through
Show through
Fig. 1 Types of degradation
character are diverse relying upon the arrangement of the character. Similarly, the various characters are utilised in Urdu content. Many type of Diacratics are also used in Urdu Script. They are zer, zabar, pesh, Tashdeed, Maddah, etc. These Diacratics helps in the pronunciation of the Urdu characters. At times, these Diacratics create a lot of problem during segmentation and it is difficult for the system to understand that a particular Diacratic is attached to which character. Below is the depiction of various sorts of degradation that may show up in the Script. So, the identification of characters in Urdu Script seems to be a difficult task. That’s why most of the researchers have used ligatures to identify Urdu Script.
1.2 Devanagari Script Devanagari is likewise a cursive Script having 45 characters which incorporates 11 vowels and 34 consonant. These consonant can be written in two structure, for example, full character and half character. Out of 34, there are 9 such characters otherwise that cannot be composed as half. They are the various consonant can be utilised as half letter. These half letter join to different letters to frame a word, however, it makes a ton of disarray while segmenting the document character wise. If the report is in corrupted structure, then it turns into an incredible test. Because it’s not a Latin language, so there is no upper casing of the character. The words are perceived by an OCR by separating it into three sections for example upper, middle, and lower. Segmentation is applied after the upper part is expelled from the words. The modifiers of vowels are used in upper and lower portion of the words.
2 Types of Degradation Different types of degradation and the work done on those degradations are described below.
30 A Comparative Study on Recognition of Degraded Urdu and Devanagari …
359
Fig. 2 Overlapping characters in a Devanaagri script, b Urdu script
2.1 Overlapping Characters As the name suggest that overlapping means to partly cover something. When a character partly covers another character, it is called overlapping character. But in Urdu, sometimes it does not cover the character but is above the character without touching each other. At the time of segmentation, it becomes difficult to identify that the pixel is with which character. In Fig. 2a, ‘da’ is overlapping other ‘da’ in the words moizuddin and Ahrazuddin. In Fig. 2b, re and seen are overlapping each other but without touching. Zahour et al. [1] used Fuzzy C means clustering to know the complexity of a document, and the problem of overlapping is removed by block covering method. Nearest neighbourhood method was used by Das et al. [2] for clustering the connected component, and 98% accuracy was achieved for connected components.
2.2 Touching Characters When there is no space between two characters, we call it touching characters. Figure 3a shows touching character in Devanagari Script where ‘sa’ and ‘ka’ is touching each other as there is no space between them and the same is with ‘va’ and ‘ja’. Figure 3b represents touching character in Urdu Script. Lot of work has been done in segmentation of touching characters. Fuzzy C-means algorithm was used by Bousesella et al. [3] for Arabic historical documents, and they have achieved 95% of accuracy for text lines. Farulla et al. [4] used Fuzzy logic for
Fig. 3 Touching characters. a Devanagari script, b Urdu script
360
S. Habib et al.
Fig. 4 Broken characters. a Urdu script, b Devanagari script
touching characters and obtained an accuracy of 96.1%. Su Liang used pixel and profile projection method and achieved 99.4% accuracy for English documents.
2.3 Broken Characters Thresholding converts any image into two colour image, i.e. black and white. Black is used to show presence of any character in an image and white for absence. Entropy can be used to scan the shades of grey in the area where connected components almost touch each other. However, in many old images, thresholding does not work well as the characters are completely broken and the shape of the character is not clear to the system. Figure 4 depicts the broken characters both in Urdu and Devanagari script. In Urdu, broken characters are underlined by yellow line as we can see ‘noon’ in raashan is broken and in Devanagari maatra of aa is broken from the word ‘tarkeeb’. Fuzzy clustering and mathematical morphology was used by Pinto et al. [5] for old documents recovery. Sandhya et al. [6] used Local enhancement Technique for broken characters while Knowledge-based word interpretation model was used by Rocha and Pavlidis [7] to recognise broken characters. Following are the figures of broken characters taken from Urdu and Devanagari Script.
2.4 Bleed Through and Show Through A pattern that meddles with the main text because of leaking of ink from the converse side. Similarly, show through is the effect where the characters written on backside of the paper is also visible in the front due to non-perfect opacity of the paper. This problem can appear not only in ancient documents, but also in recent or fresh documents. Urdu document bleed through is shown in Fig. 5a and show through can be seen in Fig. 5b where the words written on other side of the paper can be seen in front. Hu et al. [8] used Gaussian mixture model and extreme learning machine classifier method to solve this degradation problem. Bayesian Approach was used by Rowley-Brooke and Kokaram [9] to solve this bleed through problem and they got some promising result. Show through is another form of degradation that creates problem in identifying main characters. Wolf [10] used two hidden markov random
30 A Comparative Study on Recognition of Degraded Urdu and Devanagari …
361
Fig. 5 a Bleedthrough Urdu document, b Show through Urdu document
fields and a single observation filed technique to solve this problem. Hysteresis thresholding was used by Estrada and Tomasi [11] for removing the bleed through effect in documents.
2.5 Heavy Printed and Faxed Documents Heavy print can destroy the shape of any character which can be easily segmented making it difficult to identify. The problem is basically same as touching characters. Many researchers have used the same technique to identify the characters of heavy printed documents. Faxed documents are those documents in which the main text cannot be seen very clearly and is a mixture of broken character, touching character and sometimes very few dots of some character remains. So, they also come in the category of degraded documents. Sometimes, the characters are so unclear that it is not visible by the human eyes. As we can see in Fig. 6a, b very light printed Urdu and Devanagari documents which is very difficult to read by human eyes. Figure 6c, d have heavy printed characters. When the characters are heavy printed, it is difficult to recognise the shape of the character as in Fig. 6c word ‘shahadat’ is difficult to understand because the empty space is filled with ink which is making difficult to understand the shape of ‘ha’, ‘da’, and ‘ta’. Bogiatzis and Papadopoulos [12] used fuzzy and entropy methods for recognising the characters having different illumination of ink in printed documents. Hidden markov model with hybrid modelling Techniques were used by Brakensiek et al. [13] for heavy and light printed documents (Table 1). A lot of work has been done in these types of degradation for different Scripts, but the results obtained are not much satisfactory when it comes to Urdu language. We can see that most of the researchers have taken the whole ligature instead of character during segmentation method. So, the character segmentation of Urdu language needs a lot of attention for better character identification. Here, we can conclude that Fuzzy clustering methods have been used for many Scripts. Some of them are mentioned
362
S. Habib et al.
Fig. 6 a Faxed Urdu document, b Faxed Devanagari document, c Heavy printed Devanagari document, d heavy printed Urdu words
Table 1 Method used to overcome the degradation in Urdu and Devanagari Script References
Script
Method
Accuracy (%)
Degradation type
Sonika et al. [14]
Devanagari
Aspect ratio of connected components
88.95
Overlapping and touching
Babu and Jangid [15]
Devanagari
Bounding box with three categories
85
Touching
Abidi et al. [16]
Urdu
Word spotting-based
94.3
Overlapping
Pal et al. [17]
Urdu
Component labelling approach with projection profile
96.9
Overlapping
Abid et al. [18]
Urdu
Hidden markov model
96
Touching
Goyal et al. [19]
Devanagari
Neighbour pixel analysis
94
Broken
Chanda et al. [20]
Urdu
Tree classifier
97
Broken
Dhingra et al. [21]
Devanagari
MCE-based
98.5
Bleed through
above, and it gives better results than other techniques used. So, we can conclude that focussing on clustering methods can overcome many degradation problems [22, 23].
30 A Comparative Study on Recognition of Degraded Urdu and Devanagari …
363
3 Features for Script Identification Segmentation plays an important role in Script identification. As we have discussed above, the types of degradation that become a problem during the time of sgmentation and the methods applied to solve the problem. After pre-processing and segmentation, a feature extraction technique is required to extract distinct features, followed by classification. Feature of an image is extracted by its content like colour, texture, shape position, dominant edges of image items, regions, etc. Feature extraction is either structural or statistical. Properties like loop, edges crossing point, etc. comes in structural feature and statistical feature is related to the distribution of pixel in an image [24, 25].
4 Classification Methods for Script Identification Classification is used for recognition of images. There is a large collection of dataset of different images. Once we get this dataset, we train this machine learning classifier that is going to ingest all of the data, summarise it in some way, and then spit out a model that summarises the knowledge of how to recognise these different object categories. Then, finally, we use this training model and apply it on new images that will then be able to recognise different images. So, rather than a single function that just inputs an image and recognises some characters. We have two functions. One is called train, that’s going to input images and labels and then output a model. Then separately, another function called predict which will input the model and then make predictions for images.
4.1 Classification Techniques Used for Urdu Script There are many classification techniques used to identify Urdu Scripts which depends on the database and quality of paper and script. Some of the techniques used are multidimensional long short term memory, hidden markov model, weighted linear classification, feed forward neural network, multi-dimensional recurrent neural network, etc. (Table 2).
364
S. Habib et al.
Table 2 Classification techniques used for Urdu script
Table 3 Classification techniques used for Devanagari script
References
Method
Accuracy (%)
Nazly and Shafait [26] Segmentation free
91
Razzaq et al. [27]
HMM and fuzzy logic
87.6
Khan et al. [28]
Decision trees
92.6
Javed et al. [29]
Convolutional neural network
95
Javed et al. [30]
HMM
92
Khan et al. [31]
PCA
96.2
Ahmad et al. [32]
Stacked denoising auto 96 encoder
Mir et al. [33]
FFNN
87
References
Classifier
Accuracy (%)
Narang et al. [34]
CNN, SVM
88.95
Kaeayil et al. [35]
LSTM
91
Puri et al. [36]
SVM
99.54
Imama and Haque [37]
Slice based
82
4.2 Classification Technique Used for Devanagari Scripts Historical Scripts are difficult to identify. Most of the studies have recognised individual characters rather than words and sentences. A study has been done that character recognition gives better result than word or sentences. Some of the methods used to identify Devanagari Scripts are given in Table 3.
4.3 Types of Classifier There are many types of classifier. Some of them are mentioned below.
4.3.1
Naive Bayes
It is based on the Bayes theorem. This method is fast and need very less training data. The disadvantage of this classification is that it is not a very good estimator. Peng et al. [38] used Naive Bayes classifier for classification of Chinese, Japanese, English and Greek and got 86% of accuracy. Goyal et al. [39] used Naive Bayes classifiers for Devanagari Script and got 95% of accuracy.
30 A Comparative Study on Recognition of Degraded Urdu and Devanagari …
4.3.2
365
SVM
Support vector machine is used to make a partition between two categories. This is helpful in high dimensional spaces and consumes less memory. The disadvantage of this method is that it is very expensive. Overall average accuracy of this method is 96.7%. Camastra [40] used SVM for English language, and 90% accuracy was obtained. Shukla et al. [41] used SVM for Bangla Script and obtained 94.78% of accuracy.
4.3.3
KNN
Vote of K nearest neighbours are considered and majority vote of each point is taken. This technique is effective for noisy and large data. Disadvantage of this method is that it is difficult to find the value of K, and computation cost is high. Lehal et al. applied nearest neighbour along with decision tree for classification of Gurumukhi Script and obtained the accuracy of 91.6% [42]. Matei et al. [43] used nearest neighbour and KNN classifier and obtained 99.3% of accuracy.
4.3.4
Decision Tree
Given an information of attributes along with its classes, a decision tree produces a sequence of rules that can be utilised to classify the data. The favourable position is that it is easy to understand, however, it can turn out to be exceptionally intricate on account of a solitary slip-up. The overall accuracy obtained from decision tree is 93.8% Abuhaiba [44] used decision tree for classification of symbols from Times New Roman and obtained 98% accuracy. Amin [45] used decision tree for classification of Arabic Script and got 92% accuracy.
4.3.5
Artificial Neural Network/Deep Learning
The data is put away in whole system not in a database, so loss of a little data from one spot do not upset the framework. The framework keep on working. The issue with this framework is that it turns out to be delayed as the time passes means it begins degrading, however, the bit of leeway is that the framework does not close down at once. Artificial neural network learn occasions and settle on choices by remarking on comparative occasions. It can also perform more than one job at a time. The overall accuracy of this system is 98.4%. Sudholt and Fink [46] used deep convolutional neural network for classification of English Script and obtained 86.59% of accuracy. Kowsari et al. [47] used Hierarchical deep learning for text classification and obtained 86% of accuracy.
366
S. Habib et al.
After learning about different types of classifier it is assumed that if artificial neural network with deep learning is applied to Devanagari and Urdu Script. The results will be better than before especially in degraded form.
5 Conclusion It is concluded that although numerous efforts have been made, extra research is essential in the field of character recognition of degraded Urdu and Devanagari documents. In Urdu reports, the greater part of the recognisable proof has been done on ligature-based, however, not many has isolated the character So, a lot of work must be done in this field. In Devanagari records, bunches of strategy have been applied yet none has got a lot of precision in degraded one.
References 1. Zahour, A et al (2009) Overlapping and multi-touching text-line segmentation by Block Covering analysis. Pattern Anal Appl 12(4):335 2. Das MS et al (2010) Segmentation of overlapping text lines, characters in printed Telugu text document images. Int J Eng Sci Technol 2(11):6606–6610 3. Boussellaa W, Zahour A, Elabed H, Benabdelhafid A, Alimi AM (2010) Unsupervised block covering analysis for text-line segmentation of arabic ancient handwritten document images. In: 2010 20th International conference on pattern recognition, Istanbul, pp 1929–1932. https:// doi.org/10.1109/ICPR.2010.475. 4. Farulla A, Giuseppe NM, Rossini R (2017) A fuzzy approach to segment touching characters. 1–13 5. Pinto JRC et al (2005) Combining fuzzy clustering and morphological methods for old documents recovery. In: Iberian conference on pattern recognition and image analysis. Springer, Berlin, Heidelberg 6. Sandhya N, Krishnan R, Ramesh Babu DR (2015) A novel local enhancement technique for rebuilding broken characters in a degraded Kannada script. In: 2015 IEEE international advance computing conference (IACC). IEEE 7. Rocha J, Pavlidis T (1993) A solution to the problem of touching and broken characters. In: Proceedings of 2nd international conference on document analysis and recognition (ICDAR’93), Tsukuba Science City, Japan, pp 602–605. https://doi.org/10.1109/ICDAR.1993. 395663 8. Hu X, Lin H, Li S et al (2016) Global and local features based classification for bleed-through removal. Sens Imaging 17:9. https://doi.org/10.1007/s11220-016-0134-7 9. Rowley-Brooke R, Kokaram (2011) Degraded document bleed-through removal. In: 2011 Irish machine vision and image processing conference, Dublin, pp 70–75. https://doi.org/10.1109/ IMVIP.2011.21 10. Wolf C (2009) Document ink bleed-through removal with two hidden markov random fields and a single observation field. IEEE Trans Pattern Anal Mach Intell 32(3):431–447 11. Estrada R, Tomasi C (2009) Manuscript bleed-through removal via hysteresis thresholding. In: 2009 10th international conference on document analysis and recognition. IEEE. 12. Bogiatzis AC, Papadopoulos BK (2019) Local thresholding of degraded or unevenly illuminated documents using fuzzy inclusion and entropy measures. Evol Syst 10:593–619. https:// doi.org/10.1007/s12530-018-09262-5
30 A Comparative Study on Recognition of Degraded Urdu and Devanagari …
367
13. Brakensiek A, Willett D, Rigoll G (2000) Improved degraded document recognition with hybrid modeling techniques and character n-grams. In: Proceedings 15th international conference on pattern recognition. ICPR-2000, vol 4. IEEE. 14. Narang S, Jindal MK, Kumar M (2019) Devanagari ancient documents recognition using statistical feature extraction techniques. S¯adhan¯a 44(6):141 15. Babu S, Jangid M (2016) Touching character segmentation of Devanagari script. In: Proceedings of the 7th international conference on computing communication and networking technologies. 16. Abidi A et.al (2011) Towards searchable digital urdu libraries-a word spotting based retrieval approach. In: 2011 ICDAR. IEEE 17. Pal U et.al (2003) Recognition of printed Urdu script. In: Seventh ICDAR, 2003. Proceedings. IEEE 18. Abid S et al (2015) Hidden Markov model based character segmentation factor applied to Urdu script. In: ICADIWT. 19. Goyal et al (2014) Method for line segmentation in handwritten documents with touching and broken parts in Devanagari script. IJCA 102(12):22–27 20. Chanda S et al (2005) English, Devanagari and Urdu text identification. In: Proceedings of ICDAR 21. Dhingra KD et al (2008) A robust OCR for degraded documents. In: ACSEE. Springer, Boston, MA, pp 497–509 22. Kumar P, Choudhury T, Rawat S, Jayaraman S (2016) Analysis of various machine learning algorithms for enhanced opinion mining using twitter data streams. In: International conference on micro-electronics and telecommunication engineering (ICMETE), pp 265–270 23. Choudhury T, Kumar V, Nigam D (2014) An innovative smart soft computing methodology towards disease (cancer, heart disease, arthritis) detection in an earlier stage and in a smarter way. Int J Comput Sci Mob Comput 3(4):368–388 24. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET using MQTT. Ada User J 38(4):231–235 25. Tomar R, Tiwari R (2019) Information delivery system for early forest fire detection using Internet of Things. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 477–86 26. Sabbour N, Shafait F (2013) A segmentation-free approach to Arabic and Urdu OCR. In: Proceedings of SPIE 8658, document recognition and retrieval XX, 86580N. https://doi.org/ 10.1117/12.2003731 27. Razzaq et al (2010) HMM and fuzzy logic: a hybrid approach for online Urdu script-based languages’ character recognition. KBS 23(8):914–923 28. Khan K et al (2015) Urdu text classification using decision trees. In: 2015 12th International conference on high-capacity optical networks and enabling/emerging technologies (HONET). IEEE 29. Javed N et al (2017) Classification of Urdu ligatures using convolutional neural networks-a novel approach. In: 2017 FIT. IEEE 30. Javed ST et al (2010) Segmentation free Nastalique Urdu ocr. WASET 46:456–461 31. Khan K et al (2012) Urdu character recognition using principal component analysis. IJCA 60(11) 32. Ahmad I et al (2017) Offline Urdu Nastaleeq optical character recognition based on stacked denoising autoencoder. China Commun 14(1):146–157 33. Mir S et.al. “Printed Urdu Nastalique script recognition using analytical approach. In: 2015 13th International conference on FIT. IEEE 34. Narang et al (2019) Devanagari ancient documents recognition using statistical feature extraction techniques. S¯adhan¯a 44(6):141 35. Karayil et.al. (2015) A segmentation-free approach for printed Devanagari script recognition. In: 2015 13th ICDAR. IEEE 36. Puri et al (2019) An efficient Devanagari character classification in printed and handwritten documents using SVM. Procedia Comput Sci 152:111–121
368
S. Habib et al.
37. Imama B. Haque MA. A slice-based character recognition technique for handwritten Devanagari script 38. Peng F, Schuurmans D, Wang S (2004) Augmenting naive Bayes classifiers with statistical language models. Inf Retrieval 7(3–4):317–345 39. Goyal A Khandelwal K, Keshri P (2010) Optical character recognition for handwritten hindi. In: CS229 machine learning, pp 1–5 40. Camastra F (2007) A SVM-based cursive character recognizer. Pattern Recogn 40(12):3721– 3727 41. Shukla MK et al (2016) Classification of the Bangla script document using SVM. In: 2016 3rd International conference on recent advances in information technology (RAIT). IEEE 42. Lehal GS, Singh C (1999) Feature extraction and classification for OCR of Gurmukhi script. VIVEK-BOMBAY 12(2):2–12 43. Matei O, Pop PC, V˘alean H (2013) Optical character recognition in real environments using neural networks and k-nearest neighbour. Appl. Intell 39(4):739–748 44. Abuhaiba ISI (2006) Efficient ocr using simple features and decision trees with backtracking. Arab J Sci Eng 31 (Springer) 45. Amin A (2000) Recognition of printed Arabic text based on global features and decision tree learning techniques. Pattern Recogn 33(8):1309–1323 46. Sudholt S, Fink GA (2016) Phocnet: a deep convolutional neural network for word spotting in handwritten documents. In: 2016 15th international conference on frontiers in handwriting recognition (ICFHR). IEEE 47. Kowsari K, Brown DE, Heidarysafa M, Jafari Meimandi K, Gerber MS, Barnes LE (2017) HDLTex: hierarchical deep learning for text classification. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA), Cancun, pp 364–371
Chapter 31
Machine Learning Based Feature Extraction of an Image: A Review Namra Shamim and Yogesh
1 Introduction Machine Learning concept focuses on two interrelated theories that how to construct computers that learn automatically from their own experience and the central, instructive, hypothetical, measurable, computational laws that administers all learning framework as PCs and human associations. Machine Learning has advantage for research center interests to handy innovation for business use. Machine Learning has risen as a strategy for decision in Artificial Intelligence for creating reasonable programming for PC vision, discourse acknowledgment, robot control, and so on. The impact of Machine Learning has been seen over the software engineering and enterprises worry with information escalated issues as purchaser administrations, in the analysis of flaws in a complex framework in charge of calculated changes for example its impact is seen from science to cosmology. Machine Learning is an idea of gaining from a huge arrangement of past existing information to anticipate the new information. Supervised learning includes mapping of the input data to the corresponding outputs whereas in unsupervised learning input data has no corresponding outputs. Here, the underlying structure of input data creates a corresponding output [1] (Fig. 1). The most significant milestone attempt in machine learning program is given by Samuel [2], using the game of checkers where a learner learns better with experiences of repeated ‘plays’, i.e. when computed with effective methods of look ahead and
N. Shamim · Yogesh (B) Amity School of Engineering & Technology, Amity University, Uttar Pradesh, Noida, India e-mail: [email protected] N. Shamim e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_32
369
370
N. Shamin and Yogesh
Fig. 1 Models of machine learning
researchers, these methods enable the computer to raise from the status of beginners to that of tournament players. Various computer applications are used widely in the process of getting the desired feature image from a large collected data automatically. This framework is called CBIR (Content-Based Image Retrieval), which has increased extraordinary consideration in the writing of picture data. The steps involved in these frameworks are normally separated into extraction, selection, and classification. In extraction the selected features are reduced and are used in the task of classification. The unselected features are discarded [3]. Of these three exercises of CBIR, feature extraction is generally basic as it gives specific choices that straightforwardly impact the effectiveness of grouping. CBIR is partitioned into two stages. First pre-processing where the image is first prepared for the extraction of highlights/features. This includes sifting, standardization, division, and item distinguishing proof and these outcomes in a set of critical locales and objects of information. And second feature extraction that depicts the substance of the picture includes as shape, surface, shading utilization. The feature extraction innovation has been utilized for different applications as modern computerization, government managed savings, biometric confirmation, and wrongdoing prevention [4]. For recognizing the pattern and processing of an image, feature extraction is an exceptional type of innovation that gets the pertinent information from the data which is unique to find and addresses it in dimensional space which is of low value that helps in grouping as well. The data changing the information into a set of highlights is called feature extraction and it is further utilized as reduced relevant information instead of a full-sized input pattern. Highlight/Feature acknowledgment is additional material in the recently rising zone of innovation as in multimedia, database, and in handwriting data entry. Previously, these systems were more expensive than today. OCR (Optical Character Recognition) is a process where conversion of handwriting or machine print is converted into a computer processable format. In pattern recognition, feature extraction is done after pre-processing. The premise of feature/highlight acknowledgment is to accurately dole out information example to yield design with highlight grouping. Lippman proposed the criteria to choose feature extraction [5]. For the development of any example characterization extraction of the highlights/features is a significant advance as it extracts relevant information from objects or writing in the form of Feature Vector (FV). These components are used to survey input units with focused yield units and it becomes simpler for classifiers to group these highlights and settle on choices at each image point.
31 Machine Learning Based Feature Extraction of an Image …
371
2 Literature Review A recent paper portrays the various clusters of AI calculations which are created for the wide assortment of information and the various types of issues showed across various AI problems [6]. Few studies illustrate that the calculations that is used for looking through an enormous space of up-comer programs, guided via preparing experience, to discover a program that enhances the presentation metric and they differs significantly, to a limited extent by how they address to competitive arrangements (choice trees, numerical capacities and general dialects) and some extent by the manner of projects (streamlining calculations with surely knew intermingling ensures and transformative pursuit techniques that assess progressive ages of arbitrarily changed programs), likewise, it also presents that numerous calculations centers around the capacity estimate issues [7]. In an examination, it has been appeared by a specific type of computational software investigation which has been utilized for a great deal as of lately that upper and lower limits on paces of the combination of improvement strategies blending admirably with the plan of Artificial Intelligence issues as the enhancement of an exhibition metric [8]. In recent years, with the advancement in supervised learning, supervised learning has a one high-sway on the profound systems, that means it is a system which supports multilayer of edge units, it registers some straightforward parameterized capacity of its sources of info which utilizes angle based streamlining calculations to change parameters all through such a multilayered organize dependent on mistakes at its yield [9]. Huge scope profound learning frameworks have had a significant impact as of late in PC vision [10] and discourse acknowledgment, where they made significant upgrades in their exhibition when contrasted with past methodologies [11]. Essentially the significant accomplishment in profound taking in has been gotten from regulated learning techniques for finding such portrayals, endeavours have likewise been made to grow profound learning calculations that find helpful portrayals of the contribution without the requirement for named preparing data [12]. A significant Machine Learning worldview is fortification of realizing where the data accessible in the preparation information is middle amongst the managed and unaided learning [13]. Further in other studies of image classification, it has been noticed that features play a vital role in differentiating a picture or an image, classified on the different basis like shading power, surface, and so on which improves the further preparation of an image, as it were, [14]. Image analysis is not the same as picture preparing tasks like reclamation, coding, and upgrade. Image analysis includes the location, division, extraction, and arrangement procedures [15]. Different techniques in writing that have been proposed for the component extraction. Feature Extraction goes under the information decrease procedure. The point of information decrease procedure is to lessen the informational index of features which is an important data present in an image [16]. Various studies show that information present in a picture is unpredictable and high dimensional, so extraction of the enlightening component from a picture for object acknowledgment and division is an important advance and it must be followed [17]. Shape included extraction strategies are separated into two which
372
N. Shamin and Yogesh
are Form-based and locale-based techniques. The calculation of shape feature only from the boundary comes under the form-based technique and the whole region is covered under the locale based technique. Scarcely any investigations show that the guideline of these techniques incorporates two kinds of approach, for example, the persistent methodology in which fit as a fiddle is not appropriate into the relating subparts and the integral boundary is used to determine the component vector and the subsequent methodology goes under the Discrete (Worldwide) Approach which says that fit as a fiddle limit is dispersed into the comparing subparts and it figures the multi-dimensional element vector. The shape descriptor includes estimation of territory, circularity, unusualness, significant pivot direction, and twisting energy [18].
3 Methodology 3.1 Diagonal Based Feature/Element Extraction Procedure The procedure includes, features of characters that are difficult to classify at the stage of recognition are extracted diagonally. This is a significant stage as its successful working improves the pace of acknowledgment and reduction of classification in the wrong manner [19]. Each character picture of dimensions (90 × 60 pixels) are additionally isolated into 54 equivalent zones, each zone is of dimension 10 × 10 pixels as shown in Fig. 2c which have 19 inclining lines and the closer view pixels under every corner to corner line added to get a solitary sub-highlight. Therefore, sub-highlights prevailed in each zone are 19. The 19 sub-highlights are consolidated to shape a solitary element and they are placed according to their respective zones shown in Fig. 2b. In a likely manner, this procedure is additionally rehashed for all the zones. As a conclusion, 54 highlights are extricated for each character. Furthermore, 9 and 6 highlights are extracted by the quantities set individually in the lines and sections of the zone. Subsequently, each character is spoken to by 69 that is, 54 + 15 features [20].
3.2 Fourier Descriptor This method is one of the methods in object recognition for the representation of the boundary shape of a segment in an image. Fourier Descriptors of the shape are made by the coefficient of Fourier Transformation. Fourier descriptors are mainly used for the representation schemes of the shape in frequency trait. The process of extracting features from an object is called image description and the descriptors must be free of variants like size, location, and orientation. These descriptors are classified into lower frequency descriptors and higher frequency descriptors.
31 Machine Learning Based Feature Extraction of an Image …
373
Fig. 2 Steps involved in the extraction of features from the character [20]
The lower recurrence descriptors provide data about the fundamental highlights of a shape while the higher recurrence descriptors are mainly focused on every minute and fine details/domains of the shape. Shape descriptors demonstrate desirable properties of computational efficiency [21]. The coefficients developed during transformation are quite large and the sub-division of coefficients gathers all general highlights of the shape. In the event the limit of the shape has K pixels, extending from 0 to K − 1 pixels [5]. At that point, the location of the kth pixel alongside form in the Cartesian plane and is represented by (x k , yk ) and Hence, the shape can be represented by the equations [5]: x(k) = xk , y(k) = yk
(1)
In a complex plane the equation is written as [5]: s(k) = x(k) + i y(k)
(2)
After that, we have to locate the discrete Fourier Change of s(k) utilizing condition [5] j2πuk 1 s(k)e− K , k k=0
k−1
a(u) =
u = 0, 1 . . . K − 1
(3)
374
N. Shamin and Yogesh
The complex coefficients a(u) are known as the Fourier descriptors of the limit and the reverse of Fourier Change re-establishes s(k) that is [5], s(k) =
K −1
a(u)e j2πuk/k , k = 0, 1 . . . K − 1
(4)
u=0
3.3 Principal Component Analysis In statistics, Principal Component Analysis was formulated by Pearson who defined the analysis as “the lines and planes of the nearest fit characterized to the arrangement of focuses in space” [22]. Principal Component Analysis is characterized as symmetrical direct change that is used to change the information to another organized framework also; the new framework ought to be organized so that the best chance by scalar projection lies on the primary head part, second on second organized framework, etc. [23]. If the data set is normally distributed the principal components are considered independent. The principal components are in the count of original variables or lesser than it. The projection of original values is represented by [5] P = U T .X
(5)
Principal Component Analysis is also known as Karhunen Loeve Change (KLT), Hotelling Change and Appropriate Symmetrical Disintegration (Unit) for its application in the hands-on work.
3.4 Independent Component Analysis Independent Component Analysis is based on the idea of gathering a factorial code for the data used for local representation as the sparse coding. In further studies, it has been noticed that that sparseness maximization network and Independent Component Analysis are interrelated. Bell and Sejnowski have built up a solo learning calculation performing Free Segment Investigation dependent on the expansion of entropy in a solitary layer feed-forward neural network [24]. The central idea of feature extraction by Independent Component Analysis is to visualize the data. Independent Component Analysis is a tabular form in unsupervised learning. This unsupervised strategy is identified with input dispersion yet it is not capable to guarantee great execution in grouping the issues [25]. ICA includes steps like preparation of data, applying Independent Component Analysis, contraction of small weights, and extraction of features.
31 Machine Learning Based Feature Extraction of an Image …
375
Preparation of data includes input features (N ) where X = [x, . . . , x N ]T and one yield class c making another dimensional info be (N + 1) with dataset. Next, we have to select a feature and then it should be stabilized by Kwak et al. [26] ( f i − m)/2σi . At that point, applying ICA to the new dataset we get W. The abso lattice weight +1 wi j . After that wi j of lute mean for autonomous column is [26]ai = N 1+1 Nj=1 W is lesser than α.ai then wi j shrink to zero where α is a small positive number and ai is the absolute mean. For extraction of features which is the final step each weight vector Wi is projected to unique info highlight space, for example evacuating the loads and afterward duplicating new weight lattice W of the measurement (N + 1) × N to unique information X , develop (N + 1) dimensional vector whose segments f i are new component up-and-comers [26]. Therefore, features extracted by Independent Component Analysis through identification of components based on input distribution statistically.
3.5 Histogram of Oriented Gradients Histogram of Oriented Gradients was first proposed by P. Dalal and Triggs and now it is one of the successful and popular used descriptors in pattern recognition [27]. Histogram of Oriented Gradients is an appearance descriptor as it counts the occurrence of gradient orientation in the image. It is used to detect the object. The picture is partitioned into little square cells and afterward histograms of edge bearings are watched and henceforth the linkages of these histograms are descriptors. The Histogram of Oriented Gradient is constant to geometric and photometric transformation. Various steps included in the Histogram of Oriented Gradients descriptor is to make sure about the normalization of colour and gamma values. Performance is enhanced by the means of image pre-processing. The computation of the gradient values in HOG is the initial step of calculation. HOG has the excellent performance when it comes to comparison with any other feature sets. The descriptor blocks in Histogram of Oriented Gradients are classified into R-HOG and C-HOG blocks. R-HOG block includes square grids which appear according to the three parameters which are the number of cells, the pixels, and the number of channels. R-HOG are observed in dense grids which are used to encode the information. C-HOG blocks are the shape context descriptors. C-HOG has two varieties like those with single, focal cell and those with focal cells are isolated precise and C-HOG squares are disclosed concerning four parameters: Number of angular bins, Number of outspread receptacles, Radius and Expansion factor. Further, it has been developed that the Histogram of Oriented Gradient descriptor is used for other objects as well despite of humans (Fig. 3). The detection window is used for the extraction of feature vectors with the help of overlapping blocks. The combined vectors contribute to linear SVM which is useful for Person/Non-Person Classification. At all positions and scales of an image, the detection window is scanned.
376
N. Shamin and Yogesh
Fig. 3 Overview of feature extraction
3.6 Fractal Geometry Analysis This theory is used for the extraction of features of two-dimensional objects. It comprises of wavelet analysis, central projection and fractal theory. Wavelet analysis is used to gather information about the minute features; central projection minimizes the dimensions of input whereas the fractal dimension is used as a feature vector. Fractals have fractional dimensions that indicate degrees to which the objects fill the space [28]. The size of the fractal set is denoted by the fractal dimensions with
31 Machine Learning Based Feature Extraction of an Image …
377
a mutational scale of observations which is not limited to the integer values. Few characteristics of fractal analysis is used to understand the complex system, selfcomparability and non-whole number measurement. Self-Comparability dimension of a fractal refers to the shape or feature of a fractal which after magnifying as much as possible remains the constant [30, 31]. Complex number fractals and Iterated Function System Fractals are a few examples of fractals. Those fractals start with complex numbers i.e. z = (a + ib) are called complex number fractals. This type of fractal is used to assign the value to each pixel on the screen. The complex number fractals are identified by various sets which were discovered by Gaston Maurice Julia and Banoit Mandelbrot respectively. For Mandelbrot set, we use equation [29] (Fig. 4); 2 + c; z n = z n−1
(6)
The purposes of Mandelbrot set have been shaded black [29]. For Julia set the algorithm used is [29] z 1 = z 02 + c, where z 0 = z. The connection between Mandelbrot set and Julia set has appeared in the figure [29]. Iterated Function System Fractals (IFS) are created by a well-defined set of transformation, an initial pattern, transformation of initial pattern, and transformation of a picture. The most celebrated sort of Iterated Capacity Framework Fractals is the Sierpinski triangle and Koch Snowflake as appeared in Figs. 5, 6 and 7 [29]. Fractals are used in various fields like astrophysics, biological science, computer graphics, etc.
Fig. 4 The image shows a segment of the boggling plane [29]
378
Fig. 5 Relation between Mandelbrot Set and Julia Set [29] Fig. 6 Sierpinski triangle [29]
Fig. 7 Koch snowflake [29]
N. Shamin and Yogesh
31 Machine Learning Based Feature Extraction of an Image …
379
3.7 Shadow Features of Character For detecting the features related to shadow boundaries of rectangle shape bounds the features of the image and then it is divided into eight octants. Total 16 shadows are attained by the octants through putting a perpendicular in between the octant.
3.8 Chain Code Histogram Chain coding is a process which uses contour representation. The Chain Code Histogram (CCH) is the technique that observes the directionality and uses the boundary of an object. In simple words, Chain Code Histogram describes the movement for the given pixel to consecutive pixels by measuring the slopes between two pixels providing an angle made by the intersection of the lines. The contour of an image might get affected by the redundant pixels produced by the canny operator which is an edge detection algorithm and it should be removed by the procedure which runs on the principle that two 8-connected pixels are always there in every contour pixel. Redundant pixels are produced simply when a pixel has 8-connected pixels greater than the sufficient amount which is two and after the removal of redundant pixels here is a change in contour which becomes thicker. After all these steps it is observed that the structure of contour consists of 4 possible segments [32, 33]. As a result, Feature Extraction becomes free of tracing direction and after processing the bounding box consisting of digits is analyzed, if the box has a digit which cannot be divided by 3, one or two rows and columns are added. In the end, the image is divided into the dimensions of 3 × 3 zonal area. Further, from the endpoint of the contour process of tracing proceeds and it goes. And finally the image is ready to get extracted.
3.9 Geometric Analysis for Feature Extraction This method highlights on the extraction of individual characteristic of a human face like the mouth, lips, ears, forehead etc., and then converting it into three-dimensional space. This work is to be created by making the way toward finding the milestones of geometric-based highlights arrangement strategy, consequently not physically.
380
N. Shamin and Yogesh
4 Result and Discussions After studying various methods in details, it has been observed that amongst, every method used for feature selection and feature extraction, Principal Component Analysis is the most optimal one. However, every method is useful for different datasets as feature extraction is subjective and every technique solves different problems. Principal Component Analysis deals with the images having great feature dimensions and as a result high quality of features are extracted using MATLAB and various algorithms. PCA is mainly focused on recognition of face, compression of an image, and vision through computer. It extracts every important features from the large area of features. This method stores more data than any other method used for the extraction of features.
5 Summary and Conclusion We have discussed a detailed study about Machine Learning, features, and various methods or techniques used for the extraction of features. The various methods are analyzed and concluded that Principal Component Analysis is the most favourable one. The accuracy of the method, the pros and cons of every technique are provided in Table 1. Every method used has its importance for particular problem. From literature review, it is noticed that various researches involved in machine learning, deep learning, and the feature extraction methods. It also presents the various ways for the reduction of data and these reduction methods have been made essential for the best result by the huge measure of information that is to be broken down. Also, the different types of Machine Learning that is the subdivision of Artificial Intelligence and the growth of Machine Learning in various areas. Numerous improvements in the field of AI have been examined. Machine Learning is extended to understand biological fields also such as neural sciences. Machine Learning has also broadened its approach to commercial works.
31 Machine Learning Based Feature Extraction of an Image …
381
Table 1 Comparison of various feature extraction methods Methods
Details
Advantages
Disadvantages
Applications
Diagonal based The method feature extraction emphasizes on dividing the character of size 90 × 90 into 54 zones and extracting features from the pixels from each zone by moving diagonally
Reduces Limited to misclassification handwritten Reduces the large characters only input data into feature vectors Doesn’t affect the wanted ratio
It is used in recognition of letters written by hand and it can convert documents which are handwritten into a structural framework
Fourier descriptors
The method focuses on 2-dimensional shapes, the segment’s boundary shape, and the frequency spectrum
It executes the randomness along the spectral feature Chances of losing information during transform is very low
The length of the Fourier transform is too large that sometimes it is unable to notice the changes in frequency over time
It is applied in analysis of spectrum (spectral analysis) etc
Principal component analysis
The point of the procedure is to locate the direct blend of factors having the most elevated difference and evacuating it
Computation of PCA is easy and approachable Dispersion is detectable It can handle a lot of parameters at a time
Efficiency depends on the size of the block. So, as the size of the block decreases the efficiency drops
Principal component analysis is applied in various fields like neural science and risk management as well as in various portfolios
Independent component analysis
This method is used to separate signals from the various variable into an independent one
It is efficient and requires less memory for the algorithm and information
It cannot extract the sources uniformly
This method is applied for the recognition of face, communication system and prognostication
A feature can be extracted globally using this method and this method helps to find the orientation of every pixel of the image
It takes more time to extract a feature because of its large size
Applied to get humans detected as well as the detection of face/face pattern also it is used in SBIR
Histogram of In this method, oriented gradients the probability of gradient orientation in an image is counted
(continued)
382
N. Shamin and Yogesh
Table 1 (continued) Methods
Details
Advantages
Disadvantages
Applications
Fractal geometry analysis
The technique incorporates wavelet investigation, focal projection, and the fractal hypothesis
It has a wider range of signals
It is not easily understandable because of its complex procedure
The fractal analysis has a great application in the field of biology as well as it is used to examine the nerve, various bacterial elements, etc.
Shadow features of character
The method The interpretation Time-consuming provides a of object, shape, Less efficient shadow feature and size is easy of an image using rectangular boundaries
Basically it extracts every shadow feature from any image
Chain code histogram
The method includes usage of contour representation and observes the directionality
Includes few bits up to 2 bits It can be used for higher compression of the image
Includes long chains, can be disturbed easily
They are applied to compress images having great pixels
Geometric This method analysis for highlights on the feature extraction extraction of individual characteristic of a human face like the mouth, lips, ears, forehead etc
In comparison to any other approach this is the easier to extract facial characteristics, it is fast in working
Simultaneously showing of data is not much an easy step
It can observe the object as per the human being’s function of eyes
References 1. Penchikala S (2016) Big data processing with apache spark—Part 4: spark machine learning. https://www.infoq.com/articles/apache-spark-machine-learning 2. Michie D (1968) MEMO. In: Functions and machine learning-reprinted from nature, vol 218, no 5136, pp 19–22 3. IEEE computer, special issue on content based image retrieval (1995), vol 28, no 9 4. Jain AK, Bolle RM, Pankanti S (eds) (1999) Biometrics: personal identification in networked society. Kluwer, Norwell, MA 5. Kumar G, Bhatia P. A detailed review of feature extraction in image processing systems. https:// doi.org/10.1109/ACCT.2014.74 6. Hastie T, Tibshirani R, Friedman J (2011) The elements of statistical learning: data mining, inference, and prediction. Springer, New York 7. Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349:255 8. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Foundations and trends in machine learning 3. Now Publishers, Boston, pp 1–122
31 Machine Learning Based Feature Extraction of an Image …
383
9. Bengio Y (2009) Foundations and trends in machine learning 2. Now Publishers, Boston, pp 1–127 10. Krizhevsy A, Sutskever I, Hinton G (2015) Adv Neural Inf Process Syst 25:1097–1105 11. Hinton G et al (2012) IEEE Signal Process Mag 29:82–97 12. Hinton GE, Salakhutdinov RR (2006) Science 313:504–507 13. Mnih V et al (2015) Nature 518:529–533 14. Tian DP (2013) A review on image feature extraction and representation techniques. Int J Multimedia Ubiquitous Eng 8(4):385–396 15. Revathi R, Hemalatha M (2012) An emerging trend of feature extraction method in video processing. CS & IT-CSCP, pp 69–80 16. Jauregi E, Lazkano E, Sierra B (2010) Object recognition using region detection and feature extraction. Towards Auton Rob Syst 1:104–111 17. Petersen E, Ridder M, De D, Handels H (2001) Image processing with neural networks—A review. Pattern Recogn 18. Jinxia L, Yuehong Q (2011) Application of SIFT feature extraction algorithm on the image registration. In: International conference on electronic measurement & instruments. IEEE 19. Rajashekararadhya SV, Vanajaranjan P (2008) Efficient zone based feature extraction algorithm for handwritten numeral recognition of four popular south-Indian scripts. J Theor Appl Inf Technol JATIT 4(12):1171–1181 20. Pradeep J, Srinivasan E, Himavathi S (2011) Diagonal based feature extraction for handwritten character recognition system using neural network. In: 2011 3rd International conference on electronics computer technology, Kanyakumari, pp 364–368 21. Morlier J, Bergh M, Mevel L. Modeshapes recognition using fourier descriptors: a simple SHM example. Allemang R, De Clerck J, Niezrecki C, Blough J (eds) Topics in modal analysis II, vol 6. Conference proceedings of the society for experimental mechanics series. Springer, New York, NY 22. Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 6(2):559–572 23. Jolliffe IT (2002) Principal component analysis series. Springer series in statistics, 2nd edn. Springer, NY, XX1XX, 487 p, 28 illus 24. Bell AJ, Sejnowski TJ (1995) An information-maximization approach to blind separation and blind deconvolution. Neural Comput 7:1129–1159 25. Hyvarinen A, Oja E, Hoyer P, Hurri J (1998) Image feature extraction by sparse coding and independent component analysis. In: Proceedings. Fourteenth international conference on pattern recognition (Cat. No. 98EX170), Brisbane, Queensland, Australia, vol 2. pp 1268–1273 26. Kwak CHC, Jin (2001) Feature extraction using independent component analysis. 568–576. 10.100713-540-44668-0-80 27. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR 28. Mandelbrot BB (1983) The fractal geometry of nature. Macmillan. ISBN 978-0-7167-1186-5 1, Feb 2012 29. Patrzalek E. Fractals: useful beauty. Stan Ackermans Institute, Eindhoven University of Technology 30. Kumar P, Choudhury T, Rawat S, Jayaraman S (2016) Analysis of various machine learning algorithms for enhanced opinion mining using twitter data streams. In: 2016 International conference on micro-electronics and telecommunication engineering (ICMETE), pp 265–270 31. Choudhary V, Kacker S, Choudhury T, Vashisht V (2012) An approach to improve task scheduling in a decentralized cloud computing environment. Int J Comput Technol Appl 3(1):312–316 32. Tomar R, Prateek M, Sastry HG (2017) A novel approach to multicast in VANET using MQTT. Ada User J 38(4):231–235 33. Tomar R, Tiwari R (2019) Information Delivery system for early forest fire detection using Internet of Things. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 477–86
Chapter 32
A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic Archit Aggarwal and Garima Aggarwal
1 Introduction Images are a format of multimedia, which from a technical standpoint are used to store or record some quantity of data which can be measured and evaluated in several ways. There is some part of this data which may be deemed as useful data or useful information. Multiple images have multiple aspects and multiple bits of useful information. The process of using computing techniques to define this useful information in multiple images and then processing them in order to combine or fuse these images is known as image fusion. Image fusion has seen a lot of research input over time and has the potential to advance with upcoming computational methodologies and expanding processing capabilities. Image fusion [1] has been widely divided into two major domains—the spatial domain and the transform domain. The spatial domain relies on the basic principle of division of the input images into various data categories. These data categories are then relevantly merged according to the desirable output needed. This domain primarily focuses on the spatial features of an image and considers the use of region and pixel-based techniques. The transform domain on the other hand uses the application of standard decomposition techniques to produce transform coefficients which are then evaluated and fused according to certain rules defined by the user. These coefficients represent different segments and different relative conditions of the input image data. Thus, it is possible to retain only the useful information by deterring the useful coefficients by the use of certain techniques. The inverse application of the A. Aggarwal (B) · G. Aggarwal Amity University, Noida, Uttar Pradesh, India e-mail: [email protected] G. Aggarwal e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_33
385
386
A. Aggarwal and G. Aggarwal
transform technique produces the fused image. Image fusion has seen application in many domains such as medical imaging, robotics, security and surveillance and scene perception and recreation [2, 3]. All standard image fusion techniques have various drawbacks in terms of the quality of the fused image. Fuzzy logic has been extensively used in the domain of image processing. This paper aims to explore and evaluate the impact and potential use of fuzzy systems image fusion. The major contribution of this paper is to propose and implement two methods of image fusion using the application of DWT in combination with fuzzy logic. The proposed methods show better results in standard performance metrics which in turn are evident of better quality [4, 5], thus improving on the standard DWT technique. The major challenge faced while developing the system was determining the base technique which will be used for making the system and evaluation later on. Fuzzy logic here sees the application of being applied in determining the coefficient values of the fused image for individual bands of the image obtained after the DWT decomposition. The rest of the paper is organised as follows: Sect. 2 discusses the literature survey and the background methodologies, Section 3 describes the proposed model and the experimental setup along with the dataset description, Section 4 presents the results and discussions, and Sect. 5 depicts conclusion and future scope.
2 Related Works Image fusion may be visualised as the problem of determining the fusible areas, and thus, the number of applications and the pertaining literature is wide with many techniques available like DWT, DCT and PCA [6–9]. The [10] paper begins strong and provides all the relevant context in the introduction itself in a precise and clear manner while defining the problem and the relevant contribution of their work. The paper proposes image fusion using defocus by utilising the Levenberg–Marquardt algorithm to estimate the optimal value of the point spread functions (PSFs). This obtained value of PSF is then used to construct a fusion map which results in the fused image. The paper is extremely well presented and provides a large number of comparisons with other standard techniques in terms of practical implementation as well as numerical analysis using various parameters like mutual information, similaritybased index and quality of edge. The paper also provides mathematical analysis of computational times. The manuscript successfully presents a better way for multifocus image fusion using optimal defocus. This manuscript served as a detailed introduction to image fusion as it broadly covered most of the aspects involved. The relevant levels of details are also high. The paper [11] proposes a cartoon texture decomposition algorithm to break the image into several components. Then after decomposition has been achieved, it attempts to produce improved fusion rules to utilise relevant data from separate components and merges them into one image. The claim made here is that there has been a reduction in computational complexity and
32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic
387
overall improvement in performance for quality of image fusion. These claims are backed by a detailed mathematical analysis and comparison with standard techniques in terms of both computation times and quality. The paper successfully delivers a novel approach towards image fusion. The degree of background into the proposed technique is vast and helps grasp the overall components. The paper is well structured and presented. This manuscript allowed us to delve deeper into the decomposition of the images and depicted the importance for the time complexity issue. James and Dasarathy [12] provides a very extensive review of medical imaging with classification based on methods, modalities and application organs. The overall comparisons are very well described and provide a broad overview of the current state of the art techniques. The paper leads to the overall conclusion that there is still potential for work to be done in the field but there has been an improvement on an overall level in medical image fusion. The technique as a whole has successfully had an impact in the field of medicine. If one is starting to explore this field, the paper presents all the work in a well-summarised manner allowing the reader to have a good idea about the entire swath of the field. Deep learning is one of the most relevant fields of computing today. Liu et al. [13] present a comprehensive presentation about image fusion and deep learning. The background provided in the manuscript is very descriptive of individual components of image fusion and deep learning. The authors successfully provide the readers an in-depth background and correlation between deep learning and image fusion by listing advantages and difficulties in the field. Then the paper evaluated the current work on image fusion using deep learning and finally concluded with a description of future work possibilities. Dammavalam [14] presents a new approach to image fusion using fuzzy logic in combination to various other approaches like wavelet transform and GA. The topics of discrete wavelet transform, GA and the result analysis metrics are extremely well presented in a language methodology which is easy to read even for an eye which is not very familiar to the topics. The presentation of the proposed method is extensive with heavy visual images showing a complete picture of the system without holding a lot of information in the background. This makes the presentation more appealing and clearer enabling easy interpretation. The paper considers three examples from different fields to clearly show the system with comparison to standard techniques. The conclusion ends on a hopeful note with improvements in most of the quality metrics using the proposed technique. The paper uses two membership functions for the presented system whereas the proposed technique in this paper presents five membership function use cases. The exposition of the fuzzy details is not seen much in the literature available. Reference [15] is primarily concerned with DWT and fuzzy logic. It presents the theoretical aspects of DWT and fuzzy logic very well going deep into the sub-types of these systems. The paper also shows a representation of the DWT very well by showing all the bands in the DWT decomposition. The paper shows quality metrics well in mathematical terms. The primary strong points of the paper include the depiction of visual aspects and the quantitative comparison of the sub-types of DWT and fuzzy logic. The paper does not provide any information about the fuzzy rule base used to generate the output image. In this manuscript, it is attempted to depict the maximum details of the system used. The paper presents well
388
A. Aggarwal and G. Aggarwal
overall and shows better results than the standard techniques. Teggihalli et al. [16] are primarily concerned with satellite imaging rather than the usual medical imaging thus introducing a factor of variation into the fuzzy logic approach for image fusion. The paper highlights the use of PCA and DWT in terms of theory along with their advantages and disadvantages. The presentation of the aforementioned theoretical analysis is done well and provides comprehensive details. The details are defined well with quality metrics as well. The approach taken is outlined well in steps and phases giving a good view of the system, although the background fuzzy system is not shown in much detail like details of the rule base and membership functions. The paper also shows better results than standard techniques using a pure fuzzy approach to image fusion. The metrics used also show large variation across the literature [17]. The paper presents a hybrid DWT2 and fuzzy logic approach towards image fusion. The paper presents the theory aspects well and a very extensive quantitative analysis which extensively shows mathematical comparisons for various techniques over six datasets and five quality metrics. This helps assess comparisons to a very fine detail. The paper does not show any details about the membership functions and rule base used for image fusion in the fuzzy approach.
3 Preliminaries 3.1 Discrete Wavelet Transform Wavelet transform in general refers to the representation of a signal in the format of time and frequency. Discrete wavelet transform is one of these methods to produce this representation and is an improvement on its predecessors like Short-time Fourier transform. DWT has seen wide application in the domain of image processing and many of its subdomains like biometric applications. DWT has many parameters like wavelet families and the approximation bands. There are many wavelet families with their own applications, advantages and disadvantages. These families differ in the factors of conceptual complications, process runtime requirements and mathematical abilities like texture handling. The DWT process for image fusion is a simple and time-efficient process. Upon execution, it divides the input images into various bands which can be seen as the approximation and details bands. There is only one approximation band termed as the LL band or the CA band where the A is the approximation and provides the approximate of the entire input. There are three detail bands which are concerned with directional data of the image. The CH band is the band which provides data about the horizontal aspect of the input, the CV band is the band which produces the data about the vertical aspect of the input, and the CD band is the band which produces the data about the diagonal area of the input. The process of image fusion using DWT works by obtaining these bands for two input images (i1 and i2) and their decomposition. These decomposed bands of the
32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic
389
Fig. 1 DWT band decomposition
two input images are then used to produce four new bands for the resultant image using certain rules. These resultant bands are then merged using the inverse wavelet operation to produce the resultant image. The band division process is depicted in Fig. 1.
3.2 Fuzzy Logic Theoretical knowledge is often riddled with assumptions in order to achieve a conclusion without having to factor in all the complexities which exist in a practical situation. The real world exists more in practice than in theory which poses the need for methodologies which are capable of accounting the complexities of the real world while maintaining accuracy in solutions. Fuzzy logic is a method capable of dealing with approximates rather than exact data [15]. It successfully takes on the challenge of combining machine powered decision making with human characteristic inputs which are applicable at an individual level [14, 18]. The fuzzy system is built on six components which are interdependent and work in a certain flow. A visual of the entire system is obtained from these components and the entire system can be explained on the basis of these six components. The process flow is shown in Fig. 2. The creation of a fuzzy system is extremely varying and is different for all use cases. The system in this paper presents the authors’ implementation from their perspective. The components have been elaborated further in regard to the use case at hand. User-Interface Interactions This is the step where the data is collected and fed to the system initially. The primary inputs here are the coefficient variables obtained from the input images.
Fig. 2 Fuzzy system flow
390
A. Aggarwal and G. Aggarwal
Fuzzification The data after being collected exists in crisp set form. Fuzzification refers to the process of conversion of data into a usable format for the fuzzy system. For the purposes of this study, trapezoidal member functions are used for the input and the output. Knowledge Base The knowledge base can be further classified into the rule base and the database. The rule base houses the rules with all the if-else-then constructs. The database supplies the information and the components of which the rules are made of. Interface Engine The interface engine is the processor for the entire system. It evaluates the conditions in the rule base based on fuzzified data inputs and when a condition is triggered, it generates a result based on the ‘then’ part of the condition. Defuzzification The output of the computation of a fuzzy set needs to be converted to a crisp value in order to obtain the final output in the real-life domain. This process is known as defuzzification. Output Once all processes finish successfully, a crisp output is generated. The coefficient value of the concerned DWT band is the only output variable. A multi-input single-output system is obtained.
3.3 Quality Assessment Metrics Image fusion has been evaluated using these standard quality metrics [19] which can be divided into two broad categories: 1. Reference-Based—This may be defined as the assessment criteria which provides a result by comparing the obtained fused image with the original reference image where there is no blurring between the objects, these images are provided with the dataset. (a) Peak Signal-to-Noise Ratio—This is a metric which evaluated the ratio of the obtained image with the original images. The metric successfully evaluates the quality of the fused image. The calculation formula from the standard MATLAB library [20] is shown below PSNR = 10log10 (peakval2 /MSE)
(1)
(b) Mean Square Error—Mean square error may be defined as the quality measure which aims to compute the difference or error between the expected image and the acquired image. The calculation for images is done at a pixel level and is concerned with the pixel density difference between the reference and the obtained image. MSE may be calculated as follows
32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic
1/(X Y )
y−1 x−1 ||Ir(X, Y ) − If(X, y)||2 0
391
(2)
0
where X * Y pixels are under consideration If is the resultant image and If is the obtained image. 2. Non-Reference-Based—These evaluation criteria are used for evaluation of the resultant image when there is no reference or original image available for comparison. (a) Standard Deviation (SD)—Standard deviation is the metric used to evaluate the contrast of the image fused. SD may be used for edge quality betterment. SD may be calculated as follows X Y 2 SD = (A(x, y) − A
(3)
x=1 y=1
(b) Spatial Frequency (SF)—[21] Spatial frequency with regard to image quality may be defined as the degree of clarity obtained in the image. It may be evaluated as follows (4) SF = RF2 + CF2 (c) Entropy—Entropy associated with image quality is known to be associated with the texture layer of the image and hence serves as a suitable metric. The formula for calculation may be stated as E = −S H ∗ log2 (H )
(5)
4 Proposed Algorithm The implementation has been done using MATLAB with a system specification of 8 GB RAM and an Intel i7 Processor. The datasets with focus only on two objects (o1, o2) are considered, i.e., two inputs where one image has focused on one object (o1) and the other image has focused on the other object (o2). The dataset has two kinds namely with reference and without reference. Without Reference—This is the kind of dataset which has three images namely the input A image (focus on o1), the input B image (focus on o2) and the original image (both o1 and o2 in focus) which is to be used as reference for evaluation of the proposed algorithms. For the purposes of this study, our own dataset for reference-based images was prepared. The dataset is shown in Fig. 3. The images have been captured, and inputs have been created using the Gaussian blur feature.
392
A. Aggarwal and G. Aggarwal
Input A
Input A
Reference Input B Reference Pendrive DataSet
Input B Reference Eraser DataSet
Reference
Fig. 3 Self-generated reference dataset
Without Reference—The dataset has both the input images, i.e., input A (focus on o1) and input B (focus on o2), but there is no reference image present for the evaluation of the proposed algorithm. Each band in the DWT [22] approach is a matrix representation of values. The goal is to produce four new bands in order to construct the fused image. This process involves either selecting one of the present values using some rules. Existing methodologies include averaging or maximising the input. The proposed systems aim to modify this approach by using fuzzy logic to merge one of these bands. Let the reproduced image bands be as a3—approximation band (LL), b3—horizontal band (LL), c3—vertical band (HH) and d3—diagonal band (HH). The proposed system aims to apply fuzzy logic to generate one of these bands (a3, b3, c3, d3) where the other three use standard DWT rules for generation. After these bands have been determined, the inverse DWT process takes place to produce the final fused image. This process of using DWT in combination with fuzzy logic has been illustrated in a step-wise process in Fig. 4. The Proposed Algorithm 1. 2. 3. 4. 5. 6.
Inputs: Read Input Images (I1 and I2) If size(I1 = = I2) then // Check if images are the same size. i1 = grayscale(I1) // Convert image to grayscale. i2 = grayscale(I2) // Convert image to grayscale. [a1,b1,c1,d1] = DWT(I1) // Discrete wavelet transform of Image 1. [a2,b2,c2,d2] = DWT(I2) // Discrete wavelet transform of Image 2.
32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic
393
Fig. 4 DWT fuzzy hybrid flowchart
7. import FIS // import the fuzzy system. 8. b3 = evalFIS( b1, b2) // b3 is the selected band for evaluation using fuzzy 9. a3, c3, d3 = evalBands() // Evaluate the remaining bands using DWT 10. i3 = IDWT(a3,b3,c3,d3) // Reproduce the image using inverse DWT. 11. end Proposed Algorithm 1—Generation of the a3 Band The first technique proposes the generation of the approximation band, responsible for the average representation of the image, using fuzzy logic. For the generation of the approximation (a3) band, the data value matrix ranges between 0 and 511 (255 × 2). The input membership functions are trapezoidal in nature and each input has five membership functions. The output has five membership functions which are trapezoidal in nature with the range of 0–511. Proposed Algorithm 2—Generation of the d3 Band The second technique proposes the generation of the diagonal band, responsible for the diagonal information of the image, using fuzzy logic. For the generation of the d3 band, the data value matrix ranges between −45 and 41. The input membership functions are trapezoidal in nature and each input has four membership functions. Figure 3h shows the input membership functions and their ranges. These figures are identical for both the inputs. The output has four membership functions which are trapezoidal in nature with the range of −40 to −42. The primary distinction between the proposed techniques is the band produced using fuzzy logic. This is further mathematically seen in the difference between ranges and membership functions. It is observable that the two proposed approaches are based on the same hybrid concept of DWT with Fuzzy Logic but see variation
394
A. Aggarwal and G. Aggarwal
in the area of applicability. This difference in turn creates a large distinction in the results, thus showing advantages of each technique in its own unique expression.
5 Results Both the proposed techniques and the standard DWT approach were implemented on the datasets defined. The outputs for all the images for all the techniques are shown in Fig. 5. The fuzzy surface for proposed system A and B is shown in Fig. 6. The graph surfaces show the correlation between the input values and the outputs. The complete 3D graph presents the successful formulation of the fuzzy systems. INPUT A
INPUT B
DWT
Fig. 5 Output for DWT, Proposed A and Proposed B
PROPOSED A
PROPOSED B
32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic
Surface preview for Proposed Method A
395
Surface preview for Proposed Method B
Fig. 6 Fuzzy surfaces for the proposed techniques
Figure 6a, b enables a preview of all the values across both the fuzzy systems. These can be used to visualise the system value generation and the correlation between input values and the output they generate. The graphs summarise the entire fuzzy system and the output it generates for both the inputs. Objective Analysis The metrics chosen for the evaluation of the non-reference parameters were standard deviation, spatial frequency and entropy, while the metrics chosen for the referencebased parameters were PSNR and MSE. The results for the non-reference parameters and reference parameters are shown in Tables 1 and 2, respectively. The better Table 1 Non-reference metric results DWT
Proposed A
Proposed B
Image dataset
SD
SF
Entropy
Pen drive
4.2492
6.1949
6.0403
Plane
5.4709
3.6013
4.9850
Book
5.9234
6.3377
7.3273
Rubber
5.9735
7.7751
6.5785
Brain
3.2779
7.5276
5.5602
Pen drive
3.9851
6.4746
6.1762
Plane
4.6837
5.4254
3.3277
Book
5.2589
5.8447
7.0872
Rubber
5.9425
8.8923
6.8266
Brain
2.8938
9.1326
5.8544
Pen drive
3.6781
6.3985
6.0058
Plane
4.8959
4.0459
4.6818
Book
9.4844
9.6019
7.3497
Rubber
4.8985
7.2521
6.5858
Brain
3.2909
9.2333
6.0021
396
A. Aggarwal and G. Aggarwal
Table 2 PSNR and MSE results for the reference dataset
PSNR—Dataset
DWT
Proposed (A)
Proposed (B)
Pen drive
19.5297
22.6777
22.6777
Rubber
11.3392
16.4023
17.8499
MSE—Dataset
DWT
Proposed (A)
Proposed (B)
Pen drive
724.6199
419.5394
351.0009
Rubber
4.7770
1.4888
1.0668
performing metrics have been highlighted in bold. The evaluation shows better results when compared to the standard DWT technique. From the observed results, the following final results for each dataset were gathered—The pen drive dataset shows a higher value of SD using DWT, but the SF and entropy are higher using the Proposed A method. This implies that the image quality and texture are better using the proposed method, but the edge quality is higher in DWT. The plane dataset shows a higher value of SD and SF using DWT, but the entropy is higher using the Proposed A method. Higher entropy implies better texture using the proposed method. The book dataset shows higher values for all three metrics using the Proposed B methodology. The image quality, the edge quality and the texture are improved using the proposed method. The rubber dataset shows a higher value of SD using DWT, but the SF and entropy are higher using the Proposed A method. This implies that the image quality and texture are better using the proposed method, but the edge quality is higher in DWT. The brain dataset shows higher values using the Proposed B methodology. The image quality, the edge quality and the texture are improved using the proposed method. With the reference metrics, it is seen that both the metric performances are significantly higher than the standard DWT method. The improved PSNR values across both the reference datasets signify a better image quality obtained from the proposed method. The lower value of the MSE metric shows that the proposed method produces images with a higher degree of similarity to the expected images. The error rate is significantly high in the standard DWT method and is almost negligible in the proposed methods as shown by the graphs in Fig. 7.
Graph for PSNR Values Fig. 7 Graph comparison of PSNR and MSE values
Graph for MSE Values
32 A Hybrid Approach to Image Fusion Using DWT and Fuzzy Logic
397
Figure 7a clearly signifies the high level of PSNR when compared to the standard DWT technique which reinforces the result of higher quality. Figure 7b is representative of a negligible error values, i.e., 724.61 compared to 419.53 (57%) and 351.0 (48%) for the first dataset and 4.770 compared to 1.488 (31%) and 1.066 (22%) which shows that the comparative results of the proposed image are better in terms of quality and accuracy, while being lower in the rate of error.
6 Conclusion and Future Work Image fusion is becoming increasingly desirable in many fields. Thus there is a need for software which can create fused images without compromising accuracy and detail. Through this manuscript, image fusion was studied in detail while implementing multiple image fusion techniques. Two alternate methods for image fusion which provide a hybrid approach adaptable to different datasets with promising results are also proposed. The outputs and results outperform the standard technique of DWT in the subjective and the objective domain. The proposed techniques also show that there is some scope for expansion in terms of the size and variety of the dataset. The currently proposed algorithm shows some limitations in terms of the size of the input images thus showing potential for creating more accurate outputs with the addition of more parameters and inclusion of more algorithms in machine learning like neuro-fuzzy and deep learning. The field of computer science has seen rapid growth in terms of the available techniques to perform computation like image processing. It is hoped to explore this new potential and explore the most recently available technology like machine learning and artificial intelligence systems to develop more approaches towards image fusion.
References 1. Sharma M (2016) A review: image fusion techniques and applications. Int J Comput Sci Inf Technol 7(3):1082–1085 2. Choudhury T, Kumar V, Nigam D (2014) An innovative smart soft computing methodology towards disease (cancer, heart disease, arthritis) detection in an earlier stage and in a smarter way. Int J Comput Sci Mob Comput 3(4):368–388 3. Choudhary V, Kacker S, Choudhury T, Vashisht V (2012) An approach to improve task scheduling in a decentralized cloud computing environment. Int J Comput Technol Appl 3(1):312–316 4. Tomar R, Tiwari R (2019) Information delivery system for early forest fire detection using Internet of Things. In: International conference on advances in computing and data sciences. Springer, Singapore, pp 477–486 5. Anuj Kumar Y, Tomar R, Kumar D, Gupta H (2012) Security and privacy concerns in cloud computing. Int J Adv Res Comput Sci Softw Eng 2(5) 6. De I, Chanda B (2013) Multi-focus image fusion using a morphology-based focus measure in a quad-tree structure. Inf Fusion 14:136–146. https://doi.org/10.1016/j.inffus.2012.01.007
398
A. Aggarwal and G. Aggarwal
7. Wang Z, Ma Y, Gu J (2010) Multi-focus image fusion using PCNN. Pattern Recogn 43:2003– 2016. https://doi.org/10.1016/j.patcog.2010.01.011 8. Naidu V (2010) Discrete cosine transform-based image fusion. Defence Sci J 60:48–54. https:// doi.org/10.14429/dsj.60.105 9. Discrete cosine transform based image fusion techniques (2020). In: mathworks.com. https://in. mathworks.com/matlabcentral/fileexchange/36345-discrete-cosine-transform-based-imagefusion-technique. 10. Aslantas V, Toprak A (2017) Multi-focus image fusion based on optimal defocus estimation. Comput Electr Eng 62:302–318. https://doi.org/10.1016/j.compeleceng.2017.02.003 11. Liu Z, Chai Y, Yin H et al (2017) A novel multi-focus image fusion approach based on image decomposition. Inf Fusion 35:102–116. https://doi.org/10.1016/j.inffus.2016.09.007 12. James A, Dasarathy B (2014) Medical image fusion: a survey of the state of the art. Inf Fusion 19:4–19. https://doi.org/10.1016/j.inffus.2013.12.002 13. Liu Y, Chen X, Wang Z et al (2018) Deep learning for pixel-level image fusion: recent advances and future prospects. Inf Fusion 42:158–173. https://doi.org/10.1016/j.inffus.2017.10.007 14. Dammavalam S (2012) Quality assessment of pixel-level imagefusion using fuzzy logic. Int J Soft Comput 3:11–23. https://doi.org/10.5121/ijsc.2012.3102 15. Fuzzy based image fusion using wavelet transform. Available at: https://shodhganga.inflibnet. ac.in/bitstream/10603/40115/8/08_chapter3.pdf 16. Teggihalli MM, Mrs. Ramya HR (2018) Image fusion based on DWT type-2 fuzzy logic system. IJSER 2018 17. Kumaraswamy S, Srinivasa Rao D, Naveen Kumar N (2016) Satellite image fusion using fuzzy logic. Acta Universitatis Sapientiae, Informatica 8:241–253. https://doi.org/10.1515/ ausi-2016-0011 18. Liu Y, Liu S, Wang Z (2015) Multi-focus image fusion with dense SIFT. Inf Fusion 23:139–155. https://doi.org/10.1016/j.inffus.2014.05.004 19. Wang Z, Bovik A (2002) A universal image quality index. IEEE Signal Process Lett 9:81–84. https://doi.org/10.1109/97.995823 20. Peak signal-to-noise ratio (PSNR) (2020). MATLAB Psnr-Mathworks India. mathworks.com. https://in.mathworks.com/help/images/ref/psnr.html 21. Li S, Kwok J, Wang Y (2001) Combination of images with diverse focuses using the spatial frequency. Inf Fusion 2:169–176. https://doi.org/10.1016/s1566-2535(01)00038-0 22. Single-level discrete 2-D wavelet transform—MATLAB Dwt2—Mathworks India (2020). mathworks.com. https://in.mathworks.com/help/wavelet/ref/dwt2.html
Chapter 33
The BLEU Score for Automatic Evaluation of English to Bangla NMT Goutam Datta, Nisheeth Joshi, and Kusum Gupta
1 Introduction MT translates one natural language to other automatically. This translation can occur for a language pair called bilingual translation and if it apples for more than two languages, then called multilingual translator. Again this translation process can be in one direction only called unidirectional system and for both directions is called bidirectional system [1]. In statistical machine translation systems, which is primarily based on probabilistic approach by selecting most probable word y out of given words set of words x P(y|x) where x = (x1, x2, … xn). Again in SMTs phrase-based SMT performs better than word-based SMTs [2]. In another popular approach by combining both rule-based where linguistic experts frame the rules which can be referred during translation process and phrase-based approaches to design a hybrid model which outperformed the baseline SMT model [3]. Unlike all SMT models where the use of probability is the core idea to select most probable word out of given set of words from parallel datasets, the recently used most popular approach is the use of artificial neural network (ANN) called neural machine translation (NMT). Although NMT systems outperform SMTs but G. Datta (B) School of Computer Science, University of Petroleum and Energy Studies, Dehradun, Uttarakhand, India e-mail: [email protected] N. Joshi · K. Gupta School of Mathematical and Computer Science, Banasthali Vidyapeeth, Vanasthali, Rajasthan, India e-mail: [email protected] K. Gupta e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_34
399
400
G. Datta et al.
the major problem of NMT system is, it is difficult to train. It requires various hyper parameter adjustments to obtain better result. One of the major problems of NMT is out of domain problem. That is, one word may have multiple meaning based on context. And out of domain problem has very adverse effect on NMT in terms of adequacy [4]. MT mainly based on two important components, first, language model and secondly, the translation model. In NMT, language model and translation models are neural language model and neural translation model, respectively. Again in neural language model, the commonly used techniques are feed forward and recurrent neural network. In neural translation model, the most common and basic model is encoder decoder model. To capture the context during translation, the popularly used model is attention-based model. There are again various other flavors of advanced neural translation models like ensemble, convolution, architecture, etc. The rest of the paper is organized as follows: first section introduces the most recent approach in MT, i.e., introducing artificial neural network(ANN) which is widely used nowadays in automatic translations and known as neural machine translators, second section introduces the various MT evaluation techniques, third section of the paper carries out some experiments with Bilingual texts: English and Bengali, and the last section is the discussion and conclusion.
1.1 Neural Machine Translation Neural machine translation exploits the use of ANN and recently, deep architecture is widely used in machine translation design process. The most simple and basic model of NMT is the encoder/decoder mode. Encoder accepts strings and converts these into fixed-sized vector representation and decoder takes from this vector and converts them into variable-sized data. The main problem with encoder/decoder approach is with longer sentence size. Since longer sentence needs to be compressed and to be converted into fixed length vector which creates problem for neural network. With increase sentence size, the performance of NMT degrades [5]. In any MT systems, there are mainly two important components: Language model and translation model. Neural language models are very efficient way of modeling conditional probability when multiple input variables are given [2]. In NMT language model, recurrent neural network (RNN) is very widely accepted. In feed forward neural network, information is not retained in forward layers. In real-life situation, sometimes, we have to take decisions based on past and current information. Such type of situations are efficiently handled by RNN in the language modeling context (Fig. 1). Output y(t) is computed as follows: h (t) = gh wi x (t) + wr h (t−1) + bh y (t) = g y w y h (t) + b y
33 The BLEU Score for Automatic Evaluation …
401
Fig. 1 Recurrent neural network
h(t) represents the current state, h(t −1) represents the previous state. Nonlinearity is handled by tanh or sigmoid activation functions. But the major problem of RNN is vanishing gradient and exploding gradient problem. Vanishing gradient problem is addressed by long short-term memory(LSTM) Fig. 2. The major problem of RNN is it is unable to retain long term dependencies. That is, it can not recall the previous information for longer size sentences. LSTM model Fig. 2 Long short-term memory (Source http://colah. github.io/posts/2015-08-Und erstanding-LSTMs/)
402
G. Datta et al.
solves this problem. Different types of gates are used in LSTM model. Input gate is used to decide how much input is required to change the memory state. Forget gate is used to decide how much earlier values are to be retained and output gate is responsible for passing on the information to the next layer. LSTM model suffers from huge training time as it contains more number of parameters. It also suffers from overfitting problem. Because of the above reasons, there is another approach gated recurrent unit is widely applied.
2 MT Evaluation Techniques There are numerous MT evaluation systems proposed by various researchers over the period of time. One of the very popular automatic MT evaluation systems is BLEU which is mainly precision based [6]. Modified n gram precision is calculated considering candidate and reference translations. Candidate translations are the outcome of MT results, whereas reference translations are test sets given by human experts, i.e., translations produced by human experts. Another very popular automatic MT evaluation technique is the METEOR automatic evaluation of machine translation which is based on both precision and recall [7]. The drawbacks of BLEU are eliminated in METEOR by incorporating recall and also by using unigrams and considers to produce higher fluency [7]. Even in the problem with BLUE during geometric averaging is it produces zero if any individual n-gram scores zero. This averaging n-gram problem is removed in METEOR NIST is another MT evaluator developed by US National Institute of Standard and Technology.
3 Experimentation We had evaluated the performance of google translate for bilingual texts English to Bangla using BLEU. For the purpose of evaluation, BLEU uses human expert generated results as test data set, which is also called reference set. Table 1 contains some English sentences and its Bengali translation generated by google translate. Considering the first sentence “Ramesh reads text book”. For this, the two reference translations are: Reference 1: Reference 2: For second sentence “Every action has equal and opposite reaction”. For this, the two reference translations are as follows: Reference 1:
33 The BLEU Score for Automatic Evaluation … Table 1 Some English to Bangla translations using google translate
English
403
Bangla
Ramesh reads text book Every action has equal and opposite reaction She missed the train A little learning is a dangerous thing He walks slowly
Reference 2: For the third sentence “She missed the train”. For this the two reference translations are: Reference 1: Reference 2: For the fourth sentence “A little learning is a dangerous thing”. The two reference translations are as follows: Reference 1: Reference 2: Now, for the last statement “She walks slowly”. The two reference translations are: Reference 1: Reference 2: Now, let us try to calculate n-gram model for every translations. We had considered only bigram-based approach for our experimentation purpose. Although, in practice, BLUE calculates different n-grams and take its geometric mean. That is, it computes 1, 2, 3, and 4-g and finally takes the geometric mean of all grams. Here, we have shown manual calculation of n-gram and then actual BLEU score was calculated. The first output produced by google translate for the English sentence can be broken as per “Ramesh reads textbook” is bigram approach is
and is one and
so the counts for is one. In two reference sets
404
G. Datta et al.
Table 2 Count and Countclip calculation for translated Bengali sentences
Total
Count
Countclip
1 (since it appears once in MT output)
1 (maximum number of times it appears on either of the references)
1 (since it appears once in MT output)
1 (maximum number of times it appears on either of the references)
2
2
appears once also appears once. Hence, modified precision is 1 + 1/1 + 1 = 2/2, i.e., 1 which is highest BLEU score. This is illustrated in Table 2. Now, the BLEU score of the same sentence was calculated with one reference translation only and it was 100% since it matched exactly with the reference translation. Actually, the BLEU score had changed when this translated sentence was compared with more than one reference translations, i.e., test sets. Now, the modified 2 g precision = Countclip /Count = 2/2 = 1, i.e., maximum score value calculated manually. Modified precision for next translated sentence can be calculated using the same approach as follows (Table 3). So the modified precision = Countclip /Count = 4/6 = 66.67 was calculated manually. The BLEU score for the same sentence was 27.30% since it calculates 1, 2, 3, and 4-g and then the geometric mean (Fig. 3). In case of higher gram, i.e., 4-g there Table 3 Count and Countclip calculation for translated Bengali sentences
Total
Count
Countclip
1
0
1
0
1
1
1
1
1
1
1
1
6
4
33 The BLEU Score for Automatic Evaluation …
405
BLEU
120 100 80 60 40 20 0
Sentence 3
Sentence 3
Fig. 3 BLEU score for first translated sentence compared with first test data and second translated sentence
BLEU
120 100 80 60 40 20 0
Sentence 3
Sentence 3
Fig. 4 BLEU score for third sentence with 1st and 2nd reference (test) data are 100 and 16.08%, respectively
may not be exact match for all four consecutive words in MT output and test sets and hence score minimizes (Fig. 4). Now, for the third sentence “She missed the train”, MT output is “ ” exactly matches with the first reference translation and hence BLEU score was 100% since it returned maximum score for all n-grams.Now, for the same case, when we calculated the BLEU score using the second reference transla” BLEU score was only 16.08% (Fig. 4). This tion “ was due to reference translation did not exactly match with MT output, though reference translation was semantically as well as syntactically correct. Hence, the only solution is to compare with maximum test data sets and then taking the average may return better BLEU score. Similarly for the forth sentence “A little learning is a dangerous thing”, the BLEU score was only 17.16% as long as we compared it with the first reference data. But when we compared with second reference data, it was 36.30%. Less match was found with the first reference data. And, when the same sentence was compared with the second reference data, more matches were found and hence scored more (Fig. 5).
406
G. Datta et al.
BLEU Score
70 60 50 40 30 20 10 0
Sentence 4
Sentence 4
Sentence 5
Sentence 5
Fig. 5 BLEU scores for fourth sentence with first and second test sets are 17.16 and 36.30% respectively and fifth sentence with first and second test sets are 61.05 and 19% respectively
Finally, we had checked the BLEU score for the last sentence “He walks slowly” with first and second given reference translations and found the scores to be 61.05 and 19.0%, respectively (Fig. 5).
4 Discussions and Future Scope From the above translations and BLEU score behavior for English to Bengali translation, we have observed the following points. Some of the translations produced by google NMT are excellent in terms of adequacy and fluency. In the first ” sentence “Ramesh reads text book.” is translated as “ which is adequate because here no information is lost or distorted in Bengali sentence and conveys the correct meaning as well. At the same time, this translation is also fluent because it is grammatically correct also. For the second sentence “Every action has equal and opposite reaction.” The Bengali translation ” produced by MT is “ is not adequate and fluent because its correct translaion would be “ ” But in MT output instead of it produced which is not fully correct. As far as BLUE score is concerned for the first sentence, mapping with first reference translation ,it produced 100% score because of exact match found between MT output and reference data. So we can say after analyzing different MT outputs and their corresponding two reference sets that it is always better to measure BLEU with more number of test sets than fewer to optimize the process of matching between MT output and test sets and the chances of producing correct score increases. It is clear from above discussion only N-gram approach sometimes fails to produce reliable score and hence exploiting soft computing-based approach may help producing correct score.
33 The BLEU Score for Automatic Evaluation …
407
References 1. Hutchins W, Somers H (1992) An introduction to machine translation. London Academic Press, London 2. Koehn P (2017) Neural machine translation. https://arxiv.org/pdf/1709.07809.pdf,2ndpublic draft 3. Groves D, Way A (2009) Hybrid data driven models of machine translation. Springer, Berlin 4. Koehn P (2017) Rebecca knowles: “six challenges for neural machine translation” https://arxiv. org/pdf/1706.03872.pdf 5. Cho K, van Merrienboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: Encoder–Decoder approaches 6. Papineni K, Roukos S, Ward T, Zhu W (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), Philadelphia 7. Lavie A, Denkowski M (2009) The METEOR metric for automatic evaluation of machine translation. Kluwer Academic Publishers
Chapter 34
A Classifier to Predict Document Novelty Using Association Rule Mining Binesh Nair
1 Introduction IDC predicts 80% of the worldwide data to be unstructured by 2025 [1]. Gartner [2] refers to unstructured data as contents which do not conform to any specific, pre-defined data model. These are typically generated by people and hence, these content generally does not fit into traditional databases. Thus, enterprises not only face the arduous task of managing this type of data but also deriving relevant insights from the same. Knowledge workers have to analyze many documents in the form of whitepapers, articles, reports, etc., on a daily basis to make decisions. While there are recommender systems which can suggest relevant documents, it remains a fact that, many of these relevant documents may contain redundant information and hence, reading such documents can be a futile exercise [3]. The proposed work, thus, aims to assist the knowledge workers by developing a classifier which not only recommends relevant documents but also, flags them as being novel or redundant. Novelty of a document in this work is defined as the uniqueness in the content of a document on a subject which is relevant to the reader and which would enhance or extend his existing knowledge base. Although, novelty can even be due to a host of other factors like, structure of the document, style of writing, source of the work, etc., as briefly described in [4], none of these are within the scope of this work. Following subsections is expected to provide a good introduction to the basic concepts upon which the proposed framework is built.
B. Nair (B) SBM, NMIMS University, Mumbai, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_35
409
410
B. Nair
1.1 Text Mining Text mining refers to the application of data mining techniques on unstructured data [5] such as, documents, call center notes, blogs, tweets, comments and so on, in order to discover new patterns or insights.
1.2 Key Terminologies Some key terminologies associated with textual data are: i. Corpus: It refers to the collection of all text associated with a context, problem, or person [5]. ii. Documents: A document is a unit of text which is typically unstructured. A corpus consists of many documents. Emails, tweets, comments, research papers, news articles, etc., are examples of documents [5].
1.3 Association Rule Mining Association rule mining techniques have been typically applied on transactional databases in the retail industry, to identify how frequently or infrequently items are purchased together and their associations between them. Support and confidence are couple of popular objective measures that are used to determine the strength of the rules mined. i. Support: Support measures how frequently an item appears in a customers’ shopping basket [5]. Support = Count of the items/Number of transactions. ii. Confidence: Confidence measures how good a rule is at predicting [5]. Confidence = Number of transactions that contain the entire rule)/Number of transactions supporting the LHS of the rule.
1.4 Unsupervised Classification Unsupervised classification is a machine learning technique wherein the classifier does not have any tagged data to aid its learning process, rather it learn on its own by discovering patterns from the data [5].
34 A Classifier to Predict Document Novelty …
411
1.5 Structure of the Paper Section 2 will focus on literature review in the area of novelty detection in unstructured data. Section 3 will describe the proposed work, evaluation methodology and the observations. Finally, Sect. 4 concludes the paper.
2 Related Work Determining novel information from unstructured data is a widely researched topic. Much of the initial work in the area of novelty detection revolved around topic detection and tracking (TDT). Specifically, the task wherein novelty detection is applied within TDT is the first story detection (FSD) [7–10]. However, ‘Event’ and ‘Novelty’ are not the same. Events have certain structures and occurrence patterns in the media. At the same time, background documents of any typical knowledge worker are more subject-oriented and they tend to follow the same subject over a longer period of time. Thus, a collection of relevant documents for a knowledge worker represents a much smaller universe of documents than compared to a TDT task. Hence, novelty detection based on such a user profile is expected to be easier than FSD in TDT [11]. Also, since documents are not event-driven like news stories, it may be worthwhile to consider a different approach for detecting novelty of a future unseen document. Novelty detection has primarily focused on sentences rather than documents as a whole [7, 9–11]. This is mainly propelled by the availability of datasets for benchmarking [7]. On the other hand, document-level novelty detection remained a laggard, primarily due to two factors. Firstly, there has been no data available for benchmarking [12] and secondly, the fact that, every document has something new to offer, which makes determining novelty a challenging task [9]. However, we have witnessed an increasing interest in document-level novelty detection recently, with these works even publishing the corpus for benchmarking to extend the research further [12, 13]. For instance, [14, 15] focuses on reducing the redundancy of documents in search results of Web crawlers by identifying novel documents [16]. focuses on identifying research papers discussing new ideas based on novelty detection. Authors in [17] uses novelty detection to discover potential opportunities by mining patent documents. Primarily, researchers have focused on following two techniques for detecting novelty in unstructured data: i. Statistical techniques like TFIDF, cosine similarity, etc. [7, 8]. ii. Language modeling [18]. While numerous research [9, 10, 19] have been done to compare these two techniques, there has been no universal acceptance on any one technique completely outperforming the other. In fact, they have delivered comparable results. Although language models [18] have been widely used in determining the novelty of a sentence,
412
B. Nair
the terms that make a sentence different are smoothed away in the process, which can hinder the process to model novelty. Statistical techniques, on the other hand, consider a document as a ‘bag of words’ [6]. These techniques tend to be less distracted by the new words than the language models [10]. Also, unlike language models, they do not make a naive assumption about the word independence [19]. Hence, the proposed work undertakes a ‘bag of words’ approach. We also define a novelty threshold similar to the one implemented in [7, 20], given the fact that every document will have something new to offer [9]. A document must have novel words above this pre-defined threshold in order to be declared as novel. Sparse data can be another challenge in determining document novelty. Language models have used TFIDF internally wherein the authors compared sentences with clusters of sentences [18]. However, a document-to-document comparison will be computationally expensive as the size of the corpus becomes extremely large; thereby, posing scalability issues. Hence, to address this concern, we summarize the corpus using a Term-Document-Matrix (TDM). Since each document is defined as a collection of independent terms, a TDM effectively captures this relationship. This alleviates any concern regarding information being lost from a document. Even the size of the TDM need not be a concern since it never needs to be retrieved in its entirety. Once the strongly associated terms have been found, even the matrix is no longer needed. Karkali et al. [21] have used this approach to determine the novelty of a document to get a better scalability especially in a mobile setting scenario. Similarly, Krishnamoorthy [22] used TDM representation as a means to determine sentiment in financial news articles. Further, we also address the issue of sparse data in TDM by removing sparse terms which are below a certain threshold. Although, application of association rule mining on unstructured data is at a nascent stage in comparison to that on structured or semi-structured data, there have been some interesting use cases. For instance, Krishnamoorthy [22] uses association rules to determine the sentiment in financial news articles. Haralambous and Lenca [23] employ association rule mining in document classification. Interestingly, this area of research has found applications in diverse domains. For instance, Lakshmi and Kumar [24] applied the technique of association rule mining to medical transcripts to find associations of one disease with another. Paiva et al. [25] applied this technique on unstructured documents in order to enrich domain ontologies in the building and construction sector. We are employing association rule mining to determine the novelty of an unseen document, which to our best knowledge has not been investigated before.
3 Proposed Work The goal of the proposed classifier is to predict the novelty of a future (unseen) document based on the knowledge assimilated from the background documents. The learning of the classifier is unsupervised. Although, this work stays comfortably in the realm of text mining, it has applied some typical information retrieval (IR)
34 A Classifier to Predict Document Novelty …
413
techniques like construction of TDM as part of data transformation to mine the strongly associated terms.
3.1 Framework Figure 1 depicts the framework for the proposed classifier. Following are the sequence of steps involved in building the classifier: i.
Building the Corpus: A corpus is built from a set of raw documents which forms the training dataset. We have used a random sample of 200 training documents from the medical domain in the NewsGroups20 dataset as our corpus, which represents the existing knowledge base of a knowledge worker. The preprocessing stage consists of loading the training documents and building it as a corpus using the ‘Corpus’ function within the ‘tm’ library in R. This corpus is
Fig. 1 Proposed framework for the classifier
414
B. Nair
then transformed into a vector model to enable construction of a term document matrix, which in turn facilitates text mining. ii. Text Cleaning and Tokenization: This corpus then undergoes a cleaning process. Stop words, numbers, punctuations, URLs, etc., are removed. Further, a noncomprehensive list of synonyms is identified for replacements in order to filter the frequent terms. This is done by identifying the most common synonyms in the documents. For instance, common terms like ‘medical’, ‘medicines’, ‘med’, etc., have been replaced by the term ‘medicine’. This also filters out redundant terms which may have similar meaning. Once the corpus has been cleaned, it is represented by a collection of relevant words (tokens). iii. Construct Term Document Matrix (TDM): A TDM is constructed in such a way that the document-ids’ represents the columns and the terms from each of the documents represent the rows. This two-dimensional matrix structure is appropriate for finding the most frequent terms in the corpus. iv. Remove Sparse Terms from TDM: One of the issues with TDM is that most of the terms will be highly sparse as they appear only in one or very few documents. Since these terms are not frequent, they are not relevant and hence are removed from the TDM to make the latter more compact. During implementation, we removed those terms which were having a sparsity factor of 0.99 that is, not appearing in 99% of the documents. v. Mining Strongly Associated Terms with Domain Keywords: Every domain will have keywords which are frequently used in that domain. For instance, in the training and test datasets which are based on medicine domain, some of the domain-specific keywords we identified were, ‘medicine’, ‘symptoms’, ‘doctor’, ‘disease’, ‘treatment’, etc. We then used the ‘assoc’ function provided in the ‘tm’ package in R to find strongly associated terms with these keywords. We have used the statistical measure ‘confidence’ to extract the strong association rules. We set a confidence threshold of 0.35, after calibrating the same based on training documents. Higher the confidence, strongly associated are the terms. vi. Evaluate the Novelty of Unseen Documents: An unseen document also undergoes the same cleaning process. Prior to novelty detection, we examine if this document is relevant for the user. This is done through a similarity index, by matching the words of this document for similarity with a set of domain-specific keywords. If the number of matches satisfies the minimum threshold set for the similarity index, then the document can be considered as relevant for the user. This is done to filter out documents which may be novel but outside the domain of interest for the user. Further, the classifier implements a novelty threshold to identify the novelty of this document. If the number of novel terms in this document is equal to or more than the novelty threshold, only then the document is classified as ‘novel’. Otherwise, the document is classified as ‘redundant’. We have defined the novelty threshold to be a function of the size of document to circumvent the influence of document size on novel terms. Note that this threshold is calibrated with respect to a random sample from the training dataset before applying it over the entire test documents.
34 A Classifier to Predict Document Novelty …
415
Mathematically, the number of novel words in the test document can be formulated as given below, i−1 Nnw (di |d1 , . . . , di−1 ) = Wdi ∩ Wd j j=1 where W di is the set of the words contained in document d i . This has been extended from the simple new word count measure, which was one of the best performing novelty measures in TREC 2002 novelty track [10]. While the original version was used for predicting novelty in sentences, we have extended this to documents. vii. Evaluating the Classifier: The results of the classifier are compared with the tags prepared by the human assessors for the corresponding test documents. Popular metrics for the classification tasks viz., precision, recall, F-score, specificity, and accuracy have been used to determine the effectiveness of this classifier (Fig. 3).
Input: d: Input Document k: Domain keywords, defined by subject matter experts W: Strongly associated words, derived from training documents NT: Novelty Threshold Output: dtag: the predicted class (tagging d) Method: 1. Generate ‘wd’ tokens from d, after cleaning and tokenization process 2. if Nw < S, where Nw is the number of words in d and S is the similarity index such that, S ⊆ k then 3. d is irrelevant 4. else 5. Compare wd with W and compute Nnw 6. if Nnw < NT, where Nnw is the number of novel words in d then 7. dtag 9% [6]. Kalyani et al. studied the association between diabetes and other chronic diseases and the relationship between HbA1 C and functional disability in older adults. The difficulty in performing physical activities is assessed using the patient’s vitals, blood pressure, A1 C, smoking, past stroke, heart disease, chronic obstructive pulmonary disease, and cancer. This study showed that elderly people with diabetes developed higher risks of comorbidities and complications than without diabetes [7]. A search was conducted to study the brain structure and its cognitive ability in middle-aged to elderly patients with type 2 diabetes. The impact of T2D on cognitive performance and verbal declarative memory performance, hippocampal and prefrontal volumes, and hypothalamic-pituitary-adrenocortical (HPA) axis feedback control was observed [8]. An extensive phenotype cohort study on 10,000 people with T2D and its comorbidities was conducted which concluded that several macro- and micro-vascular diseases get worse with T2DM [9]. A retrospective study based on the quintiles electronic medical record was conducted to quantify the prevalence and co-prevalence of comorbidities among patients with diabetes. They concluded that hypertension, obesity, hyperlipidemia, CKD, and CVD are the most common complications with 97.5% diabetics and had at least one of the comorbidities while 88.5% had at least two of the same [10]. To understand the progression of CVD in T2DM, a progress
716
S. Dutta et al.
pattern was adopted between the occurrence of T2DM and CVD in the Australian population [11]. A hybrid model with fuzzy c-means clustering and SVM was developed to predict the risk of CVD in T2D patients. The risk factors considered were BMI, weight, waist circumference, blood pressure, lipid profile, glucose, HbA1 C, fibrinogen, and ultrasensitive c-reactive protein [12]. A naïve-based classifier was used to predict heart disease in T2D patients. The attributes considered for the classification were age, sex, weight, family history, blood pressure, fasting and PP sugar, HbA1 C, and total cholesterol. With the use of data mining, the correlations between the attributes were retrieved efficiently [13]. Comorbidities and T2DM associations were studied using the association rule-based mining approach. A maximum association is observed between T2D and hypertension. Among 18 comorbid diseases, gastritis and duodenitis, senile cataract, a disorder of lipoprotein metabolism, retinal disorder, cerebral infarction, and angina pectoris had higher frequencies of occurrence [14].
3 Data Acquisition There are few diabetic databases available online namely, Pima Indians Diabetes Database, UCI Diabetes Database, however, these databases did not consider the Indian ethnicity. So, preparing a structured dataset of Indian T2D patients is essential. A retrospective cohort of 5000 T2D patients (age-group 20–80 years) of Indian origin is collected from the outpatient department of the hospital. The specifications of the data are given in Table 2. After screening thousands of health files manually, we constructed the first-level features obtained from various sources as depicted in Table 3. From Table 3, we constructed the second-level features given in Table 4 for preparing the dataset. The features depicted in Table 4 are essential to identify associated complications in type 2 diabetes. Table 2 Specifications table
Type
Explanation
Type of data
Textual parameters
Subject area
Machine learning, deep learning, artificial intelligence
Source of data
GD Hospital and Diabetes Institute, Kolkata, India
Data format
Demographic information and laboratory parameters
56 Dataset Annotation on Chronic Disease Comorbidities …
717
Table 3 First-level features constructed from various sources Source
Type
Full-Form
Patient report
Patient vitals
Age, sex, height, weight, marital status, BMI, blood pressure, arterial pulse upper, and lower limbs
Communication report
Past history of diseases
Hypertension, dyslipidemia, CVD, CAD, CKD, retinopathy, cerebral stroke, gestational diabetes, diabetic foot disease, and thyroid disorder
Tobacco and alcohol consumption
Number of cigarettes/day and duration, alcohol consumption in units/day, habits of tobacco chewing
Patient report
Family history
Family members with diabetes
Present symptoms
Fatigue, chest pain, breathlessness, and peripheral vascular disease
Diagnosis
Diabetic retinopathy, coronary artery disease, renal dysfunction, peripheral vascular disease, diabetic neuropathy, cataract, and vision impairment
Medication
sodium-glucose co-transporter 2 ( SGLT2) inhibitors, dipeptidyl peptidase inhibitors ( DPP-IV), insulin, compounds of biguanides and sulfonylureas, and meglitinides
Pathological investigation
Blood glucose, HbA1 C, complete blood count, liver function test, sodium, potassium, urinary ACR, creatinine, lipid profile, thyroid function, ECG, echocardiogram
Intervention
Angiogram, angioplasty, CABG
3.1 Complex Issues in Data Acquisition In the process of data acquisition, we came across some challenges that inspired us to prepare a structured e-dataset which can be used to develop patient-centered personalized healthcare solutions. • Missing features: Lack of documentation, inaccurate entries, often lead to over or under-estimation of diabetic status [15]. The secondary caregivers are often responsible for committing such type of errors. Some algorithmic measures can
718 Table 4 Second-level features constructed from first-level features
• •
•
•
S. Dutta et al. Feature type
Feature name
Patient vitals
Age, sex, height, weight, BMI
Family history of diabetes
FH
Duration of diabetes
Duration
Tobacco intake
Smoking
Blood pressure in lying and standing
SBP and DBP
Blood sugar
FBS, PPBS, HbA1 C
Lipid profile
HDL, LDL, TG, TC, non-HDL, apolipoprotein
Urinary albumin creatinine ratio
Urinary ACR
Foot infection
Foot infection
Past cardiovascular disease
CVD
be undertaken to handle such missing values, but imputing missing values in medical records is not desirable from the clinical perspective. Irregularity: Irregular follow-up, incomplete laboratory tests, and non-responsive about clinical questionnaires, affect the quality of health data and subsequently influence the clinical decision-making. Inconsistency: Diagnostic coding is used as a part of clinical coding that captures patient’s state of illness, injuries, chronic disease, clinical symptoms, and diagnosis during outpatient care and at the time of inpatient admission. But in many cases, inconsistencies in diagnostic coding are encountered due to differences in hospital medical coding and physician medical coding. Bias: In general, patients often exhibit biases in selecting different healthcare organizations for treatment, and visit different clinics at different time-span [15]. Thus, the clinical profile of the patients cannot be consistently maintained and preserved. Due to the lack of clinical data warehouses that link public and private data, sharing patients’ health data across different organizations is not practiced. This trend reflects that the patient records are maintained internally, and thus, there is data mismatch or incompleteness in the health records of different organizations. Lack of e-records: Inadequacy of e-record makes the data acquisition process lengthy and error-prone as it involves the painstaking task of scanning the clinical hand-notes and manually inputting the same in the digital domain. Legibility is a major concern and is hard to interpret. Often, it is observed that there is a loss of vital information since the patient’s medical registry is subject to inconsistent layout and poor maintenance.
56 Dataset Annotation on Chronic Disease Comorbidities …
719
4 Data Preprocessing The patient’s medical registry is composed of heterogeneous data, and the data retrieved from the registry is diverse, redundant, and inconsistent that affects the mining result to a large extent. Therefore, the patient’s health data must be preprocessed for ensuring accuracy, consistency, and completeness [16]. Before preprocessing the medical records, the relationship between several laboratory markers, clinical profile, and demographic data are thoroughly investigated to identify the associated complications in T2D. The broad steps of data pre-processing are depicted in Fig. 1. In this study, we adopted the following ways to clean and preprocess the inconsistent data before it could be used for predictive analysis. • Unit conversion and missing value derivation: Most of the parameters in the laboratory reports present data in diverse measuring units like mmol/L or mg/dL, and thus, the representation becomes inconsistent. To maintain uniformity, all the clinical values are converted to mg/dL. The lipid-profile test is composed of four parameters, namely HDL-C, LDL-C, TG, and TC. In the patient’s medical registry, the missingness of any one of the four parameters is very common. In such cases, the Friedewald formula shown in Fig. 2 is used to derive the unknown value from the given parameters. Table 5 presents the laboratory markers after unit conversion and application of the Friedewald formula as given in Fig. 2. • Feature labeling: Type 2 Diabetes (T2D) is often associated with several complications as discussed in Sect. 1 and identification of such complications is crucial to understand the disease progression. To identify the occurrence of past cardiac events, peripheral ulcer, foot gangrene, past cerebral attack, macroalbuminuria > 300 (Urinary ACR), or abnormal ECG readings are considered. In this dataset, the feature CVD is labeled as 0/1, with 0 means the patient with no past cardiac
Fig. 1 Preprocessing steps on source data
720
S. Dutta et al.
Fig. 2 Friedewald formula for missing value computation in lipid-profile
Table 5 Laboratory markers FBS
PPBS
HbA1 C
HDL-C
LDL-C
TG
TC
Urinary ACR
149.53
255.56
7.6
40.83
550.91
112.64
108.82
0
259
289
8.8
43
127
91
180
0
189
385
10.68
45
117
157
197
327.3
282
358
11.2
40
97
144
173
0
193
398
10
31
93
304
185
0
events and 1 means incidence of past cardiac events. The feature labeling helps to determine the positive and negative cases of past CVD in T2D. Likewise, the feature name foot infection is labeled as 0/1 where an individual, if inflicted with foot diseases like foot gangrene and recurrent foot ulcer, is labeled as 1 otherwise 0. The labeled features are shown in Table 6. • Pattern of tobacco-intake: Tobacco-intake is an important risk factor as it escalates the disease progression. In this study, we considered smoking habits as the tobacco-intake. It is measured by the number of cigarettes consumption per day and risk is scored accordingly. Scoring is done in the range from 0 to >20 cigarettes/day, i.e., y varies from 0 ≤ y ≥ 20 where y-axis measures the severity score and x-axis measures cigarettes consumption/day as shown in Fig. 3. Table 6 Features labeled for CVD and foot infection
PID
Foot Infection
CVD
P1
1
1
P2
0
0
P3
0
0
P4
0
0
P5
0
0
56 Dataset Annotation on Chronic Disease Comorbidities …
721
Fig. 3 Risk score of cigarette consumption
• Risk score of family history of diabetes: With the family history of diabetes, an individual is prone to develop T2D, its correlated risks, and metabolic abnormalities in its life span. T2D has stronger associations with family history and several studies [17, 18] have shown that genetics plays a very strong role in developing T2D. In the health records, family history is represented as a free-text string like both parents having diabetes, father/mother having diabetes, etc. But, to incorporate such categorical information as a feature in the dataset, it needs to be converted into numeric form. In Fig. 4, no parent, single parent, and both parents are scored in the range 0–2 where the x-axis represents the number of parents with Fig. 4 Risk score of family history of diabetes
Table 7 Patient vitals PID
Age
Sex
Height (cm)
Weight (Kg)
BMI
SBP
DBP
Duration
FH
Smoking
P1
55
F
150
79
35.10
160
80
12
0
0
P2
45
F
146
62
29.10
130
80
14
2
0
P3
40
F
158.5
66
26.40
140
80
1
0
0
P4
60
M
166
61
22.10
136
80
22
0
2
P5
46
M
174
81
26.80
140
80
1
1
1
722
S. Dutta et al.
diabetes and the y-axis measures the risk score. In Table 7, the demographic data, smoking score, and family history are presented.
4.1 Discussion on Feature Association In this section, we presented and analyzed the correlations among some clinical features for comorbidities study in T2D. To understand the associations between cardiovascular disease and poor glycemic control, some laboratory results of HbA1 C, HDL-C, LDL-C, and TG values are plotted in Figs. 5, 6 and 7. Fig. 5 Correlation between HbA1 C and HDL-C
Fig. 6 Correlation between HbA1 C and LDL-C
Fig. 7 Correlation between HbA1 C and triglycerides (TG)
56 Dataset Annotation on Chronic Disease Comorbidities …
723
Fig. 8 Distribution of FBS
In Figs. 5, 6 and 7, the x-axis measures HbA1 C values and the y-axis represents HDL-C, LDL-C, and TG values, respectively. HbA1 C is an important predictor variable for screening diabetic patients with a high chance of developing CVD. It is a well-known fact that rising HDL-C lower the onset of heart diseases, and it is termed as good cholesterol. On the other hand, lower HDL-C value is a known marker of CVD. In Fig. 5, with HbA1 C > 7.0%, HDL-C values keep decreasing. This implies that, HDL-C < 35 mg/dl has 8 times increased risk of CVD than those with HDL-C > 65.0 mg/dl. Thus, HDL-C has an inverse relation with cardiac diseases in T2D. In Figs. 6 and 7, individuals with HbA1 C > 8.0% exhibited a significant increase in LDL-C and TG. This is a clear indication of higher cardiovascular morbidity and mortality with poor glycemic control. With HbA1 C > 10.0%, TG values are approaching more than 200.0 mg/dl which denotes that an individual is at higher cardiac risk. It could be inferred that the occurrence of cardiovascular events can be significantly reduced by improving glycemic control. The distribution of FBS, PPBS, and HbA1 C is represented in Figs. 8, 9, and 10, respectively. In Fig. 8, about 45% of the diabetic population is having the FBS in the range 100–150 which is alarming, and more than 30% of people are having PPBS level in the range 100–300 as shown in Fig. 9. Figure 10 depicts that the average HbA1 C is around 8.0% which is higher than the target level of = 3.80 Naturally Ripen else Artificially Ripen
Stop Fig. 6 Algorithm for detecting artificially ripen mango
3.80, the mango is naturally ripened. Otherwise, the mango is artificially ripened (Fig. 6).
5 Result and Discussion Now, in this section, the result produced from the algorithm will be compared with existing technology. As well as, the efficiency and accuracy measurement will be discussed.
5.1 Accuracy of Result In the proposed system, 52 labeled artificially and naturally ripened mangoes are tested. The classification worked correctly for 49 mangoes. So, the correctness of the algorithm stands at 94.23%. In Fig. 8, spot factor values for each mango are plotted as bar. The blue line parallel to X-axis represents the threshold value. Mangoes having values above the blue line are naturally ripened. Rests are ripened using chemicals (Fig. 7).
60 Detection of Artificially Ripen Mango Using Image Processing
767
Fig. 7 Spot factor comparison
5.2 Comparison with Existing Technologies The existing technologies for detecting artificially ripened fruits are mainly based on the RGB or CMYK color values of the image. In the case of mango, the color may change depending on types and maturity level. This proposed method does not only work with depending on color features but also extracting the spots and patches of mango.
5.3 Limitations The model cannot read the image from a bunch of mangoes. An isolated background is required to segment the image.
6 Conclusion and Future Works Classification of artificially and naturally ripened mango is important as it has a direct impact on public health. This research discusses an effective method for the detection of ripened mangoes. The model works efficiently in all types of mangoes independent of color. The key features of this technique are:
768
M. M. Rahman et al.
Sample No.
1
3
1.66360
6.82177
4
5
Input Image
2
Filtered Image Image Edges Background Subtraction Mask Background Subtracted Threshold Image Spot Factor
5.80964
Fig. 8 Image of mango in different stages of processing
4.36894
7.62632
60 Detection of Artificially Ripen Mango Using Image Processing
769
1. Increased efficiency 2. Simpler implementation 3. Faster detection The proposed method can classify the ripened mangoes with higher accuracy. But accuracy can be pushed near to 100% by adding more samples. Instead of using the global threshold, each verity of mango can have its threshold value. These values may be predicted using learning algorithms. Data collected from the features can be fed into a machine learning model to approximate the result. Acknowledgements This research has been conducted with the cooperations of Project EAS-16, Ministry of Science and Technology (MOST) and Patuakhali Science and Technology University.
References 1. Sahu D, Potdar RM (2017) Defect identification and maturity detection of mango fruits using image analysis. Am J Artif Intel 1(1):5–14. https://doi.org/10.11648/j.ajai.20170101.12 2. Karthika R, Ragadevi KVM, Asvini N (2017) Detection of artificially rip-ened fruits using image processing. Int J Adv Sci Eng Res 2(1) 3. Prieto C, Carolina D (2014) Classification of oranges by maturity, using image processing techniques. In: Proceedings of 3rd IEEE international congress of engineering mechatronics and automation (CIIMA), pp 1–5 4. Gongal A, Amatya S, Karkee M, Zhang Q, Lewis K (2015) Sensors and systems for fruit detection and localization: a review. Comput Electron Agric 116:8–19 5. Rocha A, Hauagge DC, Wainer J, Goldenstein S (2010) Automatic fruit and vegetable classification from images. Comput Electron Agric 70(1):96–104 6. Meruliya T (2015) Image processing for fruit shape and texture feature ex-traction—review. Int J Comput Appl 129(8):30–33 7. Image Thresholding. https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_ imgproc/py_thresholding/py_thresholding.html#thresholding. Accessed on Jul 2019 8. Image Processing in OpenCV. Contour Features. https://docs.opencv.org/trunk/dd/d49/tutorial_ py_contour_features.html. Accessed on Aug 2019
Chapter 61
Intelligent Negotiation Agent Architecture for SLA Negotiation Process in Cloud Computing Rishi Kumar, Mohd Fadzil Hassan, and Muhamad Hariz M. Adnan
1 Introduction Cloud computing is an evolving field of information technology (IT). It provides solutions or development environments on software, platform, and infrastructure as services on the Internet [1]. In recent trends, cloud computing has developed new trends in businesses and also speeds up the pace of scientific researches. Software as a service (SaaS), platform as a service (PaaS), and infrastructure as a service (IaaS) are providing applications to the users (consumer) hosted by cloud service provider (CSP) over the Internet [1]. The use of applications produces business to the CSP in a flexible manner of unlimited use and pay as per demand on access to the services. IaaS provides computation, storage space, and communication services on demand. PaaS offers an environment for programming to develop applications with the facility of processors, memory, and supporting tools. SaaS deals with software application, and users can use any application online [2]. Cloud computing has provided lots of opportunity to CSP for business and flexibility to users to access IT infrastructure, by adjusting the amount of resources and to pay only for used resources [3] (Fig. 1). Cloud services are common these days over the Internet, just use these services and pay accordingly. CSP provides IaaS, PaaS, and SaaS services to small, large, R. Kumar (B) · M. F. Hassan Department of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar, Malaysia e-mail: [email protected] M. F. Hassan e-mail: [email protected] M. H. M. Adnan Computing Department, Faculty of Arts, Computing, and Creative Industry, Universiti Pendidikan Sultan Idris, Tanjung Malim, Malaysia e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_62
771
772
R. Kumar et al.
Fig. 1 Components of cloud computing paradigm [1]
and medium business ventures [3]. Every single service is accompanied by the service level agreement (SLA), which describes the requirement of resources and obtained services guarantees offered by CSP to its users. SLA also advertised new schemes to attract new customers and offers multiple resources to promote their services. SLAs are defined between cloud providers, cloud consumers, and network carriers [3]. Sim [4] defines cloud computing systems as a collection of virtualized and connected computers for computing resources through negotiated service level agreements (SLAs), and cloud computing is an on-demand resources facility and has on-demand self-service which consumer can request for computing capabilities (e.g., server time and network storage) without human interaction [4, 5]. Cloud is rapidly elastic, where it can be rapidly, elastically, or automatically provisioned to the consumer and appears as unlimited. The cloud resource usage must be able to be monitored, controlled, and reported [6]. To reach final agreements between CSP and consumer, there is a negotiation between both parties, on-demand resources, price, availability, etc. At present, there is a substantial gap in the standing service level agreement (SLA) negotiation process, which neither defines the responsibilities of CSPs at the time of a malicious incident [3].
61 Intelligent Negotiation Agent Architecture for SLA Negotiation …
773
2 Background SLA negotiation process between cloud consumer and CSP can be done in several ways. In Fig. 2, [6], author explains the simple one-to-one negotiation process, and the agent contains three major components: proposal generator, proposal evaluator, and decision-maker. Proposal generator generates proposal according to requirement. Both participants use values for each negotiation issue. Distinguish negotiation approaches used by negotiator to finalize the agreement. Proposal evaluator evaluates the proposals in order to satisfy both consumer and provider according to individual’s demands. Decision-maker protocol decides to accept and reject the proposal on the basis of evaluation, requirements, availability, price, reliability, and security. Figure 2 represents the one-to-one process model, which is the key prototype of negotiation between two parties. In the current cloud management system, only semi-customized service provisioning SLAs are available. As per the future requirement and recent technology changes, forecast predicts that consumers with specific quality of services (QoS) requirements may not be satisfied by the CSP resource provider for capitalizing on revenue. Furthermore, CSP cannot provide explicit services within demand to consumers with particular quality expectations [7]. In future, demands of cloud have been increasing, and also cloud computing, fog computing, and Internet of Things (IoT) data are in demand, to fulfill consumer and provider SLA requirement, cloud management system with intelligent agent discussed in various papers [1]. There is a huge demand of intelligent negotiation agent which work on service provider layer of cloud computing and provide stability establishment to consumer and provider according to QoS requirements. In lieu of heavy demands of cloud, fog, and IoT services by consumer, the negotiation agent is able to support SLA requirements and also its management [8]. This situation motivates researcher to work toward the development of intelligent negotiation agent which can fulfill requirement, management, and dynamic changes on SLA during cloud services [9]. Automated negotiation agent in cloud computing must see other factors like service discover,
Fig. 2 Interaction process for one-to-one negotiation [6]
774
R. Kumar et al.
scaling, monitoring, and decommissioning operations in a dynamic environment [10]. The cloud services and its resources dynamically allotted the virtual machine for task execution as per SLA requirements. These resources distributions are needed to be well negotiated and optimize allocated to consumer by service provider by using an intelligent mediator agent [11, 16]. The respective cloud service provider agency may introduce or act as an agent for upholding dynamic allocation, monitoring, and reconfiguration of the cloud services on behalf of the user [12, 17]. Subsequently, the cloud services for SLA negotiation participants such as service consumer, broker, and service provider are self-regulating bodies with different necessities, policies, and objectives. These demands raise the possibility of an intelligent agent for firmness of their differences among all participants for cloud services [8]. In a real-time, ecommerce negotiation problem, the broker-based negotiation framework using other computing prototypes, like the grid and cluster, is delimited to resource constraints due to more negotiation process complexities [13, 18]. To overcome the problem of cloud resources allocation and to follow QoS constraints, an intelligent agent must be introduced over cloud environment, which maintain the flexibility and dynamic behavior of demands and supply of cloud services [11, 19]. In Fig. 3, intelligent negotiation agent (INA) has been proposed and its major key components are negotiation layer supported by SLA interpreter, decision support system, cognition layer, and information retrieval layer which must be connected to centralized database with automated learning agent which make strategy for best negotiation policy between consumer and cloud service provider [14].
Fig. 3 Architecture of intelligent negotiation agent for SLA negotiation in cloud computing
61 Intelligent Negotiation Agent Architecture for SLA Negotiation …
775
3 Proposed Architecture Proposed intelligent negotiation agent (INA) architecture is going to analyze behavior of each consumer and CSP. After analysis with help of learning algorithm, INA helps consumer to provide efficient, reliable, and scalable CSP which provide availability of resources as per requirements with maximum utility function, best cost, and in minimum response time with any latency delay. Architecture of intelligent negotiation agent (INA): In Fig. 3, INA is described by seven tuples: {CE, NL, IRL, CL, KB, ALA, DSS}, where CE is communication environment, NL is negotiation layer, IRL is information retrieval layer, CL is cognition layer, KB is knowledge base, ALA is automated learning agent, and DSS is a decision support system. INA architecture is inspired by [15]. Intelligent negotiation agent (INA) architecture enables a system which organizes to analyze the information obtained from CSP, consumer, and database for negotiation. INA provides information that supports decision-making process for a final agreement between both parties. The INA architecture has multiple layered systems, which help the agent to go through all possibilities for negotiation and based on layer-wise analysis; it will help consumers to select suitable CSP. INA has several layers for analysis and interpreting of SLA. Negotiation layer (NL): This layer is responsible to collect SLAs of Consumer. SLA interpreter bifurcates the requirements of resources as per demand by consumer. INA will send a request to respective CSP and request for proposals. All CSP will analyze the requirement of resources and produce an offer w.r.t. user requirement. SLA interpreter will analyze all offers produce by CSPs. The negotiation process is started by the negotiator engine. NL sends information of SLAs to other layers for analysis and to find out the best possible match for the consumer by learning methods, and then decision support system will help to decide based on reports produced by learning agent. Information retrieval layer (IRL): IRL is going to collect information about the requirements of the consumer and against that offered given by various CSPs. IRL main class information analysis is going to extract how many resources are required by users, their availability, prices by user, and w.r.t. demand reliable and secure CSPs shortlisted for negotiation. Fusion management is responsible for merging the information as per designed for negotiation. Directory management is going to make a catalog for the knowledge base for further learning analysis and decision support mechanisms. All information and analysis must be communicated to the knowledge base and learning agent for analysis. Cognition layer (CL): CL is a very complex layer of INA architecture and an important layer. It takes information from the knowledge base and reports analysis from IRL. The translator process converts all data into required informational format. Inference engine and inference mechanism help to decide conclusions based on an
776
R. Kumar et al.
analysis made by learning agents based on previous data and reasoning. That conclusion report helps in the negotiation process finds out the best suitable match for both parties. Knowledge base (KB): KB is going to store all the information required for negotiation and make a database for the future learning process of INA. Monitoring: After SLA negotiation process, monitoring of exchanged resources is key factor. Monitoring agent analyzes the consumption of resources with security, reliable factor. According to SLAs, monitoring of communication, security, reliability, and which later support on final billing. Automated learning agent (ALA): ALA plays an important role in the negotiation process. It has all the behavioral records of clients and service provider, i.e., related to price, availability, reliability, security, etc. Suppose if the client or CSP is new for negotiation, first it will save all the records in KB and also analyze the negotiation process and monitor the service process for better results in the future. ALA will help the cognition layer in the process of negotiation. Negotiations between consumer and CSPs, i.e., one to many, will take place, and ALA will produce better matches with the help of producing match results by machine learning algorithm, i.e., reinforcement learning and deep learning algorithm. Decision support system (DSS): DSS is going to help to decide for both parties based on results produced by other layers. If both parties accept the proposal, then the service process continues. If negotiation rejected, then SLA redesigning took place and the same process will continue with performance report of the negotiator. This is the overview architecture of the INA model for the SLA negotiation process in cloud computing which will give the best solution as compared to the existing model. This model gives solutions to consumers and CSP through which it will understand their patterns for resource requirements and negotiation behavior, which will guide CSP and consumers for a better match and efficient resource management decisions. Agent architecture with intelligent behavior analysis allows to obtain services in minimum price, service resumed in less negotiation time, proper association, and availability of resources as per demand between reliable parties.
4 Conclusion This research studies the INA and AIA for SLA negotiation process for services in cloud computing between client and CSP. It is defined that cloud SLA negotiation process with INA for negotiation will understand the requirement of client and produces perfect match from massive number of CSPs with better price, available and reliable and secure. The proposed research is to develop intelligent agent mediator for negotiation process. INA is also going too focused on the dynamic environment of cloud market and user demand. INA will help the negotiation process to speed up, number of rounds will be less, and effective solution for negotiation for users and
61 Intelligent Negotiation Agent Architecture for SLA Negotiation …
777
also assist user to reach them perfect service provider so that it will achieve its goals and objectives. Further, machine learning algorithm will be developed to justify and simulate the model. Monitoring of services delivery can also be added to give agent prominent features. Security check of both consumer and CSP can be added to the agent for secure delivery of services and data breaching. Blockchain can also provide a better solution for service delivery and support negotiation process. In micro- and macro-cloud services, there is huge demand of SLA negotiations and can further extend to fog and edge computing.
References 1. Bahsoon R et al (2018) A manifesto for future generation cloud computing. ACM Comput Surv 51(5) 2. Voorsluys RBW, Broberg J (2011) Cloud computing : principles and paradigms table of contents 3. Iyer GN (2016) Cloud testing: an overview 4. Sim KM (2012) Agent-based cloud computing. IEEE Trans Serv Comput 5(4):564–577 5. Sim KM (2010) Towards complex negotiation for Cloud economy. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol 6104, LNCS, pp 395–406 6. Shojaiemehr B, Rahmani AM, Qader NN (2018) Cloud computing service negotiation: a systematic review. Comput Stand Interf 55:196–206 7. Sudhakar S, Nithya NS, Radhakrishnan BL (2019) Fair service matching agent for federated cloud. Comput Electr Eng 76:13–23 8. De la Prieta F, Rodríguez-González S, Chamoso P, Corchado JM, Bajo J (2019) Survey of agent-based cloud computing applications. Futur Gener Comput Syst 100:223–236 9. Hsu CY, Kao BR, Ho VL, Li L, Lai KR (2016) An agent-based fuzzy constraint-directed negotiation model for solving supply chain planning and scheduling problems. Appl Soft Comput J 48:703–715 10. Rajavel R, Iyer K, Maheswar R, Jayarajan P, Udaiyakumar R (2019) Adaptive neuro-fuzzy behavioral learning strategy for effective decision making in the fuzzy-based cloud service negotiation framework. J Intell Fuzzy Syst 36(3):2311–2322 11. Armstrong DJ et al (2019) The opportunities cloud service providers should pursue in 2020. Futur Gener Comput Syst 7(1):1 12. Elhabbash A, Samreen F, Hadley J, Elkhatib Y (2019) Cloud brokerage: a systematic survey. ACM Comput Surv 51(6):1–28 13. Wu L, Garg SK, Buyya R, Chen C, Versteeg S (2013) Automated SLA negotiation framework for cloud computing. In: Proceedings of 13th IEEE/ACM international symposium cluster, cloud and grid computing. CCGrid 2013, pp 235–244 14. Sim KM (2018) Agent-based approaches for intelligent InterCloud resource allocation. IEEE Trans Cloud Comput 1 15. Vallejo D, Castro-Schez JJ, Glez-Morcillo C, Albusac J (2020) Multi-agent architecture for information retrieval and intelligent monitoring by UAVs in known environments affected by catastrophes. Eng Appl Artif Intell 87:103243 16. Jain K, Choudhury T, Kashyap N (2017) Smart vehicle identification system using OCR. In: 2017 3rd international conference on computational intelligence & communication technology (CICT), pp 1–6 17. Bhatnagar HV, Kumar P, Rawat S, Choudhury T (2018) Implementation model of Wi-Fi based smart home system. In: Proceedings on 2018 international conference on advances in computing and communication engineering, ICACCE 2018. doi: https://doi.org/10.1109/ICACCE.2018. 8441703
778
R. Kumar et al.
18. Tomar R (2019) Maintaining trust in VANETs using blockchain. Ada User J 40(4) 19. Tomar R, Sastry HG, Prateek M (2020) Establishing parameters for comparative analysis of V2V communication in VANET
Chapter 62
An Efficient Pneumonia Detection from the Chest X-Ray Images Rajdeep Chatterjee, Ankita Chatterjee, and Rohit Halder
1 Introduction Corona-viruses being a family of crown-shaped viruses, are responsible for creating acute respiratory symptoms, kidney failures, and even death with common flu-like symptoms. The typical flu symptoms confuse the doctors between the occurrence of regular flu or an infection by the crown-shaped viruses until it is too late and lifethreatening. Another prominent symptom of the infection is Pneumonia. Pneumonia is itself life-threatening unless appropriately treated. In this infection, the air sacs are filled with fluids, and the patient feels difficulty breathing. Pneumonia was reckoned to account for the death of almost 808, 694 children under the age of 5 years in 2017. Pneumonia is caused by bacteria, fungi, or even viruses. Diagnosis of Pneumonia includes blood tests, Chest X-Ray, Pulse oximetry, and Sputum test. However, analyzing the X-Ray images need skilled professionals and is a time-consuming task. Especially during the outbreak of Pneumonia related diseases, the process takes a much longer time due to the massive workload of the professionals and the precaution measures implemented in hospitals and clinics. Several artificial intelligence-based algorithms are used for accurate analysis of the X-Ray images to assist the radiology experts [1, 2]. Previously, Regression Neural Networks and Vector Quantization are used to analyze and investigate chest diseases like pneumonia [3]. Although, these algorithms failed to provide a desirable accuracy. Deep learning R. Chatterjee (B) · R. Halder School of Computer Engineering, KIIT, Bhubaneswar 751024, India e-mail: [email protected] R. Halder e-mail: [email protected] A. Chatterjee School of Electrical Sciences, IIT Bhubaneswar, Khordha 752050, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_63
779
780
R. Chatterjee et al.
techniques have been employed to have a quality diagnosis of Pneumonia due to its implicit feature extraction and robust feature selection schemes. Roadmap: Our paper has been organized as follows. In Sect. 2, we describe the related approaches. It is followed by Sect. 3, it gives the conceptual idea and model architecture about our proposed approach. The next Sect. 4, explains the experimental set-up of our study. The detailed results have been discussed in Sect. 5. Finally, we conclude with the Sect. 6.
2 Related Works Deep Learning models can play a vital role in analyzing medical health and even detect tumors and lesions from medical images, with human-level accuracy [4, 5]. However, the use of artificial intelligence in the medical domain was introduced in nineteenth century. The detection of the lymph nodes in the presence of low variance surrounding structures from the images of computer tomography scans was addressed using convolutional neural [6] network followed by interstitial lung disease classification, detection of thoracoabdominal lymph [7] and analysis of patients from spinal lumbar magnetic resonance imaging (MRI) [8]. The Convolutional Neural Networks brought about a revolution in the Deep Learning paradigm [9]. Neural Network Architectures like VGG-16, VGG-19, AlexNet has been developed for efficient multi-class classification problems [10, 11]. Transfer Learning became widely popular in medical image classification and data augmentation, which helps the model enhance performance even with a smaller and more complex dataset [12, 13]. Researchers have implemented a transfer learning-based approach for Pneumonia detection using chest X-ray images [14]. However, the quality of X-Ray images may vary at the time of input to the automated diagnosis applications for many reasons: a small number of available samples, the inexperience of the technician who takes the X-Ray image or device problem, etc. The transfer learning-based models used so far are also inefficient for time and space complexity. Researchers feel motivated to explore different tools and techniques to design an efficient Chest X-Ray image classifier using new networks or fine-tuning the existing networks. In this paper, we have shown a comparative study between the major transfer learning algorithms and put forward a lightweight, efficient model for pneumonia detection using chest X-rays. The few best-performing X-Ray image classification techniques have been given in Table 1.
3 Conceptual Background Deep learning-based algorithms, like CNN (Eqs. 1 and 2) are used to solve a large variety of problems, such as image classification, object detection, and image segmentation, etc. To build our deep neural networks is preferable as one can understand
62 An Efficient Pneumonia Detection from the Chest X-Ray Images
781
Table 1 Results obtained from binary pneumonia chest X-Ray image classification References Methodology Used dataset [14]
CNN with transfer learning
[15]
A 121 layer deep convolutional layer network Convolutional neural network Faster region based CNN (F-RCNN) Mask R-CNN Unsupervised fuzzy c-means classification learning
[16] [17] [18] [19]
Guangzhou women and children’s medical center dataset Chest X-ray14 dataset CheXpert PASCAL VOC 2007 and 2012 RSNA pneumonia dataset Chest X-ray
its components with more clarity. However, it is not always feasible to implement for the different reasons: (i) poor understanding of the different aspects of the network; (ii) inadequate data for training (learn) a model perfectly; (iii) hardware resource constraints for executing new networks. x(t) ∗ h(t) = y(t) X ( f )H ( f ) = Y ( f )
(1) (2)
The alternative solution is to use the pre-defined and validated deep neural networks. It addresses the first drawback, but again these networks have large numbers of parameters to train on a new task or problem. Also, they do not perform well with a small amount of training data. In this scenario, the appropriate solution is “Transfer Learning.” We use pre-defined and pre-trained networks in Transfer learning on a similar but a new task. We discuss transfer learning in the latter section. AlexNet is an improvement on the traditional convolutional neural networks. It was proposed in 2012. Since then, many deep nets have been developed to solve various types of problems. Each of the deep nets has its own merits and demerits. In this line, VGG is the successor of the AlexNet, and different variants of VGG are available in practice. However, ResNet, MobileNet, and the latest EfficientNet are few top-rated image classification deep neural networks with fewer hyper-parameters.
3.1 VGG16 VGG16 is a prevalent convolutional neural network architecture developed by the Visual Geometry Group from Oxford [20]. It was used to win the ILSVR (ImageNet) competition in 2014, also known as OxfordNet. The advantage of VGG16 over AlexNet is that it focuses on having convolution layers of 3 × 3 filter with a stride 1 and uses similar padding with a max pool layer of 2 × 2 filter of stride 2. It allows the
782
R. Chatterjee et al.
combination of convolution and max pool layers throughout the whole architecture. In the end, it has 2 FC (that is, fully connected layers), followed by a sigmoid/softmax for the output. In VGG16, the number “16” refers to 16 layers contributing to the weights.
3.2 VGG19 One of the widely used variants of the VGG family is VGG19. Again in VGG19, the “19” indicates 19 trainable layers, including 16 convolution layers and 3 fully connected layers.
3.3 ResNet50 Day by day, the problems to be solved become more complex. Deeper neural networks are recommended for such applications but are more challenging to train. Most deep nets experience the vanishing gradients problem. The Residual Networks, commonly called ResNet [21, 22], is a suitable alternative to the problem mentioned above. The ResNet50 has 50 layers and used to win the ImageNet competition in 2015.
3.4 MobileNetV1 MobileNets are depthwise separable convolution architecture used to downsize the model size and the overall complexity of the network [23]. It is beneficial for mobile and embedded device-based applications. The authors have introduced two specific global hyper-parameters, which makes an efficient balance between latency and accuracy. The hyper-parameters provide the developers to select the appropriately sized model for their application considering the problem constraints.
3.5 EffcicientNetB3 In 2019, Google published a series of optimized deep neural network architectures that maximizes the accuracy and minimizes the computational complexity [24]. Generally, convolutional neural networks provide results using limited resources. The network is then scaled up to obtain higher accuracy with more resources. On the other hand, EfficientNet architectures’ family is developed through optimal balancing between the network depth, width, and resolution. Eventually, it leads the network to perform better, using fewer hyper-parameters and low computational cost.
62 An Efficient Pneumonia Detection from the Chest X-Ray Images
783
Table 2 Experimental configuration of binary pneumonia chest X-Ray image classification (Experiment-I) Models #Parameters Step_Size #Epoch Activation Optimizer (Millions) function VGG16 VGG19 ResNet50 MobileNetV1 EfficientNetB3
134.26 139.57 24.90 5.33 13.40
100 100 100 100 100
100 100 100 100 100
Sigmoid
Adam
The developers proposed a new uniform scaling technique that applies to all the network depth, width, and resolution dimensions to obtain higher-performing versions of the same network architecture. The baseline network EfficientNet “B0” has been scaled up to create EfficientNet “B1”, a system with roughly twice the processing power. The same scaling is applied successively to create EfficientNet B2, B3, B4, B5, B6, and B7. We have used EfficientNetB3 (EffNetB3) as it is the most optimal architecture in the EfficientNet series.
4 Experimental Set-Up 4.1 Used Dataset The used dataset1 has total of 5232 chest X-ray images which includes 3883 pneumonia and 1349 normal images. Again, the 3883 pneumonia images are divided into 1345 viral and 2538 Bacterial Pneumonia. This 5232 chest X-ray images have been used in training the model. Furthermore, the 624 chest X-ray images have been used for testing the model. Out of 624 test images, the normal and pneumonia images are 234 and 390 (includes 148 viral and 242 bacterial) respectively. We have employed 5 image augmentation techniques [25] to improve the numbers before training. The augmentation types are rotation (40◦ ), width shift (0.2), height shift (0.2), zoom (0.2), and horizontal flip (True). We have implemented two separate experiments: (a) Experiment-I is a normal versus Pneumonia chest X-Ray image classification problem and (b) Experiment-II is normal versus bacterial Pneumonia versus viral Pneumonia chest X-Ray image classification problem. The experimental set-up for both the experiments are given Tables 2 and 3.
1 https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia.
784
R. Chatterjee et al.
Table 3 Experimental configuration of ternary pneumonia chest X-Ray image classification (Experiment-II) Models #Parameters Step_Size #Epoch Activation Optimizer (Millions) function VGG16 VGG19 ResNet50 MobileNetV1 EfficientNetB3
134.27 139.58 26.73 5.33 13.41
100 100 100 100 100
100 100 100 100 100
Softmax
Adam
4.2 Transfer Learning We have employed the transfer learning technique in building the classification models. Transfer learning technique jump starts the learning process for a new problem using its Transfer of knowledge from a related and already learnt problem [26]. The prime aim of transfer learning is to speed up the training process and boost the classification performance. The speedup is done due to its prior knowledge in learning similar patterns or tasks. The previously learnt knowledge helps the current learning when the model is trained on smaller training samples. The inadequate data for training can be compensated from the Transfer of knowledge acquired from a similar but massive dataset [27, 28]. Commonly, the loss output layer, which is the final layer used to make predictions, is replaced with the new loss output layer for our problem. The fine-tuning of this node is done to determine the training penalties. Penalties are the deviations between the actual class labels and the predicted output. However, one can also re-train more than the last layer if the problem requires more fine-tuning.
4.3 System Configuration The paper is implemented using Python 3.6 and Tensorflow (GPU) 1.14 on an Intel(R) Core(TM) i7 − 9750H CPU (9th Gen.) 2.60 GHz, 16 GB RAM and 6 GB NVIDIA GeForce RTX 2060 with 64 bits Windows 10 Home operating system.
5 Result and Analysis Five popular deep neural network architecture has been implemented on the used dataset in two separate Experiments I and II. The latest, EfficientNetB3 outperforms the other used alternative using the same experimental configurations. We have used
62 An Efficient Pneumonia Detection from the Chest X-Ray Images
785
Table 4 Results obtained from binary pneumonia chest X-Ray Image classification (Experiment-I) Models Accuracy (%) f1-score AUC VGG16 VGG19 ResNet50 MobileNetV1 EfficientNetB3
89.90 90.22 89.90 92.47 93.00
0.88 0.89 0.89 0.92 0.92
0.87 0.88 0.87 0.91 0.92
Table 5 Results obtained from ternary pneumonia chest X-Ray Image classification (ExperimentII) Models Accuracy (%) f1-score AUC VGG16 VGG19 ResNet50 MobileNetV1 EfficientNetB3
81.09 86.70 75.16 85.58 88.78
0.79 0.85 0.65 0.84 0.87
0.85 0.90 0.77 0.88 0.90
the holdout training scheme for our model building. The training-set and test-set partition of the chest X-ray Images are as per the source repository recommendation. Our study is as per the anticipated direction mentioned in the literature. This empirical study observes that the EfficientNet model is suitable for complex image classification problems such as biomedical image classification and makes a subtle balance between accuracy and computational cost. The obtained results provided consistent and stable performance and validated using three different quality metrics: accuracy, f1-score, and AUC (area under the curve). In all the cases, EfficientNetB3 gives better results than other used classification models (see in Table 4). The performance of EfficientNetB3 is slightly better than the MobileNetV1, which is also a lightweight architecture. However, The difference in performance is visible in the Experiment-II (classification between standard, bacterial and viral pneumonia chest X-Ray images) where the EfficientNetB3 significantly surpasses the results obtained from the VGG16, VGG19, ResNet50 and MobileNetV1 models (refer to Table 5). The results show that EfficientNetB3 is well suited for more complex image classification problems and performs better while using the same experimental configuration. Few samples of the chest X-Ray image classification is shown in Fig. 1. The classification accuracies on both Tables 4 and 5 are in percentage for better understanding. However, the obtained accuracies are scaled to 0–1 to maintain the uniformity with the f1-score and AUC metric values in Figs. 2 and 3. The meanaccuracies obtained from the Experiment-I and II are 91.10% and 83.46%, respectively. Figure 4 shows the comparison between the accuracy and model sizes (number of parameters in millions) obtained from the used deep learning models in this paper.
786
R. Chatterjee et al.
Fig. 1 Sample prediction results from the chest X-Ray image classification (Experiment-I)
Accuracy
f1-score
AUC
1 Mean Accuracy: 0.9110 0.92 0.92
Performances
0.9
0.9
0.9 0.88
0.87
0.89
0.9 0.88
0.93 0.91
0.92 0.92
0.89 0.87
0.8
0.7
0.6 VGG16
VGG19 ResNet50 MobNetV2 Deep Learning Models
EffNetB3
Fig. 2 Graphical presentation of the results obtained from Experiment-I (Table 4)
Again, if we consider the trade-off between all the five deep neural network architectures, EfficientNetB3 leads the classification accuracy chart with a minimal number of model parameters in both the experiments.
6 Conclusion AI-enabled medical diagnosis is the future of clinical diagnosis. As it deals with human lives, an automated diagnosis could be considered a supplementary opinion and medical experts. However, the development of lightweight but highly accurate
62 An Efficient Pneumonia Detection from the Chest X-Ray Images
Accuracy
f1-score
787
AUC
1 Mean Accuracy: 0.8346 0.9
Performances
0.9 0.85
0.87
0.86
0.85
0.9
0.89
0.88
0.87
0.84
0.81 0.79
0.8
0.77 0.75
0.7 0.65
0.6 VGG16
VGG19 ResNet50 MobNetV2 Deep Learning Models
EffNetB3
Fig. 3 Graphical presentation of the results obtained from Experiment-II (Table 5) 100
Accuracy(%)
EffNetB3 MobNetV2 EffNetB3
90
VGG19
ResNet50 VGG16
VGG19 MobNetV2
VGG16
80 ResNet50
70
0
20 40 120 Number of Parameters (Millions)
Experiment-I Experiment-II 140
Fig. 4 Comparison between model size (number of parameters) and accuracy for Experiments-I and II
medical image classifier is a challenge. We have used different popular transfer learning (deep) networks to build an efficient predictor for chest X-Ray images. The latest EfficientNet (B3) variant performs well on the used dataset and achieves 93% and 88.78% test accuracies for the binary and ternary chest X-Ray image classification experiments respectively. To the best of our knowledge, the obtained results on pneumonia versus normal are reportedly best and is an improvement over the source research work.2 In the future, we examine the damaged region of the lungs using suitable image segmentation, which can pinpoint the doctors to identify the infection’s gravity.
2 https://www.cell.com/cell/fulltext/S0092-8674(18)30154-5.
788
R. Chatterjee et al.
References 1. Avni U, Greenspan H, Konen E, Sharon M, Goldberger J (2010) X-ray categorization and retrieval on the organ and pathology level, using patch-based visual words. IEEE Trans Med Imag 30(3):733–746 2. Pattrapisetwong P, Chiracharit W (2016) Automatic lung segmentation in chest radiographs using shadow filter and multilevel thresholding. In: 2016 international computer science and engineering conference (ICSEC). IEEE, pp 1–6 3. Er O, Yumusak N, Temurtas F (2010) Chest diseases diagnosis using artificial neural networks. Exp Syst Appl 37(12):7648–7655 4. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Imag Anal 42:60–88 5. Brunetti A, Carnimeo L, Trotta GF, Bevilacqua V (2019) Computer-assisted frameworks for classification of liver, breast and blood neoplasias via neural networks: a survey based on medical images. Neurocomputing 335:274–298 6. Roth HR, Lu L, Seff A, Cherry KM, Hoffman J, Wang S, Liu J, Turkbey E, Summers RM (2014) A new 2.5D representation for lymph node detection using random sets of deep convolutional neural network observations. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 520–527 7. Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imag 35(5):1285–1298 8. Jamaludin A, Kadir T, Zisserman A (2016) Spinenet: automatically pinpointing classification evidence in spinal MRIS. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 166–175 9. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 10. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations 11. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and γ 0, Otherwise
(5)
Global thresholding value from defined algorithm was found. Then the threshold value is applied to find only two types of value 0 and 1. Binarization is applied for the reduction of complexity (Fig. 4).
2.3 Closing Operation Structuring elements of eight-connected components are applied on closed edges in binary edge image [11]. By using four or six connected components, the system cannot find accurate result. Dilated image is filtered using erosion technique on
796
F. Imran and Md. Ali Hossain
Fig. 5 Closing operation on edges
Fig. 6 Text region labeling
smaller connected components. The output will be used next which has text candidate regions (Fig. 5).
2.4 Text Regions Labeling We have used an algorithm which can find characters from connected components when the following characteristics will meet. The connected components of the edge image are partitioned according to the following condition (Fig. 6). 1. 2. 3. 4. 5.
Center width of bounding box will be in between 0.01 and 0.95. Bounding box’s center height will be in between 0.2 and 0.8. Width vs. height ratio must be less than 10. Bounding box’s width 0.3 image height and 30 pixels.
2.5 Character Segmentation for Localization The image is scanned from top to bottom [12]. If the summation of every pixel is zero, then extract a line. This process is continued until the last line. After extracting line, scan every line from left to right. If the summation of every pixel is zero, then extract every character from each line. Our proposed system will just extract lines, but the characters are localized for the recognition.
63 Recognition of Car License Plate Name …
797
2.6 Comparison of Characters Using Template Matching Technique Templates are stored which contain information of car license plate but in document image format. Actually, the template is usually visualized as a line of texts and the line is partitioned into several regions for reducing complexity. Every line is serially compared with the stored line template. Template lines were localized in regions of characters. If every region is almost the same as the template, then the system will say that the exact car plate name is found.
3 Experimental Results Total 60 car plate images are captured through any digital devices as JPG or PNG format. They are stored in our dataset with some available image owned by Larxel to apply our proposed methodology. The car plate images were not the same in type. Some were distorted, and those types of images were restored. Table 1 is showing the result summary. The components are detection of edge, text regions labeling, character extraction, character segmentation, and overall performance as well as accuracy rate. By using the following equation, the accuracy rate can be found: Accuracyrate =
Numberofcorrectlyprocessedsamples × 100% Numberofalltestsamples
(6)
The suggested method can precisely extract texts though the car license plate images are in varied positions, colors, illumination conditions, alignment modes, and character sizes. Table 1 Performance analysis of proposed technique
Component
Accuracy (%)
Detection of edge
97
Text regions labeling
99
Character extraction
97.5
Character segmentation for localization
98
Overall performance
98.5
798
F. Imran and Md. Ali Hossain
4 Comparison with Other Text Extraction Techniques The proposed system has already been compared with existing system [13]. The first method has just focused on finding inner, outer, and inner-outer corners. The second method has taken out text by identify edge at different alignment, i.e., 0, 45, 90,135 degrees and assemblage these thumps at different heights. The difficulty is to identify the edges at different alignment. By using connected component variance (CCV), the problem can be reduced.
5 Conclusion In this paper, a method has been proposed which is appropriate to get the car license plate name. The input images can have different colors, sizes, and different alignment modes. The overall accuracy is increased in contrast with other methods. During moving, the car plate image gets distorted. In our future work, we will focus on extracting car plate name from different positions of car with the help of multiple cameras.
References 1. Dalarmelina NDV, Teixeira MA, Meneguette RI (2020) A real-time automatic plate recognition system based on optical character recognition and wireless sensor networks for ITS. Sensors 20(1):55 2. Rasheed S, Naeem A, Ishaq O (2012) Automated number plate recognition using hough lines and template matching. In: Proceedings of the world congress on engineering and computer science, vol 1, pp 24–26 (October, 2012) 3. Saleem N, Muazzam H, Tahir HM, Farooq U (2016) Automatic license plate recognition using extracted features. In: 2016 4th international symposium on computational and business intelligence (ISCBI), pp 221–225. IEEE, 2016 4. Silva SM, Jung CR (2017) Real-time brazilian license plate detection and recognition using deep convolutional neural networks. In 2017 30th SIBGRAPI conference on graphics, patterns and images (SIBGRAPI), pp 55–62. IEEE, 2017 5. Bhatnagar HV, Kumar P, Rawat S, Choudhury T (2018) Implementation model of Wi-Fi based smart home system. In: Proceedings on 2018 international conference on advances in computing and communication engineering, ICACCE 2018. Doi: https://doi.org/10.1109/ICACCE.2018. 8441703 6. Shree R, Choudhury T, Gupta SC, Kumar P (2017) KAFKA: the modern platform for data management and analysis in big data domain. In: 2017 2nd international conference on telecommunication and networks (TEL-NET), pp 1–5 7. Tomar R (2019) Maintaining trust in VANETs using Blockchain. Ada User J 40(4) 8. Tomar R, Sastry HG, Prateek M (2020) Establishing parameters for comparative analysis of V2V communication in VANET 9. Samarabandu J, Liu X (2006) An edge-based text region extraction algorithm for indoor mobile robot navigation. Int J Sign Proces 3(4):273–280
63 Recognition of Car License Plate Name …
799
10. Wolf C, Jolion JM, Chassaing F (2002) Text localization, enhancement and binarization in multimedia documents. In: Object recognition supported by user interaction for service robots, vol 2, pp 1037–1040. IEEE, August, 2002 11. Agnihotri L, Dimitrova N (1999) Text detection for video analysis. In: Proceedings IEEE workshop on content-based access of image and video libraries (CBAIVL’99), pp 109–113. IEEE 12. Zhang PY, Li CH (2009) Automatic text summarization based on sentences clustering and extraction. In: 2009 2nd IEEE international conference on computer science and information technology, pp 167–170. IEEE, 2009 13. Sriyanto S, Nguyen PT, Siboro BAH, Iswanto I, Rahim R (2020) Recognition of vehicle plates using template matching method. J Critic Rev 7(1):86–90
Chapter 64
Impact of Deep Learning on Arts and Archaeology: An Image Classification Point of View Rajdeep Chatterjee, Ankita Chatterjee, and Rohit Halder
1 Introduction The past few decades have witnessed the destruction of archaeological monuments due to pollution and human-made activities. These are subjected to irreversible damage. Archaeologists often discover new sites; however, it is hard for them to correctly predict which regime the architecture has built. It is often hard for the archaeologists to manually visit and excavate the site due to the rough terrains. Computer vision can play a significant role in analyzing architectural designs, hidden patterns, and map a similarity score against the architectures of different regimes. This can really help predict which regime (ancient time) did the newly discovered architecture falls under. Apart from that, modules attached to drones can be beneficial in studying an archaeological site. Computer vision algorithms can also help develop 3-dimensional restoration (replica) of the damaged monuments and architectures. The government can later use these models for proper maintenance and reconstruction of the monuments. Arts and antiquities worth over 6 billion dollars are illegally traded around the globe without the notice of agencies [1–3]. United Nations and the world governments are have formulated various laws, but those policies are not full proof and effective [4, 5]. The development of analytical methods in the past few decades opened the gateway to capture information relating to past and draw proper insights for a useful R. Chatterjee (B) · R. Halder School of Computer Engineering, KIIT, Bhubaneswar 751024, India e-mail: [email protected] R. Halder e-mail: [email protected] A. Chatterjee School of Electrical Sciences, IIT Bhubaneswar, Khordha 752050, India e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9_65
801
802
R. Chatterjee et al.
contribution to the analysis, capitalization, conservation, and circulation of different works of artists overages [6–9]. Apart from that, the proper classification of artistic images and antiquities could contribute to lesser smuggling of such valuable items. A mobile application can be used to fetch sufficient information about any artistic creation or antique or check whether any report has been filed regarding the pilfering of such items. This paper has been organized as follows: in Sect. 2, we describe the related techniques and approaches. In Sect. 3, the conceptual background has been discussed. The next Sect. 4, it explains the used datasets and the experimental setup of our work. The results have been discussed in Sect. 5. Finally, we conclude with the paper in Sect. 6.
2 Related Works Digital image processing is multidimensional. Researchers worldwide have been working on different algorithms on various domains for proper classification of the images. Architecture and art analysis is a protean field of work involving algorithms related to computer vision. History speaks of using sophisticated imaging followed by digital imaging, especially in the field of arts. The 19th century witnessed the use of X-rays [10], infrared photography, and reflectography to extract features based on pigment composition and underdrawings [11, 12]. The soft computing algorithms that have been developed can be interpreted as modifications of previously working systems, where the main alterations revolved around the use of features for spatial domains. Classification of artistic image and its analysis was predominantly based on problems regarding forgery detection [13, 14], and painting styles [15]. Herik, in 2000, worked in the extraction of the color and texture features corresponding to an artistic image. The methodologies included analyzing color histograms, Fourier spectra, and Hurst coefficients and statistics relating to image intensity. The potential use of neural networks was used for the classification of impressionistic paintings [16]. In 2006, Yelizaveta proposed a new domain of classification of artistic images based on the uses of brushstrokes of the artists [17]. In 2013, G. Carneiro opened a new dimension in automating the identification of visual classes in an artwork [18].
3 Conceptual Background Deep learning-based algorithms are used to solve a large variety of problems, such as image classification, object detection, and image segmentation, etc. To build our deep neural networks is preferable as one can understand its components with more clarity. However, it is not always feasible to implement for different reasons: Poor understanding of the different aspects of the network Inadequate data for training (learn) a model perfectly Hardware resource constraints for executing new networks.
64 Impact of Deep Learning on Arts and Archaeology …
803
The alternative solution is to use the pre-defined and validated deep neural networks. It addresses the first drawback, but again these networks have large numbers of parameters to train on a new task or problem. Also, they do not perform well with a small amount of training data. In this scenario, the appropriate solution is “Transfer Learning.” We use not only the pre-defined but also pre-trained networks in Transfer learning on a similar but a new task. We discuss transfer learning in the latter section.
3.1 Transfer Learning We have employed the transfer learning technique in building the classification models. Transfer learning technique jump starts the learning process for a new problem using its Transfer of knowledge from a related and already learnt problem [19]. The prime aim of transfer learning is to speed up the training process and boost the classification performance. The speedup is done due to its prior knowledge in learning similar patterns or tasks. The previously learnt knowledge helps the current learning when the model is trained on smaller training samples. The inadequate data for training can be compensated from the Transfer of knowledge, which is acquired from a similar but massive dataset [20, 21]. Commonly, the loss output layer, which is the final layer used to make predictions, is replaced with the new loss output layer for our problem. The fine-tuning of this node is done to determine the training penalties. Penalties are the deviations between the actual class labels and the predicted output. However, one can also re-train more than the last layer if the problem requires more fine-tuning.
3.2 MobileNetV1 MobileNets are depthwise separable convolution architecture used to downsize the model size and the overall complexity of the network [22]. It is beneficial for mobile and embedded device-based applications. The authors have introduced two specific global hyper-parameters, which makes an efficient balance between latency and accuracy. The hyper-parameters provide the developers to select the appropriately sized model for their application considering the problem constraints.
3.3 MobileNetV2 MobileNet is a lightweight model with fewer parameters to deal with resource constraints. The MobileNetV2 architecture uses an inverted residual structure where the
804
R. Chatterjee et al.
input and output of the residual block are thin bottleneck layers opposite to traditional residual models [23]. MobileNetV2 is an improvement over its predecessor MobileNetV1 and enhances state-of-the-art mobile vision applications, including image classification, object detection, and semantic image segmentation.
3.4 EffcicientNetB0 In 2019, Google published a series of optimized deep neural network architectures that maximizes the accuracy minimizes the computational complexity [19]. Generally, convolutional neural networks provide results using limited resources. The network is then scaled up to obtain higher accuracy with more resources. On the other hand, the family of EfficientNet architectures is developed through an optimal balancing between the network depth, width, and resolution. Eventually, it leads the network to perform better, using fewer hyper-parameters and low computational cost. The developers proposed a new uniform scaling technique that applies to all the dimensions of the network depth, width, and resolution to obtain higher-performing versions of the same network architecture. The baseline network EfficientNet “B0” has been scaled up to create EfficientNet “B1”, a system with roughly twice the processing power. The same scaling is applied successively to create EfficientNet B2, B3, B4, B5, B6, and B7. We have used EfficientNetB0 (EffNetB0) as it has the minimum number of parameters in the EfficientNet series.
4 Datasets Preparation and Experiments 4.1 Datasets We have used two different datasets for two separate experiments: Experiment-I and Experiment-II. In Experiment-I, we have prepared our sculpture dataset where images are drawn from the web. It includes a total of 207 train images and 54 images for both test and validation. However, the image test-set is different from the validation-set. The Experiment-I dataset contains three distinct types of sculptures: (a) Egyptian, (b) Indic, and (c) Roman (see samples in Fig. 1). The Experiment-II is implemented using the dataset available at Kaggle.1 It has about 9000 images into five categories: (a) drawings, (b) engraving, (c) iconography, (d) painting, and (e) sculpture (see samples in Fig. 2). We have used 7721, 1107 and 265 images for the training-set, validation and test, respectively.
1 Art-Image dataset: https://www.kaggle.com/thedownhill/art-images-drawings-painting-sculpture-engraving.
64 Impact of Deep Learning on Arts and Archaeology …
Fig. 1 Sample images from the experiment-I dataset
Fig. 2 Sample images from the experiment-II Kaggle arts dataset
805
806
R. Chatterjee et al.
4.2 Image Pre-processing and Augmentation Few of the images collected from the web for Experiment-I are in PNG format, which has 4 channels, which are RGB and Alpha channel. We have converted those PNG formatted images to 3 channels JPEG formatted images. In Experiment-II, there are no such issues. As the number of images for Experiment-I is very less, we have employed 5 image augmentation techniques to improve the numbers before training. The augmentation types are rotation (40◦ ), width shift (0.2), height shift (0.2), zoom (0.2), and horizontal flip (T r ue). Also, we have done similar type of image augmentation such as re-scaling (1./255), rotation (40◦ ) and zoom (0.2) for Experiment-II [24].
4.3 System Configuration The paper is implemented using Python 3.6 and Tensorflow (GPU) 1.14 on an Intel(R) Core(TM) i7-9750H CPU (9th Gen.) 2.60GHz, 16GB RAM and 6GB NVIDIA GeForce RTX 2060 with 64 bits Windows 10 Home operating system.
5 Results Discussion and Analysis Our actual aim is to develop a mobile app to assist common people in detecting different types and origins of the arts. In that direction, we have focused on lightweight deep learning models such as MobileNets and EffcientNetB0. The total number of used parameters are 5,331,119, 4,622,403 and 6,413,983 for MobileNetV1, MobileNetV2 and EfficientNetB0. The models are trained for 50 epochs with 100. We have evaluated the results based on Accuracy [%], mean f1-score [0 − 1] and AUC [0 − 1]. The values for these metrics are considered better if it tends to 100, 1, and 1, respectively. The results are given in Tables 1 and 2. It is observed that all three lightweight image classification models are competitive to one another. However, EfficientNetB0 provides a better result for both the Experiment-I datasets and the Kaggle arts dataset
Table 1 Results obtained from ternary Egyptian//Indic//Roman sculptures classification (Experiment-I) Models Accuracy (%) f1-score AUC MobileNetV1 MobileNetV2 EfficientNetB0
92.59 94.44 98.15
0.92 0.94 0.98
0.95 0.96 0.99
64 Impact of Deep Learning on Arts and Archaeology …
807
Table 2 Results obtained from Kaggle Arts: drawings//engraving//iconography//painting// sculpture classification (Experiment-II) Models Accuracy (%) f1-score AUC MobileNetV1 MobileNetV2 EfficientNetB0
96.98 96.98 97.36
0.94 0.95 0.96
0.97 0.97 0.98
Fig. 3 Confusion matrices obtained from experiment-I (left) and experiment-II (right)
with a 5-class classification problem. The confusion matrices and the accuracy plots are shown in Figs. 3 and 4 for the Experiments-I and Experiment-II, respectively. The validation accuracies obtained from both the experiments are very high since the first epoch. The Experiment-I needs more data related to sculptures from different origins. It is a data problem. The Experiment-II datasets have many classes that have close resemblance (e.g., drawings and engraving, paintings and iconography). We have tuned our model to the best of our ability. Some sample predictions have been given in Figs. 5 and 6.
808
R. Chatterjee et al.
Fig. 4 Accuracy plots obtained from experiment-I (left) and experiment-II (right)
Fig. 5 Sample predictions from experiment-I
Fig. 6 Sample predictions from experiment-II
64 Impact of Deep Learning on Arts and Archaeology …
809
6 Conclusion Computer vision can help in monitoring and analyzing different arts and sculptures based on their categories and origin. Lightweight mobile applications empowered with deep learning capability can detect and recognize different arts and sculptures in real-time without the help of any subject specialists. It helps the law enforcing agencies to monitor and prevent illegal trade of arts. Again, it can assist in identifying the origin of a piece of sculpture in the local administration of the excavation site. This paper aims to draw the computer vision community’s attention in developing more suitable applications for the arts and archaeology-related problems. In this paper, our model achieves 98.15% and 97.36% test accuracies for Experiment-I and Experiment-II, respectively. These results are reportedly the best for the used datasets. The possibilities of deep learning-based computer vision applications are massive on arts and archaeology. In the future, we extend the work to the semantic and instance segmentation of objects in a given sculpture, wall and cave painting, etc.
References 1. Awakening AC (2019) Art crime: exposing a panoply of theft, fraud and plunder. The Palgrave Handbook on Art Crime p 1 2. Martsiushevskaya E, Ostroga V (2017) Smuggling as a crime of international character: concept, characteristics, qualifications 3. Zubrow EB (2016) Archaeological cultural heritage: a consideration of loss by smuggling, conflict or war. In: The artful economist. Springer, pp 215–226 4. Ollus N (2018) The united nations protocol to prevent, suppress and punish trafficking in persons, especially women and children: a tool for criminal justice personnel. Resour Mater Ser 62 5. Allain J (2015) No effective trafficking definition exists: domestic implementation of the palermo protocol. In: The law and slavery. Brill Nijhoff, pp 265–294 6. Gaffney V (2017) In the kingdom of the blind: visualization and e-science in archaeology, the arts and humanities. In: The virtual representation of the past. Routledge, pp 125–133 7. Mitchell J, Odio S, Garcia DH (2015) Computer-vision content detection for sponsored stories. US Patent 9135631 8. Albrecht CM, Fisher C, Freitag M, Hamann HF, Pankanti S, Pezzutti F, Rossi F (2019) Learning and recognizing archeological features from lidar data. In: 2019 IEEE international conference on big data (Big Data). IEEE, pp 5630–5636 9. Khaloo A, Lattanzi D (2015) Extracting structural models through computer vision. Struct Congr, 538–548 10. Pelagotti A, Del Mastio A, De Rosa A, Piva A (2008) Multispectral imaging of paintings. IEEE Sig Process Mag 25(4):27–36 11. Barni M, Pelagotti A, Piva A (2005) Image processing for the analysis and conservation of paintings: opportunities and challenges. IEEE Sig Process Mag 22(5):141–144 12. Berezhnoy I, Postma E, van den Herik J (2007) Computer analysis of van gogh’s complementary colours. Pattern Recognit Lett 28(6):703–709 13. Li J, Yao L, Hendriks E, Wang JZ (2011) Rhythmic brushstrokes distinguish van gogh from his contemporaries: findings via automated brushstroke extraction. IEEE Trans Pattern Anal Mach Intell 34(6):1159–1176
810
R. Chatterjee et al.
14. Johnson CR, Hendriks E, Berezhnoy IJ, Brevdo E, Hughes SM, Daubechies I, Li J, Postma E, Wang JZ (2008) Image processing for artist identification. IEEE Sig Process Mag 25(4):37–48 15. Graham DJ, Friedenberg JD, Rockmore DN, Field DJ (2010) Mapping the similarity space of paintings: image statistics and visual perception. Vis Cognit 18(4):559–573 16. van den Herik HJ, Postma EO (2000) Discovering the visual signature of painters. In: Future directions for intelligent systems and information sciences. Springer, pp 129–147 17. Yelizaveta M, Tat-Seng C, Ramesh J (2006) Semi-supervised annotation of brushwork in paintings domain using serial combinations of multiple experts. In: Proceedings of the 14th ACM international conference on Multimedia, pp 529–538 18. Carneiro G, Da Silva NP, Del Bue A, Costeira JP (2012) Artistic image classification: an analysis on the printart database. In: European conference on computer vision. Springer, pp 143–157 19. Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) A survey on deep transfer learning. In: International conference on artificial neural networks. Springer, pp 270–279 20. Huh M, Agrawal P, Efros AA (2016) What makes imagenet good for transfer learning? arXiv preprint arXiv:1608.08614 21. Zoph B, Yuret D, May J, Knight K (2016) Transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1604.02201 22. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 23. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520 24. Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):60
Author Index
A Adnan, Muhamad Hariz M., 771 Agarwal, Ayush, 111 Aggarwal, Archit, 385 Aggarwal, Garima, 385 Agrawal, Nishchay, 117 Ahluwalia, Anmol Singh, 649 Ahmed, Shameem, 345 Ali Hossain, Md., 791 Al Sefat, Abdullah, 163 Ansari, Azim Uddin, 111
B Balabantaray, Rakesh Chandra, 635 Balakrishnan, Kannan, 457 Barman, Anwesha Ujjwal, 495 Barthwal, Meghna, 285 Barthwal, Varun, 153 Bepery, Chinmay, 759 Bharathi Kannan, B., 295 Bhute, Avinash, 465 Biswas, Md. Alif, 759 Breja, Manvi, 129, 139
C Chahande, Manisha, 255, 299 Chakraborty, Partha, 507 Chakraborty, Runu, 199 Chakraborty, Swarnendu Kumar, 55 Chatterjee, Ankita, 779, 801 Chatterjee, Rajdeep, 779, 801 Chatterjee, Tanusree, 607 Chauhan, Rahul, 521 Choudhury, Tanupriya, 1, 13, 225
D Daniel, A., 295 DasBit, Sipra, 607 Das, Niharika, 267 Datta, Goutam, 399 Desai, Nishq Poorav, 495 Dey, Raghunath, 635 Dhanasekaran, S., 595 Dhawan, Sanjeev, 329 Dubey, Ashwani Kumar, 255, 299 Dumka, Ankur, 213 Dutta, Abinash, 621 Dutta, Suparna, 713
G Garg, Keshav, 285 Garg, Varsha, 581 Gaur, Deepak, 553 Gaur, Nitin, 667 Gautam, Aishwarya Singh, 55 Gautam, Anjul, 111 Goel, Ruchi, 111 Gorai, Apurba Kumar, 607 Goswami, Mrinal, 285 Gupta, Kusum, 399 Gupta, Shashikant, 737 Gupta, Shruti, 649 Gupta, Sugandha, 693
H Habib, Sobia, 357 Halder, Rohit, 779, 801 Hasan, Nazmul, 727 Hassan, Mohd Fadzil, 771 Hazra, Sudipta Kumar, 199
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 M. Prateek et al. (eds.), Proceedings of International Conference on Machine Intelligence and Data Science Applications, Algorithms for Intelligent Systems, https://doi.org/10.1007/978-981-33-4087-9
811
812 I Imran, Faisal, 791 Islam, Md. Majedul, 727 Islam, S. M. Taohidul, 759
J Jaglan, Vivek, 737 Jain, Rachna, 13, 667 Jain, Richa, 255, 299 Jain, Sanjay Kumar, 129 Jain, Sourabh, 421 Jana, Susovan, 713 Johri, Prashant, 667 Joshi, Nisheeth, 399 Joshi, R. C., 521
K Kapoor, Rajiv, 357 Kashyap, Kanchan Lata, 267, 495 Kaushik, Baijnath, 483 Khan, Yusera Farooq, 483 Khapre, Shailesh Pancham, 567 Kotak, Akshay, 539 Krishna Narayanan, S., 595 Kumar, Amit, 111 Kumar, Arvind, 667 Kumar, Divyansh, 73 Kumar, Manoj, 241 Kumar, Rishi, 771 Kumar, Sachin, 553 Kumar, Shubham, 213 Kushwaha, Pradeep Kumar, 313
L Laddha, Aarti, 27 Lohani, Bhanu Prakash, 313
M Mahajan, Amit, 539 Majumdar, Rana, 13 Majumdar, Sujoy, 713 Malakar, Samir, 345 Malhotra, Sona, 329 Malik, Neha, 737 Mall, Adarsh, 567 Mal, Sandip, 267 Mehedi, S. K. Tanzir, 163 Mehta, Jahnavi M., 745 Mishra, Kamta Nath, 35 Mishra, Ved Prakash, 35
Author Index Mishra, Vijaya, 255, 299 Mohan, Karnati, 445 Mohapatra, Puspanjali, 681 Mukherjee, Saswati, 713 Munson, Andreas, 27 Muthu, T. Sudalai, 1
N Nag, Medha, 713 Nahar, Jebun, 727 Nair, Aswin Ramachandran, 27 Nair, Binesh, 409 Narayan, Neetu, 73 Narender, 329 Nawjis, Nafiul, 163 Negi, Sarita, 83
P Pal, Chirasree, 607 Palkar, Bhakti, 539 Pandey, Dev Kumar, 295 Pant, Suman, 117 Panwar, Neelam, 83 Parihar, Ashish Singh, 55 Patel, Amit S., 99 Patel, Nishit, 539 Patel, Vaishali P., 99 Patil, Rakesh S., 465 Patil, Suchitra, 745 Paul, Binu, 457 Paul, Debdeep, 433 Petiwala, Fatima Farid, 173 Petwal, Hemant, 187 Piri, Jayashree, 681 Prasad, Devendra, 55 Prasad, Sonu, 433 PrashantJohri, 295 Punna, Sharath Chandra, 621
R Rabby, A. K. M. Shahariar Azad, 727 Rahman, Fuad, 727 Rahman, Md Mahbubur, 759 Rahman, Saifur, 507 Rani, Rinkle, 187 Rasiqul Islam Rasiq, G. M., 163 Rastogi, Om, 275 Rauthan, Man Mohan Singh, 83, 153 Rawat, Manoj Kumar, 99 Ray, Susmita, 241 Rohini, A., 1
Author Index S Sahoo, Anita, 581 Salankar, Nilima, 693 Salauddin, Molla, 199 Salunkhe, Hrishikesh S., 745 Sandilya, Avanish, 495 Sanghai, Shreyas, 465 Sanodiya, Rakesh Kumar, 433 Sarkar, Ram, 345 Sarkar, Tanmay, 199 Sathe, Sanketkumar, 465 Saxena, Vikas, 581 Seal, Ayan, 445 Sebastian, Anil, 27 Shah, Hitansh V., 745 Shah, Kritika, 495 Shakeel, Ayesha, 35 Shamim, Namra, 369 Shankar, Achyut, 567 Sharma, Vedna, 421 Shaw, Sayan Surya, 345 Shukla, Manoj Kumar, 357 Shukla, Vinod Kumar, 35, 173 Singh, Prajitesh, 567 Singh, Rohit Pratap, 621 Sondhi, Anukarsh, 649 Srivastava, Arjun Vaibhav, 313
813 Srivastava, Ashutosh, 667 Srivastava, Pramod Kumar, 225 Srivastava, Sandeep, 225 Sruthi, S., 457
T Thute, Ashutosh, 567 Tomar, Ravi, 13, 225 Tyagi, Suryansh, 313
V Vaisla, Kunwar Singh, 83 Varma, Rohan, 153 Vasudevan, V., 595 Vijarania, Meenu, 737 Vyas, Sonali, 173
Y Yadav, Ashok Kumar, 225 Yadav, Dileep Kumar, 241 Yadav, Monika, 139 Yadav, Prajakta, 465 Yao, Leehter, 433 Yogesh, 369