1,278 36 81MB
English Pages XVII, 612 [623] Year 2021
Advances in Intelligent Systems and Computing 1189
Manoj Kumar Sharma · Vijaypal Singh Dhaka · Thinagaran Perumal · Nilanjan Dey · João Manuel R. S. Tavares Editors
Innovations in Computational Intelligence and Computer Vision Proceedings of ICICV 2020
Advances in Intelligent Systems and Computing Volume 1189
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. ** Indexing: The books of this series are submitted to ISI Proceedings, EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156
Manoj Kumar Sharma Vijaypal Singh Dhaka Thinagaran Perumal Nilanjan Dey João Manuel R. S. Tavares •
•
•
•
Editors
Innovations in Computational Intelligence and Computer Vision Proceedings of ICICV 2020
123
Editors Manoj Kumar Sharma Department of Computer and Communication Engineering Manipal University Jaipur Jaipur, Rajasthan, India Thinagaran Perumal Department of Computer Science Faculty of Computer Science and Information Technology Universiti Putra Malaysia Serdang, Selangor, Malaysia
Vijaypal Singh Dhaka Department of Computer and Communication Engineering Manipal University Jaipur Jaipur, Rajasthan, India Nilanjan Dey Department of Information Technology Techno India College of Technology Kolkata, West Bengal, India
João Manuel R. S. Tavares Faculdade de Engenharia da Universidade do Porto Porto, Portugal
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-981-15-6066-8 ISBN 978-981-15-6067-5 (eBook) https://doi.org/10.1007/978-981-15-6067-5 © Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
This volume contains papers presented at the International Conference on Innovations in Computational Intelligence and Computer Vision (ICICV 2020) organized by the Department of Computer and Communication Engineering, Computer Science and Information Technology, Manipal University Jaipur during January 17-19, 2020. It provided an open platform for the young minds/ researchers across the globe to present cutting-edge research findings, exchanging ideas, and reviewing submitted and presented single and multi-disciplinary research. The research articles presented in the conference covers systems and paradigms that cover computational intelligence and computer vision in a broad sense. ICICV 2020 had an overwhelming response with high volume of submissions from different domains related to advanced computing, artificial intelligence and computer vision, image processing and video analysis, innovative practices, and interdisciplinary research areas. The scope of the conference was deep learning, soft computing, machine learning, image and video processing which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. A rigorous peer-review process was adopted to filter the quality research submissions and with the help of external reviewers and programme committee few qualitative submissions were accepted for publication in this volume of Advances in Intelligent Systems and Computing series of Springer. Several special sessions were offered by eminent professors in many cutting-edge technologies. Several eminent researchers and academicians delivered talks addressing the participants in their respective field of proficiency. Our thanks are due to Shri Sandip Datta, IOT Platform & Solutions Country Leader, IBM India Private Limited; Dr Robin T. Bye and Dr Ottar from Norwegian University of Science and Technology, Alesund, Norway; Dr Dharam Singh from University of Namibia; Dr Swagatam Das from Indian Statistical Institute, Kolkata; Dr Manu Pratap Singh from Dr. Bhimrao Ambedkar University, Agra, India; Dr Nilanjan v
vi
Preface
Dey from Techno India College of Technology, Kolkata, India; Dr K.V. Arya, ABV-IIITM Gwalior, India; and Mr. Aninda Bose, Springer India, for their valuable talks for the benefits of the participants. We would like to express our appreciation to the members of the program committee for their support and cooperation in this publication. We are also thankful to the team from Springer for providing a meticulous service for the timely production of this volume. Our heartfelt thanks to our Honourable President, Manipal University Jaipur, Dr. G. K. Prabhu; Honourable Pro-president, Manipal University Jaipur, Dr. N. N. Sharma; Honourable Registrar, Manipal University Jaipur, Dr. Ravishankar Kamath; Honourable Dean FoE, Dr. Jagannath Korody. Without their support, we could never have executed such a mega event. Special thanks to all special session chairs, track managers, and reviewers for their excellent support. Last but not least, our special thanks go to all the participants who had brightened the event with their valuable research submissions and presentations. Jaipur, India
Manoj Kumar Sharma
Contents
Advanced Computing Stochastic Investigation of Two-Unit Redundant System with Provision of Different Repair Schemes . . . . . . . . . . . . . . . . . . . . . . Monika Saini, Ashish Kumar, and Kuntal Devi
3
A Workflow Allocation Strategy Under Precedence Constraints for IaaS Cloud Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mirza Azeem Beg, Mahfooz Alam, and Mohammad Shahid
10
Artificial Neural Network Analysis for Predicting Spatial Patterns of Urbanization in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arpana Chaudhary, Chetna Soni, Chilka Sharma, and P. K. Joshi
18
Satellite Radar Interferometry for DEM Generation Using Sentinel-1A Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chetna Soni, Arpana Chaudhary, Uma Sharma, and Chilka Sharma
26
Cyber Espionage—An Ethical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . Somosri Hore and Kumarshankar Raychaudhuri
34
Optimized Z-Buffer Using Divide and Conquer . . . . . . . . . . . . . . . . . . . Nitin Bakshi, Shivendra Shivani, Shailendra Tiwari, and Manju Khurana
41
Data Mining in Cloud Computing: Survey . . . . . . . . . . . . . . . . . . . . . . . Medara Rambabu, Swati Gupta, and Ravi Shankar Singh
48
Optimal Nodes Communication Coverage Approach for Wireless Sensor Network Path Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Usha Soni Verma and Namit Gupta
57
Reliability Enhancement of Wireless Sensor Network Using Error Rate Estimation (ERE) Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Usha Soni Verma and Namit Gupta
69
vii
viii
Contents
Channel Capacity in Psychovisual Deep-Nets: Gaussianization Versus Kozachenko-Leonenko . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jesus Malo
77
pARACNE: A Parallel Inference Platform for Gene Regulatory Network Using ARACNe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Softya Sebastian, Sk. Atahar Ali, Alok Das, and Swarup Roy
85
Information Fusion-Based Intruder Detection Techniques in Surveillance Wireless Sensor Network . . . . . . . . . . . . . . . . . . . . . . . . Anamika Sharma and Siddhartha Chauhan
93
Evaluation of AOMDV Routing Protocol for Optimum Transmitted Power in a Designed Ad-hoc Wireless Sensor Network . . . . . . . . . . . . . 100 Suresh Kumar, Deepak Sharma, Payal, and Mansi A New Automatic Query Expansion Approach Using Term Selection and Document Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Yogesh Gupta and Ashish Saini Investigating Large-Scale Graphs for Community Detection . . . . . . . . . 122 Chetna Dabas, Gaurav Kumar Nigam, and Himanshu Nagar Big Data Platform Selection at a Hospital: A Rembrandt System Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Sudhanshu Singh, Rakesh Verma, and Saroj Koul Analysis of PQ Disturbances in Renewable Grid Integration System Using Non-parametric Spectral Estimation Approach . . . . . . . . . . . . . . 141 Rajender Kumar Beniwal and Manish Kumar Saini Noise Density Range Sensitive Mean-Median Filter for Impulse Noise Removal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Prateek Jeet Singh Sohi, Nikhil Sharma, Bharat Garg, and K. V. Arya Artificial Intelligence and Computer Vision Foodborne Disease Outbreak Prediction Using Deep Learning . . . . . . . 165 Pranav Goyal, Dara Nanda Gopala Krishna, Divyansh Jain, and Megha Rathi Convolutional Elman Jordan Neural Network for Reconstruction and Classification Using Attention Window . . . . . . . . . . . . . . . . . . . . . . 173 Sweta Kumari, S. Aravindakshan, Umangi Jain, and V. Srinivasa Chakravarthy A Multimodal Biometric System Based on Finger Knuckle Print, Fingerprint, and Palmprint Traits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Chander Kant and Sheetal Chaudhary
Contents
ix
Image Segmentation of MR Images with Multi-directional Region Growing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Anjali Kapoor and Rekha Aggarwal Countering Inconsistent Labelling by Google’s Vision API for Rotated Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Aman Apte, Aritra Bandyopadhyay, K. Akhilesh Shenoy, Jason Peter Andrews, Aditya Rathod, Manish Agnihotri, and Aditya Jajodia Computed Tomography Image Reconstruction Using Fuzzy Complex Diffusion Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Manju Devi, Sukhdip Singh, and Shailendra Tiwari Predicting the Big-Five Personality Traits from Handwriting . . . . . . . . 225 Nidhi Malik and Ashwin Balaji A Novel Framework for Autonomous Driving . . . . . . . . . . . . . . . . . . . . 238 Ishita Joshi, Satyabrata Roy, Krishna Kumar, Ravinder Kumar, and Deepak Sinwar Concurrent Multipath Transfer Using Delay Aware Scheduling . . . . . . 247 Lal Pratap Verma, Neelaksh Sheel, and Chandra Shekhar Yadev Implementation of Recommender System Using Neural Networks and Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Akshita Gupta and Anand Sharma NER Tagging of Free Text Queries to Search Data for Developing Autonomous Driving System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 P. V. Veena, Jitendra Kumar, and Devadatta Prasad Image Processing and Video Analysis Predictive Modeling of Brain Tumor: A Deep Learning Approach . . . . 275 Priyansh Saxena, Akshat Maheshwari, and Saumil Maheshwari Estimation of View Size Using Sampling Techniques . . . . . . . . . . . . . . . 286 Madhu Bhan and K. Rajanikanth Packet Priority-Based Routing Approach for Vehicular Delay Tolerant Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Vishakha Chourasia, Sudhakar Pandey, and Sanjay Kumar Video Tagging and Recommender System Using Deep Learning . . . . . . 302 Varsha Garg, Vidhi Ajay Markhedkar, Shalvi Sanjay Lale, and T. N. Raghunandan Aadhaar-Based Authentication and Authorization Scheme for Remote Healthcare Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Deepshikha and Siddhartha Chauhan
x
Contents
DAN : Breast Cancer Classification from High-Resolution Histology Images Using Deep Attention Network . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Ritabrata Sanyal, Manan Jethanandani, and Ram Sarkar Automated Surveillance Model for Video-Based Anomalous Activity Detection Using Deep Learning Architecture . . . . . . . . . . . . . . . . . . . . . 327 Karishma Pawar and Vahida Attar Analysis of Gait and Face Biometric Traits from CCTV Streams for Forensics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Vishwas Rajashekar and Shylaja S. S. Multiple Digital Image Watermarking Using SWT, FWHT and R, G, B Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Anand Ghuli, Dayanand G. Savakar, and Shivanand Pujar Face Liveness Detection to Overcome Spoofing Attacks in Face Recognition System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Unnati Koppikar, C Sujatha, Prakashgoud Patil, and P. S. Hiremath Analysis of MRI Image Compression Using Compressive Sensing . . . . . 361 Vivek Upadhyaya and Mohammad Salim A Transfer Learning Approach for Drowsiness Detection from EEG Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 S. S. Poorna, Amitha Deep, Karthik Hariharan, Rishi Raj Jain, and Shweta Krishnan Classification and Measuring Accuracy of Lenses Using Inception Model V3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 Shyo Prakash Jakhar, Amita Nandal, and Rahul Dixit Detection of Life Threatening ECG Arrhythmias Using Morphological Patterns and Wavelet Transform Method . . . . . . . . . . . . . . . . . . . . . . . 384 Shivani Saxena and Ritu Vijay A New Approach for Fire Pixel Detection in Building Environment Using Vision Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 P. Sridhar, Senthil Kumar Thangavel, and Latha Parameswaran Innovative Practices Computer Assisted Classification Framework for Detection of Acute Myeloid Leukemia in Peripheral Blood Smear Images . . . . . . . . . . . . . . 403 S. Alagu and K. Bhoopathy Bagan An Efficient Multimodal Biometric System Integrated with Liveness Detection Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 Chander Kant and Komal
Contents
xi
A Random Walk-Based Cancelable Biometric Template Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 Fagul Pandey, Priyabrata Dash, and Divyanshi Sinha Influence of Internal and External Sources on Information Diffusion at Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 Mohammad Ahsan and T. P. Sharma Pedestrian Detection: Unification of Global and Local Features . . . . . . . 437 Sweta Panigrahi, U. S. N. Raju, R. Pranay Raj, Sindhu Namulamettu, and Vishnupriya Thanda Logistic Map-Based Image Steganography Using Edge Detection . . . . . 447 Aiman Jan, Shabir A. Parah, and Bilal A. Malik Smart Vehicle Tracker for Parking System . . . . . . . . . . . . . . . . . . . . . . 455 Ishita Swami and Anil Suthar An Efficient Technique to Access Cryptographic File System over Network File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463 Umashankar Rawat, Satyabrata Roy, Saket Acharya, and Krishna Kumar Hybrid Feature Selection Method for Predicting the Kidney Disease Membranous Nephropathy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 K. Padmavathi, A. V. Senthılkumar, and Amit Dutta Feasibility of Adoption of Blockchain Technology in Banking and Financial Sector of India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 Anuja Agarwal, Mahendra Parihar, and Tanvi Shah Interdisciplinary Areas Machine Learning Techniques for Predicting Crop Production in India . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491 Sarthak Agarwal and Naina Narang Navier–Stokes-Based Image Inpainting for Restoration of Missing Data Due to Clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497 Deepti Maduskar and Nitant Dube Perception of Plant Diseases in Color Images Through Adaboost . . . . . 506 Cheruku Sandesh Kumar, Vinod Kumar Sharma, Ashwani Kumar Yadav, and Aishwarya Singh SWD: Low-Compute Real-Time Object Detection Architecture . . . . . . . 512 Raghav Sharma and Rohit Pandey Guided Analytics Software for Smart Aggregation, Cognition, and Interactive Visualisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521 Aleksandar Karadimce, Natasa Paunkoska (Dimoska), Dijana Capeska Bogatinoska, Ninoslav Marina, and Amita Nandal
xii
Contents
A Comparison of GA Crossover and Mutation Methods for the Traveling Salesman Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 Robin T. Bye, Magnus Gribbestad, Ramesh Chandra, and Ottar L. Osen Comparison-Based Study to Predict Breast Cancer: A Survey . . . . . . . . 543 Ankit Grover, Nitesh Pradhan, and Prashant Hemrajani Software Quality Prediction Using Machine Learning Techniques . . . . . 551 Somya Goyal and Pradeep Kumar Bhatia Prognosis of Breast Cancer by Implementing Machine Learning Algorithms Using Modified Bootstrap Aggregating . . . . . . . . . . . . . . . . 561 Peeyush Kumar, Ayushe Gangal, and Sunita Kumari Localizing License Plates in Real Time with RetinaNet Object Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570 Ritabrata Sanyal, Manan Jethanandani, Gummi Deepak Reddy, and Abhijit Kurtakoti Decision Support System for Detection and Classification of Skin Cancer Using CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578 Rishu Garg, Saumil Maheshwari, and Anupam Shukla An Attribute-Based Break-Glass Access Control Framework for Medical Emergencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 Vidyadhar Aski, Vijaypal Singh Dhaka, and Anubha Parashar Faster and Secured Web Services Communication Using Modified IDEA and Custom-Level Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596 Jitender Tanwar, Sanjay Kumar Sharma, and Mandeep Mittal Reputation-Based Stable Grouping Strategy for Holistic Content Distribution in IoV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604 Richa Sharma, T. P. Sharma, and Ajay Kumar Sharma
About the Editors
Dr. Manoj Kumar Sharma Associate Professor, Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India. Expertise: Computational Intelligence, Artificial Intelligence, Machine Learning, Cryptography. Dr. Manoj Kumar Sharma is an enthusiastic and motivating technocrat with 14 years of academic experience. With his sharp wits, creativity, leading style and passion for students’ development. He is strong at technical fixes and impresses by his ability to design solutions. Fostering innovation and providing nurturing eco-system for budding entrepreneurs is his key thrust area. He lays strong emphasis on automation projects for society and industry. He has published more than 46 research papers in different National/ International Journals and conferences. He has granted 3 copyrights and published 2 patents. His research interests include, Machine Learning, Artificial Intelligence, Image processing, pattern recognition and cryptography. He has organized 3 International conferences in 2017, 2018 and 2020. He has served as Organising Secretary and is TPC member in number of Conferences including IEEE and Springer. He is reviewer in different Journals of repute from Springer, Elsevier, IEEE and many more. Dr. Sharma has been called in different National/International conferences in India and abroad to give Guest Lectures. He has been guided number of M.Tech./Ph.D. thesis. He is working
xiii
xiv
About the Editors
on some R&D and industrial projects and he has organized number of National/International conferences/workshops etc. Dr. Vijaypal Singh Dhaka Professor, Computer Science, Head, Department of Computer and Communication Engineering; Director, Innovations, Manipal University Jaipur, Jaipur, India. Expertise: Computational Intelligence; Artificial Intelligence, Machine Learning, Web Technologies. Dr. Dhaka is an enthusiastic and motivating technocrat with 15 years of industry and academic experience. With his sharp wits, creativity, leading style and passion for students' development, he has created an atmosphere of project development in University. He is strong at technical fixes and impresses by his ability to design solutions. Fostering innovation and providing nurturing eco-system for budding entrepreneurs is his key thrust area. In his able leadership students have won many prizes at National level e.g. Aero India Drone Competition, Pravega at IISc Bengaluru and Innovative Research prize at IIT Bombay are some to mention. He lays strong emphasis on automation projects for University use, developed by students and has been successful in deploying 6 such projects in University. He has served on various key positions including Director-Innovations, Dean-Academics, Chief Editor for International Journal and Head of Department. He has more than 80 publications in Journals of great repute in his name and guided 10 research scholars to earn PhD. His research interests include, Machine Learning, Artificial Intelligence, Image processing and pattern recognition. He has organized 6 International conferences supported by IEEE, ACM, Springer and Elsevier including SIN-2017. He has served as Organising Secretary and is TPC member for several IEEE Conferences. He has been invited to deliver keynote addresses and plenary talk in various universities and institutions. Dr Dhaka has successfully executed Solar Photo-Voltaic Efficiency Prediction and Enhancement Project funded by SERB (DST, Govt of India). 9 student research projects funded by DST Govt of Rajasthan, India, have been guided by him. He received “World
About the Editors
xv
Eminence Awards 2017” for Leading Research Contribution in ICT for the Year 2016, at WS-4 in London on 15th Feb 2017. Dr. Thinagaran Perumal Chair, IEEE Consumer Electronics Society Malaysia; Senior Lecturer, Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia; Head, Cyber-Physical System Unit, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia. Expertise: Internet of Things; Cyber-Physical Systems, Smart Homes and Interoperability and Ambient Intelligence. Thinagaran Perumal is the recipient of 2014 Early Career Award from IEEE Consumer Electronics Society for his pioneering contribution in the field of consumer electronics. He completed his PhD at Universiti Putra Malaysia, in the area of smart technology and robotics. He is currently a Senior Lecturer at the Department of Computer Science, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia. He is also currently appointed as Head of Cyber-Physical Systems in the university and also been elected as Chair of IEEE Consumer Electronics Society Malaysia Chapter. His research interests are towards interoperability aspects of smart homes and Internet of Things (IoT), wearable computing, and cyber-physical systems. Thina is also heading the National Committee on Standardization for IoT (IEC/ISO TC / G/16) as Chairman since 2018. Some of the eminent works include proactive architecture for IoT systems; development of the cognitive IoT frameworks for smart homes and wearable devices for rehabilitation purposes. He is an active member of IEEE Consumer Electronics Society and its Future Directions Committee on Internet of Things. He has been invited to give several keynote lectures and plenary talk on Internet of Things in various institutions and organizations internationally. He has published several papers in IEEE Conferences and Journals and is serving as TPC member for several reputed IEEE conferences. He is an active reviewer for IEEE Internet of Things Journal, IEEE Communication Magazine, IEEE Sensors Journal, and IEEE Transaction for Automation Science and Engineering, to name a few.
xvi
About the Editors
He was elected as General Chair for IEEE International Symposium on Consumer Electronics 2017 (ISCE’17) held in Kuala Lumpur. Dr. Nilanjan Dey is an Assistant Professor in Department of Information Technology at Techno India College of Technology, Kolkata, India. He is a visiting fellow of the University of Reading, UK. He is a Visiting Professor at Wenzhou Medical University, China and Duy Tan University, Vietnam, He was an honorary Visiting Scientist at Global Biomedical Technologies Inc., CA, USA (2012-2015). He was awarded his PhD. from Jadavpur Univeristy in 2015. He has authored/edited more than 45 books with Elsevier, Wiley, CRC Press and Springer, and published more than 300 papers. He is the Editor-in-Chief of International Journal of Ambient Computing and Intelligence, IGI Global, Associated Editor of IEEE Access and International Journal of Information Technology, Springer. He is the Series Co-Editor of Springer Tracts in Nature-Inspired Computing, Springer Nature, Series Co-Editor of Advances in Ubiquitous Sensing Applications for Healthcare, Elsevier, Series Editor of Computational Intelligence in Engineering Problem Solving and Intelligent Signal processing and data analysis, CRC. His main research interests include Medical Imaging, Machine learning, Computer Aided Diagnosis, Data Mining etc. He is the Indian Ambassador of International Federation for Information Processing (IFIP) – Young ICT Group. Recently, he has been awarded as one among the top 10 most published academics in the field of Computer Science in India (2015-17).
About the Editors
xvii
João Manuel R. S. Tavares graduated in Mechanical Engineering at the Universidade do Porto, Portugal in 1992. He also earned his M.Sc. degree and Ph.D. degree in Electrical and Computer Engineering from the Universidade do Porto in 1995 and 2001, and attained his Habilitation in Mechanical Engineering in 2015. He is a senior researcher at the Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial (INEGI) and Associate Professor at the Department of Mechanical Engineering (DEMec) of the Faculdade de Engenharia da Universidade do Porto (FEUP). João Tavares is co-editor of more than 55 books, co-author of more than 50 chapters, 650 articles in international and national journals and conferences, and 3 international and 3 national patents. He has been a committee member of several international and national journals and conferences, is co-founder and co-editor of the book series “Lecture Notes in Computational Vision and Biomechanics” published by Springer, founder and Editor-in-Chief of the journal “Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization” published by Taylor & Francis, Editor-in-Chief of the journal “Computer Methods in Biomechanics and Biomedical Engineering” published by Taylor & Francis, and co-founder and co-chair of the international conference series: CompIMAGE, ECCOMAS VipIMAGE, ICCEBS, and BioDental. Additionally, he has been (co-)supervisor of several M.Sc. and Ph.D. thesis and supervisor of several post-doc projects and has participated in many scientific projects both as researcher and as scientific coordinator. His main research areas include computational vision, medical imaging, computational mechanics, scientific visualization, human–computer interaction, and new product development.
Advanced Computing
Stochastic Investigation of Two-Unit Redundant System with Provision of Different Repair Schemes Monika Saini(B) , Ashish Kumar, and Kuntal Devi Department of Mathematics and Statistics, Manipal University Jaipur, Jaipur, Rajasthan 303007, India [email protected], [email protected], [email protected]
Abstract. The foremost concentration of the contemporary work is to investigate a two dissimilar unit system stochastically by adopting the notion of priority in repair activities. Here, two various types of priorities are given and three stochastic models are developed. The first model is basic one in which no priority is given, in second model original unit’s repair got preference over duplicate unit’s repair activities while in third model original unit’s repair got preference over preventive maintenance of duplicate unit. A full-time repairman always remains with system and PM timings are already specified in advance. The behaviour of all timedependent random variables is Weibull distributed. The random variables have common shape and distinct scale parameters. Expressions for numerous processes of system efficiency have been acquired and shown graphically to climax the significance of the work. Keywords: Semi-Markov process · Weibull distribution · Priority in repair activities · Preventive maintenance and repair
1 Introduction A lot of studied have been conceded for analysing the consistency and performance of complex industrial systems. Complexity of any system decreases the productivity of the system. To enhance the availability and reliability, some researcher suggested that redundancy is an effective technique. Air crafts, textile manufacturing plants and carbon reclamation units in fertilizer manufacturing industry get high reliability using various redundancy approached specially applying cold standby. Chandrasekhar et al. [1] supported this theory by a study on two-unit cold standby system in which repair rates are Erlangian distributed. Many researchers like Malik and Deswal [8], Chhillar et al. [2] and Kumar and Saini [4] using the theories of precautionary maintenance and priority in repair discipline, constant failure, and repair rates for identical unit redundant systems. Zhang and Wang [11] developed geometric model using the concept of priority in operation and repair for a repairable system having cold standby redundancy. Moghassass et al. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_1
4
M. Saini et al.
[10] analysed reliability measures of redundant system using concept of shut-off rules. Kumar and Malik [9] proposed some reliability theories for computer systems using ideas of priority. All the studies referred above discussed various reliability problems of cold standby structures having alike units in dissimilar set of conventions. But, several times due to financial constraints, situations do not permit to retain alike unit in reserve. Furthermost the effort conceded so far based on the hypothesis of persistent failure and repair proportions, but it is not seeming realistic for many engineering systems. Some researchers like Gupta et al. [3], Kishan and Jain [5], and Barak et al. [7] advocated some consistency models for cold standby of non-similar unit repairable systems with arbitrary failure and repair laws. Kumar et al. [6] supported the enactment investigation of redundant systems with priority in various repair disciplines and Weibull law for failure and repair. By keeping in mind, the above facts and figures, the present study is designed to investigate non-identical redundant systems stochastically using the concept of priority in various situations of system failures. For this purpose, here two various types of priorities are given, and three stochastic models are developed. The first model is basic one in which no priority is given, in second model original unit’s repair got preference over duplicate unit’s repair activities while in third model original unit’s repair got preference over PM of duplicate unit. A full-time repairman always remains with system and PM timings are already specified in advance. Altogether failure and repair period distributions follow Weibull distribution. The random variables have common shape and distinct scale parameters. Expressions for numerous procedures of system efficacy have been acquired and shown graphically to climax the significance of the study. State Description The possible states (Following the notations of Kumar et al. [6]) of the system are as follows: Common to all Models S0 = (O, Dcs), S2 = (Fur, Do), S1 = (Pm, Do), S4 = (O, DPm), S3 = (O, DFur), S6 = (FUR, DFwr), S8 = (DPM, WPm), S9 = (PM, DWPm), S10 = (PM, DFwr), S12 = (DFUR, WPm) Different states of models Model I S5 = (DPM, Fwr), S7 = (FUR, DWPm), S11 = (DFUR, Fwr) Model II S5 = (Fur, DWPm), S11 = (DFwr, Fur) Model III S7 = (Fwr, DPm)
Stochastic Investigation of Two-Unit Redundant System
5
Transition Probabilities According to Kumar et al. [6], by probabilistic opinions, we have α β γ α , P02 = , P10 = , P19 = , α+β α+β α+h+γ α+γ +h h h k = , P20 = , P26 , α+γ +h α+k +h α+h+k α l β , P30 = , P3.11 = , = α+h+k l+α+β α+β +l α γ β , P40 = , P45 = , = α+β +l α+β +γ α+β +γ α , P52 = 1, P63 = 1, P74 = 1, P81 = 1, = α+β +γ h = 1, P10.3 = 1, P11.2 = 1, P12.1 = 1, p13.10 = , α+γ +h α α h , p24.7 = , p23.6 = , = α+γ +h α+h+k α+h+k α β , p32.11 = , = α+β +l α+β +l α β = , p42.5 = , α+β +γ α+β +γ
P01 = P1.10 P27 P3.12 P48 P94 p14.9 p31.12 p41.8
P32.11 andP42.5 are not available in model II, P54 = 1, P11.3 = 1. Mean Sojourn Times Suppose T represent the period ∞ to system failure. Mean sojourn times at various state S i is given by μi = E(t) = 0 P(T > t)dt Hence, (1 + η1 ) 1 1 μ4 = , μ1 = (1/η + 1) + (α + h) , 1 1 1 (α + β + γ ) η (α + h + γ ) η (α + γ + h) η 1 1 + (α + h) , μ2 = (1/η + 1) 1 1 (α + h + k) η (α + k + h) η 1 + η1 1 + η1 , μ11 = μ1 = 1 1 (α + γ + h) η (f ) η
1 1 (α + β) μ3 = 1 + + , 1 1 η (α + β + l) η (α + β + l)(l) η 1 + η1 1 + η1 , μ0 = μ2 = 1 1 (α + k + h) η (α + β) η
6
M. Saini et al.
1 1 (α + β) = 1+ + , 1 1 η (α + β + γ ) η (α + β + γ )(γ ) η 1 + η1 (1 + η1 ) , μ = μ3 = 5 1 1 (α + β + l) η (f ) η
1 1 (α) μ3 = 1 + + 1 1 η (α + β + l) η (α + β + l)(l) η
1 1 (α) μ4 = 1 + + 1 1 η (α + β + γ ) η (α + β + γ )(γ ) η
μ4
By applying similar approach availability and mean sojourn times of third model has been obtained. Investigation of Availability By stochastic influences, the recurrence relations for system availability X i (t) are derived as follows: (n) qi,j (t) © Xj (t) (1) Xi (t) = Ci (t) + j
wherever i and j represent regenerative states. Taking Laplace transformation of overhead relations (1) and solving for X0∗ (s). The system’s availability is specified by X0 (∞) = lim sX0∗ (s)
(2)
s→0
Investigation of Repairman’s Busy Period By stochastic influences, the recurrence relations for analysis of busy period of repairman in terms of Dir (t) and DiPm (t) are derived as follows: (n) qi,j (t) © Djr (t) Dir (t) = Yi (t) + j
pm Di (t)
= Yi (t) +
(n)
pm
qi,j (t) © Dj (t)
(3)
j
where i and j represent regenerative states. Taking Laplace transformation of overhead relations (3) and explaining for D0∗r (s) and D0∗Pm (s). The results for busy period are gained by D0r = lim sD0∗r (s), D0Pm = lim sD0∗Pm (s). s→0
s→0
Investigation of Estimated Number of Repairs, PM and Visits by Repairman By stochastic influences, the recurrence relations for estimated number of repairs, PM and visits by repairman EiR (t), EiPm (t) and N i (t) are derived as follows:
(n) Qi,j (t) ® δj + EjR (t) ; EiR (t) = j
Stochastic Investigation of Two-Unit Redundant System
EiPm (t) =
j
Ni (t) =
7
(n) Qi,j (t) ® δj + EjPm (t) ; (n) Qi,j (t) ® δj + Nj (t)
(4)
j
Here δj = 1, if j is the regenerative stage wherever the server prepares his work again, otherwise δj = 0. Where i and j represent regenerative states. Taking LST of above relations (4) and solving for E0R (s). The expected numbers of repairs per unit time are given by E0R (∞) = lim s E0R (s), E0Pm (∞) = lim s E0Pm (s) and N0 (∞) = lim sN˜ 0 (s). s→0
s→0
s→0
(5)
Profit Analysis The profit generated by system in steady state can be obtained by using the following expression: P = K0 X0 − K1 D0Pm − K2 D0r − K3 E0Pm − K4 E0R − K5 N0
(6)
K 0 Represents the income per entity up-time K i Represent the cost per unit time.
Graphical Results See Figs. 1, 2, 3 and 4.
0
-0.001
Availability
-0.002 -0.003 -0.004 -0.005
Variation in Availability with respect to β for shape parameter η=0.5 0.01
0.02
0.03
0.04
0.05
0.06
0.08
0.09
0.1
α=2.4 h=0.01
α=2, h=0.009, l=1.4, k=1.5, γ=5 k=1.7
-0.006 -0.007
0.07
γ=7
l=2
Failure Rate
Fig. 1. Availability change (M:I–M:II) versus failure rate (β) for η = 0.5
8
M. Saini et al. Variation in Profit with respect to (β) for shape parameter η=0.5
0
0.01
-10
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Profit
-20 -30 -40 -50
h=0.01
-60 -70
γ=7
h=0.009,k=1.5,l=1.4,α=2,γ=5
-80
k=1.7 α=2.4
Failure Rate l=2
Fig. 2. Profit change (M:I–M:II) versus failure rate (β) for η = 0.5 Variation in Availability with respect to (β) for shape parameter η=0.5
0 -0.002
0.01
-0.004
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
α=2.4
h=0.01
0.1
Availability
-0.006 -0.008 -0.01
γ=7
-0.012
-0.014 -0.016
h=0.009,k=1.5,l=1.4,α=2,γ=5
-0.018
Failure Rate
k=1.7 l=2
Fig. 3. Availability change (M:I–M:III) versus failure rate (β) for η = 0.5
0 -10
Variation in Profit with respect to (β) for shape parameter η=0.5 0.01
0.02
0.03
0.04
0.05
0.06
0.07
Profit
-40 -50
0.09
γ=7
-20 -30
0.08
l=2 h=0.01 k=1.7
-60 -70 -80
α=2.4
l=1.4,k=1.5,h=0.009,α=2,γ=5
Failure Rate
Fig. 4. Profit change (M:I–M:III) versus failure rate (β) for η = 0.5
0.1
Stochastic Investigation of Two-Unit Redundant System
9
2 Conclusion The availability and profit analysis of all the models have been carried out for a specific situation by assigning shape parameter value η = 0.5 to all the random variables. The values of all other parameters are kept as constant quantity shown in Figs. 1, 2, 3 and 4 for model I, II, and III. It is observed that the availability and profit rise with respect to increase of precautionary maintenance rate and repair rate while these declines by increasing maximum operation time and failure rate in all the three models. From Figs. 1, 2, 3, and 4, it is observed that • Giving preference to repair of novel unit over preventive maintenance and repair of replica unit is always beneficial and profitable • Giving preference to repair of novel unit over precautionary maintenance of replica unit is profitable. Hence, use of priority in repair activities is concluded and recommended from the present study.
References 1. P. Chandrasekhar, R. Natarajan, V.S.S. Yadavalli, A study on a two unit standby system with Erlangian repair time. Asia-Pac. J. Oper. Res. 21(03), 271–277 (2004) 2. S.K. Chhillar, A.K. Barak, S.C. Malik, Reliability measures of a cold standby system with priority to repair over corrective maintenance subject to random shocks. Int. J. Stat. Econ. 13(1), 79–89 (2014) 3. R. Gupta, P. Kumar, A. Gupta, Cost-bbenefit analysis of a two dissimilar unit cold standby system with Weibull failure and repair laws. Int. J. Syst. Assur. Eng. Manag. 4(4), 327–334 (2013) 4. A. Kumar, M. Saini, Cost-benefit analysis of a single-unit system with preventive maintenance and Weibull distribution for failure and repair activities. J. Appl. Math. Stat. Inform. 10(2), 5–19 (2014) 5. R. Kishan, D. Jain, Classical and Bayesian analysis of reliability characteristics of a two-unit parallel system with Weibull failure and repair laws. Int. J. Syst. Assur. Eng. Manag. 5(3), 252–261 (2014) 6. A. Kumar, M. Saini, K. Devi, Performance analysis of a redundant system with weibull failure and repair laws. Revista Investigacion Operacional 37(3), 247–257 (2016) 7. M.S. Barak, D. Yadav, S. Kumari, Stochastic analysis of a two-unit system with standby and server failure subject to inspection. Life Cycle Reliab. Saf. Eng. (2017). https://doi.org/10. 1007/s41872-017-0033-5 8. S.C. Malik, S. Deswal, Stochastic analysis of a repairable system of non-identical units with priority for operation and repair subject to weather conditions. Int. J. Comput. Appl. 49(14), 33–41 (2012) 9. A. Kumar, S.C. Malik, Reliability modeling of a computer system with priority to s/w replacement over h/w replacement subject to MOT and MRT. Int. J. Pure Appl. Math. 80, 693–709 (2012) 10. R. Moghassass, M.J. Zuo, J. Qu, Reliability and availability analysis of a repairable-out-ofsystem with repairmen subject to shut-off rules. IEEE Trans. Reliab. 60, 658–666 (2011) 11. Y.L. Zhang, G.J. Wang, A geometric process repair model for a repairable cold standby system with priority in use and repair. Reliab. Eng. Syst. Safety 94, 1782–1787 (2009)
A Workflow Allocation Strategy Under Precedence Constraints for IaaS Cloud Environment Mirza Azeem Beg1 , Mahfooz Alam2(B) , and Mohammad Shahid3 1 Department of Computer Science & Engineering, Institute of Technology & Management,
Aligarh, India [email protected] 2 Department of Computer Science, Al-Barkaat College of Graduate Studies, Aligarh, India [email protected] 3 Department of Commerce, Aligarh Muslim University, Aligarh, India [email protected]
Abstract. A big challenge for the adoption of cloud computing in the scientific community remains the efficient allocation and execution of compute-intensive scientific workflows to reduce the complexity of workflow application, turnaround time, and the size of migrated data. The allocation of scientific workflows on public clouds can be described through a variety of perspectives and parameters. The workflow allocation problem in cloud environment has been proved to be NP-complete. This paper presents a new approach for workflow allocation by managing the precedence constraints on heterogeneous virtual machine in IaaS cloud with possible minimization of execution time. Here, illustration has been presented to demonstrate the strategy for a small number of tasks workflow. A brief comparative performance study has been conducted on the basis of the results obtained for proposed strategy and HEFT on considered parameters. The study reveals the better performance of proposed strategy than HEFT on total execution time (TET) for considered set of tasks. Keywords: Cloud computing · IaaS cloud · Workflow allocation · DAG scheduling · Execution time
1 Introduction Cloud computing is commission-based service-oriented approach that is used to access the various resources like hiring services. The aim of cloud computing is designed to be very fast, always available and secure, and the infrastructure of this system has to be flexible, scalable, and intelligently delivered services [1]. Therefore, workflow management system (WMS) becomes crucial for cloud computing and it allows the cloud service to improve the efficient resource allocation, scalability, and fault tolerance. In cloud computing, virtualization is one of the main components to compete a hardware © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_2
A Workflow Allocation Strategy Under Precedence Constraints
11
platform that is doubter to both user workload and operating system to run different applications. In this technology, various security threats include as mixed-trust-level virtual machines (VMs), inter VM attacks, and communication blind spots. A workflow is a chain management/sequence of steps to achieve finite objective in any computing environment and to maintain of steps in certain order to ensure efficiency, improve fast execution, and other benefits. Workflow applications are allocated for processing specially on IaaS cloud. There are many issues on implementation of workflows viz. machine failures, communication loss, and network congestion [2]. As heterogeneous VM becomes too massive, computational requirements for running difficult applications comprise batch of tasks common in order to cater to the parameters in consideration. The process of assigning the tasks onto multiple VMs is known as scheduling/workflow allocation. Workflow applications represented using Directed Acyclic Graph (DAG) have always accepted lots of awareness [3, 4]. A schedule for a DAG is an assignment which specifies the mapping of tasks and VMs and the expected start time of each task on the mapped machine in the given set of machines with a key objective to minimize TET. Workflow allocation problems can be major categorized into two directions such as single workflows and multiple workflows. Some single- and multiple-workflow applications were proposed to deal with HDSs such as ETF [5], DLS [6], LMT [7], HEFT [8] LBSIR [9], and many more. Most of the scheduling algorithms for workflow task allocation require entry and exit task in single vertex. So, if more than one entry/exit task, they are connected to a pseudo-entry/exit (i.e., called T entry /T exit ) with zero time and communication which does not affect the allocation and the schedule. In this paper, a new approach for workflow allocation by managing the precedence constraints on heterogeneous virtual machine in IaaS cloud with possible minimization of execution time of the workflow is represented by using DAG [10]. The strategy is consisting of two phases viz. schedule generation and VM selection. In the first phase, an out degree of vertex-based scheme is used for sorting of all tasks in the workflow to satisfy the precedence constraints. VM selection phase is same as the HEFT [8]. Further, illustration has been presented to demonstrate the strategy for some small task size workflows. A brief comparative performance study has been conducted on the basis of the results obtained for proposed strategy and HEFT on considered parameters. The paper is organized as follows. Section 2 presents the problem formulation for the proposed strategy and Sect. 3 presents the proposed strategy for workflow allocation. Sections 4 and 5 show the illustration and performance study for better understanding of the work, respectively. Finally, we conclude the work in Sect. 6.
2 Problem Formulation The formulation of the workflow allocation problem for heterogeneous virtual machines on cloud computing is considered as making decisions for mapping (f ) of set of tasks in the workflow (ψ) on set of VMs (V ) with the aim of minimizing the total execution time (TET) with tasks precedence requirement are satisfied. f :ψ → V
(1)
A workflow is a composition of tasks subject to precedence constraints. First, we address a deterministic non-preemptive single workflow having batch of task parallel
12
M. A. Beg et al.
allocation problem on a set of virtual machines, where m tasks ψ = {Ti : 1 ≤ i ≤ m} must be assigned on parallel VMs (sites) V = {Vj : 1 ≤ j ≤ n}. Each edge represents the precedence constraint between tasks, such that predecessor (pred) must be completed prior execution of its successor (succ). Before proceeding the objective function, we define the EST and EFT attributes form a prepared schedule. EST(T i , V j ) and EFT(T i , V j ) are the T i on V j , respectively. The entry task (T entry ) of EST(T entry , V j ) will be zero. The EST and EFT are calculated recursively, beginning from the T entry as Eqs. (2) and (3). EST(T i , V j ) and EFT(T i , V j ) are computed same as HEFT [8] as follows: EST Ti , Vj = max avail Vj , max (2) AFT(Tm ) + Cm,i Tm ∫ pred(Ti )
EFT Ti , Vj = Eij + EST Ti , Vj
(3)
where avail[V j ] is available time for task on V j , pred T i is the group of instant predecessor of task T i , and AFT(T m ) is the actual finish time (AFT) of task (T m ). When all the tasks are scheduled, the TET will be actual finish time of the last task. Moreover, C m,i , E ij is the communication cost and execution time of T i on V j as computed as in [8]. After whole tasks in a DAG are scheduled, the TET will be AFT of the exit task T exit . TET is the total execution time takes to the current task of workflow and defined as: TET = max{AFT(Texit )} ∀n
(4)
3 Proposed Workflow Allocation Strategy In this paper, a new approach is presented for workflow allocation and managing the precedence constraints on heterogeneous virtual machine for single workflow with possible aim of minimization of execution time of the workflow for IaaS cloud. The proposed strategy is an application of DAG scheduling for bounded number of virtual machines. Proposed strategy has two phases like schedule generation phase and VMs selection phase for selecting best machine, which minimizes execution time of the task. 3.1 Schedule-Generation Phase This phase provides the order in which tasks are executed with preserving precedence constraints. The schedule phase for preserving precedence constraints of the tasks is used in top-to-bottom and left-to-right approaches. In this phase, first, we will find tasks with maximum out degree (odmax ) from top-to-bottom and left-to right in the given workflow. Starting from T entry , next task will be selected for execution which has odmax then second next and so on. Tie-breaking is done using which has largest Average Execution Cost (E ij ) is first executed. If the average execution cost is equal, then tie-breaking is done randomly. From this strategy, no need of extra effort upward rank (UPr ) calculation to preserve precedence constraints for maintaining the Schedule Order (Or ).
A Workflow Allocation Strategy Under Precedence Constraints
13
3.2 VM-Selection Phase VMs-selection phase, for selecting best machine which minimizes execution time of the task, is selected and allocated to the corresponding tasks for the execution and moreless same VM selection in HEFT [8]. This phase is based on as insertion-based policy. For computing mean, computation and communication cost is also same as HEFT. The objective of this allocation strategy is to minimize TET. The allocation steps for proposed strategy are as follows: Begin 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
Assume Inputs : m, C m, i, n Generate ETC matrices Move Top-to-Bottom (Entry to Exit task), left-to-right Generate schedule // Allocation order as per odmax of the task in DAG Do while // until unscheduled tasks are exist in schedule Choose task T i from the Schedule // as generated in step-4 For each V j do Compute EST(T i, V j) and EFT(T i, V j) // based on insertion policy End for Assign T i on V j // having minimum EST(T i, V j) and EFT(T i, V j) End while Compute TET as per equation (4) End
4 Illustration In this section, the proposed allocation strategy and HEFT with single DAG have been explained with workflow having five tasks for better understanding of the strategy. The virtual machines considered in the system configuration are three for illustration and all machines are available at initial state. In Fig. 1, T 1 , T 2 and T 3 having out degree 2, 2, and 1, respectively, whereas T 4 and T 5 show the exit tasks. The average communication costs (C ij ) connected with tasks are given with edge labels.
Fig. 1. A sample of DAG with five tasks
Here, illustration explains the proposed strategy and HEFT with five tasks as depicted in Fig. 1. The execution time of each task onto all VMs, (E ij ), UPr and allocation of
14
M. A. Beg et al.
task are given in Table 1. The calculation of UPr (i.e., task selection phase) and VMs selection phase of task using HEFT [8] is shown stepwise manner in given below: Table 1. Task information, UPr and Or for five task Task V 1 V 2 V 3 E ij
UPr
Or
T1
14 16 09 13.00 80.00 I
T2
13 19 18 16.67 46.67 III
T3
15 11 14 13.33 55.00 II
T4
18 13 20 17.00 17.00 IV
T5
21 07 16 14.67 14.67 V
UPr (T5 ) = Texit =E ij = 14.67 and UPr (T4 ) = Texit = E ij = 17. UPr (T 2 ) = 16.67 + max{(13 + 17), (14 + 14.67)} = 46.67, UPr (T 3 ) = 13.33 + max{(27 + 14.67)} = 55, UPr (T 1 ) = 13 + max{(18 + 46.67), (12 + 55)} = 80. After calculation of UPr , the allocation list (order) is prepared as per the value of upward rank as follows: T 1 , T 3 , T 2 , T 4 and T 5 . Now, we begin with VMs selection phase, T 1 will be executed firstly as per allocation list. The avail[V j ] of considered VMs is 0,0,0. Now, EST(T 1 , V 1 ) = EST(T 1 , V 2 ) = EST(T 1 , V 3 ) = 0 and EFT(T 1 , V 1 ) = 14, EFT(T 1 , V 2 ) = 16, EFT(T 1 , V 3 ) = 9. So, T 1 will assign on V 3 as it offer minimum E ij , i.e., min(14, 16, 9) = 9. After assigning T 1 , the next task is T 3 and avail[V j ] of VMs is 0, 0 and 9. Now, we calculate EST(T 3 , V 1 ) = max{0, max(9 + 12)} = 21, EST(T 3 , V 2 ) = max{0, max(9 + 12)} = 21and EST(T 3 , V 3 ) = max{9, max(9 + 0)} = 9, then EFT(T 3 , V 1 ) = 15 + 21 = 36, EFT(T 3 , V 2 ) = 11 + 21 = 32, EFT(T 3 , V 3 ) = 14 + 9 = 23. The minimum EST and EFT of T 3 is onto V 3 , so T 3 will also be assign onto V 3 . Similarly, avail[V j ] for T 2 on VMs is updated as is 0, 0 and 23. Again, we will compute EST and EFT on all VMs for T 2 . EST(T 2 , V 1 ) = max{0, max(9 + 18)} = 27, EST(T 2 , V 2 ) = max{0, max(9 + 18)} = 27 and EST(T 2 , V 3 ) = max{23, max(9 + 0)} = 23 then EFT(T 2 , V 1 ) = 13 + 27 = 40, EFT(T 2 , V 2 ) = 19 + 27 = 46, EFT(T 2 , V 3 ) = 18 + 23 = 41. Thus, T 2 will be assign on V 1 . Again, avail[V j ] for T 4 is 40, 0, and 23. So, EST(T 4 , V 1 ) = max{40, max(40 + 0)} = 40, EST(T 4 , V 2 ) = max{0, max(40 + 13)} = 53 and EST(T 4 , V 3 ) = max{23, max(40 + 13)} = 53, then EFT(T 4 , V 1 ) = 18 + 40 = 58, EFT(T 4 , V 2 ) = 13 + 53 = 66, EFT(T 4 , V 3 ) = 20 + 53 = 73. T 4 will also be assign on V 1 . The updated avail[V j ] for last task T 5 is 58, 0 and 23. EST(T 5 , V 1 ) = max{58, max[(40 + 0), (23 + 27)]} = 58, EST(T 5 , V 2 ) = max{0, max[(40 + 11), (23 + 27)]} = 51 and EST(T 5 , V 3 ) = max{23, max[(40 + 11), (23 + 0)]} = 51 then EFT(T 5 , V 1 ) = 21 + 58 = 79, EFT(T 5 , V 2 ) = 7 + 51 = 58, EFT(T 5 , V 3 ) = 16 + 51 = 67. Finally, T 5 is assigning on V 2 . The EST, EFT, avail[V j ], and VM allocation for all tasks are presented in Table 2. TET is 58 units for HEFT. The same workflow is allocated on the same set of VMs with previously considered configuration by proposed strategy as follows: the first phase is schedule generation that provides the order in which tasks are executed. As per proposed strategy, in this phase, first, we will find tasks with maximum out degree (odmax ) from top-to-bottom and
A Workflow Allocation Strategy Under Precedence Constraints
15
Table 2. Task allocation as per HEFT Task EST
EFT
avail[V j ]
V1 V2 V3 V1 V2 V3 V1 V2 V3
VM allocation
T1
0
0
0
14
16
09
0
0
09
V3
T3
21
21
09
36
33
23
0
0
23
V3
T2
27
27
23
40
46
41
40
0
23
V1
T4
40
53
53
58
66
73
58
0
23
V1
T5
58
51
51
79
58
67
58
58
23
V2
TET = 58
left-to right in the given workflow. From Fig. 1, the schedule is prepared as (T 1 , T 2 , T 3 , T 4 and T 5 ) with preserving precedence constraints for scheduling. Moreover, T 4 and T 5 have no out degree then select largest task as mentioned in earlier in Sect. 3. After completion of first phase, second phase viz. VM allocation in which appropriate VM is selected and allocated to the corresponding tasks for the execution and more-less same as HEFT. T 1 will be executed firstly as per schedule order. EST(T 1 , V 1 ) = EST(T 1 , V 2 ) = EST(T 1 , V 3 ) = 0 and EFT(T 1 , V 1 ) = 14, EFT(T 1 , V 2 ) = 16, EFT(T 1 , V 3 ) = 9. So, T 1 will assign on V 3 as it offer minimum E ij , i.e., min(14, 16, 9) = 9. After assigning T 1 , the next task is T 2 and avail[V j ] of VMs is 0, 0 and 9. Now, we calculate EST(T 2 , V 1 ) = max{0, max(9 + 18)} = 27, EST(T 2 , V 2 ) = max{0, max(9 + 18)} = 27 and EST(T 2 , V 3 ) = max{9, max(9 + 0)} = 9, then EFT(T 2 , V 1 ) = 13 + 27 = 40, EFT(T 2 , V 2 ) = 19 + 27 = 46, EFT(T 2 , V 3 ) = 18 + 9 = 27. So, T 2 will also be assign onto V 3 . Similarly, avail[V j ] for T 3 on VMs is updated as is 0, 0 and 27. Again, we will compute EST and EFT on all VMs for T 3 . EST(T 3 , V 1 ) = max{0, max(9 + 12)} = 21, EST(T 3 , V 2 ) = max{0, max(9 + 12)} = 21and EST(T 3 , V 3 ) = max{27, max(27 + 0)} = 27 then EFT(T 3 , V 1 ) = 15 + 21 = 36, EFT(T 3 , V 2 ) = 11 + 21 = 32, EFT(T 3 , V 3 ) = 14 + 27 = 41. Thus, T 3 will be assign on V 2 . T 4 is the largest task from T 5 , so T 4 will be executed first. Again, the avail[V j ] for T 4 is 0, 32, and 27. So, EST(T 4 , V 1 ) = max{0, max(32 + 13)} = 45, EST(T 4 , V 2 ) = max{32, max(32 + 0)} = 32 and EST(T 4 , V 3 ) = max{27, max(32 + 13)} = 45 then EFT(T 4 , V 1 ) = 18 + 45 = 63, EFT(T 4 , V 2 ) = 13 + 32 = 45, EFT(T 4 , V 3 ) = 20 + 45 = 65. T 4 will also be assign on V 2 . The updated avail[V j ] for last task T 5 is 0, 45, and 27. EST(T 5 , V 1 ) = max{0, max[(27 + 14), (32 + 27)]} = 59, EST(T 5 , V 2 ) = max{45, max[(27 + 14), (32 + 0)]} = 45 and EST(T 5 , V 3 ) = max{27, max[(27 + 0), (32 + 27)]} = 59 then EFT(T 5 , V 1 ) = 21 + 59 = 80, EFT(T 5 , V 2 ) = 7 + 45 = 52, EFT(T 5 , V 3 ) = 59 + 16 = 75. Finally, T 5 is assigning on V 2 . The EST, EFT, avail[V j ] and VM allocation for all tasks are presented in Table 3. TET is 52 units for proposed strategy.
5 Performance Study In this section, a performance comparison is presented with data obtained from previous illustration in Sect. 4. By presenting data graphically an illustration is presented of
16
M. A. Beg et al. Table 3. Task allocation as per proposed strategy Task EST
EFT
avail[V j ]
V1 V2 V3 V1 V2 V3 V1 V2 V3
VM allocation
T1
0
0
0
14
16
9
0
0
9
V3
T3
27
27
9
40
46
27
0
0
27
V3
T2
21
21
27
36
32
41
0
32
27
V2
T4
45
32
45
63
45
65
0
45
27
V2
T5
59
45
59
80
52
75
0
52
47
V2
TET = 52
proposed strategy and also explain HEFT algorithm stepwise on same data along with for better understanding. The illustration shows on fully heterogeneous cloud environment. In Fig. 2, proposed strategy and HEFT results are represented on performance metric TET for workflows with input parameters as follows: Task number (m) = 5 − 50, Number of virtual machines (n) = 3, Communication cost (C m ) = 9 − 31, Out degree (Od) = 1 − 10, and average execution cost (E ij ) = 10 − 30.
Fig. 2. Performance study for the HEFT and proposed strategy
Observations: • TET increases as the number of tasks is increased for both strategies as expected as depicted in Fig. 2. • It is clear that the proposed strategy is better than HEFT when number of tasks is 5. And slowly, performance becomes slightly better than HEFT for more tasks as 10-–50. • However, proposed strategy gains an improvement from 10.34%, 5%, 6.93%, and 9.64% on number of tasks 5, 10, 25, and 50 tasks executed, respectively. • The average improvement over HEFT in all four cases is 8.49% on TET.
A Workflow Allocation Strategy Under Precedence Constraints
17
6 Conclusion Effective workflow management requires an efficient allocation of tasks to resources over time and currently the subject of many research projects. One new complexity problem has been added when workflow is single/multiple. The way of task allocation in workflow can be done depending not only precedence constraints and properties of workflow but also on an unpredictable workload generated through other workflow in the heterogeneous cloud environment. In this work, a new approach for workflow allocation by managing the precedence constraints on heterogeneous virtual machine in IaaS cloud minimizes of execution time of the workflow has been presented. The proposed strategy has two phases, i.e., schedule generation and VM selection. In the first phase, an out degree of vertex-based scheme is used for sorting of all tasks in the workflow to satisfy the precedence constraints. VM selection phase finds best virtual machines for execution of the tasks. A performance analysis has been conducted on the basis of the results obtained for proposed strategy and HEFT on considered parameter TET. The study reveals the better performance of proposed strategy (on average 8.49%) than HEFT on TET for considered set of workflows. In the future, the work can be extended for large number of workflow tasks with simulation study and same can be tested on other state of the art for better suitability. Also, we can further extend the work to estimate and optimize the other QoS parameters.
References 1. I. Foster, Y. Zhao, I. Raicu, S. Lu, Cloud computing and grid computing 360-degree compared. arXiv:0901.0131 (2008) 2. P. Mell, & T. Grance, The NIST definition of cloud computing (2011) 3. E. Deelman, D. Gannon, M. Shields, I. Taylor, Workflows and e-Science: an overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25(5), 528–540 (2009) 4. W. Zheng, R. Sakellariou, Stochastic DAG scheduling using a Monte Carlo approach. J. Parall. Distrib. Comput. 73(12), 1673–1689 (2013) 5. C.H. Papadimitriou, M. Yannakakis, Towards an architecture-independent analysis of parallel algorithms. SIAM J. Comput. 19(2), 322–328 (1990) 6. Y.-K. Kwok, I. Ahmad, Dynamic critical-path scheduling: an effective technique for allocating task graphs onto multiprocessors. IEEE Trans. Parallel Distrib. Syst. 7(5), 506–521 (1996) 7. J.J. Hwang, Y.C. Chow, F.D. Anger, C.Y. Lee, Scheduling precedence graphs in systems with interprocessor communication times. SIAM J. Comput. 18(2), 244–257 (1989) 8. H. Topcuoglu, S. Hariri, M. Wu, Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002) 9. M. Shahid, Z. Raza, M. Sajid, Level based batch scheduling strategy with idle slot reduction under DAG constraints for computational grid. J. Syst. Softw. 1(108), 110–133 (2015) 10. V. Dhaka, A review on workload prediction of cloud services. Int. J. Comput. Appl. 109(9), 1–4 (2015)
Artificial Neural Network Analysis for Predicting Spatial Patterns of Urbanization in India Arpana Chaudhary1(B) , Chetna Soni1 , Chilka Sharma1 , and P. K. Joshi2 1 Banasthali Vidyapith, Banasthali 304022, India [email protected], [email protected], [email protected] 2 Jawaharlal Nehru University, Delhi, India [email protected]
Abstract. Modeling with an integration of artificial neural network (ANN) and geographical information system (GIS) is one of the newly emerging fields in spatial science. This ensures the analysis of temporal datasets for trend analysis and forecasting. This article elaborates on the effectiveness of ANN in a GIS environment to predict locations of urban change at the country level by testing the land transformation model (LTM). The LTM is a land use land cover change (LULC) change model that simulates local-scale patterns and uses a series of batch routines to learn spatial patterns in the data. We used the multi-temporal night-time light (NTL) data provided by the US Air Force Defence Meteorological Satellites Program/Operational Line Scan (DMSP/OLS) system. An iterative unsupervised classification method was applied to the data for the years 1995 and 2013 to map urbanization dynamics in India. The analysis includes information on transportation (roads, highways, and railways), built-up (residential area), water bodies (rivers, lakes), and slope as factors influencing urbanization patterns in the country. The results present a forecast of urban growth for the years 2025 and 2050. While running these processes, percent correct metric (PCM) and kappa statistics are being generated and a collective map of urban growth probability is obtained. The multi-level influences of LULC changes from which future urban patterns can be predicted and are observed. This shows that the integration of the ANN and GIS environment can create a plethora of opportunities for the decision makers in improving urban planning. Finally, the results obtained have been assimilated and amalgamated. Keywords: Artificial neural network · Land transformation model · Land use land cover change · DMSP/OLS · Night-time light · Urban growth
1 Introduction Among all the living species on the planet Earth, human beings are the only species to have modified the planet’s terrestrial surface to such a great extent. This modification as © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_3
Artificial Neural Network Analysis for Predicting Spatial
19
a land use land cover (LULC) change is known to occur in a very hap hazardous manner [1]. There are various factors that affect LULC changes and the ultimate result is a complicated interaction of all these factors. Due to such changes, recently, sustainability of urban centers and their development has assumed the main focus [2]. The present article demonstrates and helps in understanding and analyzing the impacts of various parameters/variables, which affect and control the urbanization process and patterns of growth in India. These parameters have some driving variables which have been used as input in the land transformation model (LTM). However, the urban development forecasting remains challenging and could benefit from further research on spatial metrics and their integration into the model calibration process [3]. With the shift toward rapidly developing market economies of India, the old growth and development paradigm has given birth to new possibilities and challenges as far as the rural--urban divide is concerned [4]. Seemingly, unimportant areas have come up and, in some stages, these have taken center stage as the growth of urban areas is facing with inadequate and antiquated infrastructure base of the country [5, 6]. Table 1. Showing PCM and kappa values for each state S. No. States
PCM
Kappa
S. No. States
PCM
Kappa
1
Delhi & Haryana
58.544955 0.570215 10
Orissa
40.170940 0.570215
2
Punjab
38.335358 0.360655 11
Gujarat
25.317528 0.360655
3
HP & Uttaranchal
42.211055 0.419179 12
MP 1
41.666667 0.419179
4
Uttar Pradesh
32.759807 0.323455 13
MP 2
36.796537 0.323455
5
Rajasthan
29.101958 0.287951 14
Maharashtra 40.382572 0.287951
6
WB
85.035971 0.162383 15
AP
48.382688 0.162383
7
Bihar
34.708738 0.345840 16
Karnataka
38.181818 0.345840
8
North East
32.042254 0.318214 17
Tamil Nadu
38.987972 0.318214
9
Chhattisgarh 42.927632 0.426970 18
Kerala
35.449735 0.426970
Artificial neural networks (ANN) are very significant in computing as well as exhibiting multi-faced behavior and patterns [7]. The urban evolution is no doubt an intricate phenomenon in which multiple variables interconnect nonlinearly with each other at the microlevel, consequently, the application of ANN to demonstrate urban development is fairly applicable. Different from the usually used systematic approaches, the ANN has the many returns [8, 9]. Here, land transformation model developed by the HEMA laboratory of Purdue University was used. The multilayer perceptron (MLP) uses algorithms to compute weights for each input value at their respective nodes as the information is fed into the system algorithm in a “feed-forward” style [10]. Every time, data are fed forward, then back-propagated iteratively through the system (acknowledged as cycle)
20
A. Chaudhary et al.
so that errors are minimized [11]. Concluding phase called testing can be accomplished by running the neural network by means of the trained network files without known output data to gauge the efficiency of the network’s pattern change. The outputs of urban will be binary values that show change (1) and no change (0) in binary format. Input values are fed with the help of hidden layers consisting of nodes equivalent to the inputs [12, 13]. The delta instructions are then used for all the values across succeeding passes of data and these passes are known as cycles; these cycles give values and when these cycles are continuously running, it is called training [14, 15] and analyzed observed changes with time [16]. Rapid economic growth and the emergence of the service sector as a more favored and diverse employer as compared to the agricultural and production sectors of the economy, the trend toward urban living has increased rapidly since the late 90’s [17]. The overarching aim of the article is at monitoring cities/towns and areas, as they are important to analyze the development patterns and their impacts on surrounding city growth. It gives the ease of planning properly, updating of earlier plans, and corrective measures for any natural and planning shortfall. The article illustrates how the combined use of geographical information system (GIS) and artificial neural network (ANN)-based spatial analysis can aid in the study of nature and quantification of urban expansion and change land-use characteristics.
2 Study Area For this work, our study area is India located at 20.59 latitude and 78.96 longitudes in Asia. The total extent of India measures 3214 km from north to south and 2933 km east to west. The total population recorded in the study area in 2005 was 1.14 billion which later increased to 1.23 billion in the year 2010 and then, the latest 1.36 billion in 2019. Some studies have pointed out that in the next few decades, up to 90% of India’s population could be residing in purely urban and suburban areas only while depending upon the service sector as the major employer [18] (Fig. 1).
3 Methodology The present study is intended to develop a holistic approach for spatial planning of urban and surrounding to develop an action plan. For this purpose, a series of working methodology is adopted for urban development [19]. The usage of ANN and LTM models depend upon the predictor variables defined by the modeler [20]. These variables can be considered with certain rules and limitations [21] (Fig. 2). The time-series data of Defence Meteorological Satellite Program/Operational Linescanning System (DMSP/OLS) stable night-time stable light (NTL) imagery (spanning years 1992–2013) were acquired from National Oceanic and Atmospheric Administration (NOAA) National Geophysical Data Center. These image products provide a gridded cell-based annual cloud-free composited NTL with a digital number (DN) ranging from 0 to 63 (6-bit data). DMSP/OLS NTL data of 1995 and 2005 were obtained by two individual sensors: F12 (1994–1999), F15 (2000–2007) and F18 (2010–2013). DMSP/OLS NTL images with the spatial resolution of 30 arc-seconds under the World
Artificial Neural Network Analysis for Predicting Spatial
21
Fig. 1. Study area map
Geodetic System (WGS-84) coordinate reference system. The preparation of the variables for such analysis needs .executable files (*.exe) files like createnet (Network file), createpattern (Pattern file), Batch files, etc. The network file gives network structure to the neural network, the pattern file reads the pattern of input data and gives the result accordingly and the batch file provides training to the data. Input files, Syntax: createnet 6 6 1 ltm.net. Pattern, Syntax: createpattern.6.5 inputfile.txt v., Batchman files have been used as input layers for the analysis. Input files have been used for forecasting. The real change map has been created by NN in ASCII format and further converted into supporting ArcGIS format. Distance from water bodies such as various streams, rivulets, drains, and few major rivers pass-through area of interest have been taken. Distance to urban of 1995, 2005, and 2013: Newer settlements that are to develop as suburbs. They required infrastructure-based services like power, sewage management system, etc., distance from the roads provide the backbone of connectivity infrastructure. Distance from railways to the backbone of long-distance and heavy bulk transport over land within our country. The slope of an area determines the gradient of the land. Excessive slope implies that the terrain is hilly and hence unsuitable for large size urban development. Predictor variables have been shown in Fig. 3. There are some locations where due to some constraints (e.g., reserved parks or green areas), urban expansion is not possible or allowed. By using LTM, we can block out such locations. Exclusionary layers have been created by excluding existing urban area and open water bodies.
22
A. Chaudhary et al.
Fig. 2. Methodology chart
Fig. 3. Maps of six predictor variables used for processing
Artificial Neural Network Analysis for Predicting Spatial
23
4 Validation For each simulation, epoch LTM generates computerized outcomes, i.e., PCM and kappa statistics, and they are computed through performing the ratio of values of true positive cells (TP) by the total numeral of cells shifting over the study period (RP). The values from the contingency table are being used to calculate kappa using the following equation: K=
P(A) − P(E) 1 − P(E)
(1)
In Eq. (1), where P(A) is the fraction of agreement (sensitivity coefficient) and P(E) is the expected fraction of agreement subject to the observed distribution. The assessment of sites and the number of cells dropping into each specific class of predicting variables along with true depiction on the satellite imagery represented a very locationspecific ground situation which could have exaggerated the way model has behaved [22]. True positives mostly fall inside solid urban parts while false positives incline to fall at the borders of townships and false negatives incline to be spread all over the county along urban corridors. True positives (TP): Maximum (TP) cells fall deep within already urbanized areas, i.e., towns and villages. False positives: (FPs), cells that are showing change when in factual the change did not actually occur in those areas as the results have revealed false-positive cells falling into thin sectors along the margins of already developed areas. False negatives: An exact opposite of false positive, false negatives are those areas where urban growth did occur whenever the model failed to predict the same (Fig. 4).
Fig. 4. Contingency table comparing LTM with real change using prediction categories from Table 1 and the map showing LTM predictor categories
5 Results and Conclusions Firstly, the neural network has generated the real change map which was in ASCII format and to read that file, we have to convert the ASCII file into raster by the use of the ArcGIS
24
A. Chaudhary et al.
conversion tool. This file shows the increased urban area from 1995 to 2005 and 2005 to 2013 and hence tells the neural network that it has to exclude that area (Urban area) while processing and have to predict the urban growth beyond this area, as no urban growth can take place in already urbanized area. Then, the training process starts and generates a number of ASCII files which will be used for further processing (Fig. 5).
Fig. 5. Map showing predicted urban for the year 2025 and 2050 categories
The incapability of the model to predict alteration in these areas recommends a planner/modeler that we can add many new drivers as per the requirements and the data available. The LTM has also generated the value map known as a categorical error map, showing how accurate the model has predicted, by representing correctly predicted cells as “true” (blue and green) and incorrectly predicted cells as “false” (Orange and red). These are the various PCM and kappa value which we get while running LTM, and these values are obtained after every 100th cycle of neural network and we consider only that file (Training Cycle) which gives the best PCM and kappa value. Major urban growth is also a component of the expansion along roads. Most of the new urban growth has taken place near major connecting roads and already urbanized areas. Although the model showed a satisfactory capability to identify the transformation expected to occur in zones enclosed by true-positive rings, in all likelihood it will never be able to flawlessly forecast the site of a big expansion using the units of processing incorporated in this study. Hence, according to the model results, agricultural land can suffer the extreme impact of urbanization. Although there can be many other environmental effects of rapid and large-scale urban development most obvious and immediate effect would be land-use change.
References 1. B.C. Pijanowski, A. Tayyebi, J. Doucette, B.K. Pekin, D. Braun, J. Plourde, A big data urban growth simulation at a national scale: configuring the GIS and neural network based land transformation model to run in a high performance computing (HPC) environment. Environ. Model Softw. 51, 250–268 (2014) 2. H. Omrani, A. Tayyebi, B. Pijanowski, Integrating the Multi-Label Land Use Concept and Cellular Automata with the ANN-based Land Transformation Model, pp. 208–210
Artificial Neural Network Analysis for Predicting Spatial
25
3. M. Herold, N.C. Goldstein, K.C. Clarke, The spatiotemporal form of urban growth: measurement, analysis and modeling. Remote Sens. Environ. 86(3), 286–302 (2003) 4. A. Amin, S. Fazal, Quantification of land transformation using remote sensing and GIS techniques. Am. J. Geogr. Inf. Syst. 1(2), 17–28 (2012) 5. H. Omrani, A. Tayyebi, B. Pijanowski, Integrating the multi-label land-use concept and cellular automata with the artificial neural network-based land transformation model: an integrated ML-CA-LTM modeling framework. GISci. Remote Sens. 54(3), 283–304 (2017) 6. E.G. Irwin, J. Geoghegan, Theory, data, methods: developing spatially explicit economic models of land use change. Agric. Ecosyst. Environ. 85(1–3), 7–23 (2001) 7. G. Grekousis, P. Manetos, Y.N. Photis, Modeling urban evolution using neural networks, fuzzy logic and GIS: the case of the athens metropolitan area. Cities 30(1), 193–203 (2013) 8. M.L. Fitzpatrick, D.T. Long, B.C. Pijanowski, Exploring the effects of urban and agricultural land use on surface water chemistry, across a regional watershed, using multivariate statistics. Appl. Geochem. 22(8), 1825–1840 (2007) 9. L.R. Jitendrudu, Modeling Dynamics of Urban Growth using Remote Sensing and Geographical Information Systems, in GIS Eng. Al-Salam Ind. Trading Establ. PO 17176 Jeddah 21484, KSA (2010) 10. B.C. Pijanowski, S. Pithadia, B.A. Shellito, K. Alexandridis, Calibrating a neural networkbased urban change model for two metropolitan areas of the Upper Midwest of the United States. Int. J. Geogr. Inf. Sci. 19(2), 197–215 (2005) 11. A. Tayyebi, B.C. Pijanowski, A.H. Tayyebi, An urban growth boundary model using neural networks, GIS and radial parameterization: an application to Tehran, Iran. Landsc. Urban Plan. 100(1–2), 35–44 (2011) 12. A. Tayyebi, B.K. Pekin, B.C. Pijanowski, J.D. Plourde, J.S. Doucette, D. Braun, Hierarchical modeling of urban growth across the conterminous USA: developing meso-scale quantity drivers for the Land Transformation Model. J. Land Use Sci. 1–21 (2012) 13. R.G. Pontius, D. Huffaker, K. Denman, Useful techniques of validation for spatially explicit land-change models. Ecol. Modell. 179, 445–461 (2004) 14. K. Alexandridis, B.B. Pijanowski, Spatially-explicit bayesian information entropy metrics for calibrating landscape transformation models. Entropy 15, 2480–2509 (2013) 15. O. Oyebode, Application of GIS and land use models—artificial neural network based land transformation model for future land use forecast and effects of urbanization within the Vermillion River Watershed Olaniyi Oyebode 9 (2007) 16. R.G. Pontius Jr., K. Batchu, Using the relative operating characteristic to quantify certainty in prediction of location of land cover change in India. Trans. GIS 7(4), 467–484 (2003) 17. B.T. Bestelmeyer et al., Desertification, land use, and the transformation of global drylands. Front. Ecol. Environ. 13(1), 28–36 (2015) 18. R.B. Bhagat, UNITED NATIONS EXPERT GROUP MEETING ON Migration and Urban Transition in India : implications for development migration and urban transition in india : implications for development, Sept 2017 19. D.K. Ray, J.M. Duckles, B.C. Pijanowski, The impact of future land use scenarios on runoff volumes in the Muskegon River Watershed. Environ. Manag. 46(3), 351–366 (2010) 20. A. Tayyebi, B.C. Pijanowski, Modeling multiple land use changes using ANN, CART and MARS: Comparing tradeoffs in goodness of fit and explanatory power of data mining tools. Int. J. Appl. Earth Obs. Geoinf. 28(1), 102–116 (2014) 21. A. Radmehr, S. Araghinejad, Developing strategies for urban flood management of Tehran City Using SMCDM and ANN. J. Comput. Civ. Eng. 28(6), 05014006 (2014) 22. A.G. Yeh, X. Li, ACSG Table of contents Table des matières Authors index Index des auteurs Urban Simulation using neural networks and cellular automata for land use planning. Neural Networks
Satellite Radar Interferometry for DEM Generation Using Sentinel-1A Imagery Chetna Soni(B) , Arpana Chaudhary, Uma Sharma, and Chilka Sharma Banasthali Vidyapith, Banasthali 304022, India [email protected], [email protected], [email protected], [email protected]
Abstract. Interferometry technique generates an elevation model using interferometric image pair acquired by synthetic-aperture radar (SAR). The present article has investigated the potential of Sentinel-1 SAR imageries for topographic analysis. Interferometric SAR utilizes phase difference information from complexvalued interferometric SAR images captured at two different imaging positions. Extracted topographic information is highly useful in various applications like crustal deformation, glacial movement, deformation studies, and topographical analysis. Various satellite systems such as RADARSAT, ERS, TerraSAR-X, ALOS PALSAR, and Sentinel-1 acquire interferometric images. The present article examines the DEM generation using the interferometry method. Sentinel-1A satellite datasets have been used to generate digital elevation model (DEM) for Tonk district and surrounding area. The study area comprises various land cover features including built-up, agriculture land, water bodies, barren land, and scrubland. SNAP toolbox has been used to generate the DEM using Sentinel-1A interferometric wide swath (IW) in single look complex (SLC) image format. The DEM generation process includes baseline estimation, co-registration, interferogram generation, coherence, interferogram filtering, flattening, phase unwrapping, phase to height conversion, orbital refinement, and geocoding followed by generation of digital elevation model. Visual interpretation of derived DEM has been carried out using Google Earth. Coherence influences the accuracy of generated DEM. The quality of coherence depends on the baseline, wavelength, and temporal resolution of the interferometric pair. Pixels having coherence values greater than 0.5 have shown elevation values near to SRTM DEM values. Keywords: Interferometry · Sentinel-1A · Interferometric wide swath · Digital elevation model · Terrain observation by progressive scans synthetic aperture radar (TOPS SAR)
1 Introduction Microwave remote sensing covers a wide range of electromagnetic spectrum from 1 to 1 m. The signal transmits through the atmosphere and receives backscattered signals from surface materials in active microwave remote sensing [1]. Microwave remote sensing © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_4
Satellite Radar Interferometry for DEM Generation
27
serves all-weather capabilities. All-day and night imaging is possible with microwave sensors. The microwave spectrum range penetrates through clouds, smoke, and haze. Due to the availability of a wide range of wavelengths, penetration into the surface like vegetation, sand, snow cover surface is possible. Microwave wavelength is able to measure surface roughness, dielectric constant, and moisture content. Airborne- and space-bornebased microwave remote sensing sensors are available with multi-frequency, polarimetric, and high-resolution features. TerraSAR-X, ENVISAT ASAR, SIR-C/X-SAR, ALOS PALSAR, RADARSAT-1 and 2, RISAT-1, RISAT-2, and Sentinel-1 capture the earth surface in the microwave range. Active microwave sensors are widely used in observations of land features such as agriculture, forest, built-up, ocean, snow, and ice. A digital terrain model (DTM) is a digital representation of the elevation of the earth’s surface. Topographic analysis can also be carried out using radargrammetry and interferometry techniques with SAR datasets [2]. The launch of ERS-1 satellite in 1991 significantly promoted the development of InSAR techniques and its applications [3]. Interferometry is a technique to generate high-quality elevation models using SAR imageries. In SAR image, phase information is available along with the backscattered signal. The interferometry process uses that phase information to derive a 3-D dimension. InSAR images can be acquired either from single-pass SAR interferometry in which two antennas on the same space platform are separated perpendicularly to the flight direction (azimuth direction) or from repeat-pass interferometry in which two images are acquired in consecutive passes of the same SAR antenna. The generation of SAR interferometry requires the coherent combination of the echoes recorded by the SAR sensor in order to properly focus all the targets within the image [4, 5]. The interferometry process is accomplished using at least two images (master and slave image), An interferogram image is acquired by interfering with the phase of two SAR images of the same terrain [6]. The difference between phase angles of two coherent images gives phase inference [7, 8]. To get the phase difference from two coherent images, the difference in path length is to be found out from the target. A comparison of two images of the same region using this inference can provide latitude, longitude, and altitude (height) of any point in the 3-D dimension. The phase information of two coherent images give phase difference, an interferogram in which a fringe pattern appears. [9] developed two algorithms to correct mis-registration in squint image. One algorithm corrects the phase ramp and the second one estimates the mis-registration between two interferometry images. InSAR technique is widely used in applications such as surface displacements, land topography, land subsidence/uplift, water levels, soil moisture, snow accumulation analysis, and stem volume of forest. Many researchers have used interferometry process on applications such as terrestrial and atmospheric mapping, crop height mapping, Polar Regions, soil water relationships, building height mapping [10–13]. Most of the researchers have given interferometry principles and processing steps using different datasets, i.e., ERS-1/2, TANDEM-X, and software such as DORIS InSAR processing, SARscape, Sentinel ToolBox (SNAP) [14–16]. Various investigation shows the use of polarimetric interferometry in different applications to derive the height of objects [17– 19]. Processing is an important aspect to generate the DEM using the interferometry method. Various platforms are available to generate DEM. The accuracy of DEM generated from the interferometry process depends on the quality of coherence. Extreme
28
C. Soni et al.
events such as heavy rainfall during SAR data acquisition affect the backscattering signal due to variation in moisture content. Baseline, wavelength, and time span control the quality of coherence. Weather condition such as rainfall and wind conditions affects the coherence. An accuracy assessment of DEM generated from interferometry pair in spotlight mode has been carried out by using different multi-looking parameters [21]. Testing of the interferometry pair of TANDEM-X over Mumbai and surrounding areas has been carried out by generating DEM. An accuracy assessment of the derived DEM has been carried out by comparing the elevation values with Cartosat-1 stereo optical image [22]. The quality of the generated DEM gets affected by extreme weather events. Weather effects on DEM derived from InSAR process have been observed [8]. The present article utilizes Sentinel-1A satellite datasets to generate the topography of the study area through interferometry technique by using open-source software SNAP. The prospective of the Sentinel-1A satellite images into the generation of the DEM has been explored.
2 Study Area and Data Used Rajasthan is the biggest state of India in terms of area and also known for the Thar Desert. The study area covers the Tonk district and its surroundings. The study area comprises land cover features like agriculture lands, scrublands, built-up areas, etc. The average altitude of the Tonk district above the mean sea level is between 260 and 300 m. The average rainfall and temperature of Tonk are 647 mm and 26 °C, respectively. Sentinel1 mission is a collaborative initiative of the European Space Agency (ESA) and the European Commission (EC). The Sentinel-1A space mission was launched on April 3, 2014, by the European Space Agency of the Copernicus Programme. The mission is European Radar Observatory for the Copernicus and the constellation has two satellites, Sentinel-1A and Sentinel-1B [23]. The satellites are intended to work in pre-programed, global coverage, and conflict-free operation mode while sharing the same orbit. Satellite is designed in C-band to capture earth surface in four imaging modes with varying resolution and coverage area that provides reliable and repeated wide-area monitoring datasets. The mission has the capability of imaging in dual-polarization with short revisit (6/12 days). The Sentinel-1 space mission provides wider coverage than existing SAR sensors [24, 25]. The present article has used Sentinel-1A data in IW mode with single look complex (SLC) format. Datasets have been acquired on October 23, 2017, and November 4, 2017, for region lying in Tonk district, Rajasthan. The study area has been shown in Fig. 1.
3 Methodology The article presents the generation of DEM using an interferometry technique. In a single-pass or one-time pass, the satellite has two aperture antennas on the same platform. Shuttle Radar Topographic Mission SRTM has used two antennas on the same platform. In the repeat pass, the temporal visit of the satellite is achieved. TerraSAR-X, ERS is across-track repeat-pass satellites that operate in ping pong mode. The separation in repeat pass is reserved to capture the imaging geometry. Sentinel-1 satellites have 12 days
Satellite Radar Interferometry for DEM Generation
29
Fig. 1. Study area map
with its own and 6 days with Sentinel-1 (A/B) satellite revisit. To extract meaningful elevation value of landscape phase unwrapping is accomplished. Phase unwrapping starts from near swath edge because the minimum phase difference presents around those pixels. Across the range, phase difference tracks whenever 2 phase jump or discontinuity is captured in addition or subtraction of 2. The unwrapped phase is related to the actual topography of the landscape after unwrapping the phase. Ground control points (GCP) are used to measure phase associated with elevation and to calibrate phase in terms of elevation. The methodology for the interferometry process is depicted below in Fig. 2.
Fig. 2. Methodology chart
30
C. Soni et al.
4 Results and Discussions Range and azimuth shows shift while co-registration of master and slave orbit. Doppler centroid difference (f D ) gives a difference in centroid of master and slave orbit. The shift in Doppler centroid also defines the suitability of master and slave pair for the interferometric process. When the side look angle of satellite is 90°, it gives zero Doppler centroid difference. The co-registration process is carried out before the generation of the complex interferogram. Master and slave image pairs have been registered with sub-pixel accuracy using satellite orbital parameters. To reduce noise into interferogram multi-looking has been done with 6 * 1 m (Range * Azimuth) resolution. The master and slave image is shown in Fig. 3.
Fig. 3. Master (October 23, 2017) and slave image (November 4, 2017)
The interferogram is generated by multiplying complex pixels of one image to the complex conjugate of the same pixel in another co-registered image. While the generation of the interferogram, process requires master, slave, and reference DEM. SRTM 90 m has been taken as a reference DEM for the interferometry process. The range of interferogram is −π to +π. Interferogram flattening is done to remove other factors and low frequent components like noise from the interferogram. In flattened interferograms, fringes have been reduced. Flattening is the phase difference between the constant phase and phase of known topography. Once the interferogram and coherence have been estimated, filtering of interferogram has been done to reduce phase noise in interferogram introduced by temporal or baseline related de-correlation. Goldstein filtering method has been adopted for filtering interferograms. Generated interferogram and debrust interferogram has been shown in Fig. 4. Coherence is generated while smoothing of phase in the interferogram. Incoherent areas are filtered rigorously than coherent areas. The quality of interferograms is derived from coherence information. Coherence is the measure of pixel-to-pixel SNR. The value of coherence is very low for cropland, water bodies. Coherence generated from the filtered interferogram image is shown in Fig. 5.
Satellite Radar Interferometry for DEM Generation
31
Fig. 4. Interferogram (left) and debrust interferogram (right)
Fig. 5. Coherence image
SNAP supports phase unwrapping for Sentinel-1 datasets. Conversion of phase to height has been accomplished in order to generate DEM. Parameters such as coherence threshold, interpolation method, interpolation window size and mean window size, wavelet number of level, and grid size play an important in phase to height conversion. The accuracy of generated DEM is influenced by coherence. Built-up, hilly areas have shown high coherence values. Crop, fallow land, water bodies have less coherence values (0.1–0.4) that resultant up to 30 m elevation accuracy. DEM is geocoded after the phase to height conversion. The accuracy of the generated DEM depends on the coherence image. High coherence values can be achieved by opting minimum revisit pass. The high temporal baseline reduces the chances of de-correlation in master and slave interferometry pair. Generated pair has shown very low coherence due to the presence of crop in the study area. The images have been acquired in 12 days interval. Generated DEM has shown less elevation errors in high coherence areas. The coherence of the master and slave image has been correlated with soil moisture. The backscattering values have shown variation with the change in soil moisture content. Elevation value is highly variable in low coherence areas. Though the temporal baseline was 12 days only, still elevation accuracy is not good in all areas. Google Earth has been used for visual
32
C. Soni et al.
interpretation of the DEM. Visual interpretation of the generated DEM has carried out and shown in Fig. 6.
Fig. 6. Resultant DEM with zoomed view on Google Earth
Sentinel-1 satellites mission-based satellite imageries have given an opportunity to explore the data for various applications. The present article has explored the Sentinel1A TOPSAR IW mode to generate the DEM of the Tonk district, Rajasthan. Though the resultant DEM has shown variation in elevation values, it can be further corrected. Apart from the globally available DEM such as SRTM and ASTER, Sentinel-1 (A/B) satellitebased imagery can be the next best option to generate the DEM. Both the satellites can be used in an integrated way to get the better topography where the changes are very frequent as there are 6 days revisit between Sentinel-1A and Sentinel-1B.
References 1. T.M. Lillesand, R.W. Kiefer, Remote Sensing and Image Interpretation, p. 287 (1999) 2. M.A. Richards, A beginner’s guide to interferometric SAR concepts and signal processing [AESS tutorial IV]. IEEE Aerosp. Electron. Syst. Mag. 22(9), 5–29 (2007) 3. Z. Lu, O. Kwoun, R. Rykhus, Interferometric synthetic aperture radar (InSAR): its past, present and future by how InSAR works. Photogramm. Eng. Remote Sens. 73(May), 217–221 (2007) 4. R. Bamler, P. Hartl, Synthetic aperture radar interferometry. Inverse Probl. 14, 1–54 (1999) 5. M. Eineder et al., SAR interferometry with TERRASAR-X. Eur. Sp. Agency Special Publ. ESA SP 550, 289–294 (2004) 6. O.H. Sahraoui, B. Hassaine, C. Serief, Radar Interferometry with Sarscape Software, pp. 1–10 (2006)
Satellite Radar Interferometry for DEM Generation
33
7. X. Huang, H. Xie, T. Liang, D. Yi, S. Antonio, C.S. Branch, Estimating vertical error of SRTM and map-based DEMs using ICESat altimetry data in the eastern Tibetan Plateau. Int. J. Remote, 37–41 (2011) 8. J.H. Yu, X. Li, L. Ge, H. Chang, S.I. Systems, Radargrammetry and Interferometry Sar for Dem, in 15th Australasian Remote Sensing & Photogrammetry Conference, pp. 1212–1223 (2010) 9. M. Bara, S. Member, R. Scheiber, A. Broquetas, A. Moreira, S. Member, Interferometric SAR signal analysis in the presence of squint 38(5), 2164–2178 (2000) 10. R.F. Hanssen, R. Klees, Applications of SAR interferometry in terrestrial and atmospheric mapping. Work. Proc. Eur. Microw. Conf. Amsterdam 1, 43–50 (1998) 11. D.C.E.C.X. Zhou, M.S. Liao, Application of SAR interferometry on DEM generation of the grove mountains. Photogramm. Eng. Remote Sensing 70(10), 1145–1149 (2004) 12. B.T Brake, R.F. Hanssen, M.J. van der Ploeg, G.H. de Rooij (2013) Satellite-based radar interferometry to estimate large-scale soil water depletion from clay shrinkage: possibilities and limitations. Vadose Zo. J. 12(3) 13. S. Lippl, S. Vijay, M. Braun, Automatic delineation of debris-covered glaciers using InSAR coherence derived from X-, C- and L-band radar data: a case study of Yazgyl Glacier. J. Glaciol. 64(247), 811–821 (2018) 14. K. Padia, D. Mankad, S. Chowdhury, K.L. Majumder, Digital elevation model generation using cross-track SAR-interferometry technique (2002) 15. F.I. Okeke, InSAR operational and processing steps for DEM generation. FIG Reg. Conf. 1–13 (2006) 16. V.K. Singh, P.K. Champati Ray, A.T. Jeyaseelan, Digital elevation model (DEM) generation using InSAR: Garhwal Himalaya, Uttarakhand. Int. J. Earth Sci. Eng. 3(1), 20–30 (2010) 17. S. Guillaso, L. Ferro-Famil, A. Reigber, E. Pottier, Building characterisation using L-band polarimetric interferometric SAR data. IEEE Geosci. Remote Sens. Lett. 2(3), 347–351 (2005) 18. H.S. Srivastava, P. Patel, R.R. Navalgund, Application potentials of synthetic aperture radar interferometry for land-cover mapping and crop-height estimation. Curr. Sci. 91(6), 783–788 (2006) 19. M. Joseph et al., Satellite radar interferometry for monitoring subsidence induced by longwall mining activity using Radarsat-2, Sentinel-1 and ALOS-2 data. Int. J. Appl. Earth Obs. Geoinf. 61, 92–103 (2018) 20. P. Taylor, R. Gens, J.L.V.A.N. Genderen, Review article SAR interferometry—issues, techniques, applications. Int. J. Remote Sens. 37–41 (2007) 21. N.D. Davila-hernandez, Mapping of flooded areas in Acapulco de Juárez, Guerrero-México, using TanDEM-X radar images Mapeo de áreas inundadas utilizando imágenes de radar TanDEM-X en Acapulco de Mapping of flooded areas in Acapulco de Juárez, GuerreroMéxico, using TanDEM-X, May 2018 22. R. Deo, S. Manickam, Y.S. Rao, S.S. Gedam, Evaluation of interferometric SAR DEMs generated using TanDEM-X data. Int. Geosci. Remote Sens. Symp. 2079–2082 (2013) 23. M. Bourbigot, H. Johnsen, R. Piantanida, Sentinel-1 (2016) 24. L. Veci, Interferometry tutorial, Mar 2015, pp. 1–20 (2016) 25. S.L. Ullo, C.V. Angelino, L. Cicala, N. Fiscante, P. Addabbo, Use of differential interferometry on sentinel-1 images for the measurement of ground displacements. Ischia earthquake and comparison with INGV data. Int. Geosci. Remote Sens. Symp. 2216–2219 (2018)
Cyber Espionage—An Ethical Analysis Somosri Hore1(B) and Kumarshankar Raychaudhuri2 1 Department of Philosophy, University of Delhi, Delhi 110007, India
[email protected] 2 LNJN National Institute of Criminology and Forensic Science, Ministry of Home Affairs,
Govt. of India, Rohini, Delhi 110085, India [email protected]
Abstract. In June 2018, a cyber espionage campaign by Chinese group of hackers targeting two United States-based satellite firms was exposed, whose fundamental motive was to seize military and civilian communications of the victim nations. The hackers intentionally infected the systems, along with regulating the satellite to change the positions of the orbiting devices and disrupt data traffic. This case study may directly be pointing to instantiations of cyber-crime or unethical practices performed by misuse of technology in the ambit of cyber space, but the vital question which arises in this context is, ‘can such practices be justified on moral or socio-political grounds by any means?’ The implication of this case study is highly convoluted. On one hand, hacking is considered offensive as one try to intrude in other’s personal space without taking a legitimate moral consent, whereas on the other hand, an agent might be doing the same in order to ensure national security. Hence, the intention behind committing such a crime is dubious. The motive behind taking such a measure might be threat instead of intention to harm, which in turn, gives rise to few dilemmas, to ascertain whether such unethical acts can be justified for attaining an ethical end. This research work is an attempt to reconcile two disciplines. Skimming by the arguments of moral philosophers such as Immanuel Kant, J. S Mill, Aristotle a rigorous study is done, and logicoethical arguments are presented to seek a solution for these complex yet extremely relevant issues. Keywords: Cyber espionage · Morality · Universal · Consequence · Non-contradiction · Ethical hacking · Means and ends · Ethical analysis
1 Introduction The revolution computers from large and heavy and complex machines to user friendly and interactive devices, together with Internet technology, have made communication, information transmission and other activities easier, organized and more flexible in the cyberspace [1]. With enhancing human lives, it has also paved the way for some serious threats in the form of cyber-crimes or cyber-attacks. This negative side of cyberspace can become very threatening and dangerous, if exploited to full potential. Simply, cyberattack can be defined to be a type of criminal activity, where computer or other digital © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_5
Cyber Espionage—An Ethical Analysis
35
devices such as mobile phones and PDAs can be used either as a tool, target or medium for executing the attack [1, 2]. Cyber espionage is one such type of attack, wherein secret and confidential information from individuals, organizations and/or government establishments is stolen by the perpetrator by using malicious software or spywares such as Trojans. This type of attack is committed without the permission of the information holder and is also termed as ‘cyber-spying’. In April–May 2017, a stolen transcript of a phone call conversation between US president Donald Trump and Philippine President Rodrigo Duterte was uploaded on Internet through malicious email attachments. The phone call between them was based on Trump congratulating his counterpart on his murderous drug war. These sensitive documents were obtained by a hacker group named APT32, linked to the Vietnamese Government. The documents were stolen from the surveillance of Philippine Govt. The leak seemed bigger than the revelations of just a few documents. [3] Although, at the very first glimpse, these kind of instances may seem to be free from any kind of dilemma as they seem to be out rightly flouting the moral or legal law with their act of personal unconsented intervention which is thus termed as a cyber-crime, but on careful analysis, it is deduced that such an absolute statement cannot be necessarily implied. An act of espionage might not always be committed with a malicious intention. A vision to ensure national security of the country can be one of the major factors, resulting in performing espionage by secret agents of government. A few dilemmas which arise in this context are as follows: a. Can intention be the deciding criteria for evaluating whether a certain code of conduct can be categorized as crime or not? b. Can a justified motive which may be threat or self-defence be justified to process an unjustified end? c. In case, we are able to prove the first, is there any way to determine which acts of cyber-crime are practiced with what intention. On the basis of such a doubt, can we draw a universal principle? d. Can unethical means be justified for attaining an ethical end? Therefore, in this research work, our objective is to study both the disciplines, i.e. cyber espionage and philosophy and using the arguments given by moral philosophers such as JS Mill, Immanuel Kant and Aristotle present logico-ethical arguments to seek a solution to these complicated yet extremely relevant issues. The research article is organized into different sections as follows: Sect. 2 gives an overview of cyber espionage and describes its greyish nature, i.e. both positive and negative aspects. In Sect. 3, we talk about the different techniques, which are generally used for acquiring sensitive and confidential information, i.e. performing espionage in the cyber space. This is followed by Sect. 4, which produces a philosophical analysis of the fundamental problem. Section 5 presents Jaina theory of Non- absolutism as an alternative, followed by Sect. 6, which gives the Deontological Theory established by Immanuel Kant. The article concludes with Sect. 7, along with the future scope in this direction.
36
S. Hore and K. Raychaudhuri
2 Cyber Espionage—Its Grey Nature Cyber espionage can be defined as those computer operations, which might be performed in order to gather intelligence and data from the target or any adversary computer system [4]. The information can be gathered either by using remote intrusions, malicious attachments or honey trapping as well [5]. Hacking is one of the techniques of remotely intruding into an information system. Not only hacking, but other techniques can also be used for gathering secret and sensitive information about a country’s infrastructure and security establishments. Honey trapping or social engineering is another passive attacking technique, wherein the attacker tries to acquire information from the victim by developing trust and relationship, which in turn, is done by masquerading to be a known individual or a woman [6]. This has been the new espionage outfit in trapping defence personnels of Indian Army and Indian Air Force by ISI-based operatives in the last few years [4, 7, 8]. The modus operandi in majority of the cases, as discovered by military intelligence, has been Pakistan-based Facebook IDs. These agents get in touch with military personnels, who are posted in sensitive locations of the country. These fake accounts (posed as women) coax the army personnel, through sexual chat and photos, and lure them into sending the photographs of their locations, military exercises they are a part of, and intel about other defence establishments of the country [7, 8]. Therefore, we can sum up the fact that in today’s world, espionage has gained a new dimension in social media. Popular social media platforms such as Facebook, Instagram, Twitter are used in honey trapping and subsequent espionage attack [6, 7]. As per the reports published by European Union Agency for Network and Information Security (ENISA) [2], cyber espionage ranks among the top 15 cyber threats worldwide since 2014. As we talk of espionage to be one of the top threats in the cyberspace, it is important to understand that such an attack might not always be lethal to the nation and does not result in permanent damage to physical objects [9, 10]. Many forms of malware, which are also considered strict tools of espionage, do not directly cause damage to the targeted nation’s information system or cause damage via these systems [9]. One of the primary motives of espionage, as discussed earlier, is to gather intelligence by different government agencies of the country in order to ensure security of the entire nation. This is in the interest of the entire country. From this point-of-view, cyber espionage can be regarded as an ethical act, which in other words is also known as ethical hacking. Although the nation which is doing an act of cyber espionage for securing its critical infrastructures is doing it for the right cause, however, on the other hand, the country which is being the victim of espionage might not perceive it to be an appropriate act. This throws light on the fact that cyber espionage can neither be considered as an ethical nor an unethical act, which gives a hint towards its greyish nature that is our main focus of this research work. Traditionally, mere espionage has not been viewed as a casus belli (customary or legitimate reason for going to a war) but may bring non-military retaliation, resulting in expulsion of diplomats or limiting of foreign aid or commerce [9]. Cyber espionage has not been considered as an act relating to moral considerations with reference to going to war or conduct in war. The ethical considerations in intelligence-gathering activities like cyber espionage operations are but one of the several traditionally neglected aspects of the morality of war [9]. The Information Technology (Amendment) Act enacted in
Cyber Espionage—An Ethical Analysis
37
2008 has introduced several new sections and seems to be a step forward in reducing instances of data theft, identity theft, cyber-crimes, etc., and thereby act of espionage [11]. Therefore, the ethical considerations and unethical implications of espionage and other intelligence-gathering operations give a grey touch to this type of cyber-attack.
3 Techniques and Methods of Cyber Espionage An act of espionage in the cyber space is generally done using exploitation methods on the Internet, network or individual computers through the use of various techniques such as phishing, social engineering, malicious attachments, spywares like trojan and keyloggers, etc. [12]. In this section, we would describe some of the common techniques used for gathering information secretively in the cyber space. a) Phishing: Phishing is a technique where sensitive information from the target victim can be obtained by using fake and forged Web sites or emails. The Web sites setup by the source appears genuine and original to the target, where on providing sensitive information such as username, password or other financial details is acquired by the source in the background and an error page is displayed in front of the target victim. Similarly, fake emails can also be sent posing to be banks, in order to acquire confidential information. b) Social Engineering: Social engineering is another technique of acquiring sensitive information from the victim by means of using socializing humanitarian skills. The one trying to acquire information might be known to the victim, a close aid of the victim, working in the same organization as the victim, etc., which makes it much easier to gain trust of the target. Once trust and mutual relationship are established, the required information is obtained gradually. c) Trojans and Keyloggers: Trojans and keyloggers are spywares, which when installed in the computer system of the target victim, steal data and transmit it to the attacker over the network. Trojans and keyloggers can be installed through backdoors into the victim’s computer system. Keyloggers are the software applications, which record the keystrokes of the user and store them in its database. These applications enable the attacker to view usernames, passwords and anything else which the victim might type or click. d) Malicious Attachments: Malicious attachments sent by the attacker over the emails can contain links of fake Web sites or other malicious applications, where the victim’s sensitive data might get recorded and leaked to the attacker. e) Honey Traps: Social media platforms are the best technique for performing espionage attack, where the attacker sets up fake user accounts in order to lure and trap their victims and gain confidential information including photographs and videos. Social media monitoring is also done by intelligence agencies to record and passively observe unusual activities by terrorist organizations or enemy nations.
4 Philosophical Analysis of the Fundamental Problem The moment we associate any action with the word crime, it automatically becomes immoral without requirement of a further justification. Since the word crime is necessarily followed by punishment, any act of cyber-crime or any other crime should
38
S. Hore and K. Raychaudhuri
definitely lead to a punishment. However, can cyber-crime actually be termed as a crime when it is performed to restrict the performance of a further crime that too of a massive scale? So, penetrating deep into the problem of espionage, can we call it justified when it is performed for the purpose of national security? The proponents of utilitarian philosophy such as John Stuart Mill, Jeremy Bentham would definitely call it justified as for them, rightness of a moral act is directly dependent on the consequences it yields. For utilitarian philosophers, morality is guided by the ‘Consequentialist Principle’ [13]. The Consequentialist Principle states that, an act is morally right if and only if it leads to the ‘Greatest Happiness of the Greatest number’ [14]. Thus, applying the same principle to our current situation, we can implicate that, since no happiness can quantify the proportion of happiness received by the entire nation in return of national security, cyber espionage can definitely be justified. For the proponents of the Consequentialist theory, the motive behind performing an action is irrelevant. The only criteria which are relevant are the desired consequences. Thus, irrespective of whatever our intention is, if data theft leads to say financial loss of an organisation, then the same act of hacking or honey trapping will be unjustified and immoral, whereas in cases where its leading to positive results which is not harming but rather benefitting a large mass of people, it is morally justified. Although the utilitarian argument sounds strong and logical, it unfortunately flouts an important principle of logic. The moment we refer to the term ‘ethical hacking’, it translates into what we philosophically take to be an ‘oxymoron’. ‘Hacking’ itself is an unethical practice. Hence, can we ever practice an unethical act ethically? Saying so will be directly flouting the ‘Principle of Non-Contradiction’ as propounded by Aristotle in his Metaphysics. According to the Principle of Non-Contradiction, ‘it is impossible for the same thing to belong and not to belong at the same time to the same thing and in the same respect’ [15]. The principle of non-contradiction can be logically written as—(P.—P). The conjunction of two mutually exclusive or opposite terms is not permissible according to this law. In simple words, we cannot say that, Earth is round and not round at the same time. By following the same logic, we cannot affirm that hacking is unethical and done ethically at the same time.
5 Jaina Theory of Non-Absolutism as an Alternative The Jaina theory of Syadvada affirms many sidedness of reality, wherein affirming both sides of the same coin is not inconsistent. This is because, ‘reality has infinite aspects which are all relative, conditional and we know only some of these aspects’ [16]. However, this argument still remains incompetent in resolving our fundamental question. The reason being, if ethics becomes relative in nature, then each person will devise his or her own rules which will be detrimental for the society as a whole. Secondly, the utilitarian view does not hold for even if we take ‘intention’ to be the moral criteria for judging a moral act, it is impossible for us to differentiate or draw a line between which act of espionage is done for private intervention and which one only to safeguard national security.
Cyber Espionage—An Ethical Analysis
39
6 Deontological Theory to the Rescue The Deontological theory of ethics is propagated by renowned philosopher, Immanuel Kant. Kant propagates that ethics is universal, unconditioned and absolute since it is governed by reason which is common to all human beings. Thus, any ethical principle which we apply to one situation must be categorically applied to all. For Kant, morality is not determined by consequences or any other condition. Morality ought to be only governed by our duty [17]. It is only this kind of a moral theory that can give absolute justice to any situation as the first and foremost virtue it adheres to is ‘equal treatment of all’, which is crucial for justice. Kant states, ‘act only according to that maxim whereby you can at the same time will that it should become a universal law’ [15]. He also states, ‘act in such a way that you treat humanity, whether in your own person or in the person or any other, never merely as a means to an end but always at the same time as an end’ [15].
7 Conclusion and Future Scope In this research work, our primary objective is to perform an ethical analysis of Cyber Espionage, by using logico-ethical arguments of various philosophers like Aristotle, Mills, Emanuel Kant, etc. The analysis which is done justifies whether an act of cyber espionage is ethical, unethical or neither. On the basis of the various philosophical arguments presented in this research work, if we try to analyse our present situation from the lens of Kant’s maxims, cyber espionage would not be justified. This is because on one hand we are justifying it in case of national security, where secret information acquired by intelligence helps in determining the planning, strategies and other details of the opponent, and in turn ensure security of the country, whereas on the other hand, when another country is using the same means to ensure their own national security, we are calling it immoral and unjustified. How can the same act be moral as well as immoral in just different contexts? This argument appears to be vulnerable. Also, we are using an unjustified or immoral means, perhaps a crime to reach a justified end. This cannot be permissible in the eye of morality as a crime or an attack, even in the cyberspace, can never lead to safety. Instead of using espionage as a means of ensuring national security, nations should focus on strengthening their existing computer and information system, such that it is not possible for any perpetrator, hacker or criminal to access secret and classified data illegally. Attempts should be made to create such technological inventions, wherein the moment an unauthorized individual is trying to retrieve data from the computer systems, some sort of alarm is triggered resulting in the trapping of the perpetrator. Therefore, based on all arguments, it would be justified to conclude that instead of active participation in espionage, a nation or organization should aim for self-defence and passive retaliation. In future, we can extend this research work by exploring other ethical or moral dimensions of cyber-crime or specifically cyber espionage by narrowing down to more trivial and nuanced case studies. The study can be taken up from various wide perspectives, not only limited to ethics but also socioeconomically relevant to the present. Since
40
S. Hore and K. Raychaudhuri
it is an area of research which is empirically implemented in our day to day life, constant upgradation is required with the passing time and novel technological and moral transformation of society.
References 1. M. Uma, G. Padmavathi, A survey on various cyber attacks and their classification. Int. J. Netw. Secur. 15(5), 390–396 (2013) 2. H. Kettani, P. Waiwnright, On the top threats to cyber systems, 2nd International Conference on Information and Computer Technologies (IEEE, 2019), pp. 175–179. https://doi.org/10. 1109/infoct.2019.8711324 3. M. Hjortdal, China’s use of strategic warfare: espionage meets strategic deterrence. J. Strateg. Secur. 4(2), 1–24 (2011). https://doi.org/10.5038/1944-0472.4.2.1 4. Group Captain Arun Marwah was honey-trapped, blackmailed by ISI, https://www.telegr aphindia.com/india/iaf-officer-held-over-honey-trap-suspicion/cid/1334278, last accessed 2019/09/22 5. ISI, honey trap and the 23-year old Indian: Pakistan’s dirty spy game exposed, https:// www.business-standard.com/article/current-affairs/isi-honey-trap-the-23-year-old-indianpakistan-s-dirty-spy-game-exposed-118041601099_1.html, last accessed 2019/09/24 6. Honey Trap: The new Espionage Outfit, https://www.innefu.com/blog/honey-trap-the-newespionage-outfit/, last accessed 2019/09/22 7. Dozens of Indian Army jawans under lens for ‘falling prey to Facebook honey trap’, https://theprint.in/defence/indian-army-jawan-detained-after-being-honey-trapped-byfake-facebook-account/177252/, last accessed 2019/09/24 8. Indian soldier who fell victim to ‘honey-trap’ arrested for passing on secrets to Pakistani intelligence, https://www.telegraph.co.uk/news/2019/01/15/indian-soldier-fell-victim-honey-traparrested-passing-secrets/, last accessed 2019/09/24 9. R. Dipert, The ethics of cyber warfare. J. Mil. Ethics 9(4), 384–410 (2010). https://doi.org/ 10.1080/15027570.2010.536404 10. D. Weissbrodt, Cyber conflict, cyber crime and cyber espionage. Minn. J. Int. 347–387 (2013) 11. H. Sinha, Corporate espionage and the information technology (amendment) act, 2008. Indian Law J. 2(3), 1–2 (2018) 12. A. Aggarwal, CERT-In.: Cyber Espionage, Infiltration and Combating Techniques. Secfence Technologies (2013) 13. Aristotle on Non-Contradiction, www.plato.stanford.edu/entries/aristotle-noncontradiction/, accessed on 2019/09/25 14. M.J. Stuart, On liberty, 4th edn. (Longman, Roberts & Green, London, 1869) 15. I. Kant, H.J. Gregor, C.M. Korsgaard, Groundwork of the Metaphysics of Moral Quotes (Cambridge University Press, United Kingdom, 1998) 16. Syavada-theory of non-absolutism, www.jainworld.com/scripture/prasamarati, accessed on 2019/09/25 17. S. Hore, L. Saxena, Kant’s categorical imperative and medical ethics. J. Adv. Res. Dyn. Control Syst. 10(6), 1323–1326 (2018)
Optimized Z-Buffer Using Divide and Conquer Nitin Bakshi(B) , Shivendra Shivani, Shailendra Tiwari, and Manju Khurana Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, Punjab 147004, India [email protected]
Abstract. Hidden surface determination is one of the major problems faced in the field of computer vision, computer graphics and game design. Several algorithms have been proposed to determine hidden surfaces throughout several years. Zbuffer is one of the most common algorithms used for determining hidden surfaces. This paper presents a method for hidden surface determination by optimizing an already proposed algorithm of Z-buffer. Z-buffer demands quite an amount of memory for storing depth values, hence, maintaining a buffer. We propose an optimized version of the algorithm which uses divide and conquer along with Z-buffer to reduce the number of pixels required to check to determine which surfaces are needed to be rendered and which surfaces do not require rendering. This helps reduce the time complexity as well as space complexity of the algorithm resulting in computing less number of pixels. Keywords: Z-buffer · Divide and conquer · Visual surface detection · View frustum · Occlusion · Hidden surface
1 Introduction 1.1 Hidden Surface Determination Hidden surface determination is a process in which only the surfaces that are visible to the user are rendered and the surfaces that are occluded due to other objects or surfaces are prevented from rendering [1]. For a rendering engine, its main responsibility is to render very large world spaces (those even tending to infinity in size) in constant time. Even though there is availability of high computational power, there is a requirement of algorithms that are advanced enough to pull off such large renders in constant time. Hence, optimizing algorithms is a requirement so as to ensure that the least number of resources are allocated toward the rendering of surfaces and that surfaces that are not visible are eliminated in the least amount of time [2]. 1.2 Z-Buffer Z-buffer or depth buffer [3] is a technique used in the field of computer graphics to determine whether an object or a part of some object is occluded or not. It calculates © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_6
42
N. Bakshi et al.
the z-value or depth value of pixels corresponding to a fixed (x, y) value and saves it in a two-dimensional array. This algorithm uses the image-space method for determination of hidden surface. Z-value can be defined as the value of perpendicular distance from a pixel present on the projection plane to the corresponding 3D co-ordinate on a given polygon. The dimension of a Z-buffer is same as that of screen buffer. A schematic representation of Z-buffer can be seen in Fig. 1 [3, 4].
Fig. 1. Schematic diagram of Z-buffer
1.3 Divide and Conquer Divide and conquer is an algorithm paradigm which works on the principle of multibranched recursion. It works by breaking down a problem recursively into multiple sub-problems of similar types until it is simple enough to solve. The solution to the original problem is then obtained by combining the solutions to the sub-problems. Divide and conquer is a technique used as a basis in several efficient algorithms for different kinds of problems. We see divide and conquer in areas like sorting, computing discrete Fourier transform, finding a closest pair of points, etc. [5, 6]. One of the most important applications of divide and conquer is optimization, where our search space is reduced by some factor, hence, the problem becomes relatively simpler to solve, resulting in less time required for computation.
2 Literature Survey Sutherland et al. [7] proposed an approach about the basic concept behind hidden surface detection and its visualization as sorting of elements. They talked about how coherence
Optimized Z-Buffer Using Divide and Conquer
43
is one of the key factors for applying algorithms on hidden surfaces. Scan-line, frame coherence, edge coherence, area coherence and depth coherence are forms that are discussed. Further classification of algorithms is done as: object-space algorithm, imagespace algorithm and list-priority algorithm. Catmull et al. [1] proposed another approach where they actively talked about the hidden surface problem. According to them the modified Newell algorithm sorts the polygons in the order of Z-value painting the polygons in the same order in a frame buffer. He further talked about Z-buffer touting it as the extension of the frame buffer idea in Z-value. Nigel Stewart and Geoff Leach [3] presented an improved version of Z-buffer-based CSG rendering algorithm using Z-buffer parity-based surface clipping requiring O(kn) complexity. Theoharis et al. [4] provided a survey about the key applications of Z-buffer. Several numbers of applications have been implemented using the same; hence, they talk about its key usage in several fields such as rendering, modeling and computer vision.
3 Proposed Approach Our proposed approach is depicted as a flowchart in Fig. 2. The inputs are being a set of objects and the view frustum. 3.1 Used Variables Nomenclature The following variables are used in our proposed algorithm and the nomenclature is provided initially to help understand the variables better. N Oi L vm Oi (x, y) Z Z min V a and V b So Oix Oiy
number of objects, V = view point (camera) ith object Line joining view point (x, y) co-ordinate of ith object Distance from V to (x, y) Minimum distance between V and (x, y) Lines enclosing viewing frustum with θ (angle) in between them, where θ depends on the viewing angle of camera Linked list containing objects in the viewing frustum x co-ordinate of intersection of L vm and Oi (x, y) y co-ordinate of intersection of L vm and Oi (x, y)
44
N. Bakshi et al.
Fig. 2. Flow diagram of the proposed approach
Optimized Z-Buffer Using Divide and Conquer
45
Algorithm 1 search (Lvm,So) Input: Set of objects, View frustum Output: for (i 1; i hall > outdoor (see Table 3).
5 Conclusion An enhanced error rate estimation method is proposed by considering moderate to extreme conditions. In the proposed method, reliability is measured from the error estimation method. The real-time simulation was carried out in three typical environmental conditions indoor, hall, open air. The outcome of the experimental setup validates and illustrates the reliable functionality of the proposed work. As compared to other existing
76
U. S. Verma and N. Gupta
methods, the proposed system reliability is improved in the similar working conditions. To further improve the reliability, few more error rate estimation methods need to be evaluated. This will be a prone area of the future research. Reliability offers a trust-enabled system to the wireless sensor networks. The proposed work incorporates hardware, software, and mathematical techniques.
References 1. M. Haghi, R. Stoll, K. Thurow, Pervasive and personalized ambient parameters monitoring: configurable watch. IEEE Access 7, 20126–20143 (2019) 2. X. Wang, H. Gu, H. Zhang, H. Chen, Novel RPSO based strategy for optimizing the placement and charging of a large-scale camera network in proximity service. IEEE Access 7, 16991– 17000 (2019) 3. W. Zhang, X. Gong, G. Han, Y. Zhao, An improved ant colony algorithm for path planning in one scenic area with many spots. IEEE Access 5, 13260–13269 (2017) 4. T. Xie, C. Zhang, Z. Zhang, K. Yang, Utilizing active sensor nodes in smart environments for optimal communication coverage. IEEE Access 7, 11338–11348 (2019) 5. J. Mateo-Fornes, F. Solsona-Tehas, J. Vilaplana-Mayoral, I. Teixido-Torrelles, J. RiusTorrento, CARt, a decision SLA Model for SaaS providers to keep QoS regarding availability and performance. IEEE Access 7, 38195–38204 (2019) 6. W. Wu, R. Kurachi, G. Zeng, Y. Matsubara, H. Takada, R. Li, K. Li, IDH-CAN: a hardware-based ID hopping CAN mechanism with enhanced security for automotive real-time applications. IEEE Access 6, 54607–54623 (2018) 7. G. Yildirim, Y. Tatar, Simplified agent-based resource sharing approach for WSN-WSN interaction in IoT/CPS projects. IEEE Access 6, 78077–78091 (2018) 8. L.P.I. Ledwaba, G.P. Hancke, H.S. Venter, S.J. Isaac, Performance costs of software cryptography in securing new-generation internet of energy endpoint devices. IEEE Access 6, 9303–9323 (2018) 9. E. Lattanzi, M. Dromedari, V. Freschi, A scable multitaking wireless sensor network testbed for monitoring indoor human comfort. IEEE Access 6, 17952–17967 (2018) 10. T.A. Alghamdi, Secure and energy efficient path optimization technique in wireless sensor networks using DH method. IEEE Acces XX, 1–8 (2018) 11. X. Rao, Y. Yan, M. Zhang, W. Xu, X. Fan, H. Zhou, P. Yang, You can recharge with detouring: optimizing placement for roadside wireless charger. IEEE Access 6, 47–59 (2018) 12. C.-M. Yu, M.-L. Ku, Joint hybrid transmission and adaptive routing for lifetime extension of WSNs. IEEE Access 6, 21658–21679 (2018) 13. S.K. Gharghan, R. Nordin, A.M. Jawad, H.M. Jawad, M. Ismail, Adaptive neural fuzzy inference system for accurate localization of wireless sensor network in outdoor and indoor cycling applications. IEEE Access 6, 38475–38489 (2018) 14. X. Yan, Q. Luo, Y. Yang, S. Liu, H. Li, C. Hu, ITL-MEPOSA: improved trilateration localization with minimum uncertainty propagation and optimized selection of anchor nodes for wireless sensor networks. IEEE Access 7, 53136–53145 (2019) 15. N.R. Mohanty, C.Y. Patil, Wireless sensor networks design for greenhouse automation. Int. J. Eng. Innov. Technol. IJEIT (2013) 16. A. Naureen, N. Zhang, S. Furber, Identifying energy holes in randomly deployed hierarchical wireless sensor networks. IEEE Access 5, 21395–21418 (2017)
Channel Capacity in Psychovisual Deep-Nets: Gaussianization Versus Kozachenko-Leonenko Jesus Malo(B) Image Processing Lab, Universitat de Valencia, Valencia, Spain [email protected] http://isp.uv.es/
Abstract. In this work, we quantify how neural networks designed from biology using no statistical training have a remarkable performance in information theoretic terms. Specifically, we address the question of the amount of information that can be extracted about the images from the different layers of psychophysically tuned deep networks. We show that analytical approaches are not possible, and we propose the use of two empirical estimators of capacity: the classical Kozachenko-Lonenko estimator and a recent estimator based on Gaussianization. Results show that networks purely based on visual psychophysics are extremely efficient in two aspects: (1) the internal representation of these networks duplicates the amount of information that can be extracted about the images with regard to the amount of information that could be obtained from the input representation assuming sensors of the same quality, and (2) the capacity of internal representation follows the PDF of natural scenes over the chromatic and achromatic dimensions of the stimulus space. This remarkable adaptation to the natural environment is an example of how imitation of biological vision may inspire architectures and save training effort in artificial vision. Keywords: Spatio-chromatic information · Psychophysically tuned neural networks · Entropy Gaussianization · Kozachenko-Leonenko estimator
1
Introduction
The early stages of deep-nets trained to solve visual classification problems develop units that resemble the sensors found in biological vision systems [1]. One could save substantial effort figuring out the proper architecture and training by using the existing models of early human vision which already have the linear+nonlinear architecture of deep-nets. In this work, we show a specific example of the above by quantifying the performance of a biological network in accurate information-theory units. c Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_10
78
J. Malo
In particular, we use the linear+nonlinear architecture which is standard in visual neuroscience through associations of filterbanks and the divisive normalization inhibitory interaction [2,3]. The model we consider here consists of a series of layers that reproduce the following perceptual facts: spectral integration at linear LMS sensors [4], nonlinear normalization of LMS signals using VonKries adaptation [5], linear transform to chromatic opponent ATD channels [6], saturation of the achromatic, red-green, and yellow-blue signals [7], linear bank of local-oriented filters and achromatic/chromatic contrast sensitivity weights [8], nonlinear divisive normalization of the spatio-chromatic local-frequency filters [3,9], related to classical neural field models [10]. Following the tradition of use of divisive normalization to improve JPEG and MPEG image coders [11,12] and image quality metrics [13,14], the current state-of-the-art in these image processing problems is achieved using similar linear+nonlinear architectures [15,16]. The novelty in the current approaches is that they use the biological model as a starting point and refine it using the automatic differentiation techniques fitting psychophysical databases.
2
Measuring Channel Capacity in Biological Networks
In a sensory system where the input, r, undergoes certain deterministic transform, S, but the sensors are noisy: S
r −−−−→ x = s(r) + n
(1)
the transmitted information about r available from the response x (i.e. the mutual information, I(r, x)) here will be referred to as the capacity of the channel S. This definition of capacity differs from the standard [17]. In this work, we are interested in comparing the performance of the system at different locations of the space of images with the distribution of natural scenes over that space. Moreover, this comparison should be done for different layers of the visual pathway to analyze their relative contribution to the information transmission. Ideally, we would like to describe the trends of the performance from the analytical description of the response. However, that is not straightforward. Note that the amount of transmitted information can be written in terms of the entropies, h, of the input and the noise, and of the Jacobian of the transform; or alternatively in terms of the total correlation, T , aka multi-information [18], that represents the redundancy within the considered vector (or layer). Specifically, in [19] I derive these two equations that are relevant to discuss capacity: I(r, x) = h(r) + Er log2 |∇r S| − h(n) + En DKL p(s(r))p(s(r) + n)
I(r, x) =
i
(2) h(xi ) − T (x) − h(n)
(3)
Channel Capacity in Psychovisual Deep-Nets: Gaussianization . . .
79
where Ev {·} stands for expected value over the random variable v, and DKL (p|q) stands for the Kullback-Leibler divergence between the probabilities p and q. Equation 3 identifies univariate and multivariate strategies for information maximization. When trying to assess the performance of a sensory system, reduction of the multivariate total correlation, T (x), seems the relevant term to look at because univariate entropy maximization (the term depending on h(xi )) can always be performed after joint PDF factorization through a set of (easy-to-do) univariate equalizations, and the noise of the sensors is a restriction that cannot be changed. Then, the reduction in redundancy, ΔT (r, x) = T (r) − T (x), is a possible measure of performance: the system is efficient in regions of the space of images where ΔT is big. Interestingly, this performance measure, ΔT , can be written in terms of univariate quantities and the Jacobian of the mapping S [18]. Generalizing the expression given in [18] to response models that do nor preserve the dimensionality, we have: 1 ΔT (r, x) = Δhm (r, x) + Er log2 |∇r S · ∇r S| 2
(4)
Equation 4 is good for our purposes for two reasons: (1) in case the marginal difference, Δhm , is approximately constant over the space of interest, the performance is totally driven by the Jacobian of the response, so it can be theoretically studied from the model, and (2) even if Δhm is not constant, the expression is still useful to get robust estimates of ΔT because the multivariate contribution may be get analytically from the Jacobian of the model and the rest reduces to a set of univariate entropy estimations (which do not involve multivariate PDF estimations). In the results Sect. 3, estimates of ΔT using Eq. 4 are referred to as theoretical estimation (as opposed to model-agnostic empirical estimates purely based on samples) because of this second reason. In previous works, Eq. 4 has been used to describe the communication performance of divisive normalization [3] and Wilson-Cowan interaction [20] on achromatic scenes exclusively from the analytical expressions of the corresponding Jacobian. In both cases, these studies used Eq. 4 to analyze the performance at a single layer, and Δhm was explicitly shown to be constant over the considered domain. Therefore, the considerations on the analytical Jacobian certainly explained the behavior of the system. However, Δhm may not be constant in general, and hence, the trends obtained from the Jacobian of the model can be counteracted by the variation of Δhm . Similar considerations can be made with Eq. 2: we also find this transformdependent term, ∇r S, whose behavior can be successfully analyzed over the considered image space [3,20]; however, there is no guarantee that the other terms are constant and can be disregarded in the analysis, particularly dealing with comparisons between multiple layers. Moreover, the situation seems worse in Eq. 2 because the terms that should be constant are multivariate in nature, and hence in principle, more difficult to estimate.
80
J. Malo
Therefore, since the intuition from the analytical response (or from ΔT ) is conclusive only in restricted situations, there is a need for empirical methods to estimate the capacity directly from sets of stimuli and the responses they elicit. Here, we use a recently proposed estimation of capacity [21] based on a Gaussianization technique, the so-called Rotation-Based Iterative Gaussianization (RBIG) [22], that reduces the problematic (multivariate) PDF estimation problem involved in naive estimation of I to a set of easy (univariate) marginal PDF estimations. We compare the accuracy of RBIG with theoretical results of ΔT and with classical Kozachenko-Leonenko estimator [23], and variations [24].
Fig. 1. Agreement between estimations of redundancy reduction (ΔT , in bits) between the LMS input and the internal representation in V1, estimated via the theoretical approach of Eq. 4, computed as in [3] (left), the classical Kozachenko-Leonenko estimator (second plot), and the offset corrected Kozachenko-Leonenko (third plot), and the RBIG estimator (right). The green surfaces represent the absolute difference between the theoretical estimation and the different estimates. These results imply that the estimation of I (for which there is no theoretical reference to compare with) can be trusted for the RBIG [21] and the modified Kozachenko-Leonenko estimator [24], but not for the classical Kozachenko-Leonenko estimator [23]
3
Experiments and Results
In the experiments, 19 × 106 image patches from the IPL color-calibrated database [25] were characterized according to their chromatic contrast, achromatic contrast and mean luminance, and were injected through the perceptual network to get the responses and compute the information theoretic measures.1 First, we computed redundancy reduction in the inner visual representation because we have a theoretical reference, Eq. 4, we can compare with. We computed ΔT with three estimators: the one based on Gaussianization [22], the classical Kozachenko-Leonenko estimator [23], and an improved version of Kozachenko-Leonenko [24]. There results are shown in Fig. 1. 1
Model at http://isp.uv.es/code/visioncolor/vistamodels.html, data at http://isp. uv.es/data calibrated.html, and Gaussianization estimator at http://isp.uv.es/rbig. html.
Channel Capacity in Psychovisual Deep-Nets: Gaussianization . . .
81
Then, we analyze the transmitted information in two ways: (1) we compare the amount of information that can be extracted from images at the cortical representation with the PDF of natural images, see results in Fig. 2; and (2) we plot the amount of information available from different layers of the psychophysical network assuming sensors of the same signal-to-noise ratio at every layer. On the one hand, Fig. 2 shows that, despite using no statistical training, the perceptual network is more efficient in the more populated regions of the image space. And, on the other hand, Fig. 3 shows how the series of transforms along the neural pathway progressively increase the information about the input which is available from the corresponding layer. In this regard, note that spatial transforms have a bigger contribution to the increase in available information than chromatic transforms.
Fig. 2. Information available at the cortical representation after divisive normalization at different regions of the image space (top) compared to the PDF of natural images (bottom). Note that images with smooth achromatic variation are more frequent than sharp chromatic patterns, and the cortical representation captures more information exactly in those regions
4
Discussion
Results imply that (1) the biggest contribution to improve transmission is the analysis of opponent images through local-oriented filters and divisive normalization (about 70%) as opposed to the 30% that comes from the previous chromatic transforms, (2) the capacity is remarkably well adapted to the PDF of natural
82
J. Malo
Fig. 3. Information about the scene available from different layers of the visual pathway. Results are shown over the achromatic contrast and luminance space for two fixed chromatic contrasts: the minimum (zero, on the top) and the maximum in our set (on the bottom). This result implies that using sensors of equivalent quality (5% of signal deviation), the cortical representation is more appropriate because it doubles the amount of information captured from the input
images, and (3) the internal representation captures substantially more information about the images than the trivial representation at the photoreceptor domain. This efficiency is inspiring for artificial systems, particularly considering that no statistical training was required here. Examples of the consequences in image processing include the generalization of the Visual Information Fidelity (VIF) [26] concept. VIF is an original approach to characterize the distortion introduced in an image which is based in comparing the information about the scene that a human could extract from the distorted image with respect to the information that he/she could extract from the original image. Our results have two kinds of implications in VIF. First, one may improve the perceptual model and noise schemes in VIF because the non-parametric RBIG estimation is insensitive to the complexity of the model. Second, original VIF made crude approximations on the PDF of the signals to apply analytical estimations of I, which may be too biased. Better measures of I not subject to approximated models could certainly improve the results.
5
Conclusions
In this work, we quantified how neural networks designed from biological models and using no statistical training have a remarkable performance in information
Channel Capacity in Psychovisual Deep-Nets: Gaussianization . . .
83
theoretic terms. Specifically, using two empirical estimators of mutual information [22,23], we computed the transmission capacity at different layers of standard biological models [3,19,20]. From the technical point of view, we found that Gaussianization-based estimations of total correlation [22] are substantially more accurate than the original Kozachenko-Leonenko estimator [23], and its performance is similar to more recent (offset corrected) Kozachenko-Leonenko estimators [24]. Regarding the behavior of the considered visual network, we found three interesting results: (1) progressively deeper layers have bigger capacity (assuming the same quality of the sensors at every layer) indicating that biological transforms may be optimized to maximize transmitted information. (2) the internal representation of these networks duplicates the amount of information that can be extracted about the images with regard to the amount of information that could be obtained from the input representation, and (3) the capacity of internal representation follows the PDF of natural scenes over the chromatic and achromatic dimensions of the stimulus space. This remarkable adaptation to the natural environment is an additional confirmation of the efficient coding hypothesis [20,27,28], and an additional example of how imitation of biological vision may inspire architectures and save training effort in artificial vision. Acknowledgments. Partially funded by DPI2017-89867-C2-2-R and GrisoliaP/ 2019/035.
References 1. A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in In 25th Neural Information Processing Systems, NIPS’12, USA (Curran Associates Inc, 2012), pp. 1097–1105 2. M. Carandini, D.J. Heeger, Normalization as a canonical neural computation. Nature Rev. Neurosci. 13(1), 51–62 (2012) 3. M. Martinez, P. Cyriac, T. Batard, M. Bertalm´ıo, J. Malo, Derivatives and inverse of cascaded L+NL neural models. PLOS ONE 13(10), 1–49 (2018) 4. A. Stockman, D.H. Brainard, OSA Handbook of Optics, 3rd ed. (McGraw-Hill, New York, 2010), pp. 147–152 (chapter Color vision mechanisms) 5. M.D. Fairchild, Color Appearance Models, The Wiley-IS&T Series in Imaging Science and Technology (Wiley, 2013) 6. L.M. Hurvich, D. Jameson, An opponent-process theory of color vision. Psychol. Rev. 64(6), 384–404 (1957) 7. J. Krauskopf, K. Gegenfurtner, Color discrimination and adaptation. Vision Res. 32(11), 2165–2175 (1992) 8. K.T. Mullen, The CSF of human colour vision to red-green and yellow-blue chromatic gratings. J. Physiol. 359, 381–400 (1985) 9. A.B. Watson, J.A. Solomon, Model of visual contrast gain control and pattern masking. JOSA A 14(9), 2379–2391 (1997) 10. J. Malo, J.J. Esteve-Taboada, M. Bertalm´ıo, Divisive normalization from WilsonCowan dynamics. Quant. Biol. Arxiv: 1906.08246 (2019)
84
J. Malo
11. J. Malo, J. Guti´errez, I. Epifanio, F.J. Ferri, J.M. Artigas, Perceptual feedback in multigrid motion estimation using an improved DCT quantization. IEEE Trans. Image Process. 10(10), 1411–1427 (2001) 12. J. Malo, I. Epifanio, R. Navarro, E.P. Simoncelli, Nonlinear image representation for efficient perceptual coding. IEEE Trans. Image Process. 15(1), 68–80 (2006) 13. A.B. Watson, J. Malo, Video quality measures based on the standard spatial observer, in IEEE International Conference on Image Processing, vol. 3, pp. III–41 (2002) 14. V. Laparra, J. Mu˜ noz-Mar´ı, J. Malo, Divisive normalization image quality metric revisited. JOSA A 27(4), 852–864 (2010) 15. J. Ball´e, V. Laparra, E.P. Simoncelli, End-to-end optimized image compression, in 5th International Conference on Learning Representative, ICLR 2017 (2017) 16. V. Laparra, A. Berardino, J. Balle, E.P. Simoncelli, Perceptually optimized image rendering. JOSA A 34(9), 1511–1525 (2017) 17. T.M. Cover, J.A. Thomas, Elements of Information Theory, 2nd ed. (WileyInterscience, 2006) 18. M. Studeny, J. Vejnarova, The Multi-information Function as a Tool for Measuring Stochastic Dependence (Kluwer, 1998), pp. 261–298 19. J. Malo, Spatio-chromatic information available from different neural layers via Gaussianization. Quant. Biol. ArXiv: 1910: 01559 (2019) 20. A. Gomez-Villa, M. Bertalmio, J. Malo, Visual information flow in Wilson-Cowan networks. Quant. Biol. ArXiv: 1907.13046 (2019) 21. J.E. Johnson, V. Laparra, R. Santos, G. Camps, J. Malo, Information theory in density destructors, in 7th ICML 2019, Workshop Invertible Normal Flows (2019) 22. V. Laparra, G. Camps-Valls, J. Malo, Iterative gaussianization: from ICA to random rotations. IEEE Trans. Neural Networks 22(4), 537–549 (2011) 23. L.F. Kozachenko, N.N. Leonenko, Sample estimate of the entropy of a random vector. Probl. Inf. Trans. 23, 95–101 (1987) 24. I. Marin, D.H. Foster, Estimating information from image colors: Application to digital cameras and natural scenes. IEEE Trans. PAMI 35(1), 78–91 (2013) 25. V. Laparra, S. Jim´enez, G. Camps-Valls, J. Malo, Nonlinearities and adaptation of color vision from sequential principal curves analysis. Neural Comput. 24(10), 2751–2788 (2012) 26. H.R. Sheikh, A.C. Bovik, Image information and visual quality. IEEE Trans. Image Process. 15(2), 430–444 (2006) 27. H. Barlow, Redundancy reduction revisited. Network: Comp. Neur. Syst. 12(3), 241–253 (2001) 28. J. Malo, V. Laparra, Psychophysically tuned divisive normalization approximately factorizes the pdf of natural images. Neural Comput. 22(12), 3179–3206 (2010)
pARACNE: A Parallel Inference Platform for Gene Regulatory Network Using ARACNe Softya Sebastian, Sk. Atahar Ali, Alok Das, and Swarup Roy(B) Network Reconstruction & Analysis (NetRA) Lab, Department of Computer Applications, Sikkim University, 6th Mile, Tadong, Sikkim 737102, India [email protected], [email protected], [email protected], [email protected]
Abstract. Accurate inference of gene regulatory networks from genome-scale expression data is of great significance. Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) is one of the popular inference tools that aim to accomplish this goal. Our target is to execute ARACNe in a parallel environment without redesigning its internal structure. In this paper, we present a parallel execution platform for ARACNe, pARACNE, that takes advantage of parallel computing without altering the original algorithm. We use Dialogue on Reverse Engineering Assessment and Methods (DREAM) challenge networks for evaluation and assessment of pARACNe. We observe that our parallel platform is able to infer networks close to the network inferred by ARACNe in significantly less amount of time. On comparing with the gold standard, pARACNe performs relatively better than ARACNe. Our idea is simple yet effective in producing networks similar to the outcomes of ARACNe in a cost-effective way. Keywords: ARACNe · Large regulatory network · Scalability · Parallel inference · DREAM challenge
1 Introduction A huge amount of transcriptome data is generated regularly using high throughput technologies. It is essential to analyze such bulk volume of data using fast and accurate computational tools to reveal priorly unknown biological knowledge. Inference of gene regulatory networks (GRN) from expression profile is important in understanding regulatory interaction patterns of genes inside a cell in different experimental conditions. Identifying the activity of gene(s) in a disease condition may help in elucidating genetic biomarkers responsible for the disease [1]. It is therefore important to infer regulatory networks in silico, to identify (computationally) such key genes or group of genes playing important roles in any disease. Several inference methods are available [2–5] that infer such networks from micro-array gene expression data. However, due to heavy computational costs, they are limited in handling genome-scale network inference. A few attempts have been made to provide alternative solutions to overcome the above bottleneck by utilizing the power of parallel or distributed computing. Most of © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_11
86
S. Sebastian et al.
the parallel inference methods are the extension of ARACNe [3], due to the simplicity of its algorithmic structure. The original ARACNe infers weighted co-expression networks [2] using information theory [6], followed by a pruning step, called data processing inequality (DPI). To support a computationally efficient inference of genome-wide gene regulatory networks, great efforts have been put into re-implement the sequential ARACNe into parallel versions. ARACNe-AP [7] is an extension of ARACNe that uses an effective adaptive partitioning (AP) to calculate MI concurrently in different computing units using JAVA multi-threading. Although ARACNe-AP achieves faster execution time than ARACNe, it consumes high memory space while handling large datasets [8]. GPU-ARACNe [8] puts forth an accelerated parallel implementation of ARACNe using the NVIDIA CUDA framework to take advantage of multi-core and multi-level parallelism. GPU-ARACNe too only parallelizes the performance-sensitive parts of the ARACNe algorithm. There is no guarantee that both of them will be able to infer networks close to original ARACNe. There may be false positive edges while generating the final adjacency matrix. Very recently, Casal et al. [9] implemented ARACNe in Finis Terrae 2 (FT2) supercomputer. It uses a hybrid Open MP/MPI platform to run ARACNe concurrently in multiple processing units. This implementation is not suitable for a lower configured computing device with relatively limited memory space. Concurrent execution of few computationally expensive steps of ARACNe to produce results like what the original ARACNe would output, in shorter execution time may not guarantee similar results with ARACNe. While some claim that the network is close enough, it is not supported with proper evidence. As discussed above, executing ARACNe in its entirety requires large memory requirements. We try to propose a parallel version of ARACNe that works on relatively moderate computing systems. Our effort is to run ARACNe parallely to produce networks close to the original ARACNe with relatively less overuse of the memory by dividing the data matrix among the concurrent processing units. Next, we discuss our parallel execution scheme.
2 A New Parallel Inference Engine: pARACNE We propose a parallel computing framework to infer large scale co-expression networks using original ARACNe, termed as pARACNE (Parallel ARACNe) (Fig. 1). We use ARACNe in its original form parallelly in the N partition of large expression matrix. The idea is to infer sub-networks from the manageable data chunks parallelly using ARACNe followed by the merging of the sub-networks to generate the global or final co-expression network. It is worth to mention here that our objective is not to improve the overall quality of inference, rather it is to infer the network parallelly so as to generate a network that is close to the result of original ARACNe. We discuss next the different steps involved in the pARACNE workflow. 2.1 Splitting of Expression Matrix The overall intention of the work is not to redesign the original algorithm for parallelism. Instead, execute the original implementation in the split data matrix to generate
pARACNE: A Parallel Inference Platform for Gene Regulatory …
87
Fig. 1. The workflow of pARACNE. a Splitting of the input expression matrix into subsets. b Run ARACNe. c Sub-networks obtained after running ARACNe. d Finding hub genes in each sub-network. e Running ARACNe on each of the sub-networks along with the hub genes of other sub-networks. f Final sub-networks obtained. g The global network after combining all the sub-networks
sub-networks. At first, pARACNE considers the expression data matrix as input. The rows contain expression profiles of all the genes and the columns indicate the genes. Next, it divides the matrix horizontally into N sub-matrices. While splitting the matrix, we partition the matrix into non-overlapping N/p divisions. Figure 1a, depicts the step pictorially.
88
S. Sebastian et al.
2.2 Sub-network Generation Using ARACNe Each subset obtained from the first step is then dealt with parallelly in the different concurrent execution units. ARACNe is executed parallelly on all the split subsets as shown in Fig. 1b in order to obtain sub-networks. This step includes pairwise mutual information (MI) calculation, DPI-based pruning and the generation of the adjacency matrix. 2.3 Finding Hub Genes Inspired by the idea of preferential attachment we try to re-link different sub-networks [10] produced by the different processing units. The rationale behind it is that real-world networks are usually scale-free networks. According to Barabasi and Albert [11], the more connected a node (hub) is, the more likely it is to receive new links, i.e., “richergetmore-richer.” To recover the missing links between the sub-networks, pARACNE considers the higher degree nodes with a view that hub nodes have stronger ability to link with nodes from other sub-networks and hence act as possible bridging nodes between the sub-networks. pARACNE, therefore, finds the hub genes in each sub-networks and takes the top two heavily connected nodes in each sub-network (step in Fig. 1d). Two possibilities are considered while deciding to choose the top "two" highly connected nodes in each subset. First, genes that may have been highly connected if considered in the original network (actual hub genes) may not be the most highly connected node in the subsets. This is due to the possible loss of links during partition. Secondly, the most highly connected node in a particular sub-network may not necessarily have any links to the other nodes in the other subsets because it is actually a simple node when considered in the original network and hence may not prove useful to bridge the sub-networks. Therefore, the top two highly connected nodes are considered to bridge sub-networks as discussed next. 2.4 Rewiring and Merging of Sub-networks To rewire the different sub-networks to form the global merged network, ARACNe is executed for the second time (parallelly) (Fig. 1e) on each of the sub-networks along with the hub gene set of the other sub-networks to get refined sub-networks (Fig. 1f). This ensures that some of the missing links are re-established and hence enhance the quality of the final inferred network by merging all the sub-networks. When this is completed, all the sub-networks are merged to obtain the final global adjacency matrix (network) (Fig. 1g).
3 Results and Discussion Evaluating the performance of any inference method is arduous. Different methods of inference are proposed owing to the difficulty involved in directly measuring the regulatory relationships between various genes. We use synthetic expression matrices for assessment of inference quality. We show that pARACNE is scalable, at the same time capable of producing networks similar to the original ARACNe.
pARACNE: A Parallel Inference Platform for Gene Regulatory …
89
3.1 Dataset Used Dialogue on Reverse Engineering Assessment and Methods (DREAM) [12] challenge is a major initiative in order to benchmark inference methods. We generate synthetic DREAM networks using Gene Net Weaver (GNW) [13] to assess the performance of pARACNE. The dataset used is given in Table 1. Table 1. Expression dataset used for experimentations Dataset # Genes # Interactions # Time points Ecoli-1
100
Ecoli-2
157
210
500
1322
210
Ecoli-3 1000
2019
210
Ecoli-4 1500
3594
210
Yeast-1 2000
5256
210
Yeast-2 3000
8029
210
Yeast-3 4000
11,323
210
3.2 Experimental Environment Intel(R) Core(TM) i5-8500, 3.00 GHz CPU-based system with six (06) cores and 8 GB RAM is used for all experimentations. The experiments are run in Ubuntu 18.04.3 LTS that provides six concurrent threads of execution. We use an R parallel package, doParallel,1 for implementing the proposed pARACNe and R minet package available at Bioconductor2 for implementing ARACNe. 3.3 Scalability Analysis We test the scalability of pARACNE in terms of handling larger networks (number of nodes). We used a maximum 4000 × 210 sized data matrix as input for comparing execution time, consumed by both pARACNE and ARACNe. We even experimented pARACNe with varying number of concurrent execution units or data matrix partitions (2, 6, 10, 20, 50, 100). Figure 2 shows the scalability trend of both and merging, pARACNE consumes more time in comparison to ARACNe for smaller networks. However, it outperforms ARACNe with respect to execution time while handling larger networks, which was the main objective to be achieved. Further, it is also interesting to observe that pARACNE consumes more time for a larger number of partitions or submatrices. It possibly occurs because the system that we use to run pARACNE consists of six (06) cores only. Hence, increasing the logical divisions of the task more than the 1 https://cran.r-project.org/web/packages/doParallel/index.html. 2 https://www.bioconductor.org/.
90
S. Sebastian et al.
physical concurrent execution units may not improve the overall execution time consumption. However, the execution time requirement drastically decreases with the size of the network, even with the varying number of concurrent execution units.
Fig. 2. Scalability of pARACNE and ARACNe with the methods. Due to the varying network size, P indicates the number of par-additional steps of splitting titions used
3.4 Qualitative Assessments We compare pARACNE with the outcome of ARACNe to observe whether the proposed scheme is effective in producing results similar to the original ARACNe. For that, we infer networks using both the methods and compare their accuracy, using receiver operating characteristic (ROC) curve, which is obtained by plotting true positive rate (TPR) against false positive rate (FPR). We report the Area Under the ROC (AUROC) score obtain by pARACNE for different sized networks with varying concurrent units of execution (partitions) in Fig. 3a. From the figure, it can be observed that the performance of pARACNE improves significantly with the increase in the number of partitions. We even achieve 100% accuracy with ARACNe for small sized dataset with smaller partitions. Hence, the performance of pARACNE is proportional to the number of partitions. This may be due to the fact that finer partitions may produce more accurate sub-networks. The results of pARACNE and ARACNe are also compared individually against the gold standard networks to see the quality of the network inferred in comparison to the gold network (Fig. 3b). It can be seen that the AUROC score obtained by pARACNE against the gold network is marginally better in comparison to ARACNe.
pARACNE: A Parallel Inference Platform for Gene Regulatory …
(a) pARACNE with ARACNe
91
(b) pARACNE and ARACNe against gold networks
Fig. 3. Accuracy assessment of pARACNE with ARACNe and gold networks
4 Conclusion Unlike the existing parallel versions of ARACNe, we presented a novel parallel framework for executing ARACNe without redesigning the actual algorithm. We used the actual implementation of ARACNe as a black box and plugged it into our proposed platform for handling large network inference. Interestingly, our framework is not specific to any particular algorithm, rather generic to handle any suitable inference algorithm. The current implementation has been implemented in CPU architecture and can handle a few thousand (~10 K) genes effectively. Work is on to implement pARACNE in GPGPU environment to handle much larger genome-scale networks effectively (>20 K), in comparatively less amount of time, in future. Funding. This research is supported by the Department of Science and Technology (DST), Govt. of India under DST-ICPS Data Science program [DST/ICPS/Cluster/Data Science/General], carried out at NetRA Lab, Sikkim University.
References 1. H.N. Manners, S. Roy, J.K. Kalita, Intrinsic-overlapping co-expression module detection with application to alzheimer’s disease. Comput. Biol. Chem. 77, 373–389 (2018) 2. S. Roy, D.K. Bhattacharyya, J.K. Kalita, Reconstruction of gene co-expression network from microarray data using local expression patterns. BMC Bioinformatics 15(7), S10 (2014) 3. J.J. Faith, B. Hayete, J.T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J.J. Collins, T.S. Gardner, Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5(1), e8 (2007) 4. A.A. Margolin, I. Nemenman, K. Basso, C. Wiggins, G. Stolovitzky, R. Dalla Favera, A. Califano, Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7, S7 (2006) 5. P.E. Meyer, K. Kontos, F. Lafitte, G. Bontempi, Information-theoretic inference of large transcriptional regulatory networks. EURASIP J. Bioinf. Syst. Biol. 2007, 8 (2007)
92
S. Sebastian et al.
6. T.M. Cover, J.A. Thomas, Elements of information theory (Wiley, New York, 2012) 7. A. Lachmann, F.M. Giorgi, G. Lopez, A. Califano, Aracne-ap: gene network reverse engineering through adaptive partitioning inference of mutual information. Bioinformatics 32(14), 2233–2235 (2016) 8. J. He, Z. Zhou, M. Reed, A. Califano, Accelerated parallel algorithm for gene network reverse engineering. BMC Syst. Biol. 11(4), 83 (2017) 9. U. Casal, J. González-Dominguez, M.J. Martin, Parallelization of aracne, an algorithm for the reconstruction of gene regulatory networks. In: Multidisciplinary Digital Publishing Institute Proceedings, vol. 21 (2019), p. 25 10. V. Karyotis, M. Khouzani, Malware diffusion models for modern complex networks: theory and applications. Morgan Kaufmann (2016) 11. A.L. Barabási, R. Albert, Emergence of scaling in random networks. Science 286(5439), 509–512 (1999) 12. D. Marbach, J.C. Costello, R. Küffner, N.M. Vega, R.J. Prill, D.M. Camacho, K.R. Allison, A. Aderhold, R. Bonneau, Y. Chen et al., Wisdom of crowds for robust gene network inference. Nature Methods 9(8), 796 (2012) 13. T. Schaffter, D. Marbach, D. Floreano, Genenetweaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 16, 2263–2270 (2011)
Information Fusion-Based Intruder Detection Techniques in Surveillance Wireless Sensor Network Anamika Sharma(B)
and Siddhartha Chauhan
Computer Science and Engineering Department, National Institute of Technology Hamirpur, Hamirpur 177005, India {anamika,sid}@nith.ac.in
Abstract. Intruder detection is a crucial application of surveillance wireless sensor network (SWSN). The processing of detection information to drive out the valid conclusion is the fundamental issue. The implementation of information fusion rules on the detection information of intruder is the most advantageous solution. This paper presents a survey on the various information fusion-based intruder detection techniques and their related parameters. This paper also lists the various challenges that arise while designing an algorithm for intruder detection using SWSN. Keywords: Detection probability · Information fusion · Intruder detection · Fusion rules
1 Introduction The surveillance wireless sensor network is advantageous in detecting the presence of an intruder inside critical areas. The purpose of this network is to detect and report any unauthorized access to the base station with minimum delay and maximum detection probability. There are various categories of intruder detection algorithms. These algorithms can be classified as: hierarchal tree structure intruder detection [1], intruder detection based on the detection probability [2] and information fusion-based intruder detection [5–11]. This categorization of protocols is shown in Fig. 1. Coverage provided by the sensor nodes is an important parameter for the successful detection of an intruder [12]. This paper presents the various techniques that are classified under information fusion-based intruder detection category. There are several advantages of using information fusion techniques in intruder detection application of SWSN such as reduction in the size of data, removal of unwanted noise from the sensing data, reduce data traffic and generate inference about the sensed information [3]. The inference about the presence of the intruder is generated by the information fusion center. At first, the sensor nodes deployed inside any sensing region are grouped together, and each group consists of a few numbers of sensor nodes. Each group is considered as a cluster. The sensor nodes present inside the cluster detect the intruder © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_12
94
A. Sharma and S. Chauhan
Fig. 1. Categories of intruder detection algorithms
based on the received signal strength and then compute the detection probability. Each sensor node transmits its local detection result, i.e., detection probability to the cluster head. Each cluster head performs computation on the local detection results and transmits the aggregated data to the fusion center. The information fusion center generates an inference about the presence of an intruder inside the sensing region. This inference is further transmitted to the base station for necessary decision making. Figure 2 represents the model of information fusion in surveillance wireless sensor networks.
Fig. 2. Information fusion model in surveillance wireless sensor networks
Information Fusion-Based Intruder Detection Techniques …
95
2 Information Fusion-Based Intruder Detection This section explains about various information fusion-based intruder detection techniques and their related parameters. Table 1 shows the parametric comparison between information fusion-based intruder detection techniques. Table 1. Parametric comparisons between the information fusion-based intruder detection techniques Information fusion-based intruder detection techniques
Parameters
Chair–Varshney fusion rule for distributed detection [6]
✓
✓ ✓
✕ ✓
✓
Local vote decision fusion for intruder detection [7]
✓
✓ ✓
✕ ✓
✓
Fusion of threshold rules for intruder detection [8]
Pd τ
F r S RSS η
✓
✓ ✓
✕ ✓
✓
Generalized Rao test for decentralized detection of an uncooperative ✓ intruder [9]
✓ ✓
✕ ✓
✓
Information fusion of passive sensor for mobile intruder detection [10]
✓
✓ ✓
✓ ✓
✕
Counting rule for distributed detections [11]
✓
✓ ✓
✕ ✓
✓
2.1 Parameters The parameters define the state and behavior of the particular technique or algorithm. On the basis of parameters, the performance of one technique can be differentiated from others. The various parameters used in this category of intruder detection are explained below. Detection Probability (P d ) It defines the possibility of successful detection of the mobile intruder. The value of detection probability lies between 0 and 1 (0 < Pd ≤ 1). While evaluating the performance of intruder detection algorithm, the detection probability must be greater than or equals to the threshold value τ (Pd ≥ τ ). The detection probability is also analyzed using Poisson and Gaussian distribution. The cumulative detection probability Pd is represented by Eq. 1 where pi is the detection probability if ith sensor node. ⎧ ⎫ if Pd ≥ τ ⎬ ⎨1 N (1) Pd = ⎩1 − (1 − pi ) if Pd < τ ⎭ i=1
Threshold Value (τ ) This value sets a limit for the computation and analysis of detection probability and false alarm rate. Moreover, threshold value also helps the information fusion center to generate an inference about the presence of the intruder.
96
A. Sharma and S. Chauhan
False Alarm Rate (Fr ) It is defined as the ratio of the number of false detections to that of the total number of detections. The false alarms can be of two types: critical false alarm and trivial false alarm. Critical false alarm occurs when an intruder is actually present inside the sensing area and is not detected by the sensor nodes. On the other hand, the trivial false alarm occurs when the intruder is not present inside the sensing area and still the sensor nodes are generating data regarding the presence of an intruder. False alarm rate is an important parameter that defines the efficiency and reliability of the intruder detection algorithm. Data Size (S) It is defined as the total amount of data generated by the sensor nodes about the presence of an intruder. This data needs to be processed firstly at cluster head and then at the information fusion center. The size of the data must be optimum. Because the large size of data needs more computation and communication power. The sensor nodes are resource constraint devices and large consumption of power during computation and communication leads to the premature failure of the network. On its contrary, if the size of the data is small, then the inference about the presence of the intruder can be inaccurate. Received Signal Strength (RSS) The intruder detection algorithm computed the detection probability Pd on the basis of signal strength received from the mobile intruder. The signal strength deteriorates as the distance between the sensor node and intruder increases. The received signal strength is defined as [4]. Ri =
a + ηi |X − Si |α
(2)
where Ri is the received signal strength at ith sensor nodes, a is the actual signal strength, |X − S i | = d is the Euclidian distance between the intruder and sensor nodes, α is the path loss exponent (0 < α ≤ 1) and ηi is the white Gaussian noise received at node i. Noise Statistics (η) It is defined as the total amount of unwanted disturbance present inside the received signal strength at node i. Noise is an influential parameter that requires prior analysis. Noise can disrupt the inference generated by the information fusion center about the presence of an intruder.
2.2 Information Fusion-Based Intruder Detection Techniques The results about the presence of an intruder are generated after the local decisions from multiple sensor nodes are fused together at the information fusion center [5]. The various techniques for intruder detection are classified as below and their parametric comparison is represented in Table 1. Chair–Varshney Fusion Rule for Distributed Detection Niu et al. [6] have proposed a decision fusion algorithm for distributed detection. The proposed algorithm has generated the hypothesis based on the local detection results of sensor nodes. The statistical inference about the presence of the intruder is based on the total number of detection results made by the active sensor nodes where the intruder is detected at first. A sensor
Information Fusion-Based Intruder Detection Techniques …
97
node generates detection result on the basis of received signal strength. The decision fusion results must be greater than the threshold value. Accuracy and precision in the detection results are the main advantages of this fusion algorithm. However, considering and transmitting the detection results of each sensor node impose extra communication burden on the network. Hence, it reduces the network lifetime. Local Vote Decision Fusion for Intruder Detection According to the fusion algorithm of Katenka et al. [7], each individual sensor makes local decision about the detection of an intruder. An intruder emitted some signals which are sensed by the sensor node i. The RSS at ith sensor node also comprises of some noise. The detection results are further modified and validated according to the detection results received from the ith neighboring nodes. Local vote decision fusion algorithm uses distance parameter, i.e., distance between the sensor node and intruder and detection results from number of neighboring nodes to generate an inference about the presence of an intruder. At each sensor node, the aggregation of the neighboring nodes’ detection results with own results increases the reliability, accuracy and precision about the presence of an intruder. However, to aggregate the results from neighboring sensor nodes, it requires extra communication among nodes and hence require more power consumption. Fusion of Threshold Rules for Intruder Detection Zhu et al. [8] have proposed a centralized threshold-OR fusion rule to generate the intruder detection results. The Chebyshev’s inequality is applied to generate the inference. This inequality uses hit probability and false probability to derive out the valid conclusion. The authors have generated a binary hypothesis where the detection results which is greater than the threshold value of each sensor node are fused together. After applying the Monte Carlo simulation method, the authors have analyzed that this fusion algorithm has helped in achieving maximum hit probability and minimum false alarm probability. Generalized Rao Test for Decentralized Detection of an Uncooperative Intruder Ciuonzo et al. [9] have proposed Rao test-based information fusion algorithm for intruder detection. The main advantage of this algorithm is to detect an unknown intruder with unknown locations and then fuse the quantized detection results from multiple sensor nodes at the fusion center to generate global decisions. The detection results are based on the received signal strength. This fusion rule is efficient in generating reliable and robust detection result as compared to the other fusion rules in terms of performance and computational complexity. Information Fusion of Passive Sensor for Mobile Intruder Detection Li et al. [10] have proposed an information fusion algorithm which extracted the features of the timeseries received signals from multiple sensor nodes using the feature extraction tool. These features are grouped together at cluster head to generate the intruder detection results. The feature-level intruder detection technique is more robust and minimized the communication among nodes as compared to the data-level and decision-level information fusion techniques. Counting Rule for Distributed Detections According to the counting rule algorithm [11], the N sensor nodes transmit their binary detection decision to the fusion center.
98
A. Sharma and S. Chauhan
The number of positive detections must be greater than the predefined threshold value. On the basis of number of positive detections, the fusion center infers about the presence of the intruder. This fusion rule is more reliable and accurate in terms of generating intruder detection inference, but generates huge amount of detection data. It requires a large amount of communication and computation power to process the detection data. Hence, this fusion rule decreases the network efficiency and lifetime. Table 1 shows that most of the information fusion algorithm does not consider the size of the data while fusing the detection results. This is an important parameter that needs to be considered while designing an information fusion algorithm for intruder detection.
2.3 Challenges for Information Fusion-Based Intruder Detection Algorithm The various challenges for the designing of the algorithm are listed below. 1. The size of the data generated by N sensor nodes is very large. The resource constrained sensor nodes require more computational power to process the large amount of data. 2. Generation of the huge amount of data causes network congestion and contention. 3. The presence of noise in the received signals need to be removed while generating an inference about the presence of an intruder. 4. Deployment of heterogeneous sensor nodes inside the sensing region. 5. Network coverage and connectivity maintenance for efficient intruder detection and transmitting the results to the cluster head. 6. The sensor nodes have limited battery power. Therefore, a proper node schedulingbased intruder detection algorithm is required.
3 Conclusion The intruder detection application of surveillance wireless sensor networks is defined as to detect the presence of an intruder with maximum detection probability. The intruder detection algorithm can be classified as: hierarchal tree structure for intruder detection, intruder detection based on detection probability and intruder detection based on fusion rules. This paper has reviewed the various information fusion-based intruder detection techniques and their related parameters. This paper has also presented the various challenges arises while designing the algorithm.
References 1. W. Zhang, G. Cao, DCTC: dynamic convoy tree-based collaboration for target tracking in sensor networks. IEEE Trans. Wireless Commun. 3(5), 1689–1701 (2004) 2. Y. Hu, M. Dong, K. Ota, A. Liu, M. Guo, Mobile target detection in wireless sensor networks with adjustable sensing frequency. IEEE Syst. J. 10(3), 1160–1171 (2014) 3. E.F. Nakamura, A.A. Loureiro, A.C. Frery, Information fusion for wireless sensor networks: methods, models, and classifications. ACM Comput. Surveys (CSUR) 39(3), 9 (2007)
Information Fusion-Based Intruder Detection Techniques …
99
4. C. Liu, D. Fang, Z. Yang, H. Jiang, X. Chen, W. Wang, T. Xing, L. Cai, RSS distribution-based passive localization and its application in sensor networks. IEEE Trans. Wirel. Commun. 15, 2883–2895 (2016) 5. A. Abrardo, M. Martalò, G. Ferrari, Information fusion for efficient target detection in largescale surveillance wireless sensor networks. Inf. Fusion 38, 55–64 (2017) 6. R. Niu, P.K. Varshney, Q. Cheng, Distributed detection in a large wireless sensor network. Inf. Fusion 7(4), 380–394 (2006) 7. N. Katenka, E. Levina, G. Michailidis, Local vote decision fusion for target detection in wireless sensor networks. IEEE Trans Signal Process 56, 329–338 (2008) 8. M. Zhu, S. Ding, R.R. Brooks, Q. Wu, N.S.V. Rao, Fusion of threshold rules for target detection in sensor networks (2010) 9. D. Ciuonzo, P.S. Rossi, P. Willett, Generalized Rao test for decentralized detection of an uncooperative target. IEEE Signal Process. Lett. 24(5), 678–682 (2017) 10. Y. Li, D.K. Jha, A. Ray, T.A. Wettergren, Information fusion of passive sensors for detection of moving targets in dynamic environments. IEEE Trans. Cybern. 47(1), 93–104 (2016) 11. N. Sriranga, K.G. Nagananda, R.S. Blum, A. Saucan, P.K. Varshney, Energy-efficient decision fusion for distributed detection in wireless sensor networks. In: 2018 21st International Conference on Information Fusion (FUSION) (pp. 1541–1547). IEEE (2018) 12. A. Sharma, S. Chauhan, Target coverage computation protocols in wireless sensor networks: a comprehensive review. Int. J. Comput. Appl. 1–23 (2019). https://doi.org/10.1080/1206212x. 2019.1663382
Evaluation of AOMDV Routing Protocol for Optimum Transmitted Power in a Designed Ad-hoc Wireless Sensor Network Suresh Kumar1(B) , Deepak Sharma2 , Payal1 , and Mansi3 1 Department of Electronics and Communication Engineering, University Institute of
Engineering & Technology, Maharshi Dayanand University, Rohtak, Haryana, India [email protected], [email protected] 2 Department of Physics & Electronics, A.I.J.H.M. College, Rohtak, Haryana, India [email protected] 3 Software Engineer YMSI Limited, Faridabad, India [email protected]
Abstract. The sensor nodes in ad-hoc wireless sensor networks (AWSN) are battery powered which is one of the limiting factors in their performance. However, seamless and longer duration connectivity depends upon the lifetime of individual sensor nodes. Selection of the routing protocol and quantum of power the sensor node must use for transmission are the critical design issues. The present research work involves the evaluation of ad-hoc on-demand multipath distance vector (AOMDV) routing protocol on a designed AWSN network scenario involving configuration of 25, 50 and 75 nodes. In order to conserve the energy, the optimum power required for transmission by the sensor nodes for efficient and seamless working has been obtained. The parameters selected for evaluation are packet delivery fraction (PDF), normalized routing load (NRL) and average energy consumption (AEC) per node as performance metrics using random waypoint mobility model (RWMM) and constant bit rate (CBR) as traffic application. The optimum performance has been obtained for a maximum number of nodes with a transmitted power level of 2.5 dBm by the sensor nodes. Keywords: RWMM · CBR · PDF · NRL · AEC · Dynamic source routing (DSR) · Zone-based energy efficient routing protocol (ZEERP)
1 Introduction AWSN is a dynamic topology network having random placement of sensor nodes the position of which changes rapidly on need basis. These networks are ad-hoc in nature due to lack of any pre-existing infrastructure like routers in access points or wired networks. The technique by which the sensor nodes are connected is also dependent on the power consumed by the nodes and their location that may intermittently vary in the network. The communication between the sensor nodes is maintained by transmitting packets containing data over a common wireless channel which thereby limits the radio © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_13
Evaluation of AOMDV Routing Protocol for Optimum …
101
coverage [1]. The sensor nodes in AWSNs have limited power capabilities due to which the neighboring sensor nodes get limited in terms of resources. The topology of network changes due to change in mobile sensor node location, resulting in the network scenario becoming complex [2, 3]. The higher operating frequencies suffer from interference and fading in an urban environment, which uphold the links unreliable. Therefore, effective and accurate power aware routing techniques are required for effective network design in AWSNs. During operation of sensor nodes over a period of usage, when the battery power of any sensor node degrades or fails, it effects the entire network communication. Thus, the present research work is related to optimization of transmitted power by the sensor nodes in the network and selection of suitable routing protocol thereby ensure lifetime enhancement, which is certainly a challenging task for researchers. The authors in [4] have discussed several important factors such as QoS, network connectivity, power usage and management, scaling ability, fault tolerance, congestion, production cost and latency. The termite–hill protocol outperforms all other protocols having 807.78 throughput and 0.982 success rate with least dissipation of energy. The authors in [5] have presented a broad evaluation of clustering and cluster-based routing protocols for energy consumption in AWSN’s. In [6], the authors have proposed a novel approach that improves the network management with extended lifetime and even distribution of load across the network. Stable version of ZEERP (SZEERP) for load balancing has also been evaluated. The authors in [7] have analyzed the performance of delay tolerant network routing protocols under the influence of mobility models. In [8], the authors have proposed schemes for minimizing end-to-end delay in WSN. Further, the present research paper organization is as follows: The routing protocols are briefed in Sect. 2. Section 3 presents the simulation parameters in the network design. The results obtained have been analyzed in Sect. 4. The overall outcome has been summarized in Sect. 5.
2 AWSNs Routing Protocol The routing protocol (RP) is the defined set of regulations and standards that provide a method of communication and mechanism of route selection among the nodes while maintaining high QoS standards [9, 10]. The RP’s are classified in two categories as shown in figure below namely: unipath routing protocols (URP) and multipath routing protocols (MRP). The former transmits data packets via a single route while the later uses multiple routes for communication between sender and receiver nodes. Figure 1 gives the basic classification of RP’s in AWSNs. The nodes in proactive RP, i.e., DSDV, provide routing information about the network topology up to date and routes are created between them before they are required by the network. The reactive RP, i.e., AODV and DSR, has no predefined routes. In such RP, the sensor nodes establish their routes on demand dynamically. Once the route is established, the packets of data are communicated to all or its intermediate neighbor nodes. Hybrid RP (ZRP) is a non-uniform RP which utilizes adaptive and minimal overhead control to optimize the network performance. With the mechanism of route discovery, the scalability of the network is also increased by proactive route management [11, 12].
102
S. Kumar et al.
Fig. 1. Basic classification of RP’s
AODV is a RP with a capability of supporting unicast as well as multicast routing. The routes are established between source and destination nodes on demand dynamically and are constantly maintained as long as their utility is essential and required by the sources. AODV is thus considered as an on-demand algorithm. The multipath extension to AODV protocol is AOMDV. The AOMDV protocol uses an alternate route in case of network failure besides the multiple routes created between the source and destination pairs. AOMDV establishes loop free on-demand routes while maintaining connectivity and enables efficient failure recovery. AOMDV provides an improvement over URPs in handling the network load efficiently, avoiding congestion which increases consistency of the network.
3 Simulation Setup For carrying out simulations, an AWSN scenario has been created with nodes (25, 50 and 75) randomly placed over 1500 * 1500 meters terrain size using QualNet Simulator 7.3.1. Using this sensor node configuration, we have firstly evaluated the performance of AODV and AOMDV protocol for the performance metrics—throughput and average end-to-end delay (AEED). Based on the superior performance of AOMDV protocol, the performance of the designed network scenario has been further evaluated with varying transmitted power for AOMDV protocol for qualitative performance metrics—PDF, NRL, AEC per node. The mobility model used in our network scenario is RWMM and CBR as the traffic application. The data packet size chosen is 512 bytes with 10 m/s speed per node and data rate of 2 Mbps. Table 1 provides a list of simulation parameters and the values used in the designed network scenario. The simplified radio power model is used for simulation in which the energy spent for data packet transmission depends on two factors—(i) Radio range, (ii) number of hops required to reach destination. The free space model assumes that there exists only one route for communication between sender and receiver nodes. However, the signal generally spreads through two paths, a LOS path, and the path through which the wave reflects. The mathematical expression is given as follows: Pr =
Pt Gt Gr Ht2 Hr2 D4 L
(1)
Evaluation of AOMDV Routing Protocol for Optimum …
103
Table 1. Parameters with their mentioned values used in simulation Parameter
Value
No. of nodes
25, 50, 75
Terrain size
1500 * 1500 m
MAC protocol
IEEE 802.15.4
RP
AOMDV, AODV
Model
RWMM
Transmission power
1–4 dBm
Maximum speed of node 10 m/s Data traffic
CBR
Data packet size
512 bytes
where Pt Pr Gt , Gr D L Ht , Hr
Transmitted signal power, Received signal power, Gain (transmitted and received), distance, length, Height of the Transmitter and Receiver.
When a data packet is transmitted between the source and destination, the total energy consumed can be expressed as: Etotal = k[Et + Er ] + Ed
(2)
where k is the multiplying factor,Et and Er denotes the energy consumed in forwarding a data packet to next hop and receiving a data packet from previous hop respectively. Ed Denotes energy consumption by the over hearing node. Further, the energy consumed during transmission can be expressed as: (3) Et = Etb + Ed r n B where Etb denotes the energy consumed by the transmitter in transmitting one bit of data by transceiver, Ed denotes the dissipated energy. r is the range of transmission, B is the transmission bit rate and n denotes the power index for path loss in the channel respectively. The energy consumed during reception is expressed as: Er = Erb B
(4)
The transmission range must be short and several intermediate nodes can be used to reduce the energy consumption, increase the transmission range and reliability of the nodes. The various performance metrics for evaluating performance are:
104
(i)
S. Kumar et al.
PDF: It is expressed as No. of data packets delivered to PDF =
the destination packets generated at CBR sources
(ii) NRL: It is expressed as No. of routing control packets transmitted by the nodes NRL = No.of packets received by all destinations (iii) AEC per node: It is expressed as Sum of transmitted and received energy consumed by active nodes AEC = No. of nodes present in the network
4 Result and Discussion We have used QualNet Simulator 7.3.1 to design the AWSN scenario with node configurations (25, 50 and 75) involving random placement of the nodes over 1500 * 1500 m terrain size. The nodes are changing location at a speed of 10 m/s, the mobility model as RWMM, the traffic application type as CBR; we have evaluated the designed network scenario for AODV and AOMDV protocols. Further, they are compared based on performance metrics— throughput and AEED. The graph depicting throughput with nodes varying from 25 to 75 is shown in Fig. 2.
Throughput in bits/sec
Throughput with Node Configuration 8000 7000 6000 5000 4000 3000 2000 1000 0
AODV
25 Nodes
50 Nodes
AOMDV
75 Nodes
Nodes Fig. 2. Throughput with numbers of nodes
From Fig. 2, the value of throughput for AODV is 6875, 5970, 4750 (bits/s) and for AOMDV is 7570, 6800 and 5455 (bits/s) for 25, 50 and 75 nodes, respectively. AOMDV being a multipath routing protocol provides higher throughput as compared to AODV.
Evaluation of AOMDV Routing Protocol for Optimum …
105
Further due to increase in network congestion, the value of throughput decreases with nodes increasing from 25 to 75. From Fig. 3, the AEED for AODV is 0.23, 0.26 and 0.43 (in seconds) and for AOMDV is 0.18, 0.21 and 0.36 s for 25, 50 and 75 nodes, respectively. AEED exhibits an increasing trend with increase in number of nodes for both AODV and AOMDV protocols. However, AOMDV performs better with smaller AEED than AODV. From the graphical illustrations, it is evident that in a dynamic AWSN scenario, AOMDV outperforms AODV protocol.
AEED (in seconds)
AEED with Node Configuration 0.5
AODV
0.4 AOMDV
0.3 0.2 0.1 0 25 Nodes
50 Nodes
75 Nodes
Nodes Fig. 3. AEED with number of nodes
After having establishment the superiority of AOMDV protocol, it is important to find out what should be optimum transmitted power for the sensor nodes to enable them, to be alive for longer life in the network. The evaluations have been done with variation in transmission power from 1 to 4 dBm for this protocol. The variation of PDF with transmission power for 25, 50 and 75 nodes is shown in Fig. 4.
Packet Delivery Fraction
PDF with Transmission Power 1
25 Nodes 50 Nodes 75 Nodes
0.8 0.6 0.4 0.2 0 1
1.5
2
2.5
3
3.5
4
Transmission Power in dBm
Fig. 4. PDF with transmission power
From Fig. 4, it can be observed that when the transmission power increased from 1 dBm to 2.5 dBm, the PDF increases for each 25, 50 and 75 pairs of sensor nodes. Further, the PDF decreases with increase in transmission power beyond 2.5 dBm. It is also evident that the network congestion increases with the increase in number of sensor
106
S. Kumar et al.
nodes and the PDF thereby decreases significantly. This signifies that 2.5 dBm is the optimum transmitted power. Figure 5 depicts a bar graph showing the performance metric NRL with transmission power varying from 1 to 4 dB.
Normalized Routing Load
NRL with Transmission Power
25 Nodes
4
50 Nodes 3
75 Nodes
2 1 0 1
1.5
2
2.5
3
3.5
4
Transmission Power in dBm Fig. 5. NRL with transmission power
Average Energy Consumed per node (mWHr)
From Fig. 5, for 25 nodes, the value of NRL is 1.9 at 1 dBm. The value of NRL reduces to 0.53 with transmission power varying from 1 to 4 dBm due to reduced number of hops involving the source–destination pairs. Similar variations in NRL are observed for 50 and 75 nodes. It is evident that with variation in transmitted power up to 2.5 dBm, the NRL is decreasing linearly from maximum to minimum value for the no of nodes, i.e., 75, 50, 25, respectively. However, beyond 2.5 dBm, the NRL is not linear especially when the nodes are increased from 50 to 75. The variation of AEC per node with varying transmission power for the network scenario is shown in Fig. 6. AEC per Node with Transmission Power 44 25 Nodes 50 Nodes 75 Nodes
42 40 38 36 34 1
1.5
2
2.5
3
3.5
4
Transmission Power in dBm
Fig. 6. AEC per node with transmission power
Figure 6 shows that AEC per node increases with transmission power increasing from 1 to 4 dB. The value of AEC is 38.21, 40.62 and 41.32 at 1 dBm transmission power for 25, 50 and 75 nodes, respectively. It is also evident from Fig. 6 that when nodes are increased from 25 to 75, the AEC increases due to large number of intermediary
Evaluation of AOMDV Routing Protocol for Optimum …
107
hops and the data packets require a large transmission power to reach the destination. From the results, it can be concluded that the designed AWSN network scenario using AOMDV protocol performs optimally even for a maximum number of 75 nodes at a node transmission power of 2.5 dBm.
5 Conclusion The novelty of the research work is that the optimum transmitted power to be used by the sensor nodes to enable them to be alive and available for longer and seem less communication even when the number of nodes is increased in the network. Overall, the present work involves the performance evaluation of AOMDV routing protocol on a designed AWSN scenario with 25, 50 and 75 nodes. The AODMV performs better than AODV for the several performance metrics evaluated in the present research work. Based on this outcome, further evaluation of AOMDV is carried out for the qualitative performance metrics— PDF, NRL and AEC per node with a varying power used for efficient and seamless transmission ranging from 1 to 4 dBm. For evaluation, CBR has been used as the traffic application and RWMM as the mobility model. From the results, the AOMDV protocol performs efficiently at optimum transmitted power of 2.5 dBm for PDF 0.94, 0.82, 0.55, NRL 1.35, 1.57, 2.54 and AEC per node 41.01, 42, 42.38 for a configuration of 25, 50 and 75 nodes, respectively. This simulation work will facilitate the hardware designer in selecting the components for various submodules of the nodes to be used in fielded network.
References 1. K. Mor, S. Kumar, D. Sharma, Ad-hoc wireless sensor network based on IEEE 802.15.4 theoretical review. Int. J. Comput. Sci. Eng. 6(3), 220–225 (2018). https://doi.org/10.26438/ ijcse/v6i3.220225 2. S. Kumar, S. Kumar, Analysis of SEAHN demand distance vector in wireless mobile ad-hoc network. Int. J. Enhanced Res. Sci. Tech. Eng. 4(7), 273–281 (2015) 3. S. Kaushik, S. Kumar, Intrusion detection in homogenous and heterogeneous wireless sensor networks. Int. J. Emerging Trends Tech. Comput. Sci. (IJETTCS) 2(3), 225–232 (2013) 4. L.K. Ketshabetswe, A.M. Zungeru, M. Mangwala, J.M. Chuma, B. Sigweni, Communication protocols for wireless sensor networks: a survey and comparison. Heliyon 5(5), e01591 (2019). https://doi.org/10.1016/j.heliyon.2019.e01591 5. F. Fanian, M.K. Rafsanjani, Cluster-based routing protocols in wireless sensor networks: a survey based on methodology. J. Netw. Comput. Appl. 142, 111–142 (2019). https://doi.org/ 10.1016/j.jnca.2019.04.021 6. R. Sharma, M. Sohi, N. Mittal, Zone-based energy efficient routing protocols for wireless sensor networks. Scalable Comput. Pract. Exp. 20(1), 55–70 (2019). 10.12694/scpe.v20i1.1432 7. M.D. Sharif Hossen, M.S. Rahim, Analysis of delay-tolerant routing protocols using the impact of mobility models. Scalable Comput. Pract. Exp. 20(1), 17–26 (2019). https://doi. org/10.12694/scpe.v20i1.1450 8. A. Capone, Y. Li, M. Pióro, D. Yuan, Minimizing end-to-end delay in multi-hop wireless networks with optimized transmission scheduling. Ad Hoc Netw. 89, 236–248 (2019). https:// doi.org/10.1016/j.adhoc.2019.01.004
108
S. Kumar et al.
9. K. Mor, S. Kumar, Evaluation of QoS metrics in ad-hoc wireless sensor networks using zigbee. Int. J. Comput. Sci. Eng. 6(3), 92–96 (2018). https://doi.org/10.26438/ijcse/v6i3.9296 10. S. Khurana, S. Kumar, D. Sharma, Performance evaluation of congestion control in MANETs using AODV, DSR and ZRP protocols. Int. J. Adv. Res. Comput. Sci. Softw. Eng. (IJARCSSE) 7(6), 398–403 (2017) 11. A.Z. Ghanavati, D. Lee, Optimizing the bit transmission power for link layer energy efficiency under imperfect CSI. IEEE Trans. Wireless Commun. 17(1), 29–40 (2018). https://doi.org/ 10.1109/twc.2017.2762301 12. D. Sharma, S. Kumar, A comprehensive review of routing protocols in heterogeneous wireless networks. Int. J. Enhanced Res. Manag. Comput. Appl. 4(8), 105–121 (2015)
A New Automatic Query Expansion Approach Using Term Selection and Document Clustering Yogesh Gupta1(B) and Ashish Saini2 1 Manipal University Jaipur, Rajasthan, India [email protected] 2 Dayalbagh Educational Institute, Agra, UP, India [email protected]
Abstract. The paper focus on managing term selection task for PRF-based automatic query expansion. The proposed automatic query expansion approach is based on document clustering along with usage of new fuzzy inference system designed to combine various term selection measures. An integration of efficient document clustering algorithm and hybrid term selection method makes this approach better than other AQE methods. The performance of proposed approach and other AQE methods is tested on sets of queries for CACM and CISI datasets. The higher values obtained for different information retrieval performance measures prove superiority of proposed AQE approach over other AQE methods. Keywords: Automatic query expansion · Term selection · Fuzzy logic · Document clustering · F-measure
1 Introduction Query expansion technique has been an efficient approach to enhance the performance of document retrieval system. Few approaches related to query expansion are also reported in literature [1, 2]. The reported work is based on pseudo-relevance feedback (PRF). But, PRF-based query expansion may suffer with “drift” away problem. This problem may be overcome by document clustering. In clustered documents, top retrieved documents contain less noise in comparison to un-clustered document-based query expansion using PRF. Therefore, a new document clustering-based automatic query expansion (AQE) approach is introduced in this paper. Term selection is one of the AQE technique which determines a set of most suitable terms for expansion. There are several term selection methods to catch the appropriate terms for PRF-based query expansion such as Jaccard coefficient, Dice coefficient and others and no method is perfect. Therefore, a new term selection approach is proposed in this paper, which overcomes the weaknesses of individual term selection methods and utilizes their strengths. The proposed approach determines the most suitable terms after combining the weights of each unique term © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_14
110
Y. Gupta and A. Saini
of top-ranked documents using fuzzy rules. These unique terms are the outcomes of each term selection method. Further, document clustering enhances the performance of proposed approach. It is observed from literature survey that query term selection methods can be categorized into three categories as: co-occurrence information based [3], class based [4] and corpus based [5]. Each method determines the importance of expansion terms by calculating weighting scores and ranked them according to their individual score. It is not necessary that all important terms are presented in top-ranked documents; these may be somewhere in the middle ranked documents. Therefore, many important terms, those are not in top-ranked terms, cannot be used by using individual term selection methods for AQE. Thus, it is natural to combine these term selection methods using fuzzy logic to improve performance. It is obvious that not all obtained terms are suitable to user given query and become essential to filter the noisy terms to avoid query drifting problem. Therefore, a semantic filtering approach [6] is used in this paper, which fulfills the required purpose. The document clustering-based proposed AQE approach is compared with original query, Parapar et al. [7] AQE approach and proposed AQE approach without document clustering. Two benchmark datasets CACM and CISI are taken to perform all the experiments. The reason of using document clustering is that the most relevant document for a query may have many relevant documents nearby. These documents may be dominant for this query. The paper is organized as follows: In Sect. 2, preliminaries and related work is discussed briefly. Section 3 describes proposed approach in detail. Section 4 presents result analysis and discussion. Finally, Sect. 5 draws the conclusion of paper.
2 Preliminaries and Related Work This section discusses the theoretical foundation of pseudo-relevant feedback process and term selection using co-occurrence-based methods and corpus-based methods. This section also describes the related work of document clustering. 2.1 Document Selection Using PRF In pseudo-relevant feedback process, the possible expanded terms are taken from topranked documents. Suppose, initially top n documents are selected from total retrieved documents. This initial retrieval of documents is fully dependent on proper selection of similarity function. Okapi-BM25 is used in this work for the same purpose. After submitting query to an IR system, relevant documents are extracted and then, top n documents are taken. These documents are used to identify unique terms to form a term pool and these terms can be ranked by anyone of the several term selection methods. These term selection methods are presented in following subsections.
A New Automatic Query Expansion Approach …
111
2.2 Co-occurrence Coefficient-Based Term Selection Method This method uses co-occurrence to determine the relationship among candidate terms and query terms [8]. The mathematical expressions of some co-occurrence coefficients related to such method are given below. 2 ∗ dij Dice_co ti , tj = di + dj dij Jaccard_co ti , tj = di + dj − dij dij Cosine_co ti , tj = di dj
(1) (2) (3)
where d i and d j represent the number of documents containing t j and t j , respectively, whereas d ij denotes the number of documents containing both t i and t j together. 2.3 Term Selection Method Based on Term Distribution in Corpus This section describes two methods as follows: Kullback Leibler Divergence (KLD). The central idea of this method is relying on term distribution in pseudo-relevant document and in the entire dataset. Robertson Selection Value (RSV). This method was proposed by Robertson [9]. According to this method, if the weight of candidate term is wt , then the class (relevant document/non-relevant document) containing this term will add wt to the values of their matching function. The abovementioned term selection methods for PRF-based AQE have certain demerits as discussed above. A suitably designed fuzzy logic-based term selection method for query expansion may be more effective in comparison to abovementioned co-occurrence based and statistics-based term selection methods. Therefore, such type of term selection-based AQE approaches are proposed in this work. 2.4 Related Work The major feature of Vector Space Model is multi-dimensional representation of documents. This is a challenging task for clustering algorithms because the efficiency of these algorithms go down in high-dimensional feature spaces [10]. Li et al. [10] proposed two text document clustering approaches. These approaches were based on word’s sequential patterns in documents unlike VSM. The topics of the documents are represented by the sequence of frequent words. The documents are clustered together, which have the same sequences. Li et al. [11] also proposed another clustering approach based on feature selection. This clustering process gives cluster label information that is again used for the feature selection. Lee et al. [18] presented the use of document clustering in PRF-based query expansion. They considered a document is pseudo--relevant only when it has many similar features as other documents have and low or no similar feature with neighbor documents. Therefore, existing clustering approaches do not provide exact distribution of documents for PRF.
112
Y. Gupta and A. Saini
3 Proposed Document Clustering-Based AQE Approach The work done in this paper can be divided into two parts: firstly, term selection-based AQE approach and secondly, document clustering to improve the efficiency of proposed AQE. For this purpose, EFPSOKHM clustering approach which is developed and recently reported [12] by authors is used along with proposed new term selection-based AQE approach (TSBAQE) in this paper. Figure 1 presents the block diagram of proposed approach and Table 1 describes the notations used in Fig. 1. First, required parameters are initialized such as number of relevant documents (m), number of query expansion terms (n) and a counter (t) is set to zero. Then after, all documents and queries are converted into vectors. Suppose D = (d 1 , d 2 , …d n ) denotes the collection of documents. T = (t 1 , t 2 ,…, t m ) represents the terms occurred in data set D. Each document to be clustered is represented in the form of a vector. In this work, each document Di is considered as a point in m-dimensional vector space. Further all documents are clustered using EFPSOKHM clustering approach. Afterward, distance of query vector Q is computed from each cluster. The document cluster which has minimum distance from query vector will be most relevant. All unique terms are selected from these top m documents. For each unique term, Jaccard coefficient, frequency coefficient, cosine coefficient, KLD and RSV are calculated. These computed values are used as input for fuzzy inference system, which gives a weighted list of terms as output. Further, top n terms are chosen for query expansion. Again this modified query is converted into vector and ranking function is applied to get a ranked list of documents. This process is repeated for five times as shown in Fig. 1. Fuzzy inference system used in TSBAQE. The FIS used in this work is composed of three fuzzy logic controllers (FLC) as shown in Fig. 1. FLC co deals with three cooccurrence coefficients and gives an output for each term, i.e., term weight wco based on fuzzy rules. Similarly, FLC sta gives term weight wsta for each term as output after fuzzyfying two statistical distribution term selection measures (KLD and RSV ). The weights wco and wsta along with TFIDF score of each term are used as inputs for FLC main , which gives final weight wfinal for each term. Therefore, FIS gives a weight signifying cumulative effect of co-occurrence coefficients, statistical distribution term selection measures and TFIDF measure for better term selection. Then, semantic filter is used to filter less-suitable terms from obtained list. At the last, top m terms are selected to expand the query as per their similarity. A fuzzy rule base is formed for all the three fuzzy logic controllers of FIS to control output variables (wco , wfinal and wsta ). The knowledge used in this paper to construct fuzzy rules is tabulated in Table 2. All the fuzzy rules carry equal weight. The AND operator is used to obtain a fuzzy set representing antecedent in canonical form of that particular fuzzy rule in this FIS. Implication is performed by applying fuzzy operator, which gives an output fuzzy set for consequent part of each fuzzy rule. In this work, centroid method [13] is used as defuzzification method.
4 Experimental Analysis and Discussion Two different benchmark datasets CACM and CISI with different characteristics are used for fair comparison of the performances of proposed approaches in experiments. Totally
A New Automatic Query Expansion Approach …
Convert documents into vectors D
START
Submit Query and initialize m, n and t=0
113
Data Corpus
Apply EFPSOKHM
Convert query into query vector Q
Determine distance of Q from each cluster and select minimum distant cluster
Convert query into query vectorQ
Apply ranking function to retrieve documents
Clustered documents
Obtain ranked list of retrieved documents
Yes
If t==5
Display Result
No t = t +1 Select unique terms to form term pool from top m documents
END
For each term, Compute Jaccard , Frequency , Cosine, KLD and RSV coefficients KLD and RSV
Jaccard, Frequency and Cosine
FLCco
FLCsta
Obtain new query
TFIDF Select top n expansion terms Reweighting expanded query terms
Semantic filtering
TSBAQE
Weighted term list of all terms of term pool
FLCmain
Fuzzy Inference System
Fig. 1. Block diagram of document clustering-based TSBAQE query expansion approach
fifty queries are selected randomly from these datasets. Two different types of analysis are done in this work. The results are compared with original query, Parapar et al. approach [7] and proposed TSBAQE without clustering approach so that the improvements in IR performances can be measured after document clustering. The proposed term selection-based AQE approach is based on PRF. It is much required to select the best values of n and m. In this paper, the values for n and m are set empirically. However, in present work, n is set to 10 and m is set to 6 for obtaining good results as reported in reference [6]. In proposed approach, first top n documents are taken from initial run and then all unique terms are listed. Further, the weight is computed for each term using proposed approach. In this work, different weights are given to original query terms and supposed to be added terms for expansion.
114
Y. Gupta and A. Saini Table 1. Details of entities and variable used in FIS
Variable
Notation
Description
m
Number of top retrieved documents
n
Number of top terms for query expansion
t
Counter to check iteration
Jaccard_Co
The Jaccard coefficient score value of a term
Frequency_Co
The frequency coefficient score value of a term
Cosine_Co
The cosine coefficient score value of a term
TFIDF
TFIDF score value of a term
KLD
Score value computed by KLD method of a term
RSV
Score value computed by RSV method of a term
wco
Intermediate output representing term weight for FLCco
wsta
Intermediate output representing term weight for FLCsta
wfinal
Output of FLCmain representing final term weight
4.1 Query Specific Analysis We have calculated F-Measure to analyze the performance of each query of proposed document clustering-based TLBAQE approach and compared the results with other approaches. F-Measure values are computed at three cut-offs: top 10, 30 and 50 retrieved documents. Better F-measure values are obtained using proposed approach in comparison to other approaches for both the datasets as shown in Figs. 2, 3, 4, 5, 6 and 7. Figure 2 shows the results of top ten cut-off for CACM. This figure evidently shows that TSBAQE performs better after document clustering in comparison to Parapar et al. approach [7] and query expansion without clustering for forty-two queries out of fifty queries. Figure 3 illustrates a comparison at top ten documents cut-off for CISI. This figure depicts the superiority of TSBAQE over Parapar et al. approach and TSBAQE without clustering for forty-six queries. Figures 4 and 5 show the results at top thirty cut-off for both the datasets. These figures demonstrate that TSBAQE gets better Fmeasure after clustering for forty-two queries and forty-six queries in case of CACM and CISI respectively. Similarly, Figs. 6 and 7 also illustrate that TSBAQE query expansion approach performs better after document clustering at top fifty cut-off for both datasets. The query wise performance is also analyzed. In Figs. 8 and 9, the length of each bar shows the variation in precision of document clustering-based approach over without clustering approach at top fifty cut-off. These figures clearly show that the proposed approach TSBAQE gets better precision values after document clustering as compared to without clustering. Precision and recall values are also determined and compared for randomly selected three queries as shown in Tables 3 and 4 for CACM and CISI respectively.
A New Automatic Query Expansion Approach …
115
Table 2. The knowledge used to construct fuzzy rules FLC
Domain knowledge
No. of fuzzy rules framed
Total no. of fuzzy rules
FLCco
If a term has low Jaccard_Co value, low Frequency_Co value and low Cosine_Co value, then wco will be low If a term has high Jaccard_Co value, high Frequency_Co value and high Cosine_Co value, then wco will be high
27
63
FLCsta
If a term has low KLD score and low RSV score, then wsta will be low If a term has high KLD score and low RSV score, then wsta will be medium
9
FLCmain
If a term has low value of 27 wco , low value of wsta and TFIDF is also low, then the value of wfinal is likely to be low If a term has high value of wco , high value of wsta and TFIDF is also high, then the value of wfinal will be high
4.2 Overall Performance Analysis MAP values are determined to check the overall performance of proposed approach before clustering and after clustering. The results are tabulated in Table 5. Table 5 presents the comparison of MAP values. As discussed earlier that EFPSOKHM clustering approach is run for five times and very small variation in MAP can be seen in results. Therefore, average value of MAP is taken to compare different query expansion approaches as shown in following table. It is evident from this table that document clustering-based TSBAQE query expansion approach obtains higher values of MAP in comparison with out document clustering query expansion approaches for CACM and CISI. Precision-Recall curves are also drawn to test the robustness of TSBAQE query expansion approach after clustering for both datasets. Figures 10 and 11 depict that TSBAQE query expansion approach shows superiority at all levels of recall after document clustering as compared to Parapar et al. approach and TSBAQE without clustering for both the datasets.
116
Y. Gupta and A. Saini Original Query Parapar et al. approach Proposed TSBAQE approach without clustering Proposed TSBAQE approach using clustering
1
F-Measure
0.8
0.6
0.4
0.2
0
5
10
15
20
25
Queries
30
35
40
45
50
Fig. 2. Comparison of F-measure at top ten cut-off for CACM 1 Original Query Parapar et al. approach Proposed TSBAQE approach without clustering Proposed TSBAQE approach using clustering
F-Measure
0.8
0.6
0.4
0.2
0
5
10
15
20
25
Queries
30
35
40
45
50
Fig. 3. Comparison of F-measure at top ten cut-off for CISI
From above results, it can be observed that TSBAQE query expansion approaches perform better after document clustering and improve the performance of IR system.
5 Conclusion Query specific analysis and overall performance analysis for queries presented in previous section demonstrate that proposed AQE approach dominates other approaches. The better information retrieval (IR) results are possible to achieve through proposed AQE approach, first, due to avoiding noise while obtaining ranked list of relevant documents
A New Automatic Query Expansion Approach …
117
Original Query Parapar et al. approach Proposed TSBAQE approach without clustering Proposed TSBAQE approach using clustering
1
F-Measure
0.8
0.6
0.4
0.2
0
5
10
15
20
25
30
35
40
45
50
45
50
Queries
Fig. 4. Comparison of F-measure at top thirty cut-off for CACM 1 Original Query Parapar et al. approach Proposed TSBAQE approach without clustering Proposed TSBAQE approach using clustering
F-Measure
0.8
0.6
0.4
0.2
0
5
10
15
20
25
Queries
30
35
40
Fig. 5. Comparison of F-measure at top thirty cut-off for CISI
through new document clustering algorithm and secondly, due to using new fuzzy logicbased combination of co-occurrence coefficients, statistical distribution term selection measures and TFIDF measure for term selection. The reweighting of expanded query terms after semantic filtering is also an additional reason of IR improvement. Other suitable extension forms of this approach may be developed and tried in future to retrieve more relevant information from large and complex data repositories.
118
Y. Gupta and A. Saini Original Query Parapar et al. approach Proposed TSBAQE approach without clustering Proposed TSBAQE approach using clustering 1
F-Measure
0.8
0.6
0.4
0.2
0
5
10
15
20
25
Queries
30
35
40
45
50
45
50
Fig. 6. Comparison of F-measure at top fifty cut-off for CACM 1 Original Query Parapar et al. approach Proposed TSBAQE approach without clustering Proposed TSBAQE approach using clustering
F-Measure
0.8
0.6
0.4
0.2
0
5
10
15
20
25
Queries
30
35
40
Fig. 7. Comparison of F-measure at top fifty cut-off for CISI
A New Automatic Query Expansion Approach …
119
0.3 0.2
Precision
0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Queries
Fig. 8. Query-wise performance variation of document clustering-based TSBAQE approach with TSBAQE approach without clustering as baseline for CACM 0.2
Precision
0.1 0 -0.1 -0.2 -0.3 -0.4
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50
Queries
Fig. 9. Query-wise performance variation of document clustering-based TSBAQE approach with TSBAQE approach without clustering as baseline for CISI
Table 3. Comparison of recall (R) and precision (P) values for CACM dataset Q. No.
Original query
Parapar et al. approach
TSBAQE without clustering
TSBAQE using clustering
R
P
R
P
R
P
R
P
14
0.4545
0.1762
0.7954
0.2802
0.7954
0.3051
0.8363
0.3354
26
0.3667
0.2136
0.7667
0.3409
0.8000
0.3613
0.8000
0.3907
63
0.2500
0.2780
0.5000
0.4012
0.6250
0.4289
0.6750
0.4769
120
Y. Gupta and A. Saini Table 4. Comparison of recall (R) and precision (P) values for CISI dataset
Q. No.
Original query
Parapar et al. approach
TSBAQE without clustering
TSBAQE using clustering
R
P
R
P
R
P
R
P
2
0.0384
0.1000
0.2692
0.1792
0.3076
0.2112
0.3627
0.2459
12
0.3076
0.1201
0.6154
0.1824
0.6154
0.2043
0.6723
0.2286
34
0.3421
0.1583
0.5526
0.2547
0.5789
0.2982
0.6432
0.3247
Table 5. Comparison of MAP of document clustering-based TSBAQE with original query, Parapar et al. approach and TSBAQE without clustering Data set
Original query
Parapar et al. approach
TSBAQE query expansion without clustering
TSBAQE query expansion using clustering
CACM
0.1873
0.2752
0.2818
0.2897
CISI
0.1586
0.2429
0.2516
0.2607
Fig. 10. Precision–recall curves on CACM
A New Automatic Query Expansion Approach …
121
Fig. 11. Precision–recall curves on CISI
References 1. B. Kim, J. Kim, J. Kim, Query term expansion and reweighting using term co-occurrence similarity and fuzzy inference, in 9th IFSA world congress and 20th NAFIPS international conference (Vancouver, Canada, 2001), pp. 715–720 2. Y. Chang, S. Chen, C. Liau, A new query expansion method based on fuzzy rules, in 7th joint conference on AI, Fuzzy system, and Grey system (Taipei, Republic of China, 2003) 3. J. Singh, A. Sharan, Context window based co-occurrence approach for improving feedback based query expansion in information retrieval. Int. J. Inf. Retrieval 5(4), 31–45 (2015) 4. J. Aguera, L. Araujo, Comparing and combining methods for automatic query expansion. Adv. Natural Lang. Process. Appl. Res. Comput. Sci. 33, 177–188 (2008) 5. C. Carpineto, G. Romano, A survey of automatic query expansion in information retrieval. ACM Comput. Survey 44(1), 1–50 (2012) 6. Y. Gupta, A. Saini, A novel Fuzzy-PSO term weighting automatic query expansion approach using semantic filtering. Knowl. Based System 136, 97–120 (2017) 7. J. Parapar, M. Presedo-Quindimil, A. Barreiro, Score distributions for pseudo relevance feedback. Inf. Sci. 273, 171–181 (2014) 8. J. Swets, Information retrieval systems. J. Sci. 141, 245–250 (1963) 9. S. Robertson, On term selection for query expansion. J. Document. 46(4), 359–364 (1990) 10. Y. Li, C. Luo, S. Chung, Text clustering with feature selection by using statistical data. IEEE Trans. Knowl. Data Eng. 20(5), 641–652 (2008) 11. Y. Li, S. Chung, J. Holt, Text document clustering based on frequent word meaning sequences. Data Knowl. Eng. 64(1), 381–404 (2008) 12. Y. Gupta, A. Saini, A new swarm-based efficient data clustering approach using KHM and fuzzy logic. Soft. Comput. 23(1), 145–162 (2019) 13. C. Lee, Fuzzy logic in control systems: Fuzzy logic controller, Parts I and II. IEEE Trans. Syst. Man Cybern. 20, 404–435 (1990)
Investigating Large-Scale Graphs for Community Detection Chetna Dabas(B) , Gaurav Kumar Nigam, and Himanshu Nagar Department of CSE & IT, Jaypee Institute of Information Technology, Noida, India {chetna.dabas,gaurav.nigam}@jiit.ac.in, [email protected]
Abstract. This research paper performs comparison and performance evaluation of large-scale graph techniques such as Walk Trap, Edge Betweenness, Random Walk Fast Greedy, Infomap and Leading Eigen Vector. The performance metrics like Modularity, Normalized Mutual Information, Adjusted Rand Index and Rand Index are used to investigate the performance. The datasets utilized are Yeast Protein Network, US Airport Network, Macaque Network, UK Faculty Network and Enron Email Network. The results are presented in the form of figures and tables. Keywords: Investigation · Large-scale graph · Community detection
1 Introduction With the wide adaptability and usability of the World Wide Web across the globe, web graphs imply their due importance in research and analysis. Apart from the social networks, the large graphs applications include the newspaper articles, the healthcare manuals and the transportation routes to name a few. In addition to this, there are a lot of other computing problems of practical significance in the large-scale graph scenario like connected workings and minimum cut where the pre-processing and processing of the large graphs is highly challenging to be addressed. Specifically mentioning, there are a number of classical algorithms associated with graphs which do not work well with the large-scale graphs which contains varying degree of parallelism, memory access of low quality, to name two. Further, there are challenges like locality concerns pertaining to bifurcations over various computing systems which raise the chances of machine failures during the course of execution. Although there are associated ubiquities along with the large-scale graphs, the commercial significance in context with the large-scale graphs cannot be ignored. Considering the present computing scenario worldwide, the large-scale graphs have wide popularity, visibility and adoptability due to their practical importance. These large-scale graphs are composed up of nodes that are trillions in number and edges that are millions in number. A highly popular method emerging in this arena is visualizing this kind of huge data with the help of large-scale graphs modeling and utilized in the prediction for future uses. In spite of the huge popularity © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_15
Investigating Large-Scale Graphs for Community Detection
123
of the large-scale graphs, there exists an entire series of problems with the large-scale graphs which directly or indirectly affects the efficiency of the systems which requires attention. Some of these problems associated with the rising volume of these large-scale graphs relate to their processing and analyzing in context with design of the scalable computing systems. Many other challenges include the lack of sufficient amount of information available and NP hard characteristic associated with some of the clustering problems. Recently, huge amount of data is generated in pervasive healthcare and many other sectors. This has lead to issues related to vertex partitioning (into groups) in concern with large graphs [1, 2]. In order to address these issues, lots of detection (community) techniques are explored by researchers at present, but the existing algorithms are designed by keeping the disjoint communities in mind [3–7]. Some crucial metrics namely Rand Index (RI) and Adjusted Rand Index (ARI) are defined in [8] where the overlapping degree among two partitions is measured with Rand Index and the similarity index in between two clusters are computed using Adjusted Rand Index. There is another metric called Adjusted Rand Index (ARI) and its represents the amount of co-incidence that exists between two partitions. In context of the classification goal, ARI carrying a value of 0 depicts that the failure of the corresponding method and a value near to 1 or 1 itself signifies that the corresponding method is successfully able to distinguish among various data classes. Both RI and ARI are used for cluster evaluations [8]. There exists another metric called Normalized Mutual Information (NMI). When it carries a 0 value, it signifies two datasets which are distinct, when it carries a 1 value, it signifies that the sets of data are pretty similar [9, 10].
2 Related Work In large network like those of healthcare for example, the authors of a paper [11] suggested a method in relation with find communities. Here, within a network, a hierarchical method had been proposed for detecting groups. An Amazon dataset has been utilized in the algorithm implementation of this paper. The authors of this papers claim that their strategy seeks lower complexity (computational) in comparison with the other state-of-the-art techniques. Authors of research works [8, 12] have highlighted important research issues, namely validation and detection of large network community. The maximum Modularity has been evaluated and the performance of their algorithm has been compared with respect to the truth data available. Further, the computational complexity has been evaluated for graph detection. Markov clustering work has been carried out here. Significant contribution to the limited random walk algorithm has been done in [6, 12–14]. Another recent work [3] presented the random walk (fast greedy) mechanism. Here, initialization of every node took place considering it as starting node in order to perform the random walk. The number of walk was to be inputted by the user of the algorithm and it calculated the community detection. The calculation of community detection took place by aggregating the walk while computing the similarity matrix. The similarity
124
C. Dabas et al.
among two nodes in a walk is reflected by the entry of this matrix. Iteratively, a hierarchical method is utilized for merging of nodes. This merging process depends upon the similarity factor possessed by the nodes. As the iterations progress, the start steps lowers down in terms of similarity. The authors of [15] proposed an algorithm which is limited random walk in nature. In the graph, they have used this algorithm to calculate attractor vertices which are utilized as features for clustering the vertices. This work shows that the experimental results of the proposed algorithm are carried out both on the real world and simulation of big graph data. According to their results, the authors claim that their algorithm outperforms the other existing methods where the fitness function chosen is used to calculate the clusters. In [16], the technique for extraction of the structure for communities of large networks is presented. The authors of this paper have presented a heuristic oriented method and it depends upon Modularity optimization. This work further presents the identification of communities for languages in a mobile phone network (Belgian) which incorporates more than million customers. Their algorithm also works well for ad hoc networks which are modular in nature. Here, the quality of detection is carried out with the help of Modularity and is found to be impressive. According to a research paper, the techniques for procedures as well as for the handling of samples were developed and investigated in [17] to offer reasonable error rates. The proposed model worked with the DNA biobank which was produced from the electronic medical record (EMR) system and was combined with the phenotypic data. The benefits associated with this work were the diversity in the phenotypes and the rate of the sample acquirement. As reported by recent research work [18], Fog computing is used the Internet of Things-based big data framework. Here, the sensors are utilized to gather the health data. This work is useful for healthcare professionals in transferring the data for seamless interactions on the cloud. This work also reveals that the patient’s medical data needs to be in an environment of privacy and security which further promotes the usage and adoption of Internet of Things in healthcare. The authors of paper [19] have presented a large scale mapping framework that is empirically controlled in order to create a high quality Y2H dataset with high through put. This dataset covers approximately 20% of all the yeast binary interactions. In comparison with the existing allied complex interactive models, this binary map is enriching corresponding to the signaling interactions which are transient in nature. This also bears connections that are inter-complex and contains a crucial clustering among the essential proteins. Various community detection techniques were tested by the authors of [20], who incorporated the Lancichinetti–Fortunato–Radicchi (benchmark graph). The authors performed evaluation which investigated accuracy for various algorithms using the computing time. For a given network, the authors of this research work were successfully able to suggest the most appropriate community detection algorithm. In the similar context, this research papers aims to compare and evaluate different community detection algorithms like Walk Trap, Edge Betweenness, Random Walk Fast Greedy, Infomap, and Leading Eigen Vector. These evaluations have been done on
Investigating Large-Scale Graphs for Community Detection
125
diverse datasets like Yeast Protein Community Network, UK Faculty Community Network, US Airport Community Network, Macaque community Network and Enron Email Network. Performance metrics, namely Normalized Mutual Information, Modularity and Random Indexm have been investigated in details. The research paper is organized as follows: Related work is presented in the next section; testing parameters are discussed after that. Methodologies used and datasets that are utilized in this work are presented in Sect. 4. Section 5 mentions the algorithm of study followed by Sect. 6 which presents the results and analysis. This research paper is concluded by Sect. 7 which is a conclusion and future work section.
3 Testing Parameters There are significant metrics which contributes to the performance of large scale graph evaluation. These are Rand Index, Normalized Mutual Information, Adjusted Rand Index and Modularity and are discussed here. 3.1 RI (Rand Index) Author William M Rand in 1997 proposed Rand Index [9]. It signifies similarity with respect to two clusters. Its value varies from 0 through 1. 3.2 ARI (Adjusted Rand Index) The author Hubert and Arbbi in 1985 proposed Adjusted Rand Index [9]. It is a version of Rand Index which is adjusted in relation with chance grouping of various elements. In other words, the similarity corresponding to comparisons (pair-wise) among clustering’s which are generated by a random model is the base for ARI. 3.3 NMI (Normalized Mutual Information) Normalized Mutual Information reflects the difference between the ground truth and clustering value with a partition chosen in form of distribution. The value of Normalized Mutual Information exists in the range of 1 and 0. If NMI = 1, alike clusters are indicated whereas the NMI value of 0 depicts distinct clusters [9]. 3.4 Modularity The measurement of the group interactions in comparison of the group random connections (expected) depicts Modularity [9].
4 Datasets Utilized Table 1 presents the dataset specifications utilized in this work.
126
C. Dabas et al. Table 1. Datasets utilized
Dataset (large scale graph)
Node count
Yeast protein interaction (Yeast)
Edge count
Nature of graph (directed/undirected)
2617
11,855
UD
US airport
755
23,473
D
UK faculty
81
817
D
Macaque Enron email network
45
463
D
184
125,409
D
5 Techniques Used The collaborative community detection techniques used for this work include Random Walk (Fast Greedy version) composed up of hierarchical merging, and overlapping module as described in [3] in which comparison is done with the other existing mechanisms.
6 Results and Analysis In the present work, different datasets which are directed or undirected and having magnanimous number of nodes are utilized as a part of this work and their performance is evaluated for different algorithms like Random Walk Fast Greedy, Leading Eigen Vector, Walk Trap, Infomap and Edge Betweenness. The performance comparison of these different algorithms on datasets like Yeast Protein Network, Macaque Network, UK Faculty Network and Enron Email Network has been carried out in the proposed work in term of maximum Modularity, NMI, RI and AI metrics. The result graphs are depicted in the five subsections below. In the proposed work, keeping in mind the maximum centrality metric, the extraction of edges was carried out. This process is continuously repeated until there are no more edges left. The datasets details utilized are presented by Table 1 in Sect. 4. These community detection datasets are benchmark datasets. Language R (version 3.3.2) has been used for the implementation of the entire proposed work. The libraries igraph and cab were extensively used as a part of this work. The architectural specification of the computing machine on which the proposed work has been done includes a processor namely Intel Core i3 with a capacity of 4 GB RAM. Dataset wise performance evaluation of the different community detection algorithms is presented next in the subsections from 6.1 to 6.5. The results graph plot of the number of nodes versus Modularity is presented in Fig. 1. This comparison has been established for algorithms, namely Edge Betweenness, WalkTrap, Infomap, Fast Greedy (Random Walk) along with Leading Eigen Vector. The NMI comparison graph plot of the results obtained as a part of this work in the form of the number of nodes versus Normalized Mutual Information is presented in Fig. 2. This comparison has been established for algorithms under consideration.
Investigating Large-Scale Graphs for Community Detection
127
Fig. 1. Modularity comparisons
Fig. 2. NMI comparisons
The Rand Index comparison graph plot of the results in the form of the number of nodes versus normalized mutual information is presented in Fig. 3. This comparison has been carried out for the various algorithms under consideration.
128
C. Dabas et al.
Fig. 3. Rand index comparisons
7 Conclusion and Future Work The earlier part of this research work [3], was a Random Fast Greedy Algorithm presented by the authors. In this research paper, the proposed technique is evaluated and compared with the exiting algorithms for large scale graphs used for community detection. In this work, the authors have carried out the evaluation and comparison of the algorithms Walk Trap, Random Walk (Fast Greedy), Leading Eigen Vector (LEV), along with Infomap for community detection for crucial parameters such as Normalized Mutual Information, Random Index and Modularity. This evaluation has been carried out for the large scale graph data while considering the datasets namely Yeast Protein Network, US Airport Network, UK Faculty Network, Macaque Network and Enron Email Network which consists up of a large number of edges as well as nodes. The authors have utilized the agglomerative hierarchical technique for their work. The results are interesting and presented in form of graphs and tables. It was noted during the experimentation that the Random Fast Greedy algorithm offers enhanced performance in comparison with the other state-of-the-art techniques in the scenarios where data size rises to higher numbers. Future work will consist up of testing of these techniques and datasets on multi-core machines exploring vertical scalability.
References 1. P. De Meo, E. Ferrara, G. Fiumara, A. Provetti, Mixing local and global information for community detection in large networks. J. Comput. Syst. Sci. 80(1), 72–87 (2014) 2. J.D. Wilson, J. Palowitch, S. Bhamidi, A.B. Nobel, Community extraction in multilayer networks with heterogeneous community structure. J. Mach. Learn. Res. 18(149), 1–49 (2017)
Investigating Large-Scale Graphs for Community Detection
129
3. C. Dabas, H. Nagar, G.K. Nigam, Large scale graph evaluation for find communities in big data. Procedia Comput. Sci. 132, 263–270 (2018) 4. D.B. Larremore, A. Clauset, A.Z. Jacobs, Efficiently inferring community structure in bipartite networks. Phys. Rev. E 90(1), 012805 (2014) 5. Y. Ruan, D. Fuhry, S. Parthasarathy, Efficient community detection in large networks using content and links, in Proceedings of the 22nd international conference on World Wide Web (2013, May), pp. 1089–1098. ACM 6. H. Zhang, J. Raitoharju, S. Kiranyaz, M. Gabbouj, Limited random walk algorithm for big graph data clustering. J. Big Data 3(1), 26 (2016) 7. X. Zhang, G. Cao, X. Zhang, G. Cao, Transient community detection and its application to data forwarding in delay tolerant networks. IEEE/ACM Trans. Netw. (TON) 25(5), 2829–2843 (2017) 8. A.M. Krieger, P.E. Green, A generalized Rand-index method for consensus clustering of separate partitions of the same data base. J. Classif. 16(1), 63–89 (1999) 9. A.F. McDaid, D. Greene, N. Hurley, in Normalized mutual information to evaluate overlapping community finding algorithms. arXiv preprint arXiv:1110.2515 (2011) 10. J. Yang, J. McAuley, J. Leskovec, Community detection in networks with node attributes, in (ICDM), 2013 IEEE 13th international conference on data mining series, (2013, December), pp. 1151–1156. IEEE 11. J. Leskovec, L.A. Adamic, B.A. Huberman, The dynamics of viral marketing. ACM Trans. Web (TWEB) 1(1), 5 (2007) 12. K. Steinhaeuser, N.V. Chawla, Identifying and evaluating community structure in complex networks. Patt. Recogn. Lett. 31(5), 413–421 (2010) 13. Y.Y. Ahn, J.P. Bagrow, S. Lehmann, Link communities reveal multiscale complexity in networks. Nature 466(7307), 761 (2010) 14. O. Batarfi, R. El Shawi, A.G. Fayoumi, R. Nouri, A. Barnawi, S. Sakr, Large scale graph processing systems: survey and an experimental evaluation. Cluster Comput. 18(3), 1189– 1213 (2015) 15. X. Wang, P. Cui, J. Wang, J. Pei, W. Zhu, S. Yang, Community preserving network embedding, in AAAI (2017, February), pp. 203–209 16. F.M.D. Marquitti, P.R. Guimarães, M.M. Pires, L.F. Bittencourt, MODULAR: software for the autonomous computation of modularity in large network sets. Ecography 37(3), 221–224 (2014) 17. D.M. Roden, J.M. Pulley, M.A. Basford, G.R. Bernard, E.W. Clayton, J.R. Balser, D.R. Masys, Development of a largescale identified DNA biobank to enable personalized medicine. Clin. Pharmacol. Ther. 84(3), 362–369 (2008) 18. C. Thota, R. Sundarasekar, G. Manogaran, R. Varatharajan, M.K. Priyan, Centralized fog computing security platform for IoT and cloud in healthcare system, in Exploring the convergence of big data and the internet of things (2018), pp. 141–154. IGI Global 19. H. Yu, P. Braun, M.A. Yildirim, I. Lemmens, K. Venkatesan, J. Sahalie, T. Hao, High-quality binary protein interaction map of the yeast interactome network. Science 322, 104 (2008) 20. Z. Yang, R. Algesheimer, C.J. Tessone, A comparative analysis of community detection algorithms on artificial networks. Sci. Rep. 6, 30750 (2016)
Big Data Platform Selection at a Hospital: A Rembrandt System Application Sudhanshu Singh1 , Rakesh Verma1
, and Saroj Koul2(B)
1 NITIE, Mumbai, Maharashtra, India [email protected], [email protected] 2 OP Jindal Global University, NCR Delhi, Delhi, India [email protected]
Abstract. Hospitals form a vital component of investment in the healthcare of any nation. A hospital can operate on a standalone basis or as a component in a network of providers of healthcare. Big data in hospitals, is created through interactions and transactions among and between health service providers and patients whether it is the first time or a recurring visit and gets captured in realtime or afterwards through channels such as notes, machine-to-machine interaction and social media. There is limited literature available on the application of multi-criteria decision aiding tools for healthcare IT infrastructure, especially in the context of a country like India. This paper exhibits implementation of the REMBRANDT system for selection of a big data platform in a healthcare context in India. Keywords: Big data · REMBRANDT method · Multi-criteria decision-making · Hospital IT infrastructure
1 Introduction People are to be in good health to contribute productively. There are high returns from investing in health [1] due to a decrease in the mortality rates and add to 11% in the growth of low and middle-income economies. Patients, by and large, receive care from multiple service providers on account of mobility requirements and all services are not available at a single location [2]. Different types of decisions, in routine situations and emergency conditions, allow inefficiencies to creep in hospitals if demands are not being adequately met [3]. On close look, there is little transparency [4] in such decisions and therefore need a systematic approach that can facilitate in an assessment of benefits that arise from them. In India, the literature on the adoption of big data platforms is scarce. As each hospital is unique in the way it manages its systems and processes for the delivery of healthcare services and hence being highly context-dependent, the choice of the big data platform requires a systematic approach for its selection. While the requirements of data [5] remain the same across; they, however, differ only in its upkeep. The Rembrandt system is a unique multi-criteria decision-making method and doesn’t cause rank reversals on account of the way it is structured [6], and the use © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_16
Big Data Platform Selection at a Hospital …
131
of internally and externally generated data [7] can lead to the creation of new models for doing business in organizations; therefore, is taken up for implementation in this analysis.
2 Theoretical Background Data is generated from decisions and lead to conclusions. Aggregated data of all types and forms used for/in diagnosis, treatment or the well-being of the patient population contribute to ‘big data’ [8]. The data is through personal notes as-well-as found in electronic forms; data sources on the web, social media, or as machine-to-machine interaction. It can be a hybrid of internally acquired data (e.g. electronic health records) and externally picked up data (e.g. insurance transactions). Business analytics is both an art and science of solving problems, making decisions and finding insights [9]. Organizations head for higher responsiveness, and more interconnectivity [6] as more data gets generated, over which different types of analytics are possible. Analytics, over the decades [10], has undergone development; as at Table 1. Table 1. The four phases of analytics Phase of analytics Attributes of the phase Analytics 1.0
Static, structured and small data; analysis of back office; slow and painstaking; decisions are internal; hypothesis by humans
Analytics 2.0
Fast moving, unstructured and big data; data scientists gaining more acceptance; online companies having data products; open source; spread of Hadoop; visual analytics
Analytics 4.0
All types of data in combinations; decisions and products can be internal or external; analytics as a core capability; movement in both scale and speed; contains predictive as well as prescriptive analytics
Analytics 4.0
Automated and embedding of analytics; cognitive technologies; performing digital tasks through the robotic process automation; not just automation but augmentation too
Caregivers’ time is estimated by data-driven methods [11]. Hospitals require investment in IT infrastructure to manage and realize the benefits of big data. They also seek to make the right decisions using some structured methodology for the selection of a big data platform to manage their growing data intake. There is asymmetric sharing of information between the patients and providers with one party knowing some information more than another party [12]. We frequently come across situations wherein similar data is being utilized to generate different levels of insight. This digital divide has an impact on managerial work and policies [13] and therefore require procedures to bridge the haves and have-nots gap. Lack of standardization and incompatible data generated in legacy IT systems are the challenges faced by healthcare organisation [8]. There have been many applications of operations research and multi-criteria decision aiding methods in healthcare. The queuing models [14], the buy-in and ownership from
132
S. Singh et al.
stakeholders [15], the ‘preference elicitation methods’ which improves medical decisionmaking by measuring benefit and value [16], the CRITIC and WASPAS methods that manage time/attendance in hospitals [17] are several existing usages. The “Ratio Estimation in Magnitudes or Decibels to Rate Alternatives which are Non-Dominated” (REMBRANDT) System; developed in 1990 [18] take care of the rank reversal, aggregation, and scale related flaws of the “Analytic Hierarchy Process” (AHP). A few pertinent studies using this system include a comparison of AHP with Rembrandt system [19], EFQM Rembrandt excellence model [20], decision support during negotiations for proposals quality [21]. However, there has not been much work in the context of Indian healthcare systems. We have taken the alternatives as per literature by Raghupathi [8] in the implementation of the Rembrandt Method. The practices of sustainable performance in businesses can be influenced by big data analytics [22] A healthcare facility that is sustainable adapts should ideally adapt itself to the local conditions [23]. The National Health Service (NHS), UK had set up the Sustainable Development Unit in the year 2008 and later expanded to incorporate the healthcare system in the year 2013 [24]. The triple bottom line in sustainability consists of economic dimension, social dimension and environmental dimension [23].
3 Methodology 3.1 Setting In this paper, the Rembrandt system [18–21] applied at a leading tertiary hospital in India explores the options available to select an appropriate Big Data platform. The participants included senior management and representatives from the IT department. 3.2 Model Formulation The model is formulated by a group of decision-makers (DMs) who are asked to evaluate the criteria and rank the alternatives based upon this evaluation. All the DMs are expected to perform all the comparisons as needed. The method adopts a numerical (geometric) scale for comparison, use geometric mean for calculating the impact scores and one hierarchical level for both the criteria and the alternatives [19, 20]. The key parameters and equations in the formulation of the method [18–21] are at Annexure 1; form the basis of utilizing geometric means for normalization of criteria and alternatives in ranking of other options in the Rembrandt system. 3.3 Criteria and Alternatives In this paper, we have considered availability (Avail), continuity (Cont), usability (Use), scalability (Scal), granular manipulation (Gran), privacy and security enablement (PriSec), and quality assurance (QuAs) as the criteria for selecting an alternative big data platform from among the twelve alternatives i.e., “Hadoop, MapReduce, PIG, HIVE, JAQL, Zookeeper, Hbase, Cassandra, Oozie, Lucene, Avro and Mahout” [8]. The criteria and alternatives were compiled from literature and through discussions for their applicability in the given hospital’s context. These criteria were shortlisted based upon detailed discussions regarding their relevance in the hospital.
Big Data Platform Selection at a Hospital …
133
3.4 Data Collection and Analysis Parameters and equations used during calculations are at Annexure 1. The values and weights of the gradation index for each criterion in percentages are at Tables 2, 3 and 4. Table 2. Scale used in the Rembrandt system [19, 21] Comparative judgement (C j versus C k ) Gradation index Very strong preference for C k
−8
Strong preference for C k
−6
Definite preference for C k j Weak preference for C k j
−4 −2
Indifference between C k and C j
0
Weak preference for C j
2
Definite preference for C j
4
Strong preference for C j
6
Very strong preference for C j
8
Table 3. Comparative judgement of criteria Avail Cont Use Scal Gran PriSec QuAs Avail Cont
0
0
4
6
4
4
4
0
0
0
2
4
4
4
Use
−4
0
0
2
4
4
4
Scal
−6
−2
−2
0
4
4
4
Gran
−4
−4
−4 −4
0
4
4
PriSec −4
−4
−4 −4
−4
0
4
QuAs −4
−4
−4 −4
−4
−4
0
The process is repeated for the alternatives for each of the above criteria. Here, we have taken each of the requirements and compared the alternative platforms such as Hadoop, PIG, and HIVE to each other. Similarly, calculations are carried out for other criteria, i.e. continuity, usability, scalability, granular manipulation, privacy and security enablement and quality assurance. Table 5 shows the rankings based on the aggregation of judgments for each alternative and considering all the criteria.
134
S. Singh et al. Table 4. Weights calculated for the criteria S.No. Criteria
Weight (%)
1
Availability (Avail)
32.50
2
Continuity (Con)
21.90
3
Usability (Use)
17.90
4
Scalability (Scal)
12.15
5
Granular manipulation (Gran)
7.40
6
Privacy and security enablement (PriSec)
5.00
7
Quality assurance (QuAs)
3.30
Table 5. Ranking of the big data platforms Big data platform
Percentage (%)
Rank
Hadoop
33.1
2
MapR
35.4
1
PIG
31.6
3
HIVE
20.7
4
JAQL
16.4
5
Zookeeper
13.0
6
Hbase
10.4
7
Cassandra
8.2
8
Oozie
6.5
9
Lucene
5.2
10
Avro
4.1
11
Mahout
3.3
12
4 Results and Analysis From the calculations obtained at this hospital, availability, usability, continuity and privacy, and security enablement were the main criteria and accounted for 76.6% weight. Granular manipulation and quality assurance, each carry a 9% weight. The application of the Rembrandt system for deciding on platform selection has led to the choice of MapReduce with a weight of 35.4% among all the platforms under consideration. Thus, MapReduce is found to be the most appropriate for the hospital under consideration at 35.4%, followed closely by Hadoop at 33.1%. The hospital decided to explore both the MapReduce and Hadoop big data platform IT systems for final selection, and an option to implement them together; MapReduce being run under Hadoop.
Big Data Platform Selection at a Hospital …
135
This is expected to bring the best of both Hadoop and MapReduce platforms and are also currently being used by many organizations in various sectors.
5 Limitations and Future Directions This study was limited to one tertiary hospital in India. During the investigation, a varied level of understanding among the participants was found, and hence participants had to be made aware about the existing and competing technologies during the discussions. The level of education and awareness in IT may widely vary from hospital to hospital. Significant variation in the levels of IT infrastructure may also be visible. Furthermore, we are now routinely witnessing the availability of new alternatives in big data platforms—paid and open source. This research could also be enriched by taking into consideration these new developments and technologies. Healthcare organizations may explore customization of the criteria in both qualitative and quantitative terms based upon their own contextual or local requirements. Hospitals are seen to be continually making new investments in upcoming information and communication technologies and can expect to gain more from these investments if they are made using a structured approach. The decisions are expected to be more readily acceptable-when they consider many criteria and alternatives.
Annexure 1 Nomenclature g—group of decision makers (g ≥ 1); m—criteria under the evaluation (m ≥ 1); Vi —subjective value of criteria C i (i = 1,… m), summation of V i equals 1; d—pairwise comparison of judgements on a category scale; δjld —a gradation index that is integer-valued and arrived by use of scale as shown in Table 3; √ ƴ—scale\parameter and taken as log 2 for criteria and log 2 for the alternatives [19]; rjld —a numeric estimate of the preference ratio V j /V l as made by DM. Assumption: The Criteria (C i , i = 1…m) have an unknown subjective values (V i ) common to all decision makers in the group. Steps: (1) Each DM is asked to consider pairs of criteria (as in Table 2) as C i and C j and to record their indifference between any two criteria. The compared values could lie between indifference level to very strong on these criteria. (2) The pairwise comparison judgments are picked up from DMs ‘category scale’ (Table 2) for verbal responses and converted into an integer-valued gradation index δ jld.
136
S. Singh et al.
comparisons of pairs for the m criteria (3) The evaluation between (m − 1)and m m−1 2 (Table 3). (4) The subjective weights of the criteria undergo normalization (Table 6) based on a The calculations involved in the normalization taking scale ratio, such that V i = 1. √ parameter value as log 2; use Eqs. 1 and 2. Table 6. Normalization of criteria Avail Cont
Use
Scal
Gran
PriSec QuAs Before After normalization normalization
Avail
1.000 1.000 4.000 8.000 4.000 4.000
4.000 2.972
0.325
Cont
1.000 1.000 1.000 2.000 4.000 4.000
4.000 2.000
0.219
Use
0.250 1.000 1.000 2.000 4.000 4.000
4.000 1.641
0.179
Scal
0.125 0.500 0.500 1.000 4.000 4.000
4.000 1.104
0.121
Gran
0.250 0.250 0.250 0.250 1.000 4.000
4.000 0.673
0.074
PriSec 0.250 0.250 0.250 0.250 0.250 1.000
4.000 0.453
0.050
QuAs
1.000 0.305
0.033
9.147
1.000
0.250 0.250 0.250 0.250 0.250 0.250
Sum
i. The gradation index δ jld is converted into a geometric scale (value based) and characterized by a scale parameter γ rjid = eγ δjld ; j, l = 1 md = 1 . . . g
(1)
ii. The weights of the criteria vj are calculated by a geometric means sequence using Eq. 2 and shown in Table 4. m g 1 g m vj = rjld ; j = 1 . . . m (2) l=1 d =1
(5) Once criteria weights are established (Table 4), all the alternatives are used to assess each criterion. Table 7 shows a sample comparison of alternatives taking ‘availability’ as the criteria and converts (Table 8) after taking scale parameter value as log 2. The ranking is arrived by taking all criteria and alternatives and subjecting them to the application on geometric mean and process of normalization.
−2
−2
−2
−2
−2
−2
−2
−2
−2
−2
−2
−2
−2
−2
−2
Hbase
Cassandra
Oozie
Lucene
Avro
Mahout
−2
HIVE
−2
−2
−2
PIG
Zookeeper
2
−2
−2
MapR
JAQL
2
MapR
0
Hadoop
Hadoop
Availability
−2
−2
−2
−2
−2
−2
−2
−2
−2
0
2
2
PIG
−2
−2
−2
−2
−2
−2
−2
−2
0
2
2
2
HIVE
−2
−2
−2
−2
−2
−2
−2
0
2
2
2
2
JAQL
−2
−2
−2
−2
−2
−2
0
2
2
2
2
2
Zookeeper
−2
−2
−2
−2
−2
0
2
2
2
2
2
2
Hbase
−2
−2
−2
−2
0
2
2
2
2
2
2
2
Cassandra
Table 7. Comparative judgements of alternatives on availability
−2
−2
−2
0
2
2
2
2
2
2
2
2
Oozie
−2
−2
0
2
2
2
2
2
2
2
2
2
Lucene
−2
0
2
2
2
2
2
2
2
2
2
2
Avro
0
2
2
2
2
2
2
2
2
2
2
2
Mahout
Big Data Platform Selection at a Hospital … 137
0.25
0.25
0.25
Lucene
Avro
Mahout
0.25
Zookeeper
0.25
0.25
JAQL
Oozie
0.25
HIVE
0.25
0.25
PIG
0.25
0.25
MapR
Cassandra
1
Hadoop
Hbase
Hadoop
Availability
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
4
4
MapR
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
1
4
4
PIG
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
1
4
4
4
HIVE
0.25
0.25
0.25
0.25
0.25
0.25
0.25
1
4
4
4
4
JAQL
0.25
0.25
0.25
0.25
0.25
0.25
1
4
4
4
4
4
Zookeeper
0.25
0.25
0.25
0.25
0.25
1
4
4
4
4
4
4
Hbase
0.25
0.25
0.25
0.25
1
4
4
4
4
4
4
4
Cassandra
Table 8. Calculation on alternatives (e0.693δjl )
0.25
0.25
0.25
1
4
4
4
4
4
4
4
4
Oozie
0.25
0.25
1
4
4
4
4
4
4
4
4
4
Lucene
0.25
1
4
4
4
4
4
4
4
4
4
4
Avro
1
4
4
4
4
4
4
4
4
4
4
4
Mahout
138 S. Singh et al.
Big Data Platform Selection at a Hospital …
139
References 1. D.T. Jamison, L.H. Summers, G. Alleyne, K.J. Arrow, S. Berkley, A. Binagwaho, Global health 2035: a world converging within a generation. Lancet 382(9908), 1898–1955 (2013) 2. B. Almoaber, D. Amyot, Barriers to successful health information exchange systems in Canada and the USA: a systematic review. Int. J. Healthcare Inf. Syst. Inform. 12(1), 44–63 (2017) 3. F. Guerriero, G. Miglionico, F. Olivito, Location and reorganization problems: the Calabrian healthcare system case. Eur. J. Oper. Res. 250, 3 (2016) 4. A. Mühlbacher, A. Kaczynski, Making good decisions in healthcare with multi-criteria decision analysis: the use, current research and future development of MCDA (2016) 5. S. Sachdeva, S. Batra, S. Bhalla, Evolving large scale healthcare applications using open standards, health policy and technology (2017) 6. H. Schildt, Big data and organizational design—the brave new world of algorithmic management and computer augmented transparency, innovation (2016) 7. A. Sorescu, Data-Driven Business Model Innovation. J. Prod. Innov. Manag. 34(5), 691–696 (2017) 8. W. Raghupathi, V. Raghupathi, Big data analytics in healthcare: promise and potential. Health Inf. Sci. Syst. 2(1), 1–10 (2014) 9. D. Delen, S. Ram, Research challenges and opportunities in business analytics. J. Bus. Anal. 1(1), 2–12 (2018) 10. T. Davenport, From analytics to artificial intelligence. J. Bus. Anal. 1(2), 1–8 (2018) 11. S. Yalçında˘g, A. Matta, E. Sahin, ¸ G. Shanthikumar, The patient assignment problem in home healthcare: using a data-driven method to estimate the travel times of caregivers. Flex. Serv. Manuf. J. 28(1–2), 304–335 (2016) 12. A. Berland, Using the Johari Window to explore patient and provider perspectives. Int. J. Health Govn. 22(1), 47–51 (2017) 13. S. Purkayastha, J. Braa, Big data analytics for developing countries—using the cloud for operational in health. Electr. J. Inf. Syst. Dev. Countr. 59(1), 1–17 (2013) 14. R. Agarwal, B. Singh, An analytical study of queues in the medical sector. OPSEARCH 55(2), 268–287 (2018) 15. A. Tako, K. Kotiadis, PartiSim: a multi-methodology framework to support facilitated simulation modelling in healthcare. Eur. J. Oper. Res. 244(2), 555–564 (2015) 16. M. Weernink, S. Janus, J.V. Til, D. Raisch, J.V. Manen, M. Ijzerman, A systematic review to identify the use of preference elicitation methods in healthcare decisionmaking. Pharmaceut. Med. 28(4), 175–185 (2014) 17. A. Tu¸s, E. Adalı, The new combination with CRITIC and WAS-PAS methods for the time and attendance software selection problem. OPSEARCH 56(2), 528–538 (2019) 18. F.A. Lootsma, T.C.A. Mensch, F.A. Vos, Multi-criteria analysis and budget reallocation in long-term research planning. Eur. J. Oper. Res. 47(3), 293–305 (1990) 19. D. Olson, G. Fliedner, K. Currie, Comparison of the REMBRANDT system with analytic hierarchy process. Eur. J. Oper. Res. 82(3), 522–539 (1995) 20. M. Tavana, A. Yazdi, M. Shiri, J. Rappaport, An EFQM Rembrandt excellence model based on the theory of displaced ideal. Benchmark. Int. J. 18(5), 644–667 (2011) 21. R.C. Van den Honert, F.A. Lootsma, Assessing the quality of negotiated proposals using the REMBRANDT system. Eur. J. Oper. Res. 120(2000), 162–173 (2000) 22. Y. Asi, C. Williams, The role of digital health in making progress toward sustainable development goal (SDG) 3 in conflict-affected populations. Int J Med Inform 114(890), 114–120 (2018)
140
S. Singh et al.
23. E. Pantzartzis, F. Edum-Fotwe, A. Price, Sustainable healthcare facilities: Reconciling bed capacity and local needs. Int. J. Sustain. Built. Environ. 6(1), 54–68 (2017) 24. D. Pencheon, Making health care more sustainable: the case of the English NHS. Public Health 129(10), 1335–1343 (2015)
Analysis of PQ Disturbances in Renewable Grid Integration System Using Non-parametric Spectral Estimation Approach Rajender Kumar Beniwal1(B) and Manish Kumar Saini2 1 Electrical Engineering Department, Sobhasaria Group of Institutions, Sikar 332001, India
[email protected] 2 Electrical Engineering Department, D. C. R. University of Science and Technology, Sonepat
131039, India [email protected]
Abstract. This paper presents non-parametric spectral estimation of power quality disturbances occurring in photovoltaic and wind-integrated systems. The primary non-parametric technique for spectral estimation is periodogram which suffers from a main limitation of offside lobe leakage due to finite signal length. Therefore, this work has proposed power spectral density estimation of voltage signals using Welch method that is a modified version of periodogram. Welch spectrum shows peaks only at the frequencies present in the power quality signal. Thus, it offers correct frequency estimation of non-stationary voltage signals and consequently helps in detection of power quality disturbances. The distributed generation model consisting of solar and wind energy integrated with grid is developed in MATLAB and three-phase disturbance signals are taken from point of common coupling for being segmented further. Three types of power quality disturbances, i.e., harmonics, transient and harmonics with transient are simulated for validating the efficacy of Welch method. Keywords: Power quality · Harmonics · Transients · Welch method · Spectrum analysis
1 Introduction With the continuous growth in industrial infrastructure, modernization of cities and living standard of human beings, the demand of electrical energy has drastically increased. To meet the soaring demand of electricity, there is need of more generating units as the fossil fuel-based conventional sources are depleting continuously and are also the cause of environmental pollution. So, the main thrust is on the renewable energy resources like fuel cell, photovoltaic (PV) system, wind energy, etc. Among these renewable resources, PV and wind energy systems are in vogue. But, these renewable energy sources are dependent on the environmental conditions such as change in temperature, wind speed, and solar radiation. [1]. These environmental conditions affect the generating capacity of the PV © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_17
142
R. K. Beniwal and M. K. Saini
and the wind energy systems and consequently cause problem of grid stability and other power quality issues like sag, swell, harmonics, flicker, transient, notch, interruption, etc. Beside environmental variations, there are many other sources of power quality issues, e.g., power system faults, heavy load, arc furnace, non- linear load, power electronic devices, etc. [2]. Power quality (PQ) is a term which is related to the quality of the power supplied in terms of supply voltage and frequency. As per the IEEE standard 1100, PQ is defined as “the concept of powering and grounding sensitive equipments in a matter that is suitable to the operation of the equipment” [3]. The electric power which is delivered to the customers by the utilities has some fixed parameters like supply frequency, amplitude, and phase of the voltage or current signal [4]. Some international bodies like IEEE, IEC former CIGRE, etc., provide some standards which define the PQ in terms of these parameters and give their maximum and minimum permissible values. Since the instantaneous frequency of the signal is an indicator of power quality, its estimation becomes the foremost step of any algorithm designed for power quality analysis. The literature abounds with numerous techniques for power quality analysis [5]. Different approaches have been proposed by the researchers, for instance, fast Fourier transform [6], wavelet transform [7], multiwavelet transform [8], fractional delay wavelet [9, 10], time--time transform [11], S-transform [12], modified potential function [13], independent component analysis [14], modified EMD [15] etc. For detection and tracking of power quality disturbances, spectral estimation methods are very useful. These methods are prominently used for the analysis of harmonics, inter-harmonics, transients, etc. [16]. Spectral estimation methods are categorized as parametric and non-parametric. Parametric approach is based on signal model. Popular parametric methods are estimation of signal parameters via rotational invariance technique (ESPIRIT), multiple signal classification (MUSIC), and Kalman filters. In [17], spectrum analysis has been performed by MUSIC algorithm for harmonics caused by the two-level inverter of PV system in distributed generation. Transient analysis and their categorization have been done by ESPRIT method [18]. In [19], a comparison of different windows used in spectral analysis with Kalman filter technique has been made for estimating power quality indices. Despite various applications, parametric techniques lag somewhere owing to dependence on correctness of parameter approximation in signal modeling. Moreover, MUSIC and ESPIRIT method need signal pre-filtering to remove the fundamental frequency and prespecified information about signal frequency is required in Kalman filters. Therefore, non-parametric methods like fast Fourier transform (FFT), short time Fourier transform (STFT), discrete wavelet transform (DWT), periodogram, Welch method, etc., are preferred for PSD estimation of power signals. Non-parametric methods perform the estimation of power spectral density (PSD) of signals from the signals itself rather than their models. The basic non-parametric method is FFT. In [6], discrete FFT is used for spectral analysis of harmonics signals which are generated by the induction generation in transient state of wind turbine. Fourier analysis-based spectral estimation of harmonics which are produced due to switching converter and resonant link inverter are presented in [20, 21], respectively. In FFT, there are some limitations also like frequency resolution, spectral leakage, and assumption of signal stationarity [22]. Another non-parametric technique is periodogram which gives
Analysis of PQ Disturbances in Renewable Grid …
143
a scaled and magnitude-squared discrete FFT plot of the input signal. But, there is one significant disadvantage of periodogram, i.e., the impact of offside lobe leakage due to finite signal length. To address this problem, Welch modified the method of periodogram and proposed another non-parametric spectral estimation technique known as Welch method. Periodogram- and Welch-based spectral estimations have been used to analyze the power spectrum of a signal in [23]. These methods calculate autocorrelation of the measured signals and then compute Fourier transform for estimation of power spectrum [24]. In [25], spectral analysis of transient signal has been presented and using Welch and Yule-Walker AR method with their comparison. Authors have computed cross-power spectrum density and auto-power spectrum density using Welch method on uniform grids of frequencies [26]. Adamo et al. have proposed a spectral estimation method for analyzing the frequency- and amplitude-modulated power line signal [27]. In [28], STFT, Welch, and windowing techniques have been presented for computing power quality indices. In this paper, power spectral estimation based on Welch method has been proposed for the PV and wind-integrated system. Particularly, two power quality issues of PV and wind-integrated system, i.e., transient and harmonics, are considered in this work. The case of simultaneous occurrence of transient and harmonics is also analyzed here. For spectral estimation, voltage signal is taken from the point of common coupling (PCC) in the PV and wind-integrated model simulated in MATLAB. Section 2 discusses the theoretical concept of Welch method and spectral analysis of voltage signal using Welch method is presented in Sect. 3. Section 4 concludes this paper.
2 Welch Method Welch method is a kind of non-parametric method for power spectrum estimation. In the non-parametric method, the PSD is directly estimated from the signal using techniques like periodogram, Welch, etc. Periodogram method is one of the easiest methods of PSD estimation based on FFT. In this method, the input signal is divided into many frames which are having length in the power of 2. To estimate power spectrum of the signal, signal is divided into a number of segments which can be overlapped and then FFT of each segment is computed and finally, the average of these spectrums is estimated [29]. However, this method suffers from offside lobe leakage problem due to finite length of signal. Addressing this drawback, Welch [30] proposed modification in this method and the modified spectrum is known as Welch spectrum. In Welch spectrum, the input signal is divided into segments. Overlapping of segments may/may not be done. The segmented data is fitted into the window for edge smoothening. Then, periodogram spectrum is estimated from each window and all the estimates are further averaged to obtain the overall Welch spectrum. Advantage of Welch method is that it reduces the side lobe leakage and gives stability of spectral estimate in statistical sense. For further increasing spectrum stability, the number of segments is increased. Since data length is limited, this is made possible by overlapping the segments. Due to averaging of multiple estimates, estimation variance reduces. But in parallel, redundancy increases which can be limited by using non-rectangular window to
144
R. K. Beniwal and M. K. Saini
give less weightage to the overlapping end samples. Modified periodogram (ith) is given as 2 N −1 Sf i −j2π fk P xx (f ) = xi (k) · w(k) · e (1) A.N
k=0
where f = f s , i.e., normalized frequency in cycles per sample. S f is scaling factor. N is length of input signal x(k), and w(k) is window function. A is a normalized constant given as, A=
N −1 1 2 w (k) N
(2)
k=0
and final PSD is estimated as, w
P xx (f ) =
U −1 1 i P xx (f ) U
(3)
i=0
Though, Welch method gives accurate power spectral estimation, it demands cautious selection of window functions and number of segments of the signal. This is because small data length of segments and non-rectangular window make the resolution of Welch method somewhat less. On the other hand, it reduces estimation variance. So, Welch method has trade-off between variance reduction and resolution [24].
3 Spectral Estimation of PQ Signal This section presents spectral estimation of power quality disturbances in PV and windintegrated grid using Welch method. Transient and harmonics have been analyzed separately and their simultaneous occurrence has also been considered for spectral estimation. For this purpose, a Simulink model has been developed in MATLAB which consists of PV array and wind generator (DFIG) integrated with main grid, as shown in Fig. 1. The fundamental frequency of the system is 50 Hz and sampling frequency is selected as 6.4 kHz. Three types of power quality disturbance signals are generated and measured from point of common coupling (PCC). Their Welch spectrum is then analyzed for estimation of signal frequency. Transient is a temporary phenomenon in PV and wind-integrated system. Mostly, it occurs due to dynamic operations of power system such as switching of capacitor bank. Harmonics mainly occur owing to the presence of nonlinear loads in the system. Nonlinear loads generate harmonics of multiple orders. Therefore, in this work, for generating the disturbance signals, capacitor bank and rectifier as nonlinear load are used. Transient event is generated by switching on the capacitor bank and harmonics are generated by connecting three-phase rectifier load. Harmonics with transient is generated using both capacitor bank and rectifier loads. Figure 2 shows the waveform of each type of generated disturbances.
Analysis of PQ Disturbances in Renewable Grid …
145
Fig. 1. Single-line diagram of Simulink model developed for PV and wind-integrated system
Fig. 2. Three-phase waveforms of PQ disturbances
Three-phase signal captured from PCC is segmented into single phases and then normalized. Window length of 640 samples has been used for each phase. Figure 3 shows all three phases of voltage signal with harmonics and their corresponding Welch spectrums. In Fig. 3, the Welch spectrum of phase-A signal shows first peak at point 0.01563, which presents the base frequency of 50 Hz. Second peak is at 0.07813 which corresponds to fifth-harmonic component at 250 Hz. Third peak is at 0.1094 which shows
146
R. K. Beniwal and M. K. Saini
frequency of 350 Hz. Fourth peak at 0.1719 and fifth peak at 0.2031 show frequency components of 550 and 650 Hz. Sixth and seventh peaks are at 0. 2656 and 0.2969 which relate to 850 and 950 Hz frequency present in harmonics signal. The same frequency components are observed in phase-B and phase-C, as apparent from Fig. 3.
Fig. 3. Single-phase waveforms of voltage signal with harmonics and their corresponding Welch spectrums
Figure 4 shows the voltage signal segmented in three single phases during transient event and their corresponding Welch spectrums. The Welch spectrums provide the power spectrum density estimates. As seen from the Fig. 4, the transient is present in phase-A and phase-C. The Welch spectrum shows two peaks in both phase-A and phase-C. The first peak at 0.01563 corresponds to base frequency of 50 Hz. Second peak at 0.6094 estimates the transient frequency of 1950 Hz. In phase-B, there is only one peak at 0.01563 that is related to 50 Hz as there is no transient event-related component in phase-B. Figure 5 shows the occurrence of two disturbances, i.e., harmonics and transient simultaneously with their single-phase waveforms and resultant Welch spectrums. In Fig. 5, harmonics with transient are present in phase-A and phase-C. In phase-B, only harmonics are present, no transient is seen. So, the phase-B signal and its Welch spectrum are same as that in Fig. 3. In phase-A, the first peak is of 50 Hz frequency and second peak is of 250 Hz. Third, fourth, and fifth peaks show frequency components at 350, 450, and 650 Hz, respectively. The last peak which is at 0.3909 corresponds to 1250 Hz frequency. Welch spectrum of phase-C is same as phase-A spectrum. In other words, phase-C is having the same spectral components as that of phase-A. Thus, the PSD estimation has been effectively addressed for all three PQ disturbances in PV and wind-integrated system. Welch spectrum in all three cases demonstrate their effectiveness in accurate estimation of different frequency components present in the voltage signals.
Analysis of PQ Disturbances in Renewable Grid …
147
Fig. 4. Single-phase waveforms of voltage signal with transient and their corresponding Welch spectrums
Fig. 5. Single-phase waveforms of voltage signal with harmonics and transient and their corresponding Welch spectrums
4 Conclusion This work presented Welch method-based non-parametric approach for spectral estimation of power quality disturbances in PV and wind-integrated system. Three different PQ disturbances, i.e., harmonics, transient, and harmonics with transient are simulated in MATLAB/Simulink and analyzed using spectral analysis method. Three phase signals are segmented into three single phases. Welch power spectrum density plots of each phase are plotted and the frequency components present in the disturbance signal are estimated from these plots. Welch spectrums have different shapes for different power quality events depending on the signal frequency. Accurate spectral analysis obtained
148
R. K. Beniwal and M. K. Saini
from Welch method portrays the effectiveness of Welch method in assessment of power quality disturbances occurring in PV and wind-integrated systems.
References 1. T. Kerekes, R. Teodorescu, P. Rodriguez, G. Vazquez, E. Aldabas, A new high-efficiency single-phase transformerless PV inverter topology. IEEE Trans. Industr. Electron. 58(1), 184– 191 (2010) 2. M.K. Saini, R. Kapoor, R.K. Beniwal, A. Aggarwal, Recognition of voltage sag causes using fractionally delayed biorthogonal wavelet. Trans. Inst. Measur. Control 41(10), 2851–2863 (2019) 3. IEEE Standards. IEEE Recommended practice for powering and grounding sensitive electronic equipment. IEEE Std 1100–1992 4. F.D. Martzloff, Power quality work at the International electrotechnical commission, in International Conference on Power Quality—End-Use Applications and Perspectives (Europe, Sweden, 1997) 5. M.K. Saini, R. Kapoor, Classification of power quality events—a review. Int. J. Electr. Power Energy Syst. 43(1), 11–19 (2012) 6. M.G. Ioannides, A new approach for the prediction and identification of generated harmonics by induction generators in transient state. IEEE Trans. Energy Convers. 10(1), 118–125 (1995) 7. P. Sinha, S.K. Goswami, S. Nath, Wavelet-based technique for identification of harmonic source in distribution system. Int. Trans. Electr. Energy Syst. 26(12), 2552–2572 (2016) 8. R. Kapoor, M.K. Saini, Multiwavelet transform based classification of PQ events. Int. Trans. Electr. Energy Syst. 22(4), 518–532 (2012) 9. M.K. Saini, R.K. Beniwal, Design of modified matched wavelet design using lagrange interpolation. In: Second International Conference on Innovative Applications of Computational Intelligence on Power, Energy and Controls with their Impact on Humanity (CIPECH 16) (Ghaziabad, India, 2016), pp. 264–268 10. M.K. Saini, R.K. Beniwal, Optimum fractionally delayed wavelet design for PQ event detection and classification. Int. Trans. Electr. Energy Syst. 27(10), 1–15 (2017) 11. M.K. Saini, R.K. Beniwal, Detection and classification of power quality disturbances in windgrid integrated system using fast time-time transform and small residual-extreme learning machine. Int. Trans. Electr. Energy Syst. 28(4), 1–23 (2018) 12. P.R. Babu, P.K. Dash, S.K. Swain, S. Sivanagaraju, A new fast discrete S-transform and decision tree for the classification and monitoring of power quality disturbance waveforms. Int. Trans. Electr. Energy Syst. 24(9), 1279–1300 (2014) 13. R. Kapoor, M.K. Saini, Detection and tracking of short duration variation of power system disturbances using modified potential function. Int. J. Electr. Power Energy Syst. 47, 394–401 (2013) 14. D.D. Ferreira et al., Extracting the transient events from power system signals by independent component analysis. Int. Trans. Electr. Energy Syst. 26(4), 884–900 (2016) 15. M.K. Saini, R.K. Beniwal, Recognition of multiple PQ issues using modified EMD and neural network classifier. Iranian J. Electr. Electron. Eng. 14(2), 188–203 (2018) 16. R.R. Kumar, S.S. Murugan, V. Natarajan, S. Radha, Analysis of power spectral density and development of an adaptive algorithm for filtering wind driven ambient noise in shallow water. In: 3rd International Conference on Electronics Computer Technology (Kanyakumari, 2011), pp. 163–167
Analysis of PQ Disturbances in Renewable Grid …
149
17. J.B. Noshahr, B.M. Kalesar, Harmonic spectrum estimation and analysis of the voltage at the PCC of the distribution network connected to solar plant based on parametric algorithm (Music). In: IEEE International Conference on Environment and Electrical Engineering and 2017 IEEE Industrial and Commercial Power Systems Europe (EEEIC/I&CPS Europe) (Milan, 2017), pp. 1–6 18. M.H.J. Bollen, E. Styvaktakis, I. Gu, Categorization and analysis of power system transients. IEEE Trans. Power Del. 20(3), 2298–2306 (2005) 19. R. Zolfaghari, Y. Shrivastava, V.G. Agelidis, A comparison between different windows in spectral and cross spectral analysis techniques with Kalman filtering for estimating power quality indices. Electr. Power Syst. Res. 84, 128–134 (2012) 20. N. Femia, G. Spagnuolo, Spectral analysis of switching converters using a generalized transfer function, in Technical Proceedings of Power Electronics Congress (1996), pp. 282–289 21. S.J. Finney, T.C. Green, B.W. Williams, Spectral characteristics of resonant link inverters. IEEE Trans. Power Electron. 8(4), 562–570 (1993) 22. Z. Leonowicz, T. Lobos, Power quality evaluation using advanced spectrum estimation methods, in International Conference on Power System Technology (Chongqing, 2006), pp. 1–6 23. Z. Sun, H. Chen, Y. Chen, Application of periodogram and Welch based spectral estimation to vortex frequency extraction, in Second International Conference on Intelligent System Design and Engineering Application (Sanya, Hainan, 2012), pp. 1383–1386 24. M.H. Hayes, Statistical Digital Signal Processing and Modelling (Wiley, New York, 2003) 25. A. Alkan, A.S. Yilmaz, Frequency domain analysis of power system transients using Welch and Yule-Walker AR methods. Energy Convers. Manag. 48, 2129–2135 (2007) 26. H. Akcay, Spectral estimation in frequency-domain by subspace techniques. Sig. Process. 101, 204–217 (2014) 27. F. Adamo et al., A spectral estimation method for nonstationary signals analysis with application to power systems. Measurement 73, 247–261 (2015) 28. R. Zolfaghari, Y. Shrivastava, V.G. Agelidis, Spectral analysis techniques for estimating power quality indices, in Proceedings of 14th International Conference on Harmonics and Quality of Power—ICHQP (Bergamo, 2010), pp. 1–8 29. J.L. Semmlow, Biosignal and biomedical image processing MATLAB-based applications. Marcel Dekker Inc. (2004) 30. P. Welch, The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 15(2), 70–73 (1967)
Noise Density Range Sensitive Mean-Median Filter for Impulse Noise Removal Prateek Jeet Singh Sohi1 , Nikhil Sharma1 , Bharat Garg1(B) , and K. V. Arya2 1 Thapar Institute of Engineering and Technology, Patiala, Punjab 147001, India
[email protected], [email protected], [email protected] 2 ABV-Indian Institute of Information Technology and Management, Gwalior 474015, India [email protected]
Abstract. A new noise density range sensitive algorithm for the restoration of images that are corrupted by impulse noise is proposed. The proposed algorithm replaces the noisy pixel by mean, median or pre-processed values based on noise density of the image. The proposed filter uses a unique approach for recovering images corrupted with very high noise densities (over 85%). It also provides significantly better image quality for different noise densities (10–90%). Simulation results show that the proposed filter outperforms in comparison with the other nonlinear filters. At very high noise densities, the proposed filter provides better visual representation with 6.5% average improvement in peak signal-to-noise ratio value when compared to state-of-the-art filters. Keywords: Median filters · Salt and pepper noise · Mean filters · Noise density-based filter · Image processing
1 Introduction Salt and pepper noise, also known as impulse noise, is often introduced during the transmission of an image through a noisy medium. It is caused due to sudden disturbance in the image during transmission and electromagnetic interference in the environment. The pixels attain extreme values of either 0 or 255 in case of a greyscale image [1]. Filters are effective tools for de-noising the image. They could be a piece of hardware or software that performs an algorithm on the input signal to produce de-noised image. Linear and nonlinear filtering [3–9] approaches are one of the most prevalent filtering techniques. Various approaches such as interpolation-based filters [6] are also presented for denoising images, but their performance is only good when they get a sufficiently large amount of data. Other techniques like weighted median approach [7, 8] mostly divide information into two or more groups, and based on the weighting factor, it decides which group to pick for processing. Although at high noise density, these filters fail catastrophically due to the lack of original pixels in created groups. Many trimmed © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_18
Noise Density Range Sensitive Mean-Median Filter …
151
median filters [4] also exist that provide satisfying solutions only in specific noise density ranges. Several linear and nonlinear filters are proposed which are good at a certain limit of noise density, but after that, they consequently lead to a loss in edge detail and blurring of image [9]. However, it is often ignored that different noise densities need range specific solutions. To overcome these problems, a new noise density range sensitive mean-median filter (NRSMF) is proposed, which is a combination of both linear and nonlinear filters that provide range-specific solutions for de-noising images.
2 Related Work Various filters are proposed for detecting and removing the impulse noise [2]. The mean and median filters are proposed at first [3–5]. In mean, a 3 × 3 window is constructed across the pixel, and mean of all pixels in the window replaces the processed pixel. The main problem with mean filters is their inability to preserve edges because no original pixel values are restored in the process. For this reason, exploration in the field of median filters [4, 5] started where the corrupted pixel is replaced by the median of 3 × 3 window. Though this filter performs excellently on lower noise densities, for high noise densities, most of the pixels in the 3 × 3 window are corrupted. So, most of the pixels will remain corrupted after processing, which results in a poor image quality. Many nonlinear filers are introduced to overcome the problems of mean and median filters, such as decision-based median filter (DBAMF) [6]. In DBAMF, a specific type of sorting is done on the 3 × 3 window, so that the resultant matrix has the median as its central pixel of the window. The processed pixel is then replaced by the centre pixel. In case the median turns out to be noisy, the pixel being processed is replaced by the nearest non-noisy pixel. The major drawback of this filter is that, under high noise density condition, the median generally turns out to be noisy. This results in repetition of the same pixel, again and again, causing a blurring effect. To improve the performance of DBAMF, new filters like unsymmetric trimmed mean filter (UTMF) [7] and unsymmetric trimmed midpoint filter (UTMP) are introduced. The UTMF performed better on lower noise densities, whereas UTMP is good at higher noise densities. In UTMF, the mean of non-extreme values in a 3 × 3 window is taken for every pixel. On the other hand, in UTMP the mid-point of non-extreme values is taken. The main problem with these two algorithms is that even though they work great in their own domain, they fail catastrophically when considered individually. Another nonlinear filter, modified decision-based un-symmetric trimmed median filter (MDBUTMF) [9] overcame these problems. In which, a fixed 3 × 3 window is considered across the noisy pixel and median of the uncorrupted values in the window is used to replace the noisy pixel. If the window contains only corrupted pixels, the pixel is replaced by the mean of all the pixels in the selected window. This process also performs well at lower noise densities, but at higher noise densities the noisy pixel is replaced by the mean of the corrupted pixels in the 3 × 3 matrix. This causes a streaking effect throughout the image. To overcome this issue, fast switching-based median–mean filter (FSBMMF) [10] is introduced. It uses the mean of the previously processed pixels in the 3 × 3 window when the median turns out to be noisy. This results in lower streaking.
152
P. J. S. Sohi et al.
This method uses a unique approach for the corner and last row/column pixels. These pixels are replaced by the previous pixel in their respective row or column. For the corner pixel, the mean of uncorrupted values of the 5 × 5 window is considered. However, the main problem with FSBMMF is that corrupted pixels are taken into consideration in the 3 × 3 window while taking the median, which causes inaccuracies and corruption in the estimated values. A different approach has also been introduced, which includes the usage of interpolation for replacing the corrupted pixel with the processed pixel; one such example is recursive cubic spline interpolation filter (RSIF) [11]. The only benefit of this technique is that interpolation is a better approximation technique referring mathematically. During higher noise densities, it is not as useful because it does not get ample amount of data to make accurate predictions. Moreover, this algorithm is just like MDBUTMF, where the median is used instead of interpolation. Also, this filter takes a considerable amount of computational time than other existing filters. Thus, a new three-value weighted median (TVWA) [12] with variable size window is proposed. The idea is to divide the non-corrupted values of the processing window into three groups. The first group contains all the non-corrupted pixels closer to the maximum value; the second group contains all the non-corrupted pixels closer to the middle value and the third group with pixels closer to the minimum value. So, based on the weighting factor of the group, the corrupted pixel is replaced by the product of the weighting factor of each group and respective maximum, minimum and middle value. However, if the window in consideration only had corrupted values, the window size is increased until an uncorrupted pixel is encountered. This approach is an improvement over other existing adaptive median filters, which use variable size windows during high noise density that resulted in lower details in the image. Another method to cure the above-mentioned issues is adaptive switching weighted median filter (ASWMF) [13]. In this method, every pixel should have to pass a certain set of conditions to make the decision whether it is a corrupted pixel or not. So, if the pixel is not 0 or 255 and the mean of the pixel in the 3 × 3 window is not 0 or 255, respectively, then the pixel is noise-free. For the processing, this algorithm uses the concept of repeating the pixel in the variable size window a specific number of times and then taking the median. Also, for median calculation, it is first checked whether the length of the window is even or odd, which further has certain repeating concepts to process them, respectively. The major problem with the filter is that it takes a lot of computational time due to its complexity when compared to other filters. Also, at higher noise densities the pixel is replaced by nearest non-noisy pixel when 3 × 3 and 5 × 5 windows fail to execute, which results in blurring and loss of details in the image. One of the latest approaches in context with resolving salt and pepper noise is different applied median filter (DAMF) [14]. The method first creates a binary image from the corrupted image by setting the extreme pixels as 1 and others with 0. Further, these values are used to determine whether a pixel is corrupted or not. So, if the pixel is corrupted, then a 3 × 3 window is constructed across it. The window size is increased until one non-noisy pixel is encountered in the inspection window. However, the window size is limited to 7 × 7. Once a non-noisy pixel is encountered in the window, median of all the non-extreme pixels is used to replace the corrupted pixel. Once all the corrupted
Noise Density Range Sensitive Mean-Median Filter …
153
pixels have been operated upon, the resultant image is again converted into a binary format as before. If a corrupted pixel is encountered, then a 3 × 3 window is formed around it again, and a median of the uncorrupted pre-processed values replaces the pixel. The benefit of this method is that pre-processed pixels are used to calculate higher noise density conditions. Also, rather than zero padding, the edges are symmetrically padded, thus reducing the probability of creating higher noise density windows on the edges. The major disadvantage of this filter is that under very high noise density 5 × 5 and 7 × 7 windows are used for median calculation, which results in loss of image quality. It can be observed from the above-mentioned techniques that their performance at higher noise density is not up to the mark. The algorithm presented in the next section overcomes the above-mentioned problems.
3 Proposed NRSMF Algorithm This section first presents the proposed NRSMF algorithm flow chart with its comprehensive explanation of each step. Then, various cases of proposed algorithms are explained with the help of suitable examples. 3.1 Proposed Algorithm Flowchart The flowchart of proposed NRSMF algorithm is shown in Fig. 1. It de-noises the given image by performing following steps: Step 1: Calculate approximate noise density (ND) of image. Step 2: If pixel (P) is either 0 or 255, consider it as noisy pixel (Pn ). Step 3: If ND ⇐ 25%, consider a 3 × 3 window (W3×3 ) across Pn . Evaluate nf nf noise-free window (W3×3 ) by eliminating all 0 s and 255 s. Calculate median of W3×3 , if median is non-noisy, go to Step 2 for processing of next pixel, else go to Step 6. nf Step 4: If ND > 25%, consider W3×3 and calculate mean, if mean is non-noisy, go to Step 2. nf nf Step 5: If mean is noisy, consider W5×5 . If number of non-noisy pixels left in W5×5 nf
is at least 2, take mean of W5×5 . Note: Since the above step (Step 5) uses W_5 × 5, it is quite possible that Pn is corner pixel in the window, which has least amount of co-relation with corrupted pixel. This led to a conclusion that it is better to have at least two non-noisy in W5×5 as to have more information about Pn . Step 6: Check for corner or boundary pixel. nf
(a) If Pn is corner pixel, replace Pn by median of W5×5 (b) If Pn is boundary pixel, replace Pn by previously processed pixels in their respective row/column. Step 7: After processing, if Pn is still noisy, do the following: (a) If ND ⇐ 85%, replace Pn by mean of previously processed pixels in W3×3 .
154
P. J. S. Sohi et al.
Fig. 1. The proposed algorithm flowchart
Noise Density Range Sensitive Mean-Median Filter …
155
*Note: It is observed that the previously processed pixels are already close to the actual values. Hence, they produce better estimates during mean calculation. (b) If ND > 85%, replace Pn by mean of the previously processed pixels in W5×5 . Note: For ND greater than 85%, most of the processed pixels have been estimated from few original values. That is why W5×5 is taken instead of W3×3 as it increases the chance of reaching the value of original pixel. It can be observed that all the noisy pixels are computed by non-extreme pixels which should provide better results. The next subsection explains the above-mentioned algorithm with the help of a few examples. 3.2 Examples Related to Proposed Algorithm In Figs. 2 and 4, examples of image corrupted with noise density less than 25%, between 25 and 85% and above 85%, respectively, are taken. For Fig. 2, to evaluate corner pixel, first W3×3 is considered, but all the pixels are noisy in the window, so condition for corner pixel is applicable, i.e., a W5×5 is considered, where it finds non-noisy pixels. nf Then, the corrupted pixel is replaced by median of W5×5 .
Fig. 2. Noise density is less than 25%
For Fig. 3, the corrupted pixel in 1st row, 5th column is replaced by previously processed pixel, because the number of uncorrupted pixels is less than 2 in W5×5 . So, according to algorithm, the unprocessed boundary pixel is replaced by previously processed row/column pixel. In Fig. 4, the pixel in 5th row 4th column, a 3×3 is considered, but all the pixels in nf W3×3 turn out to be noisy. So, we check for WW 5×5 . But now, the length of W5×5 is less than 2, thus the pixel is replaced by mean of previously processed pixels in W 5×5 .
4 Simulation Results and Analysis The performance of the NRSMF and other existing filters namely DBAMF, MDBUTM, FSBMMF, RSIF, TVWA, ASWMF and DAMF are tested on different greyscale images
156
P. J. S. Sohi et al.
Fig. 3. Noise density is between 25 and 85%
Fig. 4. Noise density is greater than 85%
like Boat (512 × 512), Zelda (512 × 512), Lena (512 × 512) and coloured images, e.g., peppers image (512 × 512). These images are corrupted with impulse noise. The value of peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) [15] is calculated with varying noise densities. The SSIM is a method for predicting the deduced quality of digital images and PSNR (dB) is the ratio between maximum possible signal power to the power of noise in corrupted image. Mathematically, Max2 (1) PSNR = 10 log10 MSE MSE =
M −1 N −1 2 1 x(i, j) − y(i, j) MN i=0 j=0
(2)
Noise Density Range Sensitive Mean-Median Filter …
157
where MSE is mean square error, x and y represent input and output images, respectively. The parameter M and N represent dimensions of the image and Max is maximum pixel value of the image. 2 · μx · μy + c1 2 · σxy + c2
SSIM(x, y) = (3) μ2x + μ2y + c1 σx2 + σy2 + c2 where μx , σ 2x and μy , σ 2y are mean and variance of x and y, respectively. σ xy represent covariance of x and y, and c1 , c2 are constants. Below subsection provides the analysis of the proposed algorithm on higher noise densities, followed by analysis for wide range of noise densities in context with PSNR, SSIM and visual representation of the image. 4.1 Analysis on Higher Noise Density In Table 1, the average PSNR and SSIM values are shown for the NRSMF and existing filters, when Boat, Zelda and Lena images each of 512 × 512 are considered. The noise density is varied from 85 to 99%. The simulation results show that at higher noise densities, NRSMF has a superior performance in comparison with other filters even over 95%. Figures 5 and 6 provide the plot for the same. The trend clearly demonstrates that the PSNR and SSIM of NRSMF are better than other existing filters. The simulation results verify that the proposed NRSMF filter is very efficient for high-density impulse noise removal. 4.2 Analysis on Wide Noise Density Table 2 provides the PSNR values for peppers (coloured) image (512 × 512) with noise densities varying from 10 to 90%. The values clearly show that there is improvement in PSNR for a wide range of noise density. Figure 8 provides the plot between PSNR and noise density for Lena image, and the result of NRSMF is better than other filters. Figure 7 shows the visual representation for the output image when the image is corrupted by 90% impulse noise. The proposed filter provides a better image quality by retaining the edges, less blurring and low streaking effect.
5 Conclusion A new NRSMF filter for removing high-density impulse noise is proposed. The proposed NRSMF filter efficiently estimates pixel based on noise density of the image. The NRSMF accurately makes decision according to information in the particular window. Unlike other existing filters, the proposed NRSMF makes sure that there are at least two information pixels in the window for the restoration. Also, a unique technique is used for restoring images corrupted with very high noise densities (over 85%), i.e., the use of pre-processed pixels in the 5 × 5 window for the mean calculation process. The NRSMF shows stable and consistent performance across a wide range of noise densities varying from 10 to 90%. Also, the NRSMF filter performs exceptionally well under very high noise density conditions (85–90%). In addition, the proposed NRSMF provides a good restoration on a large set of greyscale and coloured images.
158
P. J. S. Sohi et al.
Table 1. Average PSNR and SSIM values of various filters on Boat, Zelda and Lena image each of size 512 × 512 with varying noise densities (85–99%) Filters
Metric Noise density (%)
DBAMF [6]
SSIM
85
87
89
91
93
95
97
99
0.732
0.693
0.65
0.5791
0.509
0.401
0.277
0.126
MDBUTM [9]
0.407
0.354
0.305
0.251
0.203
0.152
0.113
0.063
FSBMMF [10]
0.864
0.848
0.827
0.79
0.757
0.697
0.615
0.467
RSIF [11]
0.819
0.798
0.765
0.714
0.671
0.593
0.499
0.345
TVWA [12]
0.905
0.881
0.85
0.771
0.625
0.416
0.201
0.044
ASWMF [13]
0.9
0.885
0.863
0.827
0.749
0.619
0.433
0.197
DAMF [14]
0.907
0.896
0.88
0.856
0.807
0.698
0.455
0.117
Proposed
0.908
0.897
0.883
0.855
0.825
0.773
0.692
0.514
DBAMF [6]
PSNR 21.21
20.66
19.66
18.87
17.91
16.54
15.07
12.09
MDBUTM [9]
17.99
17.16
16.33
15.52
14.65
13.87
13.19
12.35
FSBMMF [10]
25.35
25.06
24.64
23.55
22.77
21.25
19.55
16.7
RSIF [11]
23.7
23.39
22.79
21.62
20.86
19.53
18.16
15.84
TVWA [12]
27.24
26.41
25.32
22.83
19.71
16.15
12.46
8.23
ASWMF [13]
26.6
26.11
25.57
24.57
23.08
21.07
18.67
15.17
DAMF [14]
27.35
26.87
26.36
25.34
23.65
20.47
15.45
8.83
Proposed
27.41
27.07
26.69
25.64
24.7
23.17
21.4
18.16
Noise Density Range Sensitive Mean-Median Filter …
159
Fig. 5. Average values of PSNR for Boat, Zelda and Lena images (512 × 512) with varying noise densities from 85 to 99%
Fig. 6. Average values of SSIM for boat, Zelda and Lena images (512 × 512) with varying noise densities from 85 to 99%
160
P. J. S. Sohi et al.
Fig. 7. Simulation results on peppers image (512 × 512) corrupted with 90% salt and pepper noise. a original image, b noisy image, and restored images from c DBAMF [6], d MDBUTM [9], e FSBMMF [10], f RSIF [11], g TVWA [12], h ASWMF [13], i DAMF [14] and j NRSMF filters Table 2. PSNR values of various filters on peppers image (512 × 512) with varying noise densities (10% to 90%) Filters
Noise density (%) 10
20
30
40
50
60
70
80
90
DBAMF [6]
30.52
28.81
27.91
26.42
25.51
24.37
22.98
21.22
18.19
MDBUTM [9]
30.94
29.38
27.82
26.26
24.66
22.77
20.23
17
13.55
FSBMMF [10]
31.69
30.56
29.84
28.26
27.56
26.7
25.66
24.2
21.56
RSIF [11]
31.08
29.87
28.93
27.99
27.01
26.21
25.04
23.5
20.66
TVWA [12]
31.48
30.83
30.13
29.44
28.69
27.51
26.04
23.77
19.88
ASWMF [13]
31.29
30.37
29.36
28.24
27.14
26.03
24.75
23.24
20.9
DAMF [14]
30.87
29.88
28.87
27.77
26.77
25.84
24.55
23.04
20.93
Proposed
31.75
31.07
30.13
29.44
28.71
27.87
26.51
25.72
23.71
Noise Density Range Sensitive Mean-Median Filter …
161
Fig. 8. PSNR values of peppers (512 × 512) image with varying noise densities from 10 to 90%
References 1. C. Gonzalez Rafael, E. Woods Richard, Digital image processing, 2nd edn. (Prentice Hall, Englewood Cliffs NJ, 2002) 2. Mafi, M.; Martin, H.; Cabrerizo, M.; Andrian, J.; Barreto, A.; Adjouadi, M., “A comprehensive survey on impulse and Gaussian denoising filters for digital images”. Signal Process, 2018 3. K.M. Singh, P.K. Bora, S.B. Singh, Rank-ordered mean filter for removal of impulse noise from images, in IEEE International Conference on Industrial Technology (2002) 4. I. Pitas, A.N. Venetsanopoulos, Nonlinear Digital Filters Principles and Applications (Kluwer, Norwell, MA, 1990) 5. S. Zhang, M.A. Karim, A new impulse detector for switching median filters. IEEE Signal Process. Lett. 9(11), 360–363 (2002) 6. K.S. Srinivasan, D. Ebenezer, A New Fast and Efficient Decision- Based Algorithm for Removal of High-Density Impulse Noises. IEEE Signal Process. Lett. 14(3), 189–192 (2007) 7. K. Aiswarya, V. Jayaraj, D. Ebenezer, A new and efficient algorithm for the removal of high density salt and pepper noise in images and videos, in Proceedings in Inter Conference on Computer Modeling and Simulation (2010), pp. 409–413 8. S. Deivalakshmi, P. Palanisamy, Improved tolerance based selective arithmetic mean filter for detection and removal of impulse noise, in Proceedings of 5th IEEE International Conference on Industrial and Information Systems (ICIIS) (2010), pp. 309–13 9. S. Esakkirajan, T. Veerakumar, A.N. Subramanyam, C.H. Premchand, Removal of high density salt and pepper noise through modified decision based un-symmetric trimmed median filter. IEEE Signal Process. 18(5), 287–290 (2011) 10. V.R. Vijaykumar, G. Santhana Mari, D. Ebenezer, Fast switching based median-mean filter for high density salt and pepper noise removal. AEU Int. J. Electr. Commun. 68(12), 1145–1155 (2014)
162
P. J. S. Sohi et al.
11. T. Veerakumar, S. Esakkirajan, I. Vennila, Recursive cubic spline interpolation filter approach for the removal of high density salt-and-pepper noise. Int. J. Sig. Image Video Process. 8(1), 159–168 (2014) 12. C.T. Lu, Y.Y. Chen, L.L. Wang, C.F. Chang, Removal of salt-and pepper noise in corrupted image using three-values-weighted approach with variable-size window. Patt. Recogn. Lett. 80, 188–199 (2016) 13. Osama S. Faragallah, Hani M. Ibrahem, Adaptive switching weighted median filter framework for suppressing salt-and-pepper noise. AEU—Int. J. Electron. Commun. 70(8), 1034–1040 (2016) 14. U. Erkan et al., Different applied median filter in salt and pepper noise. Comput. Electr. Eng. (2018). https://doi.org/10.1016/j.compeleceng.2018.01.019 15. Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: from error measurement to structural similarity (IEEE Trans, Image Processing, 2004)
Artificial Intelligence and Computer Vision
Foodborne Disease Outbreak Prediction Using Deep Learning Pranav Goyal(B)
, Dara Nanda Gopala Krishna , Divyansh Jain, and Megha Rathi
Department of Computer Science and Engineering, Jaypee Institute of Information Technology, Noida, India [email protected], [email protected], [email protected], [email protected]
Abstract. In this paper, we have discovered patterns in foodborne disease outbreaks using deep learning approach. We have implemented artificial neural networks for the outbreak prediction of foodborne diseases. Data set is collected and gathered from the centre for diseases control (CDC) database of foodborne disease outbreaks from the year 1998–2015 to predict the meaningful patterns among food consumed and its location. This paper introduces a new methodology that predicts the status of illness; that is, on consuming a particular food, illness of a person is confirmed or suspected. Our methodology also predicts patterns in food which is the major factor to decide the status of the illness. It is very difficult to completely prevent the outbreaks of foodborne diseases but predicting these foodborne diseases could help public health officials to adopt control measures for not occurring of these foodborne diseases. Keywords: Foodborne diseases · Deep learning · Data preprocessing · Classification · Visualization · Prediction
1 Introduction Foodborne disease is a serious issue that needs to be addressed because of its effects for the last two decades. To measure one of the global problem foodborne diseases, an initiative was launched by the World Health Organization (WHO) in 2006 along with its partners. To lead the initiative, Foodborne Disease Burden Epidemiology Reference Group (FERG) was established in the year 2007. Since then WHO is constantly working to provide better estimates about the global burden of foodborne diseases. For avoiding any health-related problems like illnesses, allergy, etc., people are required to consume hygienic food. It also helps to prevent consumers from risks of food poisoning and even death. Consumers expect hygienic food, but it is not always possible. Therefore, developing approaches for diseases prediction based on the consumption of food, its location and other factors related to the illness of consumers is very necessary. Several countries adopt advanced techniques and monitoring systems in order to achieve food safety which plays a major role in healthy life but still outbreaks of the © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_19
166
P. Goyal et al.
foodborne disease remain a common problem. Consumption of contaminated food could lead to the outbreak of the foodborne diseases. Foodborne infections can occur due to many reasons like disease-causing micro-organisms or pathogens. In addition to these, food items which are doped with poisonous chemicals also lead to foodborne diseases. But contamination of food by chemicals, micro-organisms, etc., can occur at any stage before the food reaches the consumer. Therefore, it is time-consuming to verify every stage to find if the food is contaminated or not. Hence, we use previous data to predict which illness can be caused by consuming which food. Implementation of the control and prevention measures for foodborne diseases in the food industry should be the primary task of health officials. In the same way, the adoption of advanced techniques at every level of food from its production to consumption will be helpful to strengthen food safety. Next, to find out the reason for the outbreak helps us to gain information about the factors which triggered foodborne diseases. Finally, observing previous data will lead to find out patterns in the outbreaks and helps to implement prevention measures for foodborne diseases. This collective information could guarantee the changes which result in improving food safety. In the proposed work, we focus on two issues. First is to find the food that causes foodborne disease outbreak also referred to as food vehicle and second is to find a place where foodborne disease outbreak is most likely to occur, i.e., food, location (e.g. home, hospital, restaurant, etc.) [1, 2]. For preventing outbreaks, it is very important to identify the food that is most frequently linked to a particular outbreak. To analyse food type and location, some work has been done in the past, but no prior work has been done to find hidden patterns in the relationship between location, food type and disease outbreak. For addressing this gap, we have suggested the use of some deep learning and data mining algorithms so that we can extract some patterns for foodborne diseases from CDC Outbreak Surveillance Data.
2 Related Work The main purpose of storing data related to outbreaks of foodborne diseases is for better public health. A sufficient amount of data regarding agents which are responsible for the outbreak, vehicle of transmission and disease symptoms is collected by public health officials in all Organization for Economic Co-operation and Development (OECD) countries which are utilized to decrease the disease outbreaks. But still, the outbreak of foodborne diseases occurs and remained as one of important problem to be addressable. Very rare foodborne diseases are prevented such as typhoid fever and hepatitis because many vaccines are under research. The difficult part is to adopt the best strategy which helps us to fight against these foodborne diseases and safety measures at all the levels of food. Frameworks to investigate metrics that accelerate foodborne diseases and real-time food consumption patterns are provided by authors in [3] in their research. Similarly to prevent the foodborne diseases, some methods are introduced like Hazard Analysis Critical Control Points (HACCP) [4], effective tracing system in which model was developed by the combination of alphanumerical codes and Radio-Frequency Identification (RFID) benefits for both producer and consumer [5]. Food and drug administration (FDA) also
Foodborne Disease Outbreak Prediction …
167
takes measure to prevent health risk due to foodborne diseases [6]. Data mining techniques like association rule mining and decision tree are introduced by authors in [1, 2] for discovering hidden patterns and studying the relationship between different food combinations in the data from Centre for Disease Control. Contaminated food products [7] are identified by using likelihood-based methods that use outbreak scenarios that are generated artificially and real-world dataset of food sales. The authors in [8] have introduced ways to implement some new algorithms as well as several important measurements for the statistical model which concludes that foodborne disease outbreak data could be used to reduce the number of ill or dead people, reduce industrial loss and target public health. The authors in [8, 9] have discussed about detecting Vineyard Early Disease using deep learning approach. In another recent study, authors focused they have focused on processing hyperspectral data [10] so that they can provide winemakers and farmers tools that can help in early detection of threats. Similarly, for detecting Banana Leaf Disease, the deep learning-based approach has been discussed by authors in [11, 12].
3 Proposed Methodology Our studies examine the illness caused due to foodborne diseases based on data collected from CDC (1998–2015). We imported the data using R programming and then declared the target class attribute as status. The data has gone through different data preprocessing techniques which are explained in a detailed manner. 3.1 Data Description The data used for the prediction of foodborne diseases were collected from centres for disease control and Prevention Center for Diseases Control (CDC). There are 12 different attributes and 19,119 observations, described in Table 1 in the USA. In the data, 162 different locations, 3128 different food types, observations are from the year 1998 to 2015, 23 different types of status, 202 different types of species. From 19,119 observation, some aetiologies are confirmed, and some are suspected. Data also contained missing entries which will be handled later using data preprocessing techniques [13]. 3.2 Data Preprocessing To understand raw data, data mining technique known as data preprocessing is used so that data can be transformed and understood properly. Raw data or real-world data often lacks in certain trends or behaviours, like inconsistent, incomplete and may contain errors. To discover knowledge from such noisy, irrelevant or unreliable data during the training phase is very difficult and gruesome. To resolve these issues, data is preprocessed which helps in getting a more accurate final outcome. Rule-based applications and customer relationship management are some of the database-driven applications of data preprocessing. Dataset obtained from CDC is not fully appropriate for training deep learning techniques. So, different types of preprocessing techniques are taken into consideration to
168
P. Goyal et al. Table 1. Description of dataset Attribute
Type
Characteristics
Year
Numeric Observation from 1998–2015
Month
Text
Observation of every month
State
Text
State where the outbreak observed
Location
Text
Location where the outbreak occurs
Food
Text
Food due to which outbreak occur
Ingredient
Text
Types of ingredient in food
Species
Text
Types of species
Serotype/genotype Text
Types of serotype/genotype
Status
Text
status of outbreak
Illnesses
Numeric Number of illnesses observed
Hospitalization
Numeric Number of hospitalizations observed
Fatalities
Numeric Number of fatalities
train deep learning techniques in data. In attributes named illnesses, hospitalization, and fatalities have values in numerical form which has many empty values in them. We replace these empty values by mean of particular attribute, respectively. Data should be converted into a numerical format for applying deep learning techniques to predict the status of illness whether it is ‘confirmed’ or ‘suspected’. Missing values are also replaced with some particular numeric value, and for this particular value, we replace it with NA. This same process we apply in food, status, and rest of attributes where it is necessary in entire data. The target attribute that we have considered in the CRC data; that is, status has 23 unique values. There should be only two unique values as it is a binary class attribute. So we replaced the values other than ‘Confirmed’, ‘Suspected’ with the value having the highest frequency in data, i.e., ‘Confirmed including empty values present in status attribute. Now status attribute remains with two values, i.e., ‘Confirmed’ and ‘Suspected’ which are again replaced by 1 and 0, respectively. Similarly, ‘NA’ values which are replaced by empty values in Food, Location, Species, Ingredient, and serotype/genotype are again replaced by the value with the highest frequency in that particular attribute, respectively. 3.3 Prediction The data which is obtained from the CDC has ambiguity. We used preprocessing techniques to remove the ambiguity present in data. After performing the preprocessing technique mentioned above, the preprocessed data is suitable to make predictions and training deep learning models. The prediction could be made considering different attributes. For example, if we consider performing predictions based on only location attribute, then the status of illness which we want to predict will be the highest frequency value of the status values for the corresponding location already present in the preprocessed data. But if we
Foodborne Disease Outbreak Prediction …
169
decided to predict based on only Food attribute, then the status of illness which we want to predict will be the highest frequency value of the status values for the corresponding food present in the preprocessed data. We assumed that location and food attribute both in data affect the value of status. For this assumption, we have to match the location and food with corresponding location and food in data, respectively. Now the value of status will be predicted as the value of status with maximum frequency corresponding with that particular location and food value in data.
4 Result The preprocessed data is now passed as an input in the deep learning model to get trained. We declared class attribute as status and assumed location, food, month, state, ingredient, serotype/genotype as other attributes which affect the values of status. In deep learning model, 80% of data is used for training of model and the remaining 20% of data is used for testing purpose. Now, the trained deep learning model is evaluated based on results of class attribute status in test data. To evaluate the performance of deep learning model on the test data, we have used the confusion matrix. Confusion matrix has two rows and two columns which has values as a count of correct and incorrect predictions for correct and incorrect input data. Accuracy, precision and recall are the three parameters used for evaluating the deep learning models through the confusion matrix generated by them. Now, the accuracy is calculated from the confusion matrix generated from deep learning model according to the Eq. 1. Fig. 1 shows different food combinations that are more prone to a foodborne disease outbreak. The confusion matrix in Fig. 2 has row index values 0 and 1 which represents correct and incorrect data trained in the model, respectively. Similarly, it has column index values 0 and 1 represents correct and incorrect predictions made by the model, respectively.
Fig. 1. Prediction of food combinations having confirmed status
In Eq. 1 true positive (TP) is a count of correct predictions for correct input data, true negative (TN) is a count of incorrect predictions for incorrect input data, and the total population is the sum of all the values in the confusion matrix as shown in Fig. 2. Precision and recall are also calculated according to Eqs. 2 and 3, respectively. In Eq. 2, true positive (TP) is a count of correct predictions for correct input data, and false positive (FP) is a count of correct predictions for incorrect input data. In Eq. 3, true positive (TP) is a count of correct predictions for correct input data; false negative (FN) is a count of incorrect predictions for correct input data. Table 2 consists of accuracy, precision and recall of different models such as K-nearest neighbour (KNN), support
170
P. Goyal et al.
vector machine (SVM) and artificial neural network (ANN) of our preprocessed data. The preprocessed data trained in the above deep learning algorithms has attributes such as year, illnesses, hospitalization, fatalities, status, month, state, location, and food. All the values of attributes in processed data are converted into numeric type which will be flexible to train the deep learning models. We have considered the binary attribute, i.e., status in our processed data as our target variable in all the three deep learning models.
Fig. 2. Confusion matrix Table 2. Comparison of model Model
Accuracy (in %)
Precision (in %)
Recall (in %)
KNN
77
67
62
SVM
77
38
50
ANN
80
73
55
TP + TN Total Population TP Precision = TP + FP TP Recall = TP + FN
Accuracy =
(1) (2) (3)
We have considered status as your target attribute from CRC data. Original CDC data contained 23 different values in the status attribute of illness, due to which we were unable to apply deep learning techniques. The target attribute is expected to be binary which results better prediction. So we replaced multiple values of status with the most occurring value in the status attribute. Further, we replaced those values to binary value for making the target class into binary target class. We represented the model accuracy using receiver operating characteristics (ROC) curve shown in Fig. 3.
5 Conclusion In this research, we have implemented data mining and deep learning techniques to analyse the status of foodborne disease outbreaks using the CDC data. Our results predict
Foodborne Disease Outbreak Prediction …
171
Fig. 3. ROC curve of model
the confirmation or suspicion of illness on consuming particular foods at particular locations. We have preprocessed CDC data in such a way by replacing the empty values with most repeating values in that particular attribute and converted the complete data into numeric which makes easy for training deep learning models. We have designed the model which predicts illness state of particular person based on either consuming only particular food or consuming only at particular location or consuming particular food at particular location. Further, we have predicted the status of illness of a person from the CRC data based on the pattern of consuming food at particular location. In our model, we have focused mainly on food and location as they majorly influence the disease outbreak, but other aspects can also be looked in further studies. We have trained the preprocessed CRC data into the deep learning models such as KNN, SVM, and ANN. The confusion matrix generated through these deep learning models helps to evaluate the prediction model. Better prevention techniques for a different type of foods and illnesses can be created using the knowledge gained from our model. Customized training methods for food safety can be designed with the above knowledge for all stakeholders.
References 1. M. Thakur, S. Olafsson, J.-S. Lee, C.R. Hurburgh, Data mining for recognizing patterns in foodborne disease outbreaks. J. Food Eng. 97(2), 213–227 (2010) 2. M. Sharma, A framework for big data analytics as a scalable systems. Int. J. Adv. Netw. Appl. (IJANA) 25, 72–82 (2015) 3. D. Doerr, K. Hu, S. Renly, S. Edlund, M. Davis, J. H. Kaufman, J. Lessler, M. Filter, A. Käsbohrer, B. Appel, Accelerating investigation of food-borne disease outbreaks using pro-active geospatial modeling of food supply chains, in Proceedings of the First ACM SIGSPATIAL International Workshop on Use of GIS in Public Health. ACM (2012), pp. 44–47 4. H. Marvin, G. Kleter, L. Frewer, S. Cope, M. Wentholt, G. Rowe, A working procedure for identifying emerging food safety issues at an early stage: implications for european and international risk management practices. Food Control 20(4), 345–356 (2009) 5. A. Regattieri, M. Gamberi, R. Manzini, Traceability of food products: general framework and experimental evidence. J. Food Eng. 81(2), 347–356 (2007)
172
P. Goyal et al.
6. FDA, Recalls, Outbreaks Emergencies. https://www.fda.gov/Food/RecallsOutbreaksEmerg encies\\/default.htm, online. Accessed 30 Mar 2019 7. J. Kaufman, J. Lessler, A. Harry, S. Edlund, K. Hu, J. Douglas, C. Thoens, B. Appel, A. Käsbohrer, M. Filter, A likelihood-based approach to identifying contaminated food products using sales data: performance and challenges. PLoS Comput. Biol. 10(7), 1003692 (2014) 8. J. Hruka, T. Ado, L. Pdua, P. Marques, E. Peres, A. Sousa, R. Morais, J. J. Sousa, Deep learningbased methodological approach for vineyard early disease detection using hyperspectral data, in IGARSS 2018 2018 IEEE International Geoscience and Remote Sensing Symposium (July, 2018), pp. 9063–9066 9. T. Adão, E. Peres, L. Pádua, J. Hr˘uska, J.J. Sousa, R. Morais, UAS-based hyperspectral sensing methodology for continuous monitoring and early detection of vineyard anomalies, in Proceedings of the Small Unmanned Aerial Systems for Environmental Research, Vila Real, Portugal (2017), pp. 28–30 10. X. Ma, J. Geng, H. Wang, Hyperspectral image classification via contextual deep learning. EURASIP J. Image Video Process. 2015(1), 20 (2015) 11. J. Amara, B. Bouaziz, A. Algergawy et al, A deep learning-based approach for banana leaf diseases classification, in BTW (Workshops) (2017), pp. 79–88 12. J. Schmidhuber, Deep learning in neural networks: an overview. Neural networks 61, 85–117 (2015) 13. CDC, Foodborne Illnesses and Germs, https://www.cdc.gov/foodsafety/foodborne-germs. html, online Accessed 15 Feb 2019
Convolutional Elman Jordan Neural Network for Reconstruction and Classification Using Attention Window Sweta Kumari, S. Aravindakshan, Umangi Jain, and V. Srinivasa Chakravarthy(B) Laboratory of Computational Neuroscience, Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, IIT Madras, Chennai, Tamil Nadu, India [email protected], [email protected], [email protected], [email protected]
Abstract. In deep learning-based visual pattern recognition systems, typically the entire image is presented to the system for recognition. However, the human visual system often scans a large visual object by sequential shifts of attention, which is integrated for visual classification. Even in artificial domains, such sequential integration is particularly useful when the input image is too large. Some previous studies based on Elman and Jordan networks have explored only with fully connected layers using full image as input but not with convolutional layers using attention window as input. To this end, we present a novel recurrent neural network architecture which possesses spatiotemporal memory called Convolutional Elman Jordan Neural Network (CEJNN) to integrate the information by looking at a series of small attentional windows applied over the full image. Two variations of CEJNN with some modifications have been developed for two tasks: reconstruction and classification. The network is trained on 48 K images and tested on 10 K images of MNIST handwritten digit database for both tasks. Our experiment shows that the network captures better correlation of the spatiotemporal information by providing the result with a mean square error (MSE) of 0.012 for reconstruction task and also claiming the classification with 97.62% accuracy on the testing set. Keywords: Attention · Human visual system · Elman · Jordan · CNN · Classification · Reconstruction · LSTM
1 Introduction Human beings do not process a whole scene in its entirety at once [1]. Instead, they selectively focus on the most informative and characteristic parts of a visual scene [2–6] and combine information from different viewpoints over time to build up an internal representation of the scene [7]. The goal of this paper is to imitate how visual image of a scene is constructed by the human visual system and testing this model for image reconstruction and classification. The retina is designed in such a way that high acuity color vision is rendered only in the central 2 degrees of the visual field attended © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_20
174
S. Kumari et al.
by the fovea. Outside of this central field, the image formed is of very low resolution. For instance, in the visual cortex, the space allocated for representation of the central vision is three to six times larger than the space allocated for representation of peripheral vision [8]. This way, the amount of space needed at any instant for representation of the visual field is much smaller compared to when the whole visual field is covered at high resolution. Despite the tiny visual window used by the visual system, high acuity color images of the whole scene are rendered at any instant. This is because even when we are scanning a single object, our eyes are constantly moving about the object image. These eye movements make sure that all the parts of the object are covered by the central 2 degrees of our visual field. Our visual images are constructed by repeated foveation of different regions of the scene, using short-term memory and the predictive properties of the visual system to connect all the regions that are foveated. Different approaches have been tried out in the deep learning community to mimic human visual attention [9]. Such sequential problems in deep learning have been synonymous with recurrent neural networks (RNNs) [10]. RNNs attentively memorize longer sequences in a way that humans do by recursively composing each input with its previous memory, until the meaning of the whole series of inputs has been derived [11]. In practice, however, the basic RNN architectures are usually difficult to train, preclude parallelization [12], and have limited span of attention [13], and the outputs quickly degenerate to random guessing as sequence length grows [14]. Even more elaborate RNN architectures, like long short-term memory (LSTM) [15] and gated recurrent unit (GRU) [16] use a lot of memory to store partial results for their multiple cell gates [12] and the effective history stored in these models is significantly lower than simple convolutional networks like temporal convolutional network (TCN) [14, 17]. In this paper, we develop a Convolutional Elman Jordan Neural Network which addresses these limitations while replicating foveation, memory, and the predictive properties of the human visual system.
2 Model Description In this work, we proovel neural architecture to construct a long spatiotemporal memory by integrating random, sequential shifts of the attention windows of size S × S on the image. We call this network as Convolutional Elman Jordan Neural Network. Two variations of this architecture are designed for two tasks: reconstruction and classification of the image. We define the number of time steps as the total number of attention windows. In our network, each of the convolutional layer saves its previous output at time step ‘t − 1’ in the context layer which gets convolved in the same convolutional layer at time step ‘t’ like an Elman network [18] as shown in Eq. 2. Only the last convolutional layer saves its previous output at time step ‘t − 1’ in the context layer which gets convolved in the first convolutional layer at time step ‘t’ like a Jordan network [19] as shown in Eq. 1. The content of the context layers gets reset for every new image. Elman networks are designed to preserve the spatial and temporal memories locally in each convolutional layer, while Jordan network preserves the combined global spatial and temporal memories by connecting the last convolutional layer to the first convolutional layer.
Convolutional Elman Jordan Neural Network for Reconstruction …
175
2.1 Reconstruction The network takes an input of the region in the image which is covered inside the attention window of size 7 × 7. The network then learns the integration of such attentional windows with time to produce the full image at last time step. The proposed convolutional neural network (CNN) with Elman and Jordan loops for reconstruction task is able to answer two questions together: ‘what to write’ meaning the content of the attention window and ‘where to write’ meaning the location of the attention window. The network of CEJNN for reconstruction of the full image uses 30 attention windows (i.e., total 30 time steps). The network has a simple architecture with two convolutional layers and two local response normalization layers (Fig. 1).
Fig. 1. Architecture of the CEJNN for reconstruction, where input is one attention window of size 7 × 7 over the image whose location changes with the time randomly shown in yellow color and the output is full image which has been produced after integrating all of the attentions with respect to the time step
The first hidden layer with both Elman and Jordan loop is computed by using Eq. 1, and the rest of the hidden layers with only Elman loop is computed by using Eq. 2, shown in below. i c (1) + Wki ∗ Yt−1 + bi Hti = f Wxi ∗ χt + Wii ∗ Ht−1 j j−1 j Ht = f Wij ∗ Ht + Wjj ∗ Ht−1 + bj
(2)
176
S. Kumari et al.
j Yt = g Wjk ∗ Ht + Wkk ∗ Yt−1 + bk
(3)
where i 1st hidden layer; j 2nd hidden layer; k output layer . Here, χt is the input image of size 28 × 28, where the attention window of size 7 × 7 is only having the information and outside the attention window is black as shown in Fig. 1. ‘H’ is used to denote the output feature maps of convolutional and deconvolutional hidden layers. ‘W(.) ’ and ‘b(.) b(.) ’ are used to denote the weights and biases of the network. f and g indicate the rectified linear units (ReLU) [20] activation c is 32 times copy of Y function. Yt−1 t−1 , which gets convolved into the first convolutional layer. ReLU and sigmoid activation function have been used for each hidden layer and the output layer, respectively. The network is trained by calculating the mean square loss between the integrated reconstruction output from the network and their corresponding full image given in the dataset. Backpropagation [21] algorithm with mini-batch Adam optimizer [22] is used to minimize the loss. 2.2 Classification The network of CEJNN for classification takes an input of size 28 × 28, where inside of 12 × 12 attention window is having information and outside of the attention window is not having information from the image. The network gives the output of classification probabilities at each attention window for that image. The network design consists of four convolutional layers, one maxpool layer, three local response normalization layers, and two fully connected layers (shown in Fig. 2). Same set of equations, which are shown in Sect. 2.1, have been used to compute c is equal to the feature maps of hidden layer states, the only difference being that Yt−1 Yt−1 . ReLU and softmax activation function have been used for hidden layers and the output layer, respectively. The cross entropy loss is computed between the predicted class probabilities from the network and the actual class probabilities available in the dataset. The network is trained by minimizing the loss using a backpropagation algorithm with mini-batch Adam optimizer.
3 Results and Discussions The experiments were conducted on MNIST handwritten digit database which has 10 classes. The database contains 48 K images in the training set, 12 K images in the validation set, and 10 K images in the testing set for both tasks: classification and reconstruction. The reconstruction and classification networks are trained for 40 and 20 epochs respectively. The weights of the entire network are initialized using random normal initializer. For training both the networks, the value of learning rate, regularization
Convolutional Elman Jordan Neural Network for Reconstruction …
177
Fig. 2. Architecture of CEJNN for classification, where input is attention window of size 12 × 12 over the image shown in yellow colored and output is class probabilities of that attention window
factor, and batch size is chosen to be 0.0001, 0.00001, and 32, respectively. We also check the length of the preserved memory for both tasks by creating four types of testing sets with four different ratios: 0.4, 0.5, 0.6, and 1. These ratios define the number of attention windows with actual values in each pixel (information from the image is passed) taken from the image and the number of attention windows with 0 values in each pixel (no information from the image is passed). For example, the testing set with ratio 0.4 has actual attention windows in 40% of the total time steps (i.e., 12 time steps) and blank attention windows in 60% of the total time steps (i.e., 18 time steps). All of the four testing results have been reported below for both the experiments. 3.1 Reconstruction Results The reconstruction results show that validation loss reduces as the training loss reduces with subsequent epochs and the absolute validation loss on the last epoch is 0.012 (Fig. 3a). The loss is almost constant from the time step where blank attention windows are fed as inputs till the last time step in all of the four testing sets with ratio (0.4, 0.5, 0.6, and 1) which has 40, 50, 60, and 100% actual attention windows and 60, 50, 40, and 0% blank attention windows, respectively, in Fig. 3b. The reconstructed output with ratio 1 at each attention window or time step is shown in Fig. 4. 3.2 Classification Results In case of classification, the accuracy of testing sets with ratio less than 1 (i.e., 0.4, 0.5, and 0.6) has been found to be maintained more than 95% (Fig. 6a) and the accuracy of testing set with ratio 1 has been found to be 97.62%, which is shown in Table 1. We also have shown the classification speed during testing, which is the average minimum number of attention windows required for the classification of the handwritten digit (Fig. 6b). As the training of the weights happens with epoch, the average number of required attention windows reduces to 1.2 to output the correct classification in testing
178
S. Kumari et al.
Fig. 3. a Absolute training and validation loss versus Epochs, b absolute test loss (y-axis) of all four testing data with 1, 0.6, 0.5, and 0.4 ratio at each of the attention window location (x-axis) in last (40th) epoch is shown in this graph
set. The average cross entropy loss in absolute and the percentage of accuracy of training and validation sets is shown in Fig. 5.
4 Conclusions The model proposed in this paper emulates the construction of visual field as done in the human visual system. It executes foveation with the help of attention windows. It
179
Test output
Convolutional Elman Jordan Neural Network for Reconstruction …
Time step ‘t’
Fig. 4. Reconstructed output images with ratio 1, where all 30 actual random attention windows (size; 7 × 7) and no blank windows were present in the testing set. Attention window is shown in red colored box, which is the passing information in the input (size; 28 × 28) at time step ‘t’ and the full image is shown in each cell, which is the integrated output of the network till that time step
Fig. 5. a Training and validation accuracy in percentage versus Epochs, b training and validation loss (absolute) versus Epochs
uses a recurrent neural network to memorize spatial as well as temporal features which is demonstrated by integrating all attention windows to reconstruct the entire image with a mean squared loss of 0.012. The proposed model is also capable of classifying digits using attention windows with an accuracy of 97.62%. An average of 1.2 attention windows are required for making a correct classification prediction. We conclude that the model shows the required predictive properties. CNN with three convolutional layers gives 99.6% accuracy in MNIST by looking at whole image [23] whereas CEJNN gives little compromised accuracy (97.62%) by looking at only 12 x 12 attention window over the image. In future work, we can test our model in some other classification dataset. This model can be modified and used for other applications such as multiple object detection, text recognition, action recognition, and 3D world reconstruction. So far, size and number of attention windows are manually fixed by us, which could be further extended to allow the network to predict optimal trajectory and size of the attention windows over the input images.
180
S. Kumari et al.
Fig. 6. a Accuracy in percentage for all of the four testing sets with 1, 0.6, 0.5, and 0.4 ratios in last epoch versus attention locations (time steps), b average number of required attention windows versus Epochs. So if the required attention window is lower, classification speed is higher and vice versa Table 1. Test accuracy of each testing set with their corresponding ratios 1, 0.6, 0.5, and 0.4 is listed in the table where the inputs of 12 × 12 attention window size were presented Method
Test accuracy (%)
CEJNN, 12 × 12 (ratio 1)
97.62
CEJNN, 12 × 12 (ratio 0.6) 96.33 CEJNN, 12 × 12 (ratio 0.5) 95.73 CEJNN, 12 × 12 (ratio 0.4) 95.15
References 1. V. Mnih, N. Heess, A. Graves, Recurrent models of visual attention, in Advances in neural information processing systems (2014), pp. 2204–2212 2. J.E. Hoffman, C.W. Eriksen, Temporal and spatial characteristics of selective coding from visual displays temporal and spatial characteristics of selective encoding from visual displays*. Percept. Psychophys. 12, 201–204 (1972). https://doi.org/10.3758/BF03212870 3. A.M. Treisman, G. Gelade, A feature-integration theory of attention. Cogn. Psychol. 12, 97–136 (1980) 4. C. Koch, S. Ullman, Shifts in selective visual attention: towards the underlying neural circuitry, in Matters of intelligence. Springer (1987), pp. 115–141 5. C.E. Connor, H.E. Egeth, S. Yantis, Visual attention: bottom-up versus top-down (Curr, Biol, 2004) 6. Q. Lai, W. Wang, S. Khan, et al., Human versus machine attention in neural networks: a comparative study (2019) 7. L. Itti, C. Koch, E. Niebur, A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20, 1254–1259 (1998) 8. M.J. Tovée, An introduction to the visual system. Cambridge University Press (1996) 9. H.T. Siegelmann, E.D. Sontag, On the computational power of neural nets. J. Comput. Syst. Sci. 50, 132–150 (1995). https://doi.org/10.1006/jcss.1995.1013
Convolutional Elman Jordan Neural Network for Reconstruction …
181
10. D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors. Nature 323, 533–536 (1986). https://doi.org/10.1038/323533a0 11. J. Cheng, L. Dong, M. Lapata, Long short-term memory-networks for machine reading (2013) 12. A. Vaswani, Attention is all you need (2017) 13. B. Singh, T.K. Marks, M. Jones, O. Tuzel, A multi-stream bi-directional recurrent neural network for fine-grained action detection 1961–1970 (1961) 14. S. Bai, J.Z. Kolter, V. Koltun, An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv Prepr arXiv180301271 (2018) 15. S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9, 1735–1780 (1997) 16. J. Chung, Gated feedback recurrent neural networks 37 (2015) 17. C. Lea, M.D.F. Ren, A. Reiter, G.D. Hager, Temporal convolutional networks for action segmentation and detection, pp. 156–165 18. J.L. Elman, Distributed representations, simple recurrent networks, and grammatical structure. Mach. Learn. 7, 195–225 (1991). https://doi.org/10.1023/A:1022699029236 19. M.I. Jordan, Serial order: a parallel distributed processing approach (University of California, San Diego Inst Cogn Sci, 1986), p. 8604 20. R.H. Hahnloser, R. Sarpeshkar, M.A. Mahowald, R.J. Douglas, H.S. Seung, Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 442, 947–951 (2000) 21. Y. LeCun, B. Boser, J.S. Denker, et al. Lecun-89E. Neural Comput (1989) 22. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization (2014), pp. 1–15 23. Recognition of handwritten digits using artificial neural networks
A Multimodal Biometric System Based on Finger Knuckle Print, Fingerprint, and Palmprint Traits Chander Kant(B) and Sheetal Chaudhary Department of Computer Science and Applications, K.U., Kurukshetra, Haryana, India {ckverma,sheetalkuk}@rediffmail.com
Abstract. Biometric systems are nowadays gaining significant attraction due to their ability to uniquely authenticate a person on the basis of his/her unique personal body features. But, most of the deployed biometric systems still use biometric information from a single biometric trait for authentication or recognition purposes. It is well known that biometric systems have to deal with large population coverage, noisy sensor data, different deployment platforms, susceptibility to spoof attacks, and challenging recognition performance requirements. It is found difficult for unimodal biometric systems to overcome these problems. The multimodal biometric systems are able to meet these demands by integrating information from multiple biometric traits, multiple algorithms, multiple sensors, multiple instances, and multiple samples. Hence, a new multimodal biometric system is proposed in this paper which combines finger knuckle print, fingerprint, and palmprint at match-score level. The resulting match score is utilized to state whether the person is authentic or fraud. The experimental results illustrate the effectiveness of proposed multimodal biometric system pertaining to False-Accept-Rate (FAR), False-Reject-Rate (FRR), and Genuine-Accept-Rate (GAR). Keywords: Fingerprint · Finger knuckle print · Palmprint · Biometric recognition system · Biometric template · Fusion at match-score level
1 Introduction Biometrics is the science of quantifying personal body features, for example, iris, fingerprint, face, finger knuckle print, palmprint, retina, hand geometry, voice, gait, signature, etc., for the purpose of personal authentication or an individual’s identity recognition [1]. The biometric technology is getting significant attention in the field of security also. This is because the necessity for techniques which reliably authenticate a user has drastically increased because of increased security demands and faster development in networking fields. To uniquely identify a person with full assurance is the most critical issue in many real-time applications (e.g., banking, airport passenger clearance, access control systems, etc.) [2]. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_21
A Multimodal Biometric System Based …
183
The majority of biometric-based systems installed in real-time applications are unibiometric/unimodal which use data from single biometric source for the purpose of authentication. Unimodal biometric systems are susceptible to various problems (e.g., noisy data, intra class variation, inter class similarity, non-universality and spoofing attack, etc.) which cause noticeably high FAR and FRR, restricted discrimination ability, and lack of permanence. These problems faced by unimodal biometric systems could be avoided by employing multimodal biometric systems which incorporate multiple sources of biometric information to establish the unique identity of a person. Depending on the type of biometric information available, multibiometric systems could be divided into various categories, i.e., multisensor system, multialgorithm system, multiinstance system, multisample system, multimodal system, and hybrid system [3]. Fusion could be performed at four levels (i.e., sensor, feature extraction, matcher, and decision) in multimodal biometric systems [4]. At sensor level, raw data acquired from the biometric sensors is integrated. The problem of noisy data is faced at this level. At feature-extraction level, feature sets which correspond to multiple biometric traits are consolidated. The problem of incompatibility among unlike feature representations and the need of a good quality classifier for high-dimensional concatenated feature vector is faced at this level. At match-score level, matching scores produced by different biometric systems are integrated. Fusion performed at matching score level is widely accepted choice in multimodal biometric systems because it is comparatively simple to access and fuse the match scores produced by different matcher modules. Fusion at match-score level could be further categorized into three types: transformation-based, density-based, and classifier-based score-level fusion. At decision-level fusion, results output by individual biometric systems are combined. The architecture of multimodal biometric system performing fusion at the decision level is somewhat insecurely coupled as each subsystem is performing like an individual biometric system [5]. This paper presents a new multimodal biometric authentication system which fuses finger knuckle print, fingerprint, and palmprint at match-score level. Experimental results showed a considerable improvement in the recognition performance rate as compared to single biometric traits. The remaining paper is structured as follows. The related works for multimodal biometric systems are briefly discussed in Sect. 2. The proposed work is described in Sect. 3 along with individual biometric systems for finger knuckle print, fingerprint, palmprint traits, respectively, and new fusion methodology for them. Experimental results are shown in Sect. 4. Section 5 presents the conclusion.
2 Related Works Many related research works have been discussed in the literature for unique personal verification/identification based on multimodal biometrics. This section describes a brief overview of some of these research works. Son and Lee [6] used a Daubechies wavelet transform to generate features from face and iris images. Feature concatenation was used to generate a combined feature-vector and the Euclidean distance was used to produce the matching score between extracted feature vectors. Eskandari et al. [7] used local and global feature-extraction methods to extract the feature sets from iris and face images. A transformation-based and a classifier-based
184
C. Kant and S. Chaudhary
score-level fusion methodologies were employed to combine and classify the match scores. Rattani et al. [8] calculated scale invariant feature transform (SIFT) features from chimeric face and iris images and performed fusion of resulting feature vectors at the feature level. The match score is obtained using Euclidean distance by comparing the SIFT features between two feature vectors. Wang et al. [9] integrated the match scores of iris and face biometrics as a twodimensional feature-vector. Linear discriminant analysis (LDA) and neural network based on radial basis function (NNRBF) are used as classifiers. Lin et al. [10] generated the feature sets using speeded up robust features (SURF) and principal component analysis (PCA) methodologies. The feature classification stage consists of two steps where an algorithm k-means was used first to gather local descriptors and then local and global similarities were integrated to classify the images of face. The ORL face database was used for the evaluation of recognition performance. Nagesh kumar et al. [11] proposed a multimodal-based biometric system using face and palmprint biometric traits. The final decision is produced by performing fusion of resulting match scores at the matching-score level. The results have shown a considerable improvement as compared to individual biometric systems. Zhang et al. [12] used scale invariant feature transform (SIFT) for video-based human action recognition. Local appearance descriptors, local motion descriptors, and motion boundary histograms were calculated using SIFT flows. The recognition performance was compared using different classifier like bag of words approach, support vector machines, linear, and nonlinear. The proposed approach based on key points produced considerably improved results as compared to existing systems. Liau and Isa [13] proposed a multimodal biometric system using face and iris biometrics and performed fusion at match-score level using support vector machine(SVM) as classifier. In this paper, they proposed a feature-selection methodology to select an optimal subset of features. They also proposed a calculation-based speed up methodology for SVM. Results have shown that the proposed feature-selection methodology is capable to enhance the accuracy based upon total error rate (TER) and SVM-based fusion methodology also generated very considerable results. Azeem et al. [14] used hexagonal-scale invariant feature transform (H-SIFT) to extract feature set from face biometric trait. Hexagonal image processing highlights the areas with low contrast on face which constitutes the local features and also provides sharp edge response. The matching process is carried out using Fisher canonical correlation analysis (FCCA) that boosted the recognition performance accuracy. Experiments are performed on AR, ORL, Yale B, and FERET datasets which exposed better performance in terms of feature extraction.
3 Proposed Work An individual biometric trait is always not enough to fulfill the recognition performance requirements imposed by many large-scale-authentication systems. Multimodal biometric authentication systems sometimes seem more reliable because of the presence of multiple, independent biometric traits, and by providing anti-spoofing measures. They
A Multimodal Biometric System Based …
185
eradicate the problems encountered with unibiometric systems by combining the information obtained from multiple biometric sources. With this scenario, it also becomes difficult for intruders to spoof multiple traits concurrently. In this paper, a novel combination for multimodal biometric system has been proposed including finger knuckle print, fingerprint, and palmprint biometric traits. 3.1 Finger Knuckle Print (FKP) Feature Set Extraction Figure 1 represents the steps for extraction of finger knuckle print feature set. FKP recognition is based upon the skin outline and the skin creases found on the outer surface of finger knuckle. The various steps involved in finger knuckle print feature set extraction are image acquisition, image ROI extraction, image transformation, image enhancement, and feature extraction [15, 16].
Fig. 1. Extraction of finger knuckle print features
3.2 Fingerprint Feature Set Extraction Figure 2 represents the feature set extraction steps for an image of fingerprint. The fingerprint pattern consists of the ridges and valleys that are present on the surface of the finger. Ridges are the lines that create fingerprint pattern and valleys or furrows are the spaces between the ridges. The major steps involved in fingerprint feature set extraction are image acquisition, image enhancement, extraction of ridges, thinning of ridges, and minutiae points extraction [17, 18]. 3.3 Palmprint Feature Set Extraction Figure 3 represents the steps for extraction of palmprint feature set. Palmprint is reliable physical biometric trait because of its unique and stable features. Area of the palm is much wider than finger and thus palmprint is likely to be more distinct than fingerprint. A person is recognized on the basis of principal lines, ridges, and wrinkles that are present on the palm surface. The structure of these lines remain stable and unaltered during an individual’s entire life [19, 20].
186
C. Kant and S. Chaudhary
Fig. 2. Extraction of fingerprint features
Fig. 3. Extraction of palmprint features
3.4 Proposed Approach Figure 4 represents the architecture of the proposed multimodal biometric recognition system consolidating finger knuckle print, fingerprint, and palmprint at match-score
A Multimodal Biometric System Based …
187
level. Initially, preprocessing is applied to extract the region of interest (ROI) from each biometric image captured with appropriate biometric sensors. Each ROI is individually processed by corresponding feature extraction modules to generate query templates. Then, these query templates are compared with corresponding stored templates to generate individual matching scores depending upon the similarity or dissimilarity between them. These matching scores are given to the fusion module which combines them to generate a final and unique matching score. Depending upon this final matching score, decision module provides a final decision whether the user is genuine or imposter/ identified or rejected.
Fig. 4. Structural design of proposed approach
3.5 Proposed Algorithm The steps involved in algorithm for proposed multimodal biometric system are described as follows: 1. 2. 3. 4. 5.
Capture images of finger knuckle print, fingerprint, and palmprint using appropriate sensors. Generate feature set or template from finger knuckle print image. Generate feature set or template from fingerprint image. Generate feature set or template from palmprint image. Compare finger knuckle print query template with stored template in database to generate matching score, i.e., M FKP.
188
C. Kant and S. Chaudhary
6.
Compare fingerprint query template with stored template in database to generate matching score, i.e., M Finger. 7. Compare palmprint query template with stored template in database to generate matching score, i.e., M Palm. 8. Normalization of resulting match scores to convert them into normalized match scores. 9. Fusion of normalized scores of finger knuckle print, fingerprint, and palmprint to generate final matching scores using sum rule. 10. Comparison of fused matching scores against a certain threshold value to declare the final decision as genuine or an imposter person. 3.6 Fusion at Match-Score Level Fusion to be performed at the matching score level is based upon the integration of matching scores produced by individual matchers involved in the biometric system. Before integrating the scores, it is essential to normalize them to bring on the similar numerical range [0, 1] [21]. Let M FKP, M Finger , and M Palm are the matching scores output by finger knuckle print, fingerprint, and palmprint biometric systems respectively. The resulting matching scores are heterogeneous as these do not belong to the same numerical range. Let N FKP , N Finger , and N Palm are the corresponding normalized matching scores obtained after applying min--max normalization technique. These normalized scores are given below: MFKP − minFKP maxFKP − minFKP MFinger − minFinger = maxFinger − minFinger MPalm − minPalm = maxPalm − minPalm
NFKP = NFinger NPalm
(1)
where [minFKP , maxFKP ], [minFinger , maxFinger ], [minPalm , maxPalm ] are the minimum and maximum scores for finger knuckle print, fingerprint, and palmprint biometric traits, respectively. The normalized match scores of finger knuckle print, fingerprint, and palmprint are integrated using sum rule [22] to produce final match score (MSFinal ) as given below: MSFinal = NFKP + NFinger + NPalm
(2)
The final match score MSFinal is used by the decision module to state the person as authentic or fraud depending upon the comparison made against a fixed value of threshold.
4 Results and Discussion 4.1 Databases This section discusses the performance evaluation of proposed approach. It is implemented in MATLAB using the images of finger knuckle print, fingerprint, and palmprint
A Multimodal Biometric System Based …
189
obtained from PolyU database [23–25]. Figures 5, 6, and 7 show the sample images of finger knuckle print, fingerprint, and palmprint biometric traits.
Fig. 5. Sample images for finger knuckle print
Fig. 6. Sample images for fingerprint
Fig. 7. Sample images for palmprint
4.2 Results The result analysis is based upon the performance measures shown below in Eqs. 3–6 where False-Accept-Rate (FAR) is the proportion of false acceptances, False-RejectRate (FRR) is the proportion of false rejections, and Genuine-Accept-Rate (GAR) is the percentage of genuine user acceptances. FAR(%) = (No of False Acceptances /Total Imposter Matching) × 100%
(3)
FRR(%) = (No of False Rejections /Total Genuine Matching) × 100%
(4)
GAR(%) = 100 − FRR(%)
(5)
190
C. Kant and S. Chaudhary
Accuracy = 100 − (FAR + FRR)/2
(6)
The receiver operating characteristic (ROC) curves of unimodal biometric systems (i.e., finger knuckle print, fingerprint and palmprint) and the proposed multimodal system are depicted in Fig. 8 which show a considerable improved performance of proposed system over unimodal systems.
Fig. 8. Receiver operating characteristic (ROC) curves for proposed system
It is clear from Table 1 and Fig. 9 that the proposed multimodal biometric system is more efficient and also the overall accuracy of proposed system is higher than individual finger knuckle print, fingerprint, and palmprint biometric recognition systems. Table 1. Comparison of accuracy of biometric systems Biometric system
Accuracy (%)
Proposed multimodal system (FKP+Finger+Palm) 98.87 Finger knuckle print (FKP)
98.69
Fingerprint
98.46
Palmprint
98.24
5 Conclusion In this paper, a new multimodal biometric recognition system using three biometric traits, i.e., finger knuckle print, fingerprint, and palmprint has been proposed. It consolidates
A Multimodal Biometric System Based …
191
Fig. 9. Comparing accuracy of all systems with bar chart
the individual matching scores of three biometric traits using sum rule fusion methodology. Min--max normalization procedure has been used to normalize the matching scores of involved biometric traits. The results obtained have been studied for the improved recognition performance of proposed system. It reveals considerable recognition performance with ensured high security level and also outperforms unimodal biometric systems. Future work could be focused upon integrating liveness detection with this proposed multimodal biometric system.
References 1. A.K. Jain, A. Ross, S. Prabhakar, An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14, 4–20 (2004) 2. A.K. Jain, A. Ross, S. Pankanti, Biometrics: A tool for information security. IEEE Trans. Inf. Foren. Secur. 1(2) (2006) 3. A.K. Jain, A. Ross, Multibiometric systems. Commun. ACM 47, 34–40 (2004). (Special Issue on Multimodal Interfaces) 4. C. Sanderson, K.K. Paliwal, Information Fusion and Person Verification Using Speech and Face Information. IDIAP Research Report 02-33 (2002) 5. A. Ross, A.K. Jain, Information fusion in biometrics. Pattern Recogn. Lett. 24(13), 2115–2125 (2003) 6. B. Son, Y. Lee, Biometric authentication system using reduced joint feature vector of iris and face, in Audio- and Video-Based Biometric Person Authentication Lecture Notes in Computer Science Vol. 3546 (Springer, Berlin/Heidelberg, 2005), pp. 513–522 7. M. Eskandari, A. Toygar, H. Demirel, A new approach for face-iris multimodal biometric recognition using score fusion. Int. J. Pattern Recogn. Artif. Intell. 27, 1356004 (2013) 8. A. Rattani, M. Tistarelli, Robust multi-modal and multi-unit feature level fusion of face and iris biometrics, in Advances in Biometrics, vol. 5558, Lecture Notes in Computer Science, ed. by M. Tistarelli, M. Nixon (Springer, Berlin/Heidelberg, 2009), pp. 960–969 9. Y. Wang, T. Tan, A.K. Jain, Combining face and iris biometrics for identity verification, in 4th International Conference on Audio- and Video-Based Biometric Person Authentication (AVBPA’03) (Springer, Berlin, Heidelberg, 2003), pp. 805–813 10. S.D. Lin, B.F Liu, J.H. Lin., Combining speeded-up robust features with principal component analysis in face recognition system. Int. J. Innov. Comput. Inf. Control 8(12), 8545–8556 (2012) 11. V.V.M. Nagesh kumar, P. Mahesh, M. Swamy, An efficient secure multimodal biometric fusion using palmprint and face image. Int. J. Comput. Sci. 1, 49–53 (2009)
192
C. Kant and S. Chaudhary
12. J.T. Zhang, A.C. Tsoi, S.L. Lo, Scale invariant feature transform flow trajectory approach with applications to human action recognition, in Proceedings of International Joint Conference on Neural Networks, pp. 1197–1204 (2014) 13. H.F. Liau, D. Isa, Feature selection for support vector machine-based face-iris multi-modal biometric system. Exp. Syst. Appl. 38(9), 11105–11111 (2011) 14. A. Azeem, M. Sharif, J.H. Shah, M. Raza, Hexagonal scale invariant feature transform (HSIFT) for facial feature extraction. J. Appl. Res. Technol. 13(3), 402–408 (2015) 15. L. Zhang, L. Zhang, D. Zhang, H. Zhu, Online finger-knuckle-print verification for personal authentication. J. Pattern Recogn. 43(7), 2560–2571 (2010) 16. A. Kumar, Y. Zhou, Personal identification using finger knuckle orientation features. Electron. Lett. 45(20), 1023–1025 (2009) 17. L. Hong, Y. Wan, A.K. Jain, Fingerprint image enhancement: Algorithm and performance evaluation. IEEE Trans. Patt. Anal. Machine Intell. 20, 777–789 (1998) 18. A.K. Jain, S. Prabhakar, L. Hong, A multichannel approach to fingerprint classification. PAMI 21(4), 348–359 (1999) 19. G. Lu, D. Zhang, K. Wang, Palmprint recognition using Eigenpalm features. Pattern Recogn. Lett. 24(9–10), 1463–1467 (2003) 20. M. Behera, V.K. Govindan, Palmprint authentication using PCA technique. Int. J. Comput. Sci. Inf. Technol. 5(3), 3638–3640 (2014) 21. A.K. Jain, K. Nandakumar, A. Ross, Score normalization in multimodal biometric systems. J. Pattern Recogn. Soc. 38(12), 2270–2285 (2005) 22. S.C. Dass, K. Nandakumar, A.K. Jain, A principal approach to score level fusion in Multimodal Biometrics System, in Proceedings of ABVPA (2005) 23. PolyU Finger Knuckle Print Database. http://www.comp.polyu.edu.hk/~biometrics/FKP.htm 24. PolyU Fingerprint Database. http://www.comp.polyu.edu.hk/~biometrics/HRF.htm 25. PolyU Palmprint Database, http://www.comp.polyu.edu.hk/~biometrics/MSP.htm
Image Segmentation of MR Images with Multi-directional Region Growing Algorithm Anjali Kapoor1(B) and Rekha Aggarwal2 1 Guru Gobind Singh Indraprastha University, New Delhi, India
[email protected] 2 Amity School of Engineering and Technology, Amity Campus, Noida, India
[email protected]
Abstract. In medical image processing and analysis, segmentation is most compulsory assignment. In this paper, A multi-directional region growing approach is presented which use the concept of multiple seed selection to reduce the time consumption of region growing segmentation technique. The multiple seed selection concept works on the basis of eight-connected neighboring pixels. The attentiveness of the approach includes the selection of easiness of inceptive pixel and robustness to noises and the sequence of pixel execution. In order to choose a suitable threshold, the conception of neighboring difference transform (NDT) is presented for reducing the concern of threshold assortment issue. Exploratory outcome exhibits that this approach can acquire outstanding segmenting outcome; it will also not affect the result in the case of images with noises. Keywords: Medical image segmentation · NDT · Multiple seed selection · Region growing segmentation approach
1 Introduction Image segmentation is most essential process in medical imaging system. Image segmentation technique divides a digital image into various segments by identifying region of interest (ROI) or other useful statistic in digital images [1]. There are two basic features of image on which image segmentation is based: First is intensity and second is similarity. There are various segmentation techniques that are include global threshold, k-means clustering, fuzzy C-means algorithm, watershed algorithm, morphological algorithm, region seed growing method, and deformable model. The seed region growing method provides best imaging segmentation results among them. The principal of image segmentation splits an image into many segments, or group of contours from the images are separated. Each of the pixels in a region is identical according to few trademarks like color, intensity, or texture. Neighbor regions are notably distinct with identical to the same feature. Multiple seed selection is used with the multidirectional region growing algorithm for reducing the time consumption. Multiple seed selection concept works on the basis of eight-connected neighboring pixels. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_22
194
A. Kapoor and R. Aggarwal
The region growing algorithm is firstly presented by Adams et al. [2]. Xiaoli Zhang, Xiong Li, and Yuncong Feng have proposed the abstract idea of neighboring difference transform to select an appropriate threshold and proposed a MRG segmentation algorithm [3]. This algorithm may behave like preprocessing of the medical image system. It focuses on how to isolate background region and tissue region. The basic algorithm of seed region growing is that first choose a set of seed and then merge the neighboring seeds on the basis of similarity features to the seed. Update the set of seed continuously until the termination condition is pleased.
2 Related Work Wu et al. [4] have presented automated seeded region growing segmentation of organs in abdominal MR images. First, find out the co-occurrence texture and semi-variogram texture features from the image and apply the seeded region growing on these features. The seed point is determined based on three homogeneity criteria. The benefit of this method is that it delivers a parameter-free environment. Khalid et al. [5] have proposed seed-based region growing (SBRG) segmentation algorithm for brain abnormalities. This method was developed for awareness of the abnormalities area of the brain. SBRG-segmented abnormality area was compared with the number of pixels of raw MRI brain images. The accuracy results of seed-based region growing segmentation is measured by true positive, true negative, false positive, and false negative. The outcome of SBRG is found to be very favorable for both light and dark abnormalities. In Deng et al. [6] have proposed an adaptive region growing method on the basis of gradients and variances. First, use the anisotropic diffusion filter for minimizing the noise, and then a recent method based on gradient and variances which focuses on the smoothness of boundary and consistency of region. The goal is to find out the great gradient in the boundary and small variance in homogeneous region. A modified region growing method was proposed by Kavitha et al. [7], which covered an orientation constraint in addition to normal intensity constraint. First processes the dataset with the help of Gaussian filter to eliminate the noise, then apply modified region growing method, and after that extract the features area, mean, covariance and correlation and is given it to feed forward neural network for classifying the image that is tumorous or normal. Wang et al. [8] have used bi-directional edge detection technique to resolve the conventional seed region growing sequence in the inceptive seeds stages. Chuin-Mu Wang et al. proposed a peak detection method to improve the seed region growing technique. Peak detection method was used for recognizing the seed within the region as distinction in the region to be growing with contemporary pixels on the basis of similarity features. Heena Hooda et al. have compared some segmentation techniques like k-means clustering, fuzzy c-means clustering, and region growing segmentation on the sample MRI images of brain which were taken from RGCI and RC Delhi, for detection of brain tumor [9]. The outcomes were assessed based on the error percentage. Ishmam Zabir et al. have proposed a technique for detecting and segmenting the glioma tumor from MRI images [10]. In this method, region growing approach and
Image Segmentation of MR Images with Multi-directional …
195
distance-regularized level set method were used where the outcome of the region growing approach is provided as the initial seed to the distance-regularized level set method. Create 3D reconstructions by using interpolation algorithms like marching cubes with the help of derived contours after image segmentation when applied to a stack of images [11]. 2.1 Traditional Region Growing Algorithm Region growing is an image segmentation algorithm which is based on regions. Regionbased segmentation algorithm segments the image by pixel-basis segmentation method that selects the first pixel as inceptive seed. Region-based algorithm scans adjacent seed points of inceptive seed points and evaluates that the adjacent points should be merged to the region. In region growing, firstly, choose a seed set. Selection of seed point is based on the pixels which lie in a definite grayscale range and pixels uniformly placed on a lattice, etc. [3]. The inceptive region starts from the accurate position of first pixel. The regions are then grown from first seed points to neighboring points based on the similarity standard, intensity of pixel, grayscale texture, or color, etc. The image information itself is more important, and there should not lost any information of the image in the time of growing the regions on the basis of the standards. For example, Let the standards were a pixel intensity threshold value and comprehension of the histogram of the image, and as by using any of these standards, examine a satisfactory threshold value for the region belonging standard. We can understand easily by a very straightforward specimen; here, we use fourconnected adjacent points and eight-connected adjacent points to develop from the seed points. And the standard we create is the same followed before. That is, we retain check the neighboring pixels of seed. We consider them into the seed points, if adjacent pixels have the similar intensity value with the seed points. It is a repeated procedure until there is no change in two consecutive repetitive stages. Of course, we can create other standard, but the principal is to determine the likeness of the image into regions.
3 Modified Region Growing Algorithm (Multi-directional Region Growing) Author presents a modified region growing segmentation algorithm for MR medical images. This algorithm selects the value of threshold based on NDT approach. The algorithm focus on how to isolate background region and tissue region, easiness of the selection of seeds and no matter image is noisy or noise free. The principal of seed region growing algorithm is firstly choose a seed set and merge with neighboring pixels on the basis of similarity features with the set of seed. Continuously, update the seed set until the termination condition is contended. The MRG algorithm includes two region growing: First, region growing is identical as traditional region growing. The result of first region growing will create many different points around the image edge. These points could be manufactured by noise or weak edges. For removing the isolated points from image, we apply region growing second
196
A. Kapoor and R. Aggarwal
time. The problem of threshold assortment is translated to an optimization issue with the presence of NDT matrix. For region growing, select a seed from the background region as the inceptive seed by the bi-direction region growing algorithm. Then, the second region growing is executed to the selected pixel point at the center of the tissue region on the basis of first segmentation result. The significant features of this algorithm are that the sequence of executing pixels: the background region and the object regions both are attached regions with no holes and the segmentation outcomes are instinctive to noises and the selection of seed is easy. The NDT is used for choosing the segmentation threshold automatically. The implementation of this project comes into mind with the design of algorithms which are capable of detecting abnormalities on their own with almost zero error. There are so many algorithms that can detect abnormalities from their own implementation and design. This algorithm permits the user to segment a region into compact amount of regions forcefully by fortify that the outcome of the merge-split algorithm has blocks that are of a specified size. Without this characteristic, there is a prospective for the merge-split process to return many compact blocks. If these blocks are not integrated successfully by the region growing algorithm, unpleasant outcomes are credible (Fig. 1). 3.1 Multiple Seed Selection Multiple seeds are selected over the brain region at different MRI images. First, select the initial pixel points and are equitably selected over the brain region around the area of initial seed point in each direction. In other words, we can say that regions are grow evenly on the basis of eight-connected neighboring pixels. The image intensity distribution information is obtained by selecting multiple seed points and utilized for better tumor extraction from brain MRI images. 3.2 MRG Algorithm MRG contains two region growing; in the first region growing, apply the traditional region growing that is discussed in Sect. 2.1, Its goal to achieve uneven segmentation outcome. During analysis, noticed that the mostly such segmentations have initial seed points that are attached otherwise segmentations is unattached. It will construct numerous compact secluded pixels around the image edge. Therefore, we construct region growing second time to separate the secluded regions which were processed by noisy edges. The first region growing: Let two points: x and y; x represents the present seed and y is any one component of x’s adjacent seed set. I(x) and I(y) represent the intensity value of x and y. x and y are in same region if and only if difference between the intensity of x and y is less than from 1. Otherwise, they are in distinct regions. The segmentation outcome is represented by traditional region growing. The region that contains first seed is set to be “1” is represented the region which contains first seed other regions are set to be “0”. The second region growing: The threshold value is marked as 1. If |I (x) − I (y)| < 1, x and y are in identical region. Otherwise, x and y are in distinct regions. The segmentation outcome is represented by MRG after this region growing. The region that contains initial seed is set to be “0”; other regions are set to be “1”.
Image Segmentation of MR Images with Multi-directional …
197
select an MRI image
apply multiple seed selection
select initial ‘x’ and mark it ‘1’
select ‘y’ means neighbor of ‘x’ and mark it ‘0’
apply MRG If |Ix-Iy| 1, Dr is a constant and its value effected by r [3]. It indicated that maximum coefficients in a compressible vector have small values and only few values have large coefficients. This type of model is good enough for the signals which have wavelet coefficients. Signals which have high value of frequency have small value of coefficients and vice versa. 3.2 Incoherence Incoherence is defined by the correlation of the component of sensing matrix with basis matrix. Assume basis matrix ψ is orthonormal, sensing matrix Φ ∈ RM ×N is a subgroup
matrix of Φ ∈ RN ×N where M < N. Φ is an orthogonal matrix. Basis follows an equation T
Φ .Φ = NI, in which I is an identity matrix. Matrix Φ shows as Φ = Φ −g here g is a set of {1, 2 … N}. Then, A can be represented as E = E¯ g and E is an orthogonal matrix T
follow that E .E = NI. Let λ(E) shows the highest value for all the values of E.
λ(E) = Max|Φk , Ψh | k,h
(2)
This is a mutual incoherence between Φk & Ψh . This incoherence property is used to recover exact signal while compressing it. 3.3 Restricted Isometry Property (RIP) At the receiver side, it is required to recover the original signal from the few number of measurement signal, in which it is necessary that the sparse signal fulfilled the RIP property. Consider a vector u having equivalent representation as S (coefficients of A) for K nonzero values and small positive value. (1 − α)σ 2 ≤ E.σ 2 ≤ (1 + α)
(3)
To hold the RIP property, we take a vector υ for α > 0 and then the Euclidean length of the signal which has K-sparse representation is maintained by E.
Analysis of MRI Image Compression Using Compressive Sensing
365
4 Result and Analysis For analysis purpose, an MRI test signal of ankle with pixel size 64 × 64 is considered. Analysis is done using the MATLAB simulation software version R2016b. Our main motto is to find out the value of PSNR and SSIM with the variation in the compression ratio (C/R). SSIM is used to measure the similarity index of original image and reconstructed image. 4.1 Peak Signal-to-Noise Ratio (PSNR) PSNR is the parameter by which we can easily find out the quality of a reconstructed signal. If PSNR is high, then the quality of image is also high else the quality is less. It is directly indicating the efficiency of the reconstruction algorithm. PSNR is the parameter which is used to indicate the efficiency of the reconstruction. 2 (Qi ) ) 10 × log10 m (4) PSNR = [Qi (m) − Ro (m)]2 m
Qi = Original test signal parameters, Ro = Reconstructed signal parameters, m = Number of iterations denoted by m. 4.2 Compression Ratio Compression ratio is another parameter which is very crucial for the data compression and reconstruction. Compression ratio shows ratio of number of frequency components which are taken for the reconstruction to the number of total frequency components in the signal. Compression ratio is directly proportional to the number of samples taken to reconstruct the original signal. C/R =
S T
(5)
S = Number of frequency components taken for the reconstruction, T = Total number of frequency components in the signal. The MRI signal which is considered in the analysis is first compressed using the Gaussian measurement matrix, DCT and DST basis matrices. Fundamentally, there is much difference in the PSNR values and SSIM values which are calculated using the particular analysis. For the reconstruction purpose, least square method is used. Some tables and plots which are denoting various parameters related do compression are given below (Figs. 1, 2, 3 and 4; Table 1).
5 Conclusion In this analysis, we proposed a scheme in which the MRI image is compressed and recovered using least square method. PSNR & SSIM values measured with the variation
366
V. Upadhyaya and M. Salim
Fig. 1. Plot for compression ratio versus SNR (Gaussian as measurement matrix, DCT and DST as basis matrix)
Fig. 2. Plot for compression ratio versus SSIM (Gaussian as measurement matrix, DCT and DST as basis matrix)
Fig. 3. Reconstructed MRI Image with compression ratio (500/4096) and DCT basis matrix
of the Compression Ratio. As the number of measurements increases then due to that, PSNR value and SSIM value also increased. The highest value of compression ratio is
Analysis of MRI Image Compression Using Compressive Sensing
367
Fig. 4. Reconstructed MRI image with compression ratio (3500/4096) and DCT basis matrix
Table 1. PSNR and SSIM values for different compression levels Gaussian (measurement matrix) Compression ratio DCT (basis matrix) PSNR
SSIM
DST (basis matrix) PSNR
SSIM
500/4096
16.16491 0.168605 16.18374 0.135838
1000/4096
17.50731 0.215592 17.45887 0.208787
1500/4096
19.21488 0.295016 18.91492 0.290805
2000/4096
20.62365 0.365063 20.2278
0.352834
2500/4096
22.09713 0.434285 22.3343
0.42934
3000/4096
24.47251 0.500802 24.17786 0.486072
3500/4096
27.07902 0.601468 26.80713 0.573545
0.85 according to the analysis for that we get a good value of SSIM and PSNR value. That means using compressive sensing applications we can occupy high value of quality by using less number of measurements. So some concluding remarks are given below. • Same compressive sensing approach when applied in the presence of different basis and sensing matrix then the value of PSNR and SSIM both will effect. • DCT basis matrix is much favorable than the DST due to less complex nature. • Scheme is lossy in nature. If we are going to compare our work with the analysis done by the author in the paper [14], then the hand image which is opted by the authors is of high quality and 460 × 720 pixels. In this work, we consider same image with 64 × 64 pixel size and got PSNR (22 dB) for the level of compression (60%). For the work given in paper [14] at 59% compression level, the value of PSNR is about 41 dB. As we considered the image ¼ times lower than that of given in paper [14], if we do comparative analysis then the result given by our algorithm is much higher.
368
V. Upadhyaya and M. Salim
6 Future Scope This paper is based on the MRI test signal that is a T1weighted MRI Image of Hand is considered as a test signal. Whole analysis is based on the single image, and further we can enhance the analysis to 5–6 MRI signals. We can also divide the whole image into different sections and then also do the recovery analysis. Here, DCT and DST signal is taken as basis matrix; Gaussian random matrix is taken as measurement matrix. To do broad analysis, different combinations of basis matrix like DWT, DFT, etc., and measurement matrix like binomial, exponential, chi-square, etc., can also be considered.
References 1. K. Kreutz-Delgado, J.F. Murray, B.D. Rao, K. Engan, T.W. Lee, T.J. Sejnowski, Dictionary learning algorithms for sparse representation. Neural Comput. 15(2), 349–396 (2003) 2. I.F. Gorodnitsky, B.D. Rao, Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. Signal Process. IEEE Trans. 45(3), 600–616 (1997) 3. I. Gorodnitsky, B. Rao, J. George, Source localization in magneto encephalagraphy using an iterative weighted minimum norm algorithm. in Proceedings of Asilomar Conference on Signals, Systems, and Computers (Pacinc Grove, CA, 1992 October) 4. B. Rao, Signal processing with the sparseness constraint. in Proceedings IEEE International Conference on Acoustic Speech, and Signal Processing (ICASSP) (Seattle, WA, 1998 May) 5. Y. Bresler, P. Feng, Spectrum-blind minimum-rate sampling and reconstruction of 2-D multiband signals. in Proceedings of IEEE International Conference on Image Processing (ICIP) (Zurich, Switzerland, 1996 September) 6. P. Feng, Universal spectrum blind minimum rate sampling and reconstruction of multiband signals. P.hD. thesis, University of Illinois at Urbana-Champaign1997 7. P. Feng, Y. Bresler, Spectrum-blind minimum-rate sampling and reconstruction of multiband signals. in Proceedings IEEE International Conference on Acoustics Speech, and Signal Processing (ICASSP) (Atlanta, GA, 1996 May) 8. R. Venkataramani, Y. Bresler, Further results on spectrum blind sampling of 2-D signals. in Proceedings of IEEE International Conference on Image Processing (ICIP) (Chicago, IL, 1998 October) 9. A. Beurling, Sur les integrales de Fourier absolument convergentes etleur application a une transformation fonctionelle (In Proc. Scandinavian Math. Congress, Helsinki, Finland, 1938) 10. B. Gözcü, R.K. Mahabadi, Y.H. Li, E. Ilıcak, T. Cukur, J. Scarlett, V. Cevher, Learning-based compressive MRI. IEEE Trans. Med. Imaging 37(6), 1394–1406 (2018) 11. M. Mardani, E. Gong, J.Y. Cheng, S.S. Vasanawala, G. Zaharchuk, L. Xing, J.M. Pauly, Deep generative adversarial neural networks for compressive sensing MRI. IEEE Trans. Med. Imaging 38(1), 167–179 (2018) 12. T. Kampf, V.J. Sturm, T.C. Basse-Lüsebrink A. Fischer, L.R. Buschle, F.T. Kurz, H.P. Schlemmer et al., Improved compressed sensing reconstruction for F magnetic resonance imaging. Mag. Res. Mater. Phys. Biol. Med. 32(1), 63–77 (2019) 13. G. Yang, S. Yu, H. Dong, G. Slabaugh, P.L. Dragotti, X. Ye, D. Firmin et al., DAGAN: deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans. Med. Imaging 37(6), 1310–1321 (2017) 14. Tariq Tashan, Maher Al-Azawi, Multilevel magnetic resonance imaging compression using compressive sensing. IET Image Proc. 12(12), 2186–2191 (2018) 15. Sumit Datta, Bhabesh Deka, Efficient interpolated compressed sensing reconstruction scheme for 3D MRI. IET Image Proc. 12(11), 2119–2127 (2018)
A Transfer Learning Approach for Drowsiness Detection from EEG Signals S. S. Poorna(B) , Amitha Deep, Karthik Hariharan(B) , Rishi Raj Jain, and Shweta Krishnan Department of Electronics and Communication Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. There has been an alarming increase in the number of accidents that occur due to drowsiness. Research so as to analyze and rectify this state has become a necessity. Here, a system has been developed which would detect drowsiness with sufficient reliability using electroencephalogram (EEG) signals. The study was conducted on nine males and nine females, whose EEG waveforms are obtained with the help of a wireless neuro-headset, while subjects underwent a virtual driving experiment. Preprocessing of the raw EEG signals is done by Principal Component Analysis (PCA). The resultant signals are then normalized and smoothed. Subsequently, the preprocessed signal is segmented and the time-frequency features are extracted using Continuous Wavelet Transform (CWT). Transfer learning approach is adopted in this paper to discriminate between the 3 states, ‘Drowsy’, ‘Asleep’ and ‘Awake’. An existing pre-trained ResNet50 model, fine-tuned using the scalograms obtained from CWT, is utilized for evaluating the state of driver. Keywords: EEG · PCA · CWT · Transfer learning · ResNet-50 · Scalogram
1 Introduction Drowsiness during driving is one of the major causes of road accidents. According to the statistics released by the Centers for Disease Control and Prevention, drowsiness while driving resulted in around 72,000 road accidents, 44,000 injuries, and even 800 deaths in the US in 2013 [1]. Certain lifestyle choices such as long working hours can lead to drowsiness. Depression can increase drowsiness, as can stress, anxiety and boredom. Some medical conditions such as diabetes can also cause drowsiness. Many medications, particularly antihistamines, tranquilizers, and sleeping pills, list drowsiness as a possible side effect. Several methods have been proposed by researchers to accurately detect the onset of drowsiness, thereby alert the driver and avoid accidents, saving valuable human life. Drowsiness could be detected from the physical changes of the driver’s face or head such as eye closures, eye blinks, yawn and head movements, or from the behavior of the © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_41
370
S. S. Poorna et al.
vehicle such as the angular tilt of steering wheel or yaw angle. It could also be sensed from the driver’s physiological signals such as EEG, EOG (Electrooculogram), ECG (Electrocardiogram), EMG (Electromyography), body temperature or respiration rate. The state of the art techniques used in driver drowsiness detection could be seen in the survey papers by Ramzan et al. [2], Coetzer et al. [3], Khan et al. [4]. Artanto et al. [5] developed a low-cost EMG detection technique, called Myoware which measures the duration of eyelid closure to detect drowsiness. In a minute if the eyelid is closed for less than 5 times, the drowsiness level is low and if it is more than 10, the drowsiness level is high. But this technique was inappropriate, as it is subject-specific. A real time driver drowsiness detection system, based on the data acquired from angular measurement of steering wheel (SWA), obtained with the help of sensors was proposed by Li et al. [6]. This system used Approximate Entropy (ApEn) features followed by a warping distance measure to determine the driver’s drowsiness, employing a binary decision classifier. An average accuracy of 78.01% was attained by the system. In [7], drowsiness while driving was detected with the help of sensors capable of detecting the angular velocity of steering wheel. At first, the time indexed sequence from the sensor during drowsy state was collected and kept as the reference frame. Later, a sliding window was used to examine the actual sensor output, in real-time. If in the window, the similarity with the reference frame was absent, the window will advance a step and resume the search in the next frame of time series. In the other case, if similarity was observed, the specific time series in the window frame was compared against the driver’s face captured during respective trial and the outcomes were validated. EEG data analysis helps in determining drowsiness and is more reliable than EOG based techniques. EEG can also render the information related to both the movements of eye muscles and sleep stages of the brain. Various techniques have been put-fourth to recognize drowsiness from EEG, using wavelet transforms. One such experiment carried out in time and frequency domain using wavelet transforms can be seen in [8]. In their research work, 50 Hz notch filter was used to filter noise from EEG. Filtered noiseless signals were then passed through discrete wavelet filters for isolating EEG rhythms. Debauches (DB)-8 wavelets were applied to the signal and decomposed into eight levels in order to obtain the respective five EEG signal bands as: delta (0.5– 4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–30 Hz), and gamma (1–40 Hz), for the analysis. Open source sleep data from PhysioNet [9] was utilized for their work. The band isolated EEG signals were employed for features extraction and identifying drowsy driver state. Similar analysis using Discrete Wavelet Transforms (DWT) for detection of driver drowsiness using EEG signals collected using brain-sense headset is seen in [10]. From the acquired signals, alpha, delta and theta waves were extracted using 4 levels of DWT decomposition. For detecting the various drowsy conditions, Support Vector Machine (SVM) was used and 95% accuracy was obtained. Studies carried out with same EEG data set used in this paper can be seen in [11]. This study showed that precise detection of drowsiness can be achieved by preprocessing EEG signals using PCA. The feature set for analysis included two types: viz. first set characterizing temporal characteristics of eye blinks and the latter set in which eye blinks were excluded, containing the parameters of EEG spectral bands—alpha, beta, delta, theta and gamma. Both these features were combined and evaluated using supervised
A Transfer Learning Approach for Drowsiness Detection …
371
learning techniques. The accuracy of 85% was obtained using Artificial Neural Network (ANN) classifier. This paper explores the effect of CWT based feature viz. scalogram for driver drowsiness detection. Transfer learning will be adopted for distinguishing the state of driver. The performance of the system will be examined based on accuracy and loss values. The organization of this paper is as follows: Sect. 2, methodology discusses preprocessing, feature extraction and classification in detail. Section 4 deals with the results of analysis and the conclusion follows.
2 Methodology This section elaborates on the data acquisition and explains how preprocessing and feature extraction is carried out. It also gives an outline of how classification is carried out using transfer learning. 2.1 Data Collection The data collected in [11] was used in our analysis. This study was conducted on 18 subjects of age group 20–22 they were given VR headsets and asked to play a virtual driving for two hours to simulate a close to real experience. Emotiv EPOC headset was used to collect the data. The data was collected using 14 electrodes. These electrodes, placed on the scalp of the subject, are arranged around the subsequent areas: frontalparietal and frontal: F3, F4, F7, FC5, FC6, F8, AF3, AF4; temporal: T7, T8; and the occipital-parietal and occipital: P7, P8, O1, O2. The sampling frequency of the device was 128 Hz. Recordings of EEG data was taken using the 10–20 international electrode positioning system. Dominant ocular pulses are seen only in frontal and occipital electrodes and hence only channels—O2, FC6, F4, F8, AF3, F7, F3, F6 and AF4 were considered for this work. 2.2 Principal Component Analysis and Preprocessing The channels provide data about the state of the participant and hence the most appropriate data should be chosen from these channels. PCA has been utilized for merging the relevant channels. Normalized PCA output is used for further segmentation. A human blinks every 2–10 s. Hence, the EEG waveform was segmented for duration of 10 s with 2 s of overlap to avert data loss while segmenting. To remove the baseline noises in the signal, segmented smooth function [13] has been utilized. This resulted in a smooth signal, shown in Fig. 1a–c, which could be used for feature extraction. 2.3 Feature Extraction: Scalogram from Continuous Wavelet Transform For this study, study we used (CWT) based features to detect drowsiness. CWT is a powerful tool for analyzing non-stationary time series signals in the time-frequency domain. The wavelets are capable of analyzing the frequency variation of the signal over time. This is achieved using multi resolution analysis, where daughter wavelets could be obtained from a mother wavelet by shifting or scaling. The CWT is given in
372
S. S. Poorna et al.
Fig. 1. Segmented EEG signal in a awake state b drowsy state c sleepy state d–f the respective scalograms
Eq. 1, where Wx (i, j) is CWT, which gives the measure of similarity between the input signal x(t) and the complex conjugate of the mother wavelet, ψ ∗ (t), translated by j and scaled by i. 1 Wx (i, j) = √ |i|
∞ x(t)ψ −∞
∗
t−j dt. i
(1)
The energy distribution in CWT is given by scalogram. This visual representation of the wavelet transform in frequency-time domain makes hidden features to be seen more prominently. The scalogram Sx (i, j) is given in Eq. 2. Similar studies on automatic fatigue detection using CWT could be seen in [12]. Sx (i, j) = |Wx (i, j)|2 .
(2)
In this paper the scalograms corresponding to the three 3 states viz. ‘Awake’ ‘Drowsy’ and ‘Asleep’ are obtained after preprocessing and segmentation using Morse wavelet. The segmented EEG signals in the respective states are shown in Fig. 1a–c. The corresponding scalograms obtained are shown in Fig. 1d, f. In Fig. 1a, an abrupt variation in amplitude due to eye blink is seen as yellow patches across various frequencies in Fig. 1d, while in drowsy state, the eye droops down and the blink duration will be longer as shown in Fig. 1b. This may be noted in the spas a lower frequency variation in Fig. 1e. Figure shows a lengthy yellow patch than in awake state. Finally in Fig. 1f, due to the sleep state, yellow patches are not that significant.
A Transfer Learning Approach for Drowsiness Detection …
373
2.4 Transfer Learning Using Pre-trained ResNet50 In these past couple of years deep convolutional neural networks has made many innovations in the area of classification and image recognition. Recent research shows that deep learning based architectures have outperformed conventional machine learning techniques. The major constraint with these techniques is the unavailability of huge amount of data required for training these algorithms. Transfer learning is a solution to this constraint. Transfer learning aims to deliver a structure to utilize previously-acquired data to see new however similar issues far more quickly and expeditiously. In distinction to early machine learning strategies, transfer learning strategies use the data accumulated from information in auxiliary fields to facilitate prognosticative modeling consisting of many data patterns within the current field. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is a benchmark dataset, for object recognition on a large-scale. ImageNet has more than a million of images having dimension 256 × 256, categorized into thousands of different object classes. Also each class has thousands of images. The dataset is publicly available [14]. Residual learning tries to resolve the issues of vanishing gradients in deep learning. In neural networks, each layer learns high or low-level features that are trained for the task at hand. In residual learning rather than attempting to find out options, the model tries to find out some residual information. ResNet50 [15] is CNN trained on ImageNet. The network is fifty layers deep and can classify pictures into a thousand object classes. As a result, the network has learned feature representations for a good variety of pictures.
3 Results and Discussions In this paper, an existing pre-trained classifier is utilized for evaluating the state of driver with the help of EEG signals, obtained from the scalp. Transfer learning was adopted since the dataset under consideration had only limited number of scalograms obtained from PCA EEG signals. The method involves training a Convolutional Neural Network architecture trained on ImageNet information to discriminate between the 3 states, ‘Awake’, ‘Drowsy’ and Asleep’. The ResNet50 model is pre-trained using ImageNet. The first 49 layers of this trained model is frozen and the last layer is modified with 3 fully connected softmax layers. The model is then fine-tuned with the scalogram data corresponding to the three states. The dataset contained 343 images of dimension 224 × 224 that were split up in a ratio of 50:50 for training and validation. The data set was trained with batch size of 4, for 30 epochs. The loss function used is categorical cross-entropy which improves the accuracy when compared to mean squared error. For fine tuning, different optimizers were tried on the model. The Stochastic Gradient Descent optimizer (SGD) gave the best possible validation accuracy of 82.56%. The loss of 0.3593 and training accuracy of 84.21% was also obtained with the same. Using RMSprop and Adams optimizer gives poor results and the validation accuracy drops down to 63% and 58% respectively. Even though one more convolution layer was added to the architecture, improvement in the accuracy was not observed. The performance curves, training and validation accuracy versus epochs and training and validation losses versus epochs is shown in Figs. 2a, b respectively. It can be seen from the figure that at around 25 epochs, the model gives maximum accuracy. An increase in the number
374
S. S. Poorna et al.
of epochs can make the model over-fit. The results were comparable with our previous studies carried out in the same dataset using conventional machine learning techniques [11].
Fig. 2. Performance curves a training and validation accuracies b training and validation loss
4 Conclusion The drowsiness detection system was implemented with reasonable accuracy with the help of transfer learning method. The feature extraction technique described above, shows direct relation with the nature of the eyeblinks and hence could be used with more reliability. The work could be extended to other pre-trained models as well. The same database cannot be tested on deep learning models trained from scratch since we have only limited amount of data. Future research is also planned to increase the data synthetically by using suitable deep learning techniques.
References 1. CDC Webpage. https://www.cdc.gov/features/dsdrowsydriving/index.html. Last accessed 13 Oct 2019 2. M. Ramzan, H.U. Khan, S.M. Awan, A. Ismail, M. Ilyas, A. Mahmood, A survey on stateof-the-art drowsiness detection techniques. IEEE Access 7, 61904–61919 (2019) 3. R.C. Coetzer, G.P. Hancke, Driver fatigue detection: a survey. AFRICON 1–6. IEEE (2009) 4. M.Q. Khan, S. Lee, A comprehensive survey of driving monitoring and assistance systems. Sensors 19(11), 2574 (2019) 5. D. Artanto, M.P. Sulistyanto, I.D. Pranowo, E.E. Pramesta, Drowsiness detection system based on eye-closure using a low-cost EMG and ESP8266, in 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE), pp. 235–238. IEEE (November, 2017) 6. Z. Li, S. Li, R. Li, B. Cheng, J. Shi, Online detection of driver fatigue using steering wheel angles for real driving conditions. Sensors 17(3), 495 (2017)
A Transfer Learning Approach for Drowsiness Detection …
375
7. G. Zhenhai, L. DinhDat, H. Hongyu, Y. Ziwen, W. Xinyu, Driver drowsiness detection based on time series analysis of steering wheel angular velocity, in 2017 9th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp. 99–101. IEEE (2017) 8. P.C. Nissimagoudar, A.V. Nandi, H.M. Gireesha, EEG signal analysis using wavelet transform for driver status detection, in International Conference on Innovations in Bio-Inspired Computing and Applications (Springer, Cham), pp. 56–65 (2018) 9. PhysioNet Webpage. https://physionet.org/physiobank/database/sleep-edfx/. Last accessed 13 Oct 2019 10. K.V. Reddy, N. Kumar, Wavelet based analysis of EEG signal for detecting various conditions of driver, in 2019 International Conference on Communication and Signal Processing (ICCSP), pp. 0616–0620. IEEE (2019) 11. S.S. Poorna, V.V. Arsha, P.T.A. Aparna, P. Gopal, G.J. Nair, Drowsiness detection for safe driving using PCA EEG signals, in Progress in Computing, Analytics and Networking (Springer, Singapore), pp. 419–428 (2018) 12. J. Krajewski, M. Golz, S. Schnieder, T. Schnupp, C. Heinze, D. Sommer, Detecting fatigue from steering behaviour applying continuous wavelet transform, in Proceedings of the 7th International Conference on Methods and Techniques in Behavioral Research, p. 24. ACM (2010) 13. MATLAB central file exchange, segmented smooth function by Tom O’Haver. https:// www.mathworks.com/matlabcentral/fileexchange/60300-segmented-smooth-function. Last accessed 13 Oct 2019 14. J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, L. Fei-Fei, ImageNet: a large-scale hierarchical image database, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (20 June 2009) 15. T. Akiba, S. Suzuki, K. Fukuda, Extremely large minibatch SGD: training resnet50 on ImageNet in 15 minutes. arXiv preprint arXiv:1711.04325 (12 Nov 2017)
Classification and Measuring Accuracy of Lenses Using Inception Model V3 Shyo Prakash Jakhar1(B) , Amita Nandal1 , and Rahul Dixit2 1 Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur,
India [email protected], [email protected] 2 Department of CSE, Indian Institute of Information Technology Pune, Pune, India [email protected]
Abstract. Convolution network is widely used nowadays for the most computer vision solutions. Deep convolution network has become mainstream and is yielding various benchmarks since 2014. Deep learning has shown tremendous results in the field of image processing. Artificial neural networks are inspired by biological neurons and allow us to map various complex functions at a very fast speed. These neural networks have to train on large sets of data and then take a certain amount of time to do so. Eventually, they give immediate quality gains for most of the tasks in use cases of lens vision and big data scenarios. In this paper, Inception model V3 has been used that aims at utilizing the added computation as efficiently as possible in case of lens images. This model takes bulk images of lenses and fragments them into six different classes depending upon their properties. Inception model achieves accuracy for recognizing images with more than 1000 classes. Histogram shows the comparative study of six different classes on basis of classification rate and accuracy. Confusion matrix has been made which gives the correct predictions for the images of lens. Finally, the results have been compiled in two parts. In the first part contain, we find out the precision and recall of lens images. In the second part, overall accuracy of the model has been compiled which comes out to be 95.9%. Keywords: Inception model V3 · Confusion matrix · Softmax regression
1 Introduction Deep convolution neural network (convents/CNN) is most widely used nowadays in image processing problems because CNN preserves spatial relationship when filtering input images. The CNN is feedforward neural network which takes input images, assigns weights to various objects in the image so that it can differentiate one image from another. Larger the datasets more the accuracy but also increased computation time. Inception model works in two parts. In the first part, feature extraction is done from input images with convolution network. Then, classification part is completed based on those features or properties with fully connected and softmax layers. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_42
Classification and Measuring Accuracy of Lenses …
377
Deep convolution neural network (convents/CNN) achieves great result on image processing problems. The deep learning models outperform the shallow learning models like support vector machine. They are feedforward neural networks with multiple layers.
2 Background We have used Inception-V3 model that is advance version of Inception-V1 and InceptionV2. Since the 2012 ImageNet competition [1] winning entry by Krizhevsky et al. [2], their network “AlexNet” has been successfully applied to a larger variety of computer vision tasks, for example, to object detection [3], segmentation [4], human pose estimation [5], video classification [6], object tracking [7], and superresolution [8]. But GoogleNet is far better than AlexNet or VGGNet in terms of performance and computational cost. The main objective of Inception model is to increase the accuracy and reduce the computational cost. There are two approaches to increase performance: • The first method is to increase size and depth of model. • The second method is to using sparsity between layers of the model. We have found an optimal solution between the two above approaches. All of these layers at that point experience measurement decrease to wind up in 1 × 1 convolutions. For viable acknowledgment of such factor-measured component, we need kernels of various sizes. Rather than essentially going deeper as far as number of layers, it goes wider. Various kernels of various sizes are implemented inside a similar layer. The Inception network architecture consists of a number of Inception modules of following structure (as shown in Fig. 1).
Fig. 1. Inception module
Inception model works in two sections. In the initial segment, extraction is done from input pictures with convolution network. Then, classification part is completed based on those features or properties with fully connected and softmax layers. At that point, the model makes histogram diagram of these classes. Finally, confusion matrix is made. It is a table with four different combinations of predicted and actual values.
378
S. P. Jakhar et al.
It is very handy for measuring recall, precision, accuracy, and AUC-ROC curve. AUC stands for area under the curve and is a probability curve, and ROC stands for receiver operating characteristics and tells about the degree of separability between different classes. This curve plays an important role in performance evaluation of a multiclass classification problem. The formulae for recall, precision, and F-measure are as follows: True positive: Interpretation: You predicted positive and it is true. True negative: Interpretation: You predicted negative and it is true. False positive: Interpretation: You predicted positive and it is false. False negative: Interpretation: You predicted negative and it is false.
Recall : Recall = TP/(TP + FN)
(1)
Precision : Precision = TP/(TP + FN)
(2)
The Inception model is a Google-based model, and computational cost of Inception model is a lot lower than VGGNet or its higher performing successors [9, 10]. InceptionV3 model is generally utilized image recognition model that has been appeared to accomplish more than 78.1% accuracy on ImageNet dataset [11, 12]. ImageNet is an image database composed by the WordNet hierarchy, in which thousands of images are present. Inception architecture of GoogleNet is intended to perform well even under severe constraints on memory and computational expense. The outcomes of the Inception model are highly applicable in CVCES system to improve the prediction performance [9].
3 Inception-V3 Model A high-level diagram of Inception-V3 model is shown below in Fig. 2. The Inception-V3 model has 22-layer architecture which uses fully learned filters to generate a correlation statistic analysis to feedforward to the next layer. The bottleneck layer is responsible for actual classification and it is situated just before the final output layer. In each step, 15 images are chosen randomly to feed the bottleneck. As Inception model has already learned from 1000 different classes as part of the ImageNet, retraining the bottleneck layer is sufficient to get a better output. Two key functions given by the TPUEstimator API are train() and test() used to prepare and assess the lens images
Classification and Measuring Accuracy of Lenses …
379
Fig. 2. Inception-V3 model architecture
individually. These are normally called in some place in the principal function. It also has pooling layer, fully connected layer and softmax layer. The last layer of Inception-V3 model has been trained using softmax regression. If given a n-dimensional vector, the softmax function generates a same dimensional vector of values ranging from 0 to 1. Before the model can be utilized to recognize images, it must be trained. Training an Inception-V3 model provides many opportunities including: • Finding out different versions of this model in order to improve the accuracy of this image classification model. • Comparison of different optimization algorithms and hardware architectures for training this model more faster and improving performance of this model. • Redefining the Inception-V3 model as component of a larger network having the task of multimodel learning and image classification as well as object detection. In the proposed system (as shown in Fig. 3), the research dataset name is ImageNet dataset. In the dataset, the images related to lens of eyes is taken from a Singapore-based company EV. The dataset is divided into six different classes named as follows: 1. 2. 3. 4. 5. 6.
Excellent Lens Better lens Good lens Average lens Poor lens Damaged lens.
The quality of the lens depends on the image scenario and the total count of images taken is random in each class of the dataset. The dataset depends on the overall mixture also according to the map. Then, Inception model V3 is applied on the classes and
380
S. P. Jakhar et al.
Fig. 3. DFD of proposed framework
the final output in form of recall and precision has been carried out. Also, the overall accuracy of the model has been found out which is higher than any other model.
4 Result The result can be divided into two parts: 1. In the first part, we have found out the confusion matrix and histogram. Confusion matrix is used to calculate the performance of a trained classifier model over a given dataset. Then, precision and recall are calculated of different set of classes of lenses. The nomenclature used are as follows: P= Positive N= Negative T= True F= False P= Predicted A= Actual. Precision: Precision is the ratio of total number of correctly classified positive example to the total number of predicted positive example. High precision indicates an example labeled as positive is truly positive. Precision = TP/(TP + FP)
(1)
Recall: Recall is the ratio of total number of the correctly classified positive example to the total number of positive example. High recall indicates the correctness of the class
Classification and Measuring Accuracy of Lenses …
381
Recall = TP/(TP + FN)
(2)
recognized.
The recall and precision value for different classes of lens are shown in table (Table 1). Table 1. Recall and precision value for different classes Lens test report (BVe3)
Tested by: Dr. Amita Nandal Image Source: ATL MacMini
Resource By: S.P. Jakhar
Manual segregation Lens classification (release: 12/09/2019) Class
Count
Excellent lens
Better lens
Good lens
Average lens
Poor lens
Damaged lens
Recall (%)
Excellent lens
132
126
1
0
0
0
0
95.5
Better lens 1831
0
1675
0
0
0
0
91.5
Good lens 41
0
0
41
0
0
0
100.0
Average lens
245
0
0
0
245
0
0
100.0
Poor lens
8
0
0
0
0
8
0
100.0
Damaged lens
200
0
101
0
0
0
99
49.5
100.0
94.3
99.6
98.4
38.1
100.0
Precision (%)
Finally, histogram has been generated to show the results of six different classes as below (Fig. 4). 1. In the second step, accuracy of the model has been found out using the accuracy formula given below:
Accuracy = (TP + TN)/(TP + TN + FP + FN) Here, the correct predictions as well as wrong predictions have been calculated from total number of data available along with the accuracy of each class. The overall accuracy of the model comes out to be 95.9%. See Table 2.
382
S. P. Jakhar et al.
Fig. 4. Histogram
Table 2. Accuracy table Accuracy report: Tested by: Dr. Amita Nandal Date: 12/9/2019 Build by: S. P. Jakhar Inception model Class/category Total no. Dataset of data Data Data count Correct Wrong Class Model available count (augmented) predictions predictions accuracy accuracy (original) (%) (%) Excellent lens
1417
1275
8925
8923
9
100.0
Better lens
107,607
883 1
8831
8643
188
97.9
Good lens
3273
2947
8841
8840
1
100.0
Average lens
1373
1235
8645
8645
0
100.0
192
174
5160
5160
0
100.0
9553
512
8704
6852
148
78.7
Poor lens Damaged lens
95.9
5 Conclusion and Future Work We have proposed the system based on the Inception-V3 model, in which supervised learning technology has been used to train this model on different class of images of lenses. The classification accuracy of the model is approximately 95.9% on given dataset, which is higher than any other methods available for classification. The performance of Inception-V3 model can be made much better by modifying the hyper-parameters as well as using advanced embedded hardware platforms so that it can outperform other models and computational cost can also be reduced to a great extent. Considering its efficient use of computational power, it can be used in vast applications. The future work is to study and develop a more accurate model for image classification and detection of severe diseases at an early stage in the field of medical imaging.
Classification and Measuring Accuracy of Lenses …
383
References 1. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition. Microsoft Research (2015) 2. A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in neural information processing systems, pp. 1097–1105 (2012) 3. R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) 4. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431– 3440 (2015) 5. A. Toshev, C. Szegedy, Deeppose: human pose estimation via deep neural networks, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1653–1660. IEEE (2014) 6. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, L. Fei-Fei, Large-scale video classification with convolutional neural networks, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732. IEEE (2014) 7. N. Wang, D.-Y. Yeung, Learning a deep compact image representation for visual tracking, in Advances in Neural Information Processing Systems, pp. 809–817 (2013) 8. C. Dong, C.C. Loy, K. He, X. Tang, Learning a deep convolutional network for image superresolution, in Computer Vision–ECCV 2014 (Springer), pp 184–199 (2014) 9. K. Gautam, V.K. Jain, S.S. Verma, A survey and analysis of clustered vehicular communication emergency system (CVCES), in press IEEE Conference (ICONC3), India (2019) 10. X. Xia, C. Xu, B. Nan, Inception-v3 for flower classification, in 2nd International Conference on Image, Vision and Computing (2017) 11. N.R. Gavai, Y.A. Jakhade, S.A. Tribhuvan, R. Bhattad, MobileNets for flower classification using TensorFlow, in International Conference on Big Data, IoT and Data Science (BID), pp. 154–158 (2017) 12. F.N. Iandola, M.W. Moskewicz, K. Ashraf, S. Han, W.J. Dally, K. Keutzer, Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1 mb model size. arXiv preprint arXiv: 1602.07360 (2016)
Detection of Life Threatening ECG Arrhythmias Using Morphological Patterns and Wavelet Transform Method Shivani Saxena(B) and Ritu Vijay Department of Electronics, Banasthali Vidyapith, Banasthali, Rajasthan, India [email protected], [email protected]
Abstract. The work in this paper is to investigate the detection of another group of arrhythmias, which might not need immediate attention but may be life threatening that can lead to fatal cardiovascular diseases. The Electrocardiogram plays an imperative role to diagnose such cardiac diseases. But recorded ECG often contaminated by various noises and artifacts. Hence, for accurate diagnosis of heart disease and characterization of normal rhythm from arrhythmic wave, first step is to obtain clear ECG. The proposed method de-noised ECG signal using wavelet transform and extract dynamic features and morphological patterns of ECG arrhythmia. Numerical simulation of statistical parameters proves de-noising capability of proposed method. Keywords: Cardiac arrhythmia · ECG · Wavelet transform · Standard deviation · Mean square error
1 Introduction Cardiac arrhythmia, a type of Cardiac Vascular Disease (CVD) is the malfunctioning of heart when the heart does not pump the blood regularly. Late detection and/or incorrect interpretation may prove fatal. Globally 17.3 million people died of CVDs in 2008 which was about 30% of all the deaths in the world. It is estimated that such causalities may grow further to 23.3 million by 2030 [1]. The Electrocardiogram (ECG) is the best non- invasive primary diagnosis method to measure all electrical activity of heart. By analyzing the morphological pattern of ECG, abnormality in cardiac activity can be detected easily, as it directly reflects cardiac health. These measurements are important tool in medical instrumentation as recommended by the Advancement of Medical Instrumentation (AAMI). Figure 1 shows a normal ECG waveform. As shown in Fig. 1, ECG constitutes sequence of P, Q, R, S, QRS complex and T wave, accompanied by many intervals and segments. Each wave represents electrical activity of heart in one cardiac cycle. In a wave, the first upward deflection termed as P wave represents contraction of right and left atria. QRS complex gives information about conduction paths in the ventricles. T wave reflects ventricles relaxation state. But ECG signals may be contaminated with high frequency Power line interference noise (50 Hz) © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_43
Detection of Life Threatening ECG Arrhythmias …
385
Fig. 1. Schematic of normal ECG signal waveform
due to ECG recording leads; and low frequency Baseline Wander Noise or Baseline drift (0.5 Hz) caused by patient movement during examine ECG. Remove such physiological and instrumentation contents from ECG, is the main challenge in medical engineering [2, 3]. Filtering is the simplest noise removal technique but it alters original signal morphology and gives unsatisfactory results for non-stationary signal. In last decades, the wavelet transform based methods has been widely used for de-nosing of signals. It can effectively reduced noise components while preserving the characteristics of the signal [4].
2 ECG Arrhythmia ECG measured by placing ECG electrodes on fixed body locations reflects dynamic and morphological features of life-threatening arrhythmias as summarized in Table 1. Hence, from above table it says, due to any physiological disturbance, significant deviation in normal functioning of heart is found, and it is well reflected in ECG waveform. In addition, ECG recording specifies areas of heart which functioning abnormally [5–7].
3 Wavelet Transform Wavelet transform is a time-frequency representation of non-stationary signal. The DWT of signal x[n] of length L, which is iteratively decomposed through low-pass and highpass filters, produced various wavelet coefficients at each decomposition level, named approximation A[n] and detail D[n] wavelet coefficients. It is followed by a down sampler of 2 to remove redundancy and discontinuities in higher derivatives of signal. The mathematical representation of DWT constitutes two orthogonal functions, (i) time dilation wavelet basis function ψ(t) and, (ii) shifting or compression scaling function ϕ(t), using the following equation: A[n] =
L−1 k=0
h(n)ϕ(2n − k)
(1)
386
S. Saxena and R. Vijay Table 1. Dynamic and morphological features of ECG arrhythmia
Types of arrhythmia
Electrical events
ECG features and morphological pattern (downloaded from physionet)
Atrial fibrillation
Un-synchronizes electrical activity Heart beat >150 bpm • Absence of P wave, irregular R-R interval, narrow QRS complex
Atrial flutter
Rapid atrial contraction/depolarization Regular heart beat 240–360 bpm Aberrancy in QRS complex, absence of T wave, Saw tooth shape of ECG wave
Premature atrial contraction
• Other region of atrial de-polarize • Irregular heart beat
• Premature and different P wave morphology, prolonged R-R interval Premature ventricular contraction
Conduction is outside from ventricles • No atrial activity
Bundle branch blocks RBBB LBBB
Block in conduction path Slow and abnormal De-polarization and re-polarization of right ventricles in RBBB and in left ventricles in LBBB
• Widened QRS complex and deflected upward, broad S-wave
• Widened QRS complex (>120 ms) and downward deflected., inverted T wave
D[n] =
L−1 k=0
g(n)ψ(2n − k)
(2)
Detection of Life Threatening ECG Arrhythmias …
387
Here h(n) and g(n) is the transfer function of low pass and high pass filters, respectively. Wavelet transform based signal de-nosing consists of four basic steps: (i) Decomposing the signal at level N, (ii) Choice of wavelet family, (iii) Shrinkage of wavelet coefficients using thresholding. Donoho and Johnstone (1994) proposed a universal thresholding method of sample length N is: (3) T = 2 (2logN ) × σ Here σ is estimated noise level in each frequency sub-band after applying DWT [8, 9]. And, (iv) Inverse wavelet transform (IDWT) to re-construct the signal.
4 Proposed Methodology The wavelet transform based de-noising procedure used truncated and shrinkage of selective wavelet coefficients. The block diagram of proposed method is shown below (Fig. 2). Noisy ECG signal (MIT BIH arrhythmia)
DWT at level 9 & Extraction and Zeroing of A9 coeff.
Extraction & Zeroing of D3 Coeff.
Computation of Statistical Parameters
Thresholding of Wavelet detail Coeff. & Reconstruction with A9
Extraction of Dynamic features & morphological patterns
Fig. 2. Proposed method for de-noising of ECG signal
5 Experimentation and Discussions This experiment used large set of ECG beats based on lead MLII of 10 s ECG recording from MIT–BIH Arrhythmia Database and simulate in MATLAB 1D Wavelet GUI toolbox, as shown in following figures (Figs. 3 and 4). The simulation results of proposed method by extraction of single cycle from noise less ECG waveform, is illustrate dynamic and morphological features, are shown in Fig. 5. Table 2 shows detected abnormal ECG beat for clinical significance through their characteristics. Performance evaluation of proposed method based on various statistical parameters as standard deviation, mean square error and noise level (σ noise-variance ) of wavelet coefficients is as depicted in table following (Table 3). The least value of standard deviation, mean square error and noise variance shows the de-noising capability of proposed method to satisfactory level.
388
S. Saxena and R. Vijay
Fig. 3. Original simulated noise contaminated ECG signal #200
Fig. 4. De-noised ECG signal
6 Conclusions This paper designed a wavelet transform based algorithm to remove noise content into the signal and extract the statistical features from it. Since Wavelet transform localize small variations of signal in time-frequency domain. Therefore, characterization of ECG signal in dynamic and morphological features is done. Through which, various ECG arrhythmias is detected.
Detection of Life Threatening ECG Arrhythmias …
(b) #107
(a) #100
(f) #118
(g) #101
(k) #230
(l) #111
(c) #109
(d) #205
(e) #232
(h) #210
(i) #212
(j) #208
(m) #222
(n) #200
(o) #106
Fig. 5. Extracted various ECG beats
Table 2. Detected arrhythmia morphological features and apply DWT Records Dynamic and morphological features
Type of arrhythmia
100
Normal P, QRS, T wave and segments
Normal sinus rhythm
107
• Large Q wave
PVC
109
• Large P wave • Inverted, Wider QRS complex
LBBB
205
• Absence of P wave • Wider QRS complex • Deformed S-T segment
Atrial fibrillation
232
• Large T wave • Elevation of ST segment
PVC
118
• Absence of P wave • Deformed QRS complex
Atrial fibrillation
101
• Small P wave PAC • Larger Q wave as compared to S wave
210
• Large PQ interval • Inverted Q wave • Deformed ST segment
RBBB
(continued)
389
390
S. Saxena and R. Vijay Table 2. (continued) Records Dynamic and morphological features
Type of arrhythmia
212
• Deformed ST segment • Large T wave
RBBB
208
• Bizarre QRS complex • Absence T wave
Atrial flutter
230
• Small Q wave • ST prolonged
Atrial fibrillation
111
• Deformed/wider QRS complex • Narrow S wave • Large T wave
PVC
222
• Large P wave • Inverted Q wave • Missing ST segment
RBBB
200
• • • •
LBBB
106
• Narrow QRS complex • Large T wave
Large P wave Inverted QRS complex Large S wave Absence T wave
PVC
Table 3. Performance measuring parameters Records
Standard deviation
σ noise-variance
MSE
100
0.212
0.0671
0.0508
101
0.117
0.0641
0.0409
107
0.1443
0.0638
0.0749
109
0.0231
0.0080
0.0351
205
0.1641
0.0487
0.0834
232
0.0979
0.0340
0.0560
118
0.3282
0.1332
0.1991
210
0.228
0.0708
0.1252
212
0.1221
0.0237
0.0530
208
0.0977
0.0151
0.0394
230
0.3378
0.1191
0.1929
111
0.1825
0.0816
0.1217
222
0.1085
0.0570
0.0668
200
0.4371
0.1408
0.2555
106
0.0835
0.0094
0.0386
References 1. N.H. Kamaruddin, M. Murugappan, M.I. Omar, Early prediction of cardiovascular diseases using ecg signal: review, in IEEE Student Conference on Research and Development (Malaysia), pp. 48–53 (2012)
Detection of Life Threatening ECG Arrhythmias …
391
2. S. Alomari, M. Shujauddin, V. Emamian, EKG signals—de-noising and features extraction. Am. J. Biomed. Eng. 6(6), 180–201 (2016) 3. C. Sawant, H.T. Patil, Wavelet based ECG signal de-noising, in First International Conference on Networks and Soft Computing, (Guntur, India), pp. 20–24 (2014) 4. S. Luz et.al., ECG-based heartbeat classification for arrhythmia detection: a survey. Comput. Methods Programs Biomed. ELSEVIER 127, 144–164 (2014) 5. B. Subramanian, ECG signal classification and parameter estimation using multi-wavelet transform. Biomed. Res. 28(7), 3187–3193 (2017) 6. A.A. John et.al., Evaluation of cardiac signals using discrete wavelet transform with MATLAB graphical user interface, ELSVIER. Indian Heart J. 549–551 (2015) 7. P.S. Addison, Wavelet transforms and the ECG: a review. Physiol. Meas. 26, 155–199 8. V.M. Kaviri, De-noising of ECG signals by design of an optimized wavelet. Circ. Syst. 3746– 3755 (2016) 9. A. Mohsin, O. Faust, Automated characterization of cardiovascular diseases using wavelet transform features extracted from ECG signal. J. Mech. Med. Biol. 19(01), 1940009 (2019)
A New Approach for Fire Pixel Detection in Building Environment Using Vision Sensor P. Sridhar, Senthil Kumar Thangavel(B) , and Latha Parameswaran Department of Computer Science and Engineering, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India [email protected], [email protected], [email protected]
Abstract. Computer Vision based approaches is a significant area of research area in detecting and segmenting the anomalies in a building environment. Vision sensor approaches influence automation in detection and localization. Existing fire detection framework has overcome the constraints in conventional approaches such as threshold limits, environmental pollution, proximity to fire, etc. In this paper we propose a new method for fire detection in smart building with a vision sensor that is inspired by computer vision approaches. This proposed method identifies fire pixel using three steps: the first step is based on Gaussian probability distribution; the second one uses a hybrid background subtraction method and the third is based on temporal variation. These three steps are essentially used to address distinct challenges such as Gaussian noise in frames and different resolution of videos. Experimental results show good detection accuracy for video frames under various illuminations and are robust to noise like smoke. Keywords: Gaussian probability distribution · Moving region · Luminance mapping
1 Introduction Fire is an event which emits light, heat, and flame. The nature color of a flame can vary based on material burnt and temperature, varying from white to red color. The timeline of a fire spread in home or building and effects are tabulated in Table 1 shows that the fire destroys human life and possessions within a short duration. National Crime Records Bureau (NCRB) has filed fire deaths statistical report in India during 2015; according to that analysis, in total fire deaths, 42.1% of fire deaths occurred in residential buildings [7]. Hence, fire detection systems are essential for residential as well as commercial buildings. Even though fire safety measures are important, early warning greatly reduces the fire spread. In 1930s, Swiss physicists Walter Jaeger accidently found a transducer which was able to detect smoke, this led to the development of modern smoke sensors [1–5]. Using conventional method, in order to cover larger areas, a large number of sensors have to be deployed for fire detection systems may give unreliable data and has reliability around 80% [6]. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_44
A New Approach for Fire Pixel Detection …
393
Table 1. Fire and its effects in buildings [1] Timeline (hr:min)
Effects
0:30
The fire begins to grow rapidly
1.04
Initial flame diffuse in room and the smoke has been spreading
2:30–3.00
The fire stating room temperature goes above 400 o F and The smoke will begin to flows and spreads through other rooms of the house
3:20
Those in the home, rescuing will be pretty challenging
4:33
The fire will have drowned on the outside of the house. Rescue is not possible those in house
2 Related Works Many researchers have developed algorithms and techniques for fire detection. A summary of major contribution is presented here. Chen et al. [7] have proposed chromatic and dynamic analysis of images for early fire detection in video sequences. The authors have indicated that the low temperature fire region pixels show the spectrum for certain range and the flame tone changes with increase in temperature. Unfortunately, few fires-like object colors are not clearly distinguished. Khatami et al. [4] proposed a color space based approach for flame detection on videos of fire incident. Their approach searched the color space for flames, by multiplication of a linear weight matrix with a sample image. The appropriate search space was found by Particle Swarm Optimization (PSO) with the support of K-medoids. PSO computes iteratively for getting optimum weight value of the conversion matrix. The obtained weight matrix is used for flame pixel detection on image without computing iteratively. Empirical result of this approach show better detection accuracy and true positive rate over conventional and Video Fire Detection (VFD) approaches. Even though this method provides fast detection, high detection accuracy, the high similarity between fire and fire color region failed the detection. Horng et al. [8] proposed live fire detection method by HSI space. This method provided high detection accuracy and albeit positive rate was still not reduced. Celik et al. [9] proposed an algorithm for identification of fire pixels using two methods. Rules were set by visually verifying the fire region in the training image. Even though this color model gives high accuracy false alarm is high.
3 The Proposed Fire Detection Frame Work Based on the characteristics of fire, a three step method for fire region segmentation has been proposed. The first step is probability based fire pixel detection. The second step detects the fire based on modified hybrid background subtraction, and third step is Luminance mapping for frames that boost the magnitude of fire region. The architecture of the proposed model is shown in Fig. 1. This approach contains set of methods which are detailed in the following sub sections.
394
P. Sridhar et al.
Fig. 1. Architecture of the proposed technique
3.1 Fire Pixel Detection by Gaussian Probability Distribution Chen et al. [7] uses a red plane threshold which is vital in flame pixel detection and for saturation threshold computation. Even though this approach needs some thresholds for fire pixel detection, it cannot detect all varying illumination of fire environment or when burning material is changed [10–14]. Hence, in this proposed method fire pixel candidates are detected using RGB probability distribution. Each channel uses a pixel distribution independently. This distribution is fit into Gaussian probability distribution of each channel independently which is estimated as: 1 (Ic (x, y) − μc )2 , c ∈ {R, G, B} (1) exp − pc = √ 2σc2 2π σc where I c (x, y) indicates pixel intensity of channel c in a frame, μc is the mean value of the color space and σc is the standard deviation of fire region of the normal distribution. Hand segmentation technique has been applied to 120 RGB fire images, and 1,20,269 fire pixels are obtained for each channel. The global probability threshold value of a fire pixel is computed as the product of each individual channel probability. The threshold
A New Approach for Fire Pixel Detection …
395
value is set as 2.469 × 10−6 . p(I (x, y)) =
c∈{R,G,B}
pc (I (x, y));
if p(I (x, y)) > τ Fire pixel else Non - Fire pixel
(2)
3.2 Hybrid Background Subtraction for Moving Pixel Detection The motion of fire is based on the burning material used and the airflow. Back ground subtraction method is based on the differences between two adjacent frames in the videos, which defines the threshold. This method is not applicable for real time. In order to avoid aperture and ghosting effect we need a hybrid background subtraction method for learning a background. This is shown in Eq. (3) ⎧ ⎨ I x, y n+1th Frame if frame no = 1
Background x, y n+1th Frame = ⎩ (1 − alpha) ∗ I x, y − alpha ∗ I x, y else n+1th Frame nth Frame Fire pixel if In+1th Frame [x, y] − uint 8 (Background[x, y]n+1th Frame ) > τ Moving pixel Detection = Nonfire pixel else
(3)
In Eq. (3), alpha values vary between 0 and 1. In this proposed approach, alpha value is chosen as 0.8 and threshold value is set to vary from 2 to 4. These values are determined empirically, for the test video dataset. It has been observed that the best detection result is obtained when the threshold value is set to 2. 3.3 Temporal-Luminance Filter for Removal of Non-flame Pixel The mapping the luminance of the fire region and calculating the standard deviation for identifying the fire region based on motion is important. To make the luminance feature ˜ in a frame, we use two different sizes of filters 7 × 7 and 13 × 13. Before map (L) applying these two filter, the image is down sampled by 2. The function of the filter is to replace center pixel with the absolute difference between center pixel and the nearby pixels of the window size w. The output of the filter known as the luminance mapping. In Eq. (4) L˜ is the luminance value which is obtained by the summation of two different size filter output. This feature map is up-sampled by 2 and then the Gaussian filter is applied for elimination of the noise. Luminance map output is shown in Fig. 2 ⎞ ⎛ 1 L ⊗ w⎠ (4) L˜ = ⎝ 2 w ∈ (7×7,13×13)
Finally, L˜ Luminance feature map is generated and then luminance variance of pixel location (x, y) for ten consecutive frames is calculated. Luminance variance is calculated using Eqs. (5) and (6). Based on the analysis of flame characteristics, the flicker frequency has been set 10 Hz independent of the characteristics of burning material. It has been observed that, if we consider more than 10 consecutive frames, it is insensitive to noise but not applicable for real time flame detection. Thus, a temporal window size 10, for
396
P. Sridhar et al.
Fig. 2. a 6791th frame of NIST dataset b luminance map c 690th frame of our dataset d luminance map
measurement of the luminance mapping variance has been set an important parameter. In this work, the flame pixel variance is high as a result of disorder characteristics of consecutive frames. Based on checking the various video datasets the variance threshold is set to Lτ which identifies the fire pixel. The typical Luminance threshold value is 0.008 which is set based on testing the 20 videos. To find mean and variance of the feature map image, the subsequent equations are used: μx,y L(r) = N
1
N
r=1 Hxy L(r) r=1
rHxy L(r)
(5)
where, r—is the vector which indicates different intensity values of the 10 consecutive Luminance images. H xyL pixel wise histogram of the 10 consecutive Luminance images, and L is the Luminance map. σx,y L = N
1
N
i=1 Hx,y L(u) i=1
2
r − μx,y L Hx,y L(u) ;
if σx,y L > Lτ Fire pixel Candidate else NonFire pixel Candidate (6)
The fire segmentation frames using proposed work are shown in Fig. 3 and Table 2. The combination of the entire step produces the resultant frame which is comfort in distinguishing the fire region. We are applying few real time challenges such as Gaussian noise, flip the frame, fire like objects, and illumination changing and different resolution in datasets. In Fig. 3 and Table 2 showed that the proposed work provides the reliable fire region segmentation for various challenges. The test videos were collected from Smart Space Lab, NIST, You Tube’s, Visifire, mivia. Test Video 1, 2, 5, 6 are indoor environments. The fire colors in the test videos are yellow, orange, red and white. For validation 100 test frames was taken from each test videos with different time sample.
4 Results and Discussion This method has been used for detection of the flame in real time with slowly spreading fire. The frame rate has been set as 3–10/sec for experimentation, In order to evaluate the algorithm performance, we have used standard and challenged video datasets such
A New Approach for Fire Pixel Detection …
397
Different Challenges given to Input Frame: FLIP THE FRAME
ADD Gaussian Noise with variance -0.001
FIRE-LIKE Objects
Night Vision
(a)
(b)
(c)
(d)
(e)
Fig. 3. Column a: input frame. Column b: using Gaussian pdf function finds the fire pixel. Column c: with the help of hybrid background subtraction algorithm identify moving fire pixel. Column d: using large variance properties of 10 consecutive Luminance map frames identify the fire pixel. Column e: logical AND operation of (b–d) and the resultant image
Table 2. Test frames of different resolution and its corresponding output S.
Different resolution of input frames Corresponding output No
1 492x360
2 512x288
3 640x360
4 1280x720
as NIST, Visifire, mivia, Smart space Lab (our dataset) and You Tube videos. In each video we used 100 frames of different timings. The test dataset can be seen in Table 3. Performance of this proposed approach has been compared with Khatami et al. method [4]. This proposed fire detection approach and Khatami et al. [4]’s method have been tested and compared for the same set test videos which has different challenges such as
398
P. Sridhar et al.
frame flipping, Gaussian noise, fire like objects, different illumination, different resolution. Experimental results of the proposed work are shown in Table 4. The proposed method and its comparison with Khatami et al. [4]’s method for the test videos and results are tabulated in Table 4. This method gives acceptable level of fire segmentation with an overall detection accuracy as 91.83% and true positive rate as 92.77%. Table 3. Test videos description Testing videos
Test frames
Description
Test video (1–6)
100 (each)
Fire-accident set-up (smart space LAB) Night club fire (dataset from NIST) Christmas tree fire in living room Forest fire (dataset from Visifire) Forest fire (dataset from mivia)
Table 4. Detection accuracies of test videos Testing videos
Accuracy
TPR
FPR
Accuracy
TPR
FPR
Amin Khatami method [4]
Proposed method
Test video 1
88
88.89
20
96
96.67
10
Test video 2
86
91.11
60
99
98.89
0
Test video 3
81
82.22
30
93
95.56
30
Test video 4
84
84.44
20
93
94.44
20
Test video 5
85
86.67
30
86
86.67
20
Test video 6
84
84.44
20
84
84.44
20
Average
84.66
85.96
30
91.83
92.77
16.66
The proposed work is color model probability, back ground subtraction and temporal analysis which help to reduce the false positive rate. Results have been compared with Khatami et al. [4]’s approach as based on the literature survey done this method has the highest detection accuracy and TPR.
A New Approach for Fire Pixel Detection …
399
5 Conclusion In this proposed approach, we have developed a new method for fire detection in images or videos. In this work, Color probability of fire pixel was determined by Gaussian probability density model. Hybrid background method updates background of the current frame which isolates the fast moving fire object. Temporal Luminance method uses, Luminance filter to boost the magnitude of the fire region and variance threshold provides reliable fire pixel detection in temporal perspective. This approach has been tested on various challenging video datasets. This approach turned out to reliably detect fire regions with background noise in an image or video. Experimental results show high detection accuracy and true positive rate. In this work, temporal luminance method takes higher computation times and hence we can work on improving the computation time for real time fire detection system. Acknowledgements. This proposed work is a part of the project supported by DST (DST/TWF Division/AFW for EM/C/2017/121) project titled “A framework for event modeling and detection for Smart Buildings using Vision Systems”.
References 1. S.G. Kong, D. Jin, S. Li, H. Kim, Fast fire flame detection in surveillance video using logistic regression and temporal smoothing. Fire Saf. J. 79, 37–43 (2016) 2. T. Celik, H. Demirel, Fire detection in video sequences using a generic color model. Fire Saf. J. 44, 147–158 (2009) 3. A.E. Cetin, K. Dimitropoulos, B. Gouverneur, et al., Video fire detection-review. Digit. Signal Process. 23(6), 1827–1843 (2013) 4. A. Khatami, S. Mirghasemi, A. Khosravi, C.P. Lim, S. Nahavandi, A new PSO-based approach to fire flame detection using K-medoids clustering. Experts Syst. Appl. 68, 69–80 (2017) 5. R.A. Sowah, A.R. Ofoli, S. Krakani, S. Fiawoo, Hardware design and web-based communication modules of a real-time multi-sensor fire detection and notification system using fuzzy logic. IEEE Trans. Ind. Appl. 53(1) (Jan/Feb, 2017) 6. K. Muhammad, J. Ahmad, Z. Lv, P. Bellavista, P. Yang, S.W. Baik, Efficient deep CNNbased fire detection and localization in video surveillance applications. IEEE Trans. Syst. Man Cybern. Syst 99, 1–16 (2018) 7. T.H. Chen, P.H. Wu, Y. Chiou, An early fire-detection method based on image processing, in 2004 International Conference on Image Processing, 2004. ICIP ‘04, (Singapore), vol. 3, pp. 1707–1710 (2004) 8. W.B. Horng, J.W. Peng, C.Y. Chen, A new image-based real-time flame detection method using color analysis, in Proceedings. 2005 I-EEE Networking, Sensing and Control, pp. 100– 105. IEEE (2005) 9. Turgay Celik, Fast and efficient method for fire detection using image processing. ETRI J. 32(6), 881–890 (2010) 10. S. Sruthy, L. Parameswaran, A.P. Sasi, Image fusion technique using DT-CWT, in 2013 International Multi-Conference on Automation, Computing, Communication, Control and Compressed Sensing (iMac4s), pp. 160–164. IEEE (2013) 11. T. Kumar, S.N. Senthil, G.P. Sivanandam, Akhila, Detection of car in video using soft computing techniques, in Global Trends in Information Systems and Software Applications (Springer, Berlin, Heidelberg), pp. 556–565 (2012)
400
P. Sridhar et al.
12. P. Sridhar, R.R. Sathiya, Crypto-watermarking for secure and robust transmission of multispectral images, in 2017 International Conference on Computation of Power, Energy Information and Commuincation (ICCPEIC). IEEE (2017) 13. P. Sridhar, R.R. Sathiya, Noise standard deviation estimation for additive white gaussian noise corrupted images using SVD domain. Int. J. Innovative Technol. Explor. Eng. 8(11), 424–431 (2019) 14. R.R. Sathiya, Content ranking using semantic word comparison and structural string matching. Int. J. Appl. Eng. Res. 10(11), 28555–28560 (2015)
Innovative Practices
Computer Assisted Classification Framework for Detection of Acute Myeloid Leukemia in Peripheral Blood Smear Images S. Alagu(B) and K. Bhoopathy Bagan Anna University, Chennai, Tamil Nadu, India [email protected], [email protected]
Abstract. Acute Myeloid Leukemia (AML) affects myeloma cells in human blood. The manual detection of AML cells in peripheral blood smear image is a difficult task and also needs more time. The computer assisted classification framework for AML cell is proposed in this work. AML causes changes in nuclues of myeloma cells. The microscopic images of myeloid cell are obtained from online database. Pre-processing and segmentation process are performed to obtain nucleus and cell mask. The shape of nucleus is irregular and its texture is also changed when the human is affected by acute myeloid leukemia. Separation of nucleus and cytoplasm is achieved through k-means clustering algorithm. The morphological feature vectors are used as input for classifiers. Classification framework is done by using different machine learning classifiers and found that Random forest classifier gives an accuracy of 95.89%. Keywords: Acute myeloid leukemia (AML) · Blood smear images · K-means clustering and random forest (RF) classifier
1 Introduction Leukemia is a cancer in white blood cells of human being. Acute and chronic leukemia are basic types of leukemia based on fastness of the cancer cell growth. Lymphocytic and myeloblastic leukemia are cancer in lymphoid and myeloid cell respectively. The detection and classification of cancer in myeloid cell is discussed in this paper. Overproduction of immature white blood cells is a major cause of AML. Immature myeloid cells in bloodstream spread into human body and affects vital organs. Accord ing to the reports of National Cancer Institute of USA, the estimated new cases of AML in 2018 alone is 0.02 million. Comparatively in India, an estimated death due to AML in 2018 is 10,670. There are about 75,000 new cases per year in India. Detection of AML cells at the earlier stage reduces death rate to considerable level. In the proposed work, machine learning classifiers are utilized for the detection of AML. The organization of research paper is discussed here. Section 1 introduces acute myeloid leukemia. Section 2 describes previous related works in the proposed area. Section 3 focuses on working principle of the proposed system. The results and discussion of the © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_45
404
S. Alagu and K. B. Bagan
proposed approach are presented out in Sect. 4. Fifth section briefs about conclusion of the proposed work. The key references are listed at the end of the paper.
2 Literature Review Segmentation of leukocyte is concentrated by Chao et al. [1]. Concave-convex iterative algorithm was proposed for segmentation of leukocytes. Stepwise averaging method was utilized for nucleus separation. Fuzzy divergence value was minimized in the interval valued fuzzy sets to separate cytoplasm of leukocyte. Rawat and Singh [2] discussed an algorithm for detection of ALL and AML cells. Analysis of colour, texture and morphological features were carried out. Support Vector Machine classifier with radial basis function was adopted for classification. Computer aided method for recognition of cancer cell from a microscopic image was proposed by Kakre et al. [3]. Region properties were used for the cancerous tissues detection. A green plane of an image is recommended for leukemia cell segmentation. Vector quantization technique was done through Linde Buzo Gray Algorithm (LBG) for better accuracy. Detection of AML, CML, CLL and ALL cells using water shed algorithm and multiclass SVM was elaborately discussed by Preeti et al. [4]. In [5], Hausdorff dimension and local binary pattern techniques were utilized for detecting leukemia cells. 98% accuracy of classification was obtained. Segmentation of nucleus from the blood smear images through thresholding and gradient vector method was briefly carried out in [6]. The leukemia detection accuracy was 82.9%. Viswanathan [7] adopted contrast enhancement techniques for nucleus separation of AML cell. Morphological contour segmentation and Fuzzy C means Clustering optimization technique were utilized for leukemia cell detection. Liu et al. [8] focused on segmentation of WBC. Grab Cut algorithm based on dilation was iteratively adopted to obtain more precise results.
3 Proposed Method A simple and effective approach for AML cell detection is proposed in this paper. Microscopic images of AML cells and healthy white blood cells are collected from the online database of Isfahan University. The proposed method can be used as second reference for pathologists. The block diagram of the work is shown in Fig. 1. The methodology of the proposed work is discussed below. Pre-processing Pre-processing is the first step for any image processing techniques. The microscopic images with various dimensions are converted into uniform size of 256 × 256. The input images are either in .jpg or .tif file formats. The blood smear image of blood cell and its background varies greatly with respect to colour and intensity due to camera settings, illumination and aging stain. An adaptive colour space conversion procedure is carried out to overcome these variations. The RGB colour image is translated into the L * a * b color space and CMYK color space as a preprocessing work where L* for the lightness from black (0) to white (100),
Computer Assisted Classification Framework …
405
Fig. 1. System architecture
a* from green (−) to red (+), and b* from blue (−) to yellow (+) as per the definition given by the International Commission on Illumination (CIE). The key reason is that nucleus of white blood cell can be very well discriminated in L * a * b color space. Pre-processed images are given as input for the segmentation process. Image Segmentation Segmentation of cell and nucleus are extracted by k-means clustering algorithm. Cluster heads (centroids) are assumed initially and each pixel [9–11] in an image forms cluster around the centroid. The Euclidian distance between the pixel and centroid is calculated. Pixels with smaller distance are formed as same cluster. The new cluster head is found and the procedure is repeated till the perfect segmentation done. Feature Extraction The prominent features are used to classify the blast cell and normal cell perfectly. In the proposed work, regional feature vectors of cell area, nucleus area, orientation, form factor, eccentricity, perimeter and solidity are extracted. Classification Classification plays vital role in this work. Linear Discriminant Analysis (LDA), Logistic regression (LR), K-Nearest neighbor (KNN), Support vector machine (SVM), Classification and regression trees (CART) and Random forest classifier are used to detect AML cell. The concepts behind all the classifiers are discussed in this section. Linear Discriminant Analysis method works on the principle of projecting higher dimension space features onto a lower dimensional space. Our aim is to maximize the
406
S. Alagu and K. B. Bagan
class variance and to minimize the sample distance with in the class. The model of Linear Discriminant δk(x) is given below. δk(x) = x μk /σ 2 − μk /2σ 2 + log(πk )
(1)
where, μk mean σ Standard deviation from mean σ 2 variance. Logistic regression is used to perform predictive analysis on binary dependent variable. Assumptions on dependent variable η and selection of model are important for Logistic Regression. The expression for Logistic regression is shown below. Logistic(η) =
1 1 + exp(−η)
(2)
K nearest neighbours algorithm (KNN) works on the basis of similarity measure of distance functions. Classification is performed based on the majority vote of neighbours. Selection of the value K is more important for greater classification accuracy. A supervised, machine learning classifier is support vector machines (SVM). It gives an optimal hyper plane which separates the given data. Kernel trick is adopted as shown in Eq. 3. Complex regions with in the feature space are created using radial kernel. K(x, xi ) = 1 + sum(x ∗ xi )d
(3)
where, x—input, x i —support vector and gamma—parameter of kernel. A supervised learning algorithm based on “decision trees” is Classification and Regression tree algorithm (CART). It is best suitable for mutually exclusive binary classification. Finally, Random forest classifier is adopted in the proposed work. Random forest algorithm picks subset of features in random manner whereas the decision tree picks one by one. Random forest allows different set of features to take decisions. Bagging of trees reduces correlation value, thereby reducing the predictive error among the trees. Higher accuracy value is obtained and there is no possibility for over fitting due to the random feature selection. Performance Evaluation The proposed classification framework is evaluated by calculating different metrics using confusion matrix. The elements of confusion matrix are true positive, true negative, false positive and false negative. The formulas for calculating the performance metrics are shown the Table 1.
4 Results and Discussion Image Acquisition Two hundred and sixty healthy and AML blood smear images of single cell are obtained
Computer Assisted Classification Framework …
407
Table 1. Calculation of performance metrics Formulas used Accuracy
Sensitivity Specificity Precision F-measure
TP+TN TP+FP+FN+TN
TP TP+FN
TN FP+TN
TP TP+FP
2∗TP 2∗TP+FP+FN
from database provided by Isfahan University, research Center of medical Sciences medical Image and Signal Processing as shown in Table 2. Table 2. AML image data base detail Blood cell types Images Healthy Blast Total 130
130
260
The Fig. 2 shows the image of normal cell and AML affected cell differed in nucleus shape. The nucleus is covered by cytoplasm. The nucleus is perfect circle like region for healthy blood. The shape of nucleus is irregular and its texture is also changed when the human is affected by acute myeloid leukemia.
a. Healthy Cell
b. AML cell
Fig. 2. Blood smear images a healthy blood cell and b AML cell
Pre-processing The raw images of various sizes were subjected to resizing to a uniform size of 256 × 256. The images were formulated into L * a * b color space and CMYK color space to make segmentation of nucleus and cytoplasm. The proposed work is suitable only for single cell images. In future, this work will be extended for microscopic blood smear images of multiple cells by incorporating deep learning algorithms. Image Segmentation Segmentation is performed by k means clustering algorithm as per the discussion in Sect. 3. Euclidean distance is calculated between the centroid and individual pixels in
408
S. Alagu and K. B. Bagan
Fig. 3. Segmentation of nucleus and cell mask
order to form the clusters. The segmented results of nucleus mask and cell mask are shown in the following Fig. 3. The Nucleus mask and cell mask of both healthy and acute myeloid leukemia cells are shown in Table 3 and the variations in their shapes are noted to identify the AML cell. Table 3. Segmentation of cell mask and nucleus mask Healthy cells Input Image
AML cells Cell mask
Nucleus Mask
Input Image
Cell mask
Nucleus Mask
Feature Extraction and Classification Ten different morphological features are extracted from segmented masks. A set of 260 blood smear images (130-normal, 130-blasted) have taken as training set. 50 images are reserved as test set. The classification is performed by various machine learning classifiers. Confusion matrix was computed for every classifier. Performance metrics of the proposed system are calculated from confusion matrix by using the formulas as shown in Table 1. The evaluated metrics are tabulated in Table 4. From the above table, it is observed that LDA has high precision value but the recall value is poor. SVM has optimal recall score than other classifiers but precision is poor. KNN also has lesser precision value. CART algorithm gives better performance but the accuracy is not high. By comparing all the performance measures, Random forest classifier gives the best performance with improved accuracy. Detection of acute myeloid
Computer Assisted Classification Framework …
409
Table 4. Performance metrics of different classifiers S. No
Classifiers
Precision
Recall
F-score
1.
Linear discriminant analysis (LDA)
0.97
0.66
0.79
2.
Logistic regression (LR)
0.86
0.77
0.81
3.
K-nearest neighbors (KNN)
0.79
0.98
0.87
4.
Support vector machines (SVM)
0.81
1.00
0.89
5.
Classification and regression trees (CART)
0.91
0.97
0.94
6.
Random forest (RF)
0.95
0.96
0.96
lukemia is mainly concern about the accuracy. The accuracy of all the classifiers used in the work is drawn as separate bar chart as shown in Fig. 4. Random Forest Classifier gives a better accuracy of about 95.89%. Bagging of decision trees, Pruning and prediction based on maximum voting concept lead to better accuracy of Random forest classifier.
Fig. 4. Performance comparison of different classifiers
5 Conclusion The diagnosis of acute myeloid leukemia using image processing and machine learning techniques is carried out in this work. Segmentation is achieved by k means clustering technique. Morphological features are extracted. The classification framework is done with various machine learning classifiers. Random forest classifier gives better performance for detection of acute myeloid leukemia cells. In future, the analysis will be carried out with multi cell images by adopting deep learning techniques. Ensemble classifiers will be introduced to improve the detection accuracy. The proposed system is very simple and efficient to diagnose the disease of AML at the earliest.
410
S. Alagu and K. B. Bagan
References 1. H. Cao, H. Liu, E. Song, A novel algorithm for segmentation of leukocytes in peripheral blood. Elsevier, J. Biomed. Signal Proc. Control 45, 10–21 (2018) 2. J. Rawat, A.P. Singh, Computer assisted classification framework for prediction of acute lymphoblastic and acute myeloblastic leukemia. Elsevier, J. Bio-Cybern. Biomed. Eng. 37(4), 637–654 (2017) 3. F. Kazemi, Automatic recognition of acute myelogenous leukemia in blood microscopic images using K-means clustering and support vector machine. J. Med. Signals Sensors 6, 183–193 (2016) 4. P. Jagadev, H.G. Virani, Detection of leukemia and its types using image processing and machine learning, in International Conference on Trends in Electronics and Informatics (2017) 5. S. Begum, R. Sarkar, Data classification using feature selection and kNN machine learning approach, in International Conference on Computational Intelligence and Communication Networks, pp. 811–814 (2015) 6. S. Agaian, Automated screening system for acute myelogenous leukemia detection in blood microscopic images. IEEE J. Syst. 8(3) (September, 2014) 7. Viswanathan, Fuzzy C means detection of leukemia based on morphological contour segmentation. Elsevier Procedia Comput. Sci. 58, 84–90 (2015) 8. Y. Liu, F. Cao, Segmentation of white blood cells image using adaptive location and iteration. IEEE J. Biomed. Health Inf. 21(6), 1644 (2017) 9. V.P. Dhaka, An efficient segmentation technique for Devanagari offline handwritten scripts using the feedforward neural network. Neural Comput. Appl. 26, 1881–1893 (2015). https:// doi.org/10.1007/s00521-015-1844-9 10. V.P. Dhaka, Pixel plot and trace based segmentation method for bilingual handwritten scripts using feedforward neural network. Neural Comput. Appl. 27, 1817–1829 (2016). https://doi. org/10.1007/s00521-015-1972-2 11. V.P. Dhaka, Segmentation of english offline handwritten cursive scripts using a feedforward neural network. Neural Comput. Appl. 27, 1369–1379 (2016). https://doi.org/10.1007/s00 521-015-1940-x 12. Acute myeloid leukemia—surveillance, epidemiology, and end results (seer) program. http:// seer.cancer.gov/statfacts/html/amyl.html 13. K. Amin, Talebi, Oghli, Recognition of acute lymphoblastic leukemia cells in microscopic images using k-means clustering and support vector machine classifier. J. Med. Signals Sensors 5(1), 49–58 (2015) 14. Campana, C. Smith, Measurements of treatment response in childhood acute leukemia. Korean J. Hematol. 47(4), 245–254 (2012) 15. C. Reta, Leopoldo, R. Lobato, Segmentation and classification of bone marrow cells images using contextual information for medical diagnosis of acute leukemia. J. PLoS One (24 June, 2015 16. K. Archana, H.R. Kaliyal, Segmentation of blast using vector quantization technique. Int. J. Comput. Appl. 72 (2013) 17. M. Yan, K-means cluster algorithm based on color image enhancement for cell segmentation, in Proceedings on Biomedical Engineering and Informatics, IEEE (2012)
An Efficient Multimodal Biometric System Integrated with Liveness Detection Technique Chander Kant and Komal(B) Department of Computer Science and Application, KUK, Kurukshetra, Haryana, India [email protected], [email protected]
Abstract. The biometrics is an alternative for password-based recognition systems in recent days. But biometric recognition systems are venerable to spoof attacks. Hence, anti-spoofing techniques are attracting growing interest in this field. Liveness detection is the anti-spoofing technique, which has the aim to identify the dead and living modalities presented at the time of authentication. This paper presents an efficient and robust multimodal biometric system integrated with liveness detection technique. In the proposed system, face and fingerprint modalities are used at the enrolment time and only face modality is used to authenticate a person. Here, fingerprint is used to generate a fingerprint key using K-means clustering algorithm and then encrypt the face feature set with asymmetric encryption standard (AES) using fingerprint key. At the time of authentication, liveness of a person is checked using challenge--response method at the sensor level. If in any case, an intruder has passed liveness test, he/she may be detected at matching score level as template of system is secured with AES encryption. Multimodal biometric system integrated with liveness detection technique can increase the speed of the system. Matrix Laboratory (MATLAB) 2017b and multi-biometric integration (MUBI) tool are used for the experimental work, and results show that the proposed system gives better performance as compared to others. Keywords: Face recognition · Encrypted face template · Liveness detection · Multimodal biometric system
1 Introduction Biometrics are used to authenticate persons based on physiological and behavioural characteristics of a human. Biometric systems can authenticate people with assurance. Individual has unique biometrics but they are not secrets, i.e. they can be spoofed by an imposter. Fingerprints can collect from everything a person touch, face structure and iris patterns are visible to everyone while voice can be recorded by an imposter, etc. A biometric modality can be obtained directly by spoofing attack, e.g. acquisition of iris, facial, fingerprint acquisition, etc., or replication of behavioural modalities. The digitized artefact is the duplicate of a genuine person. The artefact presented to the authentication system misleads the system [1]. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_46
412
C. Kant and Komal
Most vulnerable point in an authentication system is sensor level where biometric samples are captured from individual to prove his individuality. Sensor module is the easiest target for the imposters to spoof system by presenting fake samples. False data injection is happened after data processing or before the feature extraction module as shown in Fig. 1. Therefore, there is a need to detect that the biometric sample acquired from a live person or from other source. Liveness detection is a technique that helps in detecting that the input sample is provided by a live person or not and prevents spoofing attacks. [2].
Fig. 1. Attacks of fake biometrics on biometric system
Liveness detection techniques are categorized in two parts as shown in Fig. 2: (1) hardware-based techniques and (2) software-based techniques. Hardware-based techniques identify the life signs from input biometric at the sensor level. Extra hardware has been used to get life signs from input data, e.g. hardware can be used to measure temperature, odour, pulse oximetry, blood flow and spectral information of biometric input [3]. But the cost of the biometric system may increase by integrating hardware. Second limitation of this technique is that fake biometric can be combined with live one to forge the biometric system at the very first level. Second category extracts life signs from pre-processed input image. Software-based techniques are further divided into two parts: (i) static techniques (using single sample) and (ii) dynamic techniques (using multiple samples). Static techniques detect the life signs by analysis of skin perspiration in fingerprint, Fourier spectrums of face and iris, etc. [4]. Dynamic techniques used multiple samples of biometric to detect life signs. This technique is based on movement of eyes, movement of mouth, iris or triggering of pupils with light reflection. In the face recognition systems, photograph of legitimate person is most commonly used for spoofing the system, since one’s facial photograph or image is very easily captured using unknowingly camera or download from the web, etc. This spoofing attack is simple and cheap. The intruder can also fool the authenticating system by rotating, shifting and bending the legitimate person’s photograph before the acquisition camera like live human being. The most recent survey [5] confirms that secondmost used technique is face recognition after fingerprint recognition. Facial authentication is currently used in many areas, e.g. surveillance; access controls; human--robot interaction, etc. Most
An Efficient Multimodal Biometric System Integrated …
413
Fig. 2. Classification of liveness detection techniques
commonly used attack on facial recognition is the spoofing attack where intruder try his best to deceive the system using photograph/videos, etc., of the real identity. This type of attack can be prevented by using liveness detection techniques. In this paper, challenge and response method based on eyes and mouth movement is used to check liveness of the biometric sample. In this method, system generates a challenge in terms of mouth and eyes movements. These challenges can only be complete by a live person not by photograph and analyse their response of generated challenge. If the challenge is equal to response, only then further process will be carried out otherwise access is denied. This paper presents an efficient multimodal biometric system integrated with liveness detection. In the enrolment phase, two modalities (face and fingerprint) are used but at the time of authentication only one modality (face) is used. Rest of the paper is organized as follows: Section 2 discusses the related work. Section 3 presents the proposed work of research paper. This section further includes feature-set extraction process, fingerprint key generation process and architecture of the proposed work. Section 4 presents the results and discussion related to experimental results. Last section includes the conclusion of this research paper.
2 Related Work Prabhakar et al. [6] presented a novel system using quality events for liveness detection system. Following properties are used for measuring image quality: ridge clarity, frame strength, ridge continuity or estimated performance of authentication when using the hand appearance. For measurement of these properties, a number of information are used which are as follows: (i) direction field provides the angle information, (ii) pixel intensity value of the greyscale picture or image, (iii) power spectrum and direction angle implementation by Gabor filters, (iv) fingerprint feature quality can be measured either by holistic method, or from non-overlapped local blocks of the image.
414
C. Kant and Komal
Galbally et al. [7] reviewed two cases for face attack detection. The first case study examined hill-climbing attack on an eigenface-based system. The second case study examined attack on GMM parts-based system. By examining these studies, it is clear that attack performance is dependent on selected parameter values. Nixon et al. [8] introduced liveness detection as a solution for direct attacks produced by fake or synthetic traits which are very difficult to detect. Proposed system improved the security level of the system. Singh et al. [9] proposed a challenge--response-based liveness detection scheme. Liveness module is added before feature extraction module of the face recognition system to increase the security of the system. Face macrofeatures (mouth and eye movement) are used for generating random challenge and observe user’s response. If user’s response is same as generated challenge only then user gets access to system otherwise access is denied. Experimental result shows that proposed system is very much efficient than existing systems. Pan et al. [10] proposed liveness detection technique using eye blink against face spoofing attack. Eye blinking technique is a software-based technique; therefore, there is no need of extra hardware. But this technique does not work when user wear sun glasses at the authentication time. Authors [18] reviewed the liveness detection techniques for face, fingerprint and iris and also evaluate the performance of multimodal techniques and liveness detection techniques on publicly available databases. Naidu1 and Prasad [11] used image quality assessment (IQA) values for detecting the spoof attacks in face recognition system. IQA values help in detecting fake and real identities because fake identity always has different values as compared to real identity. This liveness detection technique is applied on multimodal system to increase the efficiency or accuracy of the system. Sonavane [12] proposed a software-based liveness detection technique in biometric system [13]. 25 reference IQA values and 4 non-reference IQA values are used to check that the input sample is coming from a live identity or fake identity. Proposed system is less complex than hardware-based liveness detection system and more suitable in real-time situations. Wild et al. [14] proposed a challenge--response-based liveness detection technique. This technique is integrated with multimodal system to increase the accuracy of the system. Fingerprint and face modalities are used for multimodal system. Publicly available databases CASIA and fingerprint liveness detection competition (FLDC) are used for face, fingerprint, respectively.
3 Proposed Work Multimodal biometric systems are more reliable than unimodal biometric systems. They overcome all shortcomings present in unimodal biometric systems. In this paper, an efficient multimodal biometric system integrated with liveness detection is developed. Face and fingerprint modalities are used in the system. Fingerprint modality is used to create the encryption key and this key is used to encrypt face template using asymmetric encryption standard (AES) algorithm. At the authentication time, only face is captured and checks the liveness. If the person is live, only then system proceeds to authenticate
An Efficient Multimodal Biometric System Integrated …
415
the person else access denied. Proposed system is extra reliable and accurate than related other systems. 3.1 Feature-Set Extraction and Fingerprint Key Generation Process (a) Face feature-set extraction At the very first, face image is captured through appropriate sensors and detect region of interest. Human face organs like eyes, ear, mouth, nose, etc., make the human face [15]. Structure of every organ is different in size and structure. Geometrical distribution or characteristics of these organs are used to collect features from human face. Every human faces are different due to these organs. Distance between these organs makes some pattern. These different patterns or features are called eigenvalues or principal component in face recognition system. Principal component analysis (PCA) is used to extract feature values. Euclidean distance between extracted features and stored template features is used to build matching score [16]. Figure 3 shows the facial feature extraction process [17]. Figure shows face feature-set extraction process
Face Detection
Pre-processing of Face
Eigen Values
Feature Set Extraction
Face
Fig. 3. Face feature-set extraction process
(b) Fingerprint feature-set extraction Valley and furrows made the fingerprint pattern. At the very first, fingerprint sample is captured through appropriate sensor. Second step is to improve the quality of sample and extract ridge pattern. After that apply thinning process and finally extract the minutiae points in fingerprint. Minutiae points are mainly found at ridge ending or bifurcation [17, 18]. Figure 4 shows the whole process of fingerprint feature-set extraction. (c) Fingerprint Key Generation Process Here, Fingerprint key is generated by using K-means clustering algorithm. Fingerprint features (n) are grouped into i-clusters where i < n. Every feature in the feature set is a member of any cluster. Each cluster has a centroid value and that value is used to form fingerprint key [19, 20]. A centroid model is used to extract the centroid value of each cluster. Steps followed in K-means clustering algorithm are as follows:
416
C. Kant and Komal
Fig. 4. Fingerprint feature-set extraction process
1. 2. 3. 4. 5.
Extract minutiae points from input sample. Extract minutiae points and initialize centroid value and clusters. Find Euclidean distance (ED) between minutiae points and centroids. New minutiae point is assigned to cluster with minimum ED. If (optimum number of clusters) Calculate number of clusters and centroid value. Else Go to step 2.
3.2 Architecture of Proposed System (a) Enrolment Phase First step in this phase is to acquire two modalities face and fingerprint through appropriate acquisition device and then extra noise is removed by using median filter. Second step is to take out or extract feature set of the fingerprint and face. Third step is to extract the encryption key from fingerprint and apply this to encrypt face feature set using AES algorithm. Next step is to store encrypted features in encrypted face template (EFT) and then encrypt the fingerprint key using simple substitution method and store the encrypted Key (EK). At last, store both EFT and EK in database. Figure 5 shows the whole enrolment process of proposed system. (b) Authentication phase Authentication phase is divided into two parts: (i) liveness detection module and (ii) authentication module Liveness detection module: In this module, face liveness is checked using challenge-response method as shown in Fig. 6. Challenges are generated randomly in the form
An Efficient Multimodal Biometric System Integrated …
417
Fig. 5. Enrolment phase of proposed system
of movements of eye and mouth. If the generated challenge is equal to the calculated response, only then person is declared as live person else imposter. Only live person is sent for the verification process. Challenge generation and response calculation process in the liveness detection module are as follows.
Fig. 6. Liveness detection module
Challenge Generation: Challenge generation is used to check the liveness of the person. It assumes that if a person can move his face (left, right, up or down) then it is declared as live person. But this is not true in all cases. Attackers can play recorded video of genuine user to fail this assumption. Hence, challenge generation technique is used in such a way that only real person can respond to the challenge. A random movement of mouth or eyes (closeness/openness) is used to generate challenges. No one can copy the random sequence of movements. Movement of mouth can be measured by teeth hue
418
C. Kant and Komal
saturation value (HSV), and movement of eyes is measured by openness or closeness of eyes. Both movements are dependent on each other so that intruder will not be able to forge the system. Response calculation: Movements of mouth or eye are calculated to check responses. Eye or mouth movements are calculated by searching eye or mouth region, respectively. Mouth region is situated at 20–30% height of face and eye region is situated at 65–75% height of face as shown in Fig. 7. Total no. of eye and mouth movements are used to calculate response. Eye movements are calculated by counting the openness or closeness of eye. Similarly, mouth movements are found out by searching teeth (hue saturation value) in mouth region of face, if teeth are present only then it assumes mouth is open else close. If the calculated response is equal to generated challenge then person is declared as live person otherwise imposter.
Fig. 7. Face fragmentation
Authentication Module: In this module, only real persons are allowed for authentication. First step is to extract EFT and (EK). Second step is to decrypt the key using same substitution method and then decrypt the EFT with AES technique using same fingerprint key. Next step is to match the input face template with stored template and if matching score (S) is greater than set threshold value (T ) only then person is declared as genuine person else imposter Fig. 8 shows authentication process of the proposed system.
4 Experimental Results and Discussion MATLAB 2017b is used to evaluate the performance and effectiveness of proposed multimodal system. At the time of enrolment, a person has to enrol in biometric system with the face and the fingerprint key and both encrypted face and fingerprint is stored in the template database. At the authentication time, system first checks the liveness of a person and then decrypts the face template using respective fingerprint key. If the matching score (S) of input person and stored template is greater than set threshold (T ), then only person is recognized as genuine person otherwise imposter. An attacker or imposter can attack the system using following attacks: Eye imposter attack, Photo imposter attack, mouth imposter attack, video imposter attack, eyes and mouth attack. Table 1 shows that status of attack when it goes through liveness detection level and authentication module of a system.
An Efficient Multimodal Biometric System Integrated …
419
Fig. 8. Authentication phase of proposed system
Table 1. Attacks detection at liveness detection module and authentication module Attacks
Liveness detection module
Authentication module
Photo imposter attack
No
No
Eye imposter attack
No
No
Mouth imposter attack
No
No
Video imposter attack
No
No
Eye and mouth imposter attack
yes
Yes/No
Status of attacks is in the form of yes or no. Yes means attacker may pass the module with this type of attack and no means he/she will not be able to pass that module. Attackers may be machine operated or may attack the system with the use of above mentioned attacks. In these cases, proposed system takes very less time for detecting the attacker as compared to biometric systems based only on face or fingerprint. Figure 9 shows the time versus biometric system graph for an imposter. This figure shows that time taken by face biometric is highest and time taken by proposed system is very less compared to others. In the case of genuine user authentication, proposed system takes some extra time because system first checks the liveness of a user and then decryption process takes place before matching module and at last decision is taken based on matching score. Figure 10 shows the time vs biometric system graph for genuine user. The proposed system is run for 10 rounds and in each round 100 imposter persons (self-collected) that are not enroled in a system are tested. These imposters may attack
420
C. Kant and Komal
Time (in Seconds)
12 10 8 6 10.12 4
8.56 4.26
2 0
Face
Finger
Proposed
Time (in Seconds)
Fig. 9. Time versus biometric system graph for imposters (non-living) 14 12 10 8 6 4 2 0
12.68 10.12
Face
8.56
Finger
Proposed
Fig. 10. Time versus biometric system graph for genuine user
the systems with different types of attack given in Table 1. Figure 11 shows that in each round, false accept rate (FAR) of proposed system is better than face and fingerprint biometric systems. FAR is the measure of falsely accepted persons in a biometric system. Horizontal line of Fig. 10 shows the no. of rounds of system run and vertical line shows the no. of falsely accepted person through system.
Fig. 11. System strength of proposed and existing systems
In each round, FAR of the proposed system is very less as compared to face and fingerprint biometric systems. Overall accuracy of the biometric system depends on FAR. Lower the value of FAR, higher will be the accuracy of biometric system. Thus,
An Efficient Multimodal Biometric System Integrated …
421
the performance of the proposed system is much higher than existing face and fingerprint biometric systems.
5 Conclusion Liveness detection is a technique that helps in detecting that the input sample is provided by a live person or not and prevents spoofing attacks. It is necessary because it ensures physical presence of a live person. This paper presents an efficient multimodal biometric system which is integrated with liveness detection module and AES cryptography technique is used for securing the template database. In the proposed system, challenge-response method is used to check the liveness of a person. Proposed system is more reliable as it can detect the spoof attacks at the very first sensor level. If in any case, an intruder has passed the liveness test, the system may detect the intruder at matching score level as template of system is secured with AES encryption using fingerprint key technique. Experimental results show that proposed system is more reliable, robust and secure as compared to existing systems integrated with liveness detection. Future work will focus on the integration of other security techniques with liveness detection technique in biometric systems.
References 1. S. Prabhakar, S. Pankanti, A.K. Jain, Biometric recognition: security and privacy concerns. IEEE Security Privacy 1(2), 33–42 (2013) 2. P. Kallo, I. Kiss, A. Podmaniczky, J. Talosi, Detector for recognizing the living character of a finger in a fingerprint recognizing apparatus. Dermo Corporation, pp. 64–70 (2001) 3. W. Kang, X. Chen, Q. Wub, The biometric recognition on contactless multi-spectrum finger images. Infrared Phys. Technol. 68, 19–27 (2015) 4. Y. Xu, F. Luo, Y.-K. Zhai, J.-Y. Gan, Joint iris and facial recognition based on feature fusion and biomimetic pattern recognition, in International Conference on Wavelet Analysis and Pattern Recognition (Tianjin, 2013), pp. 14–17 5. J.S. Bedre, S. Sapkal, Ivestigation of face recognition techniques: a review, in Emerging Trends in Computer Science and Information Technology (ETCSIT2012) Proceedings distributed in International Journal of Computer Applications (IJCA) (2012) 6. S. Prabhakar, S. Pankanti, A.K. Jain, Biometric recognition: security and privacy concerns. IEEE Security Privacy 1(2), 33–42 (2013) 7. J. Galbally, C. McCool, J. Fierrez, S. Marcel, J. Ortega-Garcia, On the vulnerability of face verification systems to hill-climbing attacks. Pattern Recogn. 43(3), 1027–1038 (2014) 8. K.A. Nixon, V. Aimale, R.K. Rowe, Spoof detection schemes, in Handbook of Biometrics. (Springer, New York, NY, USA, 2018), pp. 403–423 9. A.K Singh, P. Joshi, G.C Nandi, Face recognition with liveness detection using eye and mouth movement, in International Conference on Signal Propagation and Computer Technology (ICSPCT), pp. 592–597 (2014) 10. G. Pan, L. Sun, Z. Wu, S. Lao, Eyeblink-based anti-spoofing in face recognition from a webcamera, in Proceedings IEEE 11th International Conference on Computer Vision (ICCV), pp. 1–8 (2007) 11. Y.N. Singh, S.K Singh, Vitality detection from biometrics: state-of-the-art. World Congress on Information and Communication Technologies, pp. 106–111 (2018)
422
C. Kant and Komal
12. P.A. Naidu, C.G Prasad, Multi-mode detection and identification of biometric using multilevel scaler SVM. Int. J. Res. Appl. Sci. Eng. Technol. (IJRASET) 5(6), 1408–1416 (2017) 13. P.M. Sonavane, Fake biometric trait detection using image quality features. Int. J. Eng. Develop. Res. (IJEDR) 3(2), 339–344 (2015) 14. S. Manoj, A survey of thresholding techniques over images, vol. 3, No. 2, pp. 461–478 (2014) 15. P. Wild, P. Radu, L. Chen, J. Ferryman, Robust multimodal face and fingerprint tfusion in the presence of spoofing attacks, in Conference on Pattern Recognition, United Kingdom, pp. 17–25 (2016) 16. A. Admane, A. Sheikh, S. Paunikar, S. Jawade, S. Wadbude, M.J. Sawarkar, A Review on different face recognition techniques. Int. J. Sci. Res. Comput. Sci. Eng. Inform. Technol. 5(1), 207–213 (2019) 17. A.M. Mouad, V.H. Mahale, P. Yannawar, A.T. Gaikwad, Overview of fingerprint recognition system, in International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (2016) 18. V Dhaka, An improved average Gabor wavelet filter feature extraction technique for facial expression recognition. Int. J. Innov. Eng. 2(4), 35–41 (2013) 19. W. Yang, S. Wang, H. Jiankun, C. Valli, Security and accuracy of fingerprint-based biometrics: a review, School of Engineering and Information Technology, University of New South Wales, Canberra, ACT 2600, Australia (2019) 20. M. Fatima, N. Safia, Privacy preserving K-means clustering: a survey research. Int. Arab J. Inform. Technol. 9(2), 194–200 (2012)
A Random Walk-Based Cancelable Biometric Template Generation Fagul Pandey1(B) , Priyabrata Dash1 , and Divyanshi Sinha2 1 IIT Kharagpur, Kharagpur, India [email protected], [email protected] 2 IIT Kanpur, Kanpur, India [email protected]
Abstract. The usage of fingerprint biometric-based authentication systems raises serious security concerns if the generated reversible feature vector is stored directly. For this purpose, irreversible or cancelable templates are generated either through random projection methods or by performing non-invertible transformations (one-way functions) over the feature vector. In this paper, we have utilized the randomness of the random walk concept for generating a non-invertible fingerprint template. In a random walk, the future step of an object is independent of the past pattern, which makes it unpredictable. Here, we have used this property to generate multiple vectors by defining the number of objects and steps based on the size of the feature template. Diehard tests are performed to ensure the randomness of the individual vectors. After further processing, the projection matrix was created, which generates the cancelable fingerprint template. The performance of the proposed method was assessed by determining the equal error rate (EER) on FVC 2004 databases. The results were found to be satisfactory. Keywords: Random walk · Cancelable template · Revocable · Diehard · Random projection
1 Introduction Biometric authentication systems enable recognition of individuals based on their biological or physiological characteristics (e.g., fingerprint, face, iris, voice, etc.). Biometrics are used in major applications, such as national identification and border control, and have replaced conventional pins, tokens, and passwords because of their specificity. Fingerprint is the most commonly used biometric trait for authentication purpose because of its ability to identify individuals uniquely, cost-effective hardware, permanence, and acceptability among users. Securely storing biometric template is still a major concern in biometric-based applications. Once compromised, the fingerprint becomes useless across applications and leads to permanent loss of the user’s identity. In all biometric-based applications, the template is generated from the sensed biometric modality. In the case of fingerprintbased system, mostly the template is based on extracted minutiae features. In case if © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_47
424
F. Pandey et al.
an adversary gets access to the biometric template of a user, the security of the overall system gets compromised. Thus, there is a need to either store the template securely or transform it in a way such that even if the adversary gets hold of it, it is incomprehensible for him/her. Such a transformed template is called cancelable template [1]. Considerably, cancelable templates can be obtained by performing non-invertible transformations over the original template, or through random projection techniques [2]. In non-invertible transformation-based techniques, the original feature vector is transformed using some one-way function. In the random projection mechanism, the extracted feature set f ∈ RN from biometric modality is projected on some random subspace S ∈ N. Thus, the obtained projected vector finally stored in the database is y, which is defined as y = fS. Biohashing is the extended version of the random projection concept and has been used extensively in the literature [3, 4]. The obtained cancelable biometric templates from the discussed mechanisms must satisfy the following criteria: • Irreversibility: The stored biometric template must be non-invertible in nature, such that the template becomes useless for an adversary even if he/she gets access to it. • Revocability: In case if the template is stolen or compromised, the existing template must be canceled, and a new template must be generated. • Non-linkability: Different instances of the biometric template must not link even if they are derived from the same biometric modality of a particular user. This property preserves the privacy of the user and prevents cross-matching of templates across the different application. • Performance: The transformations performed to obtain the cancelable template must not hamper the performance of the overall system. In this paper, we have obtained the cancelable template using random projection technique. The random walk is used to obtain the projection matrix. Here, each entry si,j of the defined random subspace S ∈ N is the independent realization of the motion of an individual object that is known to be unpredictable. Diehard tests are performed to check the randomness of the obtained projection matrix. To check the performance of the proposed method, it was tested on FVC2004 databases. The contributions of this article are: • Introduction of random walk concept for generating a cancelable fingerprint template. • Randomness of the random walk vectors over chaotic signal-based deterministic model is highlighted. • Introduction of simple cancelable template generation scheme. This method can be extended for other biometric modalities as well. The paper organization is as follows: state-of-the-art methods are surveyed and presented in Sect. 2. Random walk algorithm and proposed method for cancelable template generation are discussed in Sect. 3. Experiments conducted and experimental results are discussed in Sect. 4. Finally, the paper is concluded in Sect. 5.
A Random Walk-Based Cancelable Biometric Template Generation
425
2 Literature Survey Addressing the vulnerabilities of the existing biometric template, Rath et al. [5] proposed the concept of cancelable template to add security to the system and preserve privacy of the user by defining three types of transformations, namely functional, polar, and cartesian. Minutia cylinder-code (MCC)-based cancelable template of the fingerprint data was proposed by Ferrara et al. [6, 7]. Polar coordinates are rotation and translation invariant. This property was utilized by Ahmad et al. [8] for further processing and generating the template. Sandhya et al. [9] analyzed the properties of the Delaunay triangles major one being the tolerance of the structure. The triangulation net was formed on the extracted minutiae points of the fingerprint image to obtain the cancelable template. Supriya et al. [10] proposed cancelable biometric template by transforming the original template using chaotic signals. Most popular chaotic map, that is, logistic map was generating chaotic sequence. Hung-IHsiao et al. [11] presented a new multiple chaosbased biometric image cryptosystem for fingerprint security. The encryption algorithm was constructed with four chaotic systems, which consisted of two 1-D and two highdimensional 3-D chaotic systems. logistic map was used in this scheme as well. However, logistic map is a deterministic model, that is, if the initial seed is same then exact path of the system will be produced. So, a formula determines the output in this model. In contrast, in a random model, the initial seed does not reveal how the system would unfold, rather it gives a distribution of the future outcomes. For this reason, we have used the concept of random walk for generating the cancelable fingerprint template.
3 Proposed Method 3.1 Random Matrix Generation Using Random Walk Concept Random walk can be viewed as the stochastic process in which the motion of an object is unpredictable. The path traversed by the object in the past cannot determine its next step. Therefore, the probability of traversing in all the directions is equal. We have used the 2-D extension of this concept. Consider the An to be the path of the random walk in two dimensions. So, An = B1 + B2 + · · · + Bn where B1 , B2 , …., Bn are the random vectors defined as: ⎧ ⎫ (1, 0)with probability1/4 ⎪ ⎪ ⎪ ⎪ ⎨ ⎬ (0, 1)with probability1/4 Bi = ⎪ (−1, 0)with probability1/4 ⎪ ⎪ ⎪ ⎩ ⎭ (0, −1)with probability 1/4
(1)
(2)
The steps of the random walk process used in this paper are described below: vx = {−1,+1, 0, 0} and vy = {0, 0,−1,+1} where vx , vy are the vectors consisting of the possible step sizes in x and y directions. dis[n][s], where n is the number of objects and s is the number of steps.
426
• • • • •
F. Pandey et al.
Step 1: For each random object i ∈ {1, n}, repeat step 2. Step 2: For j ∈ {1, s}, repeat steps 3–5. Step 3: Determine direction at random, d = randomnumber(r) % 4. Step 4: x = x + vx [d] and y = y + vy [d]. Step 5: Calculate distance from initial position to updated position of an object. Update dis[i][j].
3.2 Cancelable Template Generation and Matching The matrix obtained in the last step (dis) is orthogonalized to generate the random projection matrix (S) using which cancelable fingerprint template (y) is generated, that is, y = fS. Furthermore, templates are matched using the formula: t=
t1 − t2 t1 − t2
(3)
Match score t match = (1 − t).
4 Experimental Setup and Results For generating the fingerprint feature vector of length 299, the method outlined in [12] is used for further processing. FVC 2004 databases (100 * 8) are used for testing purpose. The randomness of the generated vectors is assessed through diehard test [13] and the performance accuracy of the proposed method is determined with the help of equal error rate (EER). 4.1 Accuracy Results The accuracy of the proposed method is tested on FVC 2004 database. The results for different threshold values of t match are stated in Table 1. Best results are obtained for t match = 0.43. EER curves are presented in Fig. 1. Table 1. Accuracy results for different thresholds Threshold
Verification accuracy (%)
Genuine acceptance rate (%)
FAR (%)
FRR (%)
ERR (%)
0.34
99.48
99.90
0.61
0.10
0.35
0.36
99.92
99.90
0.08
0.10
0.09
0.39
99.97
99.80
0.00
0.20
0.10
0.43
99.98
99.90
0.00
0.10
0.05
0.45
99.88
99.30
0.00
0.70
0.35
A Random Walk-Based Cancelable Biometric Template Generation
427
Fig. 1. EER versus similarity score
4.2 Diehard Test Result Diehard test suite is the collection of statistical tests which checks the randomness of the generator. Here, the randomness of the vectors generated through random walk algorithm is tested using this test suite and results are stated in Table 2.
5 Conclusion Storing the biometric template directly is not recommended as it may get compromised leading to the complete identity loss of a user across applications. The proposed method used random walk concept for the generation of cancelable fingerprint template, so it is not possible to obtain the original biometric template from this non-invertible template. The proposed method was successfully tested on FVC 2004 databases, and it gives accuracy of 99.90% at threshold 0.43. The randomness of the intermediate random vectors was tested using diehard test and results were found to be satisfactory. The revocability parameter satisfies by just changing the projection matrix. Random walk ensures the irreversibility of the generated cancelable template. Furthermore, the templates generated were highly differentiable. The desired parameters such as irreversibility, non-linkability, and revocability were obtained without affecting the performance of the overall system.
428
F. Pandey et al. Table 2. Randomness checking with diehard test suite
Statistical test
p value
Assessment
Diehard birthdays test
0.97063932
PASSED
Diehard OPERM5 (overlapping permutation) test
0.21147305
PASSED
Diehard 32 × 32 binary rank test
0.26887712
PASSED
Diehard 6 × 8 binary rank test
0.76824656
PASSED
Diehard bitstream test
0.76793168
PASSED
Diehard OPSO (overlapping pairs sparse occupancy) test
0.79799364
PASSED
Diehard OQSO (overlapping quadruples sparse occupancy) test
0.71596844
PASSED
Diehard DNA test
0.32705856
PASSED
Diehard count the 1 s (stream) test
0.02716524
PASSED
Diehard count the 1 s (byte) test
0.91128245
PASSED
Diehard parking lot test
0.27185127
PASSED
Diehard minimum distance (2d circle) test
0.88042741
PASSED
Diehard 3d sphere (minimum distance) test
0.59806026
PASSED
Diehard squeeze test
0.99123760
PASSED
Diehard sums test
0.06850323
PASSED
Diehard runs test
0.59383737
PASSED
Diehard craps test
0.95474779
PASSED
Marsaglia and Tsang GCD test
0.79006537
PASSED
STS monobit test
0.32021856
PASSED
STS runs test
0.17441023
PASSED
STS serial (generalized) test
0.75728552
PASSED
RGB bit distribution test
0.77057707
PASSED
RGB generalized minimum distance test
0.52776791
PASSED
RGB permutations test
0.87556779
PASSED
RGB lagged sum test
0.85416640
PASSED
RGB Kolmogorov-Smirnov test
0.60178582
PASSED
DAB byte distribution test
0.78990963
PASSED
DAB DCT (frequency analysis) test
0.69474525
PASSED
DAB fill tree test
0.72571825
PASSED
DAB fill tree 2 test
0.30374025
PASSED
DAB monobit 2 test
0.90985527
PASSED
A Random Walk-Based Cancelable Biometric Template Generation
429
References 1. J.B. Kho, J. Kim, I.-J. Kim, A.B.J. Teoh, Cancelable fingerprint template design with randomized non-negative least squares. Pattern Recogn. 91, 245–260 (2019) 2. J.K. Pillai, V.M. Patel, R. Chellappa, N.K. Ratha, Secure and robust iris recognition using random projections and sparse representations. IEEE Trans. Pattern Anal. Mach. Intel. 33(9), 1877–1893 (2011) 3. Lu Leng, Jiashu Zhang, Palmhash code versus palmphasor code. Neurocomputing 108, 1–12 (2013) 4. L. Leng, A. Beng J. Teoh, M. Li, M.K. Khan, Analysis of correlation of 2dpalmhash code and orientation range suitable for transposition. Neurocomputing 131, 377–387 (2014) 5. N.K. Ratha, Jonathan H. Connell, R.M. Bolle, Enhancing security and privacy in biometricsbased authentication systems. IBM Syst. J. 40(3), 614–634 (2001) 6. M. Ferrara, D. Maltoni, R. Cappelli, Noninvertible minutia cylinder-code representation. IEEE Trans. Inf. Forensics Secur. 7(6), 1727–1737 (2012) 7. R. Cappelli, M. Ferrara, D. Maltoni, Minutia cylinder-code: a new representation and matching technique for fingerprint recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2128– 2141 (2010) 8. T. Ahmad, H. Jiankun, S. Wang, Pair-polar coordinate-based cancelable fingerprint templates. Pattern Recogn. 44(10–11), 2555–2564 (2011) 9. M. Sandhya, M.V.N.K. Prasad, R. Rao Chillarige, Generating cancellable fingerprint templates based on delaunay triangle feature set construction. IET Biometrics 5(2), 131–139 (2016) 10. V.G. Supriya, R. Manjunatha, Logistic map for cancellable biometrics, in Materials science and engineering conference series, vol. 225, p. 012176 (2017) 11. H.-I. Hsiao, J. Lee, Fingerprint image cryptography based on multiple chaotic systems. Sig. Process. 113, 169–181 (2015) 12. Z. Jin, M.-H. Lim, A. Beng J. Teoh, B.-M. Goi, Y.H. Tay, Generating fixed-length representation from minutiae using kernel methods for fingerprint authentication. IEEE Trans. Syst. Man Cybern Syst. 46(10), 1415–1428 (2016) 13. G. Marsaglia, Diehard test suite. http://www.stat.fsu.edu/pub/diehard 8(01):2014 (1998)
Influence of Internal and External Sources on Information Diffusion at Twitter Mohammad Ahsan(B)
and T. P. Sharma
Computer Science and Engineering Department, National Institute of Technology, Hamirpur, Himachal Pradesh, India {ahsan,teek}@nith.ac.in
Abstract. Twitter is widely used by the social media users to share their thoughts, experiences and breaking news. Due to huge user base and quick delivery of information, posts of Twitter users reach millions of people within few seconds. During extreme events and social crisis, emergency responders utilize local information posted by the Twitter users to combat the chaotic situation. Even news media cite Twitter as their source of information a number of times. With these applications, it is necessary to understand how information reaches individuals at Twitter network. There are mainly two ways by which Twitter users receive an information: (i) through internal links within Twitter network, and (ii) through sources out of Twitter network that are newspapers, radio and televisions. Existing research mainly utilized internal links for modeling information diffusion at Twitter and very few studies considered external influence. There is a study which quantifies the effects of internal and external sources on the mentions of URLs in users’ tweets. In this paper, authors have utilized retweets, URLs and hashtags to quantify internal and external influence on information diffusion at Twitter. The analysis of 880 K tweets clearly represents that the diffusion of 79% tweets is internally influenced and only remaining 21% can be attributed to the influence of external sources. Keywords: Internal influence · External influence · Information diffusion · Twitter · URLs · Hashtags
1 Introduction There is an unprecedented amount of information which is being circulated at online social networks. The flexibility and openness of these platforms attract the attention of the people. A person having an Internet connected device can share his/her thoughts and experiences with peoples belonging to different parts of the world. Information posted by a social media user can reach millions of users in few seconds [1]. During extreme events, i.e., terrorist attacks and earthquake, social media platforms are highly used by the people to alleviate their tension and to comprehend the situation. These platforms do not require one to be a journalist in order to broadcast the information to others, as required in news channels. Due to this low cost of information exchange (an Internet © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_48
Influence of Internal and External Sources on Information …
431
connected device) and fast delivery of information, these social networks are preferred by the people to share and consume the latest news. Online social networks allow their users to form links with other individuals in order to share the information. At Twitter, people follow/unfollow other users for creating and breaking links with them. By following a user, one gets subscription to the tweets posted by that user. If user ‘A’ posts an information, then his/her immediate followers receive that information on their timeline, if immediate followers retweet the received information, then their immediate followers receive retweeted information on their timeline, and so on. This is how the information very often gets diffused at online social networks (through links of the social media users). Most of the existing research works have examined information diffusion process by focusing on the internal links (edges) of concerned social network [2–4]. Although social media users mainly receive information through internal links, but sometimes, they also get information from external sources (i.e., newspapers, televisions and news sites) [5–9]. At social networks, every time user log in to their accounts, a list of posts is shown at their timeline. Social media allow its users to share and retweet the information from timeline by using the share and retweet functionality. Apart from retweeting the timeline information, social media users also post their own tweets. In retweeted information, it is clear that information gets diffused due to the internal influence, but in original tweets, it is not clear, and the content of the tweets is examined to decide the type of influence (internal or external). The spread of information at the social networking sites holds the ability to change the mind of the people. So, understanding the internal and external contribution in information diffusion at online social networks is very helpful to the companies, politicians and rumor combating agencies to effectively delivering the right information to the society.
2 Related Work Social media data have been examined in the previous research in order to quantify the internal and external influences on information diffusion in online social networks (i.e., Twitter, Sina Weibo). In 2012, Myers et al. [10] have analyzed the information diffusion process at Twitter. They identified that only about 71% of the content which circulates on Twitter can be attributed to the diffusion that occurs through links of the Twitter network, and remaining 29% appears due to external influence. This research work is limited to the analysis of URLs diffusion. After collecting 3 billion tweets, they extracted 18,186 URLs that were tweeted by at least 50 users and free from spamming behavior. For each URL, the total exposure received by the users during emergence of URL and fraction of this exposure that came from external sources are calculated. After averaging the exposures count of all URLs, the authors have identified about 71% URL mentions on Twitter as diffusion through network edges and 29% of mentions through influence of external sources. Liu et al. [11] have studied internal and external influence on the diffusion of information at Sina Weibo (a Chinese microblogging Web site). The internal and external influences are measured through counts of retweets and new tweets, respectively. They found that average number of retweet posts is much larger than average number of new
432
M. Ahsan and T. P. Sharma
tweets at Sina Weibo. It indicates that the content diffusion at Sina Weibo mainly occurs through network edges, which is consistent with the findings on Twitter by Myers et al. [10]. Li et al. [12] have measured external influence by studying the cascades of information diffusion. A cascade is a chain of information sharing which get created as social media users post or reshared the information from their news feed. They found that 20% of links in the diffusion cascades are directly affected by the external sources and 30–50% links are affected indirectly. The existing research has examined the internal and external influences through URLs, new tweets and retweets. These features indicate the fraction of information that can be attributed to diffusion through network edges or influence of external sources. In this research paper, hashtags are added as a new feature into the list of existing features— URLs, new tweets and retweets. Hashtags are widely used by the social media users to link their posts with a topic [13, 14]. This feature helps in identifying the internal and external influences for the new tweets that contain no URL. In this research work, the hashtags of a tweet are checked at the timeline of the tweet author. The presence of tweet’s hashtags at the timeline of tweet author indicates the internal influence as the topic of the tweet (hashtag) has been already delivered to the author through network links.
3 Proposed Approach To examine the internal and external influences on the information diffusion at Twitter, 880 K tweets are collected by using the Twitter API. The credentials required to collect Twitter data are generated in Sect. 3.1. Next, in data collection section, streaming API is used to collect tweets. Finally, in Sect. 3.3, the collected tweets are annotated with one of the two labels of influence: I (internal) or E (external). 3.1 Credentials to Collect Twitter Data A set of credentials is required for making data collection request to the Twitter APIs (i.e., API key, API secret key, access token and access token secret). The keys and tokens are generated by creating a developer app at Twitter (https://developer.twitter.con/en/apps). In each data collection request to the Twitter APIs, these credentials uniquely identify the app and user who is making the request. 3.2 Data Collection Application programming interfaces (APIs) of the social networks are used by the people to access data from them. Twitter API is used to collect the content posted by the users on this platform. Twitter mainly provides two APIs: REST API and streaming API. REST API is used for collecting tweets published within last week, whereas streaming API is used to collect the upcoming tweets. Authors have passed their credentials to the streaming API and collected 880 K tweets in a time period of 10 days, from July 29, 2019 to August 7, 2019.
Influence of Internal and External Sources on Information …
433
3.3 Classification The classification of collected tweets is aimed to identify whether an information gets diffused through network edges or through influence of external sources. It includes two sub processes: features extraction and tweets categorization. Features Extraction. Twitter APIs return tweets metadata in a dictionary format. This metadata contains complete information about the tweet and the tweet author (a user who posts the tweet). The geographic location of a tweet, time when it was created, device which was used for posting it (mobile or desktop), tweet text, date and time when user account was created, screen name of the tweet author, etc., are contained in the metadata. Out of this metadata, some information is relevant to the study, and rest is not. The relevant information is extracted by following the steps discussed in Algorithm 1. Algorithm 1 Input: a set of collected tweets (T ) Output: a set of feature vectors extracted from T Step 1 from metadata of each tweet t ∈ T Step 2 extract tweet text, tweet_text ← t[‘text’] Step 3 extract the name of tweet author screen_name ← t[‘user’][‘screen_name’] Step 4 assign 1 to retweet variable if the tweet text starts with ‘RT’, otherwise assign 0 1, if tweet_ text starts with RT’ retweet = 0, otherwise Step 5 extract URLs url ← url[‘expanded_url’] for url in t[‘enteties’][‘url’]] Step 6 extract hashtags hashtags ←[hashtag[‘text’] for hashtag in t [‘entities’][‘hashtags’]] Step 7 store the features extracted in step 2 to step 6 in a processable file format (i.e., CSV file or a dataframe) Tweets Classification. The features extracted from the collected tweets are used here to classify tweets as internally and externally influenced information. Initially, tweets’ text is checked to identify whether it is a new tweet or a retweet. The tweets having text starting with ‘RT’ are considered as retweet, and rest are considered as new tweets. A retweet contains information which Twitter users receive from their followees, so it is considered as an information which gets diffused through internal channel. Next, the presence of URLs is checked in new tweets to identify their connection with Twitter and non-Twitter sources (i.e., news media). A new tweet with Twitter URL is an internally influenced information as their source of information is Twitter. If URLs of other sources
434
M. Ahsan and T. P. Sharma
are present in a new tweet (i.e., www.thehindu.com, www.cnn.com) then this tweet is considered as externally influenced information as the information is received from non-Twitter sources. The classification of new tweets which contain no URL is done on the basis of hashtags. Hashtags are extracted from the tweets’ text. Here, screen name of the tweet author is used to collect tweets from his/her timeline. If the hashtags used in a new tweet were present at the user timeline when he/she posts the tweet, then it is considered as internally influenced diffusion. Because hashtags are the functionality offered by Twitter which links the tweets to a particular topic. If the topic of a new tweet matches with the topic of information received at a user timeline, then it is very clearly an internal influence. On other hand, if no hashtag or new hashtags are there in a new tweet, which are not present at the user timeline, then content of this tweet is framed through external influence.
4 Results This section contains the statistics about internal and external influence over information diffusion in Twitter. After analyzing the collected data, this paper has identified that out of the total tweets, 35% tweets are new tweets, and 65% tweets are retweets. Retweets represent the information that Twitter users receive from their followees through network edges and hence categorized as internal influence. But, to quantify the influence in new tweets, we analyze them on the basis of URLs’ and hashtags’ presence. New tweets are first checked on the basis of URLs and then hashtags. If the URL in a new tweet belongs to Twitter, then it is an internal influence, otherwise, external influence. In 312 K new tweets, 138 K tweets are found with URLs and 174 K tweets without URLs. After examining new tweets which contain URLs, it has been identified that 120 K tweets have Twitter URLs, and remaining 18 K tweets contain non-Twitter URLs. This represents that 87% of URLs mention in new tweets is attributed to network effects and only 13% occur due to external sources. In new tweets that have no URL, it has been analyzed that 13 K tweets have hashtags and 161 K tweets have no hashtags. In the tweets containing hashtags, it has been identified that hashtags of 7 K tweets are derived from the users’ timeline and remaining 6 K tweets have new hashtags (hashtags not present at the user timeline at tweeting time). Table 1 represents the statistics of internal and external influence on information diffusion process at Twitter.
5 Conclusion The process of information diffusion at online social networks is mainly examined by focusing on the edges of the concerned social network. However, apart from links of social media users, information also reaches individuals by jumping from external outof-network sources like news media: newspapers or televisions. This paper has examined the information diffusion at Twitter. After collecting the posts of Twitter users, authors analyzed whether a post is new tweet or retweet, whether the URL contained in a tweet belongs to some Twitter post or not, and, whether the hashtags used in the tweets are taken from the users’ timelines or not. In classifying the tweets to internal influence
Influence of Internal and External Sources on Information …
435
Table 1. Statistics of internal and external influence at Twitter
Retweets New tweets Tweets with URLs Tweets Tweets with without URLs hashtags
Total number Number of tweets diffused through internal influence
Number of tweets diffused through external influence
568,482
568,482
–
138,377
120,306
18,071
12,674
6949
5725
–
161,012
695,737 (79.02%)
184,808 (20.98%)
Tweets without 161,012 hashtags Tweets (New tweets + Retweets)
880,545
and external influence category, timelines of the Twitter users are also examined. After classification, it has been observed that the diffusion of 79% information at Twitter is internally driven and remaining 21% belongs to external influence. In future work, this research can be utilized as a ground-truth to directly infer the internal and external influence from tweets’ text with the help of deep learning models. Acknowledgements. This work was supported by the Ministry of Electronics and Information Technology (MEITY), Government of India. We would like to thank the anonymous reviewers for their insightful comments.
References 1. H. Allcott, M. Gentzkow, Social media and fake news in the 2016 election. J. Econ. Persp. 31(2), 211–236 (2017) 2. D. Yang, X. Liao, H. Shen, X. Cheng, G. Chen, Modeling the reemergence of information diffusion in social network. Phys. A Stat. Mech. Appl. 490, 1493–1500 (2018) 3. D. Centola, M. Macy, Complex contagions and the weakness of long ties. Am. J. Sociol. 113(3), 702–734 (2007) 4. D. Cosley, D. Huttenlocher, J. Kleinberg, X. Lan, S. Suri, Sequential influence models in social networks, in Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media (Washington, DC, 2010), pp. 26–33 5. R. Crane, D. Sornette, Robust dynamic classes revealed by measuring the response function of a social system. Proc. Nat. Acad. Sci. (PNAS) 105(41), 15649–15653 (2008) 6. S. Goel, D.J. Watts, D.G. Goldstein, The structure of online diffusion networks, in Proceedings of the 13th ACM Conference on Electronic Commerce (Valencia, Spain, 2012), pp. 623–638 7. E. Bakshy, I. Rosenn, C. Marlow, L. Adamic, The role of social networks in information diffusion, in Proceedings of the 21st ACM International Conference on World Wide Web (Lyon, France, 2012), pp. 519–528
436
M. Ahsan and T. P. Sharma
8. P.A. Dow, L.A. Adamic, A. Friggeri, The anatomy of large facebook cascades, in Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media (Massachusetts, USA, 2013), pp. 145–154 9. V. Arnaboldi, M. Conti, M. La Gala, A. Passarella, F. Pezzoni, Ego network structure in online social networks and its impact on information diffusion. Comput. Commun. 76, 26–41 (2016) 10. S.A. Myers, C. Zhu, J. Leskovec, Information diffusion and external influence in networks, in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Beijing, China, 2012), pp. 33–41 11. C. Liu, X.X. Zhan, Z.K. Zhang, G.Q. Sun, P.M. Hui, How events determine spreading patterns: information transmission via internal and external influences on social networks. New J. Phys. 17(11), 113045 (2015) 12. J. Li, J. Xiong, X. Wang, Measuring the external influence in information diffusion, in 16th IEEE International Conference on Mobile Data Management, vol. 2 (Pittsburgh, USA, 2015), pp. 92–97 13. A. Pandya, M. Oussalah, P. Monachesi, P. Kostakos, On the use of distributed semantics of tweet metadata for user age prediction, in Future Generation Computer Systems, vol. 102 (2019), pp. 437–452 14. R. Ma, X. Qiu, Q. Zhang, X. Hu, Y.G. Jiang, X. Huang, Co-attention memory network for multimodal microblog’s hashtag recommendation. IEEE Trans. Knowl. Data Eng. (2019)
Pedestrian Detection: Unification of Global and Local Features Sweta Panigrahi1(B)
, U. S. N. Raju1 , R. Pranay Raj2 , Sindhu Namulamettu2 , and Vishnupriya Thanda2
1 National Institute of Technology Warangal, Warangal, Telengana 506004, India
[email protected], [email protected] 2 Rajiv Gandhi University of Knowledge Technologies, Basar, Telengana 504107, India
[email protected], [email protected], [email protected]
Abstract. Effective and precise detection of pedestrian serves as a key to a number of applications in the domain of computer vision such as smart cars, video surveillance, robotics, and security. This paper presents the combination of feature extraction and classification. We present a thorough study on the type of features fit for pedestrian detection. The features are obtained by concatenating global shape feature histogram of oriented gradients (HOG) with global color and local texture features. We investigate our proposed method with respect to their receiver operator characteristics (ROC) and detection error trade-off (DET) performance. For classification part, we use the standard support vector machines (SVM) with linear kernel. We test our proposed method on the benchmark dataset for pedestrian detection: Institut National de Recherche en Informatique et en Automatique (INRIA) Pedestrian Dataset. The dataset contains pedestrians and non-pedestrians captured over a varying environment. Our proposed method performs best with respect to other algorithms presented in this study and gives a miss rate of 5.80%. Keywords: Pedestrian detection · Shape feature · Contrast rotation invariant local binary pattern (CLBP) · Miss rate
1 Introduction Effectively and precisely detecting pedestrians is of central importance for many applications in computer vision. In particular, there is a growing need to develop intelligent video surveillance systems. Although there exists a vast literature for object detection, a fit solution has not been found [1]. Pedestrian detection in images is one of the most challenging subsets of object detection, which arise due to large variations in clothing, poses, and cluttered background [2]. To solve pedestrian detection, two basic steps are followed: feature extraction and classifier construction. Features can be divided into global and local on the basis of their operation. Global features process the entire image, on the other hand, local features work on sub-regions of the image. For pedestrian detection, shape serves as a dominant © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_49
438
S. Panigrahi et al.
feature. Texture features have also been used in detection purpose [3]. This paper combines shape feature (HOG) [4] with local texture feature contrast rotation invariant local binary pattern (CLBP) [5] and global color feature autocorrelogram [6, 7] (AutoCor) and achieves a better performance. Where HOG captures shape of pedestrians, CLBP captures the textural pattern, and AutoCor identifies color patterns of pedestrians, which further suppresses incorrect detection. In the next section, a brief description of all the algorithm is given. The rest of the paper follows as: Methodology is described in Sect. 2, results and discussion are shown in Sect. 3. Lastly, conclusion is Sect. 4. 1.1 Shape Feature HOG is the most widely used shape feature for pedestrian detection. In HOG, the distribution of the direction of gradient serves as feature. Gradient information in horizontal and vertical direction of an image is useful as magnitude of gradients contains shape information. To improve accuracy, overlapping local contrast normalization is used. Figures 1 and 2 show an 8 × 8 cell and its nine-bin histogram, respectively. 20
99
7
162 196 105 195
53
156 149 116 143
68
144 101
136 181 28 145 22 131
130
79
101 164 161 167
48
95
164
91
161
179 181 178
12
145
87
13
67
96
37
20
35
122
6
137
30
166
80
140 124
98
9
132
17
106
34
15
104
27
84
40
172
Fig. 1. 8 × 8 cell
1047.70 807.65 640.90 1322.70 856.93 478.21 1055.40 457.00 609.98
Fig. 2. 1 × 9 bin histogram
1.2 Texture Feature To extract texture features, a number of algorithms are there in the literature. We have used and described about local binary pattern (LBP) [8], uniform local binary pattern (ULBP) [8], and CLBP in our study.
Pedestrian Detection: Unification of Global and Local Features
439
• LBP: LBP operator uses the central pixel to label the other eight pixels in 3 × 3 neighborhood in an image. Resulting pixel values are summed to get a LBP label for the center pixel. Figure 3a–c shows the calculation for the central pixel. 123
12
100
1
35
53
47
0
65
26
112
1
(a)
0
1 0
0
170
300
(c)
(d)
1
(b)
Fig. 3. a Original image. b Binary representation for center pixel ‘53’. c LBP label for central pixel. d ULBP label for central pixel
• ULBP: In uniform LBP, uniform pattern has a separate output label like LBP, and all of the non-uniform patterns are assigned to a single label. Figure 3d shows the ULBP for central element of Fig. 3a. • CLBP: As LBP is concern only about whether value of a neighboring pixel is more than central pixel or not, sometimes result of LBP is same for two different (P × P) image windows. To solve this problem, a new version of LBP, known as contrast rotation invariant LBP (CLBP), came into literature, where the value of the central pixel depends on the difference between central pixel and other pixels. Figure 3c and Fig. 4b show the LBP label for two different images Fig. 3a and Fig. 4a, respectively. Figure 4c, d shows the CLBP label for Figs. 3a and 4a. The LBP label obtained for two different images is same, whereas the CLBP label is different. 54
10
49
1
0
1
70
41
47
15
29
10
11
39
27
0
170
0
18
232
6
28
69
12
42
7
51
1
0
1
12
27
59
3
32
12
(a)
(b)
(c)
(d)
Fig. 4. a Original image. b LBP label for central element ‘39’. c CLBP label for central element of Fig. 3a. d CLBP label for central element of Fig. 4a
Color Feature. This feature is substantially robust relative to background clutter. It is also not dependent upon size of the image and orientation. Recently, this feature has acquired importance in pedestrian detection. When color features are used individually for pedestrian detection, it does not give a good performance. However, people have similar color among themselves, for example, the skin color is similar, and color of clothes also exhibits similar properties. So, when it is combined with other features, i.e., shape, it lowers the false detections of the system. To extract color feature, one of the popular algorithms is color correlogram [6], where spatial correlation of all possible pairs of pixel values is presented for a distance
440
S. Panigrahi et al.
value. As a result, length of the feature vector gets high. To solve this, autocorrelogram is applied in this work. An autocorrelogram captures spatial correlation between identical colors only. An example is given in Fig. 5 and thus reducing the feature vector length. Autocorrelogram can be applied either to a region or to the whole portion of the pedestrian. In this work, we have applied it on the whole pedestrian, to get similarity among the said object. 1
1
2
2
3
2
1
3
3
4
3
4
1
1
1
4
0.167
0.167
0.125
0.083
(b)
(a) Fig. 5. a Original image. b Autocorrelogram
2 Methodology The proposed method fuses local texture and global color features with global shape feature for pedestrian detection. The texture feature is obtained by transforming RGB channels of the image. The color features are extracted from HSI model of the image. To derive the shape information, gradients are used, which gives the direction of the change of intensity. This way a rich feature descriptor is yielded. The features are combined by applying concatenation on the respective feature vectors. The features are trained and tested on a classifier. For our study, we have used linear two-class SVM for classification. The detailed steps are given below in Algorithm 1, and the same is reflected in Fig. 6. After this feature extraction process, the next step is to train and test the model, which is described in the next section. Algorithm 1: 1. 2. 3. 4. 5. 6. 7. 8.
Separate the Red (R), Blue (B), and Green (G) channels of an image (I). Obtain a Grayscale Image (GI) by using Eq. 1. Apply texture feature extraction (CLBP) methods on GI. Obtain hue (H) and saturation (S) from R, G, and, B channels by using Eqs. 2 and 3. Apply color feature extraction (AutoCor) methods on H and S. Calculate horizontal (gx ) and vertical (gy ) gradient for each of R, G, and B channels separately. To calculate and obtain three magnitude (M) and direction (θ ), matrix follows Eqs. 4 and 5 individually on the three channels. Magnitude (M) at a pixel is the maximum of the magnitude of the three channels, and the direction (θ ) is the direction corresponding to the maximum magnitude.
Pedestrian Detection: Unification of Global and Local Features
441
Fig. 6. Block diagram of the proposed method
9. Apply shape feature extraction (HOG) on the resultant M and θ matrix. 10. Concatenate the three feature vectors obtained from steps 3, 5, and 9 to yield the final feature vector.
GI = 0.2989 ∗ R + 0.5870 ∗ G + 0.1140 ∗ B H= where θ =
cos−1
θ if B ≤ G 360 if B > G
(1) (2)
1 2 [(R−G)+(R−B)] 1/2 (R−G)2 +(R−B)(G−B)
S =1−
3 [min(R, G, B)] (R + G + B) M = gx2 + gy2 θ = tan
−1
gy gx
(3) (4)
(5)
3 Results and Discussion The proposed method is applied on the benchmark INRIA Pedestrian Dataset [9]. The features obtained by the proposed method are trained with SVM [10] using linear kernel. The SVM implementation is performed with LIBSVM [11] library.
442
S. Panigrahi et al.
3.1 Dataset: INRIA Pedestrian Dataset In INRIA Pedestrian dataset, the people are usually standing but appear in any orientation. The different characteristics of this image database are it contains pedestrians against a wide variety of background image including crowds. Many are bystanders taken from the image backgrounds, so there is no particular bias on their pose. • Train set contains 614 positive images containing 1239 persons; along with left and right reflection, a total of 2478 person images and 1218 negative, i.e., person-free images. A fixed set of 12,180 patches sampled randomly from 1218 person-free training photos provided the negative set. • Test set contains 288 positive images containing 566 persons, along with left and right reflection, 1132 images and 453 negative images. To get the negative windows for testing, the same sampling structure is followed, and 4530 negative images are obtained, ten from each image. • The positive images for training are cropped to 96 × 160 including a border of 16 pixels, whereas the positive images for testing are cropped to 70 × 140 including a border of three pixels on all sides to avoid biasing in classifier. The center 64 × 128 window is used for processing. The same size of 64 × 128 is maintained while extracting negative windows from both the train and test set. The window size is selected based on the aspect ratio of a pedestrian. Sample images are given in Figs. 7 and 8.
Fig. 7. Sample positive images containing pedestrians
Fig. 8. Sample negative images containing random background scenes
3.2 Confusion Matrix The features from train positive images, which contain pedestrians, are given label as 1, and the features of train negative images, which do not contain pedestrians, are given
Pedestrian Detection: Unification of Global and Local Features
443
label −1. The train feature matrix and train label vector are used to train SVM with linear kernel and regularization parameter as 0.01, which yields a model. When the model is given test images’ feature set (containing both positive and negative), it outputs a label which is either 1 or −1. Based on this predicted label, a 2 × 2 confusion matrix is created, which gives a summary of the correct and incorrect predictions in the system. This process is shown in Fig. 9.
Fig. 9. Process to obtain result
Confusion matrix gives the following information: • • • •
True Positive: A pedestrian present in the test image which is predicted to be positive. False Positive: A pedestrian absent in the test image which is predicted to be positive. False Negative: A pedestrian present in the test image which is predicted to be negative. True Negative: A pedestrian neither present in the test image nor predicted to be positive.
With the confusion matrix, ROC curve [12] and DET curve [13] are obtained for performance comparison. • ROC Curve Receiver operating characteristic (ROC) curve shows the model’s ability to differentiate between pedestrian and non-pedestrian. In a ROC curve, the detection rate (true positive rate) is plotted in function of the false positive rate for different thresholds. The points on the ROC curve denote a true positive rate − false positive rate pair for a particular threshold. • DET Curve A detection error trade-off (DET) curve plots log-log error rates on both the axes. The xaxis scales false miss rate (false rejection rate) vs. false positive rate nonlinearly, yielding a performance measure curve that is more linear than ROC Curves. Performance metrics for INRIA dataset. ROC curve and DET curve are given in Figs. 10 and 11, respectively. It can be observed that proposed method, i.e., HOG + CLBP + AutoCor outperforms other methods presented in this study. The miss rates
444
S. Panigrahi et al.
given in Table 1 are obtained from DET curve at 0.001 FPR. We have tested three texture feature extraction process, i.e., LBP, ULBP, and CLBP among which CLBP is the best performer. When HOG is concatenated with CLBP (HOG + CLBP), it improves the miss rate by 2.5% when compared to HOG. The miss rate further improves by 1.89% when HOG is combined with texture feature CLBP and color feature AutoCor (HOG + CLBP + AutoCor). This shows that to detect pedestrians, both local texture and global color features are essential along with global shape feature. Table 1. Miss rate (%) for INRIA Pedestrian dataset at 0.001 FPR Method
Miss rate (%)
LBP
99.25
ULBP
99.47
CLBP
59.15
AutoCor
92.51
HOG
9.22
HOG + CLBP
7.70
HOG + CLBP + AutoCor 5.81
4 Conclusion This paper has presented an extensive feature descriptor for pedestrian detection. Our proposed method combines global and local features in the form of shape, texture, and color. It is evaluated on the benchmark INRIA Pedestrian dataset. Linear SVM is used for classification. Three texture (LBP, ULBP, and CLBP) features are investigated. Local CLBP, which is found to be the best performing among them, is concatenated with global color (AutoCor) and global shape (HOG) features. The proposed method HOG + CLBP + AutoCor has given low miss rate of 5.81% at low false positive rate, which is a 3.41% improvement over traditional HOG. Thus, our proposed method to combine the global and local features is able to give a better performance.
Pedestrian Detection: Unification of Global and Local Features
445
Fig. 10. ROC curve for INRIA
Fig. 11. DET curve for INRIA
References 1. P. Viola, M.J. Jones, Robust real-time face detection. Int. J. Comput. Vision 57(2), 137–154 (2004). https://doi.org/10.1023/B:VISI.0000013087.49260.fb 2. S. Paisitkriangkrai, C. Shen, J. Zhang, Fast pedestrian detection using a cascade of boosted covariance features. IEEE Trans. Circuits Syst. Video Technol. 18(8), 1140–1151 (2008). https://doi.org/10.1109/TCSVT.2008.928213 3. D. Xia, H. Sun, Z. Shen, Real-time infrared pedestrian detection based on multi-block LBP, in 2010 International Conference on Computer Application and System Modeling (ICCASM), October 2010, vol. 12, pp. 139–142. https://doi.org/10.1109/ICCASM.2010.56221 4. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection (2005). https://doi. org/10.1109/CVPR.2005.177 5. Y. Wang, Y. Zhao, Y. Chen, Texture classification using rotation invariant models on integrated local binary pattern and Zernike moments. EURASIP J. Adv. Signal Process. 1, 182–192 (2014). https://doi.org/10.1186/1687-6180-2014-182 6. J. Huang, S.R. Kumar, M. Mitra, W.J. Zhu, R. Zabih, Image indexing using color correlograms, in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern
446
7.
8.
9. 10. 11. 12. 13.
S. Panigrahi et al. Recognition (San Juan, Puerto Rico, USA, USA June 1997), pp. 762–768. https://doi.org/10. 1109/cvpr.1997.609412 Y.D. Chun, N.C. Kim, I.H. Jang, Content-based image retrieval using multiresolution color and texture features. IEEE Trans. Multimed. 10(6), 1073–1084 (2008). https://doi.org/10. 1109/tmm.2008.2001357 T. Ojala, M. Pietikäinen, T. Mäenpää, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971– 987 (2002). https://doi.org/10.1109/TPAMI.2002.1017623 INRIA download Link. http://pascal.inrialpes.fr/data/human/. Accessed 18 Oct 2019 V. Vapnik, The nature of statistical learning theory, 2nd edn. (Springer science & business media, 1995, 2000). https://doi.org/10.1007/978-1-4757-3264-1 C.C. Chang, C.J. Lin, LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011). https://doi.org/10.1145/1961189.1961199 T. Fawcett, An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). https://doi.org/10.1016/j.patrec.2005.10.010 A. Martin, G. Doddington, T. Kamm, The DET curve in assessment of detection task performance. National Inst of Standards and Technology Gaithersburg MD (1997). 10.1.1.117.4489
Logistic Map-Based Image Steganography Using Edge Detection Aiman Jan1(B) , Shabir A. Parah1 , and Bilal A. Malik2 1 PG Department of Electronics & IT, University of Kashmir, Srinagar, J&K, India
[email protected], [email protected] 2 Institute of Technology, University of Kashmir, Srinagar, J&K, India
[email protected]
Abstract. Transferring data via an insecure network has become a challenge among researchers in this fast-growing technological world. In this paper, a doublelayer security framework based on cryptography and steganography has been developed and tested. Logistic maps have been used for encrypting data before embedding it into cover images. The encrypted data has been hidden into edge areas of the cover images to ensure better imperceptivity at a given payload. The scheme has been evaluated in terms of various objective parameters like peak signal-to-noise ratio (PSNR), normalized cross-correlation (NCC), and structural similarity index (SSIM). Besides, the strength of the cryptographic algorithm based on logistic map using number of changes per rate (NPCR), unified average changed intensity (UACI), and entropy has been computed. Our framework reports the average PSNR value of 44.61 dB for payload of around 1.72 bits per pixel (bpp). In addition, the NPCR value of about 100%, UACI value of 36.72, and entropy value of 7.96 depict that our scheme is capable of providing ample security to the data to be transmitted. Keywords: Cryptography · Image steganography · Canny edge detection · Chaotic maps · Logistic map · Least significant bit (LSB) · Random sequence
1 Introduction The innovation advancement in communication systems positively impacts interpersonal interaction. The technological advancement in communication system facilitates better utilization of electronic gadgets, like cell phones and tablets without time and spot concerns. In everyday life, as innovation keeps on improving, security of digital information transferring via network becomes a significant issue. The fundamental aspect of information security is to discourage unapproved replication, change in information, whether to put away in a capacity gadget or web [1]. Therefore, there is a need to take some precautionary measures for secure transmission of secret data. A few techniques, for example, cryptography, steganography, and watermarking are utilized to give security to the information [2]. These techniques are based on four basic characteristics, viz. payload, integrity, security, and robustness. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_50
448
A. Jan et al.
Cryptography is the process of changing the plain text into an encryption form that is sent via an insecure network [3]. Cryptography alone cannot be enough to withstand the attacks of unintended users. There is a need to provide additional security to the data being transferred. Steganography can be the better option for improving security to the information as its main job is to conceal the existence of information [4]. Both cryptography and steganography together provide double security to the secret information being communicated through the Internet. The least significant bit (LSB) method is the basic steganographic technique [5]. Mostly, three LSB bits are used for the embedding process. Extension to the number of bits causes degradation in the quality of an original image, and changes in the smooth region are easily perceivable to human eye. Embedding in the edge region by using edge detection technique can be the better option for increasing embedding capacity with a better quality of an image [6]. There are many edge detection techniques, viz. Robert, Sobel, Prewitt, and Canny. Among these, Canny is considered as the best technique for finding edges. In the proposed technique, the secret information has been embedded in edge areas using Canny edge detection technique to reduce the perceptivity of the cover image, to increase embedding capacity, and to maintain the image quality.
2 Related Work Nowadays, greater payload with the security of the algorithm has gained the attention of the researchers. For fulfilling the security need, enormous research papers [7] have been published on the hybrid of cryptography and edge detection-based steganography techniques, but the data concealed in the cover image cannot be that much secure that it can withstand attacks due to practicality of the cover image, resulting in the identification of the encrypted data. Implementation of standard encryption techniques for documentary data on digital media is unsafe with major issue being media encryption. Therefore, the encryption using standard methods, for instance, Data Encryption Standard (DES), Advanced Encryption Standard (AES), and Rivest–Shamir–Adleman algorithm (RSA) [8] requires a lot of time for encryption and is inefficient. Another issue with these methods is their key length constraint. Due to the huge proportion of picture data, the usage of confined length keys will increase propensity of attacks. Additionally, the information stream of the image may result in loss of the image specifications, and hence, making these methods less reliable for image encryption. In the present study, researchers use highly developed encryption methods like chaos theory to overcome the problem of standard methods. Chaos makes the system secure because of its properties like sensitive to initial conditions and control parameters. Logistic map is basic and simple technique for generating random sequence [9]. The map is dependent on two initial conditions r and x that varies from 3.57 to 4 and 0 to 1, respectively, and n is the number of iterations that vary from 0 to x-1. The equation for the logistic map is written as follows: xn+1 = r ∗ x ∗ (1 − xn )
(1)
Logistic map-based image steganography technique has been proposed [9] to enhance security for data encryption. In this technique, the choice of LSB or pixel
Logistic Map-Based Image Steganography Using Edge Detection
449
value differencing (PVD) approach has been used for embedding the secret data into the cover image that depends on a secret key. The scheme has been able to improve security with the randomness behavior of the logistic map. But, the trade-off between the quality of an image and payload has not been maintained to optimum. Also, the system has been tested only on grayscale images. Another, paper based on logistic map [10] has been proposed in which AES technique is utilized to encode the secret image and the 1D logistic map is used to cover up an encoded secret message into the host picture. The system provides good security but lacks in providing large capacity. One more, logistic map-based watermarking scheme [11] has been proposed to make the system secure against attacks. In this technique, DC coefficient modification of different blocks in the spatial domain has been done for embedding the watermarking bits, and the logistic map is used for improving security. The method has been able to enhance security but lacks in improving capacity. A new logistic map-based image encryption has been reported [12] for secure communication. Although, this technique is able to improve security than the simple cryptography techniques, but the scheme did not show better security values. In this paper, both the logistic map and edge detection-based steganography technique have been proposed. In this study, the logistic map is used to improve the security to the system, and edge detection technique is used for improving capacity for the secret message.
3 Proposed Algorithm In the proposed method, both the cryptography and steganography are combined to make the stego image secure for transmission from the sender side. At the recipient side, the entire encrypted message is extracted from the stego image, and thereafter, all the scrambled messages are decrypted. The encryption, embedding, extraction, and decrypting procedure appear in Fig. 1. In this technique, an attempt has been made to embed secret medical image of 256 × 256 size into any 512 × 512 dimensional color image. Canny edge detection has been used for finding edges of the green and blue planes. The proposed method secures medical information by embedding the medical image into the digital image. 3.1 Encryption and Embedding Process (i)
Take cover image and secret image (medical image) and divide them into three planes: red (R), green (G) and blue (B). (ii) Set initial values and generate a random sequence with logistic mapping as per the size of the plane of the secret image. (iii) Encrypt each plane of the secret image by doing XOR operation with a single bit of the generated sequence obtained from step ii. Repeat this step for every plane of the secret image. (iv) Perform Canny edge detection technique on G and B planes of the cover image to detect edge and non-edge pixels. Proposed technique produces binary data for both planes, where ‘0’ and ‘1’ represent non-edge pixel and edge pixel, respectively. (v) Calculate number of edge and non-edge pixels to estimate payload for (k, n) hiding technique, where ‘k’ represents non-edge pixel and ‘n’ represents edge pixel.
450
A. Jan et al. Cover Image
Separate Planes (R, G, B)
Edges detection on G & B Embedding Algorithm
Secret Image
Shuffle image with Logistic map
Encrypted Image Cover Image
Secret Image
De shuffle image with Logistic map
Stego Image
Extraction Algorithm
Encrypted Image
Fig. 1. Block diagram of the proposed technique
(vi) Use two LSB bits of the red plane to store the status of green and blue for embedding information. (vii) Embed the encrypted image into G and B planes as per the (k, n) hiding scheme. Here, we have taken n = 3. Repeat ‘vi’ and ‘vii’ steps until entire encrypted image pixels get embedded into the cover image to generate stego image. 3.2 Extraction and Decryption Process (i) Take stego image and separate its planes as red (R), green (G) and blue (B). (ii) Extract the message bits from G and B planes by checking the status of the first two bits of red plane. Repeat this step till all the message bits are extracted and encrypted image is formed. (iii) Compute R, G, B planes of the encrypted image. (iv) Decrypt every plane by doing XOR operation with the single bit of generated sequence with the same initial values. Repeat this step for every plane and concatenate them to form secret image.
4 Experimental Results This segment manages the test after effects of the proposed technique utilizing MATLAB. For an experiment, we have considered color cover images of size 512 × 512 and color secret image of 256 × 256 dimensions. The cover image and stego image are compared to verify the capacity, security, and quality of an image by calculating mean square error (MSE), PSNR, NCC, normalized absolute error (NAE), SSIM, NPCR, UACI, and entropy. Comparison tables with already existing techniques are shown in Tables 1, 2, and 3.
Logistic Map-Based Image Steganography Using Edge Detection
451
Table 1. NAE and SSIM comparison with the presented methods Parameters Parah et al. [4] NAE
Proposed
SSIM
NAE
SSIM
Lena
0.0140 0.9558 0.0053 0.9992
Plane
0.0006 0.9529 0.0038 0.9909
Baboon
0.0136 0.9818 0.0055 0.9986
Table 2. PSNR and NCC comparison with already available technique Image (at payload)
Lena
Singh et al. [5] (0.032 bpp) PSNR 40.74 NCC Patil et al. [10] (1 bpp) Parah et al. [4] (1 bpp)
–
–
0.9253 –
–
–
–
NCC
–
–
39.13
39.09
–
PSNR 39.17
1.0000 1.0000 1.0000
PSNR 45.40 NCC NCC
–
PSNR 44.95 NCC
45.37
45.41
1.0000 1.0000 1.0000
Prasad et al. [9] (2.27 bpp) PSNR 39.56 Proposed (1.72 bpp)
Baboon
PSNR 13.14
NCC Parah et al. [11] (1.5 bpp)
Plane
39.12
37.38
–
–
44.98
43.91
0.9963 0.9978 0.9976
Table 3. NPCR, UACI, and entropy comparison with the already existing technique Techniques
Entropy UACI NPCR
Nidhi et al. [12] 6.48
33.31 98.85
Parah et al. [11] 6.53
34.82 99.01
Proposed
36.72 100
7.96
4.1 Mean Square Error (MSE) It is the average squared difference between the original image and the stego image. This is mathematically expressed as follows: N 2 i=1 Ci − Ci MSE = N
(2)
452
A. Jan et al.
where Ci and Ci are the pixel values of the original image and stego image, respectively. And N denotes the dimensions of the images. 4.2 Peak Signal-to-Noise Ratio (PSNR) This is one of the most often utilized quality investigation techniques as it is exceptionally compelling in estimating the perceptual quality straightforwardness. PSNR is mathematically written as follows: 2552 (3) PSNR = 10 log10 MSE where MSE is the mean square error. 4.3 Normalized Cross-Correlation (NCC) Normalized cross-correlation calculates the relation between the cover image and stego image bits. It is represented as follows: m n i=1 j wo (i, j)wx (i · j) (4) NCC = 2 m n i=1 j=1 wo (i, j) where wo and wx are the embedded and extracted message bits, respectively. 4.4 Normalized Absolute Error (NAE) This quality measure is used to calculate the error between the cover image and stego image. This test is expressed as follows: m n i=1 j (|wo (i, j)wx (i · j)|) m n (5) NAE = i=1 j=1 (wo (i, j)) where wo and wx are the embedded and extracted message bits, respectively. 4.5 Structural Similarity Index (SSIM) The structural similarity index is an examination of resemblance between a basic and a changed image. SSIM has a worth running somewhere in the range of 0 and 1 where 1 demonstrates 100% likeness and 0 refers totally irrelevant image. Hence, clearly, the basic hiding image must have higher estimation of SSIM. It is mathematically processed as pursues 2μx μy + C1 2σxy + C2
(6) SSIM(x, y) = μ2x + μ2y + C1 σx2 + σy2 + C2 where μx , μy , σx , σy , and σxy are the mean intensity, standard deviation, and crosscovariance of images x and y, respectively.
Logistic Map-Based Image Steganography Using Edge Detection
453
4.6 Number of Changes Per Rate (NPCR) The NPCR test is done to evaluate the manipulation of altering a single pixel in the original image. It is the sensitivity of the encrypted image to its original image and initial values/secret key. This test is mathematically defined as follows: NPCR =
MN 1 D(i, j) × 100% MN
(7)
i,j=1
where M and N are the dimensions of an image. D is the total number of unequal entries, and Ip and Ic are the plain image and cipher image, respectively. 0, if Ip = Ic D= (8) 1, if Ip = Ic
4.7 Unified Average Changed Intensity (UACI) The UACI evaluates the average intensity of differences between the original image and ciphered image. It is mathematically represented as follows: ⎛ ⎞ MN 1 ⎝ Ip (i, j) − Ic (i, j) ⎠ UACI = × 100% (9) MN 255 i,j=1
where Ip and Ic are the plain and ciphered images. M and N are the dimensions of the image. 4.8 Entropy Entropy test is performed for scrutinizing the randomness of the encrypted image. The mathematical representation of Shannon’s entropy test is defined as follows: E(C) = −
n
P(ci ) log2 P(ci )
(10)
i=1
where C is a collection of symbols, ci C, P (ci ) describes probability, and n is the number of all symbols. Tables 1 and 2 show the quality comparison of original and stego image at an embedding capacity of existing technique at around 1 or 2.27 and 1.72 bpp of the proposed algorithm with 44.61 average PSNR value. It is clear from the tables that the proposed scheme is better than the existing ones as it has high PSNR, better SSIM, and less NAE. Thus, the changes made in the image are less perceivable to the eavesdropper, and therefore, is secure for communication. Security analysis test results are shown in Table 3. It is evident from the results that scheme proposed is secured one than that of Nidhi et al. [12] and Parah et al. [11]. The proposed algorithm has got 100% of NPCR value and 36.72 UACI value.
454
A. Jan et al.
5 Conclusion The paper presents the steganography technique based on logistic mapping and edge detection technique. Here, the medical image is encrypted using logistic map. The initial values are used for information encryption, which increases security. At recipient end, initial values are used to decrypt the secret medical image. This complete encrypted image has been embedded using steganography through edge detection technique. Various tests have been done on cover image, stego image, secret medical image, and encrypted image. From the comparison of different parameters, it is clear that the proposed technique is better in quality, security, and capacity together.
References 1. S. Dogan, A reversible data hiding scheme based on graph neighbourhood degree. J. Exp. e3 Theoret. Artif. Intel. 29(4), 741–753 (2017) 2. F. Ahad, S.A. Parah, J.A. Sheikh, N.A. Loan, G.M. Bhat, Information hiding in medical images: a robust medical image watermarking system for E-healthcare. Multimed. Tools Appl. (2015). https://doi.org/10.1007/s11042-015-3127-y 3. C. Biswas, U.D. Gupta, M.M. Haque, A hierarchical key derivative symmetric key algorithm using digital logic, in IEEE International Conference on Electrical, Computer and Communication Engineering (ECCE) 2017 (Cox’s Bazar, Bangladesh, 2017), pp. 16–18 4. S.A. Parah, J.A. Sheikh, J.A. Akhoon, N.A. Loan, Electronic health record hiding in images for smart city applications: a computationally efficient and reversible information hiding technique for secure communication. Future Gener. Comput. Syst. Elsevier (2018). https:// doi.org/10.1016/j.future.2018.02.023 5. S. Singh, T.J. Siddiqui, A security enhanced robust steganography algorithm for data hiding. IJCSI 9(1) (2012) 6. K. Amanpreet, K. Sumeet, Image steganography based on hybrid edge detection and 2 k correction method. Int. J. Eng. Innov. Technol. (IJEIT) 1(2) (2012) 7. M. Kumari, S. Gupta, P. Sardana, A survey of image encryption algorithms. 3D Res. Springer 8(37) (2017) 8. Z. Hua, F. Jin, B. Xu, H. Huang, 2D logistic-sine-coupling map for image encryption. Signal Process. Elsevier 149, 148–161 (2018) 9. S. Prasad, A.K. Pal, Logistic map-based image steganography scheme using combined LSB and PVD for security enhancement. emerging technologies in data mining and information security. Adv. Intell. Syst. Comput. 814 Springer 3. https://doi.org/10.1007/978-981-13-15015_17 10. N.C. Patil, V.V. Patil, Secure data hiding using encrypted secrete image. Int. J. Eng. Sci. Res. Technol. 4(10) (2015). http://www.ijesrt.com 11. S.A. Parah, N.A. Loan, A.A. Shah, J.A. Sheikh, G.M. Sheikh, A new secure and robust watermarking technique based on logistic map and modification of DC coefficient. Sci. Bus. Media B.V. Springer (2018). https://doi.org/10.1007/s11071-018-4299-6 12. S. Nidhi, A new image encryption method using chirikov and logistic map. Int. J. Comput. Appl. 59, 2123–2129 (2013)
Smart Vehicle Tracker for Parking System Ishita Swami1(B) and Anil Suthar2 1 Department of Computer Engineering, LJ Institute of Engineering and Technology,
Ahmedabad, Gujarat, India [email protected] 2 Department of Electronics & Communication, LJ Institute of Engineering and Technology, Ahmedabad, Gujarat, India [email protected]
Abstract. In modern times in India, the population shift is taking place from villages to cities for better job opportunities. There is a rapid growth in population. Demand of vehicles is therefore increasing at a rapid rate that was never before. Therefore, it is strenuous and exorbitant to create more parking spaces because of the specific number of the free spaces in the cities. By employing embedded systems, there is a way to develop an application which can give a solution of parking problems. The suggested IOT-based smart parking system monitors and indicates the availability of each parking space. The reason for developing this project is to reduce smart city issues such as the traffic on roadside and reduce the pollution in the city and in the parking. This project aims in developing an automatic number plate recognition (ANPR) system which can extract the license plate number from the vehicles using image processing algorithms on Raspberry Pi to keep track of how many vehicles entered and left the parking area. Also, the use of database helps to automate the process of allowing those vehicles which are already stored in it. For the vehicles which are not stored in the database, the red LED glows to notify the parking manager, guard and that vehicle is allowed to be parked in visitor’s parking area. Keywords: Parking system · IOT · Open CV · Raspberry Pi · Python
1 Introduction For vehicle surveillance, the automatic number plate recognition (ANPR) technology has been very useful in past few years. It uses optical character recognition (OCR) on images to read vehicle registration plates. This technology can be used to store the images captured by the camera and text from the number plate. ANPR technology considers plate variations from place to place. It can be applied at parking lots of public places like shopping malls, corporate offices residential areas and automatic toll text collection and public safety agencies. The main aim of these systems is to recognize the characters and state on the number plate with high accuracy. For example, when vehicle enters in a shopping mall, they provide a card and charge fees accordingly. If a card gets damaged or misplaced, a heavy fine is levied. To avoid © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_51
456
I. Swami and A. Suthar
such an issue, the proposed system uses number plate of the vehicle to automatically save the entry and exit time and calculate the parking fees accordingly. Also, a local database stores a list of preregistered vehicles that will be allowed automatically as it enters.
2 Literature Review Cassin et al. [1] created a prototype of a smart parking framework for an urban area for reservation using IoT by using Raspberry Pi which is integrated with cloud. They further expanded the work using OCR and facial recognition to provide double security. Cynthia et al. [2] proposed an IOT-based smart parking system which integrates several physical devices to check the parking slot availability by using Arduino IDE. The android app is built for booking parking slot and payments. The application is used to find the free slot and user need to specify the estimated time of arrival and parking slot usage start and end time. After booking for free parking slot, if the vehicle enters the entrance gate, it is assumed that each car has built-in RFID card, and RFID reader verifies the vehicle and is authenticated. The parking slot may be allotted for small and large vehicle Agarwal et al. [3] designed a system through which the image of number plate is captured using USB camera when a vehicle crosses during red light. The system used Raspberry Pi which will transmit the identified character using image processing methods to the control room for taking mandatory action against the driver Jyothi Sravya et al. [4] found a solution that if any vehicle crossed when the red signal is on, then automatically camera will start and capture the image of the vehicle and extract the number plate from the vehicle and send it to the database automatically and send the payment amount and link as SMS to the vehicle owner which includes date and time. To use the ultrasonic sensor, this will be helpful for calculating the distance between zebra crossing line and vehicles. Reddy et al. [5] proposed a system where the image is taken of the number plate. And then Raspberry Pi used to get the number of the plate. If an unauthorized number plate was detected, then authorities are informed by the use of buzzer. When the authorized vehicle is detected, then the gate is opened.
3 Basic Method for Number Plate Recognition Recognition implies the conversion of image text to character, and the final result is the number of the license plate. The characters do not have the same size and thickness, and hence, are resized into a uniform size before recognition. After the image is captured by the camera, the image is preprocessed and then image is segmented and finally characters are recognized. There are various character recognition algorithms that are used in ANPR system. Mainly, character recognition techniques include template matching and optical character recognition (OCR) [6].
Smart Vehicle Tracker for Parking System
457
4 Proposed System We have used optical character recognition in proposed system to recognize the characters in the vehicle number plate, mostly four wheelers. The system deals with the utilization of Raspberry Pi, ESP8266 and infrared sensor. The system setup is mounted at entry level in any parking area, on a slope inside a glass so that vehicle can easily pass over it. Number plate of every entered vehicle is processed, and red or green LED glows accordingly. 4.1 System Design The following flowchart represents the extraction of the number plate by automatically reading their number plates to recognize the vehicles. Automatic number plate recognition (ANPR) system employs image processing and character recognition methods. Algorithm: 1. 2. 3. 4. 5. 6. 7.
IR sensor detects the object (vehicle). Sensor sends the signal to Pi camera. Pi camera gets activated to capture image of vehicle and sends to server. Server preprocesses image for noise removal. Edges are detected, and each character is segmented. Segmented characters are recognized using OCR. Re step 8–10 if (recognized characters matched with database) then go to 9. else go to 11. 8. Glow green LED and allow to park in member parking area. 9. Glow red LED and allow to park in visitor parking area. 4.2 Working Principle Initially, image is captured real time using the Pi camera V2 module. Open CV is used at number detection stage. Open-source computer vision (OpenCV) is a library mainly providing functions employed in real-time computer vision. It is library used for image processing. Once the number plate is detected, a green contour appears on each character of the number plate then each character is identified. For detecting the number plate, (a) sensor (b) Raspberry Pi 4B (c) Raspberry Pi 8MP Camera and (d) LCD modules were used (a) Sensor: When some vehicle enters a parking area, the obstacle sensor detects the presence of the vehicle and sends the signal to server to capture the image of the vehicle. At range of 2–80 cm, the obstacle sensor detects the presence of vehicle. It can further be detected in depth.
458
I. Swami and A. Suthar
(b) Raspberry Pi 4B: The Raspberry Pi acts as server, and the ESP8266 microcontroller acts as slave. Sensor, 16 * 2 LCD display, camera and LED are interfaced with server. When ESP8266 extracts useful information (presence of vehicle) from sensor and sends to server via Wi-Fi and then further processing like image preprocessing, license plate recognition is done on server. Two LEDs are interfaced with server to show the driver the status of the gate. A red light indicates that the vehicle number plate is not stored in the database to notify the security guard to do checking and allowing the vehicle to be parked in visitor parking area. A green light indicated that the vehicle is allowed to enter the parking area because it is stored in the database (Fig. 1).
Fig. 1. Flowchart for number plate recognition
(c) Raspberry Pi 8MP Camera: When the sensor detects the vehicle, it sends signal to Pi camera. Pi camera captures the image of vehicle and sends captured image for preprocessing (Fig. 2). (d) LCD:
Smart Vehicle Tracker for Parking System
459
Fig. 2. Raspberry Pi camera capturing the image of the vehicle
The 16 * 2 LCD display is interfaced with server which displays time when the vehicle enters and leaves the parking area 4.3 Experimental Process To identify the vehicles number plate, the automatic number plate recognition system (ANPR) applies image processing and character recognition technology. The overall approach consists of the following three parts: 1. 2. 3. 1.
Plate detection Segmentation and extraction of characters from number plate Optical character recognition of extracted number plate. Plate detection.
Image preprocessing is necessary for noise elimination. It is a process which enhances the precision and interpretability of an image. Here, firstly captured image is preprocessed. Following steps explain the preprocessing for number plate detection. Steps for preprocessing of number plate detection: Step 1: To save the image captured (Fig. 3).
Fig. 3. Captured number plate
Step 2: To resize and grayscale the image (Fig. 4).
460
I. Swami and A. Suthar
Fig. 4. Grayscaled image
The resolution of image is generally large to avoid such a problem image is resized. When gray scaling is performed, image gets transformed as no color details are required so other processes are speed up. Step 3: To remove noise from image bilateral filtering (blurring) (Fig. 5).
Fig. 5. Blurring image
Every image contains unwanted details called noise. To remove these unwanted details from image, bilateral filtering (blurring) method is used. Except the number plate, other background details are blurred in the image. Step 4: To detect the edges (Fig. 6).
Fig. 6. Detected edges
For edge detection, the edges have intensity gradient more than minimum threshold value and less than maximum threshold value. Canny edge method from OpenCV is used for edge detection. (2) Segmentation and extraction of characters from number plate. A contour is an image segmentation technique. Contours are basically continuous lines or curves that cover the boundary of image. Here boundary of each character is covered (Fig. 7). (3) Optical character recognition of extracted number plate.
Smart Vehicle Tracker for Parking System
461
Fig. 7. Contour detection
This step of image processing reads the number plate information from the image. To read characters from image, Tesseract OCR engine is used. In Python, pytesseract package is used (Fig. 8).
Fig. 8. Recognized number plate
5 Result and Discussion Using OCR technique, the text is converted into character and displayed on the screen. Figure 8 shows output window, displaying number plate and timing of entry of the vehicle once server has detected the number plate.
6 Conclusion The system identifies the character in the number plate and is done by using OCR technique. As can be seen from the setup, it is cost effective and can easily be installed and maintained. Thus, the system provides a low-cost and efficient solution to the parking problems. Less human interaction is required in this process as every step is automated by the system. With just minor adjustments, the prototype can be implemented for real-time solution. Further, the template matching algorithm will be used to match the recognized number plate with the database.
References 1. E. Cassin Thangam, M. Mohan, J. Ganesh, C.V. Sukesh, Internet of things (IOT) based smart parking reservation system using raspberry pi. Int. J. Appl. Eng. Res. 13(8), 5759–5765 (2018) 2. J. Cynthia, C. Bharathi Priya, P.A. Gopinath, IOT based smart parking management system. Int. J. Recent Technol. Eng. (IJRTE) 7(4S), 374–379 (2018)
462
I. Swami and A. Suthar
3. A. Agarwal, A. Saluja, License plate recognition for remote monitoring of traffic. Int. J. Ind. Electron. Electr. Eng. 5(5), 31–34 (2017) 4. B. Jyothi Sravya, V. Naga Lakshmi, J. Rajasekhar, Recognition of vehicle number plate and measure the distance. Int. J. Recent Technol. Eng. (IJRTE) 7(6), 956–960 (2019) 5. K.V. Reddy, S. Sunkari, A new method of license plate recognition system using Raspberry Pi processor. Int. J. Comput. Sci. Inf. Eng. 3(4), 1–5 (2016) 6. S. Mahalakshmi, S. Tejaswini, Study of character recognition methods in automatic license plate recognition (ALPR) system. Int. Res. J. Eng. Technol. (IRJET) 4(5), 1420–1426 (2017)
An Efficient Technique to Access Cryptographic File System over Network File System Umashankar Rawat, Satyabrata Roy(B) , Saket Acharya, and Krishna Kumar Manipal University Jaipur, Jaipur, Rajasthan 303007, India [email protected], [email protected], [email protected], [email protected]
Abstract. Mounting of cryptographic file system (CFS) over network file system (NFS) degrades the performance of remote file access. Userspace CFS when implemented as modified NFS server together with CFS Unix and Extended CFS Unix can act as a remote NFS server. It enables them to be accessed remotely without the requirement of extra NFS mount. However, this is not a good approach due to many security issues involved in this context, such as the transmission of unencrypted passwords and data over the insecure network. When these are mounted remotely as NFS servers, different security attacks like interception, masquerade, and replay attacks can take place. In this paper, a secure protocol is developed and implemented using safe methods like mutual credential authentication and establishment of session for extended CFS Unix. This approach restricts the aforementioned attacks, enabling secure remote access. Besides, CFS Unix is moderated to use NFS version 3 instead of version 2. This is done for reliable write operations. In addition to this, a detailed comparative analysis is presented with respect to remote access performance of extended CFS Unix with secure protocol with existing CFS mounted over NFS.
Keywords: Cryptographic file system key infrastructure · Private key store
1
· Network file system · Public
Introduction
Linux systems provide authentication, authorization, and access control services using policy language, pluggable authentication module (PAM), access control lists (ACLs), usage control model [1], etc. For confidentiality and integrity services, cryptographic file system (CFS) may be used that provides file encryption/decryption along with integrity mechanisms, in a secure, efficient and transparent manner to the user. A distributed CFS should also provide secure remote access and file sharing among multiple users. c Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_52
464
U. Rawat et al.
Various encryption services offered by CFS can be put down at the device layer or file system level. Encryption and decryption occur at the device layer systems like Loopback CFS (Cryptoloop)1 and device-mapper crypto target (DMCrypt) [2]. This enciphering occurs at kernel-space of device layer using an infrastructure named Linux kernel device mapper. It offers a generic method of creating virtual layers of block devices. To restrict file sharing among multiple users, it does encryption with a single secret key on the whole block device. These are not suitable for incremental backups, neither useful for mounting by non-privileged users. Besides, these are not accessible from remotely over NFS. Any user, either privileged or non-privileged, can mount it on the system keeping the kernel intact. This facility has made it conveniently portable. However, repeated context switching between user-space and kernel-space and data replication has made the performance poor. Their performance degrades drastically when they are remotely accessed over NFS. Encryption is done using single key on the whole directory and it makes multiple file sharing infeasible. Faster ciphers and appropriate file sharing support can be used to improve the performance of Extended CFS Unix and extended EncFS [3]. They use hybrid encryption strategy to encrypt each file. Each file is encrypted with a key which is specific to an individual file. These keys are again encrypted with the public keys of authorized users among whom the files would be shared. Pluggable authentication module (PAM) and GnuPG PKI module2 provide required support for public key cryptography. There are many works done corresponding to multimedia security using intelligent agents [4], cellular automata-based security [5–8], biometric security [9–12], but all of these works are focused on providing security in external framework using artificial intelligence, cellular automata-based random number generators. CFS Unix and extended CFS Unix can be used as remote NFS server without any need of additional NFS mount. However, this is not recommended because of the possibilities of various security attacks on the non-encrypted passwords and raw data transmitted through network. In this paper, these issues are addressed through mutual authentication and session establishing which make it secure in case of remote access. Apart from this, NFS version 3 instead of version 2 can be used in extended CFS Unix for secure write operations, discussed in Sect. 2. The rest of the paper is organized as follows. Section 2 demonstrates the secure protocol with CFS daemon that acts as NFS server. Section 3 presents secure protocol for extended CFS Unix in details. Section 4 discusses the implementation details. Section 5 presents the performance evaluation of the technique. At last, Sect. 6 concludes the work and provides some future research directions.
1 2
http://tldp.org/HOWTO/Cryptoloop-HOWTO/. www.gnupg.org.
An Efficient Technique to Access Cryptographic File System . . .
2
465
Extended CFS Unix with NFS V.3
NFS version 3 [13] and NFS version 4 [14] are more efficient in performance than NFS version 2. They consist of safe asynchronous writes, less number of attribute requests and lookup operations. Improved sophisticated operations in version 4 have made it better than version 3 on networks having more latency. On a LAN, where latency is less, the performance of version 3 is found to be better than that of version 4. This decrease in latency in version 4 has come at the cost of substantial coding complexity and decoding operations at the client side and increased error handling complexity. NFS version 2 is used in the implementation of Extended CFS Unix [3] with asynchronous mode. To obtain better performance with secure asynchronous writes, file locking improvements, and other benefits of version 3, it has been customized to use version 3 remote procedure calls (RPCs). Security issues associated with its remote usage especially in transmitting the plaintext passwords and non-encrypted data over the network are resolved using secure protocol which is presented in the Sect. 3.
3
Secure Protocol for Extended CFS Unix
Access of CFS Unix, or Extended CFS Unix [3] over insecure NFS leads to possibilities of different attacks mentioned in Sect. 1. In this section, a secure protocol is described which would enable the secure access of Extended CFS Unix files over NFS. Figure 1 shows the proposed model for accessing files over NFS.
Fig. 1. Model for accessing CFS files over NFS
Figure 2 shows architecture of extended CFS Unix combined with CFS to act as server of NFS which is placed at remote location. Public key infrastructure (PKI) incorporation with GnuPG PKI module and PAM has been depicted as private key store (PKS) and PKS daemon installed
466
U. Rawat et al.
Fig. 2. Architecture of extended CFS Unix acting as remote NFS server with CFS daemon
on user workstation. Login passphrase of an user is captured and stored in the session keyring by PAM. Public–private key pairs for all the system users are stored by GnuPG keyring. Login passphrase is used by GnuPG PKI module both to perform decryption and to fetch the private key of user saved on GnuPG keyring. This actually gives both public and private key pair to the CFS daemon, as soon she enters in the system after credential authentication. To safeguard against different attacks, user workstation and file server need mutual authentication and session key establishment for encryption before the commencement of any file operation. The session keys get generated using uniform contributions of key component from either side. The detailed steps for secure protocol are as mentioned below: 1. When CFS file server is mounted by an user remotely, a message consisting a nonce (n1 ), and its session key part (kf s ) is constructed by CFS daemon, then the message gets encrypted with public key of user. Then by using netlink sockets it is transmitted to PKS daemon. p (n1 kf s ) F S → W S : Euser
(1)
2. Private key of the user is used for decryption of the message. The decryption process is done by PKS daemon, who constructs a message consisting of the
An Efficient Technique to Access Cryptographic File System . . .
467
received nonce (n1 ), another nonce (n2 ), and its session key part (kws ). It then encrypts the message with public key of the file server and forward it to CFS daemon. W S → F S : EFp S (n1 n2 kws )
(2)
3. Then the message is decrypted by CFS daemon by using its own private key. It then verifies the nonce n1 and on successful verification, the user becomes authenticated, otherwise the protocol is aborted. It then produces a unique, random credential (user credential) for this specific user-session. Then, it sends user credential and workstation nonce (n2 ) encrypted by the public key of the user to the PKS daemon. p (user credential n2 ) F S → W S : Euser
(3)
The session key Ks is now computed by CFS daemon using the key ingredients (kf s and kws ). Then, it stores the state information of the particular session, consisting of user credential, Ks , user id and workstation location, in a table for future references. 4. The message is decrypted by PKS daemon. It verifies the nonce n2 and upon getting success, the file server becomes authenticated to the user; otherwise, the protocol is aborted. After that, the session key (Ks ) is computed by PKS daemon using the key ingredients (kf s and kws ) and then it passes user credential and session key (Ks ) to the kernel of workstation. Now, user can start file operations through various system calls like open, read, and write. When user performs open operation on the file, NFS client situated on kernel of workstation transmits the user credential and the other essential parameters to the CFS daemon, encrypted with (Ks ). The message is decrypted by CFS daemon through the session key and then credential verification happens. If it is not verified, the request is not processed by the file server further. It also verifies the source which is present at the lookup table having state information. Then, CFS daemon sends the token encrypted with the session key after obtaining workstation location from the table to PKS daemon. This (Ks ) can also be used for secure communication by successive read and write operations.
4
Secure Protocol Implementation
Implementation needs strategic modifications in the PKS daemon, NFS client on user workstation and CFS daemon. 4.1
CFS Daemon Changes
CFS daemon is expanded to produce message as described before and then sends these to PKS daemon situated on workstation of user. CFS daemon saves
468
U. Rawat et al.
the required user details, unique credentials, and the session keys set up as a component of the safe protocol through a data structure that is previously shared between two parties. It obtains the location of workstation from that data structure in order to transmit the token to PKS daemon. The CFS Unix contains cfs nfs.c file. It is responsible for modified functions of NFS server, policies for access control and RPC handlers. This files are reframed to adapt version 3 RPCs in place of version 2 RPCs. To cope up with “nfs protocol version 3”, “opensource nfs prot.x”, “v 1.8” and “mount.x”, “v 1.7” files are used in place of “nfs prot.x”, “1.2” and “mount.x”, “1.2” files used in CFS Unix and Extended CFS Unix. 4.2
PKS Daemon Changes
PKS daemon needs to share session key and credentials of the user to kernel of workstation for being assigned to processes. This is done by using device driver which is installable. PKS daemon communicates to kernel by ioctl system call. 4.3
NFS Client Changes
Workstation kernel sends credentials of user and other essential parameters to the authentication module during any file operation by the NFS Client. This change is done by appending the client code for NFS in Linux kernel.
5
Performance Evaluation
In this section, performance evaluation is presented for the Extended CFS Unix acting as safe remote NFS server (Secure CFS) consisting of existing CFS in user-space and kernel-space, mounted over NFS. User-space CFS opted for comparison are extended CFS Unix and extended EncFS [3]. CFS Unix and EncFS 3 CFS’s are not considered in comparison, as they already have a large overhead in term of performance; their performance further reduces when they are mounted over NFS. Performance of CFS’s and unencrypted Ext4 file system are evaluated by running IOZone,4 a well-known tool for benchmarking able to perform tests for synthetic read/write in order to decide system throughput. The server and client systems are of 3 GHz Intel core i-3 machines with 2 GB RAM that runs Linux kernel 2.6.34 connected over an remote Fast Ethernet LAN of 100 MBPS. The iozone utility runs on various file system mount points with –a option (auto mode), for getting the throughput having file sizes ranging from 64 KB to 512 MB. “#iozone –a –i 0 –i 1 –b /home/output.xls” Where –i option denotes the write and read tests (1 and 0, respectively); and –b option signifies the output file that is used to save obtained throughput after 3 4
www.encfs.googlecode.com/files/encfs-1.7.4.tgz. www.iozone.org.
An Efficient Technique to Access Cryptographic File System . . .
469
(a) Write overhead in CFS with respect to Ext4
(b) Read overhead in CFS with respect to Ext4 Fig. 3. Comparative analysis
successful command execution. Performance measuring parameters of write and read operations are computed using achieved throughput of various CFSs that are mounted over NFS, with respect to non-encrypted Ext4 file system. This is shown in Figure 3 for write and read operations. Performance metrics in achieved read and write throughput are decreased drastically in proposed work as compared to extended CFS Unix, CFS of userspace CFS, and extended EncFS [3] mounted on NFS, the reason being, an additional NFS mount is not needed for secure CFS. In case of large files having file size more than 128 MB, secure CFS gives better performance than eCryptfs,
470
U. Rawat et al.
kernel-space CFS and ECFS [15] mounted on NFS. A gain of about 70% for write operation and about 50% for read operation of large files are recorded in secure CFS in comparison with eCryptfs. Secure CFS has achieved about 50% gain for write and 40% gain for read of large files as compared to ECFS.
6
Conclusion and Future Works
Secure CFS is developed and implemented by changing the extended CFS Unix because of its trusted and secure use as NFS server placed remotely, by incorporating secure protocol and using NFS version 3. Secure protocol prevents from various network attacks, whereas NFS version 3 gives improved performance along with asynchronous and secure writes. Remarkable performance increase is observed in secure CFS when compared with other implementations of userspace CFS mounted on NFS. Performance of secure CFS is also far better than that of kernel-space CFS, for very large file sizes mounted over NFS. In future, hardware devices like USB connected disks or smart cards could be utilized for the proposed technique in saving the private key of any user using openCryptoki PKCS#11 public key infrastructure5 support.
References 1. R. Teigao, C. Maziero, A. Santin, Applying a usage control model in an operating system kernel. J. Netw. Comput. Appl. 34(4), 1342–1352 (2011) 2. M. Broz, dm-crypt: Linux kernel device-mapper crypto target (2015) 3. U. Rawat, S. Kumar, Distributed encrypting file system for linux in user-space. Int. J. Comput. Netw. Inf. Sec. 4(8), 33 (2012) 4. N. Dey, V. Santhi, Intelligent Techniques in Signal Processing for Multimedia Security (Springer, 2017) 5. S. Nandi, S. Roy, S. Nath, S. Chakraborty, W.B.A. Karaa, N. Dey, 1-d group cellular automata based image encryption technique, in 2014 International Conference on Control, Instrumentation, Communication and Computational Technologies (ICCICCT) (IEEE, 2014), pp. 521–526 6. S. Nandi, S. Roy, J. Dansana, W.B.A. Karaa, R. Ray, S.R. Chowdhury, S. Chakraborty, N. Dey, Cellular automata based encrypted ECG-hash code generation: an application in inter human biometric authentication system. Int. J. Comput. Netw. Inf. Sec. 6(11), 1 (2014) 7. S. Roy, J. Karjee, U. Rawat, N. Dey et al., Symmetric key encryption technique: a cellular automata based approach in wireless sensor networks. Proc. Comput. Sci. 78, 408–414 (2016) 8. S. Roy, U. Rawat, J. Karjee, A lightweight cellular automata based encryption technique for iot applications. IEEE Access 7, 39782–39793 (2019) 9. S. Acharjee, S. Chakraborty, S. Samanta, A.T. Azar, A.E. Hassanien, N. Dey, Highly secured multilayered motion vector watermarking, in International Conference on Advanced Machine Learning Technologies and Applications (Springer, 2014), pp. 121–134 5
www.sourceforge.net/projects/opencryptoki.
An Efficient Technique to Access Cryptographic File System . . .
471
10. A.S. Ashour, N. Dey, Security of multimedia contents: a brief, in Intelligent Techniques in Signal Processing for Multimedia Security (Springer, 2017), pp. 3–14 11. S. Hore, T. Bhattacharya, N. Dey, A.E. Hassanien, A. Banerjee, S.B. Chaudhuri, A real time dactylology based feature extraction for selective image encryption and artificial neural network, in Image Feature Detectors and Descriptors (Springer, 2016), pp. 203–226 12. P. Rajeswari, S.V. Raju, A.S. Ashour, N. Dey, Multi-fingerprint unimodel-based biometric authentication supporting cloud computing, in Intelligent Techniques in Signal Processing for Multimedia Security (Springer, 2017), pp. 469–485 13. B. Callaghan, B. Pawlowski, P. Staubach, Nfs version 3 protocol specification. Tech. rep., RFC 1813, Network Working Group (1995) 14. S. Shepler, B. Callaghan, D. Robinson, R. Thurlow, C. Beame, M. Eisler, D. Noveck, Nfs version 4 protocol specification. RFC 3010, Network Working Group (2000) 15. U. Rawat, S. Kumar, Ecfs: an enterprise-class cryptographic file system for linux. Int. J. Inf. Sec. Priv. (IJISP) 6(2), 53–63 (2012)
Hybrid Feature Selection Method for Predicting the Kidney Disease Membranous Nephropathy K. Padmavathi1(B) , A. V. Senthılkumar2 , and Amit Dutta3 1 Research Scholar, PG Research and Computer Applications, Hindusthan College of Arts and
Science, Coimbatore, India [email protected] 2 Director, PG Research and Computer Applications, Hindusthan College of Arts and Science, Coimbatore, India [email protected] 3 Deputy Director, AICTE, New Delhi, India [email protected]
Abstract. Membranous renal disorder (MN) is that the most typical reason behind syndrome within the adult population and it is a continual disease moving the capillary vessel. In the health sector, predicting the disease plays a great evolution that gave birth to new computer technologies. This evolution prompted researches to use all the technical innovations like massive knowledge, prognosticative analysis, and deep learning algorithms to uproot helpful information and facilitate in creating selections. Diagnosis of medical data is a challenging task that should be performed accurately and efficiently. The main goal of this evaluation work is to forecast the nephropathy by means of victimization the simplest classification techniques and ways. During this method, the optimization-based feature selection victimization, particle swarm optimization (PSO), teaching–learning-based optimization (TLBO) are employed. The feature selection technique is that the choice of that lowest spatial property feature set of an inspired feature set that retains the high detection accuracy because of the original set. It is naturally categorized by very extreme levels of protein in edema, urine hypoalbuminemia, and higher serum lipids. In the feature selection, teaching–learning-based optimization (TLBO) method is used to predict the accuracy level. The classification technique bagging is combined with the supervector machine to predict the sickness. Keywords: Membranous nephropathy (MN) · PSO · TLBO · Bagging method · SVM
1 Introduction Membranous nephrotic, complaint classified by the gathering of immune sediments on the external feature of the capillary basement membrane, is that the utmost reason for disorder nephrotic illness in adults. One of the maximum recurrent roots of nephrotic syndrome in grownups is membranous glomerulonephritis (MGN) the disease is categorized by the coagulating of the glomerular basement membrane in the renal tissue. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_53
Hybrid Feature Selection Method for Predicting the Kidney …
473
In patients with primary MN, once this filtration barrier is disabled it results in a vast loss of proteins in excretion (proteinuria), and it is termed as glomerulonephritis. Membranous nephropathy is the result of the immune complexes on the capillary sickness. Immune complexes occur once associate in nursing protein attaches to associate in nursing substance [1]. The immune unwellness is categorized by coagulating the capillary basement membrane within the nephritic tissues that causes the assembly of antibodies [2]. The immune unwellness is classified by coagulating the capillary basement membrane within the excretory organ tissues that causes the assembly of antibodies [3]. The opposite infections like cancer, viral hepatitis, and hepatitis C will cause secondary MN. MN will recur once excretory organ transplantation inflicting albuminuria, transplant dysfunction, and graft failure. Clinical displays and outcomes of membranous nephrosis state that massive quantity of information is on the market for big range of patients [4]. The closeness of antibodies anti-PLA2R wasn’t assessed. The antibodies PLA2R was analyzed through the theory strategy [5]. In this progressed innovation information, mining plays a vital part in anticipating information which gives profitable data to healthcare divisions. We have analyzed two classifier’s vote and bagging methods for the prediction of this kidney disease [6]. In this inquire work, it is preferred to analyze the performance and accuracy of the hybrid-based method. The feature selection process refers to reducing the inputs for managing and survey for finding the most meaningful inputs. Diverse sorts of highlight determination calculations are filter-based, wrapper-based, and hybrid calculations [7, 8]. In this proposed work, we use hybrid algorithms to improve accuracy and better performance.
2 Existing Method PLA2R counteracting agent observing handle is utilized to assist with the determination and forecast of illness. The expectation of the illness was done on the premise of the critical stream. The principle is made to perceptive the PLA2R antibodies. Log-rank insight approach is used to calculate the noteworthy esteem 0.05 which is taken within the form of decimal [9, 10]. Information mining plays a key part within the wellbeing segment. The classification method is used to predict the disease. Bagging is called the bootstrap aggregating algorithm which is for estimating the quantity, and it improves precision by reducing the fluctuation and overcoming the overfitting issue. The metaclassifier calculations are connected on the dataset to recognize the best classifier for a forecast of kidney illness for the patient. The dataset is held on within the Attribute-Relation File Format (.ARFF format) because the information sort of the attributes should be declared [11]. In this classification, methodology bagging and voting are considered. WEKA toolkit 3.8.2 is used for generating the classification techniques and forecast of the result. The main purpose of victimization of the classification techniques is to boost the preciseness accuracy so the prediction of the unwellness may be done. The accuracy of the bagging method performs well on the voting method. The UCI repository kidney dataset is considered. The attributes are considered as 12 numerical values and 13 nominal values. So totally 25 attributes and 400 instances are considered to improve the performance level of the classification techniques [12].
474
K. Padmavathi et al.
3 Proposed Method The feature selection methodology is chosen to create relevant options for building the classification model. Feature selection is one in all the foremost vital techniques in knowledge preprocessing information mining [13]. In the projected methodology, the feature selection method is employed to pick up the foremost relevant information and reduced the options and realized the nearest info extracted so that the prediction of the illness is going to be high. The projected work is done out in MATLAB process. 1. Optimization-based feature victimization, particle swarm optimization (PSO), teaching–learning-based optimization (TLBO) methods are used. 2. We have a tendency to apply modified TLBO-based feature choice which could pick the most relevant information and reduced the options and recognize the closest information extracted so that the prediction measures are excessive. A. Performance of the planned system The dataset includes four hundred samples and twenty-five attributes. The varied attributes are given in Table 1. The various attributes are given in Table 1. The attributes of the kidney disorder are taken from the UCI repository dataset. Table 1. Attributes of the kidney disorder 1
Specific gravity
13
Pus cell clumps
2
Albumin
14
Age
3
Sugar
15
Blood
4
Red blood cells
16
Blood glucose random
5
Pus cell
17
Blood urea
6
Bacteria
18
Serum creatinine
7
Hyper tension
19
Sodium
8
Diabetes mellitus 20
Potassium
9
Coronary artery disease
Hemoglobin
10
Appetite
22
Packed cell volume
11
Pedal edema
23
WBC
12
Anemia
24
RBC
21
Table 1 specifies the various characteristics features of the kidney disease. Attributes are very important in the prediction of the disease.
Hybrid Feature Selection Method for Predicting the Kidney …
475
4 Results In the first step, optimization-based feature selection using particle swarm optimization (PSO), teaching–learning-based optimization (TLBO). The optimization-based feature selection process is done using the MATLAB process. Then the bagging algorithm is combined with the SVM method to form the hybridbased algorithm HSBASVM algorithm. INPUT: The value of the chronic kidney dataset Precondition: The input document is a text document Output: Accuracy of the Hybrid Algorithm using the feature selection process. Reduced feature i:n (F1,F2,….Fn) Selected feature F(x)= X(s1,s2,…sn) Begin 1)Initialize k solutions 2)Call SVM algorithm to evaluate k solutions 3)Initialize Time T = Sort (S1, …, Sk) while classification accuracy ≠ 4) 100% or numbers of iteration ≠ 10 do 5Then check the condition of the incremental value i for i = 1 to m 6) Else do select S according to its weight 7)Then Calculate T = Best (Sort S1, … Sk + m), k) 8)End
Table 2 describes the performance of the bagging method by using particle-based optimization. Then the SVM and bagging are combined to form a hybrid-based selection process to show their performance in accuracy and time. Based on the time factor, accuracy is calculated. Table 2. Accuracy level of the hybrid-based algorithm without a feature selection process Algorithm
Accuracy Precision Recall F-measure Time
PSO-BAGGING
90
87
86
93
4.3
MTLBO_SVM
94
88
92
94
3.9
MTLBO_HBASVM 95
91
95
95
2.7
Table 3 describes the performance of the bagging method by using particle-based optimization. Then the SVM and bagging algorithms are combined to form hybrid-based
476
K. Padmavathi et al.
selection process to show their performance in accuracy and time. Based on the time factor, accuracy is calculated (Fig. 1). Table 3. Accuracy level of the hybrid-based algorithm with feature selection process Algorithm
Accuracy Precision Recall F-measure Time
PSO-BAGGING
94
MTLBO_SVM
97
MTLBO_HBASVM 98.5
89
93
92
3.8
92
95
94
3.1
93
96
95
2.4
Fig. 1. Time period without feature selection process
5 Performance Chart Performance of hybrid-based algorithm without feature selection process The performance of the HBASVM methodology by mistreatment of the feature selection method supported time (Fig. 2). Performance of hybrid-based algorithm without feature selection process
6 Conclusion Membranous nephropathy may be serious reasoning that will produce a life injury once not treated properly. The analysis work addresses the membranous renal disorder disease prediction with data processing technique classification in conjunction with the feature selection process. The bagging method is combined with the SVM technique to form a hybrid method. It is evident from the results that, bagging outperforms the supervector method which shows a huge improvement in classification cumulative accuracy based on the time factors. Based on the simulations, the projected paper can attempt to target
Hybrid Feature Selection Method for Predicting the Kidney …
477
Fig. 2. Performance of the HBASVM methodology by mistreatment the feature selection method supported the accuracy of the algorithmic program
Fig. 3. Performance of the HBASVM methodology by the victimization of the feature selection method
Fig. 4. Time period with feature selection process
the survival analysis interpretation of the samples. The problematic survival analysis is
478
K. Padmavathi et al.
illustrated exemplified with the significance of the PLA2R antibody statement a few of the remedy of the illness.
References 1. A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning. Available from https://ieeexplore.ieee.org/document/6918213. Accessed on 09 Oct 2014 2. A.S. De Vriese, R.J. Glassock, K.A. Nath, S. Sethi, F.C. Fervenza, A proposal for a serologybased approach to membranous nephropathy (2016). www.jasn.org 3. A.Q.B. da Silva, T.V. Delaware Sandes-Freitas, J.B. Mansur, J.O. Medicina-Pestana, G. Mastroianni-Kirsztajn, Clinical presentation, outcomes, and treatment of membranous renal disorder once transplantation. Int. J. Nephrol. (2018) 4. A.S. Bomback, C. Fervenza, Membranous nephropathy: approaches to treatment. Am. J. (2018) 5. T.S. Dabade, Recurrent idiopathic membranous nephropathy after kidney transplantation: a surveillance biopsy study. Am. J. Transp. (2008) 6. S. Murphy, Division of Nephrology and Hypertension, NC Kidney Center, Sept 2018 7. N. Krishnaveni, V. Radha, Feature selection algorithms for data mining classification: a survey 8. UCI Machine Learning Repository: Chronic_ Kidney_ Disease Dataset (n.d.). https://archive. ics.uci.edu/ml/datasets/Chronic_Kidney_Disease 9. A.V. Senthilkumar, K. Padmavathi, A proposed method for prediction of membranous nephropathy disease. Int. J. Innov. Technol. Expl. Eng. 8(10) (2019). ISSN: 22783075 10. A. Taherkhani, S. Kalantari, A. Arefi Oskouie, M. Nafar, M. Taghizadeh, K. Tabar, Network analysis of membranous glomerulonephritis based on metabolomics data. Mol. Med. Rep. 18(5), 4197–4212 (2018). https://doi.org/10.3892/mmr.2018.9477. Epub 2018 Sep 12 11. J. Kittler, M. Hatef, Robert P.W. Duin, J. Matas, On consolidating classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20(3), 226–239 (1998) 12. https://en.wikipedia.org/wiki/P-value 13. P. Ghamisi, J.A. Benediktsson, Feature selection based on the hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci. Remote Sens. Lett. 12(2), 309–313 (2015). https://doi.org/10.1109/LGRS.2014.2337320 14. S. Rakhlin, Bagging and boosting, 9.520 Class 10, 13 March 2006
Feasibility of Adoption of Blockchain Technology in Banking and Financial Sector of India Anuja Agarwal(B)
, Mahendra Parihar , and Tanvi Shah
Mukesh Patel School of Technology Management & Engineering, NMIMS University, Mumbai, India {anuja.agarwal,mahendra.parihar}@nmims.edu, [email protected]
Abstract. The purpose of this research paper is to analyze the feasibility of adoption of blockchain technology as a viable, transparent, and traceable solution for storing data and transactions in firms belonging to the banking and financial sector of India. This study examines the role of four key variables, namely technical knowledge, availability of resources, collaborations among firms, and usability; for successful adoption of the technology in financial functions. This research is based on the analysis of data acquired from these factors which influence the feasibility of blockchain technology as per the in-depth study of 30 firms belonging to the banking and financial sector of India. This study investigates the current penetration of technical knowledge, research, and development of use cases with available resources, preparedness level of firms for collaborations, and the challenges faced in the usability of implementing blockchain technology for financial applications. Keywords: Blockchain technology · Feasibility of blockchain · Blockchain in Indian banking and financial sector · Blockchain applications · Factors influencing blockchain
1 Introduction to Research In this era of artificial intelligence and machine learning, the authenticity and security of data have become extremely critical; otherwise, there is a fatal chance of building analytical models on incorrect foundations. For the banking and financial sector of India, the numeric data is too sensitive to be handed over to anyone firm to perform its algorithmic processes as it runs the risk of data theft, corruption, or misuse. The data stored on blockchain is traceable which ensures its integrity, transparency, and immutability; making artificial intelligence more coherent and understandable enabling one to trace the reason as to why decisions are made in a machine learning environment [1]. India has a diversified growth of existing banking and financial services firms as well as several new financial entities entering the market. Blockchain is a form of distributed © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_54
480
A. Agarwal et al.
ledger that keeps digital record of transactions and tracks the flow of payments by providing a secured channel of communication among a broad network of users. It can enhance interoperability of the current banking and financial system; setting the foundation for transferring financial services on an artificial intelligence platform in the near future. 1.1 Research Statement and Research Objectives The study revolves around understanding the hurdles and challenges faced in the adoption of blockchain technology for the banking and financial sector of India. It aims to convey the economic, financial, and technical readiness of firms for implementing the technology in financial functions. The purpose can be proved by collecting post-performance data on financial applications of blockchain technology that eventually reveal its possibilities of producing better models in future. The primary research objectives of the evaluation are based on four key variables: (i) To understand the penetration of technical knowledge in organizations pertaining to blockchain technology. (ii) To identify the steps taken in research and development with current available resources in organizations that have been committed to make the technology work. (iii) To analyze the preparedness level of firms for collaborations in order to be a part of the consortium to make blockchain technology work efficiently. (iv) To scan the use of technology through various stakeholders in value chain process (Fig. 1).
Fig. 1. Conceptual framework of research (created by researchers)
1.2 Research Question and Hypothesis Statement Research Question: What factors from the environment influence viability of blockchain technology in the banking and financial sector of India? Hypothesis Statement: H0A : Having knowledge about blockchain technology will not influence its adoption in the banking and financial institutions of India. H1A : Knowledge about blockchain technology will play a crucial role in its adoption in the banking and financial institutions of India. Increased awareness will pull firms to deploy this technology for business advantages. H0B : Investing in resources for research and development of blockchain technology will not influence its adoption in the banking and financial institutions of India.
Feasibility of Adoption of Blockchain Technology in Banking …
481
H1B : Investing in resources for research and development of blockchain technology will play a vital role in its adoption in the banking and financial institutions of India. H0C : Collaboration among firms to form a consortium will have no relevance with the adoption of blockchain technology in the banking and financial institutions of India. H1C : Collaboration among firms to form a consortium will be critical in the successful adoption of the technology in banking and financial institutions of India. H0D : Usability of blockchain technology has no significant role in its adoption in the banking and financial institutions of India. H1D : Usability of blockchain technology plays a significant role in its adoption in the banking and financial institutions of India.
2 Literature Review 2.1 Banking and Financial Sector of India—Challenges and Opportunities For the past three decades, financial firms have been dramatically transforming the economic development of trade, commerce, and modern business in India. They continue to face concurrent challenges with rising customer expectations, diminishing consumer loyalty, and shrinking margins [2]. Former technology-intensive delivery channels have shown immense potential for market expansion [3]. Nevertheless, it calls for deep issues of alignment with global developments, adapting to new capital norms, asset quality issues, etc., demanding a high level of sophistication in risk management, information systems, and technology [4]. This is when an innovative solution comes live to acquire appropriate cost-effectiveness and interoperability in the current banking and financial system of India, setting the foundation for transferring financial services on an artificial intelligence platform in the near future. 2.2 Introduction to Blockchain Technology—Potential and Feasibility Blockchain is an incorruptible digital ledger created by blocks that contain transaction details connected in a chronological order to form a series of chain. It follows a decentralized authentication approach for clearing and settlement of transactions while covering Internet of things [5]. Leading incumbent banks are advancing through various stages of development, testing, and deployment of this technology solution to facilitate applications of remittance management, smart contracts, digitization of physical assets, etc. [6]. Employing blockchain technology in financial functions will suppress several manual ineffective processes [7]; introducing a higher technical and efficient database system [8] by increasing financial inclusion of the business model. This paper studies the implications of use cases of blockchain technology in the banking and financial organizations of India in 2019 to draw a vision of near future. Compulsive commercial processes of trade and remittance involve several partners to undergo large amount of paperwork and follow multiple compliance procedures repeatedly [9]. Blockchain abolishes third-party presence for verifying transactions which
482
A. Agarwal et al.
enable real-time transfer and a consistent track of all data to eliminate frauds corresponding to lorry receipts minimize paperwork and shorten the trade period [10]. Enforcing blockchain solution in collaboration with banks across multiple branches on various geographies; starting with the execution of interoperable domestic trade; and later floating to an international platform can mark its scalability [11] (Fig. 2).
Fig. 2. Benefits of adopting blockchain technology (Source Euroclear and Oliver Wyman)
Leveraging the smart contract technology can revolutionize the entire lifecycle of a loan; from origin to settlement in secondary trading and later to market adoption [12]. The process demands participation from key agents; drawing clear benefits to loan investors, borrowers, and asset managers by authorizing them a direct access to the system of loan data. This vastly reduces the pace of loan market settlement time by lowering manual reviews, and systems reconciliation. It enables live transaction visibility delivering transparency and increased efficiency from current levels. Blockchain-based solutions open numerous possibilities for insurers to collaborate effectively by eliminating the dependency on data intermediaries. This reduces operating costs avoiding duplication of procedures and streamline approvals. Immutable and decentralized data with easy track of real-time records provides assistance to make better-informed decisions creating greater trust and accountability. It enables insurers to securely access customer policy details of due diligence and medical underwriting preventing documents and false billings from falling through the cracks [13]. 2.3 Scope of Research and Research Gaps Summarizing all the collected facts, the researchers realize that blockchain demands sharing of guarantee and trust to seamlessly integrate the core pain point solutions. The scope of this research is limited to understand the role of four key variables—knowledge, resources, collaboration, and usability—that contribute to the successful adoption of blockchain technology in the banking and financial sector of India.
Feasibility of Adoption of Blockchain Technology in Banking …
483
i.
The first gap that arises in the adoption of blockchain process as a whole is the issue of its acceptance followed by execution in collaboration. The existing use cases which are available are handful; closely looped within the organizations [14]. Greater part of studies have been restricting to the emergence of technology as a concept; calling for an urge to trace its adoption on a technology adoption lifecycle curve that can convey the economic, financial, social, and technical readiness of firms for validating blockchain technology in their banking and financial functions [15]. ii. The second gap is lack of research on usability and requires studying post-adoption data of the technology to reveal its possibilities of producing better models in future. iii. The third gap poses critical questions that depict lack of trust on the decentralized nature of payment systems being built on the technology for financial functions. iv. The fourth gap is found in low number of publications in journals. Majority of research is published in conferences; there is a predominant need for high-quality journals to convey results of implementing blockchain for financial applications [16].
3 Research Design The study is an exploratory research work which integrates the use of primary as well as secondary data. Heterogeneous purposive sampling was considered since the target respondents were mainly developers, technologists, and financial practitioners who could provide maximum insight of the phenomenon under examination. Primary data was collected by circulating a questionnaire through e-mails, forms, and LinkedIn and conducting semi-structured interviews with a couple of experts. Subsequently, all outliers were removed to derive a stipulated sample size of 30 respondents. Secondary data was collected from existing research articles, journals, global survey reports, etc. The survey questionnaire was designed in order of individual classification data followed by questions centered on each key variable [17–20]. A few reverse-coded questions were formulated to reduce the threat of biases and test validity of responses that were recorded on Likert scale, ordinal scale, ratio scale, or multiple choice bases. The responses are interpreted using narrative technique of qualitative data analysis.
4 Analysis and Results The researchers aimed to understand, (i) the influence of technical knowledge with respect to financial applications of blockchain technology, (ii) the influence of investment in resources for research and development of blockchain applications in financial institutions, (iii) the relevance of preparedness level of firms for entering into collaborations with other financial firms through blockchain platform to enhance interoperability in the current banking and financial system, and (iv) the relevance of simplicity of use of blockchain technology for its adoption in financial functions.
484
A. Agarwal et al.
4.1 Knowledge of Blockchain Technology The respondents were categorized according to their industry division: 16 belonged to banking institutions and 14 to non-banking financial services. The penetration of knowledge on blockchain technology and its relevance in financial functions proved to be correlational in accordance with the hierarchical structure of the organizations. (i) 6-top management, implied working with technology, among strategic priorities (ii) 17-middle management, learning stage, technology relevant but not a priority (iii) 7-first line management, not familiar with technology and its financial functions. Respondents acquired knowledge particularly from conferences, newspapers, research journals, and global survey reports. The organizations subscribed to nearly 8 articles/reports on monthly basis alongside conducting 5–6 seminars annually to create awareness among employees about such emerging technologies. The financial services that required maximum transparency were ranked in order of priority by the respondents on an ordinal scale of 1–5, with 1 being highest and 5 being lowest. About 51% of the respondents assigned 1st priority to trade finance, 2nd priority to smart contracts, 3rd priority to syndicated loans, 4th priority to mutual funds, and 5th priority to insurance. This proved to be true since 14 firms had developed use cases in trade finance and syndicated loans because of the desire of less price volatility there. Apparently, know your customer and data validation were financial functions likely to be redundant by the adoption of blockchain technology. 4.2 Investment in Resources In this section, questions were centered on the approximate investments made by financial firms for implementing blockchain technology in their business functions. Few global giants being the pioneer had hugely invested amounts summing up to 10 crores in developing use cases, raising awareness among their employees as well as other firms and creating consortiums to collaborate with other organizations. Certain renowned banks had put in about 1 crore for research and development of use cases; concurrently emerging banks unveiled massive interest in these technological advancements aggregating investment funds of nearly 50 lakhs. Addressing the top nonbanking financial institutions, moderate investments ranging between 30 and 50 lakhs have been made. The remaining followed a wait-and-watch policy did not allot any fraction of their budget for the development of this technology and perhaps declined offers for collaborations from the innovators. Adoption of blockchain technology proved to be relevant for almost all financial institutions; 10 firms facilitated in-house development of this platform while 17 firms outsourced the service from technology providers like Infosys and Ripple. 4.3 Collaboration and Consortium Among Banking and Financial Firms In this section, the questions centered on what policies should be regulated by the government in order to bind all financial organizations on a standard platform. Only 33%
Feasibility of Adoption of Blockchain Technology in Banking …
485
of respondents suggested regulations, (i) the National Payments Corporation of India should act as a regulator to build trust and transparency within consortiums. (ii) The government should enforce the use of standard currencies as per fixed norms and initiate implementing blockchain for specific financial functions that reduce the risk of failure in collaborations, for instance, trade finance. (iii) External organizations should be given the freedom to select technology providers as per their requirements. In discussion with experts, the fact was raised that legal acceptance of blockchain technology and a better understanding of its interface played a vital role for validating it on a massive scale in financial firms. The willingness of organizations to enter into consortiums was recorded on a graphical rating scale of 1–5, with 5 being very high and 1 being very low. Out of 30 organizations, 27% had very low preparedness level; 13% were at a below-average mark; 20% could possibly give a try; 27% had very high preparedness level; and 13% were already engaged in collaboration activities. 4.4 Usability of Blockchain Technology Blockchain becomes usable when there is interoperability right from the base to the end-user interface. Respondents faced difficulties to integrate with organizations; they identified low level of acceptability and lack of willingness among financiers to invest in blockchain technology without a proven use case. This became evident since 14 firms had not entered into any collaborations, whereas 16 firms were equally proportioned having collaborated with less than 35 firms and greater than 110 firms, respectively. Blockchain is still at an early reconciliation phase; few experts are developing use cases to provide the industry with a comprehensive research that assists them to divert firms on to this technology for their business advantages. The respondents put forth an estimation about the probable growth rate and profits that could be generated post-validation wherein 68% firms envisaged the business growth to go up by at least 1.5 times after adoption of blockchain technology and 53% firms envisaged the operating costs to come down by at least 0.5 times.
5 Major Findings and Implications of Research The study from 30 banking and financial institutions of India provided a clear end result plotting blockchain technology on the chasm phase of the technology adoption lifecycle curve. The financial technology has not yet been adopted as mainstream which is evident by the development of very few use cases and limited investments. The key barriers to adoption are (i) unclear regulatory policies by the government, (ii) unproven technology, and (iii) increased the complexity of updating past records. Whereas, the most significant advantages observed by firms for adoption are (i) elimination of frauds, (ii) real-time traceability, and (iii) negligible paperwork. The outcome derived from analysis states that all four variables are significantly important and inter-dependent on one another to test the feasibility of the adoption of blockchain technology in the banking and financial system of India. Paradox that while 100% firms are aware of the technology and 87% are actively working on it, 27% of them are not willing to join a consortium which is a necessary condition for the successful
486
A. Agarwal et al.
adoption of blockchain technology. Despite being considered among strategic priorities by the top and middle management, there is no significant correlation established between the investment made by firms and the use cases produced by them implementing blockchain technology in financial functions. Thus, on the grounds of existing correlation between external factors it is proved that the alternate hypothesis established during the initial stage of research is accepted and the null hypothesis is rejected. It is inferred from the study that majority of the responses recognized similar problems as stated by the researchers in their secondary research. Blockchain technology has the potential to revolutionize the current banking and financial system over a span of 3–5 years, giving competitive advantage to the financiers and practitioners in this field.
6 Limitations of Research Certain limitations need to be accounted for before taking this research as the basis for any further research. Primarily due to time constraint, the sample size was smaller in a number of 30 responses only. Most of the respondents were positioned in firms in Mumbai, they were not geographically distinct. Thus, the data collected could not be extrapolated from the whole sector. The responses varied highly at times due to differences in individual perceptions; a few questions pertained to financial investments of the organizations where the respondents were not keen on disclosing the true results and might have deliberately given vague answers.
References 1. N. Chandrasekaran, R. Somanah, D. Rughoo, R. Dreepaul, T. Cunden, M. Demkah, Information systems design and intelligent applications. Adv. Intell. Syst. Comput. 863 Springer, Singapore (2019) 2. K. Manikyam, Indian banking sector-challenges and opportunities. J. Business Manage. 16(2), 52–61 (2014) 3. G. Anbalagan, New technological changes in indian banking sector. Int. J. Sci. Res. Manage. 5(9), 7015–7021 (2017) 4. A. Tilooby, The impact of blockchain technology on financial transactions. Dissertation, Georgia State University (2018) 5. Research and markets: global blockchain market in banking industry forecast up to 2024 (2018). https://www.researchandmarkets.com/research/r43mvn/global_blockchain?w=12 6. L. Trautman, Is disruptive blockchain technology the future of financial services, in The Consumer Finance Law Quarterly Report 232 (2016) 7. A. Gupta, Blockchain technology-application in indian banking sector, in International Conference on Managing Digital Revolution-Inventing Future India, International Journal-Delhi Business Review (2018) 8. J. Oh, I. Shong, A case study on business model innovations using Blockchain: focusing on financial institutions. Asia Pacific J. Innov. Entrepreneurship 11(3), 335–344 (2017) 9. M. Balakrishnan, ICICI Bank-reimagining the trade finance process with blockchain. Infosys Finacle (2018). https://youtu.be/KKfsQXS2DZs 10. R. Rajani, FinTech developments in banking, insurance and FMIs. Reserve Bank New Zealand 81(12) (2018)
Feasibility of Adoption of Blockchain Technology in Banking …
487
11. A. Raval, R.P. Srikanth, The Rise of Block Chain in India. Magzter (2017) 12. J. Mack, M. Smith, R. Yerk, Financial institutions move closer to realizing a blockchain solution for syndicated loans. Cision PR Newswire (2017) 13. V. Viswanand, A. Baid, Leading Indian life insurers partner with Cognizant to develop industry-wide blockchain solution for secure data-sharing and improved customer experience. Cision PR Newswire (2018) 14. V. Manda, A. Polisetty, Status check on blockchain implementations in India, in International Conference on Technological Innovations in Management Ecosystems (2018) 15. M. Queiroz, S. Wamba, Blockchain adoption challenges in supply chain: an empirical investigation of the main drivers in India and the USA. Int. J. Inf. Manage. 46, 70–82 (2018) 16. J. Huumo, D. Ko, S. Choi, S. Park, S. Kari, Where is current research on blockchain technology? A systematic review. PLoS ONE 11(10) (2016) 17. L. Pawczuk, R. Massey, D. Schatsky, Breaking blockchain open-Deloitte’s 2018 global blockchain survey (2018) 18. A. Tapscott, D. Tapscott, How blockchain is changing finance. Harvard Bus. Rev. (2017) 19. S. Dhar, I. Bose, Smarter banking-blockchain technology in the Indian banking system. Asian Manag. Insights Inst. Knowl. Singapore Manag. Univ. 3(2), 46–53 (2016) 20. M. Arnold, Five ways banks are using blockchain, in Financial Times (2017). https://www. ft.com/content/615b3bd8-97a9-11e7-a652-cde3f882dd7b
Interdisciplinary Areas
Machine Learning Techniques for Predicting Crop Production in India Sarthak Agarwal1 and Naina Narang1,2(B) 1 Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur,
Rajastan 303007, India [email protected], [email protected] 2 Department of ECE, Microwave Imaging and Space Technology Application Laboratory, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand 247667, India
Abstract. The agricultural production analysis is an important study for a country like India where agriculture is one of the leading sectors in economic growth. The data analysis can lead to extraction of future patterns and accurate and precise predictions important for agronomy. This paper intends to design a supervised prediction system that can predict crop production based on previous statistics of the type of crop, area, and productivity. The dataset made available by Open Government Data (OGD) Platform India is used for performing the analysis. Keywords: Crop production · Data mining · Machine learning
1 Introduction Machine learning techniques are finding their ways into solving diverse problems. Previously, major machine learning-based studies are carried out for agricultural data analysis [1–3]. The ultimate intent of these studies is to identify various factors and their effect on production yields, their prices, market growth, crop management practices, and quality. The different factors such as soil quality, weather, temperature, population, pollution, and use of fertilizers are also being studied using machine learning methods [4–7]. Newer studies can also be found in the literature for quantification of effects of climate change on the production yield [8]. However, the studies in the Indian context are very few and need the attention of the data mining experts. The existing data and estimations are generally based on a single crop or state level [9, 10]. Also, the prediction of overall impact of the disruptive changes in the yield trends is not available to the best of our knowledge. In the present paper, preliminary results of production yield prediction are being discussed. Regression techniques, namely linear regression (LR), automatic relevance determination (ARD) regression, and Theil–Sen regression, are implemented on dataset available on Open Government Data (OGD) Platform India for the years spanning from 1997 to 2014. The predicted values for the yields of major crops grown in India during the © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_55
492
S. Agarwal and N. Narang
time from 2011 to 2014 are presented in the paper and compared with the actual data of these years. The prediction accuracy is calculated and reported for the used techniques. The present paper is organized in the following structure: Sect. 2 deals with the explanation of data preprocessing. The used learning models and algorithms are explained in Sect. 3. Results for the major crops produced in India are discussed in Sect. 4. Finally, the conclusion of the study and future aspects are discussed in Sect. 5.
2 Data Preprocessing The dataset was obtained from Open Government Data (OGD) Platform India. The dataset contains 246,092 rows and 7 columns that are ‘State_Name’, ‘District_Name’, ‘Crop_Year’, ‘Season’, ‘Crop’, ‘Area’, and ‘Production’. The majority of the instances have unproportioned values such as in the area of crop for wheat varies from 1 hectare (ha) to 422,000 ha. A total of 3524 values in ‘Production’ are 0, and 3729 null values are present. Due to the variance in data, the correlation graph of Pearson’s model shows no relativity of attributes with ‘Production’ attribute [9]. Data is not subtle for any statistical method of data processing, so dimensionality reduction is done by converting all crops into different dataset having columns ‘Year’, ‘Area’, and ‘Production’ in which ‘Year’ has unique values with ‘Area’ (total area of a crop in that year) similarly for ‘Production’ (total production of a crop in that year). The complete dataset for a particular crop has not been used. This paper works on analyzing crop at a nationwide level of production and thus ignoring the parameters like State and District. The resulting outcome is a dataset based on just area of crop and its respective production.
3 Proposed Methodology A number of learning models and algorithms are used for implementing digital agriculture. In this work, we have used the regression model for machine learning. The crop production rate is first calculated using linear regression [10, 11]. The ARD regression [12] and Theil–Sen regression [13–16] are also applied for predicting the production values for subsequent years from 2011. Finally, for the evaluation of the results, the explained variance score is calculated as Explained Variance = 1 −
Var(y − yˆ ) Var{y}
(1)
where y is the actual output, yˆ is the predicted output, and Var is the square of the standard deviation.
4 Results Figure 1 shows the predictions of production for wheat, maize, and Jowar. It is evident from the results that accurate predictions can be made for crop production with a simple
Machine Learning Techniques for Predicting Crop
493
approach of linear regression. However, the linear dependency of crop production is feared to change with issues such as climate change coming into the scenario [6, 8]. Here, the prediction is in agreement with the actual values, showing that disruptive changes have not occurred in the yield of the three crops during years from 2011 to 2014. It is to be noted here that to boost correlation between dependent and independent variable, the most dynamic categorical data must be dropped as it will create more variability in data. For instance, dataset of Assam is less than Uttar Pradesh and hence while training the Uttar Pradesh parameter will prevail over Assam. This impacts the model as now Assam is just producing outlier data for the model. Figure 2 shows the variance score for the used techniques in the prediction of the three crops yields. In each crop, we were successfully able to predict the value with an average accuracy of 71%. The crops that have low variance score are due to less and variant data. Variance score for the used techniques in the prediction of the three crops yields. Theil and Sen regression gave low explained variance score for different crop yield predictions. On the other hand, LR and ARD regression have given appreciable predictions with the highest explained variance score of 92%.
5 Conclusion An accurate crop production system can help farmers, decision makers, and governing authorities for a better agriculture approach. In the presented work, different machine learning regression techniques are used to analyze the OGD India dataset. The predicted production values for major crops of India are discussed. The proposed prediction system used a minimum number of features, and LR and ARD regression techniques gave comparable accuracies. In our future work, we will try to study the correlation between the minimum support price and the predicted crop yield values.
494
S. Agarwal and N. Narang
Fig. 1. Actual and predicted production data of years from 1997 to 2014 of a wheat, b maize, and c jowar
Machine Learning Techniques for Predicting Crop
Fig. 2. Explained variance score in the prediction of a wheat, b maize, and c jowar
495
496
S. Agarwal and N. Narang
References 1. X.E. Pantazi, Wheat yield prediction using machine learning and advanced sensing techniques. Comput. Electron. Agric. 121, 57–65 (2016) 2. S. Wolfert, Big data in smart farming—a review. Agric. Syst. 153, 69–80 (2017) 3. A. Kamilaris, Deep learning in agriculture: a survey. Comput. Electron. Agric. 147, 70–90 (2018) 4. Y. Liu, A comprehensive support vector machine-based classification model for soil quality assessment. Soil Tillage Res. 155, 19–26 (2016) 5. P.G. Oguntunde, Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis. Int. J. Biometeorol. 62(3), 459–469 (2018) 6. D.K. Ray, Yield trends are insufficient to double global crop production by 2050. PLoS ONE 8(6), e66428 (2013) 7. A.O. Adesemoye, Plant growth-promoting rhizobacteria allow reduced application rates of chemical fertilizers. Microb. Ecol. 58(4), 921–929 (2009) 8. D.B. Lobell, Climate trends and global crop production since 1980. Science 1204531 (2011) 9. N. Gandhi, Rice crop yield prediction using artificial neural networks, in 2016 IEEE Technological Innovations in ICT for Agriculture and Rural Development (TIAR), Chennai (2016), pp. 105–110 10. R.B. Guruprasad, Machine Learning Methodologies for Paddy Yield Estimation in India: a case study, in IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium. IEEE (2019), pp. 7254–7257 11. D. Sinwar, V.S. Dhaka, M.K. Sharma, G. Rani, AI-based yield prediction and smart irrigation, in Internet of Things and Analytics for Agriculture, Volume 2. Studies in Big Data, vol. 67 (2020), pp. 155–180 12. M.E. Tipping, Probabilistic principal component analysis. J. R. Statistical Soc. Ser. B (Statistical Method.) 61(3), 611–622 (1999) 13. C.E. Rasmussen, Gaussian processes in machine learning. In: Advanced Lectures on Machine Learning (2004), pp. 63–71 14. S. Manoj, A framework for big data analytics as a scalable systems. Int. J. Adv. Network. Appl. (IJANA) (2015), 72–82 15. D.P. Wipf, A new view of automatic relevance determination. In: Advances in Neural Information Processing Systems (Springer, Berlin, 2008), pp. 1625–1632 16. H. Peng, Consistency and asymptotic distribution of the Theil-Sen estimator. J. Statistical Plan. Inference 138(6), 1836–1850 (2008)
Navier–Stokes-Based Image Inpainting for Restoration of Missing Data Due to Clouds Deepti Maduskar(B) and Nitant Dube Space Applications Centre (Indian Space Research Organisation), Ahmedabad, India [email protected], [email protected]
Abstract. One of the challenges in the utilization of optical satellite images is cloud occlusions. This missing data due to clouds leads to issues in image interpretation and its utilization in different applications. For images, where sparse clouds are present, single image-based restoration is possible; however, cloud dominated images require multi-temporal images for removing clouds. Standard image interpolation-based technique for single image uses only the neighboring pixels information for restoration. The image restoration method based on image inpainting explores the local information of the entire image and propagates the same into the missing gaps. In this paper, the Navier–Stokes-based inpainting algorithm is being used for the restoration. The algorithm propagates the smoothness of the image via partial differential equations(PDEs) and at the same time preserves the edges at the boundary/features of the image. It is analyzed on different satellite images covering different types of terrains and the results show improved statistical characteristics for the restored images. Keywords: Cloud occlusions · Image restoration Image inpainting · Navier–Stokes · PDEs
1
· Interpolation ·
Introduction
Satellite uses electromagnetic radiation reflected or emitted from an object and captured by the sensors to form optical images. Satellite images play a very important role in monitoring the current state of the earth. Time series satellite data helps in studying long-term trends. Satellite images are used in numerous earth observation applications, most popular of them being monitoring of natural resources, environmental studies, climate change assessment, natural hazard mitigation, and strategic applications. Missing values in the satellite images are due to cloud occlusion, sensor failure/degradation, salt and pepper noise and pixel dropouts due to communication [1]. The missing data in the images can be short, dispersed, or have long gaps. This missing data leads to issues in image c Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_56
498
D. Maduskar and N. Dube
interpretation and its utilization in different applications. Image restoration techniques are used to recover this missing data. Various image restoration techniques may either use information from a single image or multiple images (time series data). There are two broad categories of techniques available for the restoration of these missing values in images as shown in Fig. 1. Images having large gaps use a time series model for restoration, whereas images that have sparse gaps use a single image for filling of missing values.
Fig. 1. Image restoration techniques
Multiple models of machine learning, neural networks, and data mining exist for the restoration of images using time series data. Greeshma [2] attempted to restore the images using kernel principal component analysis (KPCA). A hybrid approach of principal component analysis (PCA) and optical flow-based inpainting was used for the restoration of sea surface temperature [3]. Video inpainting and deep learning-based techniques are also used for restoration [4]. Models that use data assimilation techniques [5,6] are also used for the restoration of missing data. The time series data, acquired over time, is known to have inconsistency in the data due to seasonality and different illumination conditions, which leads to statistical errors in restored images. Data assimilation-based techniques have a dependency on model data and require large computational resources. Hence, it is difficult to deploy these models for near real-time applications. Due to the requirements of near real-time restoration of satellite images for utilization in applications, a single image-based restoration technique has been considered in this paper. Restoration methods using a single image can be broadly divided into three categories: noise-based models, interpolation models, and image inpainting. Noise-based models use sensor characterization information and are used during
Navier–Stokes-Based Image Inpainting for Restoration of Missing
499
initial data processing for improving product quality. Various traditional interpolation methods such as mean imputation and nearest-neighbor interpolation exist but they tend to introduce some type of bias in the data as they consider only the localized nearest neighbors’ information [7,8]. The paper is organized as follows: Sect. 2 contains the literature survey, Sect 3 contains the description of the Navier–Stokes-based inpainting method, Sect. 4 describes the dataset used and showcases the results, and Sect. 5 concludes and discusses the future scope of this work.
2
Literature Review
Image inpainting refers to the restoration of the missing areas of an image without any prior knowledge about the missing pixels, and this makes this technique most suitable for our requirement. Image inpainting can be divided into two classes [9]. One is called the exemplar-based method and the other is the diffusion-based method. Exemplarbased methods tend to restore the texture of the unknown region [10]. The goal is to substitute the unknown region with the best matching region from the known part of the image. The region may be substituted based on pixels or patches. There exist variants of texture-based methods based on the order of filling or the best patch matching algorithms. However, the pixel-based substitution suffers from high computational costs and form repetitive patterns which are unnatural. As satellite images have continuous variation, simply copying the best matching region cannot fill the gap accurately. Therefore, a diffusion-based inpainting algorithm is being chosen in this paper. In the diffusion-based inpainting, the pixels are propagated via differential equations or parametric models into the gap. The local image geometry is explored and the structure of the image is formulated using the partial differentiation equation (PDE)/parametric data. There are two important parameters that define the structure of the image, i.e., isophotes and the diffusion coefficient. Isophotes are lines of the constant intensity of an image, hence possess the smallest spatial changes. Their direction, at a given pixel, is defined by the normal to the gradient vector, i.e., vector defining the maximum spatial changes. The diffusion coefficient controls the diffusion process. It promotes the diffusion along the isophotes, whereas demotes the diffusion near the edges and boundaries. There exist variants of diffusion-based inpainting, based on the direction of propagation, linearity or nonlinearity of the model. A fast marching technique as described in [11] is a contemporary method, which computes the intensity of the unknown pixels using the weighted means of the known pixels from the neighborhood. Basically, the method marches the boundary sets from a known region to the unknown region, filling each and every pixel in the corresponding boundary set. Since this is a well-established method from the literature, the performance of the Navier–Stokes-based inpainting algorithm is compared with it. The approach in this paper is a diffusion inpainting technique that relates the image properties with fluid dynamics and is based on Navier–Stokes equations
500
D. Maduskar and N. Dube
as discussed in the next section. This technique propagates the local structures in the image from a known part to the unknown part using the PDEs [12].
3
Navier–Stokes-Based Inpainting Algorithm
In this approach, the missing gaps in the satellite images are inpainted by using the theoretical framework of the Navier–Stokes equation, which propagates the local structures from known part to the unknown part using the PDEs by relating the image intensity with stream function for a two-dimensional incompressible flow [12]. The smoothness of the image, given by its laplacian, is related to the vorticity of the fluid. Hence, the direction of propagation is related to the vector field of the stream function. Table 1 shows the corresponding properties of fluid and image. Table 1. Table showing the related properties of fluid and image Fluid dynamics ⇒ Image inpainting Stream function (Ψ ) ⇒ Image intensity (I) Fluid velocity (∇⊥ Ψ ) ⇒ Isophote direction (∇⊥ I) Vorticity (ΔΨ ) ⇒ Smoothness (ΔI) Fluid viscosity (v) ⇒ Diffusion (v)
A Navier–Stokes vorticity transport equation, Eq. 1, is formulated for solving w, details of which can be referred in [12]. Here, w is the smoothness of the image and g is the diffusion coefficient. ∂w + v ∗ ∇w = v∇(g|∇w|)∇w) (1) ∂t The image intensity, I is obtained by solving Poisson’s equation in Eq. 2. ΔI = w,
I|∂ω = I0
(2)
Algorithm 1 Navier–Stokes based inpainting algorithm 1: Calculate the w of the boundary region using the intensity of the known part of the image. 2: Formulate the vorticity transport equation[eq.1] and solve for w. 3: Using the updated w, compute Intensity I by solving the poisson equation [Eq. 2] 4: Using I, recompute w. 5: Repeat steps 2–4 until a steady state is reached i.e the intensity does not change significantly.
Thus, the method continues the smoothness of the image from the known to the unknown part of the image along with preserving the edges at the boundary of the image.
Navier–Stokes-Based Image Inpainting for Restoration of Missing
4
501
Results
4.1
Dataset Used
The Navier–Stokes-based inpainting algorithm is applied on different sets of images of CARTOSAT-1 satellite, which is a high-resolution satellite of India, used for cartographic applications. The selected images cover the urban, vegetation, and desert terrains. These are the most prominent land use and land cover types. The variability in these terrains provides a fair estimation of the restoration performance. It should be noted that the vegetation areas have even surface, urban areas have complex structures, and deserts have rough terrain and uneven surface. Table 2 shows the details of the satellite images used in the study. Table 2. Dataset information Terrain
Satellite
Urban
CARTOSAT-1 PAN
29-04-2006
2.5 m
73d24 33.04 E 23d0 1.83 N
Vegetation CARTOSAT-1 PAN
29-04-2006
2.5 m
72d32 36.90 E 23d2 13.53 N
Desert
02-03-2011
2.5 m
76d35 5.60 E 11d36 48.24 N
4.2
Sensor Date of pass Resolution Latitude, Longitude
CARTOSAT-1 PAN
Quality of Restoration
The algorithm has been implemented in python using the OpenCV library. In OpenCV, inpainting is implemented using the function ”inpaint” which outputs the inpainted image based on the supplied mask. It should be noted that the nonzero pixels in the mask will define the unknown region which is to be inpainted. Figure 2 shows a 512 × 735 urban area satellite image with artificial clouds introduced at three points (the white area shows the cloud occluded region, which is to be inpainted), marked as (a), (b), and (c). Further, the three cloud patches with their corresponding inpainted patches are shown for the NS-based inpainting as well as Telea’s algorithm. Similarly, Fig. 4 and Fig. 6 shows the satellite image of the vegetation area and desert area of size 456 × 234 and 691 × 751, respectively. As can be seen, the vegetation area is better restored compared to urban area because of the difference in geometries. Quality of restoration is good for the desert area too. It can also be seen that the patches with smaller clouds were inpainted with better edge preservation compared to patches with bigger clouds due to the smoothing effect of the algorithm. To further examine the quality of restoration of the image, histograms of the original image and the inpainted image have been compared in Figs. 3, 5 and 7 for urban, vegetation, and desert areas, respectively. As can be seen, the negligible difference between the histograms of original and inpainted images implies the good quality of restoration for the Navier–Stokes inpainting algorithm.
502
D. Maduskar and N. Dube
Fig. 2. Original image and the inpainted image for urban area
Fig. 3. Histogram of the original image and the inpainted image for urban area
Fig. 4. Original image and the inpainted image for vegetation area
Navier–Stokes-Based Image Inpainting for Restoration of Missing
503
Fig. 5. Histogram of the original image and the inpainted image for vegetation area
Fig. 6. Original image and the inpainted image for desert area
Fig. 7. Histogram of the original image and the inpainted image for desert area
504
4.3
D. Maduskar and N. Dube
Statistical Assessment of the Satellite Data
To numerically measure the quality of the restoration of the inpainting algorithm, root-mean-square error(RMSE) is used. The Navier–Stokes inpainting method has been compared to the inpainting method by Telea. Both algorithms are tested on the same sets of images and masks. The pixel intensities of the original image at the masked patch are compared against the pixel intensities at the same patch of the inpainted image. The RMSE results are shown in Table 3. Table 3. Comparison table for RMSE for the NS based inpainting and Telea algorithm Algorithm
Telea inpainting NS based inpainting
RMSE for urban area
53.132
52.284
RMSE for vegetation area 18.998
18.831
RMSE for desert area
7.786
7.707
As can be seen from Table 3, the quality of the restored image using the Navier–Stokes based inpainting method is better.
5
Conclusion and Future Scope
Single image-based restoration for filling of missing data in satellite images has advantages in terms of its independence on other datasets and its utilization for near real-time applications. The analysis carried out in this paper shows that Navier–Stokes-based image inpainting provides better statistical results and also preserves the histogram of the restored images. Application of this inpainting technique on images of different terrains shows that vegetation areas have simple geometry, and hence, it is easy to restore the images compared to the complex geometry of urban areas. Thus, the quality of the restoration is better in case of vegetation areas compared to urban areas. As part of future work, this algorithm will be tested for different cloud occluded images of different satellites. It will also be used for the restoration of missing data due to noise, pixel dropouts, and line losses.
References 1. Y. Julien, J.A. Sobrino, Comparison of cloud-reconstruction methods for time series of composite NDVI data. Remote Sens. Environ. 114(3), 618–625 (2010) 2. N.K. Greeshma, M. Baburaj, S.N. George, Reconstruction of cloud-contaminated satellite remote sensing images using kernel pca-based image modelling. Arab. J. Geosci. 9(3), 239 (2016) 3. S. Shibata, M. Iiyama, A. Hashimoto, M. Minoh, Restoration of sea surface temperature images by learning-based and optical-flow-based inpainting, in 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE (2017), pp. 193– 198
Navier–Stokes-Based Image Inpainting for Restoration of Missing
505
4. I. Pratama, A.E. Permanasari, I. Ardiyanto,R. Indrayani, A review of missing values handling methods on time-series data, in 2016 International Conference on Information Technology Systems and Innovation (ICITSI). IEEE (2016), pp. 1–6 5. H.G. Malamiri, I. Rousta, H. Olafsson, H. Zare, H. Zhang, Gap-filling of modis time series land surface temperature (LST) products using singular spectrum analysis (SSA). Atmosphere 9(9), 334 (2018) 6. J. Zhou, L. Jia, M. Menenti, Reconstruction of global modis ndvi time series: performance of harmonic analysis of time series (hants). Remote Sens. Environ. 163, 217–228 (2015) 7. M. Lepot, J.-B. Aubin, F. Clemens, Interpolation in time series: an introductive overview of existing methods, their performance criteria and uncertainty assessment. Water 9(10), 796 (2017) 8. G.E.P. Box, G.M. Jenkins, G.C. Reinsel, G.M. Ljung, Time Series Analysis: Forecasting and Control (Wiley, New York, 2015) 9. Christine Guillemot, Olivier Le Meur, Image inpainting: Overview and recent advances. IEEE signal processing magazine 31(1), 127–144 (2014) 10. A. Criminisi, P. P´erez, K. Toyama, Region filling and object removal by exemplarbased image inpainting. IEEE Trans. Image Process. 13(9), 1200–1212 (2004) 11. A. Telea, An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004) 12. M. Bertalmio, A.L. Bertozzi, G. Sapiro, Navier–Stokes, fluid dynamics, and image and video inpainting, in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1. IEEE (2001), p. I-I
Perception of Plant Diseases in Color Images Through Adaboost Cheruku Sandesh Kumar1(B) , Vinod Kumar Sharma1 , Ashwani Kumar Yadav1 , and Aishwarya Singh2 1 ASET, Amity University, Jaipur, Rajastan, India [email protected], [email protected], [email protected] 2 Amity University, Jaipur, Rajastan, India [email protected]
Abstract. With the continuous technology evolution, agriculture is a domain moving forward to attain smart sustainable models. Current traditional plant disease detection techniques subjected to human error leads to improper use of pesticides to treat pests. This use of pesticide is not customized as per the need of plants. To decrease human intervention and errors, image processing techniques combine with machine learning is introduced into the model which first attains various different features of the diseased region of interest by implementing feature analysis through different algorithms including K-means clustering, GLCM and Otsu’s threshold method accordingly. This dataset consisting of feature set is used to train the Adaboost classifier designed using MATLAB software package. This lets the trained dataset to predict the onsets of disease. On comparison, K-fold technique is found to give accuracy 85%, whereas holdout technique providing 83.3% accurate model. Machine learning is an advance approach to handle big data and to adapt to particular image. This model hopes to provide an efficient method to handle nonlinear data for small dataset and to make it user-friendly with the help of a graphical user interface (GUI) application. Keywords: K-fold · GLCM · Otsu’s threshold · Adaboost
1 Introduction Agriculture contributes 17% to Indian economy GDP nominally, and it contributes 7.68 percent of total global agricultural output, hence proper care of food production is required [1–3]. Various different factors such as rapid population growth, food security issue and climate change have spurred the agro-industry to discover unique initiatives to protect and increase crop yield. Another vital and invariant factor is a lack of awareness about pests (virus, fungus and bacteria) which infects the plants resulting in loss of quality and quantity production [4]. There is large amount of loss of low income farmer for broader scale production [5]. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_57
Perception of Plant Diseases in Color Images
507
As a result, artificial intelligence combined with digital image processing is steadily emerging as part of the industry’s technological evolution to boost sustainable agriculture production. This will minimize the processing time and maximize the accuracy of different alternatives to decrease the application of hazardous chemicals to the plants. Due to which disease detection on plant becomes a critical issue. To monitor the plant diseases manually is strenuous as it is a dynamic and time-intensive process [6]. This requires a tremendous amount of work to deal with such unstructured big data, expertize in plant diseases, and requires large processing time [7]. Digital image processing using software is a fast solution for detection and classification of plant diseases. To design a generalized model will detect and classify disease accordingly. The idea of identification and classification using image processing of plant disease is inspired from biomedical image processing performed on CAT and MRI scans using morphological operations for detection of tumor [8]. This detection of disease will help in farmers to use appropriate pesticides accordingly. The proposed model has two key elements: identification of disease region which will perform using K-means clustering, followed by feature extraction via GLCM and Otsu’s method, and second Adaboost classification which will generate output for image, by training the feature parameters of the said image with features of different diseased and healthy leaves feature parameters. This model GUI application thus hopes to become an identification system of disease in crops to provide ease to farmers which will help in reducing the amount of unnecessary pesticide on crops and plants [9].
2 Overview In the proposed model, various algorithms will be implemented to enhance, segment and extract different texture features and on their basis classification techniques to be used to find disease type. Image processing model aims to decrease in complexities. It will use amalgamation of different operations to processing digital image which will be passed to apply to Adaboost classifier model. In this model, training set inherently RGB in nature is acquired using camera of 326 PPI on which preprocessing operations like resizing, contrast enhancement and segmentation of disease region will be performed [10]. Image segmentation will be performed on chromaticity *a and *b layers image using clustering methods like K-means to produce best result that will help in bifurcation of area of interest. Next histogram properties and texture extraction with structured color co-occurrence matrix method will be implemented for information about spatial color arrangement. Finally, Adaboost classifier trained to make a decision support system automated will be used for post-processing. By matching and classification algorithm derived from an existing database, simulation if possible will be able to provide quick results on potentially diseased crops.
3 Problem Gaps • The nonlinear illumination and poor contrast deteriorate the process of segmentation between disease and leaf giving rise to long training period as it is time intensive.
508
C. Sandesh Kumar et al.
• Nonlinear features are not used due to the lack of method to reduce dimensions. • Lack of implementation multi-class classification models using for smaller dataset. • The accuracy of the system can be improved by adjusting the training ratio.
4 Proposed System and Feasibility The proposed model makes use of texture analysis for feature extraction and employs them to classify the image well in advance for only small dataset having maximum resolution of 270 × 180 pixels. The main objectives of the model are discussed below: • To develop a generalized model to handle linear and nonlinear training features by using image processing techniques [K-means, Otsu’s thresholding, etc.] for detection and machine learning for classification of disease. • To explore various algorithms to acquire nonlinear features to detect region of interest. • To process and implement multi-class classification of an image that with the help of GUI app. • To propose an accurate and time-effective model for small dataset.
5 Model Analysis and Design The first step is to design an image processing model to analyze specific image for detection of diseased region of interest (ROI) with the help of various features extracted from the image [11]. The design requires deciding on what features are to be extracted for the collection of data, what image processing techniques to use and how much raw data is needed which will be optimum for the training of classifier [12]. The second step would be to design a supervised multi-class classification model that uses machine learning (ML) for detecting disease by training it with specific image’s features as an input dataset using MATLAB software package [13, 14]. The third step would be to develop and implement a graphical user interface (GUI) application window that would run the designed model in the background and provide easy interface for user [15, 16]. The last step would be to cross-validate the training dataset to verify the reliability of classifier by computing accuracy of designed model using MATLAB software package. The project deals with implementing a model that would look for ROI in image and will classify disease accordingly. The acquired dataset would undergo preprocessing in the first phase using MATLAB which would be discussed in detail later in the chapter. In the second phase, GLCM method and Otsu’s thresholding method are utilized on the manipulated data is processed to extract different texture features which would be discussed further in detail later in the report [6]. The third phase requires the implementation of Adaboost model which is a supervised classification model [17, 18]. This also requires the selection of the training model and the kernel [19, 20]. Once the Adaboost has been trained, it would work as a classifier [21]. After which, the project entails the designing of GUI application that would work as an interface between the model and the user. The project block diagram is shown in Fig. 1.
Perception of Plant Diseases in Color Images
509
Image Acquistion
B) Feature Extraction using Ostu threshold
Phase 3: SVM Classifier
Phase 1: Data Preprocessing
A) Feature Extraction Cooccurnce method
Kernel Selection,Crossvalida tion
Contrast Enhancement, Resizing , segmentation of ROI
Phase 2: Processing
GUI application
Fig. 1. Work flow algorithm
6 Extracted Features Are Shown Below See Table 1. Table 1. Extracted features from three cases of training set Case
Auto-corr
Extracted features Contrast
Correlation
Energy
Entropy
Mean
Std Dev
RMS
One
6.6587
0.3005
0.9497
0.4956
3.1846
39.3622
68.9000
8.7007
Two
5.7981
0.4928
0.9217
0.6323
2.9539
26.6068
61.4785
8.0683
Three
4.9136
0.5611
0.8938
0.7388
1.5329
17.6926
53.2400
5.6152
Auto-corr—auto-correlation, Std Dev—standard deviation
7 Conclusion As the need for sustainable agriculture grows with the changes in technology, the current project is at best a prototype. This project can be used to provide relief to farmers by providing a means for machine surveillance of crops for disease identification. The practical results were simulated in couple of datasets mentioned in this research which gave best classification accuracy. The future of this project holds many possibilities that can utilize real-time monitoring of fields with the help of drones. Artificial intelligence is a field of vast scope when combined with the plant disease detection model provides a step further into the field of smart sustainable agriculture.
510
C. Sandesh Kumar et al.
A larger database could be created and fed using a real-time acquiring application which can be processed by the classifier using server grade computers.
References 1. N. Hemageetha, A survey on application of data mining techniques to analyze the soil for agricultural purpose, in IEEE 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi (2016), pp. 3112–3117 2. S. Manoj, A framework for big data analytics as a scalable systems. Int. J. Adv. Network. Appl. (IJANA) 72–82 (2015) 3. S. Manoj, A survey of thresholding techniques over images. 3(2), 461–478 (2014) 4. A.A. Bharate, M.S. Shirdhonkar, in A review on plant disease detection using image processing. IEEE International Conference on Intelligent Sustainable Systems (ICISS), Palladam (2017), pp. 103–109 5. M. Parisa Beham, A.B. Gurulakshmi, Morphological image processing approach on the detection of tumor and cancer cells, in IEEE International Conference on Devices, Circuits and Systems (ICDCS), Coimbatore (2012), pp. 350–354 6. M.T. bin Mohamad Azmi, N.M. Isa, Orchid disease detection using image processing and fuzzy logic, in IEEE International Conference on Electrical, Electronics and System Engineering (ICEESE), Kuala Lumpur, 2013, pp. 37–42 7. B. Mishra, S. Nema, M. Lambert, S. Nema, Recent technologies of leaf disease detection using image processing approach—a review, in IEEE International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore (2017), pp. 1–5 8. C.S. Hlaing, S.M.M. Zaw, Plant diseases recognition for smart farming using model-based statistical features. In IEEE 6th Global Conference on Consumer Electronics (GCCE), Nagoya (2017), pp. 1–4 9. Y. Ma, G. Guo, Support Vector Machines Applications (Springer Science & Business Media, Berlin, 2014) 10. P.R. Rothe, R.V. Kshirsagar, Cotton leaf disease identification using pattern recognition techniques, in IEEE Journal International Conference on Pervasive Computing (ICPC), Pune (2015), pp. 1–6 11. V. Singh, Varsha , A.K. Misra, Detection of unhealthy region of plant leaves using image processing and genetic algorithm, in IEEE Journal International Conference on Advances in Computer Engineering and Applications, Ghaziabad (2015), pp. 1028–1032 12. S.R. Maniyath, et al., Plant disease detection using machine learning, in IEEE Journal, International Conference on Design Innovations for 3Cs Compute Communicate Control (ICDI3C), Bengaluru (2018), pp. 41–45 13. G.B. Souza, G.M. Alves, A.L.M. Levada, P.E. Cruvinel, A.N. Marana, A graph-based approach for contextual image segmentation, in IEEE Journal, 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Sao Paulo (2016), pp. 281–288 14. R.G.G. Acuña, J. Tao, R. Klette. Generalization of Otsu’s binarization into recursive color image segmentation, in IEEE Journal, International Conference on Image and Vision Computing New Zealand (IVCNZ), Auckland (2015), pp. 1–6 15. A. Le Bris, F. Tassin, N. Chehata, Contribution of texture and red-edge band for vegetated areas detection and identification, in IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Melbourne, VIC (2013), pp. 4102–4105 16. R. Radha, S. Jeyalakshmi, An Effective Algorithm for Edges and Veins Detection in Leaf Images (Trichirappalli, World Congress on Computing and Communication Technologies, 2014), pp. 128–131
Perception of Plant Diseases in Color Images
511
17. K.P. Ferentinos, Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 145, 311–318 (2018) 18. N. Petrellis, A smart phone image processing application for plant disease diagnosis, in IEEE 6th International Conference on Modern Circuits and Systems Technologies (MOCAST), Thessaloniki (2017), pp. 1–4 19. J. Francis, D. Anto Sahaya Dhas, B.K. Anoop, Identification of leaf diseases in pepper plants using soft computing techniques, in IEEE Conference on Emerging Devices and Smart Systems (ICEDSS), Namakkal (2016), pp. 168–173 20. D. Sinwar, V.S. Dhaka, M.K. Sharma, G. Rani, AI-Based Yield Prediction and Smart Irrigation. Internet of Things and Analytics for Agriculture, Volume 2. Studies in Big Data, vol. 67 (2020), pp. 155–180 21. M.R. Badnakhe, P.R. Deshmukh, Infected leaf analysis and comparison by Otsu threshold and k-means clustering. Int. J. Adv. Res. Comput. Sci. Software Eng. 2(3) 22. L. Jain, M.A. Harsha Vardhan, M.L. Nishanth, S.S. Shylaja, Cloud-based system for supervised classification of plant diseases using convolutional neural networks, in IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), Bengaluru (2017), pp. 63–68 23. A.V. Goldberg, Amazon Technologies, Inc. (2018), (US10102056). Anomaly detection using machine learning
SWD: Low-Compute Real-Time Object Detection Architecture Raghav Sharma(B) and Rohit Pandey Hughes Systique Corporation, Gurugram, India {raghav.sharma,rohit.pandey}@hsc.com
Abstract. In recent years, deep learning-based object detection has been researched hot spot due to its powerful learning ability in dealing with occlusion, scale transformation, and background switches, etc. There are many state-of-art object detectors like SSD, SSH, YOLO, and RCNN that have been invented in recent years. These architectures are highly complex, work on deep learning framework, requiring high computing power, which restricts their practical adaptability for low-cost applications and low-computes devices. In this work, a novel real-time object detector architecture, sliding window detector (SWD) based on a sliding window technique, has been proposed. SWD works on a deep learning framework and can execute on a low-compute device. In the proposed SWD architecture, the classifier network is optimized. The fully connected layer of the classifier trained on N classes is replaced by a convolutional layer, which generates N heat-maps. These heat-maps are used to localize and classify the object. SWD simulation on an Intel i5 CPU with 20 FPS shown mAP 0.85 for PKLOT data-set. Keywords: Object detection · Sliding window · Car counting · SWD
1 Introduction Object detection [1, 2] is a technique to identify and localize each desired object in the given image. Each object has its unique feature based on that human or machine can identify and localize a particular object. A human can easily extract the feature of the object and identify it, but in the case of the machine first, it needs to find the probable region to localize the object. Now to classify them, it needs to extract the unique feature of the object and then classify accordingly. This process requires deep knowledge of the domain and hard to design for humans to overcome this problem; many other object detectors introduced based on deep learning frameworks like RCNN [3], fast RCNN [4], faster RCNN [5], YOLOv1 [6], YOLOv2 [7], YOLOv3 [8], SSD [9], SSH [10], and many more. The methods based on deep learning frameworks require a lot of computation and the high computational devices, which is not feasible in every situation. In most scenarios, it is necessary to use deep learning frameworks because of its simplicity and accuracy, but along with it should able to operate on the CPU, which is next to impossible. Our proposed SWD object detector based on a deep learning framework so anyone can easily © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_58
SWD: Low-Compute Real-Time Object
513
design it, and it is optimized, so it performs well on a low-computational power device. SWD based on the sliding window technique is discussed in Sect. 3. In this section, motivation, challenges, and the problem statement have discussed. 1.1 Motivation Nowadays, deep learning is on the peak. Every field researcher like computer vision, speech processing, video processing tries to switch on deep learning framework from conventional methods due to its simplicity accuracy and efficiency. Along with these advantages, it needs very high computational power devices but these devices not feasible in every situation like for smart camera, application on mobile, low power device where cloud processing and bulky hardware is not possible. In the current scenario, if somebody wants to use deep learning frameworks in some smart device so either device should able to carry bulky hardware or it is not possible. Now, if somebody wants to design a humanoid based on deep learning, it will look like a giant because of bulky hardware needs. This work focused on reducing the need for bulky hardware for deep learning, so if somebody design humanoid, it will look like a human, not a giant, as shown in Fig. 1.
Fig. 1. Motivation of this work to reduce need of bulky hardware
1.2 Challenges Object detection is not a trivial task. To perform object detection using deep learning on the CPU, a proposed object detector should able to handle multiple variations present in the image, and it should be optimized so it can work on CPU. The diversity present in an image can be due to various reasons. Like an object in motion, different instances of the same class, deformation present in the image, illumination variation, number of the object is not fixed, and many more. As a new solution introduced to handle various
514
R. Sharma and R. Pandey
challenges, it increased the complexity of the network. The high complex network needs high computational power devices, so the highly accurate and robust object detector needs a high computational power device. An optimized object detector should able to handle the various challenges without the need for a high computational power device. 1.3 Problem of Statement Many challenges have discussed in Sect. 1.2, like the diversity present in an image, need of high computation. To handle the diversity of image, the complex network required. To handle the complex network, the high computational power device is needed, which is not feasible in every situation. This work focused on designing an optimized object detector that can handle all the variation of data without the need for a high computational power device.
2 Literature Survey Object detection has endless application because of this; it got the attention of many researchers from the starting time of machine learning. In the initial time (1999), object detection was dependent on handcrafted features, which were challenging to design and application-specific like SIFT [11] SURF [12–14], Hog [15], and Bag of words. Object detection methods improved with the advancement of deep learning. Deep learningbased detectors were easy to develop even for the beginner. There are lots of object detectors based on deep learning like SSD [9], YOLO [6], RCNN [3], and more robust as compare to others. Figure 2 shows the timeline of object detection starting from 1999 with SIFT and improved with time to time by the different methods and reached to SSD, YOLO, and Mask-RCNN which work accurately. The need for a complex network or the high computational power device increased as the use of deep learning networks increased to improve the performance of object detection. Because of this, it cannot use on a low power device. Many algorithms have proposed to handle this problem like EOD [13] uses Hog feature and CNN so it can work on low power computational devices. “Real-time object detection on low power embedded platforms [16]” proposed in October 2019 to perform object detection on a low power embedded platform. If the object size in the images is minimal like an object in the aerial images, then it is hard to perform real-time object detection even on the GPU like OIR-SSD [17] got 5 FPS on aerial images. SWD can perform real-time object detection on aerial images using a low power computational device. There are some other works similar to this work, like DAC-SDC [18], Algorithm-SOC [19], and system-level solution [20]. But resultant mean average precision (mAP) is low.
3 Proposed Methodology It has already discussed that deep learning framework based methods easy to design and more accurate, but they need very high computational power, which is not feasible
SWD: Low-Compute Real-Time Object
515
Fig. 2. Improvement of object detection with respect to time
every time. Our proposed object detector based on deep learning, authentic as other object detection methods, but unlike other deep learning networks, do not need high computational power. The proposed object detector is based on a sliding window technique. In the traditional sliding window technique, all possible patches pass through a classifier, and it performs classification and localization of objects in the image. Traditional classifier contains fully connected layers because of the fully connected layers it is limited to fixed-size input. Suppose a classifier designed to classify an image of size (14 × 14 × 3); it will produce one or zero for input (14 × 14 × 3), but it cannot classify an image of size 18 × 18 × 3. If a convolutional layer replaces fully connected layers, then the same classifier can process 18 × 18 × 3 images also. If 18 × 18 × 3 image feeds to this classifier, it will produce output tensor of 2 × 2 means four outputs corresponding to four patches of 14 × 14 × 3 present in input as shown in Fig. 3. Similarly, if 28 × 28 × 3 image will feed to the classifier, it will produce an output of size 8 × 8. Now, this classifier will work as a heat-map generator. This approach called CNN sliding window approach. As shown in Fig. 4, where Fig. 4a is the testing image and Fig. 4b is the heat-map for a human face. The white patch shows the area where probability is high to
516
R. Sharma and R. Pandey
have a human face. Now, based on this heat-map, the location of the human face can be easily detectable.
Fig. 3. Heat-map generation of image [21]
(a) Testing image taken by VOC data-set
(b) Heat-map of test image for human face.
Fig. 4. Heat-map
SWD: Low-Compute Real-Time Object
517
The proposed object detector used this technique to localize and classify the object present in the image. This approach has been used to localize the cars present in the image. Test data-set is discussed in Sect. 4. 3.1 Proposed Network In this approach, the optimize network as shown in Table 1 has used to generate heat-map of car and to perform object detection. This network is optimized and very small, so it can efficiently work on the CPU. Table 1. Network architecture Type of layer
Filter size
No. of filters
Type of layer
Filter size
No. of filters
Conv2D
3.3
16
Maxpooling2d
4.4
Conv2D
3.3
16
Conv2D
4.4
128
Maxpooling2d
4.4
Conv2D
3.3
32
1.1
No. of classes
Conv2D
3.3
32
Dropout (0.5) Conv2D
3.2 Training and Testing To train this network, segmented images have been used provided by PKLOT data-set [19] as shown in Fig. 6. Every patched re-sized to 64 × 64 sized. The training and validation accuracy of the proposed network is 97.30 and 98.10, respectively. To test it as an object detector, full images have been used, as shown in Fig. 5.
Fig. 5. PKLOT data-set sample images
4 Data-Set To train and test the proposed object detector, PKLOT [22] data-set has used. Parking lot data-set contains 12,417 images, as shown in Fig. 5, and 695,899 image patches generated from them, as shown in Fig. 6. These segmented images can be categories into
518
R. Sharma and R. Pandey
two categories occupied (patches contain cars) and empty (patches include only parking space excluded vehicles). To train the proposed object detector, segmented images have been used where occupied images are desired class and empty images as background class. These images were acquired at the Federal University of Parana and Pontifical Catholic University of Parana.
Fig. 6. Segmented images from PKLOT data-set
5 Result and Discussion This technique does not need any complex network because of this; it can work on CPU also. It gives the 0.85 mAP for a single class (car). Table 2 shows the comparison between different object detector and the proposed network. To test this network, the i5 processor and 16 GB ram have been used. It is tested on PKLOT data-set because we want to check this algorithm on aerial data, which has a single class, so the PKLOT data-set is best suitable for our proposed algorithm. Table 2. mAP and speed comparison Object detector
mAP
Speed (FPS) on CPU
SSD
0.90
0.2
YOLOv3
0.89
0.3
OURS (SWD)
0.85
~20
Table 2 shows that the mAP of the proposed object detector is almost the same, but FPS is very high as compared to other object detectors. It shows that it can work on low power computational devices with real-time processing. Result tested on 20% data-set selected randomly.
SWD: Low-Compute Real-Time Object
519
6 Conclusion Due to the optimized network and algorithm, it can process aerial images on a low power device with 20 FPS. It has trained on aerial images with a single class (car). But it has the potential to perform well on a multi-class data-set.
References 1. P.F. Felzenszwalb, R.B. Girshick, D. McAllester, D. Ramanan, Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010) 2. P. Viola, M. Jones et al., Robust real-time object detection. Int. J. Comput. Vision 57(2), 137–154 (2001) 3. R.B. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies foraccurate object detection and semantic segmentation, in CoRR, vol. abs/1311.2524 (2013) 4. R.B. Girshick, Fast R-CNN, in CoRR, vol. abs/1504.08083 (2015) 5. S. Ren, K. He, R.B. Girshick, J. Sun, Faster R-CNN: towards real-time objectdetection with region proposal networks, in CoRR, vol. abs/1506.01497 (2015) 6. J. Redmon, S.K. Divvala, R.B. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in CoRR, vol. abs/1506.02640 (2015) 7. J. Redmon, A. Farhadi, YOLO9000: better, faster, stronger, in CoRR, vol. abs/1612.08242 (2016) 8. J. Redmon, A. Farhadi, in Yolov3: an incremental improvement. arXiv preprint arXiv:1804. 02767 (2018) 9. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S.E. Reed, C. Fu, A.C. Berg, SSD: single shot multibox detector, in CoRR, vol. abs/1512.02325 (2015) 10. M. Najibi, P. Samangouei, R. Chellappa, L.S. Davis, SSH: single stage headless face detector, in CoRR, vol. abs/1708.03979 (2017) 11. D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004) 12. Dhaka Vijaypal, Offline language-free writer identification based on speeded-up robust features. Int. J. Eng. (IJE) Trans. A: Basics 28(7), 984–994 (2015) 13. V.P. Dhaka, Offline scripting-free author identification based on speeded-up robust features. IJDAR 18, 303–316 (2015). https://doi.org/10.1007/s10032-015-0252-0 14. V.S. Dhaka, Segmentation of handwritten words using structured support vector machine. Pattern Anal. Appl. (2019). https://doi.org/10.1007/s10044-019-00843-x 15. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in International Conference on computer vision & Pattern Recognition (CVPR’05), vol. 1 (IEEE Computer Society, 2005), pp. 886–893 16. G. Jose, A. Kumar, S. Kruthiventi, S. Saha, H. Muralidhara, Real-time object detection on low power embedded platforms, in The IEEE International Conference on Computer Vision (ICCV) Workshops, Oct 2019 17. R. Sharma, R. Pandey, A. Nigam, Real time object detection on aerial imagery, in Computer Analysis of Images and Patterns—18th International Conference (CAIP 2019), Salerno, Italy, September 3–5, 2019, Proceedings, Part I (2019), pp. 481–491 18. X. Xu, X. Zhang, B. Yu, X.S. Hu, C. Rowen, J. Hu, Y. Shi, Dac-sdc low powerobject detection challenge for UAV applications. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2019) 19. Y. Zhu, A. Samajdar, M. Mattina, P.N. Whatmough, Euphrates: algorithmsoc co-design for low-power mobile continuous vision, in CoRR, vol. abs/1803.11232 (2018)
520
R. Sharma and R. Pandey
20. F. Li, Z. Mo, P. Wang, Z. Liu, J. Zhang, G. Li, Q. Hu, X. He, C. Leng, Y. Zhang, et al., A system-level solution for low-power object detection, in Proceedings of the IEEE International Conference on Computer Vision Workshops (2019) 21. A. Ng, K. Katanforoosh, Y.B. Mourri, Convolutional implementation of sliding windows 22. P.R. De Almeida, L.S. Oliveira, A.S. Britto Jr., E.J. Silva Jr., A.L. Koerich, Pklot—a robust dataset for parking lot classification. Expert Syst. Appl. 42(11), 4937–4949 (2015)
Guided Analytics Software for Smart Aggregation, Cognition, and Interactive Visualisation Aleksandar Karadimce1(B) , Natasa Paunkoska (Dimoska)1 , Dijana Capeska Bogatinoska1 , Ninoslav Marina1 , and Amita Nandal2 1 University of Information Science and Technology “St. Paul the Apostle”, Ohrid, North
Macedonia {aleksandar.karadimce,natasa.paunkoska,dijana.c.bogatinoska, Rector}@uist.edu.mk 2 Department of Computer and Communication Engineering, Manipal University Jaipur, Jaipur, India [email protected]
Abstract. The development of tools that improve efficiency and inject intelligent insights into social media businesses through guided analytics is crucial for consumers, prosumers, and business markets. These tools enable contextualised socially aware and spatial-temporal data aggregation, knowledge extraction, cognitive learning about users’ behaviour, and risk quantification for business markets. The proposed tools for analytics and cognition framework will provide a toolset of guided analytics software for smart aggregation, cognition, and interactive visualisation with a monitoring dashboard. The aggregation, monitoring, cognitive reasoning, and learning modules will analyse the behaviour and engagement of the social media actors, diagnose performance risks, and provide guided analytics to consumers, prosumers, and application providers to improve collaboration and revenues, using the established Pareto-trust model. This framework will provide a seamless coupling with distributed blockchain-based services for early alert, real-time tracking and updated data triggers for reach and engagement analysis of events. Moreover, this will allow users to analyse, control, and track their return on investment to enhance monetary inclusion in collaborative social media. Keywords: Guided analytics · Data aggregation · Geospatial · Temporal · Augmented cognitive · Microservices · Social media
1 Introduction Social media platforms, such as Facebook, YouTube, WhatsApp, WeChat, Instagram, LinkedIn, and others, are very attractive these days. Besides the main aim of their existence, they have the potential to shape and mobilise communication patterns, practices of exchange and business, creation, learning and knowledge acquisition. One drawback © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_59
522
A. Karadimce et al.
of these social media platforms is their centralisation. Generally, they represent a centralised entity with a single proprietary organisation controlling the network. This feature brings critical trust and governance issues for the content created and propagated in that environment. Especially concerns the fact that very often centralised intermediaries are involved in the regular data breaches. Therefore, there is a necessity for a new innovative solution that will address the centralised content issue and will facilitate global reach, improved trust, and decentralised control and ownership of the underlying social media environment. The idea is that the next generation social media ecosystem must bring together a range of diverse and fragmented social media actors, such as individuals, startups, small and medium enterprises (SME), and providers, under one systemic umbrella with decentralised ownership, allowing them to participate in collaborative decision-making and the sharing economy. Such an ongoing project funded by the European Commission under the Horizon 2020 programme is ARTICONF (‘smART socIal media eCOsystem in a blockchaiN Federated environment’).1 The main purpose of the project is to create a decentralised and federated social media ecosystem, supported by an underlying blockchain technology that will ensure portable intra-platform and cross-platform social media data, interpretable in a range of different contexts for preservation, analysis and visualisation via four frameworks (i) trust and integration controller (TIC), (ii) co-located and orchestrated network fabric (CONF), (iii) semantic model with self-adaptive and autonomous relevant technology (SMART), and (iv) tools for analytics and cognition (TAC). The four frameworks are integrated part of the ARTICONF social media platform. TIC framework provides a mobile/Web app-enabled backend system for the end-users integrated to a decentralised blockchain platform; CONF consists primarily of a suite of microservices to customise virtual infrastructures; SMART framework comprises of a suite of software services for semantic linking and decision-making in a collaborative network of social objects; and TAC provides guided analytics software toolset with smart aggregation, cognition, and interactive visualisation with a monitoring tool. This paper focuses on the development of the TAC framework. This framework is mostly important for the consumers, prosumers, and application providers in the direction of enhancing collaboration and revenues. The tool includes guided analytics with social and predictive models for improving the efficiency and injects intelligent insights into operational and mission-critical social media businesses. Hence, the TAC tool deals with contextualised socially aware and spatial-temporal data aggregation, knowledge extraction, cognitive learning about users’ behaviour, and risk quantification for business markets. The contributions of these papers are as follows: 1. We propose a TAC framework to enable the social media service owner to have a reach and engagement analysis of the performances of the particular service he/she offers. 2. We implemented a guided analytics dashboard as a proof of concept, using a carsharing use case. The output of this tool is a visualisation microservice that will (i) provide better user experience for the social media service users and (ii) optimise 1 https://articonf.eu/.
Guided Analytics Software for Smart Aggregation
523
the business, reduce the costs, and increase the revenue of the social media service owners. The remainder of the paper is organised as follows. Section 2 gives an overview of the related work. Section 3 explains in detail the TAC architecture. Section 4 provides TAC implementation. Finally, Sect. 5 concludes the paper.
2 Related Work A social media network can be seen as an umbrella that covers all Web-based applications that allow content generation by individuals [1]. Social media through active collaboration makes people active participants and engages them in exchange of information. Another aspect of looking at the social network is as online group that brings together people with common interests [2]. Most social networking sites offer integrated and accessible tools to simplify and speed up the information publication process. Nowadays, the companies already changed the collaboration approach of focused on individuals within a single company to more sophisticated that facilitate social sessions across organisations. Indeed, the business sector is the one that has effective use of these collaboration tools [3]. The existence of different social media services leads to the production of heterogeneous data. The data has various formats such as images, videos, maps, and geolocation data; hence, they come from various data sources and have inconsistent file formats. Thus, the process of aggregation becomes difficult for managing. One alternative for data management is using large-scale storage and multidimensional data management in a single integrated system [4, 5]. The convergence of GIS and social media made progress in collecting spatial and temporal data from social media, which withdraw the emphasis on the dynamic process of time-critical or real-time monitoring and decision-making [6]. Further, the interpretation of spatial analysis actualises concepts like proximity or access, isolation or exposure, neighbourhoods and boundaries, neighbourhood effects, and diffusion [7]. The social media industry provides computer-mediated tools, which enable people or companies to create, share, or exchange information [8]. The social media industry is trying to catch or target users at optimal times in ideal locations. The ultimate aim is to convey information or content that is in line with the consumer’s mindset [9–11]. Social media change fundamentally the communication between businesses, organisations, communities, and individuals. Measuring the impact of the social media integration and specific use case performances on the ARTICONF project is a way of providing guided analytics to consumers, prosumers, and application providers, and improvement of collaboration and revenues. The impact assessment of using different tools of the members collaboration and teams performance usually is done by quantitative or qualitative measurements [12]. The return on investment (ROI) remains as the most common indicator that measures avoidance and reduction of costs, optimising the business and giving faster business decisions.
524
A. Karadimce et al.
3 TAC Architecture In this section, we give a description of the proposed TAC framework, as well as how this framework interacts with other frameworks described above (TIC, SMART, and CONF) within the ARTICONF project. TAC as a fundamental part of the decentralised social media ecosystem enhances business productivity by tracking updated data triggers from diverse social media events. The TAC tool seamlessly interacts with TIC and provision socially contextual geospatial and temporal data aggregation to gain intelligent insights and prediction through augmented cognition and reasoning. SMART initiates the TAC configuration providing aggregation, monitoring, cognitive reasoning, and learning modules that analyse the behaviour and engagement of the application and social media actors, diagnose performance risks, and provide guided analytics to consumer prosumers, and application providers to improve collaboration and revenues. TAC interacts with CONF to intelligently provision services based on abstract social media application requirements, operational conditions at the infrastructure level, and timecritical event triggering. TAC initially collects and aggregates smart data in accordance with guidelines provided by the TIC, SMART, and CONF recommendation engines. The extracted cross-contextual cognitive inferences together with the established Pareto-trust SLA preferences act as input to the TAC guided analytics engine, which provides visuals results on the ARTICONF graphical user interface. The proposed TAC framework architecture is shown in Fig. 1.
Fig. 1. TAC framework architecture
API gateway provides a RESTful interface to query different microservices provided by the module. Message broker facilitates communication between microservices.
Guided Analytics Software for Smart Aggregation
525
Augmented cognition data model consists of four microservices: • Geospatial microservice handles the gathering, display, and manipulation of global positioning system (GPS) data, satellite photography, geotagging, and historical data, usually represented in terms of geographic coordinates, or implicitly in terms of a street address, postal code, or forest stand identifier. • Temporal microservice offers support to analyse complex social networks providing users with actionable insights even against a large amount of data over a short time period, coupled with visualisation to uncover influential actors. • Return of investment (ROI) monitoring microservice provides business insights and measurable return on collaboration (ROC) metrics. • Social-contextual cognitive reasoning and cross-contextual cognitive learning microservices reduces uncertainty, double-checks the validity of information and their sources in a hostile environment. Guided analytics dashboard consists of four microservices: • Key performance indicators (KPI) metrics microservice runs the testing, measurement, and evaluation of the KPI success metrics of the ecosystem, calculated for every use case objective, further exploited according to the social media application provider needs and requirements. • Risk prediction microservice analyses the data received from the ROI monitoring microservice. • Guided analytics microservice provisions the process of guiding the social network users through the workflow and analyses the parameters of its interest through a use case recommendation, moving the analysis beyond reporting shallow summary data to acquire strong and actionable insights from users. • Use case rating microservice notifies and visualises results from the KPI metrics microservice and the ROC metric from the risk prediction microservice. TAC input interface consists of a TAC ingestion microservice acting as an input interface for the TIC, SMART, and CONF tools for pushing all openly available and anonymised semantic data.
4 TAC Implementation This section will first provide a short overview of the open-source software product ELK stack, as an implementation solution for the proposed TAC framework. Thereafter, the implementation of the proof of concept will be explained using a car-sharing use case scenario. The implementation of the TAC is done using a collection of three opensource products: Elasticsearch, Logstash, and Kibana, also known as ELK stack. The main benefit of choosing the ELK stack as an implementation solution for the TAC framework is a platform that can ingest and process data from different data sources, then store that data in one central data store that can vertically scale as data grows. Finally, TAC will provide a set of tools to analyse the data. The process of collection of
526
A. Karadimce et al.
the different types of data will be handled by edge hosts, installed with beats lightweight agents into the ELK stack. The TAC framework is running the data preprocessing and aggregation using the Logstash toll, which can collect data from various input sources the ARTICONF platform tools (TIC, CONF, SMART, and TAC) and use case partners data-set traces. The Logstash can run different transformations and enhancements to the selected data and then ships it for parsing in the Elasticsearch database. The open-source Elasticsearch and analysis engine is based on the Apache Lucene search engine uses the document-oriented index entries which can be associated with a schema and combines full text-oriented search options for text fields with more precise search options for other types of fields, like date + time fields, geolocation fields, etc. Finally, Kibana is a Web-based visualisation layer that works on top of Elasticsearch mainly used for exploring and visualising data. The modern browser-based interface (HTML5 + JavaScript) provides the end-users with the ability to analyse and visualise the data. For the purposes of the project, two servers are installed and configured. The first one is an Ubuntu server that contains the ELK stack installed using a simple docker installation, and the second one is an MS Windows server which contains the opensource search (OSS) versions, so we can compare features and have two different testing environments to decide upon. To demonstrate and validate the ARTICONF ecosystem, four diverse and carefully selected complementary social media use cases are selected. They include (i) crowd journalism with news verification, (ii) car sharing, (iii) co-creation of financial value with video, and (iv) smart energy. ARTICONF targets a broad and diverse set of potential customers not limited to its pilot case studies. For the purposed of our proof of concept, the car-sharing use case is selected. Car-sharing use case, as an example of the sharing economy concept, allows the users to rent and share a car at any place and time through smart contracts based on blockchain. That way, a social network of users is created, so the users can interact with others and report issues directly to the company, vehicle owner or other users. This collaborative consumption will save money, reduce pollution, and increase the quality of life in general. The platform will collect and store all the information generated by the anonymous user (e.g. geolocation, social network interaction, external events like a traffic jam). SMART framework within ARTICONF will aggregate and optimise the collected data and send them to the TAC framework. TAC analyses user data in order to classify users and set strategies to target them. Moreover, TAC will provide information about patterns, behaviours, and events for further predictive analytics. The ultimate goal for this particular use case is to boost user experience and the economic benefits for users and vehicle providers involved in the model. Using the car-sharing database schema provided by the ARTICONF use case partner, we created two demo datasets with randomly generated data, similar to the real data for this use case. In this scenario, using the simulated data, we have created a simplified dashboard, visualising some pieces of information important for the car-sharing companies, private vehicle owners and/or users. Different types of information are visualised using different types of graphs (see Fig. 2). Some of the information that we have visualised are, for instance, the statistical information on the kilometres travelled other is related to the most popular rating score per travel amongst the consumers. The interesting analysis
Guided Analytics Software for Smart Aggregation
527
was to observe is the customer’s preference for the car-sharing service, whether they prefer to rent the vehicles for kilometres or they have a time-based preference and they are actually passing in average more kilometres. The average price balance is changing per week during the year and this trend is visible with the horizontal bars on the TAC dashboard. Furthermore, we can show external influencing factors, such as weather conditions that affect the usage of vehicles and users habit to use the car-sharing service. Any other information can be visualised on demand.
Fig. 2. TAC dashboard car-sharing implementation
The TAC car-sharing dashboard provides the following insights (see Fig. 2): • A gauge showing the average satisfaction rating score for travel; • Metrics for average km travelled compared between customers that prefer a charge of price for Km (false) or charge of price for time (true); • Four gauges showing the average, median, minimum, and maximum price for KM for travels in the past year; • Area plot showing the average, median, minimum, and maximum price for a time in the past year; • Horizontal bar charts showing the average travel prices balance per week in the past year. If the stakeholders are satisfied with this proof of concept, the development will continue with the full implementation.
528
A. Karadimce et al.
5 Conclusion and Future Work The main aim of the tools for analytics and cognition (TAC) framework is to uncover which strategy for data aggregation data and group recommendation generation is most appropriate when dealing with groups that are made up of users within a specific use case. Due to a large number of activities, which users carry out as part of a group rather than individually, the TAC framework will improve collaboration amongst intelligently defined communities elaborating over the shared knowledge acquisition and learning. The ARTICONF ecosystem will develop robust tools for monitoring and reasoning social and cognitive states, which will provide social media consumers with enhanced cognitive abilities, especially under complex collaborative participation scenarios using active and automated learning methods. In this way, the augmented cognition reasoning model will improve collaboration amongst intelligently defined communities in a network of social objects elaborating over the shared knowledge, acquisition, and learning.
References 1. J. Waters, L. Lester, The everything guide to social media: all you need to know about participating in todays most popular online communities, in Karen Cooper (2010) 2. H. Harrin, Social media for project managers. Project Management Institute (2010) 3. C. Weise, The return on collaboration: assessing the value of todays collaboration (Cis. Sys., San Jose, CA, 2010) 4. K. Boulil, S. Bimonte, F. Pinet, Conceptual model for spatial data cubes: a UML profile and its automatic implementation. Comp. Stand. Inter. 38, 113–132 (2015) 5. M. Sarwat, Interactive and scalable exploration of big spatial data—a datamanagement perspective, mobile data management (MDM), In 2015 16th IEEE International Conference, Vol 1, Pittsburgh (2015), pp. 263–270 6. D. Sui, M. Goodchild, The convergence of GIS and social media: challenges forGIScience. Int. J. Geogr. Inf. Sci. 25(11), 17371748 (2011) 7. J.R. Logan, W. Zhang, H. Xu, Applying spatial thinking in social scienceresearch. Geo J. 75(1), 1527 (2010) 8. Fliphodl, in Social Media Alternatives Series, What You NEED to Know (2018) 9. K. Hwang, M. Chen, Big-Data Analytics for Cloud, IoT and Cognitive Computing, 1st ed (Wiley, London, 2017) 10. S. Manoj, A framework for big data analytics as a scalable systems. Int. J. Adv. Network. Appl. (IJANA) 72–82 (2015) 11. S. Manoj, A survey of thresholding techniques over images. A survey of thresholding techniques over images. 3(2), 461–478 (2014) 12. B. Gholami, S. Murugesan, Global IT project management using Web 2.0. Int. J. Inf. Technol. Proj. Manag. 2(3), 3052 (2011)
A Comparison of GA Crossover and Mutation Methods for the Traveling Salesman Problem Robin T. Bye(B) , Magnus Gribbestad, Ramesh Chandra, and Ottar L. Osen Cyber-Physical Systems Laboratory, Department of ICT and Natural Sciences, NTNU—Norwegian University of Science and Technology, Postboks 1517, NO-6025 ˚ Alesund, Norway [email protected]
Abstract. The traveling salesman problem is a very popular combinatorial optimization problem in fields such as computer science, operations research, mathematics and optimization theory. Given a list of cities and the distances between any city to another, the objective of the problem is to find the optimal permutation (tour) in the sense of minimum traveled distance when visiting each city only once before returning to the starting city. Because many real-world problems can be modeled to fit this formulation, the traveling salesman problem has applications in challenges related to planning, routing, scheduling, manufacturing, logistics, and other domains. Moreover, the traveling salesman problem serves as a benchmark problem for optimization methods and algorithms, including the genetic algorithm. In this paper, we examine various implementations of the genetic algorithm for solving two examples of the traveling salesman problem. Specifically, we compare commonly employed methods of partially mapped crossover and order crossover with an alternative encoding scheme that allows for single-point, multipoint, and uniform crossovers. In addition, we examine several mutation methods, including Twors mutation, center inverse mutation, reverse sequence mutation, and partial shuffle mutation. We empirically compare the implementations in terms of the chosen crossover and mutation methods to solve two benchmark variations of the traveling salesperson problem. The experimental results show that the genetic algorithm with order crossover and the center inverse mutation method provides the best solution for the two test cases. Keywords: TSP · Genetic algorithm Permutations · Inversion sequence
· Crossover · Mutation ·
This research was supported by the European Research Consortium for Informatics and Mathematics (ERCIM). c Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_60
530
1
R. T. Bye et al.
Introduction
The traveling salesman problem (TSP) is a very popular combinatorial optimization problem in fields such as computer science, operations research, mathematics, and optimization theory. In its most basic description, the TSP consists of a list of cities and the distances between any city to another, where the objective of the problem is to find the optimal permutation (tour) in the sense of minimum traveled distance when visiting each city only once before returning to the starting city. Moreover, the decision version of the TSP, where one must be deciding whether there exists any shorter tour than a given tour with some distance, belongs to the class of NP-complete problems, meaning that “it is possible that the worst-case running time for any algorithm for the TSP increases superpolynomially (but no more than exponentially) with the number of cities.”1 Many real-world problems can be modeled to fit the TSP formulation, and hence, the problem has applications related to planning, routing, scheduling, manufacturing, logistics, and many other domains. Moreover, the traveling salesman problem serves as a benchmark problem for optimization methods and algorithms, including the genetic algorithm (GA). The GA is an evolutionary algorithm for solving search and optimization problems and is inspired by elements in natural evolution, such as inheritance, mutation, selection, and crossover (e.g. see [10]). A GA is robust, easy to implement, and easily applicable for multi-objective optimization problems, e.g. [1,2,11]. The GA is generally attributed to Holland [15], with subsequent popularization by Goldberg [8] and is still a very popular optimization tool across many different disciplines. In our Cyber-Physical Systems Laboratory (CPS Lab),2 we have many years of experience utilizing the GA for a variety of purposes, including adaptive locomotion of a caterpillar-like robot [18], autonomous ships and dynamic positioning of tug vessels along the Norwegian coast (e.g., [3,6,7]), and as a core part of a generic software framework for intelligent computer-automated design (CautoD) [4], which was successfully applied for optimized CautoD, or virtual prototyping, of offshore cranes and winches (e.g. [5,13,14]). In this paper, we focus on solving the TSP using a GA implemented with a set of different combinations of crossover and mutation methods. For crossover, we use the partially mapped crossover (PMX), order crossover (OX), single-point crossover, multipoint crossover, and uniform crossover methods. We also test several mutation methods, namely Twors mutation, center inverse mutation (CIM), reverse sequence mutation (RSM), and partial shuffle mutation (PSM). Each combination of crossover and mutation methods are implemented and evaluated for two benchmark problems called Western Sahara (29 cities) and Djibouti (38 cities) obtained from the TSP website of the University of Waterloo, California, USA.3 1 2 3
Wikipedia: https://en.wikipedia.org/wiki/Travelling salesman problem. https://www.ntnu.no/blogger/cpslab/. http://www.math.uwaterloo.ca/tsp.
A Comparison of GA Crossover and Mutation Methods
531
We empirically evaluate the performance of the different GA implementations for solving the two TSP benchmark problems. The experimental results show that OX crossover with CIM mutation outperforms the other implementations. The paper is organized as follows. Section 2 describes related work. Section 3 presents the various crossover and mutation methods used in the implementation of the GA for solving the two TSPs. Section 4 presents the results and an analysis of these. Section 5 discusses the results and the analysis. Finally, Sect. 5.1 concludes the paper and provides possible future directions.
2
Related Work
Several common approaches exist for solving TSP problems, e.g. employing a GA [21], ant colony optimization [24], or artificial neural networks [20]. The TSP can have real-world application; e.g., the author in [12] developed a GAbased method for the TSP to find the optimal route for the Istanbul Electricity Tram and Tunnel Operations (IETT). Variations of the GA have proved to be very successful in obtaining good results for solving TSPs (e.g. [17]). Authors in [11] employed a GA-based approach using several recombination operators, while Goldberg et al. [9] proposed an approach to improve the GA using a PMX operator. In [22], authors present a novel GA-based approach by introducing a new recombination operator to produce new offspring, whereas authors in [19] proposed a new hybrid GA in which the crossover operator is improved by utilizing the local search strategy. In yet another study [23], the authors used an improved GA by combining random crossover and dynamic mutation that provided better results as compared to the conventional genetic algorithm for TSP problem. A more recent paper by Abid Hussain et al. [16] modify the crossover operator, examining PMX and OX and a new proposed operator. The authors apply the three crossover operators using TSP datasets for 42, 53, and 170 cities. ¨ coluk proposed a method for alternative chromosome encoding Finally, in [25], U¸ that allows the GA to be solved without using permutation crossover methods. According to the author, this method is supposed to perform slightly worse than other methods in terms of the solution but many times faster [25].
3
Crossover and Mutation Methods
In this paper, we have implemented two different types of crossover methods: (i) two conventional crossover methods for permutation problems (PMX and OX) and (ii) three ordinary crossover methods (single-point, two-point, and uniform crossover) normally used for non-permutation problems enabled by the alterna¨ coluk [25]. tive encoding scheme suggested by U¸
532
3.1
R. T. Bye et al.
Conventional Crossover Methods for Permutation Problems
We have implemented the PMX crossover method and the OX crossover method, both of which are standard crossover methods for GAs solving TSPs. Both methods introduce measures to avoid duplicates, which are not allowed in permutations. PMX Crossover The PMX crossover method is used for crossover in permutation problems. Being somewhat complex to explain purely in words, we resort ¨ to an example of PMX crossover provided by Uoluk in his paper [25] and reproduced here in Fig. 1.
Fig. 1. Example of PMX crossover [25]
First, a single crossover point is randomly selected for both parents, 5713642 and 4627315, corresponding to two permutations of cities 1 through 7. Next, beginning with a copy of the first parent, cities 462 before the crossover point in the second parent will take up the same gene positions in the first child. However, if merely copying these genes (cities), duplicates might occur, which is not allowed in a permutation. Hence, if there is a duplicate in the child occurring after the crossover point, this gene is replaced by the original city of the child occurring before the crossover point. In the example, putting city 4 in position 1 of the first city in the first child will lead to a duplicate because 4 also occurs in position 6. Hence, city 5 originally at position 1 is moved to position 6, and likewise, for city 6 in position 2 and city 2 in position 3. The process repeats for the second child but with cities before the crossover point from the first parent (cities 571) taking up the same gene positions of the second child and the same replacement procedure to avoid duplicates. OX Crossover Order 1 crossover (often referred to as OX or order crossover) is also a conventional crossover method for permutation problems. This method is based on randomly selecting a section of genes within the parents, for example,
A Comparison of GA Crossover and Mutation Methods
533
the 4 middle genes. Child 1 will then directly inherit these 4 middle genes from parent 1 (into the same position in the child), while child 2 will inherit from parent 2. The remaining genes are then filled with values from the other parent. Again, since chromosomes represent permutations, it is important that there are no duplicate genes (values) in the child. Therefore, creating the child starts with looking at the index of the first non-assigned gene in the other parent. If this gene value does not exist in the child, it is copied into the child. If the value already exists in the child, the procedure continues to check the next gene of the other parent. The process can be illustrated with the example in Fig. 2.
Fig. 2. Example of order 1 crossover (OX)
Having generated one child, the process repeats with parent 1 becoming parent 2 and vice versa. 3.2
Ordinary Crossover Methods Enabled by Alternative Encoding
While standard crossover methods such as n-point crossover or uniform crossover do not take measures to avoid duplicate genes, they can still be used if chromo¨ coluk somes are encoded (transformed) using the encoding scheme presented by U¸ [25]. Here, we first give a short revision of the alternative encoding scheme, which enables use of the three ordinary crossover methods; we have examined here, namely single-point crossover, two-point crossover, and uniform crossover. ¨ coluk [25], a perAlternative Chromosome Encoding As explained by U¸ mutation can be represented by its inversion sequence.4 Most conveniently, and different from the permutation itself, there is no restrictions on having duplicates in the inversion sequence. As a consequence, ordinary crossover operations can 4
E.g., see Wikipedia: https://en.wikipedia.org/wiki/Inversion (discrete mathematics).
534
R. T. Bye et al.
be applied to chromosomes encoded as inversion sequences, as long as the inversion sequences are decoded back to permutations for which the tour distance can be calculated. ¨ coluk [25] is presented in Fig. 3, which shows this An example provided by U¸ alternative chromosome encoding combined with single-point crossover.
Fig. 3. Example of single-point crossover for chromosomes encoded as inversion sequences (adapted from [25])
Single-Point Crossover Single-point crossover is a standard method for nonpermutation problems, i.e., where duplicates are allowed. A random index point in the two parent chromosomes is chosen and both parents are split into two sections at this point. Child 1 takes the first section from parent 1 and the second section from parent 2. Child 2 takes the first section from parent 2 and the second section from parent 1. An example of single-point crossover is depicted in Fig. 4.
Fig. 4. Example of single-point crossover
Two-Point Crossover Two-point crossover is similar to single-point crossover, but for this method two points are selected. This means that each parent is divided into three sections. The first child will then inherit the first and last section from parent 1, and the middle section from parent 2. The second child will inherit the first and last sections from parent 2 and the middle section from parent 1 [1]. The procedure is illustrated in Fig. 5. Uniform Crossover Uniform crossover is different from the other methods so far. This method goes through every gene, and determines if it should be
A Comparison of GA Crossover and Mutation Methods
535
Fig. 5. Example of two-point crossover
inherited from parent 1 or 2. If the probability is set to 0.5, each gene would have equal probability of being from parent 1 or parent 2 [1]. An example is shown in Fig. 6, where crossover is executed with a 0.5 probability.
Fig. 6. Example of uniform crossover
3.3
Mutation Methods
For the mutation operator, we have implemented four different mutation methods, namely Twors mutation, center inverse mutation (CIM), reverse sequence mutation (RSM), and partial shuffle mutation (PSM). Twors Mutation Twors mutation is a mutation method that also can be referred to as swap. Two genes are randomly chosen, and their positions are swapped [1]. An example of Twors mutation is shown in Fig. 7, where gene numbers 3 and 5 are randomly chosen.
Fig. 7. Example of Twors mutation
Center Inverse Mutation (CIM) The CIM method chooses one random point, which divides a chromosome into two sections. The two sections are flipped [1]. An example is shown in Fig. 8, where the random point is selected between genes 3 and 4.
536
R. T. Bye et al.
Fig. 8. Example of CIM method for mutation
Reverse Sequence Mutation (RSM) The RSM method chooses two random points in the chromosome and selects the section of genes in between. The gene sequence inside the selected section is then reversed [1]. An example is shown in Fig. 9, where indexes 3 and 6 are randomly chosen.
Fig. 9. Example of RSM method for mutation
Partial Shuffle Mutation (PSM) The PSM method iterates through each gene in a chromosome. Each gene uses the mutation probability to determine if the gene should be swapped with another. If the gene is determined to be swapped, the gene will swapped with another randomly chosen gene [1]. An example is shown in Fig. 10, where genes numbered 2 and 6 were determined to be swapped with genes 4 and 5, respectively.
Fig. 10. Example of PSM method for mutation
4
Results and Analysis
The optimal tour distances for the two different datasets Western Sahara (29 cities) and Djibouti (38 cities), are 27603 and 6656, respectively. Both these datasets are freely available at the TSP website of the University of Waterloo, California, USA [3]. For simplicity, we have used the same GA settings for all methods, which we found empirically to work well: maximum number of generations: 1000, stall generation limit: 200, population size: 300, crossover probability: 0.7, mutation probability: 0.05, and elitism ratio: 0.20. We ran the GA on an ordinary laptop with each combination of crossover and mutation methods presented previously, and each combination was run 10
A Comparison of GA Crossover and Mutation Methods
537
times for each dataset to obtain the best tour distance (b.distance) and its percentage deviation from the optimal distance (b.deviation), the best running time in seconds (b.time), and the average percentage devation (avg.deviation) and average running time in seconds (avg.time). Tables 1 and 2 summarize the results for the Western Sahara dataset, whereas Tables 3 and 4 summarize the results for the Djibouti dataset. Moreover, results using the conventional crossover methods for TSP, namely PMX and OX, are shown in Tables 1 and 3, whereas results using the alternative chromosome ¨ coluk [25] that enables single-point (1p), two-point (2p), encoding suggested by U¸ and uniform crossovers are shown in Tables 2 and 4. Table 1. Western Sahara dataset with PMX and OX crossover Crossover Mutation b.distance
b.deviation b.time avg.deviation avg.time
PMX
TWORS
28,864.92
4.37
32
19.68
24
PMX
CIM
27,603
0
36
2.80
30
PMX
PSM
27,603
0
28
15.78
32
PMX
RSM
27,603
0
34
1.80
28
OX
TWORS
27,748.71
0.52
29
11.52
23
OX
CIM
27,603
0
22
0.93
24
OX
PSM
27,603
0
20
8.05
30
OX
RSM
27,748.71
0.52
25
2.36
19
Table 2. Western Sahara dataset with single-point, two-point, and uniform crossovers Crossover Mutation b.distance b.deviation b.time avg.deviation avg.time 1p
TWORS
28,972.52
4.727
67
15.47
76
1p
CIM
27,603
0
118
8.32
85
1p
PSM
27,603
0
61
7.42
62
1p
RSM
27,603
0
51
1.95
56
2p
TWORS
28,373.51
2.7156
42
13.24
54
2p
CIM
28,256.09
2.3113
63
10.59
79
2p
PSM
27,748.71
0.525
66
16.8
69
2p
RSM
27,748.71
0.525
43
2.67
53
Uniform
TWORS
28,189.75
2.0814
67
8.84
63
Uniform
CIM
27,748.71
0.5251
134
8.55
89
Uniform
PSM
27,603
0
89
4.87
110
Uniform
RSM
27,603
0
89
2.45
110
538
R. T. Bye et al. Table 3. Djibouti dataset with PMX and OX crossovers Crossover Mutation b.distance b.deviation b.time avg.deviation avg.time PMX
TWORS
8236.96
19.19
40
PMX
CIM
6758.28
1.51
49
28.7 5.31
41 60
PMX
PSM
7605.59
12.48
71
25.42
60
PMX
RSM
6656
0.0
48
6.92
46
OX
TWORS
7290.80
8.71
54
20.06
37
OX
CIM
6656
0
54
1.80
42
OX
PSM
7532.99
11.64
46
23.02
49
OX
RSM
6656
0
34
3.26
36
Table 4. Djibouti dataset with single-point, two-point, and uniform crossovers Crossover Mutation b.distance b.deviation b.time avg.deviation avg.time
4.1
1p
TWORS
7415.21
10.24
109
25.97
112
1p
CIM
1p
PSM
9011.86
26.14
163
34.06
132
8134.72
18.18
166
24.31
144
1p
RSM
6656
0
156
5.96
132
2p
TWORS
7415.61
10.24
149
23.13
119
2p
CIM
7594.54
12.36
143
24.46
150
2p
PSM
7363.26
9.61
139
21.28
146
2p
RSM
6656
0
119
3.01
140
Uniform
TWORS
7688.32
13.43
154
21.87
119
Uniform
CIM
7421.81
10.32
165
21.42
141
Uniform
PSM
7313.06
8.98
166
21.70
161
Uniform
RSM
6656
0
161
3.89
149
Analysis
For the Western Sahara dataset with 29 cities and conventional crossover methods for TSP (Table 1), PMX crossover with all mutation methods but the Twors methods resulted in the optimal solution, whereas OX crossover resulted in the optimal solution for the CIM and PSM mutation methods. The average computational running time ranged from 24 to 32 s for these combinations of crossover and mutation methods. Using the alternative chromosome encoding (Table 2), single-point crossover with all mutation methods apart from Twors resulted in the optimal solution, as did uniform crossover with the PSM and RSM mutation methods. The average computational time ranged from 56 to 110 seconds for these combinations of crossover and mutation methods. For the Djibouti dataset with 38 cities and conventional crossover methods for TSP (Table 3), PMX and OX crossover both resulted in the optimal solution
A Comparison of GA Crossover and Mutation Methods
539
when combined with the RMS mutation method. The average computational running time was 46 and 36 seconds, respectively, for these two combinations. Using the alternative chromosome encoding (Table 4), single-point, two-point, and uniform crossovers all resulted in the optimal solution when combined with the RSM mutation method. The average computational time was 132, 140, and 149 seconds, respectively, for these three combinations of crossover and mutation methods. The only combinations of crossover and mutation methods that found the optimal solution for both datasets were (i) PMX, (ii) single-point, and (iii) uniform crossover with RSM mutation, and (iv) OX crossover with CIM mutation. These four combinations had average running times of (i) 28 and 46 s, (ii) 56 and 132 s, (iii) 110 and 149 s, and (iv) 24 and 42 s. Their average percentage deviation from the optimal distance for the 10 runs was (i) 1.80 and 6.92, (ii) 1.95 and 5.96, (iii) 2.45 and 3.89, and (iv) 0.93 and 1.80. Hence, for the aforementioned GA settings used here, taking into account the ability both to find the optimal solution and to find near-optimal solutions, as well as running time, the OX crossover with CIM mutation appears to be the best choice. Using PMX with RSM mutation is slightly worse when comparing running times, and particularly also when comparing the average deviation for the largest dataset, Djibouti. The two combinations using the alternative encoding scheme have much longer running times than the two best combinations (around 2–4 times higher), and also higher average deviations for the West Sahara dataset. For the Djibouti dataset, these two alternative encoding combinations yield better results than PMX with RSM for average percentage deviation, but worse than OX with CIM. The best routes found by OX crossover and CIM mutation with corresponding cost function convergence plots for the two datasets are shown in Fig. 11.
5
Discussion and Conclusions
It can be observed from the results above that when the dataset becomes more difficult (more cities), fewer combinations of crossover and mutation methods were able to find the optimal solution. When considering average percentage deviation as well as average running time, the GA with OX crossover and CIM mutation was the best choice. However, it is important to note that this claim is only valid for the two datasets and the choice of GA settings described previously. It may be that for other settings of population size, crossover and mutation probability, and elitism ratio, different results may be obtained. Nevertheless, we have examined by trial and error a number of other GA settings without observing any clear counter-indications that the results do not generalize. ¨ coluk [25], we were Regarding the alternative encoding scheme proposed by U¸ unable to reproduce the claim that this method should lead to a significant speedup while only slightly worse performance (in terms of finding optimal or nearoptimal solutions). On the contrary, we found the alternative encoding scheme to be typically 2–4 times slower than using conventional crossover methods.
540
R. T. Bye et al.
Fig. 11. Best routes and convergence plots using OX with CIM
We have based our implementation (which was coded in Python 3) for finding the inverse sequences and converting back to permutations on the pseudocode ¨ coluk [25]. Still, there might be parts in the conversion process in provided by U¸ our implementation that causes this particular operation to be slower than in ¨ coluk. the case of U¸ 5.1
Conclusions
The TSP is a popular NP-hard combinatorial optimization benchmark problem with many applications toward the real world. In this paper, we have examined using a GA implemented with various combinations of crossover and mutation methods for solving two datasets with 29 and 38 cities. We examined both conventional crossover methods for TSP (PMX and OX) and an alternative chromosome encoding scheme using inversion sequences than enabled ordinary crossover methods (single-point, two-point, and uniform crossovers). All these crossover methods were tested with four different mutation methods (Twors, CIM, PSM, and RSM). For the two datasets and the population size, crossover
A Comparison of GA Crossover and Mutation Methods
541
probability, and mutation probability that we tested, OX crossover with CIM mutation was the best method. Going forward, it would be interesting to investigate whether our code can be optimized (or not) regarding the alternative encoding, as well as examine larger datasets for further comparisons of GA crossover and mutation methods for TSPs. Acknowledgments. This research was supported by the European Research Consortium for Informatics and Mathematics (ERCIM), which provided funding to Ramesh Chandra for his postdoctoral fellowship at the CPS Lab at NTNU, ˚ Alesund, Norway.
References 1. O. Abdoun, J. Abouchabaka, in A comparative study of adaptive crossover operators for genetic algorithms to resolve the traveling salesman problem. arXiv preprint arXiv:1203.3097 (2012) 2. O. Abdoun, J. Abouchabaka, C. Tajani, in Analyzing the performance of mutation operators to solve the travelling salesman problem. arXiv preprint arXiv:1203.3099 (2012) 3. R.T. Bye, A receding horizon genetic algorithm for dynamic resource allocation: a case study on optimal positioning of tugs, in Computational Intelligence. (Springer, Berlin, 2012), pp. 131–147 4. R.T. Bye, O.L. Osen, B.S. Pedersen, I.A. Hameed, H.G. Schaathun, A software framework for intelligent computer-automated product design, in Proceedings of the 30th European Conference on Modelling and Simulation (ECMS ’16) (June 2016), pp. 534–543 5. R.T. Bye, O.L. Osen, W. Rekdalsbakken, B.S. Pedersen, I.A. Hameed, An intelligent winch prototyping tool, in Proceedings of the 31st European Conference on Modelling and Simulation (ECMS ’17) (May 2017), pp. 276–284 6. R.T. Bye, H.G. Schaathun, Evaluation heuristics for tug fleet optimisation algorithms: a computational simulation study of a receding horizon genetic algorithm, in Proceedings of the 4th International Conference on Operations Research and Enterprise Systems (ICORES ’15) (2015), pp. 270–282 (Selected for extended publication in Springer book series Communications in Computer and Information Science (CCIS)) 7. R.T. Bye, H.G. Schaathun, An improved receding horizon genetic algorithm for the tug fleet optimisation problem, in Proceedings 28th European Conference on Modelling and Simulation (ECMS 2014), Brescia, Italy, May 27–30, 2014 (ECMS European Council for Modelling and Simulation, 2014) 8. D.E. Goldberg, Genetic algorithms in search, in Optimization and Machine Learning (Addison-Wesley Longman Publishing Co., Inc, Boston, MA, USA, 1989) 9. D.E. Goldberg, R. Lingle, et al., Alleles, loci, and the traveling salesman problem, in Proceedings of an International Conference on Genetic Algorithms and Their Applications, vol. 154 (Lawrence Erlbaum, Hillsdale, NJ, 1985), pp. 154–159 10. E.D. Goodman, Introduction to genetic algorithms, in Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO Comp ’14) ( ACM, New York, NY, USA, 2014), pp. 205– 226
542
R. T. Bye et al.
11. J. Grefenstette, R. Gopal, B. Rosmaita, D. Van Gucht, Genetic algorithms for the traveling salesman problem, in Proceedings of the First International Conference on Genetic Algorithms and their Applications, vol. 160 (1985), pp. 160–168 12. U. Hacizade, I. Kaya, Ga based traveling salesman problem solution and its application to transport routes optimization. IFAC-PapersOnLine 51(30), 620–625 (2018) 13. I.A. Hameed, R.T. Bye, O.L. Osen, O.L., Pedersen, B.S., Schaathun, H.G.: Intelligent computer-automated crane design using an online crane prototyping tool, in Proceedings of the 30th European Conference on Modelling and Simulation (ECMS’16) (June 2016), pp. 564–573 (Best Paper Nominee) 14. I.A. Hameed, R.T. Bye, B.S. Pedersen, O.L. Osen, Evolutionary winch design using an online winch prototyping tool, in Proceedings of the 31st European Conference on Modelling and Simulation (ECMS’17) (May 2017), pp. 292–298 15. J.H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence (University of Michigan Press, Oxford, England, 1975) 16. A. Hussain, Y.S. Muhammad, M. Nauman Sajid, I. Hussain, A. Mohamd Shoukry, S. Gani, Genetic algorithm for traveling salesman problem with modified cycle crossover operator. Comput. Intell. Neurosci. 2017 (2017) 17. P. Larranaga, C.M.H. Kuijpers, R.H. Murga, I. Inza, S. Dizdarevic, Genetic algorithms for the travelling salesman problem: a review of representations and operators. Artif. Intell. Rev. 13(2), 129–170 (1999) 18. G. Li, H. Zhang, J. Zhang, R.T. Bye, Development of adaptive locomotion of a caterpillar-like robot based on a sensory feedback CPG model. Adv. Robot. 28(6), 389–401 (2014) 19. B.L. Lin, X. Sun, S. Salous, Solving travelling salesman problem with an improved hybrid genetic algorithm. J. Comput. Commun. 4(15), 98–106 (2016) 20. S. Mirjalili, Evolutionary multi-layer perceptron, in Evolutionary Algorithms and Neural Networks (Springer, Berlin, 2019), pp. 87–104 21. N.M. Razali, J. Geraghty, et al., Genetic algorithm performance with different selection strategies in solving TSP, in Proceedings of the World Congress on Engineering, vol. 2 (International Association of Engineers, Hong Kong, 2011) 22. L.D. Whitley, T. Starkweather, D. Fuquay, Scheduling problems and traveling salesmen: the genetic edge recombination operator. ICGA 89, 133–40 (1989) 23. J. Xu, L. Pei, R. Zhu, Application of a genetic algorithm with random crossover and dynamic mutation on the travelling salesman problem. Proc. Comput. Sci. 131, 937–945 (2018) 24. J. Yang, X. Shi, M. Marchese, Y. Liang, An ant colony optimization method for generalized tsp problem. Progr. Nat. Sci. 18(11), 1417–1422 (2008) ¨ coluk, Genetic algorithm solution of the TSP avoiding special crossover and 25. G. U¸ mutation. Intell. Autom. Soft Comput. 8(3), 265–272 (2002)
Comparison-Based Study to Predict Breast Cancer: A Survey Ankit Grover, Nitesh Pradhan(B) , and Prashant Hemrajani Manipal University Jaipur, Jaipur, Rajasthan 303007, India [email protected], {nitesh.pradhan,prashant.hemrajani}@jaipur.manipal.edu
Abstract. Cancer is utmost dangerous disease that leads to death stage if not cured on time. Breast cancer is the second most common disease after lung cancer in women. Therefore, its early detection is of utmost importance. Machine learning plays an important role to predict breast cancer in the early stages. In this paper, the authors present a comparison study to predict breast cancer on the Breast Cancer Wisconsin Diagnostic dataset by applying six different machine learning algorithms such as CART, logistic regression, support vector classifier, hard voting classifier, Extreme Gradient Boosting, and artificial neural network. Authors have used various metrics for model evaluation keeping accuracy as one of the most important factors since higher accuracy models can help doctors to better detect the presence of breast cancer. Keywords: CART · Hard voting classifier · Support vector classifier · Extreme Gradient Boosting · Artificial neural networks · Logistic regression
1 Introduction Breast cancer is the second leading cause of death in women besides lung cancer. The likelihood of a woman dying from breast cancer is 1 in 38 (2.6%) [1]. It is also 100 times more common in women than in men. Only 5–10% of the breast cancer is inherited [2]. In medical era, several methods have been adopted for the treatment of breast cancer in which the most commonly used treatments are chemotherapy, targeted therapy, hormone therapy, surgery including others. However, despite treatment there are several undesirable side-effects such as recurrence of the tumor, nausea, constipation, shortness of breath, numbness, joint and muscle pain. Machine learning plays an important role to diagnose cancer disease, which is broadly divided into supervised, unsupervised, and reinforcement learning classes [3]. These classes are highly efficient in the tasks of predictions, clustering, and classifying various data sources. Classification mainly deals with segregation of data such as image, audio, text, numeric data into one or more categories after creating meaningful representations of the data in the form of feature extraction on which the models are applied for our result. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_61
544
A. Grover et al.
With various machine learning techniques being implemented, there is a need for proper benchmarking of various methods being used so as to provide insights into efficient computations and higher rate of cancer detection for industrial use. In this paper, the main aim is to implement common techniques and infer the best models for the same. The further sections are as follows: Section 2 describes related work with comparison of various algorithms on Wisconsin Diagnostic original dataset. Section 3 describes about dataset used, data preprocessing and brief discussion on used algorithms with their architecture. Section 4 describes the system specification with precision, recall, and accuracy comparison of used algorithms. Section 5 concludes the work with future scope of the research.
2 Related Work Breast tissue is the main cause of breast cancer. Over 80% of breast cancer are found when a woman feels a lump. As per the National Cancer Institute’s (NCI) report [1], 12.8% of ladies conceived in the USA today will develop breast cancer sooner or later during their lives. So, it is very important to predict breast cancer in early stage. In machine learning, so many algorithms already exist to predict breast cancer such as CART, logistic regression, support vector classifier (SVC), hard voting classifier (HVC), Extreme Gradient Boosting (XGBOOST), and multilayer neural network [4]. All the mentioned algorithms applied on Wisconsin Diagnostic dataset that has nearly about 700 patient’s data are divided into eleven different symptoms. Table 1 depicts the comparisonbased study for existing algorithms. Column 1 and 2 includes the year and authors’ name, respectively. Column 3 enumerates the technique(s)/algorithm(s) used. Column 4 presents the advantage(s) of the technique. Column 5 highlights the drawbacks of the existing technique.
3 Methodology 3.1 Hyperparameters Optimizer The ‘Adam’ [14] or the adaptive momentum estimation optimizer has been used. It is a popular choice for optimization problems in many deep learning models having high efficiency and low memory requirements. The default hyper parameters have been used. Weight Initializer The ‘uniform’ initializer returns a tensor initialized by a random uniform distribution. This has been done to prevent common gradient problems such as exploding or vanishing gradients. Activation Function
1. ReLU The rectified linear unit is the most widely used activation function in deep learning tasks. It is fast and nonlinear in nature; it does not activate all the neurons
Comparison-Based Study to Predict Breast …
545
Table 1. Comparative analysis of existing techniques Years
Techniques
2016 [4]
LR and BPNN
2018 [5]
2014 [6]
2019 [7]
2018 [8]
2018 [9]
2013 [10] 2013 [11]
2019 [12]
2011 [13]
Advantages
BPNN exceeds accuracy of 93% with only 240 features Adaptive voting ensemble Combination of two of ANN and LR efficient models with 98.5% accuracy Naïve Bayes A user-friendly GUI, which is reliable and scalable for web-based services C5 Decision trees. XGBOOST providing XGBOOST high accuracy of 97.71% LR Simple model, highly interpretable
Disadvantages LR model utilizes large number of features Extensive univariate feature selection, to minimize training costs Average accuracy of 93%
Training time of XGBOOST is long and is prone to over-fitting With accuracy of 98.5% Better feature extraction techniques required GRU-SVM, kNN, MLP, SVM, MLP producing a No cross-validation Softmax Regression, test accuracy of techniques applied for SVM 99.038% model testing C4.5 Decision trees. Comparison of various More features required for SVM, ANN methods better accuracy KNN A high accuracy score KNN is a lazy algorithm of 98.70% for k = 1 and depending on the features and dataset size might be time consuming kNN, SVM, Naïve Bayes, Detailed study of feature Minimal stress on model NN selection and its impact accuracy on precision score. ANN Extensive analysis of Huge amount of training NN and its performance data and training time with 85.5% accuracy
at the same time making it fast. It rectifies the input based on a given threshold. The given Eq. 1 shows the ReLU function: f (x) = max(0 + x)
(1)
2. Leaky ReLU The LeakyReLu function is a modified form of ReLU. It introduces a small negative gradient instead of making it zero. This usually helps in better convergence by tuning the hyperparameter α, also helps to solve the problem of dead neurons. An α value of 0.6 has been used here. The given Eq. 2 shows the LeakyReLu function: ax if x < 0 f (x) = (2) x if x ≥ 0
546
A. Grover et al.
3. Sigmoid The sigmoid activation function is generally used for binary classification tasks. It takes input values between −∞ to +∞ and returns values between 0 and 1. Equation 3 defines the sigmoid activation function. f (x) = 1/1 + e−x
(3)
3.2 Data Preprocessing The dataset used have some missing values. To handle these missing values, data preprocessing also happened on Wisconsin Diagnostic dataset. The sample code number column containing Id numbers has been dropped because of being a constant feature adding no significance to the model, thus reducing the unnecessary features. The attribute numbers here start from the index 0 to index 8, where index 9 denotes the target variable. The Bare Nuclei column has missing features in the form ‘?’ string which needs to be imputed. 16 such instances of missing values were found in this feature. These missing values were replaced by the most frequently occurring/mode values of the features. 3.3 Algorithms Used The following problem is a classification-based problem where authors are supposed to classify patients into two categories, namely benign and malignant. Authors have used the following algorithms with their respective parameter tuning to get an efficient result. (a) CART This refers to classification and regression trees. It is commonly referred to by decision trees. Decision trees are useful for classification based on the criterion ‘gini’ [15]. Equation 4 defines the Gini impurity index: c Gini = 1 − x(Pi)2 i=1
(4)
(b) Logistic Regression Logistic regression (LR) is another very efficient algorithm for binary classification without over-fitting on the dataset. Authors have used the parameter ‘balanced’ for class weights to prevent the inaccurate predictions by balancing the dataset. It automatically adjusts the weights based on the inverse of the class frequencies. These weights are then multiplied with the sample weight passed to the logistic function. (c) Support Vector Classifier Support vector classifier (SVC) is again another efficient algorithm for classification tasks. SVC uses the linear kernel to prevent over-fitting. The linear kernel can be used for binary classification problems. SVC takes the help of vectors in the feature space to classify the class of a new vector. (d) Hard Voting Classifier The hard voting classifier (HVC) is an algorithm that combines the predictive capacities of conceptually similar or different machine learning classifiers for classification using majority voting. In the hard voting method, authors rely on the frequency of the class labels predicted by the algorithms used in the ensemble. It does not rely on probabilistic assumptions. Authors predict the class label y via majority voting of each classifier.
Comparison-Based Study to Predict Breast …
547
(e) Extreme Gradient Boosting Extreme Gradient Boosting (XGBOOST) is an ensemble technique relying on gradient boosted trees. It is based on the creation of new trees sequentially by computing the gradient to improve the predictions in each subsequent tree. It has several features such as sparse awareness automatically handling missing values, parallel learning, continued training for further boosting, and it has number of parameters for high performance and efficient computing. A learning rate of 0.09 has been used for faster convergence. (f) Artificial Neural Network Artificial neural network (ANN) is inspired by human brain functions designed for computer systems. ANN is the combination of three layers as input, hidden, and output layer. Neurons in input layer are the features of a given problem, whereas hidden layers are used to adjust the weight of the network and output layer neuron used to predict whether a patient suffering from breast cancer or not. ANN relies on forward propagation where the input passes through various activation functions of each layer; backward propagation is the phase where the errors between the predicted and actual target values by calculating their difference. This calculated error sent back across the network neurons where weights for the neurons are updated. To achieve this task, authors train ANN till 300 epochs and training data is divided into batches of size 64. The used neural network architecture consists of 1 input layer of 8 neurons and ReLU as the activation function followed by another hidden layer consisting of 8 neurons, LeakyReLU as the activation function with an alpha value of 0.6 as shown in Fig. 1. This is followed by the final layer consisting of 1 neuron and sigmoid activation function. All kernel initializers are randomly initialized. The Adam optimizer has been used with binary cross entropy as the loss function. The LeakyReLU activation function provides a small gradient of negative features when units are not active.
4 Implementation 4.1 About Dataset To perform the analysis accordingly during this manuscript, used dataset is Wisconsin Diagnostic dataset [16]. All samples have eleven features with ten input features and one output feature (labeled data). The input features are basically the symptoms of a patient and output feature is the binary diagnosis, one as patient has breast cancer or not. We have used Wisconsin Diagnostic dataset of 699 patients with eleven features. So, we divide our dataset into training and testing set in the ratio of 70:30, respectively. It means 490 patients’ data used as the training set and remaining 210 patient’s data used as the testing set. 4.2 System Specification The following experiments were performed on Google Colab using Jupyter notebook’s with version Python 3. For applying many machine learning algorithms, Scikit-learn 0.20 and Keras 2.2.5 with Tensorflow backend have been used, respectively. Google Colab uses Tesla K80 GPU with 2496 CUDA cores and also provide the facility of 1xsingle core hyper threaded, (i.e., 1 core, 2 threads) CPU with 12.6 GB RAM.
548
A. Grover et al.
Fig. 1. Neural network architecture
4.3 Result and Discussion The following results were obtained after applying mentioned algorithms on the above dataset. The models are compared based on precision, recall, F1 measure, and test accuracy. Here, TP stands for true positives, TN for true negatives, FP for false negatives, and FN for false negatives. Table 2 summarizes the performance of each model obtained after testing with 210 instances of breast cancer, namely CART, LR, SVC, HVC, XGBOOST, and ANN with their precision, recall, F1 measure, and accuracy value on Wisconsin Diagnostic dataset. Table 2. Resultant analysis of applied algorithms Algorithms
TP
TN
FP
FN
Precision
Recall
F1 measure
Accuracy
CART
124
73
11
2
91.85
98.41
95.01
93.80
LR
128
71
7
4
94.81
96.96
95.87
94.76
SVC
129
70
6
5
95.56
96.26
95.52
94.76
HVC
128
72
7
3
94.81
97.70
96.23
95.23
XGBOOST
130
73
5
2
96.29
98.48
97.37
96.67
ANN
129
71
6
4
95.55
96.99
96.76
95.23
Precision = TP/(TP + FP)
(5)
Recall = TP/(TP + FN)
(6)
Comparison-Based Study to Predict Breast …
549
F1 Measure = 2 ∗ (Precision ∗ Recall)/(Precision + Recall)
(7)
Accuracy = (TP + TN)/(TP + TN + FP + FN)
(8)
Besides the above table, authors also have plotted the receive operating characteristic (ROC) as shown in Fig. 2 with the respective area under curve (AUC) Scores. Only the predicted probabilities of HVC have not been plotted because of the absence of probabilities.
Fig. 2. Comparison of different classifiers
5 Conclusion From our above study, it has been concluded that XGBOOST followed by ANN are best for accurately predicting the presence of breast cancer. Gradient-based techniques perform extremely well in such tasks and hold up against ANNs where more data samples are required. Moreover, the time complexity and space complexity of the models need to be evaluated for deployment as well. However, care needs to be taken to check for bias in the model, perform out of sample testing for deploying the solution, and further consolidate our findings. Moreover, large datasets with many features would enable the authors to test and perform more accurate analysis, allow us the opportunity to perform more feature engineering, whereas in the following case we have a limited feature set.
550
A. Grover et al.
Bibliography 1. R.L. Siegel, K.D. Miller, A. Jemal, Cancer statistics. CA Cancer J. Clinicians 69(1), 7–34 (2019) 2. A.-M. Noone, K.A. Cronin, S.F. Altekruse, N. Howlader, D.R. Lewis, V.I. Petkov, L. Penberthy, Cancer incidence and survival trends by subtype using data from the surveillance epidemiology and end results program, 1992–2013. Cancer Epidemiol. Prevent. Biomarkers 26(4), 632–641 (2017) 3. B. Settles, in Active learning literature survey. Technical Report, University of WisconsinMadison Department of Computer Sciences (2009) 4. M.R. Al-Hadidi, A. Alarabeyyat, M. Alhanahnah, Breast cancer detection using k-nearest neighbor machine learning algorithm, in 2016 9th International Conference on Developments in eSystems Engineering (DeSE) (Aug 2016), pp. 35–39 5. N. Khuriwal, N. Mishra, Breast cancer diagnosis using adaptive voting ensemble machine learning algorithm, in 2018 IEEMA Engineer Infinite Conference (eTechNxT). IEEE (2018), pp. 1–5 6. S. Kharya, S. Agrawal, S. Soni, Naive Bayes classifiers: a probabilistic detection model for breast cancer. Int. J. Comput. Appl. 92(10), 0975–8887 (2014) 7. S.D. Narvekar, A. Patil, J. Patil, S. Kudoo, Prognostication of breast cancer using data mining and machine learning. Int. J. Adv. Res. Ideas Innov. Technol. 5(2), 921–924 8. L. Liu, Research on logistic regression algorithm of breast cancer diagnose data by machine learning, in 2018 International Conference on Robots & Intelligent System (ICRIS). IEEE (2018), pp. 157–160 9. A.F.M. Agarap, On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset, in Proceedings of the 2nd International Conference on Machine Learning and Soft Computing (ACM, 2018), pp. 5–9 10. A. Lg, A.T. Eshlaghy, A. Poorebrahimi, M. Ebrahimi, R. Ar, Using three machine learning techniques for predicting breast cancer recurrence (2013) 11. S.A. Medjahed, T.A. Saadi, A. Benyettou, Breast cancer diagnosis by using k-nearest neighbor with different distances and classification rules. Int. J. Comput. Appl. 62(1) (2013) 12. Prateek, Breast cancer prediction: importance of feature selection, in Advances in Computer Communication and Computational Sciences (Springer, 2019), pp. 733–742 13. I. Saritas, Prediction of breast cancer using artificial neural networks. J. Med. Syst. 36(5), 2901–2907 (2012) 14. D.P. Kingma, J. Ba, Adam: a method for stochastic optimization (2014). arXiv preprint arXiv: 1412.6980 15. L.E. Raileanu, K. Stoffel, Theoretical comparison between the gini index and information gain criteria. Ann. Math. Artif. Intell. 41(1), 77–93 (2004) 16. A. Antos, B. Kégl, T. Linder, G. Lugosi, Data-dependent margin-based generalization bounds for classification. J. Mach. Learn. Res. 3, 73–98 (2002)
Software Quality Prediction Using Machine Learning Techniques Somya Goyal1(B) and Pradeep Kumar Bhatia2 1 Department of Computer and Communication Engineering, Manipal University Jaipur, Dehmi
Kalan, Jaipur, Rajasthan 303007, India [email protected] 2 Guru Jambheshwar University of Science & Technology, Hisar, Haryana 125001, India [email protected]
Abstract. Software quality prediction is one of the most challenging tasks in the development and maintenance of the software. Machine learning (ML) is widely being incorporated for the prediction of the quality of final product in the early development stages of SDLC. ML prediction model is set using software metrics and faulty data of previous projects to detect the high-risk modules for future projects so that the testing efforts can be targeted to those specific ‘risky’ modules. Hence, ML-based predictors contribute to the detection of development anomalies early and inexpensively and ensure the timely delivery of a successful, failure-free and supreme quality software product within budget. This paper brings a comparison of 30 software quality prediction models built on five ML techniques such as artificial neural network and support vector machine, decision tree, k-nearest neighbor and Naïve-Bayes classifiers using six datasets. These models exploit the predictive power of static code metrics—McCabe complexity metrics—for the quality prediction. All thirty predictors are compared using ROC, AUC and accuracy as performance evaluation criteria. The results show that ANN technique is the most promising for accurate quality prediction irrespective of the dataset used. Keywords: ANN · Classification tree · Machine learning · Software quality
1 Introduction Software with high quality which meets the user’s needs, requirements and performs as per the expectation is tempted in the entire world since always. The software must ensure the failure-free performance that is the reliability of product. So, many quality attributes and metrics with numerous quality assurance techniques are developed, but still the question: how to ensure that the resulting product will possess good quality is an open problem. The early detection of failure-prone modules directly correlates with the quality of end-product. The fault prediction involves the early detection of those ‘risky’ modules of the software which prone to errors, impair the quality and will surely incur heavy development (testing) and maintenance cost. The early detection of faulty (buggy) modules improves the effectiveness of quality enhancement activities © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_62
552
S. Goyal and P. K. Bhatia
and ultimately improves the overall quality. The software fault prediction is used as an indicator for software quality for two reasons: (1) Quality is inversely proportional to the failures, which in turn caused by faults (development anomalies). To ensure high quality, failures must be minimized to zero. It can be achieved only by 100% detection and removal of defects from the modules. Hence, the accuracy in defect prediction is the most crucial factor to judge the quality of software product. (2) The early fault detection provides decisive power to the entire development team to strategically allocate the testing resources. Quality improvement activities can intensively be applied to the detected risky modules [1]. Scheduling can be done more effectively to avoid delays. Prioritization of failure-prone modules for testing can be done. Ultimately, the quality can be improved and ensured by prediction of faults in early phases of development cycle. In case, faulty modules are not detected in early development phases, and then, the cost of getting the defect fixed increases multifold. Along with the increased cost, the chances of getting the defect detected by the customer in the live environment also increase. In situ, defect is found in the live environment, may stop the operational procedures which can eventually result in fatal consequences too. Software industry already witnessed such failures like NASA Mass Climate Orbiter (MCO) spacecraft worth $125 million lost in the space due to small data conversion bug [2]. Hence, the more accurate fault prediction is done, and the more precise quality prediction is achieved. The present work is focused on the following research goals: • R1—To transform the software quality prediction problem as a learning problem (classification problem). • R2—To create ML prediction models using static code metrics as predictor. • R3—To evaluate the accuracy of prediction models empirically. • R4—To find which ML technique outperforms other ML techniques. In this paper, (1) The software quality prediction problem is formulated as a twoclass classification problem. Following this approach, a few features (attributes) from the previous project dataset are used as predictors, and the quality of the module is generated as response by the classifier in terms of the predicted class label. (2) In total, 30 models are built for software quality prediction using 5 ML techniques (ANN, SVM, Naïve-Bayes classifier, DT and k-nearest neighbor) over six datasets. Each model is validated using tenfold cross-validation. (3) An empirical comparison among the developed prediction models is made using ROC, AUC and accuracy as performance evaluation criteria. The rest of the paper is organized as follows. Section 2 gives the detailed description for the experimental setup. The details to datasets, metrics, methods and performance evaluation criteria as research methodology are included in Sect. 3. Section 4 covers the results and the discussions. The work is concluded in Sect. 5, with remarks on the future work.
2 Experimental Setup This section covers the experimental setup required (depicted as in Fig. 1) for our work— the comparison of effectiveness of 30 proposed models is done. 5 ML techniques (ANN, SVM, DT, Naïve-Bayes, kNN) with six datasets from PROMISE repository are employed
Software Quality Prediction Using Machine …
553
to develop 30 models that predict whether the module is ‘buggy’ or ‘clean.’ The steps taken to carry the entire research include data collection, feature extraction and selection, building the prediction model, validation of the developed model, predicting the software quality using the validated classifier and evaluation of the classifiers to make a comparison among them. The complete experimental flow is modeled in Fig. 1. A total of 30 (6 datasets × 5 ML classification techniques) distinct prediction models are built, validated and compared in the current paper.
Fig. 1. Experimental seup
3 Research Methodology This section covers the steps followed to carry this research work with necessary references to the previous related studies in the literature. 3.1 Data Description The work utilizes the data collected from NASA projects using McCabe metrics which are made available in the PROMISE repository. This research is done with six fault prediction datasets named CM1, KC1, KC2, PC1, JM1 and ALL_DATA (combination of CM1, KC1, KC2, PC1, JM1). Data has been collected using McCabe and Halstead features extractors from the source code of multiple projects [3, 4]. The PROMISE repository is widely used by researchers for quality prediction [5–8]. 3.2 Methods The machine learning techniques used to build our prediction models are artificial neural network, support vector machine, Naïve-Bayes classifier, classification trees and k-nearest neighbor [9–13].
554
S. Goyal and P. K. Bhatia
3.3 Performance Evaluation Criteria The comparison among the prediction models is made by evaluating the performance of individual classifier over the following criteria: Receiver operating characteristics (ROC) curve, area under the ROC curve and accuracy [12].
4 Results and Discussions In this section, the relationship between the static code metrics, i.e., McCabe’s complexity metrics and the fault proneness of software modules is analyzed and presented. The analysis is done using ROC, AUC and accuracy criteria. Figure 2 gives the graphical analysis for the sensitivity of all thirty classifiers, using the box plots which are grouped by ML method (depicted by common color). It is seen that the sensitivity is high for all six classifiers built on k-nearest neighbor method.
Fig. 2. Box plots for correctly classified ‘Buggy’ instances (percentage) (grouped by ML method)
4.1 ROC Curve The next evaluation criterion is ROC curve analysis. It is the visual representation of prediction accuracy of classifier on the test data. Figures 3, 4, 5, 6, 7 and 8 show the ROC curves of all 30 classifiers grouped by the dataset used. It is found from the ROC curve analysis that the classifiers built using ANN technique are promising to predict ‘risky’ modules with pretty good accuracy irrespective of the dataset used.
Software Quality Prediction Using Machine …
555
Fig. 3. ROC curves for five classifiers built using CM1 dataset
Fig. 4. ROC curves for five classifiers built using JM1 dataset
4.2 AUC Measure The next performance measure used is area under the ROC curve. We observed that how close the AUC value is to ‘1’. Because the AUC value equal to ‘1’ implies 100% prediction accuracy of the classifier. Table 1 shows the AUC values for all thirty classifiers. It is found that the ANN classifier built using KC2 dataset is having maximum AUC with value 0.8315. Figure 9 graphically represents AUC measure for classifiers grouped by dataset, and it is clear that ANN-KC2 is the best among all 30 classifiers with 0.8315 AUC measure.
556
S. Goyal and P. K. Bhatia
Fig. 5. ROC curves for five classifiers built using KC1 dataset
Fig. 6. ROC curves for five classifiers built using KC2 dataset
Further, to analyze which ML technique outperforms in terms of AUC measure, box plots for classifiers are drawn as in Fig. 10. It is observed that ANN technique outperforms other ML techniques in terms of AUC measure. 4.3 Accuracy Measure The performance for classifiers is also evaluated on the accuracy evaluation criteria. The values of accuracy measure for all 30 classifiers are tabulated as Table 2. The ANN-PC1 and SVM-PC1 show similar performance with accuracy measure 0.9296. The graphical representation also analyzed for accuracy measure as shown in Fig. 11.
Software Quality Prediction Using Machine …
557
Fig. 7. ROC curves for five classifiers built using PC1 dataset
Fig. 8. ROC curves for five classifiers built using ALL_DATA dataset
To analyze which ML technique is superior to others, over the accuracy measure, box plots created for classifiers grouped by ML technique used as in Fig. 12. It is observed that ANN-based classifiers outperform the others.
5 Conclusion and Future Scope This work is focused on the empirical comparison of machine learning techniques for software quality prediction task. In this paper, software quality prediction task is modeled as a two-class classification problem so that it can be solved using ML techniques. The evaluation of the prediction models which are developed using static code metrics is
558
S. Goyal and P. K. Bhatia Table 1. AUC measure AUC
ANN
SVM
NB
TREE KNN
CM1
0.7286 0.6341 0.6592 0.5289 0.5868
KC1
0.7878 0.7220 0.7442 0.6828 0.5741
KC2
0.8315 0.8021 0.7816 0.7104 0.6887
PC1
0.7187 0.6348 0.6053 0.5863 0.6050
JM1
0.7102 0.6612 0.6262 0.6227 0.5268
ALL_DATA 0.7282 0.6664 0.6330 0.6479 0.522
Fig. 9. AUC measure for all 30 classifiers (grouped by dataset)
Fig. 10. Box plots for AUC measure (grouped by ML method)
done using three criterion—ROC, AUC and accuracy. The experiments conducted in MATLAB and the findings are
Software Quality Prediction Using Machine … Table 2. Accuracy measure Accuracy
ANN
SVM
NB
TREE KNN
CM1
0.8996 0.8996 0.8735 0.8494 0.8614
KC1
0.8482 0.8482 0.8307 0.8373 0.275
KC2
0.8371 0.7911 0.8409 0.8065 0.8084
PC1
0.9296 0.9296 0.9017 0.9089 0.7682
JM1
0.8107 0.8037 0.8061 0.7793 0.3930
ALL_DATA 0.8265 0.8204 0.8215 0.8039 0.3989
Fig. 11. Accuracy measure for All 30 classifiers (grouped by dataset)
Fig. 12. Box plots for accuracy measure (grouped by ML method)
559
560
S. Goyal and P. K. Bhatia
• Using six datasets from PROMISE repository containing the fault prediction data, thirty quality predictors developed, trained and tested with five machine learning techniques. The observation is made that static code metrics are powerful enough to predict the software quality. • The performance of developed classifiers is evaluated, and comparison is made with the help of necessary measures and plots. It is found that the models built using ANN method are promising for the software quality prediction with pretty good accuracy. • It is observed that ANN method outperforms other ML techniques used (SVM, NaïveBayes, decision tree and k-nearest neighbor) with the value of 0.8315 and 0.9296 as AUC measure and accuracy measure, respectively. In future, this work can be enhanced with the transformation of software quality prediction task as multi-class classification problem. The work can be replicated with larger dataset. Deep architectures can also be incorporated for quality prediction of the software modules. Some meta-heuristic techniques can be further induced to improve the overall accuracy of the prediction models.
References 1. H. Stephen, Kan, Metrics and Models in Software Quality Engineering (Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 2002) 2. https://www.nasa.gov/sites/default/files/files/Space_Math_VI_2015.pdf 3. S. Sayyad, T. Menzies, The PROMISE Repository of Software Engineering Databases. University of Ottawa. Canada (2005). http://promise.site.uottawa.ca/SERepository 4. PROMISE—http://promise.site.uottawa.ca/SERepository 5. K. Dejaeger, T. Verbraken, B. Baesens, Toward comprehensible software fault prediction models using bayesian network classifiers. IEEE Trans. Software Eng. 39(2), 237–257 (2013) 6. D. Rodríguez, R. Ruiz, J.C. Riquelme, R. Harrison, A study of subgroup discovery approaches for defect prediction. Inf. Softw. Technol. 55(10), 1810–1822 (2013) 7. G. Czibula, Z. Marian, I.G. Czibula, Software defect prediction using relational association rule mining. Inf. Sci. 264, 260–278 (2014) 8. I.H. Laradji, M. Alshayeb, L. Ghouti, Software defect prediction using ensemble learning on selected features. Inf. Softw. Technol. 58, 388–402 (2015) 9. S. Goyal, P.K. Bhatia, GA based dimensionality reduction for effective software effort estimation using ANN. J. Adv. Appl. Math. Sci. 18(8), 637–649 10. S. Goyal, P.K. Bhatia, A non-linear technique for effective software effort estimation using multi-layer perceptrons, in 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India (2019), pp. 1–4. https://doi. org/10.1109/comitcon.2019.8862256 11. S. Goyal, P.K. Bhatia, in Feature selection technique for effective software effort estimation using multi-layer perceptrons, in Proceedings of ICETIT 2019. Lecture Notes in Electrical Engineering, ed. by P. Singh, B. Panigrahi, N. Suryadevara, S. Sharma, A. Singh, vol. 605 (Springer, Cham, 2019). https://doi.org/10.1007/978-3-030-30577-2_15 12. R. Malhotra, Comparative analysis of statistical and machine learning methods for predicting faulty modules. Appl. Soft. Comput. 21, 286–297 (2014) 13. S. Goyal, A. Parashar, Machine learning application to improve COCOMO model using neural networks. Int. J. Inform. Technol. Comput. Sci. (IJITCS) 10(3):35–51 (2018). https:// doi.org/10.5815/ijitcs.2018.03.05
Prognosis of Breast Cancer by Implementing Machine Learning Algorithms Using Modified Bootstrap Aggregating Peeyush Kumar, Ayushe Gangal, and Sunita Kumari(B) G. B. Pant Government Engineering College, New Delhi, India [email protected], [email protected], [email protected]
Abstract. Breast cancer has been known to be the most prevalent cancers found in women all over the globe. Early stage diagnosis can save a lot of lives, but in most cases, it is an arduous task for doctors, as the symptoms are not essentially salient. The use of artificial intelligence for the prognosis of breast cancer has proved to be an asset for the biomedical and healthcare stratum. In this paper, a hybrid predictive model is proposed for the prediction of breast cancer. The proposed hybrid model is based on the use of decision trees for the bifurcation and preprocessing of data, followed by the application of any machine learning algorithm. We have used the proposed system for deep learning (DL) and support vector machine (SVM). The proposed model has been found to achieve up to 99.3% of accuracy, exceeding the accuracy obtained by any other approach used so far. Keywords: Breast cancer · Artificial intelligence · Decision tree · Deep learning · Support vector machine
1 Introduction Breast cancer is the second most perilous type of cancer after lung cancer, and the most common type of cancer that affects women worldwide. According to the World Health Organization (WHO), breast cancer impacts 2.1 million women each year [1]. It occurs due to abnormal cell multiplication in breast cells [2], in the milk producing cells or the mammary cells, and is caused by genetic makeup and environmental influences. Major symptoms involve formation of pain, lumps, redness and peeling or scaling of skin around the target area [3]. Therefore, early diagnosis is conducive to ameliorate the breast cancer outcomes and survivability. A major challenge faced by the doctors and practitioners worldwide is the identification and diagnosis of the cancerous cells. Many a times, the symptoms are not virtually prominent, or the tests, on which the doctors rely on, are misdiagnosed [4]. In this paper, a hybrid machine learning model is proposed. The model makes use of the ability of decision trees to bifurcate the data on the basis of certain features, followed © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_63
562
P. Kumar et al.
by the application of machine learning algorithms on the resulting datasets. This enables the model to enhance its learning process and thus achieve much better results. In this paper, the proposed model is created using deep learning and SVM algorithms, on the data, preprocessed using decision tree algorithm. The paper is divided into seven sections. Section 2 gives related work. Section 3 highlights the algorithms used in the formulation of the proposed model. Section 4 delineates the proposed methodology. Section 5 gives the observations and analysis of the algorithms and methods used so far. Section 6 summarizes the results obtained. Finally, Sect. 7 talks about the conclusions and possible future work.
2 Prior Works Bazazeh et al. [5] did a comparative study using three machine learning algorithms, namely SVM, random forest and Bayesian network in prediction of breast cancer. 10fold cross-validation method is applied. The performance of the models is compared using indicators like accuracy, recall, precision and area under the ROC curve (AUC). SVM had the highest performance in terms of accuracy, specificity and precision, while random forest had the highest probability of correctly classifying the tumor. Kumar et al. [6] classified breast cancer as malignant or benign using classification algorithms and a voting classifier. The models selected were naive Bayes, SVM and J48, a type of decision tree. The accuracy of naive Bayes, SVM and J48, individually, was found to be 95.99%, 96.70% and 94.99%, respectively. The three classifiers were combined in three possible ways, using voting approach, naive Bayes and SVM, SVM and J48 and J48 and naive Bayes. The accuracy values obtained by these three combinations were 96.85%, 96.28% and 95.56%, respectively. Sun et al. [7] devised a new machine learning algorithm called as DSVM, created by integrating deep learning (DL) and SVM to predict the occurrence of breast cancer. Comparison between DL, SVM and their integrated product DSVM was made with performance markers like accuracy, precision, recall and F-score. Puneet Yadav et al. [8] used decision tree and SVM for the classification of breast cancer, as malignant or benign. The overall prediction accuracy obtained for decision tree and SVM was 90% to 94% and 94.5% to 97%, respectively. Higa [9] classified breast cancer using decision tree and ANN. The accuracy values obtained were 94% and 95.4% for decision tree and ANN, respectively. Suresh et al. [10] proposed a hybridized model using neural network and decision tree for the prediction of breast cancer. The model’s main objective was to deal with the misclassified values and made use of the radial basis function network and decision tree. A comparative analysis of the proposed algorithm with SVM, KNN and naive Bayes was done. The accuracy values obtained were 96%, 97%, 98% and 99% for KNN, naive Bayes, SVM and the proposed model, respectively. Shravya et al. [11] classified breast cancer as malignant or benign, by using algorithms like logistic regression, SVM and K-nearest neighbor. Dimensionality reduction is applied to reduce the independent variables to a set of principal variables. The paper concluded that SVM had the highest accuracy value of 92.78%, as compared to K-nearest neighbor with 92.23% and logistic regression with 92.10%.
Prognosis of Breast Cancer by Implementing Machine Learning
563
3 Algorithms Used 3.1 Support Vector Machine SVMs aim at solving classification problems by evaluating good decision boundaries [13] between two sets of points belonging to two different categories. In SVM, first data is mapped to a new high-dimensional representation in order to express decision boundary as a hyperplane which in turn is evaluated by trying to maximize the distance between the closest data points from each class of our dataset and the hyperplane. This mapping to a high dimension of data is not pragmatic on pen and paper and can also be computationally intractable. The most commonly used kernel is the radial basis function (RBF) kernel, also known as gaussian kernel. The distance between the data points measured by RBF kernel is given by KRBF (x1 , x2 ) = eγ ||x1 −x2 ||
2
(1)
Here, x 1 and x 2 are data points, ||x 1 − x 2 || denotes Euclidean distance, and γ is a parameter that controls the width of the RBF kernel. Deep learning. Here, we set the random values for the networks created or use forward propagation algorithm to set the initial values and then later correct them or update them using an optimization approach called backpropagation algorithm [14–18]. The goal of backpropagation algorithm is to minimize the cost function J (θ ) (given in (Eq. 2): J (θ ) =
1 m K i yK log Hθ xi + 1 − yKi log 1 − Hθ xi K K m i=1
(2)
K=1
where m = size of training set and here xi means ith row of matrix X and yi means ith element for matrix Y and x00 = x10 = · · · = xm0 = 1 And ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x00 x01 x02 . . . x0n y0 θ0 ⎢x x x ... x ⎥ 1n ⎥ ⎢y ⎥ ⎢ 10 11 12 ⎢θ ⎥ ⎢ 1⎥ ⎢ ⎥ ⎢ 1⎥ ⎢ ⎥ ⎢ · ⎥ ⎢ ⎥ θ =⎢ · ⎥ X=⎢ ⎥ Y=⎢ · ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ · ⎦ ⎣ · ⎦ ⎣ ⎦ θn ym xm0 xm1 xm2 . . . xmn is the parameter matrix, data matrix and label matrix, respectively. Bootstrap aggregation. It is an ensemble technique which reduces the high variance of the combination of multiple algorithms and overcomes the problem of overfitting. It works on the principle of bootstrapping [21], that is, random selection of data from the dataset, with replacement, to produce multiple unique training sets. Multiple learners are then fitted on these separate datasets, and their predictions are averaged, thus reducing overall variance of the models.
564
P. Kumar et al.
Value normalization. Value normalization or standardization is a technique in which we make each feature to have mean equal to zero and standard deviation equal to one. For each feature, we perform: xji =
xji − μi σi
(3)
Here, xji is the j th value of ith feature, and μi is the mean of ith feature and σ i is the standard deviation of ith feature.
4 Proposed Methodology The proposed model uses a modified way of bootstrap aggregating, in which instead of making sample datasets from the main dataset by randomly picking training examples like done in standard bootstrap aggregating [21], a sui generis approach is used here to create these sample datasets. The dataset used for this particular study has been taken from the UCI machine learning repository. The Wisconsin Breast Cancer Dataset [12] (Diagnostic), created by Dr. William H. Wolberg, W. Nick Street and Olvi L. Mangasarian, comprises of 569 instances, out of which 212 are malignant instances and 357 are benign instances and has 30 attributes like ‘mean radius,’ ‘mean texture,’ ‘mean perimeter’ and ‘mean area,’ to name a few. A decision tree [19] is built by partitioning the dataset at points where the majority number of two classes can be separated. This bifurcation property of decision tree plays an indispensable role in the proposed model. A dataset L consists of data {(yn , x n ), n = 1…N}, where the ys are class labels. Using this dataset, a predictor ϕ(x, L) can be formed—if the input is x, we predict y by ϕ(x, L). For the proposed model using SVM, the feature extracted with value is: attribute number: 22; name: worst perimeter; value: 106.1 and for the proposed model using deep learning: attribute number: 20; name: worst radius; value: 16.795. Using these extracted values, the dataset is sliced at X [22] = 106.1 and X [20] = 16.795, for proposed model with SVM and deep learning, respectively. This results in three datasets as follows: (4) LB LK where k = 0, 1, 2 L 1 = 70% of the dataset L and rest 30% is a test set L 2 = is formed by randomly removing the training examples with X[a0 ] > b0 L 3 = is formed by randomly removing the training examples with X[a1 ] ≤ b1 where ai = ith element of a = [20, 22], and bi = ith element of b = [106.1, 16.795], respectively. Now, on each of these three training datasets, L B ∈ {L K }, the SVM/deep learning algorithm is applied and a set of predictors is formed: {ϕ(x, L B )}. If ϕ(x, L) predicts a
Prognosis of Breast Cancer by Implementing Machine Learning
565
class j ∈ {0,1}, then aggregation of {ϕ(x, L B )} is done by voting, and as a result, predictor ϕ B (x) is obtained. One can lucidly see the architecture of the models in flowcharts shown in Figs. 1 and 2.
Fig. 1. Flowchart for proposed model using deep learning
The DL algorithms are applied to the three datasets having an architecture: 25 nodes on input layer and four hidden layers with [5, 10, 15, 20] nodes respectively. All the hidden layers and input layers uses ‘tanh’ activation function, and the output layer consists of a single node having sigmoid activation function. The learning rate has set to be 0.001. The SVM algorithm uses RBF kernel with gamma = 0.001. And for better performance, the dataset, before subjecting to SVM/deep learning algorithm, has been standardized.
5 Observations A comparative analysis of the various techniques and algorithms used so far, with their respective datasets and accuracy values, is given in Table 1.
6 Results The proposed model achieved 99.3% accuracy for both deep learning and SVM, which is greater in magnitude of all the other approaches used so far by the researchers. A
566
P. Kumar et al.
Fig. 2. Flowchart for proposed model using SVM
novel approach is given, by implementing decision tree algorithm on the preprocessed dataset to divide on the basis of features, and then individually applying machine learning algorithms on the resulting datasets. This allows the model to enhance its learning and thus gives better results. where SVM = support vector machine [5], NB = naive Bayes [6], J48 is a decision tree [6], DL = deep learning [7], PMDL = proposed model using DL, PMSVM = proposed model using SVM, LR = logistic regression [11], KNN = k-nearest neighbor [11], ANN = artificial neural network [9], DT = decision tree [8], RF = random forest [5] and BN = Bayesian network [5] (Graph 1). The graph juxtaposes the accuracy values obtained by the algorithms used previously by the researchers, with the accuracy of the proposed model.
7 Conclusion and Future Works A novel hybrid approach for the prognosis of breast cancer, by modifying the standard bootstrap aggregating, is employed in this paper. In the presented modified bootstrap aggregating, the different datasets are created by deleting specific type of training examples randomly, and there is a zero probability that a dataset will consist of all the similar type of training examples. The results obtained prove that the proposed method performs better than all the standard bagging models. The future expects to use the proposed model for the diagnosis of other types of cancers and other diseases as well. The work done can
Prognosis of Breast Cancer by Implementing Machine Learning
567
Table 1. Comparative analysis of the methods and algorithms used so far Author
Year
Machine learning algorithm Source of dataset
Accuracy
Shravya et al. [11] 2019 • Logistic regression • SVM • KNN
UCI machine learning repository [12]
• 92.10% • 92.78% • 92.23%
Suresh et al. [10]
2019 • • • •
UCI machine learning repository [12]
• • • •
Higa [9]
2018 • Decision tree • ANN
Wisconsin Breast Cancer • 94% dataset from UCI ML • 95.4% repository [12]
Yadav et al. [8]
2018 • Decision tree • Support vector machine
UCI machine learning repository [12]
Sun et al. [7]
2017 • SVM • Deep learning • DSVM
Broad Genome Data • 59.4% Analysis Center (GDAC) • 69.6% firehose [22] • 69.8%
Kumar et al. [6]
2017 • • • • • • •
University of Wisconsin, Madison. [12]
• • • • • • •
Bazazeh et al. [5]
2016 • SVM • Random forest • Bayesian network
UCI Machine learning online repository [12]
• 97.0% • 96.6% • 97.12%
KNN Naive Bayes SVM RBF with decision tree
SVM Naive Bayes J48 SVM + naive Bayes Naive Bayes + J48 J48 + SVM SVM + naive Bayes + J48
96% 97% 98% 99%
• 94.0% • 95.4%
96.70% 95.99% 94.99% 96.85% 95.56% 96.28% 97.13%
be used to the real world of the medical field in clinical practices of the practitioners, which will drastically reshape the lives of the people and might also change the face of medicine for the future practitioners.
568
P. Kumar et al.
Graph 1. Graph comparing the accuracy values of various algorithms used in previous works, with the accuracy of the proposed model
References 1. https://www.who.int/cancer/prevention/diagnosis-screening/breast-cancer/en/. Last accessed September 8, 2019 2. https://www.mayoclinic.org/diseases-conditions/breast-cancer/diagnosis-treatment/drc-203 52475. Last accessed on September 24, 2019 3. https://www.webmd.com/breast-cancer/guide/breast-cancer-symptoms-and-types. Last accessed on September 25, 2019 4. M. Al-hadidi, Breast cancer detection using K-nearest neighbor machine learning algorithm, in 9th International Conference on Developments in eSystems Engineering (2016) 5. D. Bazazeh, et al., Comparative study of machine learning algorithms for breast cancer detection and diagnosis. Int. J. Intell. Syst. Appl. Eng. (2016) 6. U. Kumar, et al., Prediction of breast cancer using voting classifier technique, in Proceedings of the IEEE International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials, Chennai, India (2017) 7. D. Sun, et al., Prognosis prediction of human breast cancer by integrating deep neural network and support vector machine, in 9th International Conference on Developments in eSystems Engineering (DeSE), Liverpool, UK (2016) 8. P. Yadav, et al., Diagnosis of breast cancer using decision tree models and SVM. Int. Res. J. Eng. Technol. (IRJET) (2018) 9. A. Higa, Diagnosis of breast cancer using decision tree and artificial neural network algorithms. Int. J. Comput. Appl. Technol. Res. (2018) 10. A. Suresh, et al., Hybridized neural network and decision tree based classifier for prognostic decision making in breast cancers, in Soft Computing (Springer Nature, Berlin, 2019) 11. C. Shravya, et al., Prediction of breast cancer using supervised machine learning techniques. Int. J. Innov. Technol. Explor. Eng. (IJITEE) (2019) 12. Wisconsin Breast Cancer Dataset, Available at https://archive.ics.uci.edu/ml/datasets/Breast+ Cancer+Wisconsin+(Diagnostic). Last accessed on September 24, 2019 13. Y. Yang, et al., The research of the fast SVM classifier method, in 12th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). (2015), pp. 121–124
Prognosis of Breast Cancer by Implementing Machine Learning
569
14. R. Pagariya, et al., Review paper on artificial neural networks. Int. J. Adv. Res. Comput. Sci. 4 (2013) 15. A. Müller, S. Guido, Introduction to Machine Learning with Python (O’Reilly Publication, Sebastopol, 2016) 16. F. Chollet, Deep Learning with Python (Manning Publications, Shelter Island, 2018) 17. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning. Springer Series in Statistics (Springer, Berlin, 2009) 18. A. Ng, Machine Learning. Course on coursera. Available at https://www.coursera.org/learn/ machine-learning 19. B. Patel, et al., A survey on decision tree algorithm for classification. Int. J. Eng. Develop. Res. (IJEDR) (2014) 20. M. Denil, et al., Narrowing the gap: random forests in theory and in practice, in Proceedings of Machine Learning Research (PMLR) (2014) 21. L. Breiman, Bagging Predictors (Kluwer Academic Publishers-Plenum Publishers and Springer Machine Learning Series, 1996) 22. http://gdac.broadinstitute.org/. Last accessed on September 24, 2019
Localizing License Plates in Real Time with RetinaNet Object Detector Ritabrata Sanyal1(B) , Manan Jethanandani2 , Gummi Deepak Reddy3 , and Abhijit Kurtakoti4 1
Kalyani Government Engineering College, Kalyani, West Bengal, India [email protected] 2 LNM Institute of Information Technology, Jaipur, Rajasthan, India [email protected] 3 Jawaharlal Nehru Technological University, Hyderabad, India [email protected] 4 Visvesvaraya Technological University, Belgaum, Karnataka, India [email protected]
Abstract. Automatic license plate recognition systems have various applications in intelligent automated transportation systems and thus have been frequently researched over the past years. Yet designing a highly accurate license plate recognition pipeline is challenging in an unconstrained environment, where difficulties arise from variations in photographic conditions like illumination, distortion, blurring, etc., and license plate structural variations like background, text font, size, and color across different countries. In this paper, we tackle the problem of license plate detection and propose a novel approach based on localization of the license plates with prior vehicle detection, using the stateof-the-art RetinaNet object detector. This helps us to achieve real-time detection performance, while having superior localization accuracy compared to other state-of-the-art object detectors. Our system proved to be robust to all those variations that can occur in an unconstrained environment and outperformed other state-of-the-art license plate detection systems to the best of our knowledge. Keywords: Automatic license plate detection · Intelligent transportation · Deep learning · Convolutional neural networks RetinaNet
1
·
Introduction
Automatic license plate recognition (ALPR) plays a key role in many different aspects of intelligent transportation systems like automated toll collection, road traffic surveillance, automated parking space allotment, and many others. Due c Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_64
Localizing License Plates in Real Time with RetinaNet Object Detector
571
to the diversity of applications of an ALPR system, this has been an active topic of research since the past decade. Despite having prolific literature, designing an ALPR system is highly challenging in an unconstrained environment where difficulties may arise from variations in photographic conditions like illumination, orientations, distortion, blurring, and also backgrounds, color, text font disparities across countries. A typical ALPR pipeline consists of a license plate (LP) detection system which entails localizing the vehicle plates from an input image, followed by an OCR system which reads every plate localized in the previous stage. The localization efficacy of the LP detector is of paramount importance in designing a highly accurate ALPR system—hence in this paper, we propose an LP detection system with high localization accuracy and which also performs in real time. Most previous works entail in engineering handcrafted features for LP detection, and using projections, active contours, CCA, etc., to segment the characters of the plate followed by a classical machine learning classifier like SVM to classify the segmented characters. These methods could not perform well in unconstrained settings, where degree of uncertainties is quite large. The recent success of deep learning especially convolutional neural networks (CNN) in image classification, localization, segmentation, and many other problems in computer vision has motivated researchers to apply these techniques in ALPR tasks. Silva et al. [19] and Laroca et al. [6] used the YOLO network [16] for LP detection. YOLO, being a one-stage object detector is less accurate, while a lot faster than two-stage object detectors. We use the state-of-the-art RetinaNet [12] object detector for LP detection. It is a one-stage detector; thus, time taken by it to process a frame is almost comparable to YOLO; yet its localization accuracy is comparable or even better than two-stage detectors, thus making RetinaNet the ideal choice for unconstrained LP detection, where both detection accuracy and real-time performance are critical. Previously, Safie et al. [18] used RetinaNet for LP detection, but they used Resnet50 network [3] for deep feature extraction, whereas we use VGG19 [20] architecture to that end. This is because VGG19 [20] network has only 19 layers and hence is much lighter than Resnet50 which has 50 layers. Thus, it takes much less training and prediction time than Resnet50. Also, their work was based on a custom dataset, having only one vehicle per frame. Our work is aimed at detecting LPs in the wild, where there may be one or more vehicles per frame. Hence, we first used RetinaNet [12] to detect vehicles and then extracted the LPs from the localized vehicles. Lastly, they used only Malaysian LPs in their study, whereas we test our method on LP datasets from multiple countries.
2
Methodologies
The proposed system constitutes of two phases: i) Vehicle detection and ii) License plate (LP) detection. These steps are elaborated as follows: At first, the vehicles are localized from the input image. This is followed by the LP detection phase which extracts the license plate from each of the localized vehicles. Vehicles are detected prior localizing the license plates simply because
572
R. Sanyal et al.
vehicles, having much larger spatial dimensions than license plates, are easier to locate in a natural scene. Once the vehicles are localized, the search space is reduced drastically to aid accurate detection of license plates. To this end, we use the state-of-the-art one-stage object detector namely the RetinaNet [12], for both detection purposes. We especially use a one-stage detector, rather than a two-stage detector because: (i) Two-stage object detectors (Faster RCNN [17]) are generally slower than one-stage ones (YOLO [15], SSD [13]), yet the average localization accuracy of the former triumphs that of the latter by a large margin, due to the extreme 19 reground-background class imbalance problem. (ii) Retina Net [12] solves this problem by modifying the cross-entropy loss, so that the loss assigned to the properly classified samples is down-weighted in every iteration. Hence, the impact of easy negative samples to the total loss is reduced, and a sparse set of hard examples contribute the most to the final loss. This novel loss function, namely the focal gloss, ameliorates this speed-accuracy tradeoff, thus outperforming all other state-of-the-art one-stage and two-stage object detectors in terms of real-time and localization performance. RetinaNet [12] has been used [2,10,14] in various object detection scenarios in different domains with great efficacy. RetinaNet [12] architecture consists of three components, namely a backbone network for feature extraction and two subnets—one for classification and the other for bounding box regression. Feature pyramid network(FPN) is used as the backbone network due to its ability to represent rich multi-scale features. VGG19 [20] network has been to extract high-level features maps, from which the FPN backbone generates feature maps at different scales. The classification subnet predicts the probability of presence of a LP at each spatial location of every anchor. In parallel, the regression subnet is trained, in which, a small fully connected network (FCN) is connected to every pyramid level to regress the offset from each anchor to a nearby ground truth object. The entire network is trained end to end in a single stage with the focal loss objective. The loss emphasizes more on the hard examples rather than the easy samples, thus solving the class imbalance problem in other one stage detectors. Formally, the focal loss F L(pt ) is expressed [12] as: F L(pt ) = −α(1 − pt )γ log(pt ) where α, γ are two hyperparameters. p, if y = 1 pt = 1 − p, otherwise
(1)
(2)
where p is the probability predicted by the model, and y = 1 represents the ground truth. sectionExperiments Here, we demonstrate the efficacy of our proposed ALPD system. All experiments are performed on NVIDIA 1080 GPU, with 8 GB of memory.
Localizing License Plates in Real Time with RetinaNet Object Detector
573
Table 1. Comparison of precision and recall metrics of different license plate detection methods on the Caltech Cars Dataset Method
Precision (%) Recall (%)
Zhou et al. [24]
95.50
84.80
Bai and Liu [4]
74.10
68.70
Le and Li [7]
71.40
61.60
Lim and Tay [11] 83.73
90.47
Li and Shen [9]
97.56
95.24
Faster RCNN
97.15
96.30
YOLO
96.67
95.88
RetinaNet 98.50 97.15 The italic and the bold highlighted cells indicate the highest and the second-highest metric values, respectively, achieved by any method Table 2. Comparison of recall rate (%) of various license plate detection methods on the five parts of the PKU Dataset Method
G1
G2
G3
G4
G5
Avg
Zheng et al. [23] 94.93
95.71
91.91
69.58
67.61
83.94
Zhao et al. [22]
95.18
95.71
95.13
69.93
68.10
84.81
Yuan et al. [21]
98.76
98.42
97.72 96.23
97.32 97.69
Zhou et al. [24]
95.43
97.85
94.21
81.23
82.37
90.22
Li et al. [8]
98.89
98.42
95.83
81.17
83.31
91.52
Faster RCNN
98.95 98.50 97.60
95.58
96.90
97.51
YOLO
98.54
96.44 97.28
97.60
98.40
97.35
RetinaNet 99.63 99.15 97.63 98.55 98.81 98.75 The italic and the bold highlighted cells indicate the highest and the second-highest recall value, respectively, achieved by any method
2.1
Datasets
A number of benchmark datasets have been used to evaluate our proposed LP detection method. For evaluating the LP detection performance, we use 3 datasets, namely the Caltech Cars dataset, PKU dataset, and the AOLP dataset. The Caltech Cars 1999 dataset [1] has 126, 896 × 592 resolution images. The images were captured in Caltech parking lot, consisting of USA license plates in cluttered background like wall, grass, trees, etc. The next dataset is the AOLP database [5]. This dataset consisting of 2049 images with Taiwan license plates is divided into 3 groups, namely: Access Control, Law Enforcement, and Road Patrol. The images are captured under different illumination and weather conditions, and they have a lot of variations in
574
R. Sanyal et al.
them, like cluttered background, multiple vehicle plates in a single frame, and images captured with arbitrary viewpoints and distances. The third dataset used is the “PKUData” database [21], which consists of 3977 images with Chinese license plates, captured under different conditions, like illumination, occlusion, degradation, multiple viewpoints, multiple vehicles, etc. This dataset is divided into 5 subsets (G1–G5), relating to different conditions, as elaborated in [21]. Table 3. Comparison of detection performance (%) of various license plate detection methods on the three parts of the AOLP Dataset Method
AC (%)
LE (%)
RP (%)
Avg (%)
Precision Recall Precision Recall Precision Recall Precision Recall Li and Shen [9] 98.53
98.38
97.75
97.62
95.28
95.58
97.18
97.19
Hsu et al. [5]
96
91
95
91
94
91
94
91
Faster RCNN
99.20
99
98.50
98.20
95.80
96.62
97.83
97.94
YOLO
98.90
99
98
98.33
95.57
95.89
97.49
97.74
RetinaNet 99.65 99.20 98.88 98.58 96.35 96.23 98.29 98 The italic and the bold highlighted cells indicate the highest and the second-highest metric values, respectively, achieved by any method
2.2
Evaluation Metrics
For evaluating our LP detection model, we use precision and recall rate alike previous methods. Recall rate is defined as the ratio of correctly localized positive regions to ground truth positive regions. Precision rate is defined as the ratio of correctly detected positive regions to the number of detected regions. A predicted bounding box is deemed to be correct, if it totally encompasses the license plate and its Intersection over Union (IoU) with the ground truth is more than or equal to 50% (that is IoU ≥ 0.5), where IoU =
area(pr ∩ gt) area(pr ∪ gt)
(3)
where pr and gt are the predicted and the ground truth license plate regions, respectively. 2.3
Discussion and Analysis
From Table 1, we can see that RetinaNet LP detection model achieved 98.50% and 97.15% precision and recall rates, respectively, on the Caltech Cars dataset, which are almost 2–3% better than the best previous work by Li and Shen [9]. Faster RCNN and YOLO models performed considerably worse compared to our proposed method. From Table 2, our model achieves 1% more average recall rate than the best previous work by Yuan et al. [21] on PKU dataset. The Faster
Localizing License Plates in Real Time with RetinaNet Object Detector
575
RCNN and YOLO models achieve about 1% worse recall rate compared to our proposed RetinaNet model. On the AOLP dataset, our RetinaNet model achieves an average precision and recall rate of 98.29% and 98%, respectively, (Table 3), which is better than the best previous work [9], by a 1–2% margin. From the LP detection results on these varied datasets, we can easily see that RetinaNet has better object detection performance than Faster RCNN and YOLO models by a 1–2% margin. Also from Table 4, it is evident that the RetinaNet model, being a one stage detector, is about twice as fast as a two-stage detector like Faster RCNN and takes about the same time as a YOLO detector. Thus, RetinaNet is almost as fast as a one-stage object detector, while having superior localization accuracy compared to both one- and two-stage detectors. It is also empirically seen that our proposed LP detection system can process a full HD video at a rate of 50 FPS and thus has near real-time performance. Table 4. Comparison of detection time (ms) of our proposed model with different baseline models Dataset Detection AOLP
Method
Faster RCNN YOLO RetinaNet PKUData Faster RCNN YOLO RetinaNet
3
Time (ms) 50 28 29 45 19 19
Conclusion
In this paper, we have proposed an automatic license plate detection system, using the state-of-the-art one-stage RetinaNet object detector. RetinaNet is trained using the focal loss objective, which solves the extreme foregroundbackground class imbalance problem typical in other one-stage object detectors. Thus, it achieves better localization accuracy than other one- or two-stage detectors like YOLO or Faster RCNN, while having a real-time detection performance. Yet it is to be noted that we had to localize the vehicles prior to LP detection, which takes up extra time. As future work, we aim to obviate this prior vehicle detection stage without hampering LP localization accuracy.
576
R. Sanyal et al.
References 1. Caltech car plates dataset. http://www.vision.caltech.edu/html-files/archive.html 2. Y. Cui, B. Oztan, Automated firearms detection in cargo x-ray images using retinanet, in Anomaly Detection and Imaging with X-Rays (ADIX) IV, vol. 10999 (International Society for Optics and Photonics, 2019), p. 109990P 3. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 5 (2015), p. 6 4. B. Hongliang, L. Changping, A hybrid license plate extraction method based on edge statistics and morphology, in Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), vol. 2. IEEE (2004), pp. 831–834 5. G.S. Hsu, J.C. Chen, Y.Z. Chung, Application-oriented license plate recognition. IEEE Trans. Veh. Technol. 62(2), 552–561 (2012) 6. R. Laroca, E. Severo, L.A. Zanlorensi, L.S. Oliveira, G.R. Gon¸calves, W.R. Schwartz, D. Menotti, A robust real-time automatic license plate recognition based on the yolo detector, in 2018 International Joint Conference on Neural Networks (IJCNN). IEEE (2018), pp. 1–10 7. W. Le, S. Li, A hybrid license plate extraction method for complex scenes, in 18th International Conference on Pattern Recognition (ICPR’06), vol. 2. IEEE (2006), pp. 324–327 8. B. Li, B. Tian, Y. Li, D. Wen, Component-based license plate detection using conditional random field model. IEEE Trans. Intell. Transport. Syst. 14(4), 1690– 1699 (2013) 9. H. Li, C. Shen, Reading car license plates using deep convolutional neural networks and lstms. arXiv preprint arXiv:1601.05610 (2016) 10. X. Li, H. Zhao, L. Zhang, Recurrent retinanet: a video object detection model based on focal loss, in International Conference on Neural Information Processing (Springer, Berlin, 2018), pp. 499–508 11. H.W. Lim, Y.H. Tay, Detection of license plate characters in natural scene with MSER and SIFT unigram classifier. https://doi.org/10.1109/STUDENT.2010. 5686998 12. T.Y. Lin, P. Goyal, R. Girshick, K. He, P. Doll´ ar, Focal loss for dense object detection, in Proceedings of the IEEE International Conference on Computer Vision (2017), pp. 2980–2988 13. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.Y. Fu, A.C. Berg, in Ssd: single shot multibox detector, in European Conference on Computer Vision (Springer, Berlin, 2016), pp. 21–37 14. M.A.A. Milton, in Towards pedestrian detection using retinanet in eccv 2018 wider pedestrian detection challenge. arXiv preprint arXiv:1902.01031 (2019) 15. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: unified, realtime object detection. arXiv preprint arXiv:1506.02640 (2015) 16. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: unified, realtime object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 779–788 17. S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in Advances in Neural Information Processing Systems (2015), pp. 91–99 18. S. Safie, N.M.A.N Azmi, R. Yusof, R., Yunus, M.F.Z.C. Sayuti, K.K. Fai, Object localization and detection for real-time automatic license plate detection (ALPR)
Localizing License Plates in Real Time with RetinaNet Object Detector
19.
20. 21.
22.
23. 24.
577
system using retinanet algorithm, in Proceedings of SAI Intelligent Systems Conference (Springer, Berlin, 2019), pp. 760–768 S.M. Silva, C.R. Jung, Real-time Brazilian license plate detection and recognition using deep convolutional neural networks, in 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE (2017), pp. 55–62 K. Simonyan, A. Zisserman, in Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Y. Yuan, W. Zou, Y. Zhao, X. Wang, X. Hu, N. Komodakis, A robust and efficient approach to license plate detection. IEEE Trans. Image Process. 26(3), 1102–1114 (2016) Y. Zhao, Y. Yuan, S. Bai, K. Liu, W. Fang, Voting-based license plate location, in 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC). IEEE (2011), pp. 314–317 D. Zheng, Y. Zhao, J. Wang, An efficient method of license plate location. Pattern Recogn. Lett. 26(15), 2431–2438 (2005) W. Zhou, H. Li, Y. Lu, Q. Tian, Principal visual word discovery for automatic license plate detection. IEEE Trans. Image Process. 21(9), 4269–4279 (2012)
Decision Support System for Detection and Classification of Skin Cancer Using CNN Rishu Garg1 , Saumil Maheshwari2(B) , and Anupam Shukla3 1 National Institute of Technology Raipur, Raipur, India
[email protected]
2 Atal Bihari Vajpayee- Indian Institute of Information Technology and Management, Gwalior,
MP, India [email protected] 3 Indian Institute of Information Technology, Pune, India [email protected]
Abstract. Skin cancer is one of the most deathful of all the cancers. It is bound to spread to different parts of the body on the off chance that it is not analyzed and treated at the beginning time. It is mostly because of the abnormal growth of skin cells, often develops when the body is exposed to sunlight. Furthermore, the characterization of skin malignant growth in the beginning time is a costly and challenging procedure. It is classified where it develops and its cell type. High Precision and recall are required for the classification of lesions. The paper aims to use MNIST HAM-10,000 dataset containing dermoscopy images. The objective is to propose a system that detects skin cancer and classifies it in different classes by using the convolution neural network. The diagnosing methodology uses image processing and deep learning model. The dermoscopy image of skin cancer undergone various techniques to remove the noise and picture resolution. The image count is also increased by using various image augmentation techniques. In the end, the transfer learning method is used to increase the classification accuracy of the images further. Our CNN model gave a weighted average precision of 0.88, a weighted recall average of 0.74, and a weighted F1 score of 0.77. The transfer learning approach applied using ResNet model yielded an accuracy of 90.51% Keywords: Skin cancer · Skin lesion · Deep learning · CNN
1 Introduction About more than a million skin cancer cases occurred in 2018 globally [1]. Skin cancer is one of the fastest-growing diseases in the world. Skin cancer occurs mainly due to the exposure of ultraviolet radiation emitted from the Sun. Considering the limited availability of the resources, early detection of skin cancer is highly important. Accurate diagnosis and feasibility of detection are vital in general for skin cancer prevention policy. Skin cancer detection in early phases is a challenge for even the dermatologist. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_65
Decision Support System for Detection and Classification …
579
In recent times, we have witnessed extensive use of deep learning in both supervised and unsupervised learning problems. One of these models is convolution neural network (CNN) which has outperformed all others for object recognition and object classification tasks. CNNs eliminate the obligation of manually handcrafting features by learning highly discriminative features while being trained end-to-end in a supervised manner. Convolutional neural networks have recently been used for the identification of skin cancer lesions. Several CNN models have successfully outperformed trained human professionals in classifying skin cancers. Several methods like transfer learning using large datasets have further improved the accuracy of these models. VGG-16 is a convolutional neural system that is prepared on more than a million pictures from the ImageNet database. The system is 16 layers profound and can arrange pictures into 1000 item classifications, for example, console, mouse, pencil, and numerous creatures. Accordingly, the system has learned rich component portrayals for a wide scope of pictures. The system has a picture info size of 224-by-224. The model accomplishes 92.7% top-5 test precision in ImageNet, which is a dataset of more than 14 million pictures having a place with 1000 classes. In this paper, we have generated a CNN model that analyzes the skin pigment lesions and classifies them in using a publicly available dataset by employing various deep learning techniques. We have improved the accuracy of classification by implementing CNN and transfer learning models. We have tested our models using the publicly available HAM-10,000 dataset
2 Literature Review CNNs have been used frequently in the field of medical image processing image classification and so on [2]. CNNs have already shown inspiring outcomes in the domain of microscopic images classification, such as human epithelial two cell image classification [3], diabetic retinopathy fundus image classification [4], cervical cell classification [5], and skin cancer detection [6–9]. Brinker et al. [10] presented the first systematic study on classifying the skin lesion diseases. The authors specifically focus on the application of CNN for the classification of skin cancer. The research also discusses the challenges needed to be addressed for the classification task. Han et al. in [11] proposed a classifier for 12 different skin diseases based on clinical images. They developed a ResNet model that was fine-tuned with 19,398 training images from Asan dataset, MED-NODE dataset, and atlas site images. This research does not include the patients with different set of ages. Authors in [12] presented the first comparison of CNN with the international group of 58 dermatologist for the classification of the skin cancer. Most dermatologists were outperformed by the CNN. Authors concluded that, irrespective of any physicians’ experience, they may benefit from assistance by a CNN’s image classification. Google’s Inception v4 CNN architecture was trained and validated using dermoscopic images and corresponding diagnoses. Marchetti et al. [13] performed cross-sectional study using 100 randomly selected dermoscopic images (50 melanomas, 44 nevi, and 6 lentigines). This study was performed over an international computer vision melanoma challenge dataset (n = 379). Authors developed a fusion of five methods for the classification purpose. In
580
R. Garg et al.
[14], authors trained a CNN-based classification model on 7895 dermoscopic and 5829 close-up images of lesions. These images were excised at a primary skin cancer clinic between January 1, 2008 and July 13, 2017. Further, the testing of the model was done on a set of 2072 unknown cases and compared with results from 95 human raters who were medical personnel. Mostly existing research considers binary classification, whether the cancer is melanomous or not, and small work is performed for classification of general images, but their result is not very optimal. The existing algorithms used for the detection and classification of skin cancer disease use machine learning and neural network algorithms.
3 Methodology This section will emphasize over the methodology adopted for the classification task. Over all steps of the methodology are shown in Fig. 1.
Fig. 1. Flowchart of the methodology implemented
Decision Support System for Detection and Classification …
581
3.1 Dataset Description We used the MNIST HAM-10,000 dataset for skin cancer which is available on kaggle [15, 16]. It contains 10,015 images of skin pigments which are divided among seven classes. The number of images present in the dataset is enough to be used for different tasks including image retrieval, segmentation, feature extraction, deep learning, and transfer learning, etc. 3.2 Preprocessing The dataset had to be cleaned and organized before being fed into the model. However, the data is highly imbalanced with the lesion type ‘melanocytic nevi’ comprising of more than fifty percent of the total dataset. We have applied several preprocessing networks to enhance the learnability of the network. We have performed data augmentation to avoid the overfitting of data. We have created several copies of the existing dataset by translating, rotating, and zooming the images by various factors. Also, in this paper, we have increased the contrast of skin lesions using histogram equalization.
4 Method Convolution neural networks and transfer learning methods are used for the classification task. For the purpose of transfer learning, deep learning models pre-trained on ImageNet dataset was used. It consists of a little more than 14 million labeled images belonging to more than 20,000 classes. Later on, these pre-trained models are further trained on the HAM10000 dataset by adding some additional layers to the models and freezing some of the initial layers. We also applied different learning algorithms like XGBoost, SVM, and random forest algorithms to perform the classification task in the HAM10000 dataset to compare the results. 4.1 Convolution Neural Network Convolutional neural networks were inspired by biological processes. In these, the connectivity pattern between neurons of a network resembles the organization of the animal visual cortex. The response of an individual cortical neuron in a restricted region of the visual field is known as the receptive field. The receptive fields of different neurons partially overlap such that they cover the entire visual field. CNN is comprised of three stages of neural layers to assemble its structures: convolutional, pooling, and fully connected. Convolution layer: In CNN, the main layer is convolutional layer. In this layer, the result of the output layer is obtained from the input by filtering in specific condition. This layer is constructed by the neurons which is in the shape of cubical blocks. Max-pooling layer: Pooling layer executes the next operation after each convolution layer. These layers are utilized to minimize the size of the neurons. These are small rectangular grids that acquire small portion of convolutional layer and filters it to give
582
R. Garg et al.
a result from that block. The most commonly used layer is max pooling that fetch that maximum pixel from the block. Fully connected layers: The final layer of a convolutional neural network(CNN) is a fully connected layer that is formed from the attachment of all preceding neurons. It reduces the spatial information as it is fully connected like in artificial neural network. It contains neurons beginning at input neurons to the output neurons. 4.2 Transfer Learning Transfer learning is a technique in which a pre-trained model is used on another dataset. This technique is mainly used when there are not enough input data to properly train the model. In such cases, a different model, which is already trained in a different large dataset, is used. Here, we used some of the models which were pre-trained in the ImageNet dataset that contain millions of images that are associated with 1000 classes. These models are further appended with different untrained layers and further trained in the HAM10000 dataset. The architecture of one of the models used, namely VGG16, is shown in Fig. 2
Fig. 2. Architecture of VGG16
5 Implementation In this research, we trained a CNN network to classify the given images in the MNIST dataset in their respective classes. Architecture of CNN model used for the classification is shown in Table 1. Python programming language is used for programming of the model. Keras is the primary deep learning library used for computation purposes along with NumPy. Matplotlib was used for graph plotting purposes. For fair splitting of data, we have incorporated stratified splitting of data for training and testing purposes. For further tackling the class imbalance, we have incorporated the concept of class weights in which the misclassification of a minority class is penalized heavily. Dropout was also used as regularization technique to ensure better generalization of model over test data. Transfer learning technique was also used to compare the accuracy of the model with that of the proposed deep learning model. This model is then trained with the new dataset by freezing more than 70% of the layers in the VGG network. The last few layers are re-trained with the new dataset to mold the network according to the new dataset. Finally, this network is concluded by adding a few fully connected layers. We used the transfer learning methods using models like ResNet or VGG16 which was pre-trained with the ImageNet dataset. Adam optimizer was used for optimization purpose and categorical cross-entropy loss was used for calculating the loss in the model.
Decision Support System for Detection and Classification …
583
Table 1. Architecture of CNN implemented Layers
Output size
Input
3 × 512 × 512
Convolution
32 × 510 × 510
ReLu activation
32 × 510 × 510
Convolution
32 × 508 × 508
ReLu activation
32 × 252 × 252
Convolution
32 × 506 × 506
ReLu activation
32 × 506 × 506
Max pooling
32 × 253 × 253
Convolution
64 × 251 × 251
ReLu activation
64 × 251 × 251
Convolution
256 × 57 × 57
ReLu activation
256 × 57 × 57
Convolution
256 × 53 × 53
ReLu activation
256 × 53 × 53
Global pooling
256
Dense
4096
ReLu activation
4096
Dense 2
5
Sigmoid activation
5
Kernel size 3×3
3×3
3×3
3×3
3×3
5×5
We have also tested the dataset on various machine learning algorithms including random forest, XGBoost, and support vector machines.
6 Results The models created were trained by using the balanced and resized images from the dataset. Kaggle’s kernels were used for performing the training testing and validation of the models. The number of epochs for which the models were trained was 50. We then calculated the confusion matrix, as shown in Fig. 3, and evaluated the models
584
R. Garg et al.
using the overall accuracy of classification. We have developed a CNN and model for the classification of skin cancer. Our CNN model gave a weighted average value of precision of 0.88, a weighted recall average of 0.74, and a weighted F1 score of 0.77. Table 2 shows the result matrix of different transfer learning models. Table 3 shows the performance of our CNN model on the various classifications of various classes. Table 4 shows the results matrix of the developed CNN model. We also tried to test our dataset of different models like random forest, XGBoost, and support vector classifiers. However, we did not see promising results in these learning algorithms. The results are displayed in Table 5.
Fig. 3. Confusion matrix of ResNet model
7 Conclusion The proposed method follows an approach in which first step is feature extraction, and then these features are used to train and test the transfer learning model. Based on the
Decision Support System for Detection and Classification …
585
Table 2. Results summary of models Sr. No
Model name
Accuracy (%)
1
ResNet
90.5
2
VGG16
78
Table 3. Results summary of self-made models Sr. No Class of lesion
Precision Recall F1 score
1
Actinic keratoses
0.27
0.62
0.37
2
Basal cell carcinoma 0.45
0.73
0.56
3
Benign keratosis
1.00
0.09
0.17
4
Dermatofibroma
0.08
0.67
0.14
5
Melanoma
0.21
0.59
0.30
6
Melanocytic nevi
0.95
0.82
0.88
7
Vascular skin lesions 0.67
0.73
0.70
Table 4. Results summary of self-made CNN (average value) Sr. No Entity
Precision Recall F1 score
1
Micro-average
0.74
0.74
0.74
2
Weighted average 0.88
0.74
0.77
Table 5. Result summary of other machine learning models Sr. No Model
Accuracy (%)
1.
Random forest
65.9
2.
XGBoost
65.15
3.
Support vector classifier 65.86
observation, we have concluded that the transfer learning mechanism can be applied to HAM10000 dataset to increase the classification accuracy of skin cancer lesions. Also, we have found that the ResNet model which is pre-trained in ImageNet dataset can be very helpful is the successful classification of cancer lesions in the HAM1000 dataset. We have further seen that learning algorithms like random forest, XGBoost, and SVMs are not very effective for classification tasks in the HAM10000 dataset. Encouraged
586
R. Garg et al.
by these outcomes, future work will include the improvement of prediction result and classification accuracy.
References 1. J. Ferlay et al., Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int. J. Cancer 144(8), 1941–1953 (2019) 2. G. Kasinathan et al., Automated 3-D lung tumor detection and classification by an active contour model and CNN classifier. Expert Syst. Appl. 134, 112–119 (2019) 3. Z. Gao et al., HEp-2 cell image classification with deep convolutional neural networks. IEEE J. Biomed. Health Inf. 21(2), 416–428 (2016) 4. P. Wang et al., Automatic cell nuclei segmentation and classification of cervical Pap smear images. Biomed. Signal Process. Control 48, 93–103 (2019) 5. S. Sharma, S. Maheshwari, A. Shukla, An intelligible deep convolution neural network based approach for classification of diabetic retinopathy. Bio-Algorith. Med-Syst. 14(2) (2018) 6. K.M. Hosny, M.A. Kassem, M.M. Foaud, Classification of skin lesions using transfer learning and augmentation with Alex-net. PLoS ONE 14(5), e0217293 (2019) 7. X. He et al., Dense deconvolution net: Multi path fusion and dense deconvolution for high resolution skin lesion segmentation. Technol. Health Care 26(S1), 307–316 (2018) 8. B. Harangi, Skin lesion classification with ensembles of deep convolutional neural networks. J. Biomed. Inform. 86, 25–32 (2018) 9. T.J. Brinker et al., Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task. Eur. J. Cancer 113, 47–54 (2019) 10. T.J. Brinker et al., Skin cancer classification using convolutional neural networks: systematic review. J. Med. Int. Res. 20(10), e11936 (2018) 11. S.S. Han et al., Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm. J. Investig. Dermatol. 138(7), 1529–1538 (2018) 12. H.A. Haenssle et al., Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 29(8), 1836–1842 (2018) 13. M.A. Marchetti et al., Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: Comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J. Am. Acad. Dermatol. 78(2), 270–277 (2018) 14. P. Tschandl et al., Expert-level diagnosis of nonpigmented skin cancer by combined convolutional neural networks. JAMA dermatology 155(1), 58–65 (2019) 15. N. Codella, V. Rotemberg, P. Tschandl, M.E. Celebi, S. Dusza, D. Gutman, B. Helba, A. Kalloo, K. Liopyris, M. Marchetti, H. Kittler, A. Halpern, Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)” (2018). https://arxiv.org/abs/1902.03368 16. P. Tschandl, C. Rosendahl, H. Kittler, The HAM10000 dataset, a large collection of multisource dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 (2018). https://doi.org/10.1038/sdata.2018.161
An Attribute-Based Break-Glass Access Control Framework for Medical Emergencies Vidyadhar Aski(B) , Vijaypal Singh Dhaka, and Anubha Parashar School of Computing and Information Technology, Manipal University, Jaipur, India {vidyadharjinnappa.aski,anubha.parashar}@jaipur.manipal.edu, [email protected]
Abstract. IoT-enabled medical services are nowadays gaining the major ground in modern lifestyle management strategies due to the increasing trends of advancements in the design and development of healthcare devices. A growing number of necessities in creating connected environments for effective treatment of a chronic disorder are further adding the objectivity in modern healthcare services. The IoT environments in healthcare paradigm consist of mainly the data acquisition system that captures the data from patient body and data aggregators or the data commuters that transform the physical layer data into the cloud layer data for analysis. This data is made accessible to medical professionals for further analysis and disease prediction from the cloud space. Data in the cloud storage needs to be protected by various encryption schemes from the unauthorized accesses. It is critically important to offer the data and device services timely during emergency situations in healthcare scenarios. In this view, authors designed a lightweight access control framework that enables the users to access encrypted data and devices in two modes: attribute oriented access and emergency break-glass access. Normally, a policy-based, multilayered authentication technique is used in encrypting data and a medical professional who satisfies the policy guidelines can decrypt the data through set of attributes. Whereas, during emergency situations the need of breakglass access arises so as to bypass the access policies and stringent security walls in order to timely access the data by rescue workers. The proposed framework is lightweight and fast since there are minimal calculations which are required at device and storage level thus by forming low computational and latency complexities. The proposed model is proved secure against the standard security models, and the efficiency is demonstrated through numerous experimentations. Keywords: Access control mechanism · Internet of things (IoT) devices · Healthcare · Multilayer authentication · Break-glass
1 Introduction The linear growth in smart instruments and embedded technologies have been creating phenomenal impacts in healthcare services in terms of real-time monitoring [1–9], predictive analysis and management of chronic health issues. Normalizing the major issues © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_66
588
V. Aski et al.
of any healthcare devices such as wearability and power durability has been the focus of interest for many researchers since near past and the potential level of addresses to solve those issues are being made by the community eventually. Numerous on-demand innovations for deploying energy-efficient healthcare devices in daily life for improving the quality of life have been observed. The applications of IoT in healthcare services will pervade different horizontals such as smart city, old-age life management, assisted living, smart rehabilitation and resource utilizations. The IoT absolutely add values to the ecosystem by providing connected environments in the era of opportunities. IoT is creating a platform for cyberspace which is increasingly stacking up with the heterogynous data collected from seamless devices every day. A recent survey anticipates a fact that every year there are millions of new devices getting registered in various sectors for collecting data for the analysis purpose. Therefore, IoT has gained the interest of many stakeholders from research and development community, industrial fraternity, and the academic researchers. This enables the development of industrial applications at different levels of the organizational hierarchy. On the other hand, the IoT in healthcare sector can be thought of as a growing network of human and wearable sensor devices that creates their visibility through an IP address in the Internet web. Automation in data gathering is nearly feasible in all aspects of healthcare services since IoT is featured by advanced set of wireless connectivity components [6]. One of the main objectives of the IoT-enabled health services is to provide its end users a network identity on the World Wide Web. This enables the users to access numerous on-demand services timely and help the patients to maintain the quality of life through the periodic assessments and regular prescription follow-ups [7]. Further, it adds the offerings of telemedicine services to the subscribed set of users with the help of remote assistance. For instance, a healthcare professional can have an access to all the enrolled patient records stored in the medical server and dispense the prescription details to respected patients based on the conditions. The geographical network expansion is still having many limitations in several developing countries, where Internet access is restricted due to progressing infrastructural designs and poor network reachability. In these places, the remote medical services can be accelerated by distributing the potential IoT and mobility services [8]. Consequently, IoT-enabled healthcare services are likely to reduce the wait time of patient wait time and cost of treatment along with the improved user experience. As per the service provider view, it is expected that the device downtime can be slashed down through remote maintenance of such devices via cloud APIs. Overall, IoT-enabled healthcare services offer the cutting edge solution to many existing medical issues and improve the quality of life. Nevertheless, IoT predicates the open end solutions through wireless communication technologies (WCTs) deployed on conventional Internet architecture, due to which there is a high chance of data and service leakages through loopholes in security models. Since the whole architecture of IoT is connected through worldwide networks without a dedicated channel or medium, the healthcare sector might become a target for unprofessional hackers. The data collected from individual devices through WBANs is vulnerable to the third-party demodulation schemes [9, 10]. In addition, unauthorized
An Attribute-Based Break-Glass Access Control Framework …
589
accesses to healthcare devices would create irreversible security breaches in the healthcare services with the consequent deaths at worst case. Hence, the misuse and security issues of healthcare devices may create a negative impact on diversifying the IoT use cases in healthcare industry. In order to prevent such misusages of healthcare devices and data, many access control mechanisms are already in place providing security support to the data and devices. Access controlling can be achieved in many ways including attribute-based and attribute-based authentication. The aim of these authentication protocols is to validate and verify the legitimacy of the user based on the predefined policies and roles within the system. However, the complexities of multilayered authentication and access control algorithms are significantly high due to the recurrent executions of processes within the main process. Therefore, handling emergency situations with these algorithms is as difficult as breaking a multilayered security model. For instance, a patient suffering from cardiovascular disease who uses an IoT device for his regular monitoring of bio-parameters such as heart rate and ECG may undergo a sudden cardiac arrest and loses the cognizance and he is not able to grant access rights to the device or spot rescuers. This may create significant delay in hospitalization and would even lead to death of the patient. In these situations, it is critical to gain the data/device access and work towards the first aid rescuing of patient [8]. In this regard, authors present a break-glass access control framework suitable for issuing the quick access rights to the emergency situation handlers so that we can avoid consequent causalities. Here, break-glass intuitively refers to the information that this algorithm executes only during emergency situation and transfers the access rights to someone who’s registered as an emergency situation handler bypassing the actual policies of normal access control scheme. Due to the fact that this break-glass mechanism overrides the access control policy, it should not be used for wrong intentions; thus, the authority transfer will be done under controlled environment.
2 Existing Work and Motivation The access controlling in IoT-enabled ubiquitous medical services has been a key point of discussion for many researchers ever since the inception. Disruptive technologies such as blockchain computing and big data analytics under the light of IoT have been key contributors in expanding healthcare services to the next level. Several researchers produced high-quality outputs in terms of securing, optimizing space constraints and energy optimizations of IoT devices and services recently. In this section, authors discussed few of such research works. Recently, several researchers [9–12] proposed an attribute-based access control (ABAc.) scheme for securely managing data outsourcing applications such as electronic health records (EHR) with the enforcement of access control policies and consumer revocation abilities. Most of them employed a mechanism of realizing revocations within a revocation of key attributes. In addition, another ABE was proposed by Narayan et al. [13] for EHR, in which the patient’s health data was directly encrypted with a single revocation. Number of non-revoked consumers used to influence the linear growth of ciphertext length though. However, numerous common deficiencies were observed from the aforementioned state-of-the-art such as the single trusted authority (TA) was used at
590
V. Aski et al.
the time of encryption. Such TAs would cause the system not only a bottleneck problem, but also create the key exposure problems since single TA can have overviewed access to all the encrypted health records and hence this would act as a single door for exploiting the secured data. In addition, delegating attribute management-related tasks to single TA is not at all practical due to the openness of nature to vulnerabilities. Further, several of current research works do not specify the attribute dimensions for public and private domains separately since the structure of these domains differ in terms of size of the organizations, and key management strategies. It was also observed from the abovementioned ABE schemes that they have not deliberated any scheme for dealing with emergency situations in healthcare scenarios. Later, many researchers started including break-glass mechanism in their ABE schemes in order to address the emergency issues in healthcare domain. In 2011, Marinovic et al. [14] proposed an ABE integrated with break-glass access control scheme termed Rumpole supported by declarative query language for specifying the emergency break-glass decision rather than using an implicit predefined decision. Further, attributebased access control mechanisms embedded with break-glass scheme were proposed by Ming et al. [15] for encrypting patient health records (PHR). In this scheme, the key distribution complexity is reduced by the entire system which is divided into multiple security blocks in which each such block manages only the users related to corresponding blocks. However, many of these break-glass control schemes possess only the identity-based encryption rather than providing fine-grained access control on the shared ciphertext environments. 2.1 Limitations and Emergency Constraints Enforcement Strategies in ABE Although there are few researchers as discussed in the above section, who limelighted their focus on break-glass decryption schemes in access control context, minimal attention has been paid through defining emergency constraints. Such schemes are likely to be misinterpreted at various levels. Though there have been efforts from various researchers in creating the role-based and attribute-based encryption schemes [3–12] ensuring security for EHRs stored in a central server, these schemes fail to produce enough evidences to protect data stored against distributed environments. Along with that defining the emergency constraints in designing break-glass decryption strategies plays a vital role in transferring the authorization during emergency situations. Lack of a schematic authorization scheme for IoT-based medical device can cause life-threatening adversarial situation and may lead to death.
3 Proposed Architecture and Experimental Setup The proposed system architecture mainly comprises the group of Medical Service Providers (MSP), Cloud Infrastructure (CI), Medical Service Consumer (MSC) and Emergency Situation Handlers (ESH). Each of these constituents interacts with each other concerning their responsibilities. Figure 1 shows the overall system elementary architecture and the corresponding interactions between them.
An Attribute-Based Break-Glass Access Control Framework …
591
Fig. 1. Overall architecture of the proposed framework
Key generator is used to create a master secrete key pair for entire scheme, and this key will be attributed for the service and data users. The CI offers the abundant storage and computation space to the consumers and related users. MSPs are the elementary entities of the proposed scheme, who provides the numerous services to its patients. MSC gets the services offered by the MSP. Figure 2 illustrates the generation of breakglass encrypted key while verifying the user and his emergency situation for avoiding false alarm servings. The patient needs to set the password pw for viewing health data. Figure 3 demonstrates the procedural algorithmic representation of proposed scheme. Initially, the patient shares the password (pw) for accessing his device and cloud data with the registered ESH. The ESH verification is done at the outer layer of the BG.KeyExt procedure. The peripheral function is attributed by ESH user credentials (cr1, cr2) such as email and resource access password. Once the credentials are verified at the CI database, then the legitimacy of the ESH user is identified. This step is to ensure the proposed security scheme cannot be used for misusing. In order to protect pw, ESH user utilizes the pw in peripheral function and generates encrypted password called emergency session key (ESK) as shown in Fig. 3. The attribute-based access the users employ the different attributes such as session key, UI, DoB. The authors invoked attribute-based encryption– decryption scheme from [16] in order to utilize time effectively in emphasizing the design of break-glass access. Once the ESH gets the encrypted password EmSesKey, the same is attributed in inner function fun2 along with pw as shown in Fig. 3. The ESH user will be enabled to access the data and device after the emergency situation occurrence
592
V. Aski et al.
Fig. 2. Generation and extraction of keys a user authentication and generation of emergency session key (ESK) during emergency, b accessing hospital resources by user, c ESH care
is validated implicitly. Thereafter, ESH user has an access to view decrypted data and device.
4 Result and Analysis Figure 4 illustrates the comparison matrix for computational complexities of various attribute-based encryption schemes and the proposed scheme. Authors demonstrated the results with an inclusive architectural manner which comprises of two-way authentication (2-way auth.), PRAKeyGen [16], peripheral authentication (peripheral auth.) and break-glass key extraction. All these were compared in terms of execution time and observed the better efficiency of the proposed scheme. When the number of resources is high (say more than 25), the multilayer authentication schemes would take more time in executing the tasks. The proposed scheme employs lightweight encryption technique, and it can be utilized in the frameworks of any resource constraint devices.
5 Conclusion In this article, authors proposed an attribute-oriented lightweight secured access control scheme for emergency medical conditions with break-glass capabilities. The scheme can be used to decrypt the data and device in normal mode with the set of attributes, as well
An Attribute-Based Break-Glass Access Control Framework … Password-processed Break-Glass Key Extraction:BG.KeyExt fun1 Peripheral _ Login.Service (cr1, cr2); if (authorized!) in Peripheral_Login.service then return Error "Invalid ESH" else return EmSesKey begin fun2 BG.KeyExt (Pw, EmSesKey) Input: Pre-shared patient password (Pw), Emergency Session Key ( EmSesKey ) Output: Decrypted Electronic Health Record (Dec.EHR) if EmSesKey in False_Alarm_Entity then return Error "This is a false Emergency!" else Dec.EHR ⟵new Dec.EHR.Instance(); Dec.EHR.add(Pw, EmSesKey); Role⟵getRole(uID,DoB); ESH-Policies⟵get.ESH.Policy(Role) for ESH_policy in ESHpolicies do for access_right in ESH_policy do //provide access rights for cloud EHR data // Provide access to the Medical device (MD) end end return Auth(Bg.Key); End Pseudo-code illustration of overall scheme
Fig. 3. Pseudo-code illustration of overall scheme
Fig. 4. Comparison of computational complexities of various access control schemes
593
594
V. Aski et al.
as in emergency mode. During emergency mode, the patient pre-distributes the password for the set of registered ESHs and the same password can be used as an attribute to extract the break-glass key. Once the ESH is successfully verified, he/she will be able to decrypt the EHR data from CI along with having an access right to operate healthcare data. On the basis of decisional bilinear Diffie–Hellman (DBDH) hypothesis, the medical data is exhibited resistivity against chosen plain text attack. Thus, the proposed model is suitable to be installed in IoT healthcare networks.
References 1. P. Verma, S.K. Sood, Fog assisted-IoT enabled patient health monitoring in smart homes. IEEE IoT J. 5(3), 1789–1796 (2018) 2. X. Xu, F. Shucun, L. Qi, X. Zhang, Q. Liu, Q. He, S. Li, An IoT-oriented data placement method with privacy preservation in cloud environment. J. Netw. Comput. Appl. 124, 148–157 (2018) 3. Finding success in the new IoT ecosystem: market to reach $3.04 trillion and 30 billion connected things in 2020, The IDC (2014). Available http://www.idc.com/getdoc.jsp?contai nerId=prUS25237214. Accessed on 07 Nov 2014 4. P. Gonizzi, G. Ferrari, V. Gay, J. Leguay, Data dissemination scheme for distributed storage for IoT observation systems at large scale. Inf. Fusion 22, 16–25 (2015) 5. M.A. Khan, K. Salah, IoT security: review, blockchain solutions, and open challenges. Future Gener. Comput. Syst. 82, 395–411 (2018) 6. M. Khera, Think like a hacker: insights on the latest attack vectors (and security controls) for medical device applications. J. Diab. Sci. Technol. 11(2), 207–212 (2017) 7. J. Sametinger, J.W. Rozenblit, R.L. Lysecky, P. Ott, Security challenges for medical devices. Commun. ACM 58(4), 74–82 (2015) 8. V. Aski, S. Raghavendra, A.K. Sharma, An efficient remote disaster management technique using IoT for expeditious medical supplies to affected area: an architectural study and implementation, in XVIII International Conference on Data Science and Intelligent Analysis of Information (Springer, Cham, 2018), pp. 156–166 9. J. Hur, D.K. Noh, Attribute-based access control with efficient revocation in data outsourcing systems. IEEE Trans. Parallel Distrib. Syst. 22(7), 1214–1221 (2010) 10. J. Bethencourt, A. Sahai, B. Waters, Ciphertext-policy attribute-based encryption, in 2007 IEEE Symposium on Security and Privacy (SP ‘07) (IEEE, 2007), pp. 321–334 11. R. Ostrovsky, A. Sahai, B. Waters, Attribute-based encryption with non-monotonic access structures, in Proceedings of the 14th ACM Conference on Computer and Communications Security (ACM, 2007), pp. 195–203 12. A. Boldyreva, V. Goyal, V. Kumar, Identity-based encryption with efficient revocation, in Proceedings of the 15th ACM Conference on Computer and Communications Security (ACM, 2008), pp. 417–426 13. S. Narayan, M. Gagné, R. Safavi-Naini, Privacy preserving EHR system using attribute-based infrastructure, in Proceedings of the 2010 ACM Workshop on Cloud Computing Security Workshop (ACM, 2010), pp. 47–52 14. S. Marinovic, R. Craven, J. Ma, N. Dulay, Rumpole: a flexible break-glass access control model, in Proceedings of the 16th ACM Symposium on Access Control Models and Technologies (ACM, 2011), pp. 73–82 15. M. Li, S.Y. Ming, K. Ren, W. Lou, Securing personal health records in cloud computing: patient-centric and fine-grained data access control in multi-owner settings, in International Conference on Security and Privacy in Communication (Springer, 2010), pp. 89–106
An Attribute-Based Break-Glass Access Control Framework …
595
16. V.J. Aski, S. Gupta, B. Sarkar, An authentication-centric multi-layered security model for data security in IoT-enabled biomedical applications, in IEEE 8th Global Conference on Consumer Electronics (GCCE 2019) (IEEE Consumer Electronics Society, Osaka, Japan, 15–18th Oct 2019) (Accepted, in Press)
Faster and Secured Web Services Communication Using Modified IDEA and Custom-Level Security Jitender Tanwar1,2(B) , Sanjay Kumar Sharma3 , and Mandeep Mittal4 1 Assistant Professor, Amity School of Engineering & Technology, Amity University, Noida,
India [email protected] 2 Research Scholar, Department of Computer Science, Banasthali Vidyapith, Vanasthali, Rajasthan, India 3 Department of Computer Science, Banasthali Vidyapith, Vanasthali, Rajasthan, India [email protected] 4 Department of Mathematics, Institute of Applied Sciences, Amity University, Noida, India [email protected]
Abstract. Web services are the main components for business communications these days. They provide communication between client and server through XML tags. The increasing number of vulnerabilities in XML tags demands much secured and efficient techniques. Security and efficiency of communication provided through XML tags are the primary concern of researchers. International Data Encryption Algorithm (IDEA) is considered as a secured algorithm, and it is used for encryption and decryption of XML text. The main concept behind the IDEA algorithm is to utilize three diverse mathematical operations: multiplication modulo, bitwise Exclusive OR, and addition modulo. Efficiency of the IDEA algorithm largely depends on these three operations. Aim of this research is to design a much efficient but secured encryption and decryption technique. We have modified the multiplication modulo operation and used it in IDEA algorithm to make it more efficient and used custom-level encryption and decryption methods to provide security. The modified algorithm is implemented and tested on the dynamic train seat allocation system developed using Web services. Our experimental result shows that the customized Idea algorithm encryption is 175.63% and decryption is 184.16% faster and secured than the IDEA algorithm. Keywords: IDEA · Customized IDEA · Algebraic operations · XML · Web services
1 Introduction Web services are the software products available over the Web. They are utilized to provide real-time data. Their utility is tremendous because of language and platform © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_67
Faster and Secured Web Services Communication Using Modified …
597
independence. Web services pursue client–server architecture for client–server communication as appeared in Fig. 1, where the client sends a request utilizing the Web while the server acknowledges, processes and sets up an arrival message as indicated by the request. Client and server both have an abstraction between them because the client does not have the idea of how the server forms a request and vice versa. Web services need a mechanism for transmission and format of input/output messages. The Internet (HTTP) works as a medium, and Simple Object Access Protocol (SOAP) provides format for Web services [1].
Fig. 1. Client–server architecture of Web services
Web service communication occurs because of six standard protocols given by World Wide Web Consortium. These protocols are WSDL, UDDI, SKELETON, STUB, SOAP, and HTTP as appeared in Fig. [2]. These protocols cooperate with each other to make the communication feasible. Web Service Description Language (WSDL), Universal Description, Discovery, and Integration (UDDI), and Simple Object Access Protocol (SOAP) are the XML-based protocols [1, 2]. XML does not have a coordinated security framework. Encryption is prescribed for delicate XML information. The IDEA algorithm is one of the symmetric encryption algorithms that is used to encrypt XML data. The basic concept of the algorithm is to use three algebraic operations, namely multiplication modulo, bitwise Exclusive OR, and addition modulo. Efficiency of the IDEA algorithm is largely dependent on these three operations [3]. The proposed scheme of algorithm is a modified version of the IDEA algorithm with implementation of the custom-level encryption technique. The modification is done by replacing the multiplication modulo with modified addition modulo or bitwise Exclusive OR operations because the multiplication is a costly operation as compared to the addition modulo and bitwise Exclusive OR. Furthermore, instead of encrypting whole XML data, we have used custom-level encryption technique to provide efficient security (Figs. 3, 4, 5, 6, 7, and 8).
2 Existing System The symmetric key-based IDEA algorithm was given by Lai, Massey, and Murphy in 1991. It was evolved from Proposed Encryption Standard (PES) which was also proposed by Lai and Massey earlier. PES was a substitution for DES, and IDEA was initially named Improved Proposed Encryption Standard (IPES). In 1992, the name changed to
598
J. Tanwar et al.
Fig. 2. Components of Web services communication
Fig. 3. Structure of IDEA encryption process
Faster and Secured Web Services Communication Using Modified …
599
Fig. 4. Structure of CUSTOM-IDEA encryption process
Execution time for encryption/decryption
Time(sec)
20 15
IDEA
IDEA Custom-IDEA
Custom-IDEA
10 5 0 Encryption
Decryption
Fig. 5. Execution time graph
IDEA. IDEA algorithm encrypts a 64-bit plain text and a 64-bit cipher text. To control the algorithm, a 128-bit key was used. IDEA algorithm is comprised of eight similar rounds in addition to a half-round for transformation output. The fundamental concept of IDEA is the utilization of the blending of three incongruent arithmetical operation: an addition modulo 216, bit-to-bit XOR, and multiplication modulo 216 + 1. A total of 216 conceivable 16-bit blocks from 0000000000000000 to 1111111111111111 are used by IDEA algorithm [1]. The algorithm is broken into eight rounds. Plain text is partitioned into four numbers of 16-bit blocks named X1, X2, X3, and X4. The key size is 128 bits. The key is partitioned into eight 16-bit sub-keys due to the fact that the
600
J. Tanwar et al. Efficiency graph
Efficiency
200
Custom-IDEA
Custom-IDEA
150 IDEA
IDEA
Encryption
Decryption
100 50 0
Fig. 6. Efficiency graph
Fig. 7. Comparison of execution time during encryption
majority of the arithmetical activities utilized in the encryption and decryption procedure works on 16-bit data. The last round is the output transformation round which uses four numbers of 16-bit sub-keys. Out of eight sub-keys, six keys are used by each round of encryption/decryption and remaining two are used for moving left the input by 25 position. Total number of sub-keys are 52. Out of these 52 sub-keys, 48(8 rounds * 6 subkeys) are used for first eight rounds and four sub-keys are used for output transformation [3, 4] (Table 1). The following steps are used for encryption procedure in each round: Step 1. Step 2. Step 3. Step 4.
Multiplication of X1 and Z1. Addition of X2 and Z2. Addition of X3 and Z3. Multiplication of X4 and Z4.
Faster and Secured Web Services Communication Using Modified …
601
Fig. 8. Comparison of execution time during decryption
Table 1. Architectural difference between IDEA and customized IDEA Features
IDEA algorithm
Operators used
Multiply, addition, and XOR Modified addition, addition, and XOR
Number of levels in algorithm 8
Step 5. Step 6. Step 7. Step 8. Step 9. Step 10. Step 11. Step 12. Step 13. Step 14.
Customize IDEA algorithm
Customized (1–8)
Bitwise XOR on the outputs of Steps 1 and 3. Bitwise XOR on the outputs of Steps 2 and 4. Multiply Step 5 with Z5. Addition of Step 6 and Step 7 results. Multiplication of Step 8 and Z6. Addition of Step 9 and Step 7. Bitwise XOR the result of Step 1 and Step 9. Bitwise XOR the results of Step 3 and Step 9. Bitwise XOR the results of Step 2 and Step 10. Bitwise XOR the results of Step 4 and Step 10.
This is the sequence of steps performed in the first round. The same sequence of steps is performed with all eight rounds. The decryption is a reverse procedure of encryption that converts cipher text to the original plain message. The difference between both the procedures of encryption and decryption is that the 16-bit sub-keys are created in reverse direction.
602
J. Tanwar et al.
3 Proposed System The disadvantage of IDEA is that it requires a considerable amount of time to spend in encryption/decryption which makes the communication procedure slower [4]. Furthermore, the increased multiplicative additive operation in a single round increases the complexity of the algorithm. Another drawback of IDEA is a large set of weak keys, and if the size of these weak keys is increased from 128 bits to 512 bits, then again the complexity of the algorithm increases [3, 4]. What is more important is that at round 6, new vulnerabilities are detected day by day. So we need an approach that is faster as well as secure. This paper talks about the alterations that are made to the IDEA calculation and upgraded with a custom-level strategy to make it progressively secure and quick. Encoding whole XML data drastically debases the performance of the algorithm. We have proposed a system where strong encryption is suggested only for very important XML information. Every XML tag does not need strong security [2]. We can allow independent clients to set their security of independent tags. The encryption of XML tags will be performed depending on the security level set by the client. In the proposed scheme, the multiplication modulo operation is replaced by the modified addition operation. The three operations are mentioned below. Modified addition = plain text + Key/Plain text + Key constant Addition = Plain text + Key/Plain text XOR operation = Plain text xor Key. The modified concept is named customized IDEA. The sub-keys in the customized IDEA is of equivalent size as in original IDEA. The block size ought to be 16 bits as a result of the mathematical activities that work on 16-bit numbers. The modification is done by supplanting the multiplication modulo with addition modulo or bitwise Exclusive OR. In light of the fact that the multiplication modulo is an exorbitant operation when compared with the addition modulo and bitwise Exclusive OR. In customized IDEA, also the encryption procedure comprises eight novel rounds and one output transformation round. The 128-bit key and 64-bit plaintext size are used. The plaintext is divided into four equal size blocks of 16 bits. Six sub-keys are used in each round, and the remaining two sub-keys are utilized in the subsequent round to apply left movement by 25 positions. This system is designed specifically to improve the efficiency of IDEA algorithm. To evaluate the results, we used the modified algorithm with the dynamic train seat allocation application developed using Web services [7–9]. Customized idea algorithm is used to encrypt the user data and protect it from hackers and malicious users. Non-confidential data is stored/transferred as plaintext, and confidential data is stored/transferred in encrypted format [5, 6]. XML data require different level of security for each tag, and this security level is set by the user before data get transferred. The encryption will be performed based on the security-level attribute. The number of rounds covered by customized IDEA is dependent on the security-level value.
Faster and Secured Web Services Communication Using Modified …
603
4 Result We tested the two algorithms on Google register motor virtual machine with framework type 64-bit operating framework, x64-based processor, CPU 2.00 GHz, and RAM 7.50 GB utilizing 1000 records of 780 Mb; each document contains approximately 28 million lines or approximately 2.1 million XML messages. We found that the IDEA calculation takes normal 17.389 s for encryption and 17.028 s for decryption, while the modified IDEA takes normal 9.442 s for encryption and 9.695 s for decryption. The modified IDEA encryption is 175.63%, and decryption is 184.16% quicker than the IDEA values. Efficiency =
Total execution time of IDEA algorithm ∗ 100 Total execution time of Custom − IDEA algorithm
5 Conclusion Web services communicate with each other using XML messages. The efficiency and security of XML messages directly affect the communication between Web services. In our proposed system, we have strengthened the security and efficiency of the XML messages by means of various actions. One major action is by using light operators (addition modulo instead of multiplication modulo) that perform fast computations and make it efficient, and another is by providing primacy key value (user-defined security level) that prevents from running multiple rounds. After the empirical comparison between the IDEA and the customized IDEA, we can conclude that on one hand, there is a secure transmission and on another it has increased its efficiency.
References 1. E. Cerami, Web services essentials distributed applications with XML-RPC, SOAP, UDDI & WSDL (O’Reilly, 2002). ISBN: 0-596-00224-6 2. J. Tanwar, S.K. Sharma, M. Mittal, Secure framework for web services communication, in 2018 International Conference on Automation and Computational Engineering (Amity University Greater Noida Campus, U.P, India, 2018), pp. 187–190 3. O. Almasri, H.M. Jani, Introducing an Encryption algorithm based on IDEA. Int. J. Sci. Res. (IJSR), India 2(9), 334–339 (2013). ISSN: 2319-7064 4. H.K. Sahu, V. Jadhav, S. Sonavane, R.K. Sharma, Cryptanalytic attacks on international data encryption algorithm block cipher. Def. Sci. J. 66, 582–589 (2016) 5. V. Sankar, G. Zayaraz, Securing confidential data in xml using custom level encryption, in International Conference on Computation of Power, Energy Information and Communication (ICCPEIC) (2016), pp. 227–230 6. A.W. Mohomed, A.M. Zeki, Web services SOAP optimization techniques, in 4th IEEE International Conference on Engineering Technologies and Applied Science (Salmabad, Bahrain, 2017) 7. Firebase Realtime Database. https://firebase.google.com/docs/database/. Accessed on 04 Aug 2019 8. Android Dev Summit. https://developer.android.com/. Accessed on 05 July 2019 9. Google Cloud Platform. https://cloud.google.com/. Accessed on 04 Aug 2019
Reputation-Based Stable Grouping Strategy for Holistic Content Distribution in IoV Richa Sharma(B) , T. P. Sharma, and Ajay Kumar Sharma CSE, National Institute of Technology, Hamirpur, Himachal Pradesh, India {richa_cs,teek}@nith.ac.in, [email protected]
Abstract. The development of vehicles and portable advancements for wireless connectivity among vehicles has carried new difficulties to the Internet of Vehicles (IoV). So as to precisely find vehicles and rapidly process information for reducing the pressure of reliable communication, grouping techniques are desirable in order to ensure that portable stations and terminals residing at roadsides can easily make use of maximum transmission opportunity. In literature, vast amount of researchers have considered grouping paradigm for reliable information dissemination, yet we have incorporated the data reputation-based stable and secure grouping strategy into an optimization problem that provides the QoS upper bounds for better system performance. Our proposed grouping paradigm is modeled into a simulative platform considering the mobility simulator SUMO and network simulator NS3 aiming to reduce overhead. Keywords: LTE-V · Reputation · Trust · NS3 · SUMO
1 Introduction With the improvement of social, monetary, and promising 5G technologies, the Internet of Things (IoT) has gotten increasingly more consideration. In addition, with regularly expanding vehicles associated with the IoT, the Internet of Vehicle (IoV) is turning into an examination hot spot in the scholarly community and industry network all through the world. The IoV is required to essentially improve the productivity and security of transportation frameworks by giving opportune and proficient information/content dissemination between the streets and vehicles, vehicles and vehicles, vehicles and individuals, vehicles and everything. For productive correspondence in IoV, various types of remote technologies are utilized namely vehicular interchanges comprising DSRC/CALM, cellular communication 4G/LTE, WiMax and Satellite, and short-run static communication including Zigbee, Bluetooth, and Wi-Fi [13]. Vehicular Ad Hoc Network (VANET) as IoV parent technology aims is to give traffic well-being and effectiveness in terms of reduced time constraints, less pollution discharge, and with limited cost. Although there are many issues that need to be thrown light on that include problematic web access, incompatibility with singleton gadgets, less commercialization, constrained processing ability, singleton network framework and inaccessibility of distributed computing administrations, and so on. © Springer Nature Singapore Pte Ltd. 2021 M. K. Sharma et al. (eds.), Innovations in Computational Intelligence and Computer Vision, Advances in Intelligent Systems and Computing 1189, https://doi.org/10.1007/978-981-15-6067-5_68
Reputation-Based Stable Grouping Strategy for Holistic Content …
605
Third Generation Partnership Project (3GPP) individuals meet consistently to team up and develop cellular communication standards; recently, 3GPP is characterizing models for 5G [6]. Characterizing a whole new standard for 5G is an enormous endeavor. 3GPP has part of the 5G standard into two specified release manuals: Release 15, which compares to new radio Phase 1, and Release 16, which relates to new radio Phase 2. In NR Phase 1, there are basic components among LTE and NR, for example, both utilizing symmetrical orthogonal frequency division multiplexing (OFDM). 3GPP is working on making Release 15 adjustable for mm wave groups to help remote/satellite communication. For V2X, further examination is proposed for access network interfaces and dynamic transmission access for side link (PC5). The new assessment system is being characterized for V2X that uses cases including vehicle platooning, propelled driving techniques to facilitate semi-automated, fully automated, and remote driving [7]. Grouping vehicles into various groups, in light of some predefined criteria, can take care of issues related to the singleton VANET. Different grouping paradigms have been proposed and characterized by various researchers in the past [2], and almost very less thought is given to the circumstances where group heads could be chosen as trusted group heads for enhanced security. Given the indispensable job of the group head (GH) in its group and between different GHs and the appalling results of choosing a vindictive hub as a group head, there is a genuine need to join trust in group head choice in VANET [11]. The vast majority of them endeavor to improve the stability of groups considering versatility measurements thinking about the direction of moving heading, position, speed calculation, and heading to which path [12]. Data dissemination among connected vehicles in an IoV paradigm requires some secure communication and protection techniques that will not result in issues like message respectability, denial of service, and confidentiality. This article presents a secure trusted grouping technique for efficient data dissemination for associated vehicles that empower an interruption discovery mechanism against security assaults and gives benefits that meet clients’ quality of service (QoS) prerequisites. The principal commitments of this article are abridged as pursues: • Formulating the data reputation-based stable and secure grouping strategy into an optimization problem that provides the QoS upper bounds for better system performance. • Conducting careful examination on a genuine vehicular trace, and having some significant discoveries that help our framework structure theory. These discoveries include: (1) mobility similarity among connected vehicles; (2) a reputation-based grouping scheme that includes trust as a key factor; (3) a profoundly stable correlation existing among connected vehicles. • Modeling the proposed grouping paradigm into a simulative platform considering the mobility simulator SUMO and network simulator NS3 aiming to reduce overhead. In the rest of this paper, we first present a review of the vehicular reputation-based data dissemination system in Sect. 2. In Sect. 3, the system is modeled into a proposed trusted reputation-based grouping strategy in IoV. A brief analysis on a real vehicular trace is conducted in Sect. 4 that proves the effectiveness of our algorithm through
606
R. Sharma et al.
extensive simulation by initially formulating the simulation scenario and afterward the discussion of simulation results. Section 5 concludes the paper.
2 Related Work Due to IoV high potential for improving street security and traffic productivity, dissemination plans have been proposed in the literature in dynamic vehicular condition. These algorithms by various researchers are arranged into various groupings relying upon how the groups are assembled and how the group head is chosen. Every calculation depends on various criteria (measurements) that assume a basic job in accomplishing group security. Bali et al. [6] propose a hybrid backbone-based grouping strategy that makes groups and chooses group heads by thinking about the number of connections and vehicle portability. During the procedure of group arrangement, groups having a moderately high level of connectivity initiates in making authority for the backbone network. The initiative at that point takes an interest in the election of group head decision and group redesign dependent on total relative speeds of vehicles. Reliable data dissemination scheme based on clustering and probabilistic broadcast proposed by that tries to solve issues like high latency, poor coverage, and high accident probability. Each group member advances the received information to its group head with a determined likelihood which is related to the occasions similar parcel is gotten during one interim. While getting the sending data, the chosen group header keeps on scattering it toward the transmission course. The simulation results of this study show that the proposed convention CPB algorithm shows better results for parameters like data coverage, normal message delay, and delivery of data proportion. Cong et al. [5] proposed trustworthiness based on incident reports in V2V communication and forward to those vehicles. Crowed sourcing capabilities use for evaluating trustworthiness value for vehicles. The global view can easily broadcast the individual vehicle’s trust value in CSC. Furthermore, the author intends to extend the work taking security and privacy issues under consideration using unique identification and public key infrastructure mechanism. Sarah et al. [8] have proposed a trustworthy way to select GH combining stability with calculating trust value [9]. Author introduced a timer to control congestion among vehicles to become GH. Yet there is some parameter to be included that categorizes the information to be critical or non-critical. This approach has considered only the centralized environment which leaves gap for implementation in decentralized networks. The appointment of the GH depending on the portability parameter is introduced in Vehicular Multi-hop algorithm for Stable Clustering (VMaSC) [12]. Vehicle having the least mobility is calculated as an active member for GH inside their N hops. Reenactment results show that the VMaSC outperforms well in providing stability for GH election, although the results lack in reducing overheads to some extent. Considering the limitations in literature, we present a secure trusted grouping technique for efficient data dissemination for associated vehicles that empowers an interruption discovery mechanism against security assaults and gives benefits that meet client’s quality of service (QoS) prerequisites.
Reputation-Based Stable Grouping Strategy for Holistic Content …
607
3 Framework of Proposed Approach In this section, we portray the activity of our proposed reputation-based grouping algorithm. To evaluate the reputation among neighbors, we combine the exchange trust with data trust. Initially, the assumptions supporting our work are presented and furthermore the system model is evaluated. 3.1 Assumptions • Equipping vehicles with radio devices for short-range wireless communication, i.e., DSRC and for long-range communication LTE connectivity, and to get exact notion of speed and position, GPS has also to be considered. • Periodically information exchange, i.e., positions, speed, heading, reputation among neighboring vehicles to be carried out [1]. • Categorizing information as critical or non-critical by calculating trust parameter which is more severe for critical messages and lenient for non-critical messages. 3.2 System Model As illustrated in the last section, the assumptions are taken into consideration to propose a novel way to deal with overhead reduction for reputation-based grouping approach for holistic content distribution. It is made out of two significant parts. In the initial segment, mobility similarity metric is evaluated, and in the subsequent part, trust-based grouping metric is depicted. Mobility Similarity Metric Grouping is carried out by election of GH, GM to perform the intra-group communication. Initially, the GH acts as a local coordinator for the intra grouping, additionally for non-centralized environments LTE networks can also be used for holistic content delivery [12]. Every vehicle time-to-time broadcast the information such as position, speed, and reputation metric which it has calculated. These metrics help to update time-to-time the neighboring table. There are two kinds of situations feasible for holistic content dissemination in IoV: Initially, vehicles going in the inverse direction and vehicles heading toward upward. Note that the time interim during which two vehicles can disseminate depends on their relative speed and various other metrics. This time must be utilized for both affiliation and information trade. Without loss of all-inclusive statements, expect Vi represents the sending vehicle and Vj is one of i’s neighboring vehicles. Signify Vvi , ai and Vvj , aj as the mean values and corresponding variance of the velocities of vehicle Vi and Vj respectively. Assuming I.D(t) indicating the inter-vehicle among adjacent vehicles at a particular time t with an initial value at time 0 as I.D(0) = t 0 . The direction of the information propagation among vehicles have to be ensured by selecting that receiving vehicle which has a higher geographic procedure than the sender i. Considering the equation 0 < I.D(0) < C.R, where C.R indicates the dissemination range among two adjacent vehicles. The communication distance among adjacent vehicles is considered here as a queue and the mean and variance of the Vvi , ai and Vvj , aj with slight drift is considered as μ = Vvj −Vvi
608
R. Sharma et al.
and variance as σ 2 = ai + aj . The probability density function for calculating the link connectivity among vehicles meant by p(x|I.D(0), t) of I.D(t) can be portrayed as pursues: p(x|I.D(0), t) = Pr (x ≤ I.D ≤ x + tx|t0 ) ⎡ ⎤ μfn [f − t0 − fn − μt]2 exp t − ∞ ⎢ ⎥ 1 σ2 2σ 2 t ⎥ ⎢ = √ ⎥ ⎢ μfn [f − t0 − fn − μt]2 ⎦ σ 2π t n=−∞ ⎣ − exp t − σ2 2σ 2 t
(1)
Here, in Eq. (1) notations are f n = 4nI.D and f n = 2I.D − f n respectively. The cumulative dissemination factor (CDF) for the route connectivity time between vi and vj can be elaborated in Eq. (2). C.R C.D.F = 1 − p(x|I.D(0), t)
(2)
0
ai vi ∗ Svi vj + p(x|I.D(0), t) M.S.F = distance(vi vj )
(3)
Here, for Eq. (3) ai vi and S, vi , vj are the relative acceleration and speed for adjacent vehicles. Reputation-based trust metric IoV adapts the holistic LTE-V environment and for the reliability and security of messages, the vehicles must co-work. To confirm the conduct of vehicles and distinguish their anomalous behavior in the system, trust metric is proposed. Reputation of every vehicle is calculated by analyzing the behavior of vehicle; here, we have considered a holistic beta reputation approach that can be feasible for both centralized and distributed environment [3]. Beta reputation system collects the feedback and reputation ratings of every vehicle and sends it to server, i.e., if vehicle V1 exchanges messages with neighboring vehicles, then neighboring vehicle V2 or V3 provides feedback about adjacent performance or behavior [4]. Reputation collector stores the feedback from all vehicles and sends it to EPC of the LTE network. The reputation ratings are periodically updated, and every vehicle gets the notification to whether to maintain connection with adjacent agent or not. Data dissemination among vehicles comprises the checking of the conduct of vehicles in the parts of the authenticity of data that they communicate. Along these lines, every vehicle must survey the data it gets from its neighbors. When an incident occurs, the vehicle immediately sends warning messages to GH or adjacent vehicles and this warning message also contains the reputation Rv msg of the detected event. If a vehicle Vi can recognize straightforwardly the collision/event, it assigns Rv Msg = 1 to the event. Something else, if the vehicle V has not recognized the event then it aggregates all the messages and distinguishes messages among critical and non-critical messages. Here, the GH calculates the trust value of the adjacent vehicles as follows: ∞ Ri (Msg) ∗ I.D ∗ Tf ∗ Ni (4) Rv Msg = i=0∞ i=0 (I.D ∗ Tf ∗ Ni )
Reputation-Based Stable Grouping Strategy for Holistic Content …
609
Here, Rv Msg denotes the reputation of the trusted vehicle Vi . I.D denotes the transmission radius of vehicle to its adjacent vehicle. T f denotes the global trust metric. The newness (N i ) of vehicle is calculated as: Ni = 1 −
Tcurrent − Tcapture Msg Validity
(5)
Here, the current vehicle time and captured vehicle time are taken into consideration. The trust data which the vehicle has calculated is the product of critical factor which we add to each message for message solidarity. Ni Tf = (6) Num Received Msgs
4 Performance Evaluation In this segment, we depict the simulation scenario and performance evaluation used for the reputation-based trust algorithm for holistic IoV route guidance. At last, we talk about the presentation results we have evaluated. 4.1 Simulation Scenario We executed the reputation-based grouping approach using Simulation of Urban MObility (SUMO) to provide mobility traces. The mobile traces are mounted in a network simulator NS3 to executing the framework in a simulated scenario. Our proposed approach is being compared with two grouping schemes: VMaSC [12] and MoBIC [10]. The holistic dissemination is carried out by sending data through the vehicular group using the IEEE802.11p interface to the Evolved Packet Controller (EPC) of the LTE interface. The radio propagation model Nakagami-m is used; additionally, the beta probability distribution is used to provide reputation among vehicles. To ensure mobility similarity weighting parameter is fixed to 0.5, and the same weights are given to trust factor (T f ) and Trust Communication distance (I.D). Table 1 presents the distinct simulation parameters used in our simulation evaluation test. Here, we have categorized traffic density of vehicles into three categories: Initially, the vehicles having 15 vehicles/lane/km are categorized in low traffic density, secondly with 30 vehicles/lane/km comes under medium traffic density, lastly for variable traffic density vehicles distribution is randomly distributed for every 260–300 m. 4.2 Performance Metrics • Dissemination Overhead: It depicts the number of messages disseminated by nodes during group formation, maintenance phase. It is a significant parameter to consider in LTE-V environment, due to the fact that it permits us to know the number of messages totally disseminated in the system. To limit the number of messages or to reduce the congestion among vehicles overhead needs to be taken care of. In this way, decreasing the overhead of communication permits to monitor the data transfer capacity which is a basic asset and to improve the reaction time particularly in a domain as IoV where the continuous reaction is required.
610
R. Sharma et al. Table 1. Simulation parameters
Traffic
Propagation model
Parameters
Values
Number of lanes
2 (1 per direction)
Neighborhood vehicle speed
5–35 m/s
Vehicle density
10–50 vehicles/km/lane
Dissemination range
300 m
Communication overhead
100 bytes
Path loss
Dual slope model
Fading model
Nakagami-m model
Threshold (w)
0.5
Interval of rebroadcasting
0.06
MAC model
IEEE 802.11 DCF
• Stability Evaluation: Grouping stability is utmost important for vehicles to be taken care of due to mobility restrictions among disseminating information. Stability is evaluated according to the lifetime of GH in a respective group and the average module of a vehicle residing in that group. 4.2.1 Discussion Compared to other algorithms which are using stability factors for trust evaluations, we have introduced a trust metric for efficient grouping with a critical factor. The critical factor is an index for all the crucial messages and non-critical messages. The trust is evaluated for T f , and this critical factor is added for every message. According to Fig. 1, the function compares the trust value with the possibility for errors to come. For critical factor value 1, the error rate is approaching 0 for critical and non-critical messages. On the other hand, the T d value = 0 depicts the error rate is 1 which shows a nonlinear graph. To complete our recreations, we made a correlation with VMaSC [12] and MoBIC [10] algorithms which use the stability parameter for efficient grouping in literature. Figure 2 shows the overhead of interchanges per vehicle for the number of vehicles in the network. Analyzing the figure, it can be easily illustrated that even with an increase in vehicle density the overhead is minimum as compared to other metrics. Overhead basically leads to slow performance of the algorithm leading decrease in efficiency. Our results are better due to the timer installed at the time of GH election and calculating the trust metric for both critical and non-critical messages. The election time for VMaSC is less than compared to MoBIC algorithm but our proposed approach beats both approaches even with increase in vehicle density. It is prominent that with increase in vehicle density the time to elect a GH also increases. This basically happens due to the exchange of control messages among vehicles and various restrictions among them. In Fig. 3, we have compared our approach with other two grouping approaches taking the average of vehicles density with respect to the time
Reputation-Based Stable Grouping Strategy for Holistic Content …
611
Fig. 1. Error rate of the trusted value for the disseminated message
Fig. 2. Dissemination overhead versus amount of vehicles
to elect a GH. Attaching the timer and trust metric in our messages helps to limit the overheads occurring in the network.
Fig. 3. GH election time versus density of vehicles
612
R. Sharma et al.
5 Conclusion The proposed plan utilizes a novel metric that enhances the chances of efficiency by combining data with trust. Two principle steps have been characterized in the proposed plan. The initial step utilizes versatility measurements to choose generally stable candidates for GH by defining the mobility similarity metric. This metric not only helps to stably elect GH but also leads to developing a stepping stone for the trust metric appointment. In the subsequent next phase, the proposed plan revolves around adapting trust with disseminated data. We compared our approach with VMaSC and MoBIC algorithms of grouping stability and concluded that even with an increase in vehicle density, our results are more stable and improved. The grouping overhead for VMaSC is quite better than MoBIC algorithm, yet fails to overcome our results mainly due to the inclusion of election timer for controlling the congestion in the network. In the future, we intend to extend this work taking more QoS parameters like behavior of drivers that can affect the mobility, social relations among other adjacent vehicles; more security parameters should also be taken care of.
References 1. T. Butt, R. Ashraf, C.S. Iqbal, T. Shah, Umar: social internet of vehicles: architecture and enabling technologies. Comput. Electr. Eng. 69, 68–84 (2018) 2. D. Dixit, P.R. Sahu, Performance of multihop communication systems with regenerative relays in eta-mu fading channels, in 2014 IEEE 79th Vehicular Technology Conference (VTC Spring) (2014), pp. 1–5 3. M. Javed, S. Zeadally, RepGuide: reputation-based route guidance using internet of vehicles. IEEE Commun. Stand. Mag. 2(4), 81–87 (2018) 4. A. Jøsang, R. Ismail, The beta reputation system, in 15th Bled Electronic Commerce Conference, eReality: Constructing the eEconomy Bled (2002), pp. 324–337 5. C. Liao, J. Chang, I. Lee, K.K. Venkatasubramanian, A trust model for vehicular networkbased incident reports, in 2013 IEEE 5th International Symposium on Wireless Vehicular Communications (WiVeC) (2013), pp. 1–5 6. K. Lin, C. Li, P. Pace, G. Fortino, Multi-level cluster-based satellite-terrestrial integrated communication in internet of vehicles. Comput. Commun. 149, 44–50 (2020) 7. J. Mihelj, A. Kos, U. Sedlar, Source reputation assessment in an IoT-based vehicular traffic monitoring system. Proc. Comput. Sci. 147, 295–299 (2019) 8. S. Oubabas, R. Aoudjit, J. Joel, S. Rodrigues, Talbi: secure and stable vehicular ad hoc network clustering algorithm based on hybrid mobility similarities and trust management scheme. Veh. Commun. 13, 128–138 (2018) 9. N.J. Patel, R.H. Jhaveri, Trust based approaches for secure routing in VANET: a survey. Proc. Comput. Sci. 45, 592–601 (2015) 10. M. Ren, L. Khoukhi, H. Labiod, J. Zhang, V.: Veque, A mobility-based scheme for dynamic clustering in vehicular ad-hoc networks (VANETs). Veh. Commun. 9, 233–241 (2017); F. Author, Article title. Journal 2(5), 99–110 (2016) 11. J.P. Singh, R.S. Bali, A hybrid backbone based clustering algorithm for vehicular ad-hoc networks. Proc. Comput. Sci 46, 1005–1013 (2015) 12. S. Ucar, S.C. Ergen, O. Ozkasap, Multihop-cluster-based IEEE 802.11 p and LTE hybrid architecture for VANET safety message dissemination. IEEE Trans. Veh. Technol. 65(4), 2621–2636 (2015) 13. G. Yan, D.B. Rawat, Vehicle-to-vehicle connectivity analysis for vehicular ad-hoc networks. Ad Hoc Netw. 58, 25–35 (2017)